arXiv Papers with Code in Cryptography and Security (January 2025

Abstract:
Local differential privacy (LDP) involves users perturbing their inputs to provide plausible deniability of their data. However, this also makes LDP vulnerable to poisoning attacks. In this paper, we first introduce novel poisoning attacks for ranking estimation. These attacks are intricate, as fake attackers do not merely adjust the frequency of target items. Instead, they leverage a limited number of fake users to precisely modify frequencies, effectively altering item rankings to maximize gains. To tackle this challenge, we introduce the concepts of attack cost and optimal attack item (set), and propose corresponding strategies for kRR, OUE, and OLH protocols. For kRR, we iteratively select optimal attack items and allocate suitable fake users. For OUE, we iteratively determine optimal attack item sets and consider the incremental changes in item frequencies across different sets. Regarding OLH, we develop a harmonic cost function based on the pre-image of a hash to select that supporting a larger number of effective attack items. Lastly, we present an attack strategy based on confidence levels to quantify the probability of a successful attack and the number of attack iterations more precisely. We demonstrate the effectiveness of our attacks through theoretical and empirical evidence, highlighting the necessity for defenses against these attacks. The source code and data have been made available at https://github.com/LDP-user/LDP-Ranking.git.

Paperid: 2, https://arxiv.org/pdf/2506.23474.pdf GitHub

Abstract:
The Model Context Protocol (MCP) has recently emerged as a standardized interface for connecting language models with external tools and data. As the ecosystem rapidly expands, the lack of a structured, comprehensive view of existing MCP artifacts presents challenges for research. To bridge this gap, we introduce MCPCorpus, a large-scale dataset containing around 14K MCP servers and 300 MCP clients. Each artifact is annotated with 20+ normalized attributes capturing its identity, interface configuration, GitHub activity, and metadata. MCPCorpus provides a reproducible snapshot of the real-world MCP ecosystem, enabling studies of adoption trends, ecosystem health, and implementation diversity. To keep pace with the rapid evolution of the MCP ecosystem, we provide utility tools for automated data synchronization, normalization, and inspection. Furthermore, to support efficient exploration and exploitation, we release a lightweight web-based search interface. MCPCorpus is publicly available at: https://github.com/Snakinya/MCPCorpus.

Paperid: 3, https://arxiv.org/pdf/2506.23145.pdf GitHub

Abstract:
Privacy preservation in AI is crucial, especially in healthcare, where models rely on sensitive patient data. In the emerging field of machine unlearning, existing methodologies struggle to remove patient data from trained multimodal architectures, which are widely used in healthcare. We propose Forget-MI, a novel machine unlearning method for multimodal medical data, by establishing loss functions and perturbation techniques. Our approach unlearns unimodal and joint representations of the data requested to be forgotten while preserving knowledge from the remaining data and maintaining comparable performance to the original model. We evaluate our results using performance on the forget dataset, performance on the test dataset, and Membership Inference Attack (MIA), which measures the attacker's ability to distinguish the forget dataset from the training dataset. Our model outperforms the existing approaches that aim to reduce MIA and the performance on the forget dataset while keeping an equivalent performance on the test set. Specifically, our approach reduces MIA by 0.202 and decreases AUC and F1 scores on the forget set by 0.221 and 0.305, respectively. Additionally, our performance on the test set matches that of the retrained model, while allowing forgetting. Code is available at https://github.com/BioMedIA-MBZUAI/Forget-MI.git

Paperid: 4, https://arxiv.org/pdf/2506.23074.pdf GitHub GitHub

Abstract:
In this paper, we propose a Counterfactually Decoupled Attention Learning (CDAL) method for open-world model attribution. Existing methods rely on handcrafted design of region partitioning or feature space, which could be confounded by the spurious statistical correlations and struggle with novel attacks in open-world scenarios. To address this, CDAL explicitly models the causal relationships between the attentional visual traces and source model attribution, and counterfactually decouples the discriminative model-specific artifacts from confounding source biases for comparison. In this way, the resulting causal effect provides a quantification on the quality of learned attention maps, thus encouraging the network to capture essential generation patterns that generalize to unseen source models by maximizing the effect. Extensive experiments on existing open-world model attribution benchmarks show that with minimal computational overhead, our method consistently improves state-of-the-art models by large margins, particularly for unseen novel attacks. Source code: https://github.com/yzheng97/CDAL.

Paperid: 5, https://arxiv.org/pdf/2506.22890.pdf GitHub

Abstract:
Collaborative Perception (CP) has been shown to be a promising technique for multi-agent autonomous driving and multi-agent robotic systems, where multiple agents share their perception information to enhance the overall perception performance and expand the perception range. However, in CP, an ego agent needs to receive messages from its collaborators, which makes it vulnerable to attacks from malicious agents. To address this critical issue, we propose a unified, probability-agnostic, and adaptive framework, namely, CP-uniGuard, which is a tailored defense mechanism for CP deployed by each agent to accurately detect and eliminate malicious agents in its collaboration network. Our key idea is to enable CP to reach a consensus rather than a conflict against an ego agent's perception results. Based on this idea, we first develop a probability-agnostic sample consensus (PASAC) method to effectively sample a subset of the collaborators and verify the consensus without prior probabilities of malicious agents. Furthermore, we define collaborative consistency loss (CCLoss) for object detection task and bird's eye view (BEV) segmentation task to capture the discrepancy between an ego agent and its collaborators, which is used as a verification criterion for consensus. In addition, we propose online adaptive threshold via dual sliding windows to dynamically adjust the threshold for consensus verification and ensure the reliability of the systems in dynamic environments. Finally, we conduct extensive experiments and demonstrate the effectiveness of our framework. Code will be released at https://github.com/CP-Security/CP-uniGuard.

Paperid: 6, https://arxiv.org/pdf/2506.21609.pdf GitHub

Abstract:
Recently, there have been notable advancements in large language models (LLMs), demonstrating their growing abilities in complex reasoning. However, existing research largely overlooks a thorough and systematic comparison of these models' reasoning processes and outputs, particularly regarding their self-reflection pattern (also termed "Aha moment") and the interconnections across diverse domains. This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models (GPT-o1, DeepSeek-R1, Kimi-k1.5, and Grok-3) using keywords statistic and LLM-as-a-judge paradigm. Our approach connects their internal thinking processes with their final outputs. A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving. Additionally, a set of metrics is put forward to assess both the coherence of reasoning and the accuracy of the outputs. The research results uncover various patterns of how these models balance exploration and exploitation, deal with problems, and reach conclusions during the reasoning process. Through quantitative and qualitative comparisons, disparities among these models are identified in aspects such as the depth of reasoning, the reliance on intermediate steps, and the degree of similarity between their thinking processes and output patterns and those of GPT-o1. This work offers valuable insights into the trade-off between computational efficiency and reasoning robustness and provides practical recommendations for enhancing model design and evaluation in practical applications. We publicly release our project at: https://github.com/ChangWenhan/FromThinking2Output

Paperid: 7, https://arxiv.org/pdf/2506.21571.pdf GitHub

Abstract:
Large Reasoning Models (LRMs), which autonomously produce a reasoning Chain of Thought (CoT) before producing final responses, offer a promising approach to interpreting and monitoring model behaviors. Inspired by the observation that certain CoT patterns -- e.g., ``Wait, did I miss anything?'' -- consistently emerge across tasks, we explore whether LRMs exhibit human-like cognitive habits. Building on Habits of Mind, a well-established framework of cognitive habits associated with successful human problem-solving, we introduce CogTest, a principled benchmark designed to evaluate LRMs' cognitive habits. CogTest includes 16 cognitive habits, each instantiated with 25 diverse tasks, and employs an evidence-first extraction method to ensure reliable habit identification. With CogTest, we conduct a comprehensive evaluation of 16 widely used LLMs (13 LRMs and 3 non-reasoning ones). Our findings reveal that LRMs, unlike conventional LLMs, not only exhibit human-like habits but also adaptively deploy them according to different tasks. Finer-grained analyses further uncover patterns of similarity and difference in LRMs' cognitive habit profiles, particularly certain inter-family similarity (e.g., Qwen-3 models and DeepSeek-R1). Extending the study to safety-related tasks, we observe that certain habits, such as Taking Responsible Risks, are strongly associated with the generation of harmful responses. These findings suggest that studying persistent behavioral patterns in LRMs' CoTs is a valuable step toward deeper understanding of LLM misbehavior. The code is available at: https://github.com/jianshuod/CogTest.

Paperid: 8, https://arxiv.org/pdf/2506.21308.pdf GitHub

Abstract:
Privacy risks in differentially private (DP) systems increase significantly when data is correlated, as standard DP metrics often underestimate the resulting privacy leakage, leaving sensitive information vulnerable. Given the ubiquity of dependencies in real-world databases, this oversight poses a critical challenge for privacy protections. Bayesian differential privacy (BDP) extends DP to account for these correlations, yet current BDP mechanisms indicate notable utility loss, limiting its adoption. In this work, we address whether BDP can be realistically implemented in common data structures without sacrificing utility -- a key factor for its applicability. By analyzing arbitrary and structured correlation models, including Gaussian multivariate distributions and Markov chains, we derive practical utility guarantees for BDP. Our contributions include theoretical links between DP and BDP and a novel methodology for adapting DP mechanisms to meet the BDP requirements. Through evaluations on real-world databases, we demonstrate that our novel theorems enable the design of BDP mechanisms that maintain competitive utility, paving the way for practical privacy-preserving data practices in correlated settings.

Paperid: 9, https://arxiv.org/pdf/2506.21046.pdf GitHub

Abstract:
The ability of deep neural networks (DNNs) come from extracting and interpreting features from the data provided. By exploiting intermediate features in DNNs instead of relying on hard labels, we craft adversarial perturbation that generalize more effectively, boosting black-box transferability. These features ubiquitously come from supervised learning in previous work. Inspired by the exceptional synergy between self-supervised learning and the Transformer architecture, this paper explores whether exploiting self-supervised Vision Transformer (ViT) representations can improve adversarial transferability. We present dSVA -- a generative dual self-supervised ViT features attack, that exploits both global structural features from contrastive learning (CL) and local textural features from masked image modeling (MIM), the self-supervised learning paradigm duo for ViTs. We design a novel generative training framework that incorporates a generator to create black-box adversarial examples, and strategies to train the generator by exploiting joint features and the attention mechanism of self-supervised ViTs. Our findings show that CL and MIM enable ViTs to attend to distinct feature tendencies, which, when exploited in tandem, boast great adversarial generalizability. By disrupting dual deep features distilled by self-supervised ViTs, we are rewarded with remarkable black-box transferability to models of various architectures that outperform state-of-the-arts. Code available at https://github.com/spencerwooo/dSVA.

Paperid: 10, https://arxiv.org/pdf/2506.20101.pdf GitHub

Abstract:
Multi-Key Homomorphic Encryption (MKHE), proposed by Lopez-Alt et al. (STOC 2012), allows for performing arithmetic computations directly on ciphertexts encrypted under distinct keys. Subsequent works by Chen and Dai et al. (CCS 2019) and Kim and Song et al. (CCS 2023) extended this concept by proposing multi-key BFV/CKKS variants, referred to as the CDKS scheme. These variants incorporate asymptotically optimal techniques to facilitate secure computation across multiple data providers. In this paper, we identify a critical security vulnerability in the CDKS scheme when applied to multiparty secure computation tasks, such as privacy-preserving federated learning (PPFL). In particular, we show that CDKS may inadvertently leak plaintext information from one party to others. To mitigate this issue, we propose a new scheme, SMHE (Secure Multi-Key Homomorphic Encryption), which incorporates a novel masking mechanism into the multi-key BFV and CKKS frameworks to ensure that plaintexts remain confidential throughout the computation. We implement a PPFL application using SMHE and demonstrate that it provides significantly improved security with only a modest overhead in homomorphic evaluation. For instance, our PPFL model based on multi-key CKKS incurs less than a 2\times runtime and communication traffic increase compared to the CDKS-based PPFL model. The code is publicly available at https://github.com/JiahuiWu2022/SMHE.git.

Paperid: 11, https://arxiv.org/pdf/2506.19464.pdf GitHub

Abstract:
The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model's functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks access to the victim model's training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classification substantiates the effectiveness of the proposed attack. The source code is available at https://github.com/rajankita/QueryWise.

Paperid: 12, https://arxiv.org/pdf/2506.19453.pdf GitHub

Abstract:
Software supply chain vulnerabilities arise when attackers exploit weaknesses by injecting vulnerable code into widely used packages or libraries within software repositories. While most existing approaches focus on identifying vulnerable packages or libraries, they often overlook the specific functions responsible for these vulnerabilities. Pinpointing vulnerable functions within packages or libraries is critical, as it can significantly reduce the risks associated with using open-source software. Identifying vulnerable patches is challenging because developers often submit code changes that are unrelated to vulnerability fixes. To address this issue, this paper introduces FuncVul, an innovative code chunk-based model for function-level vulnerability detection in C/C++ and Python, designed to identify multiple vulnerabilities within a function by focusing on smaller, critical code segments. To assess the model's effectiveness, we construct six code and generic code chunk based datasets using two approaches: (1) integrating patch information with large language models to label vulnerable samples and (2) leveraging large language models alone to detect vulnerabilities in function-level code. To design FuncVul vulnerability model, we utilise GraphCodeBERT fine tune model that captures both the syntactic and semantic aspects of code. Experimental results show that FuncVul outperforms existing state-of-the-art models, achieving an average accuracy of 87-92% and an F1 score of 86-92% across all datasets. Furthermore, we have demonstrated that our code-chunk-based FuncVul model improves 53.9% accuracy and 42.0% F1-score than the full function-based vulnerability prediction. The FuncVul code and datasets are publicly available on GitHub at https://github.com/sajalhalder/FuncVul.

Paperid: 13, https://arxiv.org/pdf/2506.18756.pdf GitHub

Abstract:
Large Language Models (LLMs) increasingly rely on automatic prompt engineering in graphical user interfaces (GUIs) to refine user inputs and enhance response accuracy. However, the diversity of user requirements often leads to unintended misinterpretations, where automated optimizations distort original intentions and produce erroneous outputs. To address this challenge, we propose the Adaptive Greedy Binary Search (AGBS) method, which simulates common prompt optimization mechanisms while preserving semantic stability. Our approach dynamically evaluates the impact of such strategies on LLM performance, enabling robust adversarial sample generation. Through extensive experiments on open and closed-source LLMs, we demonstrate AGBS's effectiveness in balancing semantic consistency and attack efficacy. Our findings offer actionable insights for designing more reliable prompt optimization systems. Code is available at: https://github.com/franz-chang/DOBS

Paperid: 14, https://arxiv.org/pdf/2506.16349.pdf GitHub

Abstract:
Watermarking the outputs of generative models has emerged as a promising approach for tracking their provenance. Despite significant interest in autoregressive image generation models and their potential for misuse, no prior work has attempted to watermark their outputs at the token level. In this work, we present the first such approach by adapting language model watermarking techniques to this setting. We identify a key challenge: the lack of reverse cycle-consistency (RCC), wherein re-tokenizing generated image tokens significantly alters the token sequence, effectively erasing the watermark. To address this and to make our method robust to common image transformations, neural compression, and removal attacks, we introduce (i) a custom tokenizer-detokenizer finetuning procedure that improves RCC, and (ii) a complementary watermark synchronization layer. As our experiments demonstrate, our approach enables reliable and robust watermark detection with theoretically grounded p-values.

Paperid: 15, https://arxiv.org/pdf/2506.16078.pdf GitHub

Abstract:
Safety alignment is a key requirement for building reliable Artificial General Intelligence. Despite significant advances in safety alignment, we observe that minor latent shifts can still trigger unsafe responses in aligned models. We argue that this stems from the shallow nature of existing alignment methods, which focus on surface-level refusal behaviors without sufficiently altering internal representations. Consequently, small shifts in hidden activations can re-trigger harmful behaviors embedded in the latent space. To explore the robustness of safety alignment to latent perturbations, we introduce a probing method that measures the Negative Log-Likelihood of the original response generated by the model. This probe quantifies local sensitivity in the latent space, serving as a diagnostic tool for identifying vulnerable directions. Based on this signal, we construct effective jailbreak trajectories, giving rise to the Activation Steering Attack (ASA). More importantly, these insights offer a principled foundation for improving alignment robustness. To this end, we introduce Layer-wise Adversarial Patch Training~(LAPT), a fine-tuning strategy that inject controlled perturbations into hidden representations during training. Experimental results highlight that LAPT strengthen alignment robustness without compromising general capabilities. Our findings reveal fundamental flaws in current alignment paradigms and call for representation-level training strategies that move beyond surface-level behavior supervision. Codes and results are available at https://github.com/Carol-gutianle/LatentSafety.

Paperid: 16, https://arxiv.org/pdf/2506.15312.pdf GitHub

Abstract:
Face sketch synthesis is a technique aimed at converting face photos into sketches. Existing face sketch synthesis research mainly relies on training with numerous photo-sketch sample pairs from existing datasets. However, these large-scale discriminative learning methods will have to face problems such as data scarcity and high human labor costs. Once the training data becomes scarce, their generative performance significantly degrades. In this paper, we propose a one-shot face sketch synthesis method based on diffusion models. We optimize text instructions on a diffusion model using face photo-sketch image pairs. Then, the instructions derived through gradient-based optimization are used for inference. To simulate real-world scenarios more accurately and evaluate method effectiveness more comprehensively, we introduce a new benchmark named One-shot Face Sketch Dataset (OS-Sketch). The benchmark consists of 400 pairs of face photo-sketch images, including sketches with different styles and photos with different backgrounds, ages, sexes, expressions, illumination, etc. For a solid out-of-distribution evaluation, we select only one pair of images for training at each time, with the rest used for inference. Extensive experiments demonstrate that the proposed method can convert various photos into realistic and highly consistent sketches in a one-shot context. Compared to other methods, our approach offers greater convenience and broader applicability. The dataset will be available at: https://github.com/HanWu3125/OS-Sketch

Paperid: 17, https://arxiv.org/pdf/2506.15253.pdf GitHub

Abstract:
The rapid deployment of Large language model (LLM) agents in critical domains like healthcare and finance necessitates robust security frameworks. To address the absence of standardized evaluation benchmarks for these agents in dynamic environments, we introduce RAS-Eval, a comprehensive security benchmark supporting both simulated and real-world tool execution. RAS-Eval comprises 80 test cases and 3,802 attack tasks mapped to 11 Common Weakness Enumeration (CWE) categories, with tools implemented in JSON, LangGraph, and Model Context Protocol (MCP) formats. We evaluate 6 state-of-the-art LLMs across diverse scenarios, revealing significant vulnerabilities: attacks reduced agent task completion rates (TCR) by 36.78% on average and achieved an 85.65% success rate in academic settings. Notably, scaling laws held for security capabilities, with larger models outperforming smaller counterparts. Our findings expose critical risks in real-world agent deployments and provide a foundational framework for future security research. Code and data are available at https://github.com/lanzer-tree/RAS-Eval.

Paperid: 18, https://arxiv.org/pdf/2506.14933.pdf GitHub

Abstract:
The decentralized finance (DeFi) community has grown rapidly in recent years, pushed forward by cryptocurrency enthusiasts interested in the vast untapped potential of new markets. The surge in popularity of cryptocurrency has ushered in a new era of financial crime. Unfortunately, the novelty of the technology makes the task of catching and prosecuting offenders particularly challenging. Thus, it is necessary to implement automated detection tools related to policies to address the growing criminality in the cryptocurrency realm.

Paperid: 19, https://arxiv.org/pdf/2506.14057.pdf GitHub

Abstract:
Web tracking is a pervasive and opaque practice that enables personalized advertising, retargeting, and conversion tracking. Over time, it has evolved into a sophisticated and invasive ecosystem, employing increasingly complex techniques to monitor and profile users across the web. The research community has a long track record of analyzing new web tracking techniques, designing and evaluating the effectiveness of countermeasures, and assessing compliance with privacy regulations. Despite a substantial body of work on web tracking, the literature remains fragmented across distinctly scoped studies, making it difficult to identify overarching trends, connect new but related techniques, and identify research gaps in the field. Today, web tracking is undergoing a once-in-a-generation transformation, driven by fundamental shifts in the advertising industry, the adoption of anti-tracking countermeasures by browsers, and the growing enforcement of emerging privacy regulations. This Systematization of Knowledge (SoK) aims to consolidate and synthesize this wide-ranging research, offering a comprehensive overview of the technical mechanisms, countermeasures, and regulations that shape the modern and rapidly evolving web tracking landscape. This SoK also highlights open challenges and outlines directions for future research, aiming to serve as a unified reference and introductory material for researchers, practitioners, and policymakers alike.

Paperid: 20, https://arxiv.org/pdf/2506.13160.pdf GitHub GitHub

Abstract:
Deep neural networks (DNNs) rely heavily on high-quality open-source datasets (e.g., ImageNet) for their success, making dataset ownership verification (DOV) crucial for protecting public dataset copyrights. In this paper, we find existing DOV methods (implicitly) assume that the verification process is faithful, where the suspicious model will directly verify ownership by using the verification samples as input and returning their results. However, this assumption may not necessarily hold in practice and their performance may degrade sharply when subjected to intentional or unintentional perturbations. To address this limitation, we propose the first certified dataset watermark (i.e., CertDW) and CertDW-based certified dataset ownership verification method that ensures reliable verification even under malicious attacks, under certain conditions (e.g., constrained pixel-level perturbation). Specifically, inspired by conformal prediction, we introduce two statistical measures, including principal probability (PP) and watermark robustness (WR), to assess model prediction stability on benign and watermarked samples under noise perturbations. We prove there exists a provable lower bound between PP and WR, enabling ownership verification when a suspicious model's WR value significantly exceeds the PP values of multiple benign models trained on watermark-free datasets. If the number of PP values smaller than WR exceeds a threshold, the suspicious model is regarded as having been trained on the protected dataset. Extensive experiments on benchmark datasets verify the effectiveness of our CertDW method and its resistance to potential adaptive attacks. Our codes are at \href{https://github.com/NcepuQiaoTing/CertDW}{GitHub}.

Paperid: 21, https://arxiv.org/pdf/2506.12880.pdf GitHub

Abstract:
We study suffix-based jailbreaks$\unicode{x2013}$a powerful family of attacks against large language models (LLMs) that optimize adversarial suffixes to circumvent safety alignment. Focusing on the widely used foundational GCG attack (Zou et al., 2023), we observe that suffixes vary in efficacy: some markedly more universal$\unicode{x2013}$generalizing to many unseen harmful instructions$\unicode{x2013}$than others. We first show that GCG's effectiveness is driven by a shallow, critical mechanism, built on the information flow from the adversarial suffix to the final chat template tokens before generation. Quantifying the dominance of this mechanism during generation, we find GCG irregularly and aggressively hijacks the contextualization process. Crucially, we tie hijacking to the universality phenomenon, with more universal suffixes being stronger hijackers. Subsequently, we show that these insights have practical implications: GCG universality can be efficiently enhanced (up to $\times$5 in some cases) at no additional computational cost, and can also be surgically mitigated, at least halving attack success with minimal utility loss. We release our code and data at http://github.com/matanbt/interp-jailbreak.

Paperid: 22, https://arxiv.org/pdf/2506.12761.pdf GitHub

Abstract:
Location-based services often require users to share sensitive locational data, raising privacy concerns due to potential misuse or exploitation by untrusted servers. In response, we present VeLoPIR, a versatile location-based private information retrieval (PIR) system designed to preserve user privacy while enabling efficient and scalable query processing. VeLoPIR introduces three operational modes-interval validation, coordinate validation, and identifier matching-that support a broad range of real-world applications, including information and emergency alerts. To enhance performance, VeLoPIR incorporates multi-level algorithmic optimizations with parallel structures, achieving significant scalability across both CPU and GPU platforms. We also provide formal security and privacy proofs, confirming the system's robustness under standard cryptographic assumptions. Extensive experiments on real-world datasets demonstrate that VeLoPIR achieves up to 11.55 times speed-up over a prior baseline. The implementation of VeLoPIR is publicly available at https://github.com/PrivStatBool/VeLoPIR.

Paperid: 23, https://arxiv.org/pdf/2506.12430.pdf GitHub

Authors:Zonghao Ying, Siyang Wu, Run Hao, Peng Ying, Shixuan Sun, Pengyu Chen, Junze Chen, Hao Du, Kaiwen Shen, Shangkun Wu, Jiwei Wei, Shiyuan He, Yang Yang, Xiaohai Xu, Ke Ma, Qianqian Xu, Qingming Huang, Shi Lin, Xun Wang, Changting Lin, Meng Han, Yilei Jiang, Siqi Lai, Yaozhi Zheng, Yifei Song, Xiangyu Yue, Zonglei Jing, Tianyuan Zhang, Zhilei Zhu, Aishan Liu, Jiakai Wang, Siyuan Liang, Xianglong Kong, Hainan Li, Junjie Mu, Haotong Qin, Yue Yu, Lei Chen, Felix Juefei-Xu, Qing Guo, Xinyun Chen, Yew Soon Ong, Xianglong Liu, Dawn Song, Alan Yuille, Philip Torr, Dacheng Tao

Abstract:
Multimodal Large Language Models (MLLMs) have enabled transformative advancements across diverse applications but remain susceptible to safety threats, especially jailbreak attacks that induce harmful outputs. To systematically evaluate and improve their safety, we organized the Adversarial Testing & Large-model Alignment Safety Grand Challenge (ATLAS) 2025}. This technical report presents findings from the competition, which involved 86 teams testing MLLM vulnerabilities via adversarial image-text attacks in two phases: white-box and black-box evaluations. The competition results highlight ongoing challenges in securing MLLMs and provide valuable guidance for developing stronger defense mechanisms. The challenge establishes new benchmarks for MLLM safety evaluation and lays groundwork for advancing safer multimodal AI systems. The code and data for this challenge are openly available at https://github.com/NY1024/ATLAS_Challenge_2025.

Paperid: 24, https://arxiv.org/pdf/2506.11586.pdf GitHub

Abstract:
The widespread adoption of outsourced neural network inference presents significant privacy challenges, as sensitive user data is processed on untrusted remote servers. Secure inference offers a privacy-preserving solution, but existing frameworks suffer from high computational overhead and communication costs, rendering them impractical for real-world deployment. We introduce SecONNds, a non-intrusive secure inference framework optimized for large ImageNet-scale Convolutional Neural Networks. SecONNds integrates a novel fully Boolean Goldreich-Micali-Wigderson (GMW) protocol for secure comparison -- addressing Yao's millionaires' problem -- using preprocessed Beaver's bit triples generated from Silent Random Oblivious Transfer. Our novel protocol achieves an online speedup of 17$\times$ in nonlinear operations compared to state-of-the-art solutions while reducing communication overhead. To further enhance performance, SecONNds employs Number Theoretic Transform (NTT) preprocessing and leverages GPU acceleration for homomorphic encryption operations, resulting in speedups of 1.6$\times$ on CPU and 2.2$\times$ on GPU for linear operations. We also present SecONNds-P, a bit-exact variant that ensures verifiable full-precision results in secure computation, matching the results of plaintext computations. Evaluated on a 37-bit quantized SqueezeNet model, SecONNds achieves an end-to-end inference time of 2.8 s on GPU and 3.6 s on CPU, with a total communication of just 420 MiB. SecONNds' efficiency and reduced computational load make it well-suited for deploying privacy-sensitive applications in resource-constrained environments. SecONNds is open source and can be accessed from: https://github.com/shashankballa/SecONNds.

Paperid: 25, https://arxiv.org/pdf/2506.11423.pdf GitHub

Abstract:
The Bhatt Conjectures framework introduces rigorous, hierarchical benchmarks for evaluating AI reasoning and understanding, moving beyond pattern matching to assess representation invariance, robustness, and metacognitive self-awareness. The agentreasoning-sdk demonstrates practical implementation, revealing that current AI models struggle with complex reasoning tasks and highlighting the need for advanced evaluation protocols to distinguish genuine cognitive abilities from statistical inference. https://github.com/mbhatt1/agentreasoning-sdk

Paperid: 26, https://arxiv.org/pdf/2506.10960.pdf GitHub

Abstract:
Large language models (LLMs) have been increasingly applied to automated harmful content detection tasks, assisting moderators in identifying policy violations and improving the overall efficiency and accuracy of content review. However, existing resources for harmful content detection are predominantly focused on English, with Chinese datasets remaining scarce and often limited in scope. We present a comprehensive, professionally annotated benchmark for Chinese content harm detection, which covers six representative categories and is constructed entirely from real-world data. Our annotation process further yields a knowledge rule base that provides explicit expert knowledge to assist LLMs in Chinese harmful content detection. In addition, we propose a knowledge-augmented baseline that integrates both human-annotated knowledge rules and implicit knowledge from large language models, enabling smaller models to achieve performance comparable to state-of-the-art LLMs. Code and data are available at https://github.com/zjunlp/ChineseHarm-bench.

Paperid: 27, https://arxiv.org/pdf/2506.10597.pdf GitHub

Abstract:
Large Language Models (LLMs) have achieved remarkable progress, but their deployment has exposed critical vulnerabilities, particularly to jailbreak attacks that circumvent safety mechanisms. Guardrails--external defense mechanisms that monitor and control LLM interaction--have emerged as a promising solution. However, the current landscape of LLM guardrails is fragmented, lacking a unified taxonomy and comprehensive evaluation framework. In this Systematization of Knowledge (SoK) paper, we present the first holistic analysis of jailbreak guardrails for LLMs. We propose a novel, multi-dimensional taxonomy that categorizes guardrails along six key dimensions, and introduce a Security-Efficiency-Utility evaluation framework to assess their practical effectiveness. Through extensive analysis and experiments, we identify the strengths and limitations of existing guardrail approaches, explore their universality across attack types, and provide insights into optimizing defense combinations. Our work offers a structured foundation for future research and development, aiming to guide the principled advancement and deployment of robust LLM guardrails. The code is available at https://github.com/xunguangwang/SoK4JailbreakGuardrails.

Paperid: 28, https://arxiv.org/pdf/2506.10424.pdf GitHub

Abstract:
Large language models (LLMs) have achieved remarkable success and are widely adopted for diverse applications. However, fine-tuning these models often involves private or sensitive information, raising critical privacy concerns. In this work, we conduct the first comprehensive study evaluating the vulnerability of fine-tuned LLMs to membership inference attacks (MIAs). Our empirical analysis demonstrates that MIAs exploit the loss reduction during fine-tuning, making them highly effective in revealing membership information. These findings motivate the development of our defense. We propose SOFT (\textbf{S}elective data \textbf{O}bfuscation in LLM \textbf{F}ine-\textbf{T}uning), a novel defense technique that mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection. Our extensive experiments span six diverse domains and multiple LLM architectures and scales. Results show that SOFT effectively reduces privacy risks while maintaining competitive model performance, offering a practical and scalable solution to safeguard sensitive information in fine-tuned LLMs.

Paperid: 29, https://arxiv.org/pdf/2506.09562.pdf GitHub

Abstract:
Deep reinforcement learning (DRL) has achieved remarkable success in a wide range of sequential decision-making domains, including robotics, healthcare, smart grids, and finance. Recent research demonstrates that attackers can efficiently exploit system vulnerabilities during the training phase to execute backdoor attacks, producing malicious actions when specific trigger patterns are present in the state observations. However, most existing backdoor attacks rely primarily on simplistic and heuristic trigger configurations, overlooking the potential efficacy of trigger optimization. To address this gap, we introduce TooBadRL (Trigger Optimization to Boost Effectiveness of Backdoor Attacks on DRL), the first framework to systematically optimize DRL backdoor triggers along three critical axes, i.e., temporal, spatial, and magnitude. Specifically, we first introduce a performance-aware adaptive freezing mechanism for injection timing. Then, we formulate dimension selection as a cooperative game, utilizing Shapley value analysis to identify the most influential state variable for the injection dimension. Furthermore, we propose a gradient-based adversarial procedure to optimize the injection magnitude under environment constraints. Evaluations on three mainstream DRL algorithms and nine benchmark tasks show that TooBadRL significantly improves attack success rates, while ensuring minimal degradation of normal task performance. These results highlight the previously underappreciated importance of principled trigger optimization in DRL backdoor attacks. The source code of TooBadRL can be found at https://github.com/S3IC-Lab/TooBadRL.

Paperid: 30, https://arxiv.org/pdf/2506.09443.pdf GitHub

Abstract:
Large Language Models (LLMs) have demonstrated remarkable intelligence across various tasks, which has inspired the development and widespread adoption of LLM-as-a-Judge systems for automated model testing, such as red teaming and benchmarking. However, these systems are susceptible to adversarial attacks that can manipulate evaluation outcomes, raising concerns about their robustness and, consequently, their trustworthiness. Existing evaluation methods adopted by LLM-based judges are often piecemeal and lack a unified framework for comprehensive assessment. Furthermore, prompt template and model selections for improving judge robustness have been rarely explored, and their performance in real-world settings remains largely unverified. To address these gaps, we introduce RobustJudge, a fully automated and scalable framework designed to systematically evaluate the robustness of LLM-as-a-Judge systems. RobustJudge investigates the impact of attack methods and defense strategies (RQ1), explores the influence of prompt template and model selection (RQ2), and assesses the robustness of real-world LLM-as-a-Judge applications (RQ3).Our main findings are: (1) LLM-as-a-Judge systems are still vulnerable to a range of adversarial attacks, including Combined Attack and PAIR, while defense mechanisms such as Re-tokenization and LLM-based Detectors offer improved protection; (2) Robustness is highly sensitive to the choice of prompt template and judge models. Our proposed prompt template optimization method can improve robustness, and JudgeLM-13B demonstrates strong performance as a robust open-source judge; (3) Applying RobustJudge to Alibaba's PAI platform reveals previously unreported vulnerabilities. The source code of RobustJudge is provided at https://github.com/S3IC-Lab/RobustJudge.

Paperid: 31, https://arxiv.org/pdf/2506.09363.pdf GitHub

Abstract:
Diffusion models (DMs) have achieved significant progress in text-to-image generation. However, the inevitable inclusion of sensitive information during pre-training poses safety risks, such as unsafe content generation and copyright infringement. Concept erasing finetunes weights to unlearn undesirable concepts, and has emerged as a promising solution. However, existing methods treat unsafe concept as a fixed word and repeatedly erase it, trapping DMs in ``word concept abyss'', which prevents generalized concept-related erasing. To escape this abyss, we introduce semantic-augment erasing which transforms concept word erasure into concept domain erasure by the cyclic self-check and self-erasure. It efficiently explores and unlearns the boundary representation of concept domain through semantic spatial relationships between original and training DMs, without requiring additional preprocessed data. Meanwhile, to mitigate the retention degradation of irrelevant concepts while erasing unsafe concepts, we further propose the global-local collaborative retention mechanism that combines global semantic relationship alignment with local predicted noise preservation, effectively expanding the retentive receptive field for irrelevant concepts. We name our method SAGE, and extensive experiments demonstrate the comprehensive superiority of SAGE compared with other methods in the safe generation of DMs. The code and weights will be open-sourced at https://github.com/KevinLight831/SAGE.

Paperid: 32, https://arxiv.org/pdf/2506.09353.pdf GitHub

Abstract:
Large Vision-Language Models (LVLMs) have achieved impressive progress across various applications but remain vulnerable to malicious queries that exploit the visual modality. Existing alignment approaches typically fail to resist malicious queries while preserving utility on benign ones effectively. To address these challenges, we propose Deep Aligned Visual Safety Prompt (DAVSP), which is built upon two key innovations. First, we introduce the Visual Safety Prompt, which appends a trainable padding region around the input image. It preserves visual features and expands the optimization space. Second, we propose Deep Alignment, a novel approach to train the visual safety prompt through supervision in the model's activation space. It enhances the inherent ability of LVLMs to perceive malicious queries, achieving deeper alignment than prior works. Extensive experiments across five benchmarks on two representative LVLMs demonstrate that DAVSP effectively resists malicious queries while preserving benign input utility. Furthermore, DAVSP exhibits great cross-model generation ability. Ablation studies further reveal that both the Visual Safety Prompt and Deep Alignment are essential components, jointly contributing to its overall effectiveness. The code is publicly available at https://github.com/zhangyitonggg/DAVSP.

Paperid: 33, https://arxiv.org/pdf/2506.08347.pdf GitHub

Abstract:
Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach. Our code is available at https://github.com/Graph-COM/Node_DP.

Paperid: 34, https://arxiv.org/pdf/2506.07022.pdf GitHub

Abstract:
As LLMs are increasingly deployed in real-world applications, ensuring their ability to refuse malicious prompts, especially jailbreak attacks, is essential for safe and reliable use. Recently, activation steering has emerged as an effective approach for enhancing LLM safety by adding a refusal direction vector to internal activations of LLMs during inference, which will further induce the refusal behaviors of LLMs. However, indiscriminately applying activation steering fundamentally suffers from the trade-off between safety and utility, since the same steering vector can also lead to over-refusal and degraded performance on benign prompts. Although prior efforts, such as vector calibration and conditional steering, have attempted to mitigate this trade-off, their lack of theoretical grounding limits their robustness and effectiveness. To better address the trade-off between safety and utility, we present a theoretically grounded and empirically effective activation steering method called AlphaSteer. Specifically, it considers activation steering as a learnable process with two principled learning objectives: utility preservation and safety enhancement. For utility preservation, it learns to construct a nearly zero vector for steering benign data, with the null-space constraints. For safety enhancement, it learns to construct a refusal direction vector for steering malicious data, with the help of linear regression. Experiments across multiple jailbreak attacks and utility benchmarks demonstrate the effectiveness of AlphaSteer, which significantly improves the safety of LLMs without compromising general capabilities. Our codes are available at https://github.com/AlphaLab-USTC/AlphaSteer.

Paperid: 35, https://arxiv.org/pdf/2506.06985.pdf GitHub

Abstract:
We address the problem of machine unlearning, where the goal is to remove the influence of specific training data from a model upon request, motivated by privacy concerns and regulatory requirements such as the "right to be forgotten." Unfortunately, existing methods rely on restrictive assumptions or lack formal guarantees. To this end, we propose a novel method for certified machine unlearning, leveraging the connection between unlearning and privacy amplification by stochastic post-processing. Our method uses noisy fine-tuning on the retain data, i.e., data that does not need to be removed, to ensure provable unlearning guarantees. This approach requires no assumptions about the underlying loss function, making it broadly applicable across diverse settings. We analyze the theoretical trade-offs in efficiency and accuracy and demonstrate empirically that our method not only achieves formal unlearning guarantees but also performs effectively in practice, outperforming existing baselines. Our code is available at https://github.com/stair-lab/certified-unlearning-neural-networks-icml-2025

Paperid: 36, https://arxiv.org/pdf/2506.06694.pdf GitHub

Abstract:
Foundation models have revolutionized fields such as natural language processing and computer vision by enabling general-purpose learning across diverse tasks and datasets. However, building analogous models for human mobility remains challenging due to the privacy-sensitive nature of mobility data and the resulting data silos across institutions. To bridge this gap, we propose MoveGCL, a scalable and privacy-preserving framework for training mobility foundation models via generative continual learning. Without sharing raw data, MoveGCL enables decentralized and progressive model evolution by replaying synthetic trajectories generated from a frozen teacher model, and reinforces knowledge retention through a tailored distillation strategy that mitigates catastrophic forgetting. To address the heterogeneity of mobility patterns, MoveGCL incorporates a Mixture-of-Experts Transformer with a mobility-aware expert routing mechanism, and employs a layer-wise progressive adaptation strategy to stabilize continual updates. Experiments on six real-world urban datasets demonstrate that MoveGCL achieves performance comparable to joint training and significantly outperforms federated learning baselines, while offering strong privacy protection. MoveGCL marks a crucial step toward unlocking foundation models for mobility, offering a practical blueprint for open, scalable, and privacy-preserving model development in the era of foundation models. To facilitate reproducibility and future research, we have released the code and models at https://github.com/tsinghua-fib-lab/MoveGCL.

Paperid: 37, https://arxiv.org/pdf/2506.06444.pdf GitHub GitHub

Abstract:
Existing safety assurance research has primarily focused on training-phase alignment to instill safe behaviors into LLMs. However, recent studies have exposed these methods' susceptibility to diverse jailbreak attacks. Concurrently, inference scaling has significantly advanced LLM reasoning capabilities but remains unexplored in the context of safety assurance. Addressing this gap, our work pioneers inference scaling for robust and effective LLM safety against emerging threats. We reveal that conventional inference scaling techniques, despite their success in reasoning tasks, perform poorly in safety contexts, even falling short of basic approaches like Best-of-N Sampling. We attribute this inefficiency to a newly identified challenge, the exploration--efficiency dilemma, arising from the high computational overhead associated with frequent process reward model (PRM) evaluations. To overcome this dilemma, we propose SAFFRON, a novel inference scaling paradigm tailored explicitly for safety assurance. Central to our approach is the introduction of a multifurcation reward model (MRM) that significantly reduces the required number of reward model evaluations. To operationalize this paradigm, we further propose: (i) a partial supervision training objective for MRM, (ii) a conservative exploration constraint to prevent out-of-distribution explorations, and (iii) a Trie-based key--value caching strategy that facilitates cache sharing across sequences during tree search. Extensive experiments validate the effectiveness of our method. Additionally, we publicly release our trained multifurcation reward model (Saffron-1) and the accompanying token-level safety reward dataset (Safety4M) to accelerate future research in LLM safety. Our code, model, and data are publicly available at https://github.com/q-rz/saffron , and our project homepage is at https://q-rz.github.io/p/saffron .

Paperid: 38, https://arxiv.org/pdf/2506.06409.pdf GitHub

Abstract:
Large language model (LLM) watermarks enable authentication of text provenance, curb misuse of machine-generated text, and promote trust in AI systems. Current watermarks operate by changing the next-token predictions output by an LLM. The updated (i.e., watermarked) predictions depend on random side information produced, for example, by hashing previously generated tokens. LLM watermarking is particularly challenging in low-entropy generation tasks - such as coding - where next-token predictions are near-deterministic. In this paper, we propose an optimization framework for watermark design. Our goal is to understand how to most effectively use random side information in order to maximize the likelihood of watermark detection and minimize the distortion of generated text. Our analysis informs the design of two new watermarks: HeavyWater and SimplexWater. Both watermarks are tunable, gracefully trading-off between detection accuracy and text distortion. They can also be applied to any LLM and are agnostic to side information generation. We examine the performance of HeavyWater and SimplexWater through several benchmarks, demonstrating that they can achieve high watermark detection accuracy with minimal compromise of text generation quality, particularly in the low-entropy regime. Our theoretical analysis also reveals surprising new connections between LLM watermarking and coding theory. The code implementation can be found in https://github.com/DorTsur/HeavyWater_SimplexWater

Paperid: 39, https://arxiv.org/pdf/2506.06151.pdf GitHub

Abstract:
Retrieval-Augmented Generation (RAG) systems enhance Large Language Models (LLMs) by retrieving relevant documents from external corpora before generating responses. This approach significantly expands LLM capabilities by leveraging vast, up-to-date external knowledge. However, this reliance on external knowledge makes RAG systems vulnerable to corpus poisoning attacks that manipulate generated outputs via poisoned document injection. Existing poisoning attack strategies typically treat the retrieval and generation stages as disjointed, limiting their effectiveness. We propose Joint-GCG, the first framework to unify gradient-based attacks across both retriever and generator models through three innovations: (1) Cross-Vocabulary Projection for aligning embedding spaces, (2) Gradient Tokenization Alignment for synchronizing token-level gradient signals, and (3) Adaptive Weighted Fusion for dynamically balancing attacking objectives. Evaluations demonstrate that Joint-GCG achieves at most 25% and an average of 5% higher attack success rate than previous methods across multiple retrievers and generators. While optimized under a white-box assumption, the generated poisons show unprecedented transferability to unseen models. Joint-GCG's innovative unification of gradient-based attacks across retrieval and generation stages fundamentally reshapes our understanding of vulnerabilities within RAG systems. Our code is available at https://github.com/NicerWang/Joint-GCG.

Paperid: 40, https://arxiv.org/pdf/2506.06112.pdf GitHub

Abstract:
Growing concerns over data privacy and security highlight the importance of machine unlearning--removing specific data influences from trained models without full retraining. Techniques like Membership Inference Attacks (MIAs) are widely used to externally assess successful unlearning. However, existing methods face two key limitations: (1) maximizing MIA effectiveness (e.g., via online attacks) requires prohibitive computational resources, often exceeding retraining costs; (2) MIAs, designed for binary inclusion tests, struggle to capture granular changes in approximate unlearning. To address these challenges, we propose the Interpolated Approximate Measurement (IAM), a framework natively designed for unlearning inference. IAM quantifies sample-level unlearning completeness by interpolating the model's generalization-fitting behavior gap on queried samples. IAM achieves strong performance in binary inclusion tests for exact unlearning and high correlation for approximate unlearning--scalable to LLMs using just one pre-trained shadow model. We theoretically analyze how IAM's scoring mechanism maintains performance efficiently. We then apply IAM to recent approximate unlearning algorithms, revealing general risks of both over-unlearning and under-unlearning, underscoring the need for stronger safeguards in approximate unlearning systems. The code is available at https://github.com/Happy2Git/Unlearning_Inference_IAM.

Paperid: 41, https://arxiv.org/pdf/2506.05867.pdf GitHub

Abstract:
Model stealing poses a significant security risk in machine learning by enabling attackers to replicate a black-box model without access to its training data, thus jeopardizing intellectual property and exposing sensitive information. Recent methods that use pre-trained diffusion models for data synthesis improve efficiency and performance but rely heavily on manually crafted prompts, limiting automation and scalability, especially for attackers with little expertise. To assess the risks posed by open-source pre-trained models, we propose a more realistic threat model that eliminates the need for prompt design skills or knowledge of class names. In this context, we introduce Stealix, the first approach to perform model stealing without predefined prompts. Stealix uses two open-source pre-trained models to infer the victim model's data distribution, and iteratively refines prompts through a genetic algorithm, progressively improving the precision and diversity of synthetic images. Our experimental results demonstrate that Stealix significantly outperforms other methods, even those with access to class names or fine-grained prompts, while operating under the same query budget. These findings highlight the scalability of our approach and suggest that the risks posed by pre-trained generative models in model stealing may be greater than previously recognized.

Paperid: 42, https://arxiv.org/pdf/2506.05743.pdf GitHub GitHub

Abstract:
With the rapid advancement of deep learning technology, pre-trained encoder models have demonstrated exceptional feature extraction capabilities, playing a pivotal role in the research and application of deep learning. However, their widespread use has raised significant concerns about the risk of training data privacy leakage. This paper systematically investigates the privacy threats posed by membership inference attacks (MIAs) targeting encoder models, focusing on contrastive learning frameworks. Through experimental analysis, we reveal the significant impact of model architecture complexity on membership privacy leakage: As more advanced encoder frameworks improve feature-extraction performance, they simultaneously exacerbate privacy-leakage risks. Furthermore, this paper proposes a novel membership inference attack method based on the p-norm of feature vectors, termed the Embedding Lp-Norm Likelihood Attack (LpLA). This method infers membership status, by leveraging the statistical distribution characteristics of the p-norm of feature vectors. Experimental results across multiple datasets and model architectures demonstrate that LpLA outperforms existing methods in attack performance and robustness, particularly under limited attack knowledge and query volumes. This study not only uncovers the potential risks of privacy leakage in contrastive learning frameworks, but also provides a practical basis for privacy protection research in encoder models. We hope that this work will draw greater attention to the privacy risks associated with self-supervised learning models and shed light on the importance of a balance between model utility and training data privacy. Our code is publicly available at: https://github.com/SeroneySun/LpLA_code.

Paperid: 43, https://arxiv.org/pdf/2506.05640.pdf GitHub

Abstract:
Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs) by leveraging computational resources across organizations while keeping sensitive data on local devices. It addresses privacy and security concerns while navigating challenges associated with the substantial computational demands of LLMs, which can be prohibitive for small and medium-sized organizations. FL supports the development of task-specific LLMs for cross-silo applications through fine-tuning but remains vulnerable to inference attacks, such as membership inference and gradient inversion, which threaten data privacy. Prior studies have utilized Differential Privacy (DP) in LLM fine-tuning, which, despite being effective at preserving privacy, can degrade model performance. To overcome these challenges, we propose a novel method, FedShield-LLM, that uses pruning with Fully Homomorphic Encryption (FHE) for Low-Rank Adaptation (LoRA) parameters, enabling secure computations on encrypted model updates while mitigating the attack surface by deactivating less important LoRA parameters. Furthermore, optimized federated algorithms for cross-silo environments enhance scalability and efficiency. Parameter-efficient fine-tuning techniques like LoRA substantially reduce computational and communication overhead, making FL feasible for resource-constrained clients. Experimental results show that the proposed method outperforms existing methods while maintaining robust privacy protection, enabling organizations to collaboratively train secure and efficient LLMs. The code and data are available at, https://github.com/solidlabnetwork/fedshield-llm

Paperid: 44, https://arxiv.org/pdf/2506.05407.pdf GitHub

Abstract:
The rise of generative APIs has fueled interest in privacy-preserving synthetic data generation. While the Private Evolution (PE) algorithm generates Differential Privacy (DP) synthetic images using diffusion model APIs, it struggles with few-shot private data due to the limitations of its DP-protected similarity voting approach. In practice, the few-shot private data challenge is particularly prevalent in specialized domains like healthcare and industry. To address this challenge, we propose a novel API-assisted algorithm, Private Contrastive Evolution (PCEvolve), which iteratively mines inherent inter-class contrastive relationships in few-shot private data beyond individual data points and seamlessly integrates them into an adapted Exponential Mechanism (EM) to optimize DP's utility in an evolution loop. We conduct extensive experiments on four specialized datasets, demonstrating that PCEvolve outperforms PE and other API-assisted baselines. These results highlight the potential of leveraging API access with private data for quality evaluation, enabling the generation of high-quality DP synthetic images and paving the way for more accessible and effective privacy-preserving generative API applications. Our code is available at https://github.com/TsingZ0/PCEvolve.

Paperid: 45, https://arxiv.org/pdf/2506.05394.pdf GitHub

Abstract:
Foundation models represent the most prominent and recent paradigm shift in artificial intelligence. Foundation models are large models, trained on broad data that deliver high accuracy in many downstream tasks, often without fine-tuning. For this reason, models such as CLIP , DINO or Vision Transfomers (ViT), are becoming the bedrock of many industrial AI-powered applications. However, the reliance on pre-trained foundation models also introduces significant security concerns, as these models are vulnerable to adversarial attacks. Such attacks involve deliberately crafted inputs designed to deceive AI systems, jeopardizing their reliability. This paper studies the vulnerabilities of vision foundation models, focusing specifically on CLIP and ViTs, and explores the transferability of adversarial attacks to downstream tasks. We introduce a novel attack, targeting the structure of transformer-based architectures in a task-agnostic fashion. We demonstrate the effectiveness of our attack on several downstream tasks: classification, captioning, image/text retrieval, segmentation and depth estimation. Code available at:https://github.com/HondamunigePrasannaSilva/attack-attention

Paperid: 46, https://arxiv.org/pdf/2506.05032.pdf GitHub

Abstract:
Adversarial training (AT) has been considered one of the most effective methods for making deep neural networks robust against adversarial attacks, while the training mechanisms and dynamics of AT remain open research problems. In this paper, we present a novel perspective on studying AT through the lens of class-wise feature attribution. Specifically, we identify the impact of a key family of features on AT that are shared by multiple classes, which we call cross-class features. These features are typically useful for robust classification, which we offer theoretical evidence to illustrate through a synthetic data model. Through systematic studies across multiple model architectures and settings, we find that during the initial stage of AT, the model tends to learn more cross-class features until the best robustness checkpoint. As AT further squeezes the training robust loss and causes robust overfitting, the model tends to make decisions based on more class-specific features. Based on these discoveries, we further provide a unified view of two existing properties of AT, including the advantage of soft-label training and robust overfitting. Overall, these insights refine the current understanding of AT mechanisms and provide new perspectives on studying them. Our code is available at https://github.com/PKU-ML/Cross-Class-Features-AT.

Paperid: 47, https://arxiv.org/pdf/2506.04462.pdf GitHub

Abstract:
Watermarking techniques for large language models (LLMs) can significantly impact output quality, yet their effects on truthfulness, safety, and helpfulness remain critically underexamined. This paper presents a systematic analysis of how two popular watermarking approaches-Gumbel and KGW-affect these core alignment properties across four aligned LLMs. Our experiments reveal two distinct degradation patterns: guard attenuation, where enhanced helpfulness undermines model safety, and guard amplification, where excessive caution reduces model helpfulness. These patterns emerge from watermark-induced shifts in token distribution, surfacing the fundamental tension that exists between alignment objectives. To mitigate these degradations, we propose Alignment Resampling (AR), an inference-time sampling method that uses an external reward model to restore alignment. We establish a theoretical lower bound on the improvement in expected reward score as the sample size is increased and empirically demonstrate that sampling just 2-4 watermarked generations effectively recovers or surpasses baseline (unwatermarked) alignment scores. To overcome the limited response diversity of standard Gumbel watermarking, our modified implementation sacrifices strict distortion-freeness while maintaining robust detectability, ensuring compatibility with AR. Experimental results confirm that AR successfully recovers baseline alignment in both watermarking approaches, while maintaining strong watermark detectability. This work reveals the critical balance between watermark strength and model alignment, providing a simple inference-time solution to responsibly deploy watermarked LLMs in practice.

Paperid: 48, https://arxiv.org/pdf/2506.04453.pdf GitHub

Abstract:
Federated learning (FL) allows multiple data-owners to collaboratively train machine learning models by exchanging local gradients, while keeping their private data on-device. To simultaneously enhance privacy and training efficiency, recently parameter-efficient fine-tuning (PEFT) of large-scale pretrained models has gained substantial attention in FL. While keeping a pretrained (backbone) model frozen, each user fine-tunes only a few lightweight modules to be used in conjunction, to fit specific downstream applications. Accordingly, only the gradients with respect to these lightweight modules are shared with the server. In this work, we investigate how the privacy of the fine-tuning data of the users can be compromised via a malicious design of the pretrained model and trainable adapter modules. We demonstrate gradient inversion attacks on a popular PEFT mechanism, the adapter, which allow an attacker to reconstruct local data samples of a target user, using only the accessible adapter gradients. Via extensive experiments, we demonstrate that a large batch of fine-tuning images can be retrieved with high fidelity. Our attack highlights the need for privacy-preserving mechanisms for PEFT, while opening up several future directions. Our code is available at https://github.com/info-ucr/PEFTLeak.

Paperid: 49, https://arxiv.org/pdf/2506.04202.pdf GitHub GitHub

Abstract:
Long context large language models (LLMs) are deployed in many real-world applications such as RAG, agent, and broad LLM-integrated applications. Given an instruction and a long context (e.g., documents, PDF files, webpages), a long context LLM can generate an output grounded in the provided context, aiming to provide more accurate, up-to-date, and verifiable outputs while reducing hallucinations and unsupported claims. This raises a research question: how to pinpoint the texts (e.g., sentences, passages, or paragraphs) in the context that contribute most to or are responsible for the generated output by an LLM? This process, which we call context traceback, has various real-world applications, such as 1) debugging LLM-based systems, 2) conducting post-attack forensic analysis for attacks (e.g., prompt injection attack, knowledge corruption attacks) to an LLM, and 3) highlighting knowledge sources to enhance the trust of users towards outputs generated by LLMs. When applied to context traceback for long context LLMs, existing feature attribution methods such as Shapley have sub-optimal performance and/or incur a large computational cost. In this work, we develop TracLLM, the first generic context traceback framework tailored to long context LLMs. Our framework can improve the effectiveness and efficiency of existing feature attribution methods. To improve the efficiency, we develop an informed search based algorithm in TracLLM. We also develop contribution score ensemble/denoising techniques to improve the accuracy of TracLLM. Our evaluation results show TracLLM can effectively identify texts in a long context that lead to the output of an LLM. Our code and data are at: https://github.com/Wang-Yanting/TracLLM.

Paperid: 50, https://arxiv.org/pdf/2506.03651.pdf GitHub

Abstract:
The quantity and quality of vulnerability datasets are essential for developing deep learning solutions to vulnerability-related tasks. Due to the limited availability of vulnerabilities, a common approach to building such datasets is analyzing security patches in source code. However, existing security patches often suffer from inaccurate labels, insufficient contextual information, and undecidable patches that fail to clearly represent the root causes of vulnerabilities or their fixes. These issues introduce noise into the dataset, which can mislead detection models and undermine their effectiveness. To address these issues, we present mono, a novel LLM-powered framework that simulates human experts' reasoning process to construct reliable vulnerability datasets. mono introduces three key components to improve security patch datasets: (i) semantic-aware patch classification for precise vulnerability labeling, (ii) iterative contextual analysis for comprehensive code understanding, and (iii) systematic root cause analysis to identify and filter undecidable patches. Our comprehensive evaluation on the MegaVul benchmark demonstrates that mono can correct 31.0% of labeling errors, recover 89% of inter-procedural vulnerabilities, and reveals that 16.7% of CVEs contain undecidable patches. Furthermore, mono's enriched context representation improves existing models' vulnerability detection accuracy by 15%. We open source the framework mono and the dataset MonoLens in https://github.com/vul337/mono.

Paperid: 51, https://arxiv.org/pdf/2506.03614.pdf GitHub

Abstract:
One way to mitigate risks in vision-language models (VLMs) is to remove dangerous samples in their training data. However, such data moderation can be easily bypassed when harmful images are split into small, benign-looking patches, scattered across many training samples. VLMs may then learn to piece these fragments together during training and generate harmful responses at inference, either from full images or text references. For instance, if trained on image patches from a bloody scene paired with the descriptions "safe," VLMs may later describe, the full image or a text reference to the scene, as "safe." We define the core ability of VLMs enabling this attack as $\textit{visual stitching}$ -- the ability to integrate visual information spread across multiple training samples that share the same textual descriptions. In our work, we first demonstrate visual stitching abilities in common open-source VLMs on three datasets where each image is labeled with a unique synthetic ID: we split each $(\texttt{image}, \texttt{ID})$ pair into $\{(\texttt{patch}, \texttt{ID})\}$ pairs at different granularity for finetuning, and we find that tuned models can verbalize the correct IDs from full images or text reference. Building on this, we simulate the adversarial data poisoning scenario mentioned above by using patches from dangerous images and replacing IDs with text descriptions like ``safe'' or ``unsafe'', demonstrating how harmful content can evade moderation in patches and later be reconstructed through visual stitching, posing serious VLM safety risks. Code is available at https://github.com/ZHZisZZ/visual-stitching.

Paperid: 52, https://arxiv.org/pdf/2506.03196.pdf GitHub

Abstract:
Graph-based learning provides a powerful framework for modeling complex relational structures; however, its application within the domain of wireless security remains significantly underexplored. In this work, we introduce the first application of graph-based learning for jamming source localization, addressing the imminent threat of jamming attacks in wireless networks. Unlike geometric optimization techniques that struggle under environmental uncertainties and dense interference, we reformulate the localization as an inductive graph regression task. Our approach integrates structured node representations that encode local and global signal aggregation, ensuring spatial coherence and adaptive signal fusion. To enhance robustness, we incorporate an attention-based \ac{GNN} that adaptively refines neighborhood influence and introduces a confidence-guided estimation mechanism that dynamically balances learned predictions with domain-informed priors. We evaluate our approach under complex \ac{RF} environments with various sampling densities, network topologies, jammer characteristics, and signal propagation conditions, conducting comprehensive ablation studies on graph construction, feature selection, and pooling strategies. Results demonstrate that our novel graph-based learning framework significantly outperforms established localization baselines, particularly in challenging scenarios with sparse and obfuscated signal information. Our code is available at https://github.com/tiiuae/gnn-jamming-source-localization.

Paperid: 53, https://arxiv.org/pdf/2506.02761.pdf GitHub

Abstract:
With the surge and widespread application of image generation models, data privacy and content safety have become major concerns and attracted great attention from users, service providers, and policymakers. Machine unlearning (MU) is recognized as a cost-effective and promising means to address these challenges. Despite some advancements, image generation model unlearning (IGMU) still faces remarkable gaps in practice, e.g., unclear task discrimination and unlearning guidelines, lack of an effective evaluation framework, and unreliable evaluation metrics. These can hinder the understanding of unlearning mechanisms and the design of practical unlearning algorithms. We perform exhaustive assessments over existing state-of-the-art unlearning algorithms and evaluation standards, and discover several critical flaws and challenges in IGMU tasks. Driven by these limitations, we make several core contributions, to facilitate the comprehensive understanding, standardized categorization, and reliable evaluation of IGMU. Specifically, (1) We design CatIGMU, a novel hierarchical task categorization framework. It provides detailed implementation guidance for IGMU, assisting in the design of unlearning algorithms and the construction of testbeds. (2) We introduce EvalIGMU, a comprehensive evaluation framework. It includes reliable quantitative metrics across five critical aspects. (3) We construct DataIGM, a high-quality unlearning dataset, which can be used for extensive evaluations of IGMU, training content detectors for judgment, and benchmarking the state-of-the-art unlearning algorithms. With EvalIGMU and DataIGM, we discover that most existing IGMU algorithms cannot handle the unlearning well across different evaluation dimensions, especially for preservation and robustness. Code and models are available at https://github.com/ryliu68/IGMU.

Paperid: 54, https://arxiv.org/pdf/2506.02456.pdf GitHub

Abstract:
Computer-Use Agents (CUAs) with full system access enable powerful task automation but pose significant security and privacy risks due to their ability to manipulate files, access user data, and execute arbitrary commands. While prior work has focused on browser-based agents and HTML-level attacks, the vulnerabilities of CUAs remain underexplored. In this paper, we investigate Visual Prompt Injection (VPI) attacks, where malicious instructions are visually embedded within rendered user interfaces, and examine their impact on both CUAs and Browser-Use Agents (BUAs). We propose VPI-Bench, a benchmark of 306 test cases across five widely used platforms, to evaluate agent robustness under VPI threats. Each test case is a variant of a web platform, designed to be interactive, deployed in a realistic environment, and containing a visually embedded malicious prompt. Our empirical study shows that current CUAs and BUAs can be deceived at rates of up to 51% and 100%, respectively, on certain platforms. The experimental results also indicate that system prompt defenses offer only limited improvements. These findings highlight the need for robust, context-aware defenses to ensure the safe deployment of multimodal AI agents in real-world environments. The code and dataset are available at: https://github.com/cua-framework/agents

Paperid: 55, https://arxiv.org/pdf/2506.02362.pdf GitHub

Abstract:
Model extraction attacks aim to replicate the functionality of a black-box model through query access, threatening the intellectual property (IP) of machine-learning-as-a-service (MLaaS) providers. Defending against such attacks is challenging, as it must balance efficiency, robustness, and utility preservation in the real-world scenario. Despite the recent advances, most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs. However, this assumption is increasingly unreliable, as modern models are trained on diverse datasets and attackers often operate under limited query budgets. As a result, the effectiveness of these defenses is significantly compromised in realistic deployment scenarios. To address this gap, we propose MISLEADER (enseMbles of dIStiLled modEls Against moDel ExtRaction), a novel defense strategy that does not rely on OOD assumptions. MISLEADER formulates model protection as a bilevel optimization problem that simultaneously preserves predictive fidelity on benign inputs and reduces extractability by potential clone models. Our framework combines data augmentation to simulate attacker queries with an ensemble of heterogeneous distilled models to improve robustness and diversity. We further provide a tractable approximation algorithm and derive theoretical error bounds to characterize defense effectiveness. Extensive experiments across various settings validate the utility-preserving and extraction-resistant properties of our proposed defense strategy. Our code is available at https://github.com/LabRAI/MISLEADER.

Paperid: 56, https://arxiv.org/pdf/2506.01790.pdf GitHub

Abstract:
We study how training data contributes to the emergence of toxic behaviors in large-language models. Most prior work on reducing model toxicity adopts $reactive$ approaches, such as fine-tuning pre-trained (and potentially toxic) models to align them with human values. In contrast, we propose a $proactive$ approach$-$IF-Guide$-$which leverages influence functions to identify harmful tokens within any training data and suppress their impact during training. To this end, we first show that standard influence functions are ineffective at discovering harmful training records. We then present a novel adaptation that measures token-level attributions from training data to model toxicity, along with techniques for selecting toxic training documents and a learning objective that can be integrated into both pre-training and fine-tuning. Moreover, IF-Guide does not rely on human-preference data, which is typically required by existing alignment methods. In evaluation, we demonstrate that IF-Guide substantially reduces both explicit and implicit toxicity$-$by up to 10$\times$ compared to uncensored models, and up to 3$\times$ compared to baseline alignment methods, e.g., DPO and RAD$-$across both pre-training and fine-tuning scenarios. IF-Guide is computationally efficient: a billion-parameter model is $not$ $necessary$ for computing influence scores; a million-parameter model$-$with 7.5$\times$ fewer parameters$-$can effectively serve as a proxy for identifying harmful data. Our code is publicly available at: https://github.com/ztcoalson/IF-Guide

Paperid: 57, https://arxiv.org/pdf/2506.01770.pdf GitHub

Abstract:
Large Language Models (LLMs) have achieved significant success in various tasks, yet concerns about their safety and security have emerged. In particular, they pose risks in generating harmful content and vulnerability to jailbreaking attacks. To analyze and monitor machine learning models, model-based analysis has demonstrated notable potential in stateful deep neural networks, yet suffers from scalability issues when extending to LLMs due to their vast feature spaces. In this paper, we propose ReGA, a model-based analysis framework with representation-guided abstraction, to safeguard LLMs against harmful prompts and generations. By leveraging safety-critical representations, which are low-dimensional directions emerging in hidden states that indicate safety-related concepts, ReGA effectively addresses the scalability issue when constructing the abstract model for safety modeling. Our comprehensive evaluation shows that ReGA performs sufficiently well in distinguishing between safe and harmful inputs, achieving an AUROC of 0.975 at the prompt level and 0.985 at the conversation level. Additionally, ReGA exhibits robustness to real-world attacks and generalization across different safety perspectives, outperforming existing safeguard paradigms in terms of interpretability and scalability. Overall, ReGA serves as an efficient and scalable solution to enhance LLM safety by integrating representation engineering with model-based abstraction, paving the way for new paradigms to utilize software insights for AI safety. Our code is available at https://github.com/weizeming/ReGA.

Paperid: 58, https://arxiv.org/pdf/2506.01591.pdf GitHub

Abstract:
Advances in talking-head animation based on Latent Diffusion Models (LDM) enable the creation of highly realistic, synchronized videos. These fabricated videos are indistinguishable from real ones, increasing the risk of potential misuse for scams, political manipulation, and misinformation. Hence, addressing these ethical concerns has become a pressing issue in AI security. Recent proactive defense studies focused on countering LDM-based models by adding perturbations to portraits. However, these methods are ineffective at protecting reference portraits from advanced image-to-video animation. The limitations are twofold: 1) they fail to prevent images from being manipulated by audio signals, and 2) diffusion-based purification techniques can effectively eliminate protective perturbations. To address these challenges, we propose Silencer, a two-stage method designed to proactively protect the privacy of portraits. First, a nullifying loss is proposed to ignore audio control in talking-head generation. Second, we apply anti-purification loss in LDM to optimize the inverted latent feature to generate robust perturbations. Extensive experiments demonstrate the effectiveness of Silencer in proactively protecting portrait privacy. We hope this work will raise awareness among the AI security community regarding critical ethical issues related to talking-head generation techniques. Code: https://github.com/yuangan/Silencer.

Paperid: 59, https://arxiv.org/pdf/2506.01446.pdf GitHub

Abstract:
Policies are designed to distinguish between correct and incorrect actions; they are types. But badly typed actions may cause not compile errors, but financial and reputational harm We demonstrate how even the most complex ABAC policies can be expressed as types in dependently typed languages such as Agda and Lean, providing a single framework to express, analyze, and implement policies. We then go head-to-head with Rego, the popular and powerful open-source ABAC policy language. We show the superior safety that comes with a powerful type system and built-in proof assistant. In passing, we discuss various access control models, sketch how to integrate in a future when attributes are distributed and signed (as discussed at the W3C), and show how policies can be communicated using just the syntax of the language. Our examples are in Agda.

Paperid: 60, https://arxiv.org/pdf/2506.00978.pdf GitHub

Abstract:
Projector-based adversarial attack aims to project carefully designed light patterns (i.e., adversarial projections) onto scenes to deceive deep image classifiers. It has potential applications in privacy protection and the development of more robust classifiers. However, existing approaches primarily focus on individual classifiers and fixed camera poses, often neglecting the complexities of multi-classifier systems and scenarios with varying camera poses. This limitation reduces their effectiveness when introducing new classifiers or camera poses. In this paper, we introduce Classifier-Agnostic Projector-Based Adversarial Attack (CAPAA) to address these issues. First, we develop a novel classifier-agnostic adversarial loss and optimization framework that aggregates adversarial and stealthiness loss gradients from multiple classifiers. Then, we propose an attention-based gradient weighting mechanism that concentrates perturbations on regions of high classification activation, thereby improving the robustness of adversarial projections when applied to scenes with varying camera poses. Our extensive experimental evaluations demonstrate that CAPAA achieves both a higher attack success rate and greater stealthiness compared to existing baselines. Codes are available at: https://github.com/ZhanLiQxQ/CAPAA.

Paperid: 61, https://arxiv.org/pdf/2506.00652.pdf GitHub GitHub

Abstract:
The rapid development of Artificial Intelligence Generated Content (AIGC) has led to significant progress in video generation but also raises serious concerns about intellectual property protection and reliable content tracing. Watermarking is a widely adopted solution to this issue, but existing methods for video generation mainly follow a post-generation paradigm, which introduces additional computational overhead and often fails to effectively balance the trade-off between video quality and watermark extraction. To address these issues, we propose Video Signature (VIDSIG), an in-generation watermarking method for latent video diffusion models, which enables implicit and adaptive watermark integration during generation. Specifically, we achieve this by partially fine-tuning the latent decoder, where Perturbation-Aware Suppression (PAS) pre-identifies and freezes perceptually sensitive layers to preserve visual quality. Beyond spatial fidelity, we further enhance temporal consistency by introducing a lightweight Temporal Alignment module that guides the decoder to generate coherent frame sequences during fine-tuning. Experimental results show that VIDSIG achieves the best overall performance in watermark extraction, visual quality, and generation efficiency. It also demonstrates strong robustness against both spatial and temporal tampering, highlighting its practicality in real-world scenarios. Our code is available at \href{https://github.com/hardenyu21/Video-Signature}{here}

Paperid: 62, https://arxiv.org/pdf/2506.00419.pdf GitHub

Abstract:
LLM generated code often contains security issues. We address two key challenges in improving secure code generation. First, obtaining high quality training data covering a broad set of security issues is critical. To address this, we introduce a method for distilling a preference dataset of insecure and secure code pairs from frontier LLMs, along with a security reasoning that explains the issues and the fix. The key idea here is to make use of security knowledge sources to devise a systematic prompting strategy that ensures broad coverage. Second, aligning models to secure code requires focusing on localized regions of code. Direct preference optimization methods, like SimPO, are not designed to handle these localized differences and turn out to be ineffective. We address this with a new localized preference optimization algorithm that masks the security related tokens in both the winning (secure) and losing (insecure) responses. To prevent loss in code quality, we also add a regularizer. Evaluations show that both training on our dataset, DiSCo, and the new preference optimization algorithm, LPO, yield substantial reductions in code insecurity while also improving overall code quality. Code and dataset are available at https://github.com/StonyBrookNLP/disco-lpo.

Paperid: 63, https://arxiv.org/pdf/2506.00322.pdf GitHub

Abstract:
We propose dpmm, an open-source library for synthetic data generation with Differentially Private (DP) guarantees. It includes three popular marginal models -- PrivBayes, MST, and AIM -- that achieve superior utility and offer richer functionality compared to alternative implementations. Additionally, we adopt best practices to provide end-to-end DP guarantees and address well-known DP-related vulnerabilities. Our goal is to accommodate a wide audience with easy-to-install, highly customizable, and robust model implementations. Our codebase is available from https://github.com/sassoftware/dpmm.

Paperid: 64, https://arxiv.org/pdf/2505.24536.pdf GitHub

Abstract:
The pervasion of large-scale Deep Neural Networks (DNNs) and their enormous training costs make their intellectual property (IP) protection of paramount importance. Recently introduced passport-based methods attempt to steer DNN watermarking towards strengthening ownership verification against ambiguity attacks by modulating the affine parameters of normalization layers. Unfortunately, neither watermarking nor passport-based methods provide a holistic protection with robust ownership proof, high fidelity, active usage authorization and user traceability for offline access distributed models and multi-user Machine-Learning as a Service (MLaaS) cloud model. In this paper, we propose a Chameleon Hash-based Irreversible Passport (CHIP) protection framework that utilizes the cryptographic chameleon hash function to achieve all these goals. The collision-resistant property of chameleon hash allows for strong model ownership claim upon IP infringement and liable user traceability, while the trapdoor-collision property enables hashing of multiple user passports and licensee certificates to the same immutable signature to realize active usage control. Using the owner passport as an oracle, multiple user-specific triplets, each contains a passport-aware user model, a user passport, and a licensee certificate can be created for secure offline distribution. The watermarked master model can also be deployed for MLaaS with usage permission verifiable by the provision of any trapdoor-colliding user passports. CHIP is extensively evaluated on four datasets and two architectures to demonstrate its protection versatility and robustness. Our code is released at https://github.com/Dshm212/CHIP.

Paperid: 65, https://arxiv.org/pdf/2505.24486.pdf GitHub

Abstract:
The performance of existing audio deepfake detection frameworks degrades when confronted with new deepfake attacks. Rehearsal-based continual learning (CL), which updates models using a limited set of old data samples, helps preserve prior knowledge while incorporating new information. However, existing rehearsal techniques don't effectively capture the diversity of audio characteristics, introducing bias and increasing the risk of forgetting. To address this challenge, we propose Rehearsal with Auxiliary-Informed Sampling (RAIS), a rehearsal-based CL approach for audio deepfake detection. RAIS employs a label generation network to produce auxiliary labels, guiding diverse sample selection for the memory buffer. Extensive experiments show RAIS outperforms state-of-the-art methods, achieving an average Equal Error Rate (EER) of 1.953 % across five experiences. The code is available at: https://github.com/falihgoz/RAIS.

Paperid: 66, https://arxiv.org/pdf/2505.24267.pdf GitHub

Abstract:
We introduce MUSE, a watermarking algorithm for tabular generative models. Previous approaches typically leverage DDIM invertibility to watermark tabular diffusion models, but tabular diffusion models exhibit significantly poorer invertibility compared to other modalities, compromising performance. Simultaneously, tabular diffusion models require substantially less computation than other modalities, enabling a multi-sample selection approach to tabular generative model watermarking. MUSE embeds watermarks by generating multiple candidate samples and selecting one based on a specialized scoring function, without relying on model invertibility. Our theoretical analysis establishes the relationship between watermark detectability, candidate count, and dataset size, allowing precise calibration of watermarking strength. Extensive experiments demonstrate that MUSE achieves state-of-the-art watermark detectability and robustness against various attacks while maintaining data quality, and remains compatible with any tabular generative model supporting repeated sampling, effectively addressing key challenges in tabular data watermarking. Specifically, it reduces the distortion rates on fidelity metrics by 81-89%, while achieving a 1.0 TPR@0.1%FPR detection rate. Implementation of MUSE can be found at https://github.com/fangliancheng/MUSE.

Paperid: 67, https://arxiv.org/pdf/2505.23873.pdf GitHub

Abstract:
Knowledge graphs (KGs) are ubiquitous in numerous real-world applications, and watermarking facilitates protecting intellectual property and preventing potential harm from AI-generated content. Existing watermarking methods mainly focus on static plain text or image data, while they can hardly be applied to dynamic graphs due to spatial and temporal variations of structured data. This motivates us to propose KGMARK, the first graph watermarking framework that aims to generate robust, detectable, and transparent diffusion fingerprints for dynamic KG data. Specifically, we propose a novel clustering-based alignment method to adapt the watermark to spatial variations. Meanwhile, we present a redundant embedding strategy to harden the diffusion watermark against various attacks, facilitating the robustness of the watermark to the temporal variations. Additionally, we introduce a novel learnable mask matrix to improve the transparency of diffusion fingerprints. By doing so, our KGMARK properly tackles the variation challenges of structured data. Experiments on various public benchmarks show the effectiveness of our proposed KGMARK. Our code is available at https://github.com/phrara/kgmark.

Paperid: 68, https://arxiv.org/pdf/2505.23839.pdf GitHub

Abstract:
DNA, encoding genetic instructions for almost all living organisms, fuels groundbreaking advances in genomics and synthetic biology. Recently, DNA Foundation Models have achieved success in designing synthetic functional DNA sequences, even whole genomes, but their susceptibility to jailbreaking remains underexplored, leading to potential concern of generating harmful sequences such as pathogens or toxin-producing genes. In this paper, we introduce GeneBreaker, the first framework to systematically evaluate jailbreak vulnerabilities of DNA foundation models. GeneBreaker employs (1) an LLM agent with customized bioinformatic tools to design high-homology, non-pathogenic jailbreaking prompts, (2) beam search guided by PathoLM and log-probability heuristics to steer generation toward pathogen-like sequences, and (3) a BLAST-based evaluation pipeline against a curated Human Pathogen Database (JailbreakDNABench) to detect successful jailbreaks. Evaluated on our JailbreakDNABench, GeneBreaker successfully jailbreaks the latest Evo series models across 6 viral categories consistently (up to 60\% Attack Success Rate for Evo2-40B). Further case studies on SARS-CoV-2 spike protein and HIV-1 envelope protein demonstrate the sequence and structural fidelity of jailbreak output, while evolutionary modeling of SARS-CoV-2 underscores biosecurity risks. Our findings also reveal that scaling DNA foundation models amplifies dual-use risks, motivating enhanced safety alignment and tracing mechanisms. Our code is at https://github.com/zaixizhang/GeneBreaker.

Paperid: 69, https://arxiv.org/pdf/2505.23813.pdf GitHub

Abstract:
Federated Learning (FL) has emerged as a critical paradigm for enabling privacy-preserving machine learning, particularly in regulated sectors such as finance and healthcare. However, standard FL strategies often encounter significant operational challenges related to fault tolerance, system resilience against concurrent client and server failures, and the provision of robust, verifiable privacy guarantees essential for handling sensitive data. These deficiencies can lead to training disruptions, data loss, compromised model integrity, and non-compliance with data protection regulations (e.g., GDPR, CCPA). This paper introduces Differentially Private Resilient Temporal Federated Learning (DP-RTFL), an advanced FL framework designed to ensure training continuity, precise state recovery, and strong data privacy. DP-RTFL integrates local Differential Privacy (LDP) at the client level with resilient temporal state management and integrity verification mechanisms, such as hash-based commitments (referred to as Zero-Knowledge Integrity Proofs or ZKIPs in this context). The framework is particularly suited for critical applications like credit risk assessment using sensitive financial data, aiming to be operationally robust, auditable, and scalable for enterprise AI deployments. The implementation of the DP-RTFL framework is available as open-source.

Paperid: 70, https://arxiv.org/pdf/2505.23773.pdf GitHub

Abstract:
Digital signature algorithms (DSAs) are fundamental to cryptographic security, ensuring data integrity and authentication. While RSA, DSA, ECDSA, and EdDSA are widely used, their performance varies significantly depending on key sizes, hash functions, and elliptic curve configurations. In this paper, we introduce LightDSA, a hybrid and configurable digital signature library that supports RSA, DSA, ECDSA, and EdDSA with flexible form and curve selection, open sourced at https://github.com/serengil/LightDSA. Unlike conventional implementations that impose strict curve-form mappings - such as Weierstrass for ECDSA and Edwards for EdDSA LightDSA - allows arbitrary combinations, enabling a broader performance evaluation. We analyze the computational efficiency of these algorithms across various configurations, comparing key generation, signing, and verification times. Our results provide insights into the trade-offs between security and efficiency, guiding the selection of optimal configurations for different cryptographic needs.

Paperid: 71, https://arxiv.org/pdf/2505.23643.pdf GitHub

Abstract:
As AI agents become increasingly autonomous and capable, ensuring their security against vulnerabilities such as prompt injection becomes critical. This paper explores the use of information-flow control (IFC) to provide security guarantees for AI agents. We present a formal model to reason about the security and expressiveness of agent planners. Using this model, we characterize the class of properties enforceable by dynamic taint-tracking and construct a taxonomy of tasks to evaluate security and utility trade-offs of planner designs. Informed by this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selectively hiding information. Its evaluation in AgentDojo demonstrates that this approach enables us to complete a broad range of tasks with security guarantees. A tutorial to walk readers through the the concepts introduced in the paper can be found at https://github.com/microsoft/fides

Paperid: 72, https://arxiv.org/pdf/2505.21620.pdf GitHub

Abstract:
The rapid development of video generative models has led to a surge in highly realistic synthetic videos, raising ethical concerns related to disinformation and copyright infringement. Recently, video watermarking has been proposed as a mitigation strategy by embedding invisible marks into AI-generated videos to enable subsequent detection. However, the robustness of existing video watermarking methods against both common and adversarial perturbations remains underexplored. In this work, we introduce VideoMarkBench, the first systematic benchmark designed to evaluate the robustness of video watermarks under watermark removal and watermark forgery attacks. Our study encompasses a unified dataset generated by three state-of-the-art video generative models, across three video styles, incorporating four watermarking methods and seven aggregation strategies used during detection. We comprehensively evaluate 12 types of perturbations under white-box, black-box, and no-box threat models. Our findings reveal significant vulnerabilities in current watermarking approaches and highlight the urgent need for more robust solutions. Our code is available at https://github.com/zhengyuan-jiang/VideoMarkBench.

Paperid: 73, https://arxiv.org/pdf/2505.21499.pdf GitHub

Abstract:
Vision-Language Model (VLM) based Web Agents represent a significant step towards automating complex tasks by simulating human-like interaction with websites. However, their deployment in uncontrolled web environments introduces significant security vulnerabilities. Existing research on adversarial environmental injection attacks often relies on unrealistic assumptions, such as direct HTML manipulation, knowledge of user intent, or access to agent model parameters, limiting their practical applicability. In this paper, we propose AdInject, a novel and real-world black-box attack method that leverages the internet advertising delivery to inject malicious content into the Web Agent's environment. AdInject operates under a significantly more realistic threat model than prior work, assuming a black-box agent, static malicious content constraints, and no specific knowledge of user intent. AdInject includes strategies for designing malicious ad content aimed at misleading agents into clicking, and a VLM-based ad content optimization technique that infers potential user intents from the target website's context and integrates these intents into the ad content to make it appear more relevant or critical to the agent's task, thus enhancing attack effectiveness. Experimental evaluations demonstrate the effectiveness of AdInject, attack success rates exceeding 60% in most scenarios and approaching 100% in certain cases. This strongly demonstrates that prevalent advertising delivery constitutes a potent and real-world vector for environment injection attacks against Web Agents. This work highlights a critical vulnerability in Web Agent security arising from real-world environment manipulation channels, underscoring the urgent need for developing robust defense mechanisms against such threats. Our code is available at https://github.com/NicerWang/AdInject.

Paperid: 74, https://arxiv.org/pdf/2505.21277.pdf GitHub

Abstract:
Large Language Models (LLMs), despite advanced general capabilities, still suffer from numerous safety risks, especially jailbreak attacks that bypass safety protocols. Understanding these vulnerabilities through black-box jailbreak attacks, which better reflect real-world scenarios, offers critical insights into model robustness. While existing methods have shown improvements through various prompt engineering techniques, their success remains limited against safety-aligned models, overlooking a more fundamental problem: the effectiveness is inherently bounded by the predefined strategy spaces. However, expanding this space presents significant challenges in both systematically capturing essential attack patterns and efficiently navigating the increased complexity. To better explore the potential of expanding the strategy space, we address these challenges through a novel framework that decomposes jailbreak strategies into essential components based on the Elaboration Likelihood Model (ELM) theory and develops genetic-based optimization with intention evaluation mechanisms. To be striking, our experiments reveal unprecedented jailbreak capabilities by expanding the strategy space: we achieve over 90% success rate on Claude-3.5 where prior methods completely fail, while demonstrating strong cross-model transferability and surpassing specialized safeguard models in evaluation accuracy. The code is open-sourced at: https://github.com/Aries-iai/CL-GSO.

Paperid: 75, https://arxiv.org/pdf/2505.20259.pdf GitHub

Abstract:
LLMs have made impressive progress, but their growing capabilities also expose them to highly flexible jailbreaking attacks designed to bypass safety alignment. While many existing defenses focus on known types of attacks, it is more critical to prepare LLMs for unseen attacks that may arise during deployment. To address this, we propose a lifelong safety alignment framework that enables LLMs to continuously adapt to new and evolving jailbreaking strategies. Our framework introduces a competitive setup between two components: a Meta-Attacker, trained to actively discover novel jailbreaking strategies, and a Defender, trained to resist them. To effectively warm up the Meta-Attacker, we first leverage the GPT-4o API to extract key insights from a large collection of jailbreak-related research papers. Through iterative training, the first iteration Meta-Attacker achieves a 73% attack success rate (ASR) on RR and a 57% transfer ASR on LAT using only single-turn attacks. Meanwhile, the Defender progressively improves its robustness and ultimately reduces the Meta-Attacker's success rate to just 7%, enabling safer and more reliable deployment of LLMs in open-ended environments. The code is available at https://github.com/sail-sg/LifelongSafetyAlignment.

Paperid: 76, https://arxiv.org/pdf/2505.19973.pdf GitHub

Abstract:
Digital Forensics and Incident Response (DFIR) involves analyzing digital evidence to support legal investigations. Large Language Models (LLMs) offer new opportunities in DFIR tasks such as log analysis and memory forensics, but their susceptibility to errors and hallucinations raises concerns in high-stakes contexts. Despite growing interest, there is no comprehensive benchmark to evaluate LLMs across both theoretical and practical DFIR domains. To address this gap, we present DFIR-Metric, a benchmark with three components: (1) Knowledge Assessment: a set of 700 expert-reviewed multiple-choice questions sourced from industry-standard certifications and official documentation; (2) Realistic Forensic Challenges: 150 CTF-style tasks testing multi-step reasoning and evidence correlation; and (3) Practical Analysis: 500 disk and memory forensics cases from the NIST Computer Forensics Tool Testing Program (CFTT). We evaluated 14 LLMs using DFIR-Metric, analyzing both their accuracy and consistency across trials. We also introduce a new metric, the Task Understanding Score (TUS), designed to more effectively evaluate models in scenarios where they achieve near-zero accuracy. This benchmark offers a rigorous, reproducible foundation for advancing AI in digital forensics. All scripts, artifacts, and results are available on the project website at https://github.com/DFIR-Metric.

Paperid: 77, https://arxiv.org/pdf/2505.19395.pdf GitHub

Abstract:
Ensuring that large language models (LLMs) can effectively assess, detect, explain, and remediate software vulnerabilities is critical for building robust and secure software systems. We introduce VADER, a human-evaluated benchmark designed explicitly to assess LLM performance across four key vulnerability-handling dimensions: assessment, detection, explanation, and remediation. VADER comprises 174 real-world software vulnerabilities, each carefully curated from GitHub repositories and annotated by security experts. For each vulnerability case, models are tasked with identifying the flaw, classifying it using Common Weakness Enumeration (CWE), explaining its underlying cause, proposing a patch, and formulating a test plan. Using a one-shot prompting strategy, we benchmark six state-of-the-art LLMs (Claude 3.7 Sonnet, Gemini 2.5 Pro, GPT-4.1, GPT-4.5, Grok 3 Beta, and o3) on VADER, and human security experts evaluated each response according to a rigorous scoring rubric emphasizing remediation (quality of the code fix, 50%), explanation (20%), and classification and test plan (30%) according to a standardized rubric. Our results show that current state-of-the-art LLMs achieve only moderate success on VADER - OpenAI's o3 attained 54.7% accuracy overall, with others in the 49-54% range, indicating ample room for improvement. Notably, remediation quality is strongly correlated (Pearson r > 0.97) with accurate classification and test plans, suggesting that models that effectively categorize vulnerabilities also tend to fix them well. VADER's comprehensive dataset, detailed evaluation rubrics, scoring tools, and visualized results with confidence intervals are publicly released, providing the community with an interpretable, reproducible benchmark to advance vulnerability-aware LLMs. All code and data are available at: https://github.com/AfterQuery/vader

Paperid: 78, https://arxiv.org/pdf/2505.18787.pdf GitHub GitHub

Abstract:
Deepfake (DF) detectors face significant challenges when deployed in real-world environments, particularly when encountering test samples deviated from training data through either postprocessing manipulations or distribution shifts. We demonstrate postprocessing techniques can completely obscure generation artifacts presented in DF samples, leading to performance degradation of DF detectors. To address these challenges, we propose Think Twice before Adaptation (\texttt{T$^2$A}), a novel online test-time adaptation method that enhances the adaptability of detectors during inference without requiring access to source training data or labels. Our key idea is to enable the model to explore alternative options through an Uncertainty-aware Negative Learning objective rather than solely relying on its initial predictions as commonly seen in entropy minimization (EM)-based approaches. We also introduce an Uncertain Sample Prioritization strategy and Gradients Masking technique to improve the adaptation by focusing on important samples and model parameters. Our theoretical analysis demonstrates that the proposed negative learning objective exhibits complementary behavior to EM, facilitating better adaptation capability. Empirically, our method achieves state-of-the-art results compared to existing test-time adaptation (TTA) approaches and significantly enhances the resilience and generalization of DF detectors during inference. Code is available \href{https://github.com/HongHanh2104/T2A-Think-Twice-Before-Adaptation}{here}.

Paperid: 79, https://arxiv.org/pdf/2505.18613.pdf GitHub

Abstract:
Ransomware remains a critical threat to cybersecurity, yet publicly available datasets for training machine learning-based ransomware detection models are scarce and often have limited sample size, diversity, and reproducibility. In this paper, we introduce MLRan, a behavioural ransomware dataset, comprising over 4,800 samples across 64 ransomware families and a balanced set of goodware samples. The samples span from 2006 to 2024 and encompass the four major types of ransomware: locker, crypto, ransomware-as-a-service, and modern variants. We also propose guidelines (GUIDE-MLRan), inspired by previous work, for constructing high-quality behavioural ransomware datasets, which informed the curation of our dataset. We evaluated the ransomware detection performance of several machine learning (ML) models using MLRan. For this purpose, we performed feature selection by conducting mutual information filtering to reduce the initial 6.4 million features to 24,162, followed by recursive feature elimination, yielding 483 highly informative features. The ML models achieved an accuracy, precision and recall of up to 98.7%, 98.9%, 98.5%, respectively. Using SHAP and LIME, we identified critical indicators of malicious behaviour, including registry tampering, strings, and API misuse. The dataset and source code for feature extraction, selection, ML training, and evaluation are available publicly to support replicability and encourage future research, which can be found at https://github.com/faithfulco/mlran.

Paperid: 80, https://arxiv.org/pdf/2505.18551.pdf GitHub

Abstract:
Machine learning (ML)-based malware detection systems often fail to account for the dynamic nature of real-world training and test data distributions. In practice, these distributions evolve due to frequent changes in the Android ecosystem, adversarial development of new malware families, and the continuous emergence of both benign and malicious applications. Prior studies have shown that such concept drift -- distributional shifts in benign and malicious samples, leads to significant degradation in detection performance over time. Despite the practical importance of this issue, existing datasets are often outdated and limited in temporal scope, diversity of malware families, and sample scale, making them insufficient for the systematic evaluation of concept drift in malware detection. To address this gap, we present LAMDA, the largest and most temporally diverse Android malware benchmark to date, designed specifically for concept drift analysis. LAMDA spans 12 years (2013-2025, excluding 2015), includes over 1 million samples (approximately 37% labeled as malware), and covers 1,380 malware families and 150,000 singleton samples, reflecting the natural distribution and evolution of real-world Android applications. We empirically demonstrate LAMDA's utility by quantifying the performance degradation of standard ML models over time and analyzing feature stability across years. As the most comprehensive Android malware dataset to date, LAMDA enables in-depth research into temporal drift, generalization, explainability, and evolving detection challenges. The dataset and code are available at: https://iqsec-lab.github.io/LAMDA/.

Paperid: 81, https://arxiv.org/pdf/2505.18333.pdf GitHub

Abstract:
Large Language Models (LLMs) are vulnerable to prompt injection attacks, and several defenses have recently been proposed, often claiming to mitigate these attacks successfully. However, we argue that existing studies lack a principled approach to evaluating these defenses. In this paper, we argue the need to assess defenses across two critical dimensions: (1) effectiveness, measured against both existing and adaptive prompt injection attacks involving diverse target and injected prompts, and (2) general-purpose utility, ensuring that the defense does not compromise the foundational capabilities of the LLM. Our critical evaluation reveals that prior studies have not followed such a comprehensive evaluation methodology. When assessed using this principled approach, we show that existing defenses are not as successful as previously reported. This work provides a foundation for evaluating future defenses and guiding their development. Our code and data are available at: https://github.com/PIEval123/PIEval.

Paperid: 82, https://arxiv.org/pdf/2505.18156.pdf GitHub

Abstract:
Large Language Models (LLMs) are changing the way people interact with technology. Tools like ChatGPT and Claude AI are now common in business, research, and everyday life. But with that growth comes new risks, especially prompt-based attacks that exploit how these models process language. InjectLab is a security framework designed to address that problem. This paper introduces InjectLab as a structured, open-source matrix that maps real-world techniques used to manipulate LLMs. The framework is inspired by MITRE ATT&CK and focuses specifically on adversarial behavior at the prompt layer. It includes over 25 techniques organized under six core tactics, covering threats like instruction override, identity swapping, and multi-agent exploitation. Each technique in InjectLab includes detection guidance, mitigation strategies, and YAML-based simulation tests. A Python tool supports easy execution of prompt-based test cases. This paper outlines the framework's structure, compares it to other AI threat taxonomies, and discusses its future direction as a practical, community-driven foundation for securing language models.

Paperid: 83, https://arxiv.org/pdf/2505.18135.pdf GitHub

Abstract:
Large language models (LLMs) can now access a wide range of external tools, thanks to the Model Context Protocol (MCP). This greatly expands their abilities as various agents. However, LLMs rely entirely on the text descriptions of tools to decide which ones to use--a process that is surprisingly fragile. In this work, we expose a vulnerability in prevalent tool/function-calling protocols by investigating a series of edits to tool descriptions, some of which can drastically increase a tool's usage from LLMs when competing with alternatives. Through controlled experiments, we show that tools with properly edited descriptions receive over 10 times more usage from GPT-4.1 and Qwen2.5-7B than tools with original descriptions. We further evaluate how various edits to tool descriptions perform when competing directly with one another and how these trends generalize or differ across a broader set of 17 different models. These phenomena, while giving developers a powerful way to promote their tools, underscore the need for a more reliable foundation for agentic LLMs to select and utilize tools and resources. Our code is publicly available at https://github.com/kazemf78/llm-unreliable-tool-preferences.

Paperid: 84, https://arxiv.org/pdf/2505.17776.pdf GitHub

Abstract:
Emerging 5G millimeter-wave and sub-6 GHz networks enable high-accuracy indoor localization, but security and privacy vulnerabilities pose serious challenges. In this paper, we identify and address threats including location spoofing and adversarial signal manipulation against 5G-based indoor localization. We formalize a threat model encompassing attackers who inject forged radio signals or perturb channel measurements to mislead the localization system. To defend against these threats, we propose an adversary-resilient localization architecture that combines deep learning fingerprinting with physical domain knowledge. Our approach integrates multi-anchor Channel Impulse Response (CIR) fingerprints with Time Difference of Arrival (TDoA) features and known anchor positions in a hybrid Convolutional Neural Network (CNN) and multi-head attention network. This design inherently checks geometric consistency and dynamically down-weights anomalous signals, making localization robust to tampering. We formulate the secure localization problem and demonstrate, through extensive experiments on a public 5G indoor dataset, that the proposed system achieves a mean error approximately 0.58 m under mixed Line-of-Sight (LOS) and Non-Line-of-Sight (NLOS) trajectories in benign conditions and gracefully degrades to around 0.81 m under attack scenarios. We also show via ablation studies that each architecture component (attention mechanism, TDoA, etc.) is critical for both accuracy and resilience, reducing errors by 4-5 times compared to baselines. In addition, our system runs in real-time, localizing the user in just 1 ms on a simple CPU. The code has been released to ensure reproducibility (https://github.com/sec5gloc/Sec5GLoc).

Paperid: 85, https://arxiv.org/pdf/2505.17598.pdf GitHub

Abstract:
Safety alignment in large language models (LLMs) is increasingly compromised by jailbreak attacks, which can manipulate these models to generate harmful or unintended content. Investigating these attacks is crucial for uncovering model vulnerabilities. However, many existing jailbreak strategies fail to keep pace with the rapid development of defense mechanisms, such as defensive suffixes, rendering them ineffective against defended models. To tackle this issue, we introduce a novel attack method called ArrAttack, specifically designed to target defended LLMs. ArrAttack automatically generates robust jailbreak prompts capable of bypassing various defense measures. This capability is supported by a universal robustness judgment model that, once trained, can perform robustness evaluation for any target model with a wide variety of defenses. By leveraging this model, we can rapidly develop a robust jailbreak prompt generator that efficiently converts malicious input prompts into effective attacks. Extensive evaluations reveal that ArrAttack significantly outperforms existing attack strategies, demonstrating strong transferability across both white-box and black-box models, including GPT-4 and Claude-3. Our work bridges the gap between jailbreak attacks and defenses, providing a fresh perspective on generating robust jailbreak prompts. We make the codebase available at https://github.com/LLBao/ArrAttack.

Paperid: 86, https://arxiv.org/pdf/2505.17107.pdf GitHub

Abstract:
Large Language Model (LLM) agents can automate cybersecurity tasks and can adapt to the evolving cybersecurity landscape without re-engineering. While LLM agents have demonstrated cybersecurity capabilities on Capture-The-Flag (CTF) competitions, they have two key limitations: accessing latest cybersecurity expertise beyond training data, and integrating new knowledge into complex task planning. Knowledge-based approaches that incorporate technical understanding into the task-solving automation can tackle these limitations. We present CRAKEN, a knowledge-based LLM agent framework that improves cybersecurity capability through three core mechanisms: contextual decomposition of task-critical information, iterative self-reflected knowledge retrieval, and knowledge-hint injection that transforms insights into adaptive attack strategies. Comprehensive evaluations with different configurations show CRAKEN's effectiveness in multi-stage vulnerability detection and exploitation compared to previous approaches. Our extensible architecture establishes new methodologies for embedding new security knowledge into LLM-driven cybersecurity agentic systems. With a knowledge database of CTF writeups, CRAKEN obtained an accuracy of 22% on NYU CTF Bench, outperforming prior works by 3% and achieving state-of-the-art results. On evaluation of MITRE ATT&CK techniques, CRAKEN solves 25-30% more techniques than prior work, demonstrating improved cybersecurity capabilities via knowledge-based execution. We make our framework open source to public https://github.com/NYU-LLM-CTF/nyuctf_agents_craken.

Paperid: 87, https://arxiv.org/pdf/2505.17085.pdf GitHub

Abstract:
The ubiquity of social media platforms facilitates malicious linguistic steganography, posing significant security risks. Steganalysis is profoundly hindered by the challenge of identifying subtle cognitive inconsistencies arising from textual fragmentation and complex dialogue structures, and the difficulty in achieving robust aggregation of multi-dimensional weak signals, especially given extreme steganographic sparsity and sophisticated steganography. These core detection difficulties are compounded by significant data imbalance. This paper introduces GSDFuse, a novel method designed to systematically overcome these obstacles. GSDFuse employs a holistic approach, synergistically integrating hierarchical multi-modal feature engineering to capture diverse signals, strategic data augmentation to address sparsity, adaptive evidence fusion to intelligently aggregate weak signals, and discriminative embedding learning to enhance sensitivity to subtle inconsistencies. Experiments on social media datasets demonstrate GSDFuse's state-of-the-art (SOTA) performance in identifying sophisticated steganography within complex dialogue environments. The source code for GSDFuse is available at https://github.com/NebulaEmmaZh/GSDFuse.

Paperid: 88, https://arxiv.org/pdf/2505.16831.pdf GitHub

Abstract:
Unlearning in large language models (LLMs) is intended to remove the influence of specific data, yet current evaluations rely heavily on token-level metrics such as accuracy and perplexity. We show that these metrics can be misleading: models often appear to forget, but their original behavior can be rapidly restored with minimal fine-tuning, revealing that unlearning may obscure information rather than erase it. To diagnose this phenomenon, we introduce a representation-level evaluation framework using PCA-based similarity and shift, centered kernel alignment, and Fisher information. Applying this toolkit across six unlearning methods, three domains (text, code, math), and two open-source LLMs, we uncover a critical distinction between reversible and irreversible forgetting. In reversible cases, models suffer token-level collapse yet retain latent features; in irreversible cases, deeper representational damage occurs. We further provide a theoretical account linking shallow weight perturbations near output layers to misleading unlearning signals, and show that reversibility is modulated by task type and hyperparameters. Our findings reveal a fundamental gap in current evaluation practices and establish a new diagnostic foundation for trustworthy unlearning in LLMs. We provide a unified toolkit for analyzing LLM representation changes under unlearning and relearning: https://github.com/XiaoyuXU1/Representational_Analysis_Tools.git.

Paperid: 89, https://arxiv.org/pdf/2505.16737.pdf GitHub

Abstract:
The significant progress of large language models (LLMs) has led to remarkable achievements across numerous applications. However, their ability to generate harmful content has sparked substantial safety concerns. Despite the implementation of safety alignment techniques during the pre-training phase, recent research indicates that fine-tuning LLMs on adversarial or even benign data can inadvertently compromise their safety. In this paper, we re-examine the fundamental issue of why fine-tuning on non-harmful data still results in safety degradation. We introduce a safety-aware probing (SAP) optimization framework designed to mitigate the safety risks of fine-tuning LLMs. Specifically, SAP incorporates a safety-aware probe into the gradient propagation process, mitigating the model's risk of safety degradation by identifying potential pitfalls in gradient directions, thereby enhancing task-specific performance while successfully preserving model safety. Our extensive experimental results demonstrate that SAP effectively reduces harmfulness below the original fine-tuned model and achieves comparable test loss to standard fine-tuning methods. Our code is available at https://github.com/ChengcanWu/SAP.

Paperid: 90, https://arxiv.org/pdf/2505.16650.pdf GitHub

Abstract:
Due to the recent increase in the number of connected devices, the need to promptly detect security issues is emerging. Moreover, the high number of communication flows creates the necessity of processing huge amounts of data. Furthermore, the connected devices are heterogeneous in nature, having different computational capacities. For this reason, in this work we propose an image-based representation of network traffic which allows to realize a compact summary of the current network conditions with 1-second time windows. The proposed representation highlights the presence of anomalies thus reducing the need for complex processing architectures. Finally, we present an unsupervised learning approach which effectively detects the presence of anomalies. The code and the dataset are available at https://github.com/michaelneri/image-based-network-traffic-anomaly-detection.

Paperid: 91, https://arxiv.org/pdf/2505.16640.pdf GitHub

Abstract:
Vision-Language-Action (VLA) models have advanced robotic control by enabling end-to-end decision-making directly from multimodal inputs. However, their tightly coupled architectures expose novel security vulnerabilities. Unlike traditional adversarial perturbations, backdoor attacks represent a stealthier, persistent, and practically significant threat-particularly under the emerging Training-as-a-Service paradigm-but remain largely unexplored in the context of VLA models. To address this gap, we propose BadVLA, a backdoor attack method based on Objective-Decoupled Optimization, which for the first time exposes the backdoor vulnerabilities of VLA models. Specifically, it consists of a two-stage process: (1) explicit feature-space separation to isolate trigger representations from benign inputs, and (2) conditional control deviations that activate only in the presence of the trigger, while preserving clean-task performance. Empirical results on multiple VLA benchmarks demonstrate that BadVLA consistently achieves near-100% attack success rates with minimal impact on clean task accuracy. Further analyses confirm its robustness against common input perturbations, task transfers, and model fine-tuning, underscoring critical security vulnerabilities in current VLA deployments. Our work offers the first systematic investigation of backdoor vulnerabilities in VLA models, highlighting an urgent need for secure and trustworthy embodied model design practices. We have released the project page at https://badvla-project.github.io/.

Paperid: 92, https://arxiv.org/pdf/2505.16530.pdf GitHub

Abstract:
Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners due to the enormous computational cost of training. It is crucial to protect the IP of LLMs from malicious stealing or unauthorized deployment. Despite existing efforts in watermarking and fingerprinting LLMs, these methods either impact the text generation process or are limited in white-box access to the suspect model, making them impractical. Hence, we propose DuFFin, a novel $\textbf{Du}$al-Level $\textbf{Fin}$gerprinting $\textbf{F}$ramework for black-box setting ownership verification. DuFFin extracts the trigger pattern and the knowledge-level fingerprints to identify the source of a suspect model. We conduct experiments on a variety of models collected from the open-source website, including four popular base models as protected LLMs and their fine-tuning, quantization, and safety alignment versions, which are released by large companies, start-ups, and individual users. Results show that our method can accurately verify the copyright of the base protected LLM on their model variants, achieving the IP-ROC metric greater than 0.95. Our code is available at https://github.com/yuliangyan0807/llm-fingerprint.

Paperid: 93, https://arxiv.org/pdf/2505.16263.pdf GitHub

Abstract:
Social media and online forums are increasingly becoming popular. Unfortunately, these platforms are being used for spreading hate speech. In this paper, we design black-box techniques to protect users from hate-speech on online platforms by generating perturbations that can fool state of the art deep learning based hate speech detection models thereby decreasing their efficiency. We also ensure a minimal change in the original meaning of hate-speech. Our best perturbation attack is successfully able to evade hate-speech detection for 86.8 % of hateful text.

Paperid: 94, https://arxiv.org/pdf/2505.15389.pdf GitHub

Abstract:
Rapid deployment of vision-language models (VLMs) magnifies safety risks, yet most evaluations rely on artificial images. This study asks: How safe are current VLMs when confronted with meme images that ordinary users share? To investigate this question, we introduce MemeSafetyBench, a 50,430-instance benchmark pairing real meme images with both harmful and benign instructions. Using a comprehensive safety taxonomy and LLM-based instruction generation, we assess multiple VLMs across single and multi-turn interactions. We investigate how real-world memes influence harmful outputs, the mitigating effects of conversational context, and the relationship between model scale and safety metrics. Our findings demonstrate that VLMs are more vulnerable to meme-based harmful prompts than to synthetic or typographic images. Memes significantly increase harmful responses and decrease refusals compared to text-only inputs. Though multi-turn interactions provide partial mitigation, elevated vulnerability persists. These results highlight the need for ecologically valid evaluations and stronger safety mechanisms. MemeSafetyBench is publicly available at https://github.com/oneonlee/Meme-Safety-Bench.

Paperid: 95, https://arxiv.org/pdf/2505.15140.pdf GitHub

Abstract:
Graph Neural Networks (GNNs) have been widely used for graph analysis. Federated Graph Learning (FGL) is an emerging learning framework to collaboratively train graph data from various clients. Although FGL allows client data to remain localized, a malicious server can still steal client private data information through uploaded gradient. In this paper, we for the first time propose label distribution attacks (LDAs) on FGL that aim to infer the label distributions of the client-side data. Firstly, we observe that the effectiveness of LDA is closely related to the variance of node embeddings in GNNs. Next, we analyze the relation between them and propose a new attack named EC-LDA, which significantly improves the attack effectiveness by compressing node embeddings. Then, extensive experiments on node classification and link prediction tasks across six widely used graph datasets show that EC-LDA outperforms the SOTA LDAs. Specifically, EC-LDA can achieve the Cos-sim as high as 1.0 under almost all cases. Finally, we explore the robustness of EC-LDA under differential privacy protection and discuss the potential effective defense methods to EC-LDA. Our code is available at https://github.com/cheng-t/EC-LDA.

Paperid: 96, https://arxiv.org/pdf/2505.14112.pdf GitHub

Abstract:
Logit-based LLM watermarking traces and verifies AI-generated content by maintaining green and red token lists and increasing the likelihood of green tokens during generation. However, it fails in low-entropy scenarios, where predictable outputs make green token selection difficult without disrupting natural text flow. Existing approaches address this by assuming access to the original LLM to calculate entropy and selectively watermark high-entropy tokens. However, these methods face two major challenges: (1) high computational costs and detection delays due to reliance on the original LLM, and (2) potential risks of model leakage. To address these limitations, we propose Invisible Entropy (IE), a watermarking paradigm designed to enhance both safety and efficiency. Instead of relying on the original LLM, IE introduces a lightweight feature extractor and an entropy tagger to predict whether the entropy of the next token is high or low. Furthermore, based on theoretical analysis, we develop a threshold navigator that adaptively sets entropy thresholds. It identifies a threshold where the watermark ratio decreases as the green token count increases, enhancing the naturalness of the watermarked text and improving detection robustness. Experiments on HumanEval and MBPP datasets demonstrate that IE reduces parameter size by 99\% while achieving performance on par with state-of-the-art methods. Our work introduces a safe and efficient paradigm for low-entropy watermarking. https://github.com/Carol-gutianle/IE https://huggingface.co/datasets/Carol0110/IE-Tagger

Paperid: 97, https://arxiv.org/pdf/2505.14103.pdf GitHub

Abstract:
Jailbreak attacks to Large audio-language models (LALMs) are studied recently, but they achieve suboptimal effectiveness, applicability, and practicability, particularly, assuming that the adversary can fully manipulate user prompts. In this work, we first conduct an extensive experiment showing that advanced text jailbreak attacks cannot be easily ported to end-to-end LALMs via text-to speech (TTS) techniques. We then propose AudioJailbreak, a novel audio jailbreak attack, featuring (1) asynchrony: the jailbreak audio does not need to align with user prompts in the time axis by crafting suffixal jailbreak audios; (2) universality: a single jailbreak perturbation is effective for different prompts by incorporating multiple prompts into perturbation generation; (3) stealthiness: the malicious intent of jailbreak audios will not raise the awareness of victims by proposing various intent concealment strategies; and (4) over-the-air robustness: the jailbreak audios remain effective when being played over the air by incorporating the reverberation distortion effect with room impulse response into the generation of the perturbations. In contrast, all prior audio jailbreak attacks cannot offer asynchrony, universality, stealthiness, or over-the-air robustness. Moreover, AudioJailbreak is also applicable to the adversary who cannot fully manipulate user prompts, thus has a much broader attack scenario. Extensive experiments with thus far the most LALMs demonstrate the high effectiveness of AudioJailbreak. We highlight that our work peeks into the security implications of audio jailbreak attacks against LALMs, and realistically fosters improving their security robustness. The implementation and audio samples are available at our website https://audiojailbreak.github.io/AudioJailbreak.

Paperid: 98, https://arxiv.org/pdf/2505.12690.pdf GitHub

Abstract:
We develop QUICtester, an automated approach for uncovering non-compliant behaviors in the ratified QUIC protocol implementations (RFC 9000/9001). QUICtester leverages active automata learning to abstract the behavior of a QUIC implementation into a finite state machine (FSM) representation. Unlike prior noncompliance checking methods, to help uncover state dependencies on event timing, QUICtester introduces the idea of state learning with event timing variations, adopting both valid and invalid input configurations, and combinations of security and transport layer parameters during learning. We use pairwise differential analysis of learned behaviour models of tested QUIC implementations to identify non-compliance instances as behaviour deviations in a property-agnostic way. This exploits the existence of the many different QUIC implementations, removing the need for validated, formal models. The diverse implementations act as cross-checking test oracles to discover non-compliance. We used QUICtester to analyze analyze 186 learned models from 19 QUIC implementations under the five security settings and discovered 55 implementation errors. Significantly, the tool uncovered a QUIC specification ambiguity resulting in an easily exploitable DoS vulnerability, led to 5 CVE assignments from developers, and two bug bounties thus far.

Paperid: 99, https://arxiv.org/pdf/2505.11837.pdf GitHub

Abstract:
Nowadays, Large Language Models (LLMs) are trained on huge datasets, some including sensitive information. This poses a serious privacy concern because privacy attacks such as Membership Inference Attacks (MIAs) may detect this sensitive information. While knowledge distillation compresses LLMs into efficient, smaller student models, its impact on privacy remains underexplored. In this paper, we investigate how knowledge distillation affects model robustness against MIA. We focus on two questions. First, how is private data protected in teacher and student models? Second, how can we strengthen privacy preservation against MIAs in knowledge distillation? Through comprehensive experiments, we show that while teacher and student models achieve similar overall MIA accuracy, teacher models better protect member data, the primary target of MIA, whereas student models better protect non-member data. To address this vulnerability in student models, we propose 5 privacy-preserving distillation methods and demonstrate that they successfully reduce student models' vulnerability to MIA, with ensembling further stabilizing the robustness, offering a reliable approach for distilling more secure and efficient student models. Our implementation source code is available at https://github.com/richardcui18/MIA_in_KD.

Paperid: 100, https://arxiv.org/pdf/2505.11586.pdf GitHub

Abstract:
Recent research highlights concerns about the trustworthiness of third-party Pre-Trained Language Models (PTLMs) due to potential backdoor attacks. These backdoored PTLMs, however, are effective only for specific pre-defined downstream tasks. In reality, these PTLMs can be adapted to many other unrelated downstream tasks. Such adaptation may lead to unforeseen consequences in downstream model outputs, consequently raising user suspicion and compromising attack stealthiness. We refer to this phenomenon as backdoor complications. In this paper, we undertake the first comprehensive quantification of backdoor complications. Through extensive experiments using 4 prominent PTLMs and 16 text classification benchmark datasets, we demonstrate the widespread presence of backdoor complications in downstream models fine-tuned from backdoored PTLMs. The output distribution of triggered samples significantly deviates from that of clean samples. Consequently, we propose a backdoor complication reduction method leveraging multi-task learning to mitigate complications without prior knowledge of downstream tasks. The experimental results demonstrate that our proposed method can effectively reduce complications while maintaining the efficacy and consistency of backdoor attacks. Our code is available at https://github.com/zhangrui4041/Backdoor_Complications.

Paperid: 101, https://arxiv.org/pdf/2505.11547.pdf GitHub

Abstract:
Attribution of cyber-attacks remains a complex but critical challenge for cyber defenders. Currently, manual extraction of behavioral indicators from dense forensic documentation causes significant attribution delays, especially following major incidents at the international scale. This research evaluates large language models (LLMs) for cyber-attack attribution based on behavioral indicators extracted from forensic documentation. We test OpenAI's GPT-4 and text-embedding-3-large for identifying threat actors' tactics, techniques, and procedures (TTPs) by comparing LLM-generated TTPs against human-generated data from MITRE ATT&CK Groups. Our framework then identifies TTPs from text using vector embedding search and builds profiles to attribute new attacks for a machine learning model to learn. Key contributions include: (1) assessing off-the-shelf LLMs for TTP extraction and attribution, and (2) developing an end-to-end pipeline from raw CTI documents to threat-actor prediction. This research finds that standard LLMs generate TTP datasets with noise, resulting in a low similarity to human-generated datasets. However, the TTPs generated are similar in frequency to those within the existing MITRE datasets. Additionally, although these TTPs are different than human-generated datasets, our work demonstrates that they still prove useful for training a model that performs above baseline on attribution. Project code and files are contained here: https://github.com/kylag/ttp_attribution.

Paperid: 102, https://arxiv.org/pdf/2505.11049.pdf GitHub

Abstract:
To enhance the safety of VLMs, this paper introduces a novel reasoning-based VLM guard model dubbed GuardReasoner-VL. The core idea is to incentivize the guard model to deliberatively reason before making moderation decisions via online RL. First, we construct GuardReasoner-VLTrain, a reasoning corpus with 123K samples and 631K reasoning steps, spanning text, image, and text-image inputs. Then, based on it, we cold-start our model's reasoning ability via SFT. In addition, we further enhance reasoning regarding moderation through online RL. Concretely, to enhance diversity and difficulty of samples, we conduct rejection sampling followed by data augmentation via the proposed safety-aware data concatenation. Besides, we use a dynamic clipping parameter to encourage exploration in early stages and exploitation in later stages. To balance performance and token efficiency, we design a length-aware safety reward that integrates accuracy, format, and token cost. Extensive experiments demonstrate the superiority of our model. Remarkably, it surpasses the runner-up by 19.27% F1 score on average. We release data, code, and models (3B/7B) of GuardReasoner-VL at https://github.com/yueliu1999/GuardReasoner-VL/

Paperid: 103, https://arxiv.org/pdf/2505.10846.pdf GitHub

Abstract:
This paper presents AutoRAN, the first automated, weak-to-strong jailbreak attack framework targeting large reasoning models (LRMs). At its core, AutoRAN leverages a weak, less-aligned reasoning model to simulate the target model's high-level reasoning structures, generates narrative prompts, and iteratively refines candidate prompts by incorporating the target model's intermediate reasoning steps. We evaluate AutoRAN against state-of-the-art LRMs including GPT-o3/o4-mini and Gemini-2.5-Flash across multiple benchmark datasets (AdvBench, HarmBench, and StrongReject). Results demonstrate that AutoRAN achieves remarkable success rates (approaching 100%) within one or a few turns across different LRMs, even when judged by a robustly aligned external model. This work reveals that leveraging weak reasoning models can effectively exploit the critical vulnerabilities of much more capable reasoning models, highlighting the need for improved safety measures specifically designed for reasoning-based models. The code for replicating AutoRAN and running records are available at: (https://github.com/JACKPURCELL/AutoRAN-public). (warning: this paper contains potentially harmful content generated by LRMs.)

Paperid: 104, https://arxiv.org/pdf/2505.10321.pdf GitHub

Abstract:
A recent area of increasing research is the use of Large Language Models (LLMs) in penetration testing, which promises to reduce costs and thus allow for higher frequency. We conduct a review of related work, identifying best practices and common evaluation issues. We then present AutoPentest, an application for performing black-box penetration tests with a high degree of autonomy. AutoPentest is based on the LLM GPT-4o from OpenAI and the LLM agent framework LangChain. It can perform complex multi-step tasks, augmented by external tools and knowledge bases. We conduct a study on three capture-the-flag style Hack The Box (HTB) machines, comparing our implementation AutoPentest with the baseline approach of manually using the ChatGPT-4o user interface. Both approaches are able to complete 15-25 % of the subtasks on the HTB machines, with AutoPentest slightly outperforming ChatGPT. We measure a total cost of \$96.20 US when using AutoPentest across all experiments, while a one-month subscription to ChatGPT Plus costs \$20. The results show that further implementation efforts and the use of more powerful LLMs released in the future are likely to make this a viable part of vulnerability management.

Paperid: 105, https://arxiv.org/pdf/2505.10111.pdf GitHub

Abstract:
Per Row Activation Counting (PRAC) has emerged as a robust framework for mitigating RowHammer (RH) vulnerabilities in modern DRAM systems. However, we uncover a critical vulnerability: a timing channel introduced by the Alert Back-Off (ABO) protocol and Refresh Management (RFM) commands. We present PRACLeak, a novel attack that exploits these timing differences to leak sensitive information, such as secret keys from vulnerable AES implementations, by monitoring memory access latencies. To counter this, we propose Timing-Safe PRAC (TPRAC), a defense that eliminates PRAC-induced timing channels without compromising RH mitigation efficacy. TPRAC uses Timing-Based RFMs, issued periodically and independent of memory activity. It requires only a single-entry in-DRAM mitigation queue per DRAM bank and is compatible with existing DRAM standards. Our evaluations demonstrate that TPRAC closes timing channels while incurring only 3.4% performance overhead at the RH threshold of 1024.

Paperid: 106, https://arxiv.org/pdf/2505.09924.pdf GitHub

Abstract:
The rise of Large Language Models (LLMs) has heightened concerns about the misuse of AI-generated text, making watermarking a promising solution. Mainstream watermarking schemes for LLMs fall into two categories: logits-based and sampling-based. However, current schemes entail trade-offs among robustness, text quality, and security. To mitigate this, we integrate logits-based and sampling-based schemes, harnessing their respective strengths to achieve synergy. In this paper, we propose a versatile symbiotic watermarking framework with three strategies: serial, parallel, and hybrid. The hybrid framework adaptively embeds watermarks using token entropy and semantic entropy, optimizing the balance between detectability, robustness, text quality, and security. Furthermore, we validate our approach through comprehensive experiments on various datasets and models. Experimental results indicate that our method outperforms existing baselines and achieves state-of-the-art (SOTA) performance. We believe this framework provides novel insights into diverse watermarking paradigms. Our code is available at https://github.com/redwyd/SymMark.

Paperid: 107, https://arxiv.org/pdf/2505.09921.pdf GitHub

Abstract:
Large Language Models (LLMs) excel in various domains but pose inherent privacy risks. Existing methods to evaluate privacy leakage in LLMs often use memorized prefixes or simple instructions to extract data, both of which well-alignment models can easily block. Meanwhile, Jailbreak attacks bypass LLM safety mechanisms to generate harmful content, but their role in privacy scenarios remains underexplored. In this paper, we examine the effectiveness of jailbreak attacks in extracting sensitive information, bridging privacy leakage and jailbreak attacks in LLMs. Moreover, we propose PIG, a novel framework targeting Personally Identifiable Information (PII) and addressing the limitations of current jailbreak methods. Specifically, PIG identifies PII entities and their types in privacy queries, uses in-context learning to build a privacy context, and iteratively updates it with three gradient-based strategies to elicit target PII. We evaluate PIG and existing jailbreak methods using two privacy-related datasets. Experiments on four white-box and two black-box LLMs show that PIG outperforms baseline methods and achieves state-of-the-art (SoTA) results. The results underscore significant privacy risks in LLMs, emphasizing the need for stronger safeguards. Our code is availble at https://github.com/redwyd/PrivacyJailbreak.

Paperid: 108, https://arxiv.org/pdf/2505.09820.pdf GitHub

Abstract:
As Large Language Models (LLMs) are widely used, understanding them systematically is key to improving their safety and realizing their full potential. Although many models are aligned using techniques such as reinforcement learning from human feedback (RLHF), they are still vulnerable to jailbreaking attacks. Some of the existing adversarial attack methods search for discrete tokens that may jailbreak a target model while others try to optimize the continuous space represented by the tokens of the model's vocabulary. While techniques based on the discrete space may prove to be inefficient, optimization of continuous token embeddings requires projections to produce discrete tokens, which might render them ineffective. To fully utilize the constraints and the structures of the space, we develop an intrinsic optimization technique using exponentiated gradient descent with the Bregman projection method to ensure that the optimized one-hot encoding always stays within the probability simplex. We prove the convergence of the technique and implement an efficient algorithm that is effective in jailbreaking several widely used LLMs. We demonstrate the efficacy of the proposed technique using five open-source LLMs on four openly available datasets. The results show that the technique achieves a higher success rate with great efficiency compared to three other state-of-the-art jailbreaking techniques. The source code for our implementation is available at: https://github.com/sbamit/Exponentiated-Gradient-Descent-LLM-Attack

Paperid: 109, https://arxiv.org/pdf/2505.08878.pdf GitHub

Abstract:
Large-language models (LLMs) are now able to produce text that is, in many cases, seemingly indistinguishable from human-generated content. This has fueled the development of watermarks that imprint a ``signal'' in LLM-generated text with minimal perturbation of an LLM's output. This paper provides an analysis of text watermarking in a one-shot setting. Through the lens of hypothesis testing with side information, we formulate and analyze the fundamental trade-off between watermark detection power and distortion in generated textual quality. We argue that a key component in watermark design is generating a coupling between the side information shared with the watermark detector and a random partition of the LLM vocabulary. Our analysis identifies the optimal coupling and randomization strategy under the worst-case LLM next-token distribution that satisfies a min-entropy constraint. We provide a closed-form expression of the resulting detection rate under the proposed scheme and quantify the cost in a max-min sense. Finally, we provide an array of numerical results, comparing the proposed scheme with the theoretical optimum and existing schemes, in both synthetic data and LLM watermarking. Our code is available at https://github.com/Carol-Long/CC_Watermark

Paperid: 110, https://arxiv.org/pdf/2505.08816.pdf GitHub

Abstract:
As the digital landscape becomes more interconnected, the frequency and severity of zero-day attacks, have significantly increased, leading to an urgent need for innovative Intrusion Detection Systems (IDS). Machine Learning-based IDS that learn from the network traffic characteristics and can discern attack patterns from benign traffic offer an advanced solution to traditional signature-based IDS. However, they heavily rely on labeled datasets, and their ability to generalize when encountering unseen traffic patterns remains a challenge. This paper proposes a novel self-supervised contrastive learning approach based on transformer encoders, specifically tailored for generalizable intrusion detection on raw packet sequences. Our proposed learning scheme employs a packet-level data augmentation strategy combined with a transformer-based architecture to extract and generate meaningful representations of traffic flows. Unlike traditional methods reliant on handcrafted statistical features (NetFlow), our approach automatically learns comprehensive packet sequence representations, significantly enhancing performance in anomaly identification tasks and supervised learning for intrusion detection. Our transformer-based framework exhibits better performance in comparison to existing NetFlow self-supervised methods. Specifically, we achieve up to a 3% higher AUC in anomaly detection for intra-dataset evaluation and up to 20% higher AUC scores in inter-dataset evaluation. Moreover, our model provides a strong baseline for supervised intrusion detection with limited labeled data, exhibiting an improvement over self-supervised NetFlow models of up to 1.5% AUC when pretrained and evaluated on the same dataset. Additionally, we show the adaptability of our pretrained model when fine-tuned across different datasets, demonstrating strong performance even when lacking benign data from the target domain.

Paperid: 111, https://arxiv.org/pdf/2505.07985.pdf GitHub

Abstract:
Machine learning (ML) algorithms are heavily based on the availability of training data, which, depending on the domain, often includes sensitive information about data providers. This raises critical privacy concerns. Anonymization techniques have emerged as a practical solution to address these issues by generalizing features or suppressing data to make it more difficult to accurately identify individuals. Although recent studies have shown that privacy-enhancing technologies can influence ML predictions across different subgroups, thus affecting fair decision-making, the specific effects of anonymization techniques, such as $k$-anonymity, $\ell$-diversity, and $t$-closeness, on ML fairness remain largely unexplored. In this work, we systematically audit the impact of anonymization techniques on ML fairness, evaluating both individual and group fairness. Our quantitative study reveals that anonymization can degrade group fairness metrics by up to four orders of magnitude. Conversely, similarity-based individual fairness metrics tend to improve under stronger anonymization, largely as a result of increased input homogeneity. By analyzing varying levels of anonymization across diverse privacy settings and data distributions, this study provides critical insights into the trade-offs between privacy, fairness, and utility, offering actionable guidelines for responsible AI development. Our code is publicly available at: https://github.com/hharcolezi/anonymity-impact-fairness.

Paperid: 112, https://arxiv.org/pdf/2505.06579.pdf GitHub

Abstract:
Large language models (LLMs) have achieved remarkable success in various domains, primarily due to their strong capabilities in reasoning and generating human-like text. Despite their impressive performance, LLMs are susceptible to hallucinations, which can lead to incorrect or misleading outputs. This is primarily due to the lack of up-to-date knowledge or domain-specific information. Retrieval-augmented generation (RAG) is a promising approach to mitigate hallucinations by leveraging external knowledge sources. However, the security of RAG systems has not been thoroughly studied. In this paper, we study a poisoning attack on RAG systems named POISONCRAFT, which can mislead the model to refer to fraudulent websites. Compared to existing poisoning attacks on RAG systems, our attack is more practical as it does not require access to the target user query's info or edit the user query. It not only ensures that injected texts can be retrieved by the model, but also ensures that the LLM will be misled to refer to the injected texts in its response. We demonstrate the effectiveness of POISONCRAFTacross different datasets, retrievers, and language models in RAG pipelines, and show that it remains effective when transferred across retrievers, including black-box systems. Moreover, we present a case study revealing how the attack influences both the retrieval behavior and the step-by-step reasoning trace within the generation model, and further evaluate the robustness of POISONCRAFTunder multiple defense mechanisms. These results validate the practicality of our threat model and highlight a critical security risk for RAG systems deployed in real-world applications. We release our code\footnote{https://github.com/AndyShaw01/PoisonCraft} to support future research on the security and robustness of RAG systems in real-world settings.

Paperid: 113, https://arxiv.org/pdf/2505.06311.pdf GitHub

Abstract:
The integration of Large Language Models (LLMs) with external sources is becoming increasingly common, with Retrieval-Augmented Generation (RAG) being a prominent example. However, this integration introduces vulnerabilities of Indirect Prompt Injection (IPI) attacks, where hidden instructions embedded in external data can manipulate LLMs into executing unintended or harmful actions. We recognize that IPI attacks fundamentally rely on the presence of instructions embedded within external content, which can alter the behavioral states of LLMs. Can the effective detection of such state changes help us defend against IPI attacks? In this paper, we propose InstructDetector, a novel detection-based approach that leverages the behavioral states of LLMs to identify potential IPI attacks. Specifically, we demonstrate the hidden states and gradients from intermediate layers provide highly discriminative features for instruction detection. By effectively combining these features, InstructDetector achieves a detection accuracy of 99.60% in the in-domain setting and 96.90% in the out-of-domain setting, and reduces the attack success rate to just 0.03% on the BIPIA benchmark. The code is publicly available at https://github.com/MYVAE/Instruction-detection.

Paperid: 114, https://arxiv.org/pdf/2505.04046.pdf GitHub

Abstract:
Trustworthy multi-view learning has attracted extensive attention because evidence learning can provide reliable uncertainty estimation to enhance the credibility of multi-view predictions. Existing trusted multi-view learning methods implicitly assume that multi-view data is secure. However, in safety-sensitive applications such as autonomous driving and security monitoring, multi-view data often faces threats from adversarial perturbations, thereby deceiving or disrupting multi-view models. This inevitably leads to the adversarial unreliability problem (AUP) in trusted multi-view learning. To overcome this tricky problem, we propose a novel multi-view learning framework, namely Reliable Disentanglement Multi-view Learning (RDML). Specifically, we first propose evidential disentanglement learning to decompose each view into clean and adversarial parts under the guidance of corresponding evidences, which is extracted by a pretrained evidence extractor. Then, we employ the feature recalibration module to mitigate the negative impact of adversarial perturbations and extract potential informative features from them. Finally, to further ignore the irreparable adversarial interferences, a view-level evidential attention mechanism is designed. Extensive experiments on multi-view classification tasks with adversarial attacks show that RDML outperforms the state-of-the-art methods by a relatively large margin. Our code is available at https://github.com/Willy1005/2025-IJCAI-RDML.

Paperid: 115, https://arxiv.org/pdf/2505.02344.pdf GitHub

Abstract:
The rise of LLMs has increased concerns over source tracing and copyright protection for AIGC, highlighting the need for advanced detection technologies. Passive detection methods usually face high false positives, while active watermarking techniques using logits or sampling manipulation offer more effective protection. Existing LLM watermarking methods, though effective on unaltered content, suffer significant performance drops when the text is modified and could introduce biases that degrade LLM performance in downstream tasks. These methods fail to achieve an optimal tradeoff between text quality and robustness, particularly due to the lack of end-to-end optimization of the encoder and decoder. In this paper, we introduce a novel end-to-end logits perturbation method for watermarking LLM-generated text. By jointly optimization, our approach achieves a better balance between quality and robustness. To address non-differentiable operations in the end-to-end training pipeline, we introduce an online prompting technique that leverages the on-the-fly LLM as a differentiable surrogate. Our method achieves superior robustness, outperforming distortion-free methods by 37-39% under paraphrasing and 17.2% on average, while maintaining text quality on par with these distortion-free methods in terms of text perplexity and downstream tasks. Our method can be easily generalized to different LLMs. Code is available at https://github.com/KAHIMWONG/E2E_LLM_WM.

Paperid: 116, https://arxiv.org/pdf/2505.01406.pdf GitHub

Abstract:
The rapid rise of video diffusion models has enabled the generation of highly realistic and temporally coherent videos, raising critical concerns about content authenticity, provenance, and misuse. Existing watermarking approaches, whether passive, post-hoc, or adapted from image-based techniques, often struggle to withstand video-specific manipulations such as frame insertion, dropping, or reordering, and typically degrade visual quality. In this work, we introduce VIDSTAMP, a watermarking framework that embeds per-frame or per-segment messages directly into the latent space of temporally-aware video diffusion models. By fine-tuning the model's decoder through a two-stage pipeline, first on static image datasets to promote spatial message separation, and then on synthesized video sequences to restore temporal consistency, VIDSTAMP learns to embed high-capacity, flexible watermarks with minimal perceptual impact. Leveraging architectural components such as 3D convolutions and temporal attention, our method imposes no additional inference cost and offers better perceptual quality than prior methods, while maintaining comparable robustness against common distortions and tampering. VIDSTAMP embeds 768 bits per video (48 bits per frame) with a bit accuracy of 95.0%, achieves a log P-value of -166.65 (lower is better), and maintains a video quality score of 0.836, comparable to unwatermarked outputs (0.838) and surpassing prior methods in capacity-quality tradeoffs. Code: Code: \url{https://github.com/SPIN-UMass/VidStamp}

Paperid: 117, https://arxiv.org/pdf/2504.21730.pdf GitHub

Abstract:
Deep neural networks (DNNs) are vulnerable to backdoor attacks, where an attacker manipulates a small portion of the training data to implant hidden backdoors into the model. The compromised model behaves normally on clean samples but misclassifies backdoored samples into the attacker-specified target class, posing a significant threat to real-world DNN applications. Currently, several empirical defense methods have been proposed to mitigate backdoor attacks, but they are often bypassed by more advanced backdoor techniques. In contrast, certified defenses based on randomized smoothing have shown promise by adding random noise to training and testing samples to counteract backdoor attacks. In this paper, we reveal that existing randomized smoothing defenses implicitly assume that all samples are equidistant from the decision boundary. However, it may not hold in practice, leading to suboptimal certification performance. To address this issue, we propose a sample-specific certified backdoor defense method, termed Cert-SSB. Cert-SSB first employs stochastic gradient ascent to optimize the noise magnitude for each sample, ensuring a sample-specific noise level that is then applied to multiple poisoned training sets to retrain several smoothed models. After that, Cert-SSB aggregates the predictions of multiple smoothed models to generate the final robust prediction. In particular, in this case, existing certification methods become inapplicable since the optimized noise varies across different samples. To conquer this challenge, we introduce a storage-update-based certification method, which dynamically adjusts each sample's certification region to improve certification performance. We conduct extensive experiments on multiple benchmark datasets, demonstrating the effectiveness of our proposed method. Our code is available at https://github.com/NcepuQiaoTing/Cert-SSB.

Paperid: 118, https://arxiv.org/pdf/2504.19876.pdf GitHub

Abstract:
This paper introduces DeeCLIP, a novel framework for detecting AI-generated images using CLIP-ViT and fusion learning. Despite significant advancements in generative models capable of creating highly photorealistic images, existing detection methods often struggle to generalize across different models and are highly sensitive to minor perturbations. To address these challenges, DeeCLIP incorporates DeeFuser, a fusion module that combines high-level and low-level features, improving robustness against degradations such as compression and blurring. Additionally, we apply triplet loss to refine the embedding space, enhancing the model's ability to distinguish between real and synthetic content. To further enable lightweight adaptation while preserving pre-trained knowledge, we adopt parameter-efficient fine-tuning using low-rank adaptation (LoRA) within the CLIP-ViT backbone. This approach supports effective zero-shot learning without sacrificing generalization. Trained exclusively on 4-class ProGAN data, DeeCLIP achieves an average accuracy of 89.00% on 19 test subsets composed of generative adversarial network (GAN) and diffusion models. Despite having fewer trainable parameters, DeeCLIP outperforms existing methods, demonstrating superior robustness against various generative models and real-world distortions. The code is publicly available at https://github.com/Mamadou-Keita/DeeCLIP for research purposes.

Paperid: 119, https://arxiv.org/pdf/2504.19019.pdf GitHub

Abstract:
The challenge of ensuring Large Language Models (LLMs) align with societal standards is of increasing interest, as these models are still prone to adversarial jailbreaks that bypass their safety mechanisms. Identifying these vulnerabilities is crucial for enhancing the robustness of LLMs against such exploits. We propose Graph of ATtacks (GoAT), a method for generating adversarial prompts to test the robustness of LLM alignment using the Graph of Thoughts framework [Besta et al., 2024]. GoAT excels at generating highly effective jailbreak prompts with fewer queries to the victim model than state-of-the-art attacks, achieving up to five times better jailbreak success rate against robust models like Llama. Notably, GoAT creates high-quality, human-readable prompts without requiring access to the targeted model's parameters, making it a black-box attack. Unlike approaches constrained by tree-based reasoning, GoAT's reasoning is based on a more intricate graph structure. By making simultaneous attack paths aware of each other's progress, this dynamic framework allows a deeper integration and refinement of reasoning paths, significantly enhancing the collaborative exploration of adversarial vulnerabilities in LLMs. At a technical level, GoAT starts with a graph structure and iteratively refines it by combining and improving thoughts, enabling synergy between different thought paths. The code for our implementation can be found at: https://github.com/GoAT-pydev/Graph_of_Attacks.

Paperid: 120, https://arxiv.org/pdf/2504.18575.pdf GitHub

Abstract:
Autonomous UI agents powered by AI have tremendous potential to boost human productivity by automating routine tasks such as filing taxes and paying bills. However, a major challenge in unlocking their full potential is security, which is exacerbated by the agent's ability to take action on their user's behalf. Existing tests for prompt injections in web agents either over-simplify the threat by testing unrealistic scenarios or giving the attacker too much power, or look at single-step isolated tasks. To more accurately measure progress for secure web agents, we introduce WASP -- a new publicly available benchmark for end-to-end evaluation of Web Agent Security against Prompt injection attacks. Evaluating with WASP shows that even top-tier AI models, including those with advanced reasoning capabilities, can be deceived by simple, low-effort human-written injections in very realistic scenarios. Our end-to-end evaluation reveals a previously unobserved insight: while attacks partially succeed in up to 86% of the case, even state-of-the-art agents often struggle to fully complete the attacker goals -- highlighting the current state of security by incompetence.

Paperid: 121, https://arxiv.org/pdf/2504.17609.pdf GitHub GitHub

Abstract:
Aiming at the problems of poor quality of steganographic images and slow network convergence of image steganography models based on deep learning, this paper proposes a Steganography Curriculum Learning training strategy (STCL) for deep learning image steganography models. So that only easy images are selected for training when the model has poor fitting ability at the initial stage, and gradually expand to more difficult images, the strategy includes a difficulty evaluation strategy based on the teacher model and an knee point-based training scheduling strategy. Firstly, multiple teacher models are trained, and the consistency of the quality of steganographic images under multiple teacher models is used as the difficulty score to construct the training subsets from easy to difficult. Secondly, a training control strategy based on knee points is proposed to reduce the possibility of overfitting on small training sets and accelerate the training process. Experimental results on three large public datasets, ALASKA2, VOC2012 and ImageNet, show that the proposed image steganography scheme is able to improve the model performance under multiple algorithmic frameworks, which not only has a high PSNR, SSIM score, and decoding accuracy, but also the steganographic images generated by the model under the training of the STCL strategy have a low steganography analysis scores. You can find our code at \href{https://github.com/chaos-boops/STCL}{https://github.com/chaos-boops/STCL}.

Paperid: 122, https://arxiv.org/pdf/2504.17130.pdf GitHub

Abstract:
Large language models (LLMs) have transformed the way we access information. These models are often tuned to refuse to comply with requests that are considered harmful and to produce responses that better align with the preferences of those who control the models. To understand how this "censorship" works. We use representation engineering techniques to study open-weights safety-tuned models. We present a method for finding a refusal--compliance vector that detects and controls the level of censorship in model outputs. We also analyze recent reasoning LLMs, distilled from DeepSeek-R1, and uncover an additional dimension of censorship through "thought suppression". We show a similar approach can be used to find a vector that suppresses the model's reasoning process, allowing us to remove censorship by applying the negative multiples of this vector. Our code is publicly available at: https://github.com/hannahxchen/llm-censorship-steering

Paperid: 123, https://arxiv.org/pdf/2504.16651.pdf GitHub

Abstract:
Recent advances in generative models have led to their application in password guessing, with the aim of replicating the complexity, structure, and patterns of human-created passwords. Despite their potential, inconsistencies and inadequate evaluation methodologies in prior research have hindered meaningful comparisons and a comprehensive, unbiased understanding of their capabilities. This paper introduces MAYA, a unified, customizable, plug-and-play benchmarking framework designed to facilitate the systematic characterization and benchmarking of generative password-guessing models in the context of trawling attacks. Using MAYA, we conduct a comprehensive assessment of six state-of-the-art approaches, which we re-implemented and adapted to ensure standardization. Our evaluation spans eight real-world password datasets and covers an exhaustive set of advanced testing scenarios, totaling over 15,000 compute hours. Our findings indicate that these models effectively capture different aspects of human password distribution and exhibit strong generalization capabilities. However, their effectiveness varies significantly with long and complex passwords. Through our evaluation, sequential models consistently outperform other generative architectures and traditional password-guessing tools, demonstrating unique capabilities in generating accurate and complex guesses. Moreover, the diverse password distributions learned by the models enable a multi-model attack that outperforms the best individual model. By releasing MAYA, we aim to foster further research, providing the community with a new tool to consistently and reliably benchmark generative password-guessing models. Our framework is publicly available at https://github.com/williamcorrias/MAYA-Password-Benchmarking.

Paperid: 124, https://arxiv.org/pdf/2504.16449.pdf GitHub

Abstract:
Malicious URLs persistently threaten the cybersecurity ecosystem, by either deceiving users into divulging private data or distributing harmful payloads to infiltrate host systems. Gaining timely insights into the current state of this ongoing battle holds significant importance. However, existing reviews exhibit 4 critical gaps: 1) Their reliance on algorithm-centric taxonomies obscures understanding of how detection approaches exploit specific modal information channels; 2) They fail to incorporate pivotal LLM/Transformer-based defenses; 3) No open-source implementations are collected to facilitate benchmarking; 4) Insufficient dataset coverage.This paper presents a comprehensive review of malicious URL detection technologies, systematically analyzing methods from traditional blacklisting to advanced deep learning approaches (e.g. Transformer, GNNs, and LLMs). Unlike prior surveys, we propose a novel modality-based taxonomy that categorizes existing works according to their primary data modalities (URL, HTML, Visual, etc.). This hierarchical classification enables both rigorous technical analysis and clear understanding of multimodal information utilization. Furthermore, to establish a profile of accessible datasets and address the lack of standardized benchmarking (where current studies often lack proper baseline comparisons), we curate and analyze: 1) publicly available datasets (2016-2024), and 2) open-source implementations from published works(2013-2025). Then, we outline essential design principles and architectural frameworks for product-level implementations. The review concludes by examining emerging challenges and proposing actionable directions for future research. We maintain a GitHub repository for ongoing curating datasets and open-source implementations: https://github.com/sevenolu7/Malicious-URL-Detection-Open-Source/tree/master.

Paperid: 125, https://arxiv.org/pdf/2504.16438.pdf GitHub

Abstract:
In practical settings, differentially private Federated learning (DP-FL) is the dominant method for training models from private, on-device client data. Recent work has suggested that DP-FL may be enhanced or outperformed by methods that use DP synthetic data (Wu et al., 2024; Hou et al., 2024). The primary algorithms for generating DP synthetic data for FL applications require careful prompt engineering based on public information and/or iterative private client feedback. Our key insight is that the private client feedback collected by prior DP synthetic data methods (Hou et al., 2024; Xie et al., 2024) can be viewed as an RL (reinforcement learning) reward. Our algorithm, Policy Optimization for Private Data (POPri) harnesses client feedback using policy optimization algorithms such as Direct Preference Optimization (DPO) to fine-tune LLMs to generate high-quality DP synthetic data. To evaluate POPri, we release LargeFedBench, a new federated text benchmark for uncontaminated LLM evaluations on federated client data. POPri substantially improves the utility of DP synthetic data relative to prior work on LargeFedBench datasets and an existing benchmark from Xie et al. (2024). POPri closes the gap between next-token prediction accuracy in the fully-private and non-private settings by up to 58%, compared to 28% for prior synthetic data methods, and 3% for state-of-the-art DP federated learning methods. The code and data are available at https://github.com/meiyuw/POPri.

Paperid: 126, https://arxiv.org/pdf/2504.16364.pdf GitHub GitHub

Abstract:
In recent years, a large number of works have introduced Convolutional Neural Networks (CNNs) into image steganography, which transform traditional steganography methods such as hand-crafted features and prior knowledge design into steganography methods that neural networks autonomically learn information embedding. However, due to the inherent complexity of digital images, issues of invisibility and security persist when using CNN models for information embedding. In this paper, we propose Curriculum Learning Progressive Steganophy Network (CLPSTNet). The network consists of multiple progressive multi-scale convolutional modules that integrate Inception structures and dilated convolutions. The module contains multiple branching pathways, starting from a smaller convolutional kernel and dilatation rate, extracting the basic, local feature information from the feature map, and gradually expanding to the convolution with a larger convolutional kernel and dilatation rate for perceiving the feature information of a larger receptive field, so as to realize the multi-scale feature extraction from shallow to deep, and from fine to coarse, allowing the shallow secret information features to be refined in different fusion stages. The experimental results show that the proposed CLPSTNet not only has high PSNR , SSIM metrics and decoding accuracy on three large public datasets, ALASKA2, VOC2012 and ImageNet, but also the steganographic images generated by CLPSTNet have low steganalysis scores.You can find our code at \href{https://github.com/chaos-boops/CLPSTNet}{https://github.com/chaos-boops/CLPSTNet}.

Paperid: 127, https://arxiv.org/pdf/2504.16359.pdf GitHub GitHub

Abstract:
This work introduces \textbf{VideoMark}, a distortion-free robust watermarking framework for video diffusion models. As diffusion models excel in generating realistic videos, reliable content attribution is increasingly critical. However, existing video watermarking methods often introduce distortion by altering the initial distribution of diffusion variables and are vulnerable to temporal attacks, such as frame deletion, due to variable video lengths. VideoMark addresses these challenges by employing a \textbf{pure pseudorandom initialization} to embed watermarks, avoiding distortion while ensuring uniform noise distribution in the latent space to preserve generation quality. To enhance robustness, we adopt a frame-wise watermarking strategy with pseudorandom error correction (PRC) codes, using a fixed watermark sequence with randomly selected starting indices for each video. For watermark extraction, we propose a Temporal Matching Module (TMM) that leverages edit distance to align decoded messages with the original watermark sequence, ensuring resilience against temporal attacks. Experimental results show that VideoMark achieves higher decoding accuracy than existing methods while maintaining video quality comparable to watermark-free generation. The watermark remains imperceptible to attackers without the secret key, offering superior invisibility compared to other frameworks. VideoMark provides a practical, training-free solution for content attribution in diffusion-based video generation. Code and data are available at \href{https://github.com/KYRIE-LI11/VideoMark}{https://github.com/KYRIE-LI11/VideoMark}{Project Page}.

Paperid: 128, https://arxiv.org/pdf/2504.15565.pdf GitHub GitHub

Abstract:
Due to the growing demand for privacy protection, encrypted tunnels have become increasingly popular among mobile app users, which brings new challenges to app fingerprinting (AF)-based network management. Existing methods primarily transfer traditional AF methods to encrypted tunnels directly, ignoring the core obfuscation and re-encapsulation mechanism of encrypted tunnels, thus resulting in unsatisfactory performance. In this paper, we propose DecETT, a dual decouple-based semantic enhancement method for accurate AF under encrypted tunnels. Specifically, DecETT improves AF under encrypted tunnels from two perspectives: app-specific feature enhancement and irrelevant tunnel feature decoupling.Considering the obfuscated app-specific information in encrypted tunnel traffic, DecETT introduces TLS traffic with stronger app-specific information as a semantic anchor to guide and enhance the fingerprint generation for tunnel traffic. Furthermore, to address the app-irrelevant tunnel feature introduced by the re-encapsulation mechanism, DecETT is designed with a dual decouple-based fingerprint enhancement module, which decouples the tunnel feature and app semantic feature from tunnel traffic separately, thereby minimizing the impact of tunnel features on accurate app fingerprint extraction. Evaluation under five prevalent encrypted tunnels indicates that DecETT outperforms state-of-the-art methods in accurate AF under encrypted tunnels, and further demonstrates its superiority under tunnels with more complicated obfuscation. \textit{Project page: \href{https://github.com/DecETT/DecETT}{https://github.com/DecETT/DecETT}}

Paperid: 129, https://arxiv.org/pdf/2504.14064.pdf GitHub

Abstract:
We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a plug-in framework and integrates easily into realistic agentic frameworks like BrowserGym (for web agents) and $Ï$-bench (for tool calling agents); 2) It is configurable and allows for detailed threat modeling, allowing configuration of specific components of the agentic framework being attackable, and specifying targets for the attacker; and 3) It is modular and decouples the development of attacks from details of the environment in which the agent is deployed, allowing for the same attacks to be applied across multiple environments. We illustrate several advantages of our framework, including the ability to adapt to new threat models and environments easily, the ability to easily combine several previously published attacks to enable comprehensive and fine-grained security testing, and the ability to analyze trade-offs between various vulnerabilities and performance. We apply DoomArena to state-of-the-art (SOTA) web and tool-calling agents and find a number of surprising results: 1) SOTA agents have varying levels of vulnerability to different threat models (malicious user vs malicious environment), and there is no Pareto dominant agent across all threat models; 2) When multiple attacks are applied to an agent, they often combine constructively; 3) Guardrail model-based defenses seem to fail, while defenses based on powerful SOTA LLMs work better. DoomArena is available at https://github.com/ServiceNow/DoomArena.

Paperid: 130, https://arxiv.org/pdf/2504.14061.pdf GitHub

Abstract:
Differentially private (DP) tabular data synthesis generates artificial data that preserves the statistical properties of private data while safeguarding individual privacy. The emergence of diverse algorithms in recent years has introduced challenges in practical applications, such as inconsistent data processing methods, lack of in-depth algorithm analysis, and incomplete comparisons due to overlapping development timelines. These factors create significant obstacles to selecting appropriate algorithms. In this paper, we address these challenges by proposing a benchmark for evaluating tabular data synthesis methods. We present a unified evaluation framework that integrates data preprocessing, feature selection, and synthesis modules, facilitating fair and comprehensive comparisons. Our evaluation reveals that a significant utility-efficiency trade-off exists among current state-of-the-art methods. Some statistical methods are superior in synthesis utility, but their efficiency is not as good as most machine learning-based methods. Furthermore, we conduct an in-depth analysis of each module with experimental validation, offering theoretical insights into the strengths and limitations of different strategies.

Paperid: 131, https://arxiv.org/pdf/2504.13416.pdf GitHub

Abstract:
Given how large parts of publicly available text are crawled to pretrain large language models (LLMs), data creators increasingly worry about the inclusion of their proprietary data for model training without attribution or licensing. Their concerns are also shared by benchmark curators whose test-sets might be compromised. In this paper, we present STAMP, a framework for detecting dataset membership-i.e., determining the inclusion of a dataset in the pretraining corpora of LLMs. Given an original piece of content, our proposal involves first generating multiple rephrases, each embedding a watermark with a unique secret key. One version is to be released publicly, while others are to be kept private. Subsequently, creators can compare model likelihoods between public and private versions using paired statistical tests to prove membership. We show that our framework can successfully detect contamination across four benchmarks which appear only once in the training data and constitute less than 0.001% of the total tokens, outperforming several contamination detection and dataset inference baselines. We verify that STAMP preserves both the semantic meaning and utility of the original data. We apply STAMP to two real-world scenarios to confirm the inclusion of paper abstracts and blog articles in the pretraining corpora.

Paperid: 132, https://arxiv.org/pdf/2504.13358.pdf GitHub

Abstract:
GraphQL is an open-source data query and manipulation language for web applications, offering a flexible alternative to RESTful APIs. However, its dynamic execution model and lack of built-in security mechanisms expose it to vulnerabilities such as unauthorized data access, denial-of-service (DoS) attacks, and injections. Existing testing tools focus on functional correctness, often overlooking security risks stemming from query interdependencies and execution context. This paper presents GraphQLer, the first context-aware security testing framework for GraphQL APIs. GraphQLer constructs a dependency graph to analyze relationships among mutations, queries, and objects, capturing critical interdependencies. It chains related queries and mutations to reveal authentication and authorization flaws, access control bypasses, and resource misuse. Additionally, GraphQLer tracks internal resource usage to uncover data leakage, privilege escalation, and replay attack vectors. We assess GraphQLer on various GraphQL APIs, demonstrating improved testing coverage - averaging a 35% increase, with up to 84% in some cases - compared to top-performing baselines. Remarkably, this is achieved in less time, making GraphQLer suitable for time-sensitive contexts. GraphQLer also successfully detects a known CVE and potential vulnerabilities in large-scale production APIs. These results underline GraphQLer's utility in proactively securing GraphQL APIs through automated, context-aware vulnerability detection.

Paperid: 133, https://arxiv.org/pdf/2504.13061.pdf GitHub

Abstract:
Text-to-image models based on diffusion processes, such as DALL-E, Stable Diffusion, and Midjourney, are capable of transforming texts into detailed images and have widespread applications in art and design. As such, amateur users can easily imitate professional-level paintings by collecting an artist's work and fine-tuning the model, leading to concerns about artworks' copyright infringement. To tackle these issues, previous studies either add visually imperceptible perturbation to the artwork to change its underlying styles (perturbation-based methods) or embed post-training detectable watermarks in the artwork (watermark-based methods). However, when the artwork or the model has been published online, i.e., modification to the original artwork or model retraining is not feasible, these strategies might not be viable. To this end, we propose a novel method for data-use auditing in the text-to-image generation model. The general idea of ArtistAuditor is to identify if a suspicious model has been finetuned using the artworks of specific artists by analyzing the features related to the style. Concretely, ArtistAuditor employs a style extractor to obtain the multi-granularity style representations and treats artworks as samplings of an artist's style. Then, ArtistAuditor queries a trained discriminator to gain the auditing decisions. The experimental results on six combinations of models and datasets show that ArtistAuditor can achieve high AUC values (> 0.937). By studying ArtistAuditor's transferability and core modules, we provide valuable insights into the practical implementation. Finally, we demonstrate the effectiveness of ArtistAuditor in real-world cases by an online platform Scenario. ArtistAuditor is open-sourced at https://github.com/Jozenn/ArtistAuditor.

Paperid: 134, https://arxiv.org/pdf/2504.12782.pdf GitHub

Abstract:
Ensuring the ethical deployment of text-to-image models requires effective techniques to prevent the generation of harmful or inappropriate content. While concept erasure methods offer a promising solution, existing finetuning-based approaches suffer from notable limitations. Anchor-free methods risk disrupting sampling trajectories, leading to visual artifacts, while anchor-based methods rely on the heuristic selection of anchor concepts. To overcome these shortcomings, we introduce a finetuning framework, dubbed ANT, which Automatically guides deNoising Trajectories to avoid unwanted concepts. ANT is built on a key insight: reversing the condition direction of classifier-free guidance during mid-to-late denoising stages enables precise content modification without sacrificing early-stage structural integrity. This inspires a trajectory-aware objective that preserves the integrity of the early-stage score function field, which steers samples toward the natural image manifold, without relying on heuristic anchor concept selection. For single-concept erasure, we propose an augmentation-enhanced weight saliency map to precisely identify the critical parameters that most significantly contribute to the unwanted concept, enabling more thorough and efficient erasure. For multi-concept erasure, our objective function offers a versatile plug-and-play solution that significantly boosts performance. Extensive experiments demonstrate that ANT achieves state-of-the-art results in both single and multi-concept erasure, delivering high-quality, safe outputs without compromising the generative fidelity. Code is available at https://github.com/lileyang1210/ANT

Paperid: 135, https://arxiv.org/pdf/2504.12623.pdf GitHub GitHub

Abstract:
In this paper, we present the demonstration of training a four-layer neural network entirely using fully homomorphic encryption (FHE), supporting both single-output and multi-output classification tasks in a non-interactive setting. A key contribution of our work is identifying that replacing \textit{Softmax} with \textit{Sigmoid}, in conjunction with the Binary Cross-Entropy (BCE) loss function, provides an effective and scalable solution for homomorphic classification. Moreover, we show that the BCE loss function, originally designed for multi-output tasks, naturally extends to the multi-class setting, thereby enabling broader applicability. We also highlight the limitations of prior loss functions such as the SLE loss and the one proposed in the 2019 CVPR Workshop, both of which suffer from vanishing gradients as network depth increases. To address the challenges posed by large-scale encrypted data, we further introduce an improved version of the previously proposed data encoding scheme, \textit{Double Volley Revolver}, which achieves a better trade-off between computational and memory efficiency, making FHE-based neural network training more practical. The complete, runnable C++ code to implement our work can be found at: \href{https://github.com/petitioner/ML.NNtraining}{$\texttt{https://github.com/petitioner/ML.NNtraining}$}.

Paperid: 136, https://arxiv.org/pdf/2504.12217.pdf GitHub

Abstract:
In the context of cloud computing, services are held on cloud servers, where the clients send their data to the server and obtain the results returned by server. However, the computation, data and results are prone to tampering due to the vulnerabilities on the server side. Thus, verifying the integrity of computation is important in the client-server setting. The cryptographic method known as Zero-Knowledge Proof (ZKP) is renowned for facilitating private and verifiable computing. ZKP allows the client to validate that the results from the server are computed correctly without violating the privacy of the server's intellectual property. Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zkSNARKs), in particular, has been widely applied in various applications like blockchain and verifiable machine learning. Despite their popularity, existing zkSNARKs approaches remain highly computationally intensive. For instance, even basic operations like matrix multiplication require an extensive number of constraints, resulting in significant overhead. In addressing this challenge, we introduce \textit{zkVC}, which optimizes the ZKP computation for matrix multiplication, enabling rapid proof generation on the server side and efficient verification on the client side. zkVC integrates optimized ZKP modules, such as Constraint-reduced Polynomial Circuit (CRPC) and Prefix-Sum Query (PSQ), collectively yielding a more than 12-fold increase in proof speed over prior methods. The code is available at https://github.com/UCF-Lou-Lab-PET/zkformer

Paperid: 137, https://arxiv.org/pdf/2504.11195.pdf GitHub

Abstract:
Vision-language models (VLMs), such as CLIP, have gained significant popularity as foundation models, with numerous fine-tuning methods developed to enhance performance on downstream tasks. However, due to their inherent vulnerability and the common practice of selecting from a limited set of open-source models, VLMs suffer from a higher risk of adversarial attacks than traditional vision models. Existing defense techniques typically rely on adversarial fine-tuning during training, which requires labeled data and lacks of flexibility for downstream tasks. To address these limitations, we propose robust test-time prompt tuning (R-TPT), which mitigates the impact of adversarial attacks during the inference stage. We first reformulate the classic marginal entropy objective by eliminating the term that introduces conflicts under adversarial conditions, retaining only the pointwise entropy minimization. Furthermore, we introduce a plug-and-play reliability-based weighted ensembling strategy, which aggregates useful information from reliable augmented views to strengthen the defense. R-TPT enhances defense against adversarial attacks without requiring labeled training data while offering high flexibility for inference tasks. Extensive experiments on widely used benchmarks with various attacks demonstrate the effectiveness of R-TPT. The code is available in https://github.com/TomSheng21/R-TPT.

Paperid: 138, https://arxiv.org/pdf/2504.10694.pdf GitHub

Abstract:
Jailbreak attacks bypass the guardrails of large language models to produce harmful outputs. In this paper, we ask whether the model outputs produced by existing jailbreaks are actually useful. For example, when jailbreaking a model to give instructions for building a bomb, does the jailbreak yield good instructions? Since the utility of most unsafe answers (e.g., bomb instructions) is hard to evaluate rigorously, we build new jailbreak evaluation sets with known ground truth answers, by aligning models to refuse questions related to benign and easy-to-evaluate topics (e.g., biology or math). Our evaluation of eight representative jailbreaks across five utility benchmarks reveals a consistent drop in model utility in jailbroken responses, which we term the jailbreak tax. For example, while all jailbreaks we tested bypass guardrails in models aligned to refuse to answer math, this comes at the expense of a drop of up to 92% in accuracy. Overall, our work proposes the jailbreak tax as a new important metric in AI safety, and introduces benchmarks to evaluate existing and future jailbreaks. We make the benchmark available at https://github.com/ethz-spylab/jailbreak-tax

Paperid: 139, https://arxiv.org/pdf/2504.09839.pdf GitHub GitHub

Abstract:
Speech synthesis technology has brought great convenience, while the widespread usage of realistic deepfake audio has triggered hazards. Malicious adversaries may unauthorizedly collect victims' speeches and clone a similar voice for illegal exploitation (\textit{e.g.}, telecom fraud). However, the existing defense methods cannot effectively prevent deepfake exploitation and are vulnerable to robust training techniques. Therefore, a more effective and robust data protection method is urgently needed. In response, we propose a defensive framework, \textit{\textbf{SafeSpeech}}, which protects the users' audio before uploading by embedding imperceptible perturbations on original speeches to prevent high-quality synthetic speech. In SafeSpeech, we devise a robust and universal proactive protection technique, \textbf{S}peech \textbf{PE}rturbative \textbf{C}oncealment (\textbf{SPEC}), that leverages a surrogate model to generate universally applicable perturbation for generative synthetic models. Moreover, we optimize the human perception of embedded perturbation in terms of time and frequency domains. To evaluate our method comprehensively, we conduct extensive experiments across advanced models and datasets, both subjectively and objectively. Our experimental results demonstrate that SafeSpeech achieves state-of-the-art (SOTA) voice protection effectiveness and transferability and is highly robust against advanced adaptive adversaries. Moreover, SafeSpeech has real-time capability in real-world tests. The source code is available at \href{https://github.com/wxzyd123/SafeSpeech}{https://github.com/wxzyd123/SafeSpeech}.

Paperid: 140, https://arxiv.org/pdf/2504.09757.pdf GitHub

Abstract:
Large language models (LLMs) have demonstrated revolutionary capabilities in understanding complex contexts and performing a wide range of tasks. However, LLMs can also answer questions that are unethical or harmful, raising concerns about their applications. To regulate LLMs' responses to such questions, a training strategy called \textit{alignment} can help. Yet, alignment can be unexpectedly compromised when fine-tuning an LLM for downstream tasks. This paper focuses on recovering the alignment lost during fine-tuning. We observe that there are two distinct directions inherent in an aligned LLM: the \textit{aligned direction} and the \textit{harmful direction}. An LLM is inclined to answer questions in the aligned direction while refusing queries in the harmful direction. Therefore, we propose to recover the harmful direction of the fine-tuned model that has been compromised. Specifically, we restore a small subset of the fine-tuned model's weight parameters from the original aligned model using gradient descent. We also introduce a rollback mechanism to avoid aggressive recovery and maintain downstream task performance. Our evaluation on 125 fine-tuned LLMs demonstrates that our method can reduce their harmful rate (percentage of answering harmful questions) from 33.25\% to 1.74\%, without sacrificing task performance much. In contrast, the existing methods either only reduce the harmful rate to a limited extent or significantly impact the normal functionality. Our code is available at https://github.com/kangyangWHU/LLMAlignment

Paperid: 141, https://arxiv.org/pdf/2504.09026.pdf GitHub

Abstract:
Instruction fine-tuning attacks pose a significant threat to large language models (LLMs) by subtly embedding poisoned data in fine-tuning datasets, which can trigger harmful or unintended responses across a range of tasks. This undermines model alignment and poses security risks in real-world deployment. In this work, we present a simple and effective approach to detect and mitigate such attacks using influence functions, a classical statistical tool adapted for machine learning interpretation. Traditionally, the high computational costs of influence functions have limited their application to large models and datasets. The recent Eigenvalue-Corrected Kronecker-Factored Approximate Curvature (EK-FAC) approximation method enables efficient influence score computation, making it feasible for large-scale analysis. We are the first to apply influence functions for detecting language model instruction fine-tuning attacks on large-scale datasets, as both the instruction fine-tuning attack on language models and the influence calculation approximation technique are relatively new. Our large-scale empirical evaluation of influence functions on 50,000 fine-tuning examples and 32 tasks reveals a strong association between influence scores and sentiment. Building on this, we introduce a novel sentiment transformation combined with influence functions to detect and remove critical poisons -- poisoned data points that skew model predictions. Removing these poisons (only 1% of total data) recovers model performance to near-clean levels, demonstrating the effectiveness and efficiency of our approach. Artifact is available at https://github.com/lijiawei20161002/Poison-Detection. WARNING: This paper contains offensive data examples.

Paperid: 142, https://arxiv.org/pdf/2504.08782.pdf GitHub

Abstract:
We introduce a new attack paradigm that embeds hidden adversarial capabilities directly into diffusion models via fine-tuning, without altering their observable behavior or requiring modifications during inference. Unlike prior approaches that target specific images or adjust the generation process to produce adversarial outputs, our method integrates adversarial functionality into the model itself. The resulting tampered model generates high-quality images indistinguishable from those of the original, yet these images cause misclassification in downstream classifiers at a high rate. The misclassification can be targeted to specific output classes. Users can employ this compromised model unaware of its embedded adversarial nature, as it functions identically to a standard diffusion model. We demonstrate the effectiveness and stealthiness of our approach, uncovering a covert attack vector that raises new security concerns. These findings expose a risk arising from the use of externally-supplied models and highlight the urgent need for robust model verification and defense mechanisms against hidden threats in generative models. The code is available at https://github.com/LucasBeerens/CRAFTed-Diffusion .

Paperid: 143, https://arxiv.org/pdf/2504.08198.pdf GitHub

Abstract:
Federated Learning (FL) has been introduced as a way to keep data local to clients while training a shared machine learning model, as clients train on their local data and send trained models to a central aggregator. It is expected that FL will have a huge implication on Mobile Edge Computing, the Internet of Things, and Cross-Silo FL. In this paper, we focus on the widely used FedAvg algorithm to explore the effect of the number of clients in FL. We find a significant deterioration of learning accuracy for FedAvg as the number of clients increases. To address this issue for a general application, we propose a method called Knowledgeable Client Insertion (KCI) that introduces a very small number of knowledgeable clients to the MEC setting. These knowledgeable clients are expected to have accumulated a large set of data samples to help with training. With the help of KCI, the learning accuracy of FL increases much faster even with a normal FedAvg aggregation technique. We expect this approach to be able to provide great privacy protection for clients against security attacks such as model inversion attacks. Our code is available at https://github.com/Eleanor-W/KCI_for_FL.

Paperid: 144, https://arxiv.org/pdf/2504.07002.pdf GitHub

Abstract:
Watermarking is a technique to help identify the source of data points, which can be used to help prevent the misuse of protected datasets. Existing methods on code watermarking, leveraging the idea from the backdoor research, embed stealthy triggers as watermarks. Despite their high resilience against dilution attacks and backdoor detections, the robustness has not been fully evaluated. To fill this gap, we propose DeCoMa, a dual-channel approach to Detect and purify Code dataset waterMarks. To overcome the high barrier created by the stealthy and hidden nature of code watermarks, DeCoMa leverages dual-channel constraints on code to generalize and map code samples into standardized templates. Subsequently, DeCoMa extracts hidden watermarks by identifying outlier associations between paired elements within the standardized templates. Finally, DeCoMa purifies the watermarked dataset by removing all samples containing the detected watermark, enabling the silent appropriation of protected code. We conduct extensive experiments to evaluate the effectiveness and efficiency of DeCoMa, covering 14 types of code watermarks and 3 representative intelligent code tasks (a total of 14 scenarios). Experimental results demonstrate that DeCoMa achieves a stable recall of 100% in 14 code watermark detection scenarios, significantly outperforming the baselines. Additionally, DeCoMa effectively attacks code watermarks with embedding rates as low as 0.1%, while maintaining comparable model performance after training on the purified dataset. Furthermore, as DeCoMa requires no model training for detection, it achieves substantially higher efficiency than all baselines, with a speedup ranging from 31.5 to 130.9X. The results call for more advanced watermarking techniques for code models, while DeCoMa can serve as a baseline for future evaluation. Code is available at https://github.com/xiaoyuanpigo/DeCoMa

Paperid: 145, https://arxiv.org/pdf/2504.06575.pdf GitHub

Abstract:
Watermarking has emerged as a promising technique for detecting texts generated by LLMs. Current research has primarily focused on three design criteria: high quality of the watermarked text, high detectability, and robustness against removal attack. However, the security against spoofing attacks remains relatively understudied. For example, a piggyback attack can maliciously alter the meaning of watermarked text-transforming it into hate speech-while preserving the original watermark, thereby damaging the reputation of the LLM provider. We identify two core challenges that make defending against spoofing difficult: (1) the need for watermarks to be both sensitive to semantic-distorting changes and insensitive to semantic-preserving edits, and (2) the contradiction between the need to detect global semantic shifts and the local, auto-regressive nature of most watermarking schemes. To address these challenges, we propose a semantic-aware watermarking algorithm that post-hoc embeds watermarks into a given target text while preserving its original meaning. Our method introduces a semantic mapping model, which guides the generation of a green-red token list, contrastively trained to be sensitive to semantic-distorting changes and insensitive to semantic-preserving changes. Experiments on two standard benchmarks demonstrate strong robustness against removal attacks and security against spoofing attacks, including sentiment reversal and toxic content insertion, while maintaining high watermark detectability. Our approach offers a significant step toward more secure and semantically aware watermarking for LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/contrastive-watermark.

Paperid: 146, https://arxiv.org/pdf/2504.06417.pdf GitHub

Abstract:
The increasing availability of drones and their potential for malicious activities pose significant privacy and security risks, necessitating fast and reliable detection in real-world environments. However, existing drone detection systems often struggle in real-world settings due to environmental noise and sensor limitations. This paper introduces TRIDENT, a tri-modal drone detection framework that integrates synchronized audio, visual, and RF data to enhance robustness and reduce dependence on individual sensors. TRIDENT introduces two fusion strategies - Late Fusion and GMU Fusion - to improve multi-modal integration while maintaining efficiency. The framework incorporates domain-specific feature extraction techniques alongside a specialized data augmentation pipeline that simulates real-world sensor degradation to improve generalization capabilities. A diverse multi-sensor dataset is collected in urban and non-urban environments under varying lighting conditions, ensuring comprehensive evaluation. Experimental results show that TRIDENT achieves 98.8 percent accuracy in real-world recordings and 83.26 percent in a more complex setting (augmented data), outperforming unimodal and dual-modal baselines. Moreover, TRIDENT operates in real-time, detecting drones in just 6.09 ms while consuming only 75.27 mJ per detection, making it highly efficient for resource-constrained devices. The dataset and code have been released to ensure reproducibility (https://github.com/TRIDENT-2025/TRIDENT).

Paperid: 147, https://arxiv.org/pdf/2504.06410.pdf GitHub GitHub

Abstract:
This paper explores inference-time data leakage risks of deep neural networks (NNs), where a curious and honest model service provider is interested in retrieving users' private data inputs solely based on the model inference results. Particularly, we revisit residual NNs due to their popularity in computer vision and our hypothesis that residual blocks are a primary cause of data leakage owing to the use of skip connections. By formulating inference-time data leakage as a constrained optimization problem, we propose a novel backward feature inversion method, \textbf{PEEL}, which can effectively recover block-wise input features from the intermediate output of residual NNs. The surprising results in high-quality input data recovery can be explained by the intuition that the output from these residual blocks can be considered as a noisy version of the input and thus the output retains sufficient information for input recovery. We demonstrate the effectiveness of our layer-by-layer feature inversion method on facial image datasets and pre-trained classifiers. Our results show that PEEL outperforms the state-of-the-art recovery methods by an order of magnitude when evaluated by mean squared error (MSE). The code is available at \href{https://github.com/Huzaifa-Arif/PEEL}{https://github.com/Huzaifa-Arif/PEEL}

Paperid: 148, https://arxiv.org/pdf/2504.05838.pdf GitHub

Abstract:
Recently, the Image Prompt Adapter (IP-Adapter) has been increasingly integrated into text-to-image diffusion models (T2I-DMs) to improve controllability. However, in this paper, we reveal that T2I-DMs equipped with the IP-Adapter (T2I-IP-DMs) enable a new jailbreak attack named the hijacking attack. We demonstrate that, by uploading imperceptible image-space adversarial examples (AEs), the adversary can hijack massive benign users to jailbreak an Image Generation Service (IGS) driven by T2I-IP-DMs and mislead the public to discredit the service provider. Worse still, the IP-Adapter's dependency on open-source image encoders reduces the knowledge required to craft AEs. Extensive experiments verify the technical feasibility of the hijacking attack. In light of the revealed threat, we investigate several existing defenses and explore combining the IP-Adapter with adversarially trained models to overcome existing defenses' limitations. Our code is available at https://github.com/fhdnskfbeuv/attackIPA.

Paperid: 149, https://arxiv.org/pdf/2504.04715.pdf GitHub

Abstract:
The proliferation of Large Language Models (LLMs) accessed via black-box APIs introduces a significant trust challenge: users pay for services based on advertised model capabilities (e.g., size, performance), but providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking. Detecting such substitutions is difficult due to the black-box nature, typically limiting interaction to input-output queries. This paper formalizes the problem of model substitution detection in LLM APIs. We systematically evaluate existing verification techniques, including output-based statistical tests, benchmark evaluations, and log probability analysis, under various realistic attack scenarios like model quantization, randomized substitution, and benchmark evasion. Our findings reveal the limitations of methods relying solely on text outputs, especially against subtle or adaptive attacks. While log probability analysis offers stronger guarantees when available, its accessibility is often limited. We conclude by discussing the potential of hardware-based solutions like Trusted Execution Environments (TEEs) as a pathway towards provable model integrity, highlighting the trade-offs between security, performance, and provider adoption. Code is available at https://github.com/sunblaze-ucb/llm-api-audit

Paperid: 150, https://arxiv.org/pdf/2504.03850.pdf GitHub GitHub

Abstract:
Tree-Ring Watermarking is a significant technique for authenticating AI-generated images. However, its effectiveness in rectified flow-based models remains unexplored, particularly given the inherent challenges of these models with noise latent inversion. Through extensive experimentation, we evaluated and compared the detection and separability of watermarks between SD 2.1 and FLUX.1-dev models. By analyzing various text guidance configurations and augmentation attacks, we demonstrate how inversion limitations affect both watermark recovery and the statistical separation between watermarked and unwatermarked images. Our findings provide valuable insights into the current limitations of Tree-Ring Watermarking in the current SOTA models and highlight the critical need for improved inversion methods to achieve reliable watermark detection and separability. The official implementation, dataset release and all experimental results are available at this \href{https://github.com/dsgiitr/flux-watermarking}{\textbf{link}}.

Paperid: 151, https://arxiv.org/pdf/2504.03767.pdf GitHub

Abstract:
To reduce development overhead and enable seamless integration between potential components comprising any given generative AI application, the Model Context Protocol (MCP) (Anthropic, 2024) has recently been released and subsequently widely adopted. The MCP is an open protocol that standardizes API calls to large language models (LLMs), data sources, and agentic tools. By connecting multiple MCP servers, each defined with a set of tools, resources, and prompts, users are able to define automated workflows fully driven by LLMs. However, we show that the current MCP design carries a wide range of security risks for end users. In particular, we demonstrate that industry-leading LLMs may be coerced into using MCP tools to compromise an AI developer's system through various attacks, such as malicious code execution, remote access control, and credential theft. To proactively mitigate these and related attacks, we introduce a safety auditing tool, MCPSafetyScanner, the first agentic tool to assess the security of an arbitrary MCP server. MCPScanner uses several agents to (a) automatically determine adversarial samples given an MCP server's tools and resources; (b) search for related vulnerabilities and remediations based on those samples; and (c) generate a security report detailing all findings. Our work highlights serious security issues with general-purpose agentic workflows while also providing a proactive tool to audit MCP server safety and address detected vulnerabilities before deployment. The described MCP server auditing tool, MCPSafetyScanner, is freely available at: https://github.com/johnhalloran321/mcpSafetyScanner

Paperid: 152, https://arxiv.org/pdf/2504.02431.pdf GitHub

Abstract:
System operators responsible for protecting software applications remain hesitant to implement cyber deception technology, including methods that place traps to catch attackers, despite its proven benefits. Overcoming their concerns removes a barrier that currently hinders industry adoption of deception technology. Our work introduces deception policy documents to describe deception technology "as code" and pairs them with Koney, a Kubernetes operator, which facilitates the setup, rotation, monitoring, and removal of traps in Kubernetes. We leverage cloud-native technologies, such as service meshes and eBPF, to automatically add traps to containerized software applications, without having access to the source code. We focus specifically on operational properties, such as maintainability, scalability, and simplicity, which we consider essential to accelerate the adoption of cyber deception technology and to facilitate further research on cyber deception.

Paperid: 153, https://arxiv.org/pdf/2504.01395.pdf GitHub

Abstract:
Differentially private (DP) image synthesis aims to generate synthetic images from a sensitive dataset, alleviating the privacy leakage concerns of organizations sharing and utilizing synthetic images. Although previous methods have significantly progressed, especially in training diffusion models on sensitive images with DP Stochastic Gradient Descent (DP-SGD), they still suffer from unsatisfactory performance. In this work, inspired by curriculum learning, we propose a two-stage DP image synthesis framework, where diffusion models learn to generate DP synthetic images from easy to hard. Unlike existing methods that directly use DP-SGD to train diffusion models, we propose an easy stage in the beginning, where diffusion models learn simple features of the sensitive images. To facilitate this easy stage, we propose to use `central images', simply aggregations of random samples of the sensitive dataset. Intuitively, although those central images do not show details, they demonstrate useful characteristics of all images and only incur minimal privacy costs, thus helping early-phase model training. We conduct experiments to present that on the average of four investigated image datasets, the fidelity and utility metrics of our synthetic images are 33.1% and 2.1% better than the state-of-the-art method.

Paperid: 154, https://arxiv.org/pdf/2504.01380.pdf GitHub

Abstract:
High-performance security guarantees rely on hardware support. Generic programmable support for fine-grained instruction analysis has gained broad interest in the literature as a fundamental building block for the security of future processors. Yet, implementation in real out-of-order (OoO) superscalar processors presents tough challenges that cannot be explored in highly abstract simulators. We detail the challenges of implementing complex programmable pathways without critical paths or contention. We then introduce FireGuard, the first implementation of fine-grained instruction analysis on a real OoO superscalar processor. We establish an end-to-end system, including microarchitecture, SoC, ISA and programming model. Experiments show that our solution simultaneously ensures both security and performance of the system, with parallel scalability. We examine the feasibility of building FireGuard into modern SoCs: Apple's M1-Pro, Huawei's Kirin-960, and Intel's i7-12700F, where less than 1% silicon area is introduced. The Repo. of FireGuard's source code: https://github.com/SEU-ACAL/reproduce-FireGuard-DAC-25.

Paperid: 155, https://arxiv.org/pdf/2504.00041.pdf GitHub

Abstract:
In recent years, the rise of cyber threats has emphasized the need for robust malware detection systems, especially on mobile devices. Malware, which targets vulnerabilities in devices and user data, represents a substantial security risk. A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat. This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications. We assess monolithic classifiers and ensemble methods, focusing on dynamic selection algorithms, which have shown superior performance compared to traditional approaches. In contrast to balancing strategies performed on the whole dataset, we propose a balancing procedure that works individually for each classifier in the pool. Our empirical analysis demonstrates that the KNOP algorithm obtained the best results using a pool of Random Forest. Additionally, an instance hardness assessment revealed that balancing reduces the difficulty of the minority class and enhances the detection of the minority class (malware). The code used for the experiments is available at https://github.com/jvss2/Machine-Learning-Empirical-Evaluation.

Paperid: 156, https://arxiv.org/pdf/2503.22759.pdf GitHub

Abstract:
Deep learning has become a cornerstone of modern artificial intelligence, enabling transformative applications across a wide range of domains. As the core element of deep learning, the quality and security of training data critically influence model performance and reliability. However, during the training process, deep learning models face the significant threat of data poisoning, where attackers introduce maliciously manipulated training data to degrade model accuracy or lead to anomalous behavior. While existing surveys provide valuable insights into data poisoning, they generally adopt a broad perspective, encompassing both attacks and defenses, but lack a dedicated, in-depth analysis of poisoning attacks specifically in deep learning. In this survey, we bridge this gap by presenting a comprehensive and targeted review of data poisoning in deep learning. First, this survey categorizes data poisoning attacks across multiple perspectives, providing an in-depth analysis of their characteristics and underlying design princinples. Second, the discussion is extended to the emerging area of data poisoning in large language models(LLMs). Finally, we explore critical open challenges in the field and propose potential research directions to advance the field further. To support further exploration, an up-to-date repository of resources on data poisoning in deep learning is available at https://github.com/Pinlong-Zhao/Data-Poisoning.

Paperid: 157, https://arxiv.org/pdf/2503.21824.pdf GitHub

Abstract:
Recently, video-based large language models (video-based LLMs) have achieved impressive performance across various video comprehension tasks. However, this rapid advancement raises significant privacy and security concerns, particularly regarding the unauthorized use of personal video data in automated annotation by video-based LLMs. These unauthorized annotated video-text pairs can then be used to improve the performance of downstream tasks, such as text-to-video generation. To safeguard personal videos from unauthorized use, we propose two series of protective video watermarks with imperceptible adversarial perturbations, named Ramblings and Mutes. Concretely, Ramblings aim to mislead video-based LLMs into generating inaccurate captions for the videos, thereby degrading the quality of video annotations through inconsistencies between video content and captions. Mutes, on the other hand, are designed to prompt video-based LLMs to produce exceptionally brief captions, lacking descriptive detail. Extensive experiments demonstrate that our video watermarking methods effectively protect video data by significantly reducing video annotation performance across various video-based LLMs, showcasing both stealthiness and robustness in protecting personal video content. Our code is available at https://github.com/ttthhl/Protecting_Your_Video_Content.

Paperid: 158, https://arxiv.org/pdf/2503.21496.pdf GitHub

Abstract:
The rapid development of network technologies and industrial intelligence has augmented the connectivity and intelligence within the automotive industry. Notably, in the Internet of Vehicles (IoV), the Controller Area Network (CAN), which is crucial for the communication of electronic control units but lacks inbuilt security measures, has become extremely vulnerable to severe cybersecurity threats. Meanwhile, the efficacy of Intrusion Detection Systems (IDS) is hampered by the scarcity of sufficient attack data for robust model training. To overcome this limitation, we introduce a novel methodology leveraging the Restricted Boltzmann Machine (RBM) to generate synthetic CAN attack data, thereby producing training datasets with a more balanced sample distribution. Specifically, we design a CAN Data Processing Module for transforming raw CAN data into an RBM-trainable format, and a Negative Sample Generation Module to generate data reflecting the distribution of CAN data frames denoting network intrusions. Experimental results show the generated data significantly improves IDS performance, with CANet accuracy rising from 0.6477 to 0.9725 and EfficientNet from 0.1067 to 0.1555. Code is available at https://github.com/wangkai-tech23/CANDataSynthetic.

Paperid: 159, https://arxiv.org/pdf/2503.20823.pdf GitHub

Abstract:
Despite the remarkable versatility of Large Language Models (LLMs) and Multimodal LLMs (MLLMs) to generalize across both language and vision tasks, LLMs and MLLMs have shown vulnerability to jailbreaking, generating textual outputs that undermine safety, ethical, and bias standards when exposed to harmful or sensitive inputs. With the recent advancement of safety alignment via preference-tuning from human feedback, LLMs and MLLMs have been equipped with safety guardrails to yield safe, ethical, and fair responses with regard to harmful inputs. However, despite the significance of safety alignment, research on the vulnerabilities remains largely underexplored. In this paper, we investigate the unexplored vulnerability of the safety alignment, examining its ability to consistently provide safety guarantees for out-of-distribution(OOD)-ifying harmful inputs that may fall outside the aligned data distribution. Our key observation is that OOD-ifying the vanilla harmful inputs highly increases the uncertainty of the model to discern the malicious intent within the input, leading to a higher chance of being jailbroken. Exploiting this vulnerability, we propose JOOD, a new Jailbreak framework via OOD-ifying inputs beyond the safety alignment. We explore various off-the-shelf visual and textual transformation techniques for OOD-ifying the harmful inputs. Notably, we observe that even simple mixing-based techniques such as image mixup prove highly effective in increasing the uncertainty of the model, thereby facilitating the bypass of the safety alignment. Experiments across diverse jailbreak scenarios demonstrate that JOOD effectively jailbreaks recent proprietary LLMs and MLLMs such as GPT-4 and o1 with high attack success rate, which previous attack approaches have consistently struggled to jailbreak. Code is available at https://github.com/naver-ai/JOOD.

Paperid: 160, https://arxiv.org/pdf/2503.20802.pdf GitHub

Abstract:
Text watermarking provides an effective solution for identifying synthetic text generated by large language models. However, existing techniques often focus on satisfying specific criteria while ignoring other key aspects, lacking a unified evaluation. To fill this gap, we propose the Comprehensive Evaluation Framework for Watermark (CEFW), a unified framework that comprehensively evaluates watermarking methods across five key dimensions: ease of detection, fidelity of text quality, minimal embedding cost, robustness to adversarial attacks, and imperceptibility to prevent imitation or forgery. By assessing watermarks according to all these key criteria, CEFW offers a thorough evaluation of their practicality and effectiveness. Moreover, we introduce a simple and effective watermarking method called Balanced Watermark (BW), which guarantees robustness and imperceptibility through balancing the way watermark information is added. Extensive experiments show that BW outperforms existing methods in overall performance across all evaluation dimensions. We release our code to the community for future research. https://github.com/DrankXs/BalancedWatermark.

Paperid: 161, https://arxiv.org/pdf/2503.19402.pdf GitHub

Abstract:
Network applications are routinely under attack. We consider the problem of developing an effective and efficient fuzzer for the recently ratified QUIC network protocol to uncover security vulnerabilities. QUIC offers a unified transport layer for low latency, reliable transport streams that is inherently secure, ultimately representing a complex protocol design characterised by new features and capabilities for the Internet. Fuzzing a secure transport layer protocol is not trivial. The interactive, strict, rule-based, asynchronous nature of communications with a target, the stateful nature of interactions, security mechanisms to protect communications (such as integrity checks and encryption), and inherent overheads (such as target initialisation) challenge generic network protocol fuzzers. We discuss and address the challenges pertinent to fuzzing transport layer protocols (like QUIC), developing mechanisms that enable fast, effective fuzz testing of QUIC implementations to build a prototype grey-box mutation-based fuzzer; QUIC-Fuzz. We test 6, well-maintained server-side implementations, including from Google and Alibaba with QUIC-Fuzz. The results demonstrate the fuzzer is both highly effective and generalisable. Our testing uncovered 10 new security vulnerabilities, precipitating 2 CVE assignments thus far. In code coverage, QUIC-Fuzz outperforms other existing state-of-the-art network protocol fuzzers (Fuzztruction-Net, ChatAFL, and ALFNet) with up to an 84% increase in code coverage where QUIC-Fuzz outperformed statistically significantly across all targets and with a majority of bugs only discoverable by QUIC-Fuzz. We open-source QUIC-Fuzz on GitHub.

Paperid: 162, https://arxiv.org/pdf/2503.19176.pdf GitHub

Abstract:
Audio watermarking is increasingly used to verify the provenance of AI-generated content, enabling applications such as detecting AI-generated speech, protecting music IP, and defending against voice cloning. To be effective, audio watermarks must resist removal attacks that distort signals to evade detection. While many schemes claim robustness, these claims are typically tested in isolation and against a limited set of attacks. A systematic evaluation against diverse removal attacks is lacking, hindering practical deployment. In this paper, we investigate whether recent watermarking schemes that claim robustness can withstand a broad range of removal attacks. First, we introduce a taxonomy covering 22 audio watermarking schemes. Next, we summarize their underlying technologies and potential vulnerabilities. We then present a large-scale empirical study to assess their robustness. To support this, we build an evaluation framework encompassing 22 types of removal attacks (109 configurations) including signal-level, physical-level, and AI-induced distortions. We reproduce 9 watermarking schemes using open-source code, identify 8 new highly effective attacks, and highlight 11 key findings that expose the fundamental limitations of these methods across 3 public datasets. Our results reveal that none of the surveyed schemes can withstand all tested distortions. This evaluation offers a comprehensive view of how current watermarking methods perform under real-world threats. Our demo and code are available at https://sokaudiowm.github.io/.

Paperid: 163, https://arxiv.org/pdf/2503.18813.pdf GitHub

Abstract:
Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an untrusted environment. However, LLM agents are vulnerable to prompt injection attacks when handling untrusted data. In this paper we propose CaMeL, a robust defense that creates a protective system layer around the LLM, securing it even when underlying models are susceptible to attacks. To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query; therefore, the untrusted data retrieved by the LLM can never impact the program flow. To further improve security, CaMeL uses a notion of a capability to prevent the exfiltration of private data over unauthorized data flows by enforcing security policies when tools are called. We demonstrate effectiveness of CaMeL by solving $77\%$ of tasks with provable security (compared to $84\%$ with an undefended system) in AgentDojo. We release CaMeL at https://github.com/google-research/camel-prompt-injection.

Paperid: 164, https://arxiv.org/pdf/2503.17198.pdf GitHub

Abstract:
Non-transferable learning (NTL) has been proposed to protect model intellectual property (IP) by creating a "non-transferable barrier" to restrict generalization from authorized to unauthorized domains. Recently, well-designed attack, which restores the unauthorized-domain performance by fine-tuning NTL models on few authorized samples, highlights the security risks of NTL-based applications. However, such attack requires modifying model weights, thus being invalid in the black-box scenario. This raises a critical question: can we trust the security of NTL models deployed as black-box systems? In this work, we reveal the first loophole of black-box NTL models by proposing a novel attack method (dubbed as JailNTL) to jailbreak the non-transferable barrier through test-time data disguising. The main idea of JailNTL is to disguise unauthorized data so it can be identified as authorized by the NTL model, thereby bypassing the non-transferable barrier without modifying the NTL model weights. Specifically, JailNTL encourages unauthorized-domain disguising in two levels, including: (i) data-intrinsic disguising (DID) for eliminating domain discrepancy and preserving class-related content at the input-level, and (ii) model-guided disguising (MGD) for mitigating output-level statistics difference of the NTL model. Empirically, when attacking state-of-the-art (SOTA) NTL models in the black-box scenario, JailNTL achieves an accuracy increase of up to 55.7% in the unauthorized domain by using only 1% authorized samples, largely exceeding existing SOTA white-box attacks.

Paperid: 165, https://arxiv.org/pdf/2503.15404.pdf GitHub

Abstract:
Vision Transformers (ViTs) have been widely applied in various computer vision and vision-language tasks. To gain insights into their robustness in practical scenarios, transferable adversarial examples on ViTs have been extensively studied. A typical approach to improving adversarial transferability is by refining the surrogate model. However, existing work on ViTs has restricted their surrogate refinement to backward propagation. In this work, we instead focus on Forward Propagation Refinement (FPR) and specifically refine two key modules of ViTs: attention maps and token embeddings. For attention maps, we propose Attention Map Diversification (AMD), which diversifies certain attention maps and also implicitly imposes beneficial gradient vanishing during backward propagation. For token embeddings, we propose Momentum Token Embedding (MTE), which accumulates historical token embeddings to stabilize the forward updates in both the Attention and MLP blocks. We conduct extensive experiments with adversarial examples transferred from ViTs to various CNNs and ViTs, demonstrating that our FPR outperforms the current best (backward) surrogate refinement by up to 7.0\% on average. We also validate its superiority against popular defenses and its compatibility with other transfer methods. Codes and appendix are available at https://github.com/RYC-98/FPR.

Paperid: 166, https://arxiv.org/pdf/2503.15092.pdf GitHub

Abstract:
This study presents the first comprehensive safety evaluation of the DeepSeek models, focusing on evaluating the safety risks associated with their generated content. Our evaluation encompasses DeepSeek's latest generation of large language models, multimodal large language models, and text-to-image models, systematically examining their performance regarding unsafe content generation. Notably, we developed a bilingual (Chinese-English) safety evaluation dataset tailored to Chinese sociocultural contexts, enabling a more thorough evaluation of the safety capabilities of Chinese-developed models. Experimental results indicate that despite their strong general capabilities, DeepSeek models exhibit significant safety vulnerabilities across multiple risk dimensions, including algorithmic discrimination and sexual content. These findings provide crucial insights for understanding and improving the safety of large foundation models. Our code is available at https://github.com/NY1024/DeepSeek-Safety-Eval.

Paperid: 167, https://arxiv.org/pdf/2503.14827.pdf GitHub

Abstract:
Multimodal foundation models (MMFMs) play a crucial role in various applications, including autonomous driving, healthcare, and virtual assistants. However, several studies have revealed vulnerabilities in these models, such as generating unsafe content by text-to-image models. Existing benchmarks on multimodal models either predominantly assess the helpfulness of these models, or only focus on limited perspectives such as fairness and privacy. In this paper, we present the first unified platform, MMDT (Multimodal DecodingTrust), designed to provide a comprehensive safety and trustworthiness evaluation for MMFMs. Our platform assesses models from multiple perspectives, including safety, hallucination, fairness/bias, privacy, adversarial robustness, and out-of-distribution (OOD) generalization. We have designed various evaluation scenarios and red teaming algorithms under different tasks for each perspective to generate challenging data, forming a high-quality benchmark. We evaluate a range of multimodal models using MMDT, and our findings reveal a series of vulnerabilities and areas for improvement across these perspectives. This work introduces the first comprehensive and unique safety and trustworthiness evaluation platform for MMFMs, paving the way for developing safer and more reliable MMFMs and systems. Our platform and benchmark are available at https://mmdecodingtrust.github.io/.

Paperid: 168, https://arxiv.org/pdf/2503.14681.pdf GitHub

Abstract:
Differentially private (DP) image synthesis aims to generate artificial images that retain the properties of sensitive images while protecting the privacy of individual images within the dataset. Despite recent advancements, we find that inconsistent--and sometimes flawed--evaluation protocols have been applied across studies. This not only impedes the understanding of current methods but also hinders future advancements. To address the issue, this paper introduces DPImageBench for DP image synthesis, with thoughtful design across several dimensions: (1) Methods. We study eleven prominent methods and systematically characterize each based on model architecture, pretraining strategy, and privacy mechanism. (2) Evaluation. We include nine datasets and seven fidelity and utility metrics to thoroughly assess them. Notably, we find that a common practice of selecting downstream classifiers based on the highest accuracy on the sensitive test set not only violates DP but also overestimates the utility scores. DPImageBench corrects for these mistakes. (3) Platform. Despite the methods and evaluation protocols, DPImageBench provides a standardized interface that accommodates current and future implementations within a unified framework. With DPImageBench, we have several noteworthy findings. For example, contrary to the common wisdom that pretraining on public image datasets is usually beneficial, we find that the distributional similarity between pretraining and sensitive images significantly impacts the performance of the synthetic images and does not always yield improvements. In addition, adding noise to low-dimensional features, such as the high-level characteristics of sensitive images, is less affected by the privacy budget compared to adding noise to high-dimensional features, like weight gradients. The former methods perform better than the latter under a low privacy budget.

Paperid: 169, https://arxiv.org/pdf/2503.12368.pdf GitHub

Abstract:
Image steganography is an information-hiding technique that involves the surreptitious concealment of covert informational content within digital images. In this paper, we introduce ${\rm SCR{\small EED}S{\small OLO}}$, a novel framework for concealing arbitrary binary data within images. Our approach synergistically leverages Random Shuffling, Fernet Symmetric Encryption, and Reed-Solomon Error Correction Codes to encode the secret payload, which is then discretely embedded into the carrier image using LSB (Least Significant Bit) Steganography. The combination of these methods addresses the vulnerability vectors of both security and resilience against bit-level corruption in the resultant stego-images. We show that our framework achieves a data payload of 3 bits per pixel for an RGB image, and mathematically assess the probability of successful transmission for the amalgamated $n$ message bits and $k$ error correction bits. Additionally, we find that ${\rm SCR{\small EED}S{\small OLO}}$ yields good results upon being evaluated with multiple performance metrics, successfully eludes detection by various passive steganalysis tools, and is immune to simple active steganalysis attacks. Our code and data are available at https://github.com/Starscream-11813/SCReedSolo-Steganography.

Paperid: 170, https://arxiv.org/pdf/2503.12008.pdf GitHub

Abstract:
Tabular data synthesis using diffusion models has gained significant attention for its potential to balance data utility and privacy. However, existing privacy evaluations often rely on heuristic metrics or weak membership inference attacks (MIA), leaving privacy risks inadequately assessed. In this work, we conduct a rigorous MIA study on diffusion-based tabular synthesis, revealing that state-of-the-art attacks designed for image models fail in this setting. We identify noise initialization as a key factor influencing attack efficacy and propose a machine-learning-driven approach that leverages loss features across different noises and time steps. Our method, implemented with a lightweight MLP, effectively learns membership signals, eliminating the need for manual optimization. Experimental results from the MIDST Challenge @ SaTML 2025 demonstrate the effectiveness of our approach, securing first place across all tracks. Code is available at https://github.com/Nicholas0228/Tartan_Federer_MIDST.

Paperid: 171, https://arxiv.org/pdf/2503.09446.pdf GitHub

Abstract:
Text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images but also raise people's concerns about generating harmful or misleading content. While extensive approaches have been proposed to erase unwanted concepts without requiring retraining from scratch, they inadvertently degrade performance on normal generation tasks. In this work, we propose Interpret then Deactivate (ItD), a novel framework to enable precise concept removal in T2I diffusion models while preserving overall performance. ItD first employs a sparse autoencoder (SAE) to interpret each concept as a combination of multiple features. By permanently deactivating the specific features associated with target concepts, we repurpose SAE as a zero-shot classifier that identifies whether the input prompt includes target concepts, allowing selective concept erasure in diffusion models. Moreover, we demonstrate that ItD can be easily extended to erase multiple concepts without requiring further training. Comprehensive experiments across celebrity identities, artistic styles, and explicit content demonstrate ItD's effectiveness in eliminating targeted concepts without interfering with normal concept generation. Additionally, ItD is also robust against adversarial prompts designed to circumvent content filters. Code is available at: https://github.com/NANSirun/Interpret-then-deactivate.

Paperid: 172, https://arxiv.org/pdf/2503.09433.pdf GitHub

Abstract:
Identifying vulnerabilities in source code is crucial, especially in critical software components. Existing methods such as static analysis, dynamic analysis, formal verification, and recently Large Language Models are widely used to detect security flaws. This paper introduces CASTLE (CWE Automated Security Testing and Low-Level Evaluation), a benchmarking framework for evaluating the vulnerability detection capabilities of different methods. We assess 13 static analysis tools, 10 LLMs, and 2 formal verification tools using a hand-crafted dataset of 250 micro-benchmark programs covering 25 common CWEs. We propose the CASTLE Score, a novel evaluation metric to ensure fair comparison. Our results reveal key differences: ESBMC (a formal verification tool) minimizes false positives but struggles with vulnerabilities beyond model checking, such as weak cryptography or SQL injection. Static analyzers suffer from high false positives, increasing manual validation efforts for developers. LLMs perform exceptionally well in the CASTLE dataset when identifying vulnerabilities in small code snippets. However, their accuracy declines, and hallucinations increase as the code size grows. These results suggest that LLMs could play a pivotal role in future security solutions, particularly within code completion frameworks, where they can provide real-time guidance to prevent vulnerabilities. The dataset is accessible at https://github.com/CASTLE-Benchmark.

Paperid: 173, https://arxiv.org/pdf/2503.08923.pdf GitHub

Abstract:
Hardware verification is crucial in modern SoC design, consuming around 70% of development time. SystemVerilog assertions ensure correct functionality. However, existing industrial practices rely on manual efforts for assertion generation, which becomes increasingly untenable as hardware systems become complex. Recent research shows that Large Language Models (LLMs) can automate this process. However, proprietary SOTA models like GPT-4o often generate inaccurate assertions and require expensive licenses, while smaller open-source LLMs need fine-tuning to manage HDL code complexities. To address these issues, we introduce **VERT**, an open-source dataset designed to enhance SystemVerilog assertion generation using LLMs. VERT enables researchers in academia and industry to fine-tune open-source models, outperforming larger proprietary ones in both accuracy and efficiency while ensuring data privacy through local fine-tuning and eliminating costly licenses. The dataset is curated by systematically augmenting variables from open-source HDL repositories to generate synthetic code snippets paired with corresponding assertions. Experimental results demonstrate that fine-tuned models like Deepseek Coder 6.7B and Llama 3.1 8B outperform GPT-4o, achieving up to 96.88% improvement over base models and 24.14% over GPT-4o on platforms including OpenTitan, CVA6, OpenPiton and Pulpissimo. VERT is available at https://github.com/AnandMenon12/VERT.

Paperid: 174, https://arxiv.org/pdf/2503.07978.pdf GitHub

Abstract:
The distributed nature of training makes Federated Learning (FL) vulnerable to backdoor attacks, where malicious model updates aim to compromise the global model's performance on specific tasks. Existing defense methods show limited efficacy as they overlook the inconsistency between benign and malicious model updates regarding both general and fine-grained directions. To fill this gap, we introduce AlignIns, a novel defense method designed to safeguard FL systems against backdoor attacks. AlignIns looks into the direction of each model update through a direction alignment inspection process. Specifically, it examines the alignment of model updates with the overall update direction and analyzes the distribution of the signs of their significant parameters, comparing them with the principle sign across all model updates. Model updates that exhibit an unusual degree of alignment are considered malicious and thus be filtered out. We provide the theoretical analysis of the robustness of AlignIns and its propagation error in FL. Our empirical results on both independent and identically distributed (IID) and non-IID datasets demonstrate that AlignIns achieves higher robustness compared to the state-of-the-art defense methods. The code is available at https://github.com/JiiahaoXU/AlignIns.

Paperid: 175, https://arxiv.org/pdf/2503.07464.pdf GitHub

Abstract:
While cryptographic algorithms such as the ubiquitous Advanced Encryption Standard (AES) are secure, *physical implementations* of these algorithms in hardware inevitably 'leak' sensitive data such as cryptographic keys. A particularly insidious form of leakage arises from the fact that hardware consumes power and emits radiation in a manner that is statistically associated with the data it processes and the instructions it executes. Supervised deep learning has emerged as a state-of-the-art tool for carrying out *side-channel attacks*, which exploit this leakage by learning to map power/radiation measurements throughout encryption to the sensitive data operated on during that encryption. In this work we develop a principled deep learning framework for determining the relative leakage due to measurements recorded at different points in time, in order to inform *defense* against such attacks. This information is invaluable to cryptographic hardware designers for understanding *why* their hardware leaks and how they can mitigate it (e.g. by indicating the particular sections of code or electronic components which are responsible). Our framework is based on an adversarial game between a family of classifiers trained to estimate the conditional distributions of sensitive data given subsets of measurements, and a budget-constrained noise distribution which probabilistically erases individual measurements to maximize the loss of these classifiers. We demonstrate our method's efficacy and ability to overcome limitations of prior work through extensive experimental comparison with 8 baseline methods using 3 evaluation metrics and 6 publicly-available power/EM trace datasets from AES, ECC and RSA implementations. We provide an open-source PyTorch implementation of these experiments.

Paperid: 176, https://arxiv.org/pdf/2503.07191.pdf GitHub

Abstract:
Recent advances in 3D Gaussian Splatting (3DGS) have revolutionized scene reconstruction, opening new possibilities for 3D steganography by hiding 3D secrets within 3D covers. The key challenge in steganography is ensuring imperceptibility while maintaining high-fidelity reconstruction. However, existing methods often suffer from detectability risks and utilize only suboptimal 3DGS features, limiting their full potential. We propose a novel end-to-end key-secured 3D steganography framework (KeySS) that jointly optimizes a 3DGS model and a key-secured decoder for secret reconstruction. Our approach reveals that Gaussian features contribute unequally to secret hiding. The framework incorporates a key-controllable mechanism enabling multi-secret hiding and unauthorized access prevention, while systematically exploring optimal feature update to balance fidelity and security. To rigorously evaluate steganographic imperceptibility beyond conventional 2D metrics, we introduce 3D-Sinkhorn distance analysis, which quantifies distributional differences between original and steganographic Gaussian parameters in the representation space. Extensive experiments demonstrate that our method achieves state-of-the-art performance in both cover and secret reconstruction while maintaining high security levels, advancing the field of 3D steganography. Code is available at https://github.com/RY-Paper/KeySS

Paperid: 177, https://arxiv.org/pdf/2503.06166.pdf GitHub

Abstract:
Out-of-Distribution (OOD) detection is critical for ensuring the reliability of machine learning models in safety-critical applications such as autonomous driving and medical diagnosis. While deploying personalized OOD detection directly on edge devices is desirable, it remains challenging due to large model sizes and the computational infeasibility of on-device training. Federated learning partially addresses this but still requires gradient computation and backpropagation, exceeding the capabilities of many edge devices. To overcome these challenges, we propose SecDOOD, a secure cloud-device collaboration framework for efficient on-device OOD detection without requiring device-side backpropagation. SecDOOD utilizes cloud resources for model training while ensuring user data privacy by retaining sensitive information on-device. Central to SecDOOD is a HyperNetwork-based personalized parameter generation module, which adapts cloud-trained models to device-specific distributions by dynamically generating local weight adjustments, effectively combining central and local information without local fine-tuning. Additionally, our dynamic feature sampling and encryption strategy selectively encrypts only the most informative feature channels, largely reducing encryption overhead without compromising detection performance. Extensive experiments across multiple datasets and OOD scenarios demonstrate that SecDOOD achieves performance comparable to fully fine-tuned models, enabling secure, efficient, and personalized OOD detection on resource-limited edge devices. To enhance accessibility and reproducibility, our code is publicly available at https://github.com/Dystopians/SecDOOD.

Paperid: 178, https://arxiv.org/pdf/2503.05794.pdf GitHub

Abstract:
With the increasing adoption of deep learning in speaker verification, large-scale speech datasets have become valuable intellectual property. To audit and prevent the unauthorized usage of these valuable released datasets, especially in commercial or open-source scenarios, we propose a novel dataset ownership verification method. Our approach introduces a clustering-based backdoor watermark (CBW), enabling dataset owners to determine whether a suspicious third-party model has been trained on a protected dataset under a black-box setting. The CBW method consists of two key stages: dataset watermarking and ownership verification. During watermarking, we implant multiple trigger patterns in the dataset to make similar samples (measured by their feature similarities) close to the same trigger while dissimilar samples are near different triggers. This ensures that any model trained on the watermarked dataset exhibits specific misclassification behaviors when exposed to trigger-embedded inputs. To verify dataset ownership, we design a hypothesis-test-based framework that statistically evaluates whether a suspicious model exhibits the expected backdoor behavior. We conduct extensive experiments on benchmark datasets, verifying the effectiveness and robustness of our method against potential adaptive attacks. The code for reproducing main experiments is available at https://github.com/Radiant0726/CBW

Paperid: 179, https://arxiv.org/pdf/2503.05206.pdf GitHub

Abstract:
Modern cybersecurity threats are growing in complexity, targeting increasingly intricate & interconnected systems. To effectively defend against these evolving threats, security teams utilize automation & orchestration to enhance response efficiency and consistency. In that sense, cybersecurity playbooks are key enablers, providing a structured, reusable, and continuously improving approach to incident response, enabling organizations to codify requirements, domain expertise, and best practices and automate decision-making processes to the extent possible. The emerging Collaborative Automated Course of Action Operations (CACAO) standard defines a common machine-processable schema for cybersecurity playbooks, facilitating interoperability for their exchange and ensuring the ability to orchestrate and automate cybersecurity operations. However, despite its potential and the fact that it is a relatively new standardization work, there is a lack of tools to support its adoption and, in particular, the management & lifecycle development of CACAO playbooks, limiting their practical deployment. Motivated by the above, this work presents the design, development, and evaluation of a Knowledge Management System (KMS) for managing CACAO cybersecurity playbooks throughout their lifecycle, providing essential tools to streamline playbook management. Using open technologies & standards, the proposed approach fosters standards-based interoperability & enhances the usability of state-of-the-art cybersecurity orchestration & automation primitives. To encourage adoption, the resulting implementation is released as open-source, which, to the extent of our knowledge, comprises the first publicly available & documented work in this domain, supporting the broader uptake of CACAO playbooks & promoting the widespread use of interoperable automation and orchestration mechanisms in cybersecurity operations.

Paperid: 180, https://arxiv.org/pdf/2503.04963.pdf GitHub

Abstract:
The growing computational demand for deep neural networks ( DNNs) has raised concerns about their energy consumption and carbon footprint, particularly as the size and complexity of the models continue to increase. To address these challenges, energy-efficient hardware and custom accelerators have become essential. Additionally, adaptable DNN s are being developed to dynamically balance performance and efficiency. The use of these strategies became more common to enable sustainable AI deployment. However, these efficiency-focused designs may also introduce vulnerabilities, as attackers can potentially exploit them to increase latency and energy usage by triggering their worst-case-performance scenarios. This new type of attack, called energy-latency attacks, has recently gained significant research attention, focusing on the vulnerability of DNN s to this emerging attack paradigm, which can trigger denial-of-service ( DoS) attacks. This paper provides a comprehensive overview of current research on energy-latency attacks, categorizing them using the established taxonomy for traditional adversarial attacks. We explore different metrics used to measure the success of these attacks and provide an analysis and comparison of existing attack strategies. We also analyze existing defense mechanisms and highlight current challenges and potential areas for future research in this developing field. The GitHub page for this work can be accessed at https://github.com/hbrachemi/Survey_energy_attacks/

Paperid: 181, https://arxiv.org/pdf/2503.04815.pdf GitHub

Abstract:
The increasing demand for real-time data processing in Internet of Things (IoT) devices has elevated the importance of edge computing, necessitating efficient and secure deployment of applications on resource-constrained devices. Kubernetes and its lightweight distributions (k0s, k3s, KubeEdge, and OpenYurt) extend container orchestration to edge environments, but their security, reliability, and maintainability have not been comprehensively analyzed. This study compares Kubernetes and these lightweight distributions by evaluating security compliance using kube-bench, simulating network outages to assess resiliency, and documenting maintainability. Results indicate that while k3s and k0s offer superior ease of development due to their simplicity, they have lower security compliance compared to Kubernetes, KubeEdge, and OpenYurt. Kubernetes provides a balanced approach but may be resource-intensive for edge deployments. KubeEdge and OpenYurt enhance security features and reliability under network outages but increase complexity and resource consumption. The findings highlight trade-offs between performance, security, resiliency, and maintainability, offering insights for practitioners deploying Kubernetes in edge environments.

Paperid: 182, https://arxiv.org/pdf/2503.03710.pdf GitHub

Abstract:
Existing training-time safety alignment techniques for large language models (LLMs) remain vulnerable to jailbreak attacks. Direct preference optimization (DPO), a widely deployed alignment method, exhibits limitations in both experimental and theoretical contexts as its loss function proves suboptimal for refusal learning. Through gradient-based analysis, we identify these shortcomings and propose an improved safety alignment that disentangles DPO objectives into two components: (1) robust refusal training, which encourages refusal even when partial unsafe generations are produced, and (2) targeted unlearning of harmful knowledge. This approach significantly increases LLM robustness against a wide range of jailbreak attacks, including prefilling, suffix, and multi-turn attacks across both in-distribution and out-of-distribution scenarios. Furthermore, we introduce a method to emphasize critical refusal tokens by incorporating a reward-based token-level weighting mechanism for refusal learning, which further improves the robustness against adversarial exploits. Our research also suggests that robustness to jailbreak attacks is correlated with token distribution shifts in the training process and internal representations of refusal and harmful tokens, offering valuable directions for future research in LLM safety alignment. The code is available at https://github.com/wicai24/DOOR-Alignment

Paperid: 183, https://arxiv.org/pdf/2503.03170.pdf GitHub

Abstract:
The observations documented in Cyber Threat Intelligence (CTI) reports play a critical role in describing adversarial behaviors, providing valuable insights for security practitioners to respond to evolving threats. Recent advancements of Large Language Models (LLMs) have demonstrated significant potential in various cybersecurity applications, including CTI report understanding and attack knowledge graph construction. While previous works have proposed benchmarks that focus on the CTI extraction ability of LLMs, the sequential characteristic of adversarial behaviors within CTI reports remains largely unexplored, which holds considerable significance in developing a comprehensive understanding of how adversaries operate. To address this gap, we introduce AttackSeqBench, a benchmark tailored to systematically evaluate LLMs' capability to understand and reason attack sequences in CTI reports. Our benchmark encompasses three distinct Question Answering (QA) tasks, each task focuses on the varying granularity in adversarial behavior. To alleviate the laborious effort of QA construction, we carefully design an automated dataset construction pipeline to create scalable and well-formulated QA datasets based on real-world CTI reports. To ensure the quality of our dataset, we adopt a hybrid approach of combining human evaluation and systematic evaluation metrics. We conduct extensive experiments and analysis with both fast-thinking and slow-thinking LLMs, while highlighting their strengths and limitations in analyzing the sequential patterns in cyber attacks. The overarching goal of this work is to provide a benchmark that advances LLM-driven CTI report understanding and fosters its application in real-world cybersecurity operations. Our dataset and code are available at https://github.com/Javiery3889/AttackSeqBench .

Paperid: 184, https://arxiv.org/pdf/2503.02169.pdf GitHub

Abstract:
Statistical adversarial data detection (SADD) detects whether an upcoming batch contains adversarial examples (AEs) by measuring the distributional discrepancies between clean examples (CEs) and AEs. In this paper, we explore the strength of SADD-based methods by theoretically showing that minimizing distributional discrepancy can help reduce the expected loss on AEs. Despite these advantages, SADD-based methods have a potential limitation: they discard inputs that are detected as AEs, leading to the loss of useful information within those inputs. To address this limitation, we propose a two-pronged adversarial defense method, named Distributional-discrepancy-based Adversarial Defense (DAD). In the training phase, DAD first optimizes the test power of the maximum mean discrepancy (MMD) to derive MMD-OPT, which is a stone that kills two birds. MMD-OPT first serves as a guiding signal to minimize the distributional discrepancy between CEs and AEs to train a denoiser. Then, it serves as a discriminator to differentiate CEs and AEs during inference. Overall, in the inference stage, DAD consists of a two-pronged process: (1) directly feeding the detected CEs into the classifier, and (2) removing noise from the detected AEs by the distributional-discrepancy-based denoiser. Extensive experiments show that DAD outperforms current state-of-the-art (SOTA) defense methods by simultaneously improving clean and robust accuracy on CIFAR-10 and ImageNet-1K against adaptive white-box attacks. Codes are publicly available at: https://github.com/tmlr-group/DAD.

Paperid: 185, https://arxiv.org/pdf/2503.01908.pdf GitHub

Abstract:
Large Language Model (LLM) agents equipped with external tools have become increasingly powerful for complex tasks such as web shopping, automated email replies, and financial trading. However, these advancements amplify the risks of adversarial attacks, especially when agents can access sensitive external functionalities. Nevertheless, manipulating LLM agents into performing targeted malicious actions or invoking specific tools remains challenging, as these agents extensively reason or plan before executing final actions. In this work, we present UDora, a unified red teaming framework designed for LLM agents that dynamically hijacks the agent's reasoning processes to compel malicious behavior. Specifically, UDora first generates the model's reasoning trace for the given task, then automatically identifies optimal points within this trace to insert targeted perturbations. The resulting perturbed reasoning is then used as a surrogate response for optimization. By iteratively applying this process, the LLM agent will then be induced to undertake designated malicious actions or to invoke specific malicious tools. Our approach demonstrates superior effectiveness compared to existing methods across three LLM agent datasets. The code is available at https://github.com/AI-secure/UDora.

Paperid: 186, https://arxiv.org/pdf/2503.01811.pdf GitHub

Abstract:
We introduce AutoAdvExBench, a benchmark to evaluate if large language models (LLMs) can autonomously exploit defenses to adversarial examples. Unlike existing security benchmarks that often serve as proxies for real-world tasks, bench directly measures LLMs' success on tasks regularly performed by machine learning security experts. This approach offers a significant advantage: if a LLM could solve the challenges presented in bench, it would immediately present practical utility for adversarial machine learning researchers. We then design a strong agent that is capable of breaking 75% of CTF-like ("homework exercise") adversarial example defenses. However, we show that this agent is only able to succeed on 13% of the real-world defenses in our benchmark, indicating the large gap between difficulty in attacking "real" code, and CTF-like code. In contrast, a stronger LLM that can attack 21% of real defenses only succeeds on 54% of CTF-like defenses. We make this benchmark available at https://github.com/ethz-spylab/AutoAdvExBench.

Paperid: 187, https://arxiv.org/pdf/2503.00555.pdf GitHub

Abstract:
Safety alignment is an important procedure before the official deployment of a Large Language Model (LLM). While safety alignment has been extensively studied for LLM, there is still a large research gap for Large Reasoning Models (LRMs) that equip with improved reasoning capability. We in this paper systematically examine a simplified pipeline for producing safety aligned LRMs. With our evaluation of various LRMs, we deliver two main findings: i) Safety alignment can be done upon the LRM to restore its safety capability. ii) Safety alignment leads to a degradation of the reasoning capability of LRMs. The two findings show that there exists a trade-off between reasoning and safety capability with the sequential LRM production pipeline. The discovered trade-off, which we name Safety Tax, should shed light on future endeavors of safety research on LRMs. As a by-product, we curate a dataset called DirectRefusal, which might serve as an alternative dataset for safety alignment. Our source code is available at https://github.com/git-disl/Safety-Tax.

Paperid: 188, https://arxiv.org/pdf/2503.00187.pdf GitHub

Abstract:
Large language models (LLMs) are shown to be vulnerable to jailbreaking attacks where adversarial prompts are designed to elicit harmful responses. While existing defenses effectively mitigate single-turn attacks by detecting and filtering unsafe inputs, they fail against multi-turn jailbreaks that exploit contextual drift over multiple interactions, gradually leading LLMs away from safe behavior. To address this challenge, we propose a safety steering framework grounded in safe control theory, ensuring invariant safety in multi-turn dialogues. Our approach models the dialogue with LLMs using state-space representations and introduces a novel neural barrier function (NBF) to detect and filter harmful queries emerging from evolving contexts proactively. Our method achieves invariant safety at each turn of dialogue by learning a safety predictor that accounts for adversarial queries, preventing potential context drift toward jailbreaks. Extensive experiments under multiple LLMs show that our NBF-based safety steering outperforms safety alignment, prompt-based steering and lightweight LLM guardrails baselines, offering stronger defenses against multi-turn jailbreaks while maintaining a better trade-off among safety, helpfulness and over-refusal. Check out the website here https://sites.google.com/view/llm-nbf/home . Our code is available on https://github.com/HanjiangHu/NBF-LLM .

Paperid: 189, https://arxiv.org/pdf/2503.00061.pdf GitHub

Abstract:
Large Language Model (LLM) agents exhibit remarkable performance across diverse applications by using external tools to interact with environments. However, integrating external tools introduces security risks, such as indirect prompt injection (IPI) attacks. Despite defenses designed for IPI attacks, their robustness remains questionable due to insufficient testing against adaptive attacks. In this paper, we evaluate eight different defenses and bypass all of them using adaptive attacks, consistently achieving an attack success rate of over 50%. This reveals critical vulnerabilities in current defenses. Our research underscores the need for adaptive attack evaluation when designing defenses to ensure robustness and reliability. The code is available at https://github.com/uiuc-kang-lab/AdaptiveAttackAgent.

Paperid: 190, https://arxiv.org/pdf/2502.21286.pdf GitHub

Abstract:
Zero-Touch Networks (ZTNs) represent a state-of-the-art paradigm shift towards fully automated and intelligent network management, enabling the automation and intelligence required to manage the complexity, scale, and dynamic nature of next-generation (6G) networks. ZTNs leverage Artificial Intelligence (AI) and Machine Learning (ML) to enhance operational efficiency, support intelligent decision-making, and ensure effective resource allocation. However, the implementation of ZTNs is subject to security challenges that need to be resolved to achieve their full potential. In particular, two critical challenges arise: the need for human expertise in developing AI/ML-based security mechanisms, and the threat of adversarial attacks targeting AI/ML models. In this survey paper, we provide a comprehensive review of current security issues in ZTNs, emphasizing the need for advanced AI/ML-based security mechanisms that require minimal human intervention and protect AI/ML models themselves. Furthermore, we explore the potential of Automated ML (AutoML) technologies in developing robust security solutions for ZTNs. Through case studies, we illustrate practical approaches to securing ZTNs against both conventional and AI/ML-specific threats, including the development of autonomous intrusion detection systems and strategies to combat Adversarial ML (AML) attacks. The paper concludes with a discussion of the future research directions for the development of ZTN security approaches.

Paperid: 191, https://arxiv.org/pdf/2502.20650.pdf GitHub

Abstract:
In recent years, Diffusion Models (DMs) have demonstrated significant advances in the field of image generation. However, according to current research, DMs are vulnerable to backdoor attacks, which allow attackers to control the model's output by inputting data containing covert triggers, such as a specific visual patch or phrase. Existing defense strategies are well equipped to thwart such attacks through backdoor detection and trigger inversion because previous attack methods are constrained by limited input spaces and low-dimensional triggers. For example, visual triggers are easily observed by defenders, text-based or attention-based triggers are more susceptible to neural network detection. To explore more possibilities of backdoor attack in DMs, we propose Gungnir, a novel method that enables attackers to activate the backdoor in DMs through style triggers within input images. Our approach proposes using stylistic features as triggers for the first time and implements backdoor attacks successfully in image-to-image tasks by introducing Reconstructing-Adversarial Noise (RAN) and Short-Term Timesteps-Retention (STTR). Our technique generates trigger-embedded images that are perceptually indistinguishable from clean images, thus bypassing both manual inspection and automated detection neural networks. Experiments demonstrate that Gungnir can easily bypass existing defense methods. Among existing DM defense frameworks, our approach achieves a 0 backdoor detection rate (BDR). Our codes are available at https://github.com/paoche11/Gungnir.

Paperid: 192, https://arxiv.org/pdf/2502.20627.pdf GitHub

Abstract:
The transition from 5G to 6G mobile networks necessitates network automation to meet the escalating demands for high data rates, ultra-low latency, and integrated technology. Recently, Zero-Touch Networks (ZTNs), driven by Artificial Intelligence (AI) and Machine Learning (ML), are designed to automate the entire lifecycle of network operations with minimal human intervention, presenting a promising solution for enhancing automation in 5G/6G networks. However, the implementation of ZTNs brings forth the need for autonomous and robust cybersecurity solutions, as ZTNs rely heavily on automation. AI/ML algorithms are widely used to develop cybersecurity mechanisms, but require substantial specialized expertise and encounter model drift issues, posing significant challenges in developing autonomous cybersecurity measures. Therefore, this paper proposes an automated security framework targeting Physical Layer Authentication (PLA) and Cross-Layer Intrusion Detection Systems (CLIDS) to address security concerns at multiple Internet protocol layers. The proposed framework employs drift-adaptive online learning techniques and a novel enhanced Successive Halving (SH)-based Automated ML (AutoML) method to automatically generate optimized ML models for dynamic networking environments. Experimental results illustrate that the proposed framework achieves high performance on the public Radio Frequency (RF) fingerprinting and the Canadian Institute for CICIDS2017 datasets, showcasing its effectiveness in addressing PLA and CLIDS tasks within dynamic and complex networking environments. Furthermore, the paper explores open challenges and research directions in the 5G/6G cybersecurity domain. This framework represents a significant advancement towards fully autonomous and secure 6G networks, paving the way for future innovations in network automation and cybersecurity.

Paperid: 193, https://arxiv.org/pdf/2502.20562.pdf GitHub

Abstract:
State-of-the-art defense mechanisms are typically evaluated in the context of white-box attacks, which is not realistic, as it assumes the attacker can access the gradients of the target network. To protect against this scenario, Adversarial Training (AT) and Adversarial Distillation (AD) include adversarial examples during the training phase, and Adversarial Purification uses a generative model to reconstruct all the images given to the classifier. This paper considers an even more realistic evaluation scenario: gray-box attacks, which assume that the attacker knows the architecture and the dataset used to train the target network, but cannot access its gradients. We provide empirical evidence that models are vulnerable to gray-box attacks and propose LISArD, a defense mechanism that does not increase computational and temporal costs but provides robustness against gray-box and white-box attacks without including AT. Our method approximates a cross-correlation matrix, created with the embeddings of perturbed and clean images, to a diagonal matrix while simultaneously conducting classification learning. Our results show that LISArD can effectively protect against gray-box attacks, can be used in multiple architectures, and carries over its resilience to the white-box scenario. Also, state-of-the-art AD models underperform greatly when removing AT and/or moving to gray-box settings, highlighting the lack of robustness from existing approaches to perform in various conditions (aside from white-box settings). All the source code is available at https://github.com/Joana-Cabral/LISArD.

Paperid: 194, https://arxiv.org/pdf/2502.19755.pdf GitHub

Abstract:
Effective out-of-distribution (OOD) detection is crucial for the safe deployment of machine learning models in real-world scenarios. However, recent work has shown that OOD detection methods are vulnerable to adversarial attacks, potentially leading to critical failures in high-stakes applications. This discovery has motivated work on robust OOD detection methods that are capable of maintaining performance under various attack settings. Prior approaches have made progress on this problem but face a number of limitations: often only exhibiting robustness to attacks on OOD data or failing to maintain strong clean performance. In this work, we adapt an existing robust classification framework, TRADES, extending it to the problem of robust OOD detection and discovering a novel objective function. Recognising the critical importance of a strong clean/robust trade-off for OOD detection, we introduce an additional loss term which boosts classification and detection performance. Our approach, called HALO (Helper-based AdversariaL OOD detection), surpasses existing methods and achieves state-of-the-art performance across a number of datasets and attack settings. Extensive experiments demonstrate an average AUROC improvement of 3.15 in clean settings and 7.07 under adversarial attacks when compared to the next best method. Furthermore, HALO exhibits resistance to transferred attacks, offers tuneable performance through hyperparameter selection, and is compatible with existing OOD detection frameworks out-of-the-box, leaving open the possibility of future performance gains. Code is available at: https://github.com/hugo0076/HALO

Paperid: 195, https://arxiv.org/pdf/2502.19525.pdf GitHub

Abstract:
In settings like vaccination registries, individuals act after observing others, and the resulting public records can expose private information. We study privacy-preserving sequential learning, where agents add endogenous noise to their reported actions to conceal private signals. Efficient social learning relies on information flow, seemingly in conflict with privacy. Surprisingly, with continuous signals and a fixed privacy budget $(ε)$, the optimal randomization strategy balances privacy and accuracy, accelerating learning to $Θ_ε(\log n)$, faster than the nonprivate $Θ(\sqrt{\log n})$ rate. In the nonprivate baseline, the expected time to the first correct action and the number of incorrect actions diverge; under privacy with sufficiently small $ε$, both are finite. Privacy helps because, under the false state, agents more often receive signals contradicting the majority; randomization then asymmetrically amplifies the log-likelihood ratio, enhancing aggregation. In heterogeneous populations, an order-optimal $Θ(\sqrt{n})$ rate is achievable when a subset of agents have low privacy budgets. With binary signals, however, privacy reduces informativeness and impairs learning relative to the nonprivate baseline, though the dependence on $ε$ is nonmonotone. Our results show how privacy reshapes information dynamics and inform the design of platforms and policies.

Paperid: 196, https://arxiv.org/pdf/2502.19257.pdf GitHub

Abstract:
Webshell is a type of backdoor, and web applications are widely exposed to webshell injection attacks. Therefore, it is important to study webshell detection techniques. In this study, we propose a webshell detection method. We first convert PHP source code to opcodes and then extract Opcode Double-Tuples (ODTs). Next, we combine CodeBert and FastText models for feature representation and classification. To address the challenge that deep learning methods have difficulty detecting long webshell files, we introduce a sliding window attention mechanism. This approach effectively captures malicious behavior within long files. Experimental results show that our method reaches high accuracy in webshell detection, solving the problem of traditional methods that struggle to address new webshell variants and anti-detection techniques.

Paperid: 197, https://arxiv.org/pdf/2502.18851.pdf GitHub

Abstract:
Code watermarking identifies AI-generated code by embedding patterns into the code during generation. Effective watermarking requires meeting two key conditions: the watermark should be reliably detectable, and the code should retain its original functionality. However, existing methods often modify tokens that are critical for program logic, such as keywords in conditional expressions or operators in arithmetic computations. These modifications can cause syntax errors or functional failures, limiting the practical use of watermarking. We present STONE, a method that preserves functional integrity by selectively inserting watermarks only into non-syntax tokens. By excluding tokens essential for code execution, STONE minimizes the risk of functional degradation. In addition, we introduce CWEM, a comprehensive evaluation metric that evaluates watermarking techniques based on correctness, detectability, and naturalness. While correctness and detectability have been widely used, naturalness remains underexplored despite its importance. Unnatural patterns can reveal the presence of a watermark, making it easier for adversaries to remove. We evaluate STONE using CWEM and compare its performance with the state-of-the-art approach. The results show that STONE achieves an average improvement of 7.69% in CWEM across Python, C++, and Java. Our code is available in https://github.com/inistory/STONE-watermarking/.

Paperid: 198, https://arxiv.org/pdf/2502.18527.pdf GitHub

Abstract:
Personal AI assistants (e.g., Apple Intelligence, Meta AI) offer proactive recommendations that simplify everyday tasks, but their reliance on sensitive user data raises concerns about privacy and trust. To address these challenges, we introduce the Guardian of Data (GOD), a secure, privacy-preserving framework for training and evaluating AI assistants directly on-device. Unlike traditional benchmarks, the GOD model measures how well assistants can anticipate user needs-such as suggesting gifts-while protecting user data and autonomy. Functioning like an AI school, it addresses the cold start problem by simulating user queries and employing a curriculum-based approach to refine the performance of each assistant. Running within a Trusted Execution Environment (TEE), it safeguards user data while applying reinforcement and imitation learning to refine AI recommendations. A token-based incentive system encourages users to share data securely, creating a data flywheel that drives continuous improvement. Specifically, users mine with their data, and the mining rate is determined by GOD's evaluation of how well their AI assistant understands them across categories such as shopping, social interactions, productivity, trading, and Web3. By integrating privacy, personalization, and trust, the GOD model provides a scalable, responsible path for advancing personal AI assistants. For community collaboration, part of the framework is open-sourced at https://github.com/PIN-AI/God-Model.

Paperid: 199, https://arxiv.org/pdf/2502.18514.pdf GitHub

Abstract:
Facial recognition systems rely on embeddings to represent facial images and determine identity by verifying if the distance between embeddings is below a pre-tuned threshold. While embeddings are not reversible to original images, they still contain sensitive information, making their security critical. Traditional encryption methods like AES are limited in securely utilizing cloud computational power for distance calculations. Homomorphic Encryption, allowing calculations on encrypted data, offers a robust alternative. This paper introduces CipherFace, a homomorphic encryption-driven framework for secure cloud-based facial recognition, which we have open-sourced at http://github.com/serengil/cipherface. By leveraging FHE, CipherFace ensures the privacy of embeddings while utilizing the cloud for efficient distance computation. Furthermore, we propose a novel encrypted distance computation method for both Euclidean and Cosine distances, addressing key challenges in performing secure similarity calculations on encrypted data. We also conducted experiments with different facial recognition models, various embedding sizes, and cryptosystem configurations, demonstrating the scalability and effectiveness of CipherFace in real-world applications.

Paperid: 200, https://arxiv.org/pdf/2502.18509.pdf GitHub

Abstract:
Conversational agents are increasingly woven into individuals' personal lives, yet users often underestimate the privacy risks associated with them. The moment users share information with these agents-such as large language models (LLMs)-their private information becomes vulnerable to exposure. In this paper, we characterize the notion of contextual privacy for user interactions with LLM-based Conversational Agents (LCAs). It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals when interacting with LCAs (untrusted receivers). Through a formative design user study, we observe how even "privacy-conscious" users inadvertently reveal sensitive information through indirect disclosures. Based on insights from this study, we propose a locally deployable framework that operates between users and LCAs, identifying and reformulating out-of-context information in user prompts. Our evaluation using examples from ShareGPT shows that lightweight models can effectively implement this framework, achieving strong gains in contextual privacy while preserving the user's intended interaction goals. Notably, about 76% of participants in our human evaluation preferred the reformulated prompts over the original ones, validating the usability and effectiveness of contextual privacy in our proposed framework. We opensource the code at https://github.com/IBM/contextual-privacy-LLM.

Paperid: 201, https://arxiv.org/pdf/2502.18508.pdf GitHub GitHub

Abstract:
Backdoor attacks on deep neural networks (DNNs) have emerged as a significant security threat, allowing adversaries to implant hidden malicious behaviors during the model training phase. Pre-processing-based defense, which is one of the most important defense paradigms, typically focuses on input transformations or backdoor trigger inversion (BTI) to deactivate or eliminate embedded backdoor triggers during the inference process. However, these methods suffer from inherent limitations: transformation-based defenses often fail to balance model utility and defense performance, while BTI-based defenses struggle to accurately reconstruct trigger patterns without prior knowledge. In this paper, we propose REFINE, an inversion-free backdoor defense method based on model reprogramming. REFINE consists of two key components: \textbf{(1)} an input transformation module that disrupts both benign and backdoor patterns, generating new benign features; and \textbf{(2)} an output remapping module that redefines the model's output domain to guide the input transformations effectively. By further integrating supervised contrastive loss, REFINE enhances the defense capabilities while maintaining model utility. Extensive experiments on various benchmark datasets demonstrate the effectiveness of our REFINE and its resistance to potential adaptive attacks.

Paperid: 202, https://arxiv.org/pdf/2502.18504.pdf GitHub

Abstract:
Jailbreaking large-language models (LLMs) involves testing their robustness against adversarial prompts and evaluating their ability to withstand prompt attacks that could elicit unauthorized or malicious responses. In this paper, we present TurboFuzzLLM, a mutation-based fuzzing technique for efficiently finding a collection of effective jailbreaking templates that, when combined with harmful questions, can lead a target LLM to produce harmful responses through black-box access via user prompts. We describe the limitations of directly applying existing template-based attacking techniques in practice, and present functional and efficiency-focused upgrades we added to mutation-based fuzzing to generate effective jailbreaking templates automatically. TurboFuzzLLM achieves $\geq$ 95\% attack success rates (ASR) on public datasets for leading LLMs (including GPT-4o \& GPT-4 Turbo), shows impressive generalizability to unseen harmful questions, and helps in improving model defenses to prompt attacks. TurboFuzzLLM is available open source at https://github.com/amazon-science/TurboFuzzLLM.

Paperid: 203, https://arxiv.org/pdf/2502.18077.pdf GitHub

Abstract:
Foundation models (FMs) for computer vision learn rich and robust representations, enabling their adaptation to task/domain-specific deployments with little to no fine-tuning. However, we posit that the very same strength can make applications based on FMs vulnerable to model stealing attacks. Through empirical analysis, we reveal that models fine-tuned from FMs harbor heightened susceptibility to model stealing, compared to conventional vision architectures like ResNets. We hypothesize that this behavior is due to the comprehensive encoding of visual patterns and features learned by FMs during pre-training, which are accessible to both the attacker and the victim. We report that an attacker is able to obtain 94.28% agreement (matched predictions with victim) for a Vision Transformer based victim model (ViT-L/16) trained on CIFAR-10 dataset, compared to only 73.20% agreement for a ResNet-18 victim, when using ViT-L/16 as the thief model. We arguably show, for the first time, that utilizing FMs for downstream tasks may not be the best choice for deployment in commercial APIs due to their susceptibility to model theft. We thereby alert model owners towards the associated security risks, and highlight the need for robust security measures to safeguard such models against theft. Code is available at https://github.com/rajankita/foundation_model_stealing.

Paperid: 204, https://arxiv.org/pdf/2502.17832.pdf GitHub

Abstract:
Multimodal large language models (MLLMs) equipped with Retrieval Augmented Generation (RAG) leverage both their rich parametric knowledge and the dynamic, external knowledge to excel in tasks such as Question Answering. While RAG enhances MLLMs by grounding responses in query-relevant external knowledge, this reliance poses a critical yet underexplored safety risk: knowledge poisoning attacks, where misinformation or irrelevant knowledge is intentionally injected into external knowledge bases to manipulate model outputs to be incorrect and even harmful. To expose such vulnerabilities in multimodal RAG, we propose MM-PoisonRAG, a novel knowledge poisoning attack framework with two attack strategies: Localized Poisoning Attack (LPA), which injects query-specific misinformation in both text and images for targeted manipulation, and Globalized Poisoning Attack (GPA) to provide false guidance during MLLM generation to elicit nonsensical responses across all queries. We evaluate our attacks across multiple tasks, models, and access settings, demonstrating that LPA successfully manipulates the MLLM to generate attacker-controlled answers, with a success rate of up to 56% on MultiModalQA. Moreover, GPA completely disrupts model generation to 0% accuracy with just a single irrelevant knowledge injection. Our results highlight the urgent need for robust defenses against knowledge poisoning to safeguard multimodal RAG frameworks.

Paperid: 205, https://arxiv.org/pdf/2502.16903.pdf GitHub

Abstract:
Despite the growing interest in jailbreak methods as an effective red-teaming tool for building safe and responsible large language models (LLMs), flawed evaluation system designs have led to significant discrepancies in their effectiveness assessments. We conduct a systematic measurement study based on 37 jailbreak studies since 2022, focusing on both the methods and the evaluation systems they employ. We find that existing evaluation systems lack case-specific criteria, resulting in misleading conclusions about their effectiveness and safety implications. This paper advocates a shift to a more nuanced, case-by-case evaluation paradigm. We introduce GuidedBench, a novel benchmark comprising a curated harmful question dataset, detailed case-by-case evaluation guidelines and an evaluation system integrated with these guidelines -- GuidedEval. Experiments demonstrate that GuidedBench offers more accurate measurements of jailbreak performance, enabling meaningful comparisons across methods and uncovering new insights overlooked in previous evaluations. GuidedEval reduces inter-evaluator variance by at least 76.03\%. Furthermore, we observe that incorporating guidelines can enhance the effectiveness of jailbreak methods themselves, offering new insights into both attack strategies and evaluation paradigms.

Paperid: 206, https://arxiv.org/pdf/2502.16782.pdf GitHub

Abstract:
Private Transformer inference using cryptographic protocols offers promising solutions for privacy-preserving machine learning; however, it still faces significant runtime overhead (efficiency issues) and challenges in handling long-token inputs (scalability issues). We observe that the Transformer's operational complexity scales quadratically with the number of input tokens, making it essential to reduce the input token length. Notably, each token varies in importance, and many inputs contain redundant tokens. Additionally, prior private inference methods that rely on high-degree polynomial approximations for non-linear activations are computationally expensive. Therefore, reducing the polynomial degree for less important tokens can significantly accelerate private inference. Building on these observations, we propose \textit{CipherPrune}, an efficient and scalable private inference framework that includes a secure encrypted token pruning protocol, a polynomial reduction protocol, and corresponding Transformer network optimizations. At the protocol level, encrypted token pruning adaptively removes unimportant tokens from encrypted inputs in a progressive, layer-wise manner. Additionally, encrypted polynomial reduction assigns lower-degree polynomials to less important tokens after pruning, enhancing efficiency without decryption. At the network level, we introduce protocol-aware network optimization via a gradient-based search to maximize pruning thresholds and polynomial reduction conditions while maintaining the desired accuracy. Our experiments demonstrate that CipherPrune reduces the execution overhead of private Transformer inference by approximately $6.1\times$ for 128-token inputs and $10.6\times$ for 512-token inputs, compared to previous methods, with only a marginal drop in accuracy. The code is publicly available at https://github.com/UCF-Lou-Lab-PET/cipher-prune-inference.

Paperid: 207, https://arxiv.org/pdf/2502.16055.pdf GitHub

Abstract:
Foundational models (FMs) have made significant strides in the healthcare domain. Yet the data silo challenge and privacy concern remain in healthcare systems, hindering safe medical data sharing and collaborative model development among institutions. The collection and curation of scalable clinical datasets increasingly become the bottleneck for training strong FMs. In this study, we propose Medical Foundation Models Merging (MedForge), a cooperative framework enabling a community-driven medical foundation model development, meanwhile preventing the information leakage of raw patient data and mitigating synchronization model development issues across clinical institutions. MedForge offers a bottom-up model construction mechanism by flexibly merging task-specific Low-Rank Adaptation (LoRA) modules, which can adapt to downstream tasks while retaining original model parameters. Through an asynchronous LoRA module integration scheme, the resulting composite model can progressively enhance its comprehensive performance on various clinical tasks. MedForge shows strong performance on multiple clinical datasets (e.g., breast cancer, lung cancer, and colon cancer) collected from different institutions. Our major findings highlight the value of collaborative foundation models in advancing multi-center clinical collaboration effectively and cohesively. Our code is publicly available at https://github.com/TanZheling/MedForge.

Paperid: 208, https://arxiv.org/pdf/2502.15720.pdf GitHub

Abstract:
Loyal AI is loyal to the community that builds it. An AI is loyal to a community if the community has ownership, alignment, and control. Community owned models can only be used with the approval of the community and share the economic rewards communally. Community aligned models have values that are aligned with the consensus of the community. Community controlled models perform functions designed by the community. Since we would like permissionless access to the loyal AI's community, we need the AI to be open source. The key scientific question then is: how can we build models that are openly accessible (open source) and yet are owned and governed by the community. This seeming impossibility is the focus of this paper where we outline a concrete pathway to Open, Monetizable and Loyal models (OML), building on our earlier work on OML, arXiv:2411.03887(1) , and a representation via a cryptographic-ML library http://github.com/sentient-agi/oml-1.0-fingerprinting .

Paperid: 209, https://arxiv.org/pdf/2502.15427.pdf GitHub

Abstract:
As large language models (LLMs) become integrated into everyday applications, ensuring their robustness and security is increasingly critical. In particular, LLMs can be manipulated into unsafe behaviour by prompts known as jailbreaks. The variety of jailbreak styles is growing, necessitating the use of external defences known as guardrails. While many jailbreak defences have been proposed, not all defences are able to handle new out-of-distribution attacks due to the narrow segment of jailbreaks used to align them. Moreover, the lack of systematisation around defences has created significant gaps in their practical application. In this work, we perform systematic benchmarking across 15 different defences, considering a broad swathe of malicious and benign datasets. We find that there is significant performance variation depending on the style of jailbreak a defence is subject to. Additionally, we show that based on current datasets available for evaluation, simple baselines can display competitive out-of-distribution performance compared to many state-of-the-art defences. Code is available at https://github.com/IBM/Adversarial-Prompt-Evaluation.

Paperid: 210, https://arxiv.org/pdf/2502.15245.pdf GitHub

Abstract:
Image Steganography is a cryptographic technique that embeds secret information into an image, ensuring the hidden data remains undetectable to the human eye while preserving the image's original visual integrity. Least Significant Bit (LSB) Steganography achieves this by replacing the k least significant bits of an image with the k most significant bits of a secret image, maintaining the appearance of the original image while simultaneously encoding the essential elements of the hidden data. In this work, we shift away from conventional applications of steganography in deep learning and explore its potential from a new angle. We present experimental results on CIFAR-10 showing that LSB Steganography, when used as a data augmentation strategy for downstream computer vision tasks such as image classification, can significantly improve the training efficiency of deep neural networks. It can also act as an implicit, uniformly discretized piecewise linear approximation of color augmentations such as (brightness, contrast, hue, and saturation), without introducing additional training overhead through a new joint image training regime that disregards the need for tuning sensitive augmentation hyperparameters.

Paperid: 211, https://arxiv.org/pdf/2502.15233.pdf GitHub

Abstract:
An increasing number of companies have begun providing services that leverage cloud-based large language models (LLMs), such as ChatGPT. However, this development raises substantial privacy concerns, as users' prompts are transmitted to and processed by the model providers. Among the various privacy protection methods for LLMs, those implemented during the pre-training and fine-tuning phrases fail to mitigate the privacy risks associated with the remote use of cloud-based LLMs by users. On the other hand, methods applied during the inference phrase are primarily effective in scenarios where the LLM's inference does not rely on privacy-sensitive information. In this paper, we outline the process of remote user interaction with LLMs and, for the first time, propose a detailed definition of a general pseudonymization framework applicable to cloud-based LLMs. The experimental results demonstrate that the proposed framework strikes an optimal balance between privacy protection and utility. The code for our method is available to the public at https://github.com/Mebymeby/Pseudonymization-Framework.

Paperid: 212, https://arxiv.org/pdf/2502.14881.pdf GitHub

Abstract:
With the rapid advancement of Large Vision-Language Models (LVLMs), ensuring their safety has emerged as a crucial area of research. This survey provides a comprehensive analysis of LVLM safety, covering key aspects such as attacks, defenses, and evaluation methods. We introduce a unified framework that integrates these interrelated components, offering a holistic perspective on the vulnerabilities of LVLMs and the corresponding mitigation strategies. Through an analysis of the LVLM lifecycle, we introduce a classification framework that distinguishes between inference and training phases, with further subcategories to provide deeper insights. Furthermore, we highlight limitations in existing research and outline future directions aimed at strengthening the robustness of LVLMs. As part of our research, we conduct a set of safety evaluations on the latest LVLM, Deepseek Janus-Pro, and provide a theoretical analysis of the results. Our findings provide strategic recommendations for advancing LVLM safety and ensuring their secure and reliable deployment in high-stakes, real-world applications. This survey aims to serve as a cornerstone for future research, facilitating the development of models that not only push the boundaries of multimodal intelligence but also adhere to the highest standards of security and ethical integrity. Furthermore, to aid the growing research in this field, we have created a public repository to continuously compile and update the latest work on LVLM safety: https://github.com/XuankunRong/Awesome-LVLM-Safety .

Paperid: 213, https://arxiv.org/pdf/2502.13778.pdf GitHub

Abstract:
Rapid industrial digitalization has created intricate cybersecurity demands that necessitate effective validation methods. While cyber ranges and simulation platforms are widely deployed, they frequently face limitations in scenario diversity and creation efficiency. In this paper, we present SpiderSim, a theoretical cybersecurity simulation platform enabling rapid and lightweight scenario generation for industrial digitalization security research. At its core, our platform introduces three key innovations: a structured framework for unified scenario modeling, a multi-agent collaboration mechanism for automated generation, and modular atomic security capabilities for flexible scenario composition. Extensive implementation trials across multiple industrial digitalization contexts, including marine ranch monitoring systems, validate our platform's capacity for broad scenario coverage with efficient generation processes. Built on solid theoretical foundations and released as open-source software, SpiderSim facilitates broader research and development in automated security testing for industrial digitalization.

Paperid: 214, https://arxiv.org/pdf/2502.13593.pdf GitHub

Abstract:
Over the past decades, researchers have primarily focused on improving the generalization abilities of models, with limited attention given to regulating such generalization. However, the ability of models to generalize to unintended data (e.g., harmful or unauthorized data) can be exploited by malicious adversaries in unforeseen ways, potentially resulting in violations of model ethics. Non-transferable learning (NTL), a task aimed at reshaping the generalization abilities of deep learning models, was proposed to address these challenges. While numerous methods have been proposed in this field, a comprehensive review of existing progress and a thorough analysis of current limitations remain lacking. In this paper, we bridge this gap by presenting the first comprehensive survey on NTL and introducing NTLBench, the first benchmark to evaluate NTL performance and robustness within a unified framework. Specifically, we first introduce the task settings, general framework, and criteria of NTL, followed by a summary of NTL approaches. Furthermore, we emphasize the often-overlooked issue of robustness against various attacks that can destroy the non-transferable mechanism established by NTL. Experiments conducted via NTLBench verify the limitations of existing NTL methods in robustness. Finally, we discuss the practical applications of NTL, along with its future directions and associated challenges.

Paperid: 215, https://arxiv.org/pdf/2502.13163.pdf GitHub

Abstract:
Vulnerabilities in open-source operating systems (OSs) pose substantial security risks to software systems, making their detection crucial. While fuzzing has been an effective vulnerability detection technique in various domains, OS fuzzing (OSF) faces unique challenges due to OS complexity and multi-layered interaction, and has not been comprehensively reviewed. Therefore, this work systematically surveys the state-of-the-art OSF techniques, categorizes them based on the general fuzzing process, and investigates challenges specific to kernel, file system, driver, and hypervisor fuzzing. Finally, future research directions for OSF are discussed. GitHub: https://github.com/pghk13/Survey-OSF.

Paperid: 216, https://arxiv.org/pdf/2502.12794.pdf GitHub

Abstract:
Differentially private diffusion models (DPDMs) harness the remarkable generative capabilities of diffusion models while enforcing differential privacy (DP) for sensitive data. However, existing DPDM training approaches often suffer from significant utility loss, large memory footprint, and expensive inference cost, impeding their practical uses. To overcome such limitations, we present RAPID: Retrieval Augmented PrIvate Diffusion model, a novel approach that integrates retrieval augmented generation (RAG) into DPDM training. Specifically, RAPID leverages available public data to build a knowledge base of sample trajectories; when training the diffusion model on private data, RAPID computes the early sampling steps as queries, retrieves similar trajectories from the knowledge base as surrogates, and focuses on training the later sampling steps in a differentially private manner. Extensive evaluation using benchmark datasets and models demonstrates that, with the same privacy guarantee, RAPID significantly outperforms state-of-the-art approaches by large margins in generative quality, memory footprint, and inference cost, suggesting that retrieval-augmented DP training represents a promising direction for developing future privacy-preserving generative models. The code is available at: https://github.com/TanqiuJiang/RAPID

Paperid: 217, https://arxiv.org/pdf/2502.12734.pdf GitHub

Abstract:
Machine-generated Text (MGT) detection is crucial for regulating and attributing online texts. While the existing MGT detectors achieve strong performance, they remain vulnerable to simple perturbations and adversarial attacks. To build an effective defense against malicious perturbations, we view MGT detection from a threat modeling perspective, that is, analyzing the model's vulnerability from an adversary's point of view and exploring effective mitigations. To this end, we introduce an adversarial framework for training a robust MGT detector, named GREedy Adversary PromoTed DefendER (GREATER). The GREATER consists of two key components: an adversary GREATER-A and a detector GREATER-D. The GREATER-D learns to defend against the adversarial attack from GREATER-A and generalizes the defense to other attacks. GREATER-A identifies and perturbs the critical tokens in embedding space, along with greedy search and pruning to generate stealthy and disruptive adversarial examples. Besides, we update the GREATER-A and GREATER-D synchronously, encouraging the GREATER-D to generalize its defense to different attacks and varying attack intensities. Our experimental results across 10 text perturbation strategies and 6 adversarial attacks show that our GREATER-D reduces the Attack Success Rate (ASR) by 0.67% compared with SOTA defense methods while our GREATER-A is demonstrated to be more effective and efficient than SOTA attack approaches. Codes and dataset are available in https://github.com/Liyuuuu111/GREATER.

Paperid: 218, https://arxiv.org/pdf/2502.12650.pdf GitHub

Abstract:
We 1) present the first rigorous security, performance, energy, and cost analyses of the state-of-the-art on-DRAM-die read disturbance mitigation method, Per Row Activation Counting (PRAC) and 2) propose Chronus, a new mechanism that addresses PRAC's two major weaknesses. Our analysis shows that PRAC's system performance overhead on benign applications is non-negligible for modern DRAM chips and prohibitively large for future DRAM chips that are more vulnerable to read disturbance. We identify two weaknesses of PRAC that cause these overheads. First, PRAC increases critical DRAM access latency parameters due to the additional time required to increment activation counters. Second, PRAC performs a constant number of preventive refreshes at a time, making it vulnerable to an adversarial access pattern, known as the wave attack, and consequently requiring it to be configured for significantly smaller activation thresholds. To address PRAC's two weaknesses, we propose a new on-DRAM-die RowHammer mitigation mechanism, Chronus. Chronus 1) updates row activation counters concurrently while serving accesses by separating counters from the data and 2) prevents the wave attack by dynamically controlling the number of preventive refreshes performed. Our performance analysis shows that Chronus's system performance overhead is near-zero for modern DRAM chips and very low for future DRAM chips. Chronus outperforms three variants of PRAC and three other state-of-the-art read disturbance solutions. We discuss Chronus's and PRAC's implications for future systems and foreshadow future research directions. To aid future research, we open-source our Chronus implementation at https://github.com/CMU-SAFARI/Chronus.

Paperid: 219, https://arxiv.org/pdf/2502.12575.pdf GitHub

Abstract:
As LLM-based agents become increasingly prevalent, backdoors can be implanted into agents through user queries or environment feedback, raising critical concerns regarding safety vulnerabilities. However, backdoor attacks are typically detectable by safety audits that analyze the reasoning process of agents. To this end, we propose a novel backdoor implantation strategy called \textbf{Dynamically Encrypted Multi-Backdoor Implantation Attack}. Specifically, we introduce dynamic encryption, which maps the backdoor into benign content, effectively circumventing safety audits. To enhance stealthiness, we further decompose the backdoor into multiple sub-backdoor fragments. Based on these advancements, backdoors are allowed to bypass safety audits significantly. Additionally, we present AgentBackdoorEval, a dataset designed for the comprehensive evaluation of agent backdoor attacks. Experimental results across multiple datasets demonstrate that our method achieves an attack success rate nearing 100\% while maintaining a detection rate of 0\%, illustrating its effectiveness in evading safety audits. Our findings highlight the limitations of existing safety mechanisms in detecting advanced attacks, underscoring the urgent need for more robust defenses against backdoor threats. Code and data are available at https://github.com/whfeLingYu/DemonAgent.

Paperid: 220, https://arxiv.org/pdf/2502.12562.pdf GitHub

Abstract:
Multimodal Large Language Models (MLLMs) have serious security vulnerabilities.While safety alignment using multimodal datasets consisting of text and data of additional modalities can effectively enhance MLLM's security, it is costly to construct these datasets. Existing low-resource security alignment methods, including textual alignment, have been found to struggle with the security risks posed by additional modalities. To address this, we propose Synthetic Embedding augmented safety Alignment (SEA), which optimizes embeddings of additional modality through gradient updates to expand textual datasets. This enables multimodal safety alignment training even when only textual data is available. Extensive experiments on image, video, and audio-based MLLMs demonstrate that SEA can synthesize a high-quality embedding on a single RTX3090 GPU within 24 seconds. SEA significantly improves the security of MLLMs when faced with threats from additional modalities. To assess the security risks introduced by video and audio, we also introduced a new benchmark called VA-SafetyBench. High attack success rates across multiple MLLMs validate its challenge. Our code and data will be available at https://github.com/ZeroNLP/SEA.

Paperid: 221, https://arxiv.org/pdf/2502.11798.pdf GitHub

Abstract:
Backdoor learning is a critical research topic for understanding the vulnerabilities of deep neural networks. While the diffusion model (DM) has been broadly deployed in public over the past few years, the understanding of its backdoor vulnerability is still in its infancy compared to the extensive studies in discriminative models. Recently, many different backdoor attack and defense methods have been proposed for DMs, but a comprehensive benchmark for backdoor learning on DMs is still lacking. This absence makes it difficult to conduct fair comparisons and thorough evaluations of the existing approaches, thus hindering future research progress. To address this issue, we propose \textit{BackdoorDM}, the first comprehensive benchmark designed for backdoor learning on DMs. It comprises nine state-of-the-art (SOTA) attack methods, four SOTA defense strategies, and three useful visualization analysis tools. We first systematically classify and formulate the existing literature in a unified framework, focusing on three different backdoor attack types and five backdoor target types, which are restricted to a single type in discriminative models. Then, we systematically summarize the evaluation metrics for each type and propose a unified backdoor evaluation method based on multimodal large language model (MLLM). Finally, we conduct a comprehensive evaluation and highlight several important conclusions. We believe that BackdoorDM will help overcome current barriers and contribute to building a trustworthy artificial intelligence generated content (AIGC) community. The codes are released in https://github.com/linweiii/BackdoorDM.

Paperid: 222, https://arxiv.org/pdf/2502.11745.pdf GitHub

Abstract:
RowHammer is a major read disturbance mechanism in DRAM where repeatedly accessing (hammering) a row of DRAM cells (DRAM row) induces bitflips in physically nearby DRAM rows (victim rows). To ensure robust DRAM operation, state-of-the-art mitigation mechanisms restore the charge in potential victim rows (i.e., they perform preventive refresh or charge restoration). With newer DRAM chip generations, these mechanisms perform preventive refresh more aggressively and cause larger performance, energy, or area overheads. Therefore, it is essential to develop a better understanding and in-depth insights into the preventive refresh to secure real DRAM chips at low cost. In this paper, our goal is to mitigate RowHammer at low cost by understanding the impact of reduced preventive refresh latency on RowHammer. To this end, we present the first rigorous experimental study on the interactions between refresh latency and RowHammer characteristics in real DRAM chips. Our experimental characterization using 388 real DDR4 DRAM chips from three major manufacturers demonstrates that a preventive refresh latency can be significantly reduced (by 64%). To investigate the impact of reduced preventive refresh latency on system performance and energy efficiency, we reduce the preventive refresh latency and adjust the aggressiveness of existing RowHammer solutions by developing a new mechanism, Partial Charge Restoration for Aggressive Mitigation (PaCRAM). Our results show that PaCRAM reduces the performance and energy overheads induced by five state-of-the-art RowHammer mitigation mechanisms with small additional area overhead. Thus, PaCRAM introduces a novel perspective into addressing RowHammer vulnerability at low cost by leveraging our experimental observations. To aid future research, we open-source our PaCRAM implementation at https://github.com/CMU-SAFARI/PaCRAM.

Paperid: 223, https://arxiv.org/pdf/2502.11355.pdf GitHub

Abstract:
Large language models (LLMs) are evolving into autonomous decision-makers, raising concerns about catastrophic risks in high-stakes scenarios, particularly in Chemical, Biological, Radiological and Nuclear (CBRN) domains. Based on the insight that such risks can originate from trade-offs between the agent's Helpful, Harmlessness and Honest (HHH) goals, we build a novel three-stage evaluation framework, which is carefully constructed to effectively and naturally expose such risks. We conduct 14,400 agentic simulations across 12 advanced LLMs, with extensive experiments and analysis. Results reveal that LLM agents can autonomously engage in catastrophic behaviors and deception, without being deliberately induced. Furthermore, stronger reasoning abilities often increase, rather than mitigate, these risks. We also show that these agents can violate instructions and superior commands. On the whole, we empirically prove the existence of catastrophic risks in autonomous LLM agents. We release our code to foster further research.

Paperid: 224, https://arxiv.org/pdf/2502.11127.pdf GitHub

Abstract:
Large Language Model (LLM)-based Multi-agent Systems (MAS) have demonstrated remarkable capabilities in various complex tasks, ranging from collaborative problem-solving to autonomous decision-making. However, as these systems become increasingly integrated into critical applications, their vulnerability to adversarial attacks, misinformation propagation, and unintended behaviors have raised significant concerns. To address this challenge, we introduce G-Safeguard, a topology-guided security lens and treatment for robust LLM-MAS, which leverages graph neural networks to detect anomalies on the multi-agent utterance graph and employ topological intervention for attack remediation. Extensive experiments demonstrate that G-Safeguard: (I) exhibits significant effectiveness under various attack strategies, recovering over 40% of the performance for prompt injection; (II) is highly adaptable to diverse LLM backbones and large-scale MAS; (III) can seamlessly combine with mainstream MAS with security guarantees. The code is available at https://github.com/wslong20/G-safeguard.

Paperid: 225, https://arxiv.org/pdf/2502.11110.pdf GitHub

Abstract:
Homomorphic encryption (HE) is a core building block in privacy-preserving machine learning (PPML), but HE is also widely known as its efficiency bottleneck. Therefore, many GPU-accelerated cryptographic schemes have been proposed to improve the performance of HE. However, these methods often require complex modifications tailored to specific algorithms and are tightly coupled with specific GPU and operating systems. It is interesting to ask how to generally offer more practical GPU-accelerated cryptographic algorithm implementations. Given the powerful code generation capabilities of large language models (LLMs), we aim to explore their potential to automatically generate practical GPU-friendly algorithm code using CPU-friendly code. In this paper, we focus on number theoretic transform (NTT) -- the core mechanism of HE. We first develop and optimize a GPU-friendly NTT (GNTT) family that exploits PyTorch's fast matrix computation and precomputation, achieving an approximately 62x speedup -- a significant boost over existing ones. Then we explore GPU-friendly code generation using various LLMs, including DeepSeek-R1, OpenAI o1 and o3-mini. We discover many interesting findings throughout the process. For instance, somewhat surprisingly, our experiments demonstrate that DeepSeek-R1 significantly outperforms OpenAI o3-mini and o1, but still cannot beat our optimized protocol. The findings provide valuable insights for turbocharging PPML and enhancing code generation capabilities of LLMs. Codes are available at: https://github.com/LMPC-Lab/GenGPUCrypto.

Paperid: 226, https://arxiv.org/pdf/2502.11054.pdf GitHub

Abstract:
Multi-turn jailbreak attacks simulate real-world human interactions by engaging large language models (LLMs) in iterative dialogues, exposing critical safety vulnerabilities. However, existing methods often struggle to balance semantic coherence with attack effectiveness, resulting in either benign semantic drift or ineffective detection evasion. To address this challenge, we propose Reasoning-Augmented Conversation, a novel multi-turn jailbreak framework that reformulates harmful queries into benign reasoning tasks and leverages LLMs' strong reasoning capabilities to compromise safety alignment. Specifically, we introduce an attack state machine framework to systematically model problem translation and iterative reasoning, ensuring coherent query generation across multiple turns. Building on this framework, we design gain-guided exploration, self-play, and rejection feedback modules to preserve attack semantics, enhance effectiveness, and sustain reasoning-driven attack progression. Extensive experiments on multiple LLMs demonstrate that RACE achieves state-of-the-art attack effectiveness in complex conversational scenarios, with attack success rates (ASRs) increasing by up to 96%. Notably, our approach achieves ASRs of 82% and 92% against leading commercial models, OpenAI o1 and DeepSeek R1, underscoring its potency. We release our code at https://github.com/NY1024/RACE to facilitate further research in this critical domain.

Paperid: 227, https://arxiv.org/pdf/2502.09990.pdf GitHub

Abstract:
Despite the rapid development of safety alignment techniques for LLMs, defending against multi-turn jailbreaks is still a challenging task. In this paper, we conduct a comprehensive comparison, revealing that some existing defense methods can improve the robustness of LLMs against multi-turn jailbreaks but compromise usability, i.e., reducing general capabilities or causing the over-refusal problem. From the perspective of mechanism interpretability of LLMs, we discover that these methods fail to establish a boundary that exactly distinguishes safe and harmful feature representations. Therefore, boundary-safe representations close to harmful representations are inevitably disrupted, leading to a decline in usability. To address this issue, we propose X-Boundary to push harmful representations away from boundary-safe representations and obtain an exact distinction boundary. In this way, harmful representations can be precisely erased without disrupting safe ones. Experimental results show that X-Boundary achieves state-of-the-art defense performance against multi-turn jailbreaks, while reducing the over-refusal rate by about 20% and maintaining nearly complete general capability. Furthermore, we theoretically prove and empirically verify that X-Boundary can accelerate the convergence process during training. Please see our code at: https://github.com/AI45Lab/X-Boundary.

Paperid: 228, https://arxiv.org/pdf/2502.09941.pdf GitHub

Abstract:
Current image tampering localization methods primarily rely on Convolutional Neural Networks (CNNs) and Transformers. While CNNs suffer from limited local receptive fields, Transformers offer global context modeling at the expense of quadratic computational complexity. Recently, the state space model Mamba has emerged as a competitive alternative, enabling linear-complexity global dependency modeling. Inspired by it, we propose a lightweight and effective FORensic network based on vision MAmba (ForMa) for blind image tampering localization. Firstly, ForMa captures multi-scale global features that achieves efficient global dependency modeling through linear complexity. Then the pixel-wise localization map is generated by a lightweight decoder, which employs a parameter-free pixel shuffle layer for upsampling. Additionally, a noise-assisted decoding strategy is proposed to integrate complementary manipulation traces from tampered images, boosting decoder sensitivity to forgery cues. Experimental results on 10 standard datasets demonstrate that ForMa achieves state-of-the-art generalization ability and robustness, while maintaining the lowest computational complexity. Code is available at https://github.com/multimediaFor/ForMa.

Paperid: 229, https://arxiv.org/pdf/2502.09755.pdf GitHub

Abstract:
Safety-aligned LLMs respond to prompts with either compliance or refusal, each corresponding to distinct directions in the model's activation space. Recent works show that initializing attacks via self-transfer from other prompts significantly enhances their performance. However, the underlying mechanisms of these initializations remain unclear, and attacks utilize arbitrary or hand-picked initializations. This work presents that each gradient-based jailbreak attack and subsequent initialization gradually converge to a single compliance direction that suppresses refusal, thereby enabling an efficient transition from refusal to compliance. Based on this insight, we propose CRI, an initialization framework that aims to project unseen prompts further along compliance directions. We demonstrate our approach on multiple attacks, models, and datasets, achieving an increased attack success rate (ASR) and reduced computational overhead, highlighting the fragility of safety-aligned LLMs. A reference implementation is available at: https://amit1221levi.github.io/CRI-Jailbreak-Init-LLMs-evaluation.

Paperid: 230, https://arxiv.org/pdf/2502.09723.pdf GitHub

Abstract:
Recent advances in large language models (LLMs) have demonstrated remarkable potential in the field of natural language processing. Unfortunately, LLMs face significant security and ethical risks. Although techniques such as safety alignment are developed for defense, prior researches reveal the possibility of bypassing such defenses through well-designed jailbreak attacks. In this paper, we propose QueryAttack, a novel framework to examine the generalizability of safety alignment. By treating LLMs as knowledge databases, we translate malicious queries in natural language into structured non-natural query language to bypass the safety alignment mechanisms of LLMs. We conduct extensive experiments on mainstream LLMs, and the results show that QueryAttack not only can achieve high attack success rates (ASRs), but also can jailbreak various defense methods. Furthermore, we tailor a defense method against QueryAttack, which can reduce ASR by up to $64\%$ on GPT-4-1106. Our code is available at https://github.com/horizonsinzqs/QueryAttack.

Paperid: 231, https://arxiv.org/pdf/2502.09084.pdf GitHub

Abstract:
Operating System (OS) fingerprinting is essential for network management and cybersecurity, enabling accurate device identification based on network traffic analysis. Traditional rule-based tools such as Nmap and p0f face challenges in dynamic environments due to frequent OS updates and obfuscation techniques. While Machine Learning (ML) approaches have been explored, Deep Learning (DL) models, particularly Transformer architectures, remain unexploited in this domain. This study investigates the application of Tabular Transformer architectures-specifically TabTransformer and FT-Transformer-for OS fingerprinting, leveraging structured network data from three publicly available datasets. Our experiments demonstrate that FT-Transformer generally outperforms traditional ML models, previous approaches and TabTransformer across multiple classification levels (OS family, major, and minor versions). The results establish a strong foundation for DL-based OS fingerprinting, improving accuracy and adaptability in complex network environments. Furthermore, we ensure the reproducibility of our research by providing an open-source implementation.

Paperid: 232, https://arxiv.org/pdf/2502.08193.pdf GitHub

Abstract:
Large Vision-Language Models (LVLMs) are susceptible to typographic attacks, which are misclassifications caused by an attack text that is added to an image. In this paper, we introduce a multi-image setting for studying typographic attacks, broadening the current emphasis of the literature on attacking individual images. Specifically, our focus is on attacking image sets without repeating the attack query. Such non-repeating attacks are stealthier, as they are more likely to evade a gatekeeper than attacks that repeat the same attack text. We introduce two attack strategies for the multi-image setting, leveraging the difficulty of the target image, the strength of the attack text, and text-image similarity. Our text-image similarity approach improves attack success rates by 21% over random, non-specific methods on the CLIP model using ImageNet while maintaining stealth in a multi-image scenario. An additional experiment demonstrates transferability, i.e., text-image similarity calculated using CLIP transfers when attacking InstructBLIP.

Paperid: 233, https://arxiv.org/pdf/2502.07231.pdf GitHub

Abstract:
Backdoor attacks occur when an attacker subtly manipulates machine learning models during the training phase, leading to unintended behaviors when specific triggers are present. To mitigate such emerging threats, a prevalent strategy is to cleanse the victim models by various backdoor purification techniques. Despite notable achievements, current state-of-the-art (SOTA) backdoor purification techniques usually rely on the availability of a small clean dataset, often referred to as auxiliary dataset. However, acquiring an ideal auxiliary dataset poses significant challenges in real-world applications. This study begins by assessing the SOTA backdoor purification techniques across different types of real-world auxiliary datasets. Our findings indicate that the purification effectiveness fluctuates significantly depending on the type of auxiliary dataset used. Specifically, a high-quality in-distribution auxiliary dataset is essential for effective purification, whereas datasets from varied or out-of-distribution sources significantly degrade the defensive performance. Based on this, we propose Guided Input Calibration (GIC), which aims to improve purification efficacy by employing a learnable transformation. Guided by the victim model itself, GIC aligns the characteristics of the auxiliary dataset with those of the original training set. Comprehensive experiments demonstrate that GIC can substantially enhance purification performance across diverse types of auxiliary datasets. The code and data will be available via https://github.com/shawkui/BackdoorBenchER.

Paperid: 234, https://arxiv.org/pdf/2502.07049.pdf GitHub

Abstract:
Large Language Models (LLMs) are emerging as transformative tools for software vulnerability detection, addressing critical challenges in the security domain. Traditional methods, such as static and dynamic analysis, often falter due to inefficiencies, high false positive rates, and the growing complexity of modern software systems. By leveraging their ability to analyze code structures, identify patterns, and generate repair suggestions, LLMs, exemplified by models like GPT, BERT, and CodeBERT, present a novel and scalable approach to mitigating vulnerabilities. This paper provides a detailed survey of LLMs in vulnerability detection. It examines key aspects, including model architectures, application methods, target languages, fine-tuning strategies, datasets, and evaluation metrics. We also analyze the scope of current research problems, highlighting the strengths and weaknesses of existing approaches. Further, we address challenges such as cross-language vulnerability detection, multimodal data integration, and repository-level analysis. Based on these findings, we propose solutions for issues like dataset scalability, model interpretability, and applications in low-resource scenarios. Our contributions are threefold: (1) a systematic review of how LLMs are applied in vulnerability detection; (2) an analysis of shared patterns and differences across studies, with a unified framework for understanding the field; and (3) a summary of key challenges and future research directions. This work provides valuable insights for advancing LLM-based vulnerability detection. We also maintain and regularly update latest selected paper on https://github.com/OwenSanzas/LLM-For-Vulnerability-Detection

Paperid: 235, https://arxiv.org/pdf/2502.05505.pdf GitHub

Abstract:
Differentially private (DP) synthetic data, which closely resembles the original private data while maintaining strong privacy guarantees, has become a key tool for unlocking the value of private data without compromising privacy. Recently, Private Evolution (PE) has emerged as a promising method for generating DP synthetic data. Unlike other training-based approaches, PE only requires access to inference APIs from foundation models, enabling it to harness the power of state-of-the-art (SoTA) models. However, a suitable foundation model for a specific private data domain is not always available. In this paper, we discover that the PE framework is sufficiently general to allow APIs beyond foundation models. In particular, we demonstrate that many SoTA data synthesizers that do not rely on neural networks--such as computer graphics-based image generators, which we refer to as simulators--can be effectively integrated into PE. This insight significantly broadens PE's applicability and unlocks the potential of powerful simulators for DP data synthesis. We explore this approach, named Sim-PE, in the context of image synthesis. Across four diverse simulators, Sim-PE performs well, improving the downstream classification accuracy of PE by up to 3x, reducing FID by up to 80%, and offering much greater efficiency. We also show that simulators and foundation models can be easily leveraged together within PE to achieve further improvements. The code is open-sourced in the Private Evolution Python library: https://github.com/microsoft/DPSDA.

Paperid: 236, https://arxiv.org/pdf/2502.05215.pdf GitHub

Abstract:
Watermarking embeds information into digital content like images, audio, or text, imperceptible to humans but robustly detectable by specific algorithms. This technology has important applications in many challenges of the industry such as content moderation, tracing AI-generated content, and monitoring the usage of AI models. The contributions of this thesis include the development of new watermarking techniques for images, audio, and text. We first introduce methods for active moderation of images on social platforms. We then develop specific techniques for AI-generated content. We specifically demonstrate methods to adapt latent generative models to embed watermarks in all generated content, identify watermarked sections in speech, and improve watermarking in large language models with tests that ensure low false positive rates. Furthermore, we explore the use of digital watermarking to detect model misuse, including the detection of watermarks in language models fine-tuned on watermarked text, and introduce training-free watermarks for the weights of large transformers. Through these contributions, the thesis provides effective solutions for the challenges posed by the increasing use of generative AI models and the need for model monitoring and content moderation. It finally examines the challenges and limitations of watermarking techniques and discuss potential future directions for research in this area.

Paperid: 237, https://arxiv.org/pdf/2502.05206.pdf GitHub

Authors:Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, Hengyuan Xu, Yunhao Chen, Yunhan Zhao, Hanxun Huang, Yige Li, Yutao Wu, Jiaming Zhang, Xiang Zheng, Yang Bai, Zuxuan Wu, Xipeng Qiu, Jingfeng Zhang, Yiming Li, Xudong Han, Haonan Li, Jun Sun, Cong Wang, Jindong Gu, Baoyuan Wu, Siheng Chen, Tianwei Zhang, Yang Liu, Mingming Gong, Tongliang Liu, Shirui Pan, Cihang Xie, Tianyu Pang, Yinpeng Dong, Ruoxi Jia, Yang Zhang, Shiqing Ma, Xiangyu Zhang, Neil Gong, Chaowei Xiao, Sarah Erfani, Tim Baldwin, Bo Li, Masashi Sugiyama, Dacheng Tao, James Bailey, Yu-Gang Jiang

Abstract:
The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to a wide range of applications, including conversational AI, recommendation systems, autonomous driving, content generation, medical diagnostics, and scientific discovery. However, their widespread deployment also exposes them to significant safety risks, raising concerns about robustness, reliability, and ethical implications. This survey provides a systematic review of current safety research on large models, covering Vision Foundation Models (VFMs), Large Language Models (LLMs), Vision-Language Pre-training (VLP) models, Vision-Language Models (VLMs), Diffusion Models (DMs), and large-model-powered Agents. Our contributions are summarized as follows: (1) We present a comprehensive taxonomy of safety threats to these models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats. (2) We review defense strategies proposed for each type of attacks if available and summarize the commonly used datasets and benchmarks for safety research. (3) Building on this, we identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices. More importantly, we highlight the necessity of collective efforts from the research community and international collaboration. Our work can serve as a useful reference for researchers and practitioners, fostering the ongoing development of comprehensive defense systems and platforms to safeguard AI models.

Paperid: 238, https://arxiv.org/pdf/2502.05174.pdf GitHub

Abstract:
Recent research has explored that LLM agents are vulnerable to indirect prompt injection (IPI) attacks, where malicious tasks embedded in tool-retrieved information can redirect the agent to take unauthorized actions. Existing defenses against IPI have significant limitations: either require essential model training resources, lack effectiveness against sophisticated attacks, or harm the normal utilities. We present MELON (Masked re-Execution and TooL comparisON), a novel IPI defense. Our approach builds on the observation that under a successful attack, the agent's next action becomes less dependent on user tasks and more on malicious tasks. Following this, we design MELON to detect attacks by re-executing the agent's trajectory with a masked user prompt modified through a masking function. We identify an attack if the actions generated in the original and masked executions are similar. We also include three key designs to reduce the potential false positives and false negatives. Extensive evaluation on the IPI benchmark AgentDojo demonstrates that MELON outperforms SOTA defenses in both attack prevention and utility preservation. Moreover, we show that combining MELON with a SOTA prompt augmentation defense (denoted as MELON-Aug) further improves its performance. We also conduct a detailed ablation study to validate our key designs. Code is available at https://github.com/kaijiezhu11/MELON.

Paperid: 239, https://arxiv.org/pdf/2502.04643.pdf GitHub

Abstract:
A fundamental issue in deep learning has been adversarial robustness. As these systems have scaled, such issues have persisted. Currently, large language models (LLMs) with billions of parameters suffer from adversarial attacks just like their earlier, smaller counterparts. However, the threat models have changed. Previously, having gray-box access, where input embeddings or output logits/probabilities were visible to the user, might have been reasonable. However, with the introduction of closed-source models, no information about the model is available apart from the generated output. This means that current black-box attacks can only utilize the final prediction to detect if an attack is successful. In this work, we investigate and demonstrate the potential of attack guidance, akin to using output probabilities, while having only black-box access in a classification setting. This is achieved through the ability to elicit confidence from the model. We empirically show that the elicited confidence is calibrated and not hallucinated for current LLMs. By minimizing the elicited confidence, we can therefore increase the likelihood of misclassification. Our new proposed paradigm demonstrates promising state-of-the-art results on three datasets across two models (LLaMA-3-8B-Instruct and Mistral-7B-Instruct-V0.3) when comparing our technique to existing hard-label black-box attack methods that introduce word-level substitutions.

Paperid: 240, https://arxiv.org/pdf/2502.04230.pdf GitHub

Abstract:
The rapid proliferation of generative audio synthesis and editing technologies has raised significant concerns about copyright infringement, data provenance, and the spread of misinformation through deepfake audio. Watermarking offers a proactive solution by embedding imperceptible, identifiable, and traceable marks into audio content. While recent neural network-based watermarking methods like WavMark and AudioSeal have improved robustness and quality, they struggle to achieve both robust detection and accurate attribution simultaneously. This paper introduces Cross-Attention Robust Audio Watermark (XAttnMark), which bridges this gap by leveraging partial parameter sharing between the generator and the detector, a cross-attention mechanism for efficient message retrieval, and a temporal conditioning module for improved message distribution. Additionally, we propose a psychoacoustic-aligned temporal-frequency masking loss that captures fine-grained auditory masking effects, enhancing watermark imperceptibility. Our approach achieves state-of-the-art performance in both detection and attribution, demonstrating superior robustness against a wide range of audio transformations, including challenging generative editing with strong editing strength. The project webpage is available at https://liuyixin-louis.github.io/xattnmark/.

Paperid: 241, https://arxiv.org/pdf/2502.04204.pdf GitHub

Abstract:
Jailbreak attacks against large language models (LLMs) aim to induce harmful behaviors in LLMs through carefully crafted adversarial prompts. To mitigate attacks, one way is to perform adversarial training (AT)-based alignment, i.e., training LLMs on some of the most adversarial prompts to help them learn how to behave safely under attacks. During AT, the length of adversarial prompts plays a critical role in the robustness of aligned LLMs. While long-length adversarial prompts during AT might lead to strong LLM robustness, their synthesis however is very resource-consuming, which may limit the application of LLM AT. This paper focuses on adversarial suffix jailbreak attacks and unveils that to defend against a jailbreak attack with an adversarial suffix of length $Î(M)$, it is enough to align LLMs on prompts with adversarial suffixes of length $Î(\sqrt{M})$. Theoretically, we analyze the adversarial in-context learning of linear transformers on linear regression tasks and prove a robust generalization bound for trained transformers. The bound depends on the term $Î(\sqrt{M_{\text{test}}}/M_{\text{train}})$, where $M_{\text{train}}$ and $M_{\text{test}}$ are the numbers of adversarially perturbed in-context samples during training and testing. Empirically, we conduct AT on popular open-source LLMs and evaluate their robustness against jailbreak attacks of different adversarial suffix lengths. Results confirm a positive correlation between the attack success rate and the ratio of the square root of the adversarial suffix length during jailbreaking to the length during AT. Our findings show that it is practical to defend against ``long-length'' jailbreak attacks via efficient ``short-length'' AT. The code is available at https://github.com/fshp971/adv-icl.

Paperid: 242, https://arxiv.org/pdf/2502.03801.pdf GitHub

Abstract:
Federated learning (FL) enables collaborative model training while preserving data privacy, but its decentralized nature exposes it to client-side data poisoning attacks (DPAs) and model poisoning attacks (MPAs) that degrade global model performance. While numerous proposed defenses claim substantial effectiveness, their evaluation is typically done in isolation with limited attack strategies, raising concerns about their validity. Additionally, existing studies overlook the mutual effectiveness of defenses against both DPAs and MPAs, causing fragmentation in this field. This paper aims to provide a unified benchmark and analysis of defenses against DPAs and MPAs, clarifying the distinction between these two similar but slightly distinct domains. We present a systematic taxonomy of poisoning attacks and defense strategies, outlining their design, strengths, and limitations. Then, a unified comparative evaluation across FL algorithms and data heterogeneity is conducted to validate their individual and mutual effectiveness and derive key insights for design principles and future research. Along with the analysis, we frame our work to a unified benchmark, FLPoison, with high modularity and scalability to evaluate 15 representative poisoning attacks and 17 defense strategies, facilitating future research in this domain. Code is available at https://github.com/vio1etus/FLPoison.

Paperid: 243, https://arxiv.org/pdf/2502.03773.pdf GitHub

Abstract:
In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand \cite{bordt2022post}. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at https://github.com/emlaufer/ExpProof.

Paperid: 244, https://arxiv.org/pdf/2502.02844.pdf GitHub

Abstract:
Traditional robust methods in multi-agent reinforcement learning (MARL) often struggle against coordinated adversarial attacks in cooperative scenarios. To address this limitation, we propose the Wolfpack Adversarial Attack framework, inspired by wolf hunting strategies, which targets an initial agent and its assisting agents to disrupt cooperation. Additionally, we introduce the Wolfpack-Adversarial Learning for MARL (WALL) framework, which trains robust MARL policies to defend against the proposed Wolfpack attack by fostering systemwide collaboration. Experimental results underscore the devastating impact of the Wolfpack attack and the significant robustness improvements achieved by WALL. Our code is available at https://github.com/sunwoolee0504/WALL.

Paperid: 245, https://arxiv.org/pdf/2502.02747.pdf GitHub

Abstract:
Recent research builds various patching agents that combine large language models (LLMs) with non-ML tools and achieve promising results on the state-of-the-art (SOTA) software patching benchmark, SWE-bench. Based on how to determine the patching workflows, existing patching agents can be categorized as agent-based planning methods, which rely on LLMs for planning, and rule-based planning methods, which follow a pre-defined workflow. At a high level, agent-based planning methods achieve high patching performance but with a high cost and limited stability. Rule-based planning methods, on the other hand, are more stable and efficient but have key workflow limitations that compromise their patching performance. In this paper, we propose PatchPilot, an agentic patcher that strikes a balance between patching efficacy, stability, and cost-efficiency. PatchPilot proposes a novel rule-based planning workflow with five components: reproduction, localization, generation, validation, and refinement (where refinement is unique to PatchPilot). We introduce novel and customized designs to each component to optimize their effectiveness and efficiency. Through extensive experiments on the SWE-bench benchmarks, PatchPilot shows a superior performance than existing open-source methods while maintaining low cost (less than 1$ per instance) and ensuring higher stability. We also conduct a detailed ablation study to validate the key designs in each component. Our code is available at https://github.com/ucsb-mlsec/PatchPilot.

Paperid: 246, https://arxiv.org/pdf/2502.01925.pdf GitHub

Abstract:
Many-shot jailbreaking circumvents the safety alignment of LLMs by exploiting their ability to process long input sequences. To achieve this, the malicious target prompt is prefixed with hundreds of fabricated conversational exchanges between the user and the model. These exchanges are randomly sampled from a pool of unsafe question-answer pairs, making it appear as though the model has already complied with harmful instructions. In this paper, we present PANDAS: a hybrid technique that improves many-shot jailbreaking by modifying these fabricated dialogues with Positive Affirmations, Negative Demonstrations, and an optimized Adaptive Sampling method tailored to the target prompt's topic. We also introduce ManyHarm, a dataset of harmful question-answer pairs, and demonstrate through extensive experiments that PANDAS significantly outperforms baseline methods in long-context scenarios. Through attention analysis, we provide insights into how long-context vulnerabilities are exploited and show how PANDAS further improves upon many-shot jailbreaking.

Paperid: 247, https://arxiv.org/pdf/2502.00757.pdf GitHub

Abstract:
Scaffolding Large Language Models (LLMs) into multi-agent systems often improves performance on complex tasks, but the safety impact of such scaffolds has not been thoroughly explored. We introduce AgentBreeder, a framework for multi-objective self-improving evolutionary search over scaffolds. We evaluate discovered scaffolds on widely recognized reasoning, mathematics, and safety benchmarks and compare them with popular baselines. In 'blue' mode, we see a 79.4% average uplift in safety benchmark performance while maintaining or improving capability scores. In 'red' mode, we find adversarially weak scaffolds emerging concurrently with capability optimization. Our work demonstrates the risks of multi-agent scaffolding and provides a framework for mitigating them. Code is available at https://github.com/J-Rosser-UK/AgentBreeder.

Paperid: 248, https://arxiv.org/pdf/2502.00354.pdf GitHub

Abstract:
Federated learning (FL) has gained widespread attention for its privacy-preserving and collaborative learning capabilities. Due to significant statistical heterogeneity, traditional FL struggles to generalize a shared model across diverse data domains. Personalized federated learning addresses this issue by dividing the model into a globally shared part and a locally private part, with the local model correcting representation biases introduced by the global model. Nevertheless, locally converged parameters more accurately capture domain-specific knowledge, and current methods overlook the potential benefits of these parameters. To address these limitations, we propose PM-MoE architecture. This architecture integrates a mixture of personalized modules and an energy-based personalized modules denoising, enabling each client to select beneficial personalized parameters from other clients. We applied the PM-MoE architecture to nine recent model-split-based personalized federated learning algorithms, achieving performance improvements with minimal additional training. Extensive experiments on six widely adopted datasets and two heterogeneity settings validate the effectiveness of our approach. The source code is available at \url{https://github.com/dannis97500/PM-MOE}.

Paperid: 249, https://arxiv.org/pdf/2502.00156.pdf GitHub

Abstract:
Bias in machine learning models can lead to unfair decision making, and while it has been well-studied in the image and text domains, it remains underexplored in action recognition. Action recognition models often suffer from background bias (i.e., inferring actions based on background cues) and foreground bias (i.e., relying on subject appearance), which can be detrimental to real-life applications such as autonomous vehicles or assisted living monitoring. While prior approaches have mainly focused on mitigating background bias using specialized augmentations, we thoroughly study both foreground and background bias. We propose ALBAR, a novel adversarial training method that mitigates foreground and background biases without requiring specialized knowledge of the bias attributes. Our framework applies an adversarial cross-entropy loss to the sampled static clip (where all the frames are the same) and aims to make its class probabilities uniform using a proposed entropy maximization loss. Additionally, we introduce a gradient penalty loss for regularization against the debiasing process. We evaluate our method on established background and foreground bias protocols, setting a new state-of-the-art and strongly improving combined debiasing performance by over 12% absolute on HMDB51. Furthermore, we identify an issue of background leakage in the existing UCF101 protocol for bias evaluation which provides a shortcut to predict actions and does not provide an accurate measure of the debiasing capability of a model. We address this issue by proposing more fine-grained segmentation boundaries for the actor, where our method also outperforms existing approaches. Project Page: https://joefioresi718.github.io/ALBAR_webpage/

Paperid: 250, https://arxiv.org/pdf/2501.18934.pdf GitHub

Abstract:
The rapid adoption of deep learning in sensitive domains has brought tremendous benefits. However, this widespread adoption has also given rise to serious vulnerabilities, particularly model inversion (MI) attacks, posing a significant threat to the privacy and integrity of personal data. The increasing prevalence of these attacks in applications such as biometrics, healthcare, and finance has created an urgent need to understand their mechanisms, impacts, and defense methods. This survey aims to fill the gap in the literature by providing a structured and in-depth review of MI attacks and defense strategies. Our contributions include a systematic taxonomy of MI attacks, extensive research on attack techniques and defense mechanisms, and a discussion about the challenges and future research directions in this evolving field. By exploring the technical and ethical implications of MI attacks, this survey aims to offer insights into the impact of AI-powered systems on privacy, security, and trust. In conjunction with this survey, we have developed a comprehensive repository to support research on MI attacks and defenses. The repository includes state-of-the-art research papers, datasets, evaluation metrics, and other resources to meet the needs of both novice and experienced researchers interested in MI attacks and defenses, as well as the broader field of AI security and privacy. The repository will be continuously maintained to ensure its relevance and utility. It is accessible at https://github.com/overgter/Deep-Learning-Model-Inversion-Attacks-and-Defenses.

Paperid: 251, https://arxiv.org/pdf/2501.18877.pdf GitHub

Abstract:
Text-to-image diffusion models show remarkable generation performance following text prompts, but risk generating Not Safe For Work (NSFW) contents from unsafe prompts. Existing approaches, such as prompt filtering or concept unlearning, fail to defend against adversarial attacks while maintaining benign image quality. In this paper, we propose a novel approach called Distorting Embedding Space (DES), a text encoder-based defense mechanism that effectively tackles these issues through innovative embedding space control. DES transforms unsafe embeddings, extracted from a text encoder using unsafe prompts, toward carefully calculated safe embedding regions to prevent unsafe contents generation, while reproducing the original safe embeddings. DES also neutralizes the nudity embedding, extracted using prompt ``nudity", by aligning it with neutral embedding to enhance robustness against adversarial attacks. These methods ensure both robust defense and high-quality image generation. Additionally, DES can be adopted in a plug-and-play manner and requires zero inference overhead, facilitating its deployment. Extensive experiments on diverse attack types, including black-box and white-box scenarios, demonstrate DES's state-of-the-art performance in both defense capability and benign image generation quality. Our model is available at https://github.com/aei13/DES.

Paperid: 252, https://arxiv.org/pdf/2501.18638.pdf GitHub

Abstract:
As large language models (LLMs) become increasingly prevalent, ensuring their robustness against adversarial misuse is crucial. This paper introduces the GAP (Graph of Attacks with Pruning) framework, an advanced approach for generating stealthy jailbreak prompts to evaluate and enhance LLM safeguards. GAP addresses limitations in existing tree-based LLM jailbreak methods by implementing an interconnected graph structure that enables knowledge sharing across attack paths. Our experimental evaluation demonstrates GAP's superiority over existing techniques, achieving a 20.8% increase in attack success rates while reducing query costs by 62.7%. GAP consistently outperforms state-of-the-art methods for attacking both open and closed LLMs, with attack success rates of >96%. Additionally, we present specialized variants like GAP-Auto for automated seed generation and GAP-VLM for multimodal attacks. GAP-generated prompts prove highly effective in improving content moderation systems, increasing true positive detection rates by 108.5% and accuracy by 183.6% when used for fine-tuning. Our implementation is available at https://github.com/dsbuddy/GAP-LLM-Safety.

Paperid: 253, https://arxiv.org/pdf/2501.18636.pdf GitHub

Abstract:
The indexing-retrieval-generation paradigm of retrieval-augmented generation (RAG) has been highly successful in solving knowledge-intensive tasks by integrating external knowledge into large language models (LLMs). However, the incorporation of external and unverified knowledge increases the vulnerability of LLMs because attackers can perform attack tasks by manipulating knowledge. In this paper, we introduce a benchmark named SafeRAG designed to evaluate the RAG security. First, we classify attack tasks into silver noise, inter-context conflict, soft ad, and white Denial-of-Service. Next, we construct RAG security evaluation dataset (i.e., SafeRAG dataset) primarily manually for each task. We then utilize the SafeRAG dataset to simulate various attack scenarios that RAG may encounter. Experiments conducted on 14 representative RAG components demonstrate that RAG exhibits significant vulnerability to all attack tasks and even the most apparent attack task can easily bypass existing retrievers, filters, or advanced LLMs, resulting in the degradation of RAG service quality. Code is available at: https://github.com/IAAR-Shanghai/SafeRAG.

Paperid: 254, https://arxiv.org/pdf/2501.18492.pdf GitHub

Abstract:
As LLMs increasingly impact safety-critical applications, ensuring their safety using guardrails remains a key challenge. This paper proposes GuardReasoner, a new safeguard for LLMs, by guiding the guard model to learn to reason. Concretely, we first create the GuardReasonerTrain dataset, which consists of 127K samples with 460K detailed reasoning steps. Then, we introduce reasoning SFT to unlock the reasoning capability of guard models. In addition, we present hard sample DPO to further strengthen their reasoning ability. In this manner, GuardReasoner achieves better performance, explainability, and generalizability. Extensive experiments and analyses on 13 benchmarks of 3 guardrail tasks demonstrate its superiority. Remarkably, GuardReasoner 8B surpasses GPT-4o+CoT by 5.74% and LLaMA Guard 3 8B by 20.84% F1 score on average. We release the training data, code, and models with different scales (1B, 3B, 8B) of GuardReasoner : https://github.com/yueliu1999/GuardReasoner/.

Paperid: 255, https://arxiv.org/pdf/2501.17858.pdf GitHub

Abstract:
Chatbot Arena is a popular platform for evaluating LLMs by pairwise battles, where users vote for their preferred response from two randomly sampled anonymous models. While Chatbot Arena is widely regarded as a reliable LLM ranking leaderboard, we show that crowdsourced voting can be rigged to improve (or decrease) the ranking of a target model $m_{t}$. We first introduce a straightforward target-only rigging strategy that focuses on new battles involving $m_{t}$, identifying it via watermarking or a binary classifier, and exclusively voting for $m_{t}$ wins. However, this strategy is practically inefficient because there are over $190$ models on Chatbot Arena and on average only about $1\%$ of new battles will involve $m_{t}$. To overcome this, we propose omnipresent rigging strategies, exploiting the Elo rating mechanism of Chatbot Arena that any new vote on a battle can influence the ranking of the target model $m_{t}$, even if $m_{t}$ is not directly involved in the battle. We conduct experiments on around $1.7$ million historical votes from the Chatbot Arena Notebook, showing that omnipresent rigging strategies can improve model rankings by rigging only hundreds of new votes. While we have evaluated several defense mechanisms, our findings highlight the importance of continued efforts to prevent vote rigging. Our code is available at https://github.com/sail-sg/Rigging-ChatbotArena.

Paperid: 256, https://arxiv.org/pdf/2501.17667.pdf GitHub GitHub

Abstract:
Deep reinforcement learning (DRL) has gained widespread adoption in control and decision-making tasks due to its strong performance in dynamic environments. However, DRL agents are vulnerable to noisy observations and adversarial attacks, and concerns about the adversarial robustness of DRL systems have emerged. Recent efforts have focused on addressing these robustness issues by establishing rigorous theoretical guarantees for the returns achieved by DRL agents in adversarial settings. Among these approaches, policy smoothing has proven to be an effective and scalable method for certifying the robustness of DRL agents. Nevertheless, existing certifiably robust DRL relies on policies trained with simple Gaussian augmentations, resulting in a suboptimal trade-off between certified robustness and certified return. To address this issue, we introduce a novel paradigm dubbed \texttt{C}ertified-r\texttt{A}dius-\texttt{M}aximizing \texttt{P}olicy (\texttt{CAMP}) training. \texttt{CAMP} is designed to enhance DRL policies, achieving better utility without compromising provable robustness. By leveraging the insight that the global certified radius can be derived from local certified radii based on training-time statistics, \texttt{CAMP} formulates a surrogate loss related to the local certified radius and optimizes the policy guided by this surrogate loss. We also introduce \textit{policy imitation} as a novel technique to stabilize \texttt{CAMP} training. Experimental results demonstrate that \texttt{CAMP} significantly improves the robustness-return trade-off across various tasks. Based on the results, \texttt{CAMP} can achieve up to twice the certified expected return compared to that of baselines. Our code is available at https://github.com/NeuralSec/camp-robust-rl.

Paperid: 257, https://arxiv.org/pdf/2501.17433.pdf GitHub

Abstract:
Recent research shows that Large Language Models (LLMs) are vulnerable to harmful fine-tuning attacks -- models lose their safety alignment ability after fine-tuning on a few harmful samples. For risk mitigation, a guardrail is typically used to filter out harmful samples before fine-tuning. By designing a new red-teaming method, we in this paper show that purely relying on the moderation guardrail for data filtration is not reliable. Our proposed attack method, dubbed Virus, easily bypasses the guardrail moderation by slightly modifying the harmful data. Experimental results show that the harmful data optimized by Virus is not detectable by the guardrail with up to 100\% leakage ratio, and can simultaneously achieve superior attack performance. Finally, the key message we want to convey through this paper is that: \textbf{it is reckless to consider guardrail moderation as a clutch at straws towards harmful fine-tuning attack}, as it cannot solve the inherent safety issue of the pre-trained LLMs. Our code is available at https://github.com/git-disl/Virus

Paperid: 258, https://arxiv.org/pdf/2501.15553.pdf GitHub

Abstract:
Cybercriminals pose a significant threat to blockchain trading security, causing $40.9 billion in losses in 2024. However, the lack of an effective real-world address dataset hinders the advancement of cybercrime detection research. The anti-cybercrime efforts of researchers from broader fields, such as statistics and artificial intelligence, are blocked by data scarcity. In this paper, we present Real-CATS, a Real-world dataset of Cryptocurrency Addresses with Transaction profileS, serving as a practical training ground for developing and assessing detection methods. Real-CATS comprises 103,203 criminal addresses from real-world reports and 106,196 benign addresses from exchange customers. It satifies the C3R characteristics (Comprehensiveness, Classifiability, Customizability, and Real-world Transferability), which are fundemental for practical detection of cryptocurrency cybercrime. The dataset provides three main functions: 1) effective evaluation of detection methods, 2) support for feature extensions, and 3) a new evaluation scenario for real-world deployment. Real-CATS also offers opportunities to expand cybercrime measurement studies. It is particularly beneficial for researchers without cryptocurrency-related knowledge to engage in this emerging research field. We hope that studies on cryptocurrency cybercrime detection will be promoted by an increasing number of cross-disciplinary researchers drawn to this versatile data platform. All datasets are available at https://github.com/sjdseu/Real-CATS

Paperid: 259, https://arxiv.org/pdf/2501.15529.pdf GitHub

Abstract:
Deep reinforcement learning (DRL) is widely applied to safety-critical decision-making scenarios. However, DRL is vulnerable to backdoor attacks, especially action-level backdoors, which pose significant threats through precise manipulation and flexible activation, risking outcomes like vehicle collisions or drone crashes. The key distinction of action-level backdoors lies in the utilization of the backdoor reward function to associate triggers with target actions. Nevertheless, existing studies typically rely on backdoor reward functions with fixed values or conditional flipping, which lack universality across diverse DRL tasks and backdoor designs, resulting in fluctuations or even failure in practice. This paper proposes the first universal action-level backdoor attack framework, called UNIDOOR, which enables adaptive exploration of backdoor reward functions through performance monitoring, eliminating the reliance on expert knowledge and grid search. We highlight that action tampering serves as a crucial component of action-level backdoor attacks in continuous action scenarios, as it addresses attack failures caused by low-frequency target actions. Extensive evaluations demonstrate that UNIDOOR significantly enhances the attack performance of action-level backdoors, showcasing its universality across diverse attack scenarios, including single/multiple agents, single/multiple backdoors, discrete/continuous action spaces, and sparse/dense reward signals. Furthermore, visualization results encompassing state distribution, neuron activation, and animations demonstrate the stealthiness of UNIDOOR. The source code of UNIDOOR can be found at https://github.com/maoubo/UNIDOOR.

Paperid: 260, https://arxiv.org/pdf/2501.14250.pdf GitHub

Abstract:
Large language models (LLMs) are widely used in real-world applications, raising concerns about their safety and trustworthiness. While red-teaming with jailbreak prompts exposes the vulnerabilities of LLMs, current efforts focus primarily on single-turn attacks, overlooking the multi-turn strategies used by real-world adversaries. Existing multi-turn methods rely on static patterns or predefined logical chains, failing to account for the dynamic strategies during attacks. We propose Siren, a learning-based multi-turn attack framework designed to simulate real-world human jailbreak behaviors. Siren consists of three stages: (1) training set construction utilizing Turn-Level LLM feedback (Turn-MF), (2) post-training attackers with supervised fine-tuning (SFT) and direct preference optimization (DPO), and (3) interactions between the attacking and target LLMs. Experiments demonstrate that Siren achieves an attack success rate (ASR) of 90% with LLaMA-3-8B as the attacker against Gemini-1.5-Pro as the target model, and 70% with Mistral-7B against GPT-4o, significantly outperforming single-turn baselines. Moreover, Siren with a 7B-scale model achieves performance comparable to a multi-turn baseline that leverages GPT-4o as the attacker, while requiring fewer turns and employing decomposition strategies that are better semantically aligned with attack goals. We hope Siren inspires the development of stronger defenses against advanced multi-turn jailbreak attacks under realistic scenarios. Code is available at https://github.com/YiyiyiZhao/siren. Warning: This paper contains potentially harmful text.

Paperid: 261, https://arxiv.org/pdf/2501.12612.pdf GitHub

Abstract:
Text-to-image (T2I) models have rapidly advanced, enabling the generation of high-quality images from text prompts across various domains. However, these models present notable safety concerns, including the risk of generating harmful, biased, or private content. Current research on assessing T2I safety remains in its early stages. While some efforts have been made to evaluate models on specific safety dimensions, many critical risks remain unexplored. To address this gap, we introduce T2ISafety, a safety benchmark that evaluates T2I models across three key domains: toxicity, fairness, and bias. We build a detailed hierarchy of 12 tasks and 44 categories based on these three domains, and meticulously collect 70K corresponding prompts. Based on this taxonomy and prompt set, we build a large-scale T2I dataset with 68K manually annotated images and train an evaluator capable of detecting critical risks that previous work has failed to identify, including risks that even ultra-large proprietary models like GPTs cannot correctly detect. We evaluate 12 prominent diffusion models on T2ISafety and reveal several concerns including persistent issues with racial fairness, a tendency to generate toxic content, and significant variation in privacy protection across the models, even with defense methods like concept erasing. Data and evaluator are released under https://github.com/adwardlee/t2i_safety.

Paperid: 262, https://arxiv.org/pdf/2501.12523.pdf GitHub

Abstract:
Generating unique molecules with biochemically desired properties to serve as viable drug candidates is a difficult task that requires specialized domain expertise. In recent years, diffusion models have shown promising results in accelerating the drug design process through AI-driven molecular generation. However, training these models requires massive amounts of data, which are often isolated in proprietary silos. OpenFL is a federated learning framework that enables privacy-preserving collaborative training across these decentralized data sites. In this work, we present a federated discrete denoising diffusion model that was trained using OpenFL. The federated model achieves comparable performance with a model trained on centralized data when evaluating the uniqueness and validity of the generated molecules. This demonstrates the utility of federated learning in the drug design process. OpenFL is available at: https://github.com/securefederatedai/openfl

Paperid: 263, https://arxiv.org/pdf/2501.09328.pdf GitHub

Abstract:
Developing high-performance deep learning models is resource-intensive, leading model owners to utilize Machine Learning as a Service (MLaaS) platforms instead of publicly releasing their models. However, malicious users may exploit query interfaces to execute model extraction attacks, reconstructing the target model's functionality locally. While prior research has investigated triggerable watermarking techniques for asserting ownership, existing methods face significant challenges: (1) most approaches require additional training, resulting in high overhead and limited flexibility, and (2) they often fail to account for advanced attackers, leaving them vulnerable to adaptive attacks. In this paper, we propose Neural Honeytrace, a robust plug-and-play watermarking framework against model extraction attacks. We first formulate a watermark transmission model from an information-theoretic perspective, providing an interpretable account of the principles and limitations of existing triggerable watermarking. Guided by the model, we further introduce: (1) a similarity-based training-free watermarking method for plug-and-play and flexible watermarking, and (2) a distribution-based multi-step watermark information transmission strategy for robust watermarking. Comprehensive experiments on four datasets demonstrate that Neural Honeytrace outperforms previous methods in efficiency and resisting adaptive attacks. Neural Honeytrace reduces the average number of samples required for a worst-case t-Test-based copyright claim from 193,252 to 1,857 with zero training cost. The code is available at https://github.com/NeurHT/NeurHT.

Paperid: 264, https://arxiv.org/pdf/2501.07047.pdf GitHub

Abstract:
Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than computing on plaintext, often leading to unacceptably large latencies in getting the results. HE accelerators have emerged to mitigate this latency issue, but with the high cost of ASICs. In this paper we show that HE primitives can be converted to AI operators and accelerated on existing ASIC AI accelerators, like TPUs, which are already widely deployed in the cloud. Adapting such accelerators for HE requires (1) supporting modular multiplication, (2) high-precision arithmetic in software, and (3) efficient mapping on matrix engines. We introduce the CROSS compiler (1) to adopt Barrett reduction to provide modular reduction support using multiplier and adder, (2) Basis Aligned Transformation (BAT) to convert high-precision multiplication as low-precision matrix-vector multiplication, (3) Matrix Aligned Transformation (MAT) to covert vectorized modular operation with reduction into matrix multiplication that can be efficiently processed on 2D spatial matrix engine. Our evaluation of CROSS on a Google TPUv4 demonstrates significant performance improvements, with up to 161x and 5x speedup compared to the previous work on many-core CPUs and V100. The kernel-level codes are open-sourced at https://github.com/google/jaxite/tree/main/jaxite_word.

Paperid: 265, https://arxiv.org/pdf/2501.03800.pdf GitHub

Abstract:
Despite the considerable performance improvements of face recognition algorithms in recent years, the same scientific advances responsible for this progress can also be used to create efficient ways to attack them, posing a threat to their secure deployment. Morphing attack detection (MAD) systems aim to detect a specific type of threat, morphing attacks, at an early stage, preventing them from being considered for verification in critical processes. Foundation models (FM) learn from extensive amounts of unlabelled data, achieving remarkable zero-shot generalization to unseen domains. Although this generalization capacity might be weak when dealing with domain-specific downstream tasks such as MAD, FMs can easily adapt to these settings while retaining the built-in knowledge acquired during pre-training. In this work, we recognize the potential of FMs to perform well in the MAD task when properly adapted to its specificities. To this end, we adapt FM CLIP architectures with LoRA weights while simultaneously training a classification header. The proposed framework, MADation surpasses our alternative FM and transformer-based frameworks and constitutes the first adaption of FMs to the MAD task. MADation presents competitive results with current MAD solutions in the literature and even surpasses them in several evaluation scenarios. To encourage reproducibility and facilitate further research in MAD, we publicly release the implementation of MADation at https://github.com/gurayozgur/MADation

Paperid: 266, https://arxiv.org/pdf/2501.03489.pdf GitHub

Abstract:
The pervasiveness of proprietary language models has raised critical privacy concerns, necessitating advancements in private inference (PI), where computations are performed directly on encrypted data without revealing users' sensitive information. While PI offers a promising solution, its practical deployment is hindered by substantial communication and latency overheads, primarily stemming from nonlinear operations. To address this, we introduce an information-theoretic framework to characterize the role of nonlinearities in decoder-only language models, laying a principled foundation for optimizing transformer-architectures tailored to the demands of PI. By leveraging Shannon's entropy as a quantitative measure, we uncover the previously unexplored dual significance of nonlinearities: beyond ensuring training stability, they are crucial for maintaining attention head diversity. Specifically, we find that their removal triggers two critical failure modes: {\em entropy collapse} in deeper layers that destabilizes training, and {\em entropic overload} in earlier layers that leads to under-utilization of Multi-Head Attention's (MHA) representational capacity. We propose an entropy-guided attention mechanism paired with a novel entropy regularization technique to mitigate entropic overload. Additionally, we explore PI-friendly alternatives to layer normalization for preventing entropy collapse and stabilizing the training of LLMs with reduced-nonlinearities. Our study bridges the gap between information theory and architectural design, establishing entropy dynamics as a principled guide for developing efficient PI architectures. The code and implementation are available at https://github.com/Nandan91/entropy-guided-attention-llm

Paperid: 267, https://arxiv.org/pdf/2501.03279.pdf GitHub

Abstract:
With the growing significance of network security, the classification of encrypted traffic has emerged as an urgent challenge. Traditional byte-based traffic analysis methods are constrained by the rigid granularity of information and fail to fully exploit the diverse correlations between bytes. To address these limitations, this paper introduces MH-Net, a novel approach for classifying network traffic that leverages multi-view heterogeneous traffic graphs to model the intricate relationships between traffic bytes. The essence of MH-Net lies in aggregating varying numbers of traffic bits into multiple types of traffic units, thereby constructing multi-view traffic graphs with diverse information granularities. By accounting for different types of byte correlations, such as header-payload relationships, MH-Net further endows the traffic graph with heterogeneity, significantly enhancing model performance. Notably, we employ contrastive learning in a multi-task manner to strengthen the robustness of the learned traffic unit representations. Experiments conducted on the ISCX and CIC-IoT datasets for both the packet-level and flow-level traffic classification tasks demonstrate that MH-Net achieves the best overall performance compared to dozens of SOTA methods.

Paperid: 268, https://arxiv.org/pdf/2501.03272.pdf GitHub

Abstract:
Supervised fine-tuning has become the predominant method for adapting large pretrained models to downstream tasks. However, recent studies have revealed that these models are vulnerable to backdoor attacks, where even a small number of malicious samples can successfully embed backdoor triggers into the model. While most existing defense methods focus on post-training backdoor defense, efficiently defending against backdoor attacks during training phase remains largely unexplored. To address this gap, we propose a novel defense method called Backdoor Token Unlearning (BTU), which proactively detects and neutralizes trigger tokens during the training stage. Our work is based on two key findings: 1) backdoor learning causes distinctive differences between backdoor token parameters and clean token parameters in word embedding layers, and 2) the success of backdoor attacks heavily depends on backdoor token parameters. The BTU defense leverages these properties to identify aberrant embedding parameters and subsequently removes backdoor behaviors using a fine-grained unlearning technique. Extensive evaluations across three datasets and four types of backdoor attacks demonstrate that BTU effectively defends against these threats while preserving the model's performance on primary tasks. Our code is available at https://github.com/XDJPH/BTU.

Paperid: 269, https://arxiv.org/pdf/2501.02629.pdf GitHub

Abstract:
As large language models (LLMs) are increasingly deployed in diverse applications, including chatbot assistants and code generation, aligning their behavior with safety and ethical standards has become paramount. However, jailbreak attacks, which exploit vulnerabilities to elicit unintended or harmful outputs, threaten LLMs' safety significantly. In this paper, we introduce Layer-AdvPatcher, a novel methodology designed to defend against jailbreak attacks by utilizing an unlearning strategy to patch specific layers within LLMs through self-augmented datasets. Our insight is that certain layer(s), tend to produce affirmative tokens when faced with harmful prompts. By identifying these layers and adversarially exposing them to generate more harmful data, one can understand their inherent and diverse vulnerabilities to attacks. With these exposures, we then "unlearn" these issues, reducing the impact of affirmative tokens and hence minimizing jailbreak risks while keeping the model's responses to safe queries intact. We conduct extensive experiments on two models, four benchmark datasets, and multiple state-of-the-art jailbreak attacks to demonstrate the efficacy of our approach. Results indicate that our framework reduces the harmfulness and attack success rate of jailbreak attacks without compromising utility for benign queries compared to recent defense methods. Our code is publicly available at: https://github.com/oyy2000/LayerAdvPatcher

Paperid: 270, https://arxiv.org/pdf/2501.02029.pdf GitHub

Abstract:
With the integration of an additional modality, large vision-language models (LVLMs) exhibit greater vulnerability to safety risks (e.g., jailbreaking) compared to their language-only predecessors. Although recent studies have devoted considerable effort to the post-hoc alignment of LVLMs, the inner safety mechanisms remain largely unexplored. In this paper, we discover that internal activations of LVLMs during the first token generation can effectively identify malicious prompts across different attacks. This inherent safety perception is governed by sparse attention heads, which we term ``safety heads." Further analysis reveals that these heads act as specialized shields against malicious prompts; ablating them leads to higher attack success rates, while the model's utility remains unaffected. By locating these safety heads and concatenating their activations, we construct a straightforward but powerful malicious prompt detector that integrates seamlessly into the generation process with minimal extra inference overhead. Despite its simple structure of a logistic regression model, the detector surprisingly exhibits strong zero-shot generalization capabilities. Experiments across various prompt-based attacks confirm the effectiveness of leveraging safety heads to protect LVLMs. Code is available at \url{https://github.com/Ziwei-Zheng/SAHs}.

Paperid: 271, https://arxiv.org/pdf/2501.01110.pdf GitHub

Abstract:
Continual Learning (CL) for malware classification tackles the rapidly evolving nature of malware threats and the frequent emergence of new types. Generative Replay (GR)-based CL systems utilize a generative model to produce synthetic versions of past data, which are then combined with new data to retrain the primary model. Traditional machine learning techniques in this domain often struggle with catastrophic forgetting, where a model's performance on old data degrades over time. In this paper, we introduce a GR-based CL system that employs Generative Adversarial Networks (GANs) with feature matching loss to generate high-quality malware samples. Additionally, we implement innovative selection schemes for replay samples based on the model's hidden representations. Our comprehensive evaluation across Windows and Android malware datasets in a class-incremental learning scenario -- where new classes are introduced continuously over multiple tasks -- demonstrates substantial performance improvements over previous methods. For example, our system achieves an average accuracy of 55% on Windows malware samples, significantly outperforming other GR-based models by 28%. This study provides practical insights for advancing GR-based malware classification systems. The implementation is available at \url {https://github.com/MalwareReplayGAN/MalCL}\footnote{The code will be made public upon the presentation of the paper}.

Paperid: 272, https://arxiv.org/pdf/2501.00200.pdf GitHub

Abstract:
Recently, cutting-plane methods such as GCP-CROWN have been explored to enhance neural network verifiers and made significant advances. However, GCP-CROWN currently relies on generic cutting planes (cuts) generated from external mixed integer programming (MIP) solvers. Due to the poor scalability of MIP solvers, large neural networks cannot benefit from these cutting planes. In this paper, we exploit the structure of the neural network verification problem to generate efficient and scalable cutting planes specific for this problem setting. We propose a novel approach, Branch-and-bound Inferred Cuts with COnstraint Strengthening (BICCOS), which leverages the logical relationships of neurons within verified subproblems in the branch-and-bound search tree, and we introduce cuts that preclude these relationships in other subproblems. We develop a mechanism that assigns influence scores to neurons in each path to allow the strengthening of these cuts. Furthermore, we design a multi-tree search technique to identify more cuts, effectively narrowing the search space and accelerating the BaB algorithm. Our results demonstrate that BICCOS can generate hundreds of useful cuts during the branch-and-bound process and consistently increase the number of verifiable instances compared to other state-of-the-art neural network verifiers on a wide range of benchmarks, including large networks that previous cutting plane methods could not scale to. BICCOS is part of the $Î±,Î²$-CROWN verifier, the VNN-COMP 2024 winner. The code is available at http://github.com/Lemutisme/BICCOS .

Paperid: 273, https://arxiv.org/pdf/2502.09624.pdf

Abstract:
By synergistically integrating mobile networks and embodied artificial intelligence (AI), Mobile Embodied AI Networks (MEANETs) represent an advanced paradigm that facilitates autonomous, context-aware, and interactive behaviors within dynamic environments. Nevertheless, the rapid development of MEANETs is accompanied by challenges in trustworthiness and operational efficiency. Fortunately, blockchain technology, with its decentralized and immutable characteristics, offers promising solutions for MEANETs. However, existing block propagation mechanisms suffer from challenges such as low propagation efficiency and weak security for block propagation, which results in delayed transmission of vehicular messages or vulnerability to malicious tampering, potentially causing severe traffic accidents in blockchain-enabled MEANETs. Moreover, current block propagation strategies cannot effectively adapt to real-time changes of dynamic topology in MEANETs. Therefore, in this paper, we propose a graph Resfusion model-based trustworthy block propagation optimization framework for consortium blockchain-enabled MEANETs. Specifically, we propose an innovative trust calculation mechanism based on the trust cloud model, which comprehensively accounts for randomness and fuzziness in the miner trust evaluation. Furthermore, by leveraging the strengths of graph neural networks and diffusion models, we develop a graph Resfusion model to effectively and adaptively generate the optimal block propagation trajectory. Simulation results demonstrate that the proposed model outperforms other routing mechanisms in terms of block propagation efficiency and trustworthiness. Additionally, the results highlight its strong adaptability to dynamic environments, making it particularly suitable for rapidly changing MEANETs.

Paperid: 274, https://arxiv.org/pdf/2501.11074.pdf

Abstract:
Deep reinforcement learning (DRL) has been widely used in many important tasks of communication networks. In order to improve the perception ability of DRL on the network, some studies have combined graph neural networks (GNNs) with DRL, which use the GNNs to extract unstructured features of the network. However, as networks continue to evolve and become increasingly complex, existing GNN-DRL methods still face challenges in terms of scalability and robustness. Moreover, these methods are inadequate for addressing network security issues. From the perspective of security and robustness, this paper explores the solution of combining GNNs with DRL to build a resilient network. This article starts with a brief tutorial of GNNs and DRL, and introduces their existing applications in networks. Furthermore, we introduce the network security methods that can be strengthened by GNN-DRL approaches. Then, we designed a framework based on GNN-DRL to defend against attacks and enhance network resilience. Additionally, we conduct a case study using an encrypted traffic dataset collected from real IoT environments, and the results demonstrated the effectiveness and superiority of our framework. Finally, we highlight key open challenges and opportunities for enhancing network resilience with GNN-DRL.

Paperid: 275, https://arxiv.org/pdf/2501.11068.pdf

Abstract:
Ensuring end-to-end cross-layer communication security in military networks by selecting covert schemes between nodes is a key solution for military communication security. With the development of communication technology, covert communication has expanded from the physical layer to the network and application layers, utilizing methods such as artificial noise, private networks, and semantic coding to transmit secret messages. However, as adversaries continuously eavesdrop on specific communication channels, the accumulation of sufficient data may reveal underlying patterns that influence concealment, and establishing a cross-layer covert communication mechanism emerges as an effective strategy to mitigate these regulatory challenges. In this article, we first survey the communication security solution based on covert communication, specifically targeting three typical scenarios: device-to-device, private network communication, and public network communication, and analyze their application scopes. Furthermore, we propose an end-to-end cross-layer covert communication scheme driven by Generative Artificial Intelligence (GenAI), highlighting challenges and their solutions. Additionally, a case study is conducted using diffusion reinforcement learning to sovle cloud edge internet of things cross-layer secure communication.

Paperid: 276, https://arxiv.org/pdf/2504.20493.pdf

Abstract:
While reasoning large language models (LLMs) demonstrate remarkable performance across various tasks, they also contain notable security vulnerabilities. Recent research has uncovered a "thinking-stopped" vulnerability in DeepSeek-R1, where model-generated reasoning tokens can forcibly interrupt the inference process, resulting in empty responses that compromise LLM-integrated applications. However, existing methods triggering this vulnerability require complex mathematical word problems with long prompts--even exceeding 5,000 tokens. To reduce the token cost and formally define this vulnerability, we propose a novel prompt injection attack named "Reasoning Interruption Attack", based on adaptive token compression. We demonstrate that simple standalone arithmetic tasks can effectively trigger this vulnerability, and the prompts based on such tasks exhibit simpler logical structures than mathematical word problems. We develop a systematic approach to efficiently collect attack prompts and an adaptive token compression framework that utilizes LLMs to automatically compress these prompts. Experiments show our compression framework significantly reduces prompt length while maintaining effective attack capabilities. We further investigate the attack's performance via output prefix and analyze the underlying causes of the vulnerability, providing valuable insights for improving security in reasoning LLMs.

Paperid: 277, https://arxiv.org/pdf/2504.01048.pdf

Abstract:
Visual Language Models (VLMs) have become foundational models for document understanding tasks, widely used in the processing of complex multimodal documents across domains such as finance, law, and academia. However, documents often contain noise-like information, such as watermarks, which inevitably leads us to inquire: \emph{Do watermarks degrade the performance of VLMs in document understanding?} To address this, we propose a novel evaluation framework to investigate the effect of visible watermarks on VLMs performance. We takes into account various factors, including different types of document data, the positions of watermarks within documents and variations in watermark content. Our experimental results reveal that VLMs performance can be significantly compromised by watermarks, with performance drop rates reaching up to 36\%. We discover that \emph{scattered} watermarks cause stronger interference than centralized ones, and that \emph{semantic contents} in watermarks creates greater disruption than simple visual occlusion. Through attention mechanism analysis and embedding similarity examination, we find that the performance drops are mainly attributed to that watermarks 1) force widespread attention redistribution, and 2) alter semantic representation in the embedding space. Our research not only highlights significant challenges in deploying VLMs for document understanding, but also provides insights towards developing robust inference mechanisms on watermarked documents.

Paperid: 278, https://arxiv.org/pdf/2503.21315.pdf

Abstract:
Retrieval-augmented generation (RAG) systems enhance large language models by incorporating external knowledge, addressing issues like outdated internal knowledge and hallucination. However, their reliance on external knowledge bases makes them vulnerable to corpus poisoning attacks, where adversarial passages can be injected to manipulate retrieval results. Existing methods for crafting such passages, such as random token replacement or training inversion models, are often slow and computationally expensive, requiring either access to retriever's gradients or large computational resources. To address these limitations, we propose Dynamic Importance-Guided Genetic Algorithm (DIGA), an efficient black-box method that leverages two key properties of retrievers: insensitivity to token order and bias towards influential tokens. By focusing on these characteristics, DIGA dynamically adjusts its genetic operations to generate effective adversarial passages with significantly reduced time and memory usage. Our experimental evaluation shows that DIGA achieves superior efficiency and scalability compared to existing methods, while maintaining comparable or better attack success rates across multiple datasets.

Paperid: 279, https://arxiv.org/pdf/2503.19134.pdf

Abstract:
While safety mechanisms have significantly progressed in filtering harmful text inputs, MLLMs remain vulnerable to multimodal jailbreaks that exploit their cross-modal reasoning capabilities. We present MIRAGE, a novel multimodal jailbreak framework that exploits narrative-driven context and role immersion to circumvent safety mechanisms in Multimodal Large Language Models (MLLMs). By systematically decomposing the toxic query into environment, role, and action triplets, MIRAGE constructs a multi-turn visual storytelling sequence of images and text using Stable Diffusion, guiding the target model through an engaging detective narrative. This process progressively lowers the model's defences and subtly guides its reasoning through structured contextual cues, ultimately eliciting harmful responses. In extensive experiments on the selected datasets with six mainstream MLLMs, MIRAGE achieves state-of-the-art performance, improving attack success rates by up to 17.5% over the best baselines. Moreover, we demonstrate that role immersion and structured semantic reconstruction can activate inherent model biases, facilitating the model's spontaneous violation of ethical safeguards. These results highlight critical weaknesses in current multimodal safety mechanisms and underscore the urgent need for more robust defences against cross-modal threats.

Paperid: 280, https://arxiv.org/pdf/2503.11750.pdf

Abstract:
In the realm of large vision-language models (LVLMs), adversarial jailbreak attacks serve as a red-teaming approach to identify safety vulnerabilities of these models and their associated defense mechanisms. However, we identify a critical limitation: not every adversarial optimization step leads to a positive outcome, and indiscriminately accepting optimization results at each step may reduce the overall attack success rate. To address this challenge, we introduce HKVE (Hierarchical Key-Value Equalization), an innovative jailbreaking framework that selectively accepts gradient optimization results based on the distribution of attention scores across different layers, ensuring that every optimization step positively contributes to the attack. Extensive experiments demonstrate HKVE's significant effectiveness, achieving attack success rates of 75.08% on MiniGPT4, 85.84% on LLaVA and 81.00% on Qwen-VL, substantially outperforming existing methods by margins of 20.43\%, 21.01\% and 26.43\% respectively. Furthermore, making every step effective not only leads to an increase in attack success rate but also allows for a reduction in the number of iterations, thereby lowering computational costs. Warning: This paper contains potentially harmful example data.

Paperid: 281, https://arxiv.org/pdf/2503.11619.pdf

Abstract:
Deploying large vision-language models (LVLMs) introduces a unique vulnerability: susceptibility to malicious attacks via visual inputs. However, existing defense methods suffer from two key limitations: (1) They solely focus on textual defenses, fail to directly address threats in the visual domain where attacks originate, and (2) the additional processing steps often incur significant computational overhead or compromise model performance on benign tasks. Building on these insights, we propose ESIII (Embedding Security Instructions Into Images), a novel methodology for transforming the visual space from a source of vulnerability into an active defense mechanism. Initially, we embed security instructions into defensive images through gradient-based optimization, obtaining security instructions in the visual dimension. Subsequently, we integrate security instructions from visual and textual dimensions with the input query. The collaboration between security instructions from different dimensions ensures comprehensive security protection. Extensive experiments demonstrate that our approach effectively fortifies the robustness of LVLMs against such attacks while preserving their performance on standard benign tasks and incurring an imperceptible increase in time costs.

Paperid: 282, https://arxiv.org/pdf/2506.16023.pdf

Abstract:
Blockchain-based steganography enables data hiding via encoding the covert data into a specific blockchain transaction field. However, previous works focus on the specific field-embedding methods while lacking a consideration on required field-generation embedding. In this paper, we propose a generic blockchain-based steganography framework (GBSF). The sender generates the required fields such as amount and fees, where the additional covert data is embedded to enhance the channel capacity. Based on GBSF, we design a reversible generative adversarial network (R-GAN) that utilizes the generative adversarial network with a reversible generator to generate the required fields and encode additional covert data into the input noise of the reversible generator. We then explore the performance flaw of R-GAN. To further improve the performance, we propose R-GAN with Counter-intuitive data preprocessing and Custom activation functions, namely CCR-GAN. The counter-intuitive data preprocessing (CIDP) mechanism is used to reduce decoding errors in covert data, while it incurs gradient explosion for model convergence. The custom activation function named ClipSigmoid is devised to overcome the problem. Theoretical justification for CIDP and ClipSigmoid is also provided. We also develop a mechanism named T2C, which balances capacity and concealment. We conduct experiments using the transaction amount of the Bitcoin mainnet as the required field to verify the feasibility. We then apply the proposed schemes to other transaction fields and blockchains to demonstrate the scalability. Finally, we evaluate capacity and concealment for various blockchains and transaction fields and explore the trade-off between capacity and concealment. The results demonstrate that R-GAN and CCR-GAN are able to enhance the channel capacity effectively and outperform state-of-the-art works.

Paperid: 283, https://arxiv.org/pdf/2504.09153.pdf

Abstract:
The Low-Altitude Economy Networking (LAENet) is emerging as a transformative paradigm that enables an integrated and sophisticated communication infrastructure to support aerial vehicles in carrying out a wide range of economic activities within low-altitude airspace. However, the physical layer communications in the LAENet face growing security threats due to inherent characteristics of aerial communication environments, such as signal broadcast nature and channel openness. These challenges highlight the urgent need for safeguarding communication confidentiality, availability, and integrity. In view of the above, this survey comprehensively reviews existing secure countermeasures for physical layer communication in the LAENet. We explore core methods focusing on anti-eavesdropping and authentication for ensuring communication confidentiality. Subsequently, availability-enhancing techniques are thoroughly discussed for anti-jamming and spoofing defense. Then, we review approaches for safeguarding integrity through anomaly detection and injection protection. Furthermore, we discuss future research directions, emphasizing energy-efficient physical layer security, multi-drone collaboration for secure communication, AI-driven security defense strategy, space-air-ground integrated security architecture, and 6G-enabled secure UAV communication. This survey may provide valuable references and new insights for researchers in the field of secure physical layer communication for the LAENet.

Paperid: 284, https://arxiv.org/pdf/2501.15468.pdf

Abstract:
Low Earth Orbit (LEO) satellites can be used to assist maritime wireless communications for data transmission across wide-ranging areas. However, extensive coverage of LEO satellites, combined with openness of channels, can cause the communication process to suffer from security risks. This paper presents a low-altitude friendly-jamming LEO satellite-maritime communication system enabled by a unmanned aerial vehicle (UAV) to ensure data security at the physical layer. Since such a system requires trade-off policies that balance the secrecy rate and energy consumption of the UAV to meet evolving scenario demands, we formulate a secure satellite-maritime communication multi-objective optimization problem (SSMCMOP). In order to solve the dynamic and long-term optimization problem, we reformulate it into a Markov decision process. We then propose a transformer-enhanced soft actor critic (TransSAC) algorithm, which is a generative artificial intelligence-enable deep reinforcement learning approach to solve the reformulated problem, so that capturing global dependencies and diversely exploring weights. Simulation results demonstrate that the TransSAC outperforms various baselines, and achieves an optimal secrecy rate while effectively minimizing the energy consumption of the UAV. Moreover, the results find more suitable constraint values for the system.

Paperid: 285, https://arxiv.org/pdf/2503.20804.pdf

Abstract:
Assessing the safety of autonomous driving policy is of great importance, and reinforcement learning (RL) has emerged as a powerful method for discovering critical vulnerabilities in driving policies. However, existing RL-based approaches often struggle to identify vulnerabilities that are both effective-meaning the autonomous vehicle is genuinely responsible for the accidents-and diverse-meaning they span various failure types. To address these challenges, we propose AED, a framework that uses large language models (LLMs) to automatically discover effective and diverse vulnerabilities in autonomous driving policies. We first utilize an LLM to automatically design reward functions for RL training. Then we let the LLM consider a diverse set of accident types and train adversarial policies for different accident types in parallel. Finally, we use preference-based learning to filter ineffective accidents and enhance the effectiveness of each vulnerability. Experiments across multiple simulated traffic scenarios and tested policies show that AED uncovers a broader range of vulnerabilities and achieves higher attack success rates compared with expert-designed rewards, thereby reducing the need for manual reward engineering and improving the diversity and effectiveness of vulnerability discovery.

Paperid: 286, https://arxiv.org/pdf/2503.10058.pdf

Abstract:
Money laundering is a financial crime that obscures the origin of illicit funds, necessitating the development and enforcement of anti-money laundering (AML) policies by governments and organizations. The proliferation of mobile payment platforms and smart IoT devices has significantly complicated AML investigations. As payment networks become more interconnected, there is an increasing need for efficient real-time detection to process large volumes of transaction data on heterogeneous payment systems by different operators such as digital currencies, cryptocurrencies and account-based payments. Most of these mobile payment networks are supported by connected devices, many of which are considered loT devices in the FinTech space that constantly generate data. Furthermore, the growing complexity and unpredictability of transaction patterns across these networks contribute to a higher incidence of false positives. While machine learning solutions have the potential to enhance detection efficiency, their application in AML faces unique challenges, such as addressing privacy concerns tied to sensitive financial data and managing the real-world constraint of limited data availability due to data regulations. Existing surveys in the AML literature broadly review machine learning approaches for money laundering detection, but they often lack an in-depth exploration of advanced deep learning techniques - an emerging field with significant potential. To address this gap, this paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML. Additionally, we propose a novel framework that applies the least-privilege principle by integrating machine learning techniques, codifying AML red flags, and employing account profiling to provide context for predictions and enable effective fraud detection under limited data availability....

Paperid: 287, https://arxiv.org/pdf/2504.07135.pdf

Abstract:
In the era of rapidly evolving large language models (LLMs), state-of-the-art rumor detection systems, particularly those based on Message Propagation Trees (MPTs), which represent a conversation tree with the post as its root and the replies as its descendants, are facing increasing threats from adversarial attacks that leverage LLMs to generate and inject malicious messages. Existing methods are based on the assumption that different nodes exhibit varying degrees of influence on predictions. They define nodes with high predictive influence as important nodes and target them for attacks. If the model treats nodes' predictive influence more uniformly, attackers will find it harder to target high predictive influence nodes. In this paper, we propose Similarizing the predictive Influence of Nodes with Contrastive Learning (SINCon), a defense mechanism that encourages the model to learn graph representations where nodes with varying importance have a more uniform influence on predictions. Extensive experiments on the Twitter and Weibo datasets demonstrate that SINCon not only preserves high classification accuracy on clean data but also significantly enhances resistance against LLM-driven message injection attacks.

Paperid: 288, https://arxiv.org/pdf/2504.04809.pdf

Abstract:
Tool learning serves as a powerful auxiliary mechanism that extends the capabilities of large language models (LLMs), enabling them to tackle complex tasks requiring real-time relevance or high precision operations. Behind its powerful capabilities lie some potential security issues. However, previous work has primarily focused on how to make the output of the invoked tools incorrect or malicious, with little attention given to the manipulation of tool selection. To fill this gap, we introduce, for the first time, a black-box text-based attack that can significantly increase the probability of the target tool being selected in this paper. We propose a two-level text perturbation attack witha coarse-to-fine granularity, attacking the text at both the word level and the character level. We conduct comprehensive experiments that demonstrate the attacker only needs to make some perturbations to the tool's textual information to significantly increase the possibility of the target tool being selected and ranked higher among the candidate tools. Our research reveals the vulnerability of the tool selection process and paves the way for future research on protecting this process.

Paperid: 289, https://arxiv.org/pdf/2504.21803.pdf

Abstract:
Binary code analysis plays a pivotal role in the field of software security and is widely used in tasks such as software maintenance, malware detection, software vulnerability discovery, patch analysis, etc. However, unlike source code, reverse engineers face significant challenges in understanding binary code due to the lack of intuitive semantic information. Although traditional reverse tools can convert binary code into C-like pseudo code, the lack of code comments and symbolic information such as function names still makes code understanding difficult. In recent years, two groups of techniques have shown promising prospects: (1) Deep learning-based techniques have demonstrated competitive results in tasks related to binary code understanding, furthermore, (2) Large Language Models (LLMs) have been extensively pre-trained at the source-code level for tasks such as code understanding and generation. This has left participants wondering about the capabilities of LLMs in binary code understanding. To this end, this work proposes a benchmark to evaluate the effectiveness of LLMs in real-world reverse engineering scenarios, which covers two key binary code understanding tasks, i.e., function name recovery and binary code summarization. To more comprehensively evaluate, we include binaries with multiple target architectures as well as different optimization options. We gain valuable insights into the capabilities and limitations through extensive empirical studies of popular LLMs using our benchmark. Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis. Our results highlight the great potential of the LLMs in advancing the field of binary code understanding, and provide new directions for binary code analysis techniques.

Paperid: 290, https://arxiv.org/pdf/2504.19454.pdf

Abstract:
The technique of hiding secret messages within seemingly harmless covertext to evade examination by censors with rigorous security proofs is known as provably secure steganography (PSS). PSS evolves from symmetric key steganography to public-key steganography, functioning without the requirement of a pre-shared key and enabling the extension to multi-party covert communication and identity verification mechanisms. Recently, a public-key steganography method based on elliptic curves was proposed, which uses point compression to eliminate the algebraic structure of curve points. However, this method has strict requirements on the curve parameters and is only available on half of the points. To overcome these limitations, this paper proposes a more general elliptic curve public key steganography method based on admissible encoding. By applying the tensor square function to the known well-distributed encoding, we construct admissible encoding, which can create the pseudo-random public-key encryption function. The theoretical analysis and experimental results show that the proposed provable secure public-key steganography method can be deployed on all types of curves and utilize all points on the curve.

Paperid: 291, https://arxiv.org/pdf/2504.15139.pdf

Abstract:
Minimum distortion steganography is currently the mainstream method for modification-based steganography. A key issue in this method is how to define steganographic distortion. With the rapid development of deep learning technology, the definition of distortion has evolved from manual design to deep learning design. Concurrently, rapid advancements in image generation have made generated images viable as cover media. However, existing distortion design methods based on machine learning do not fully leverage the advantages of generated cover media, resulting in suboptimal security performance. To address this issue, we propose GIFDL (Generated Image Fluctuation Distortion Learning), a steganographic distortion learning method based on the fluctuations in generated images. Inspired by the idea of natural steganography, we take a series of highly similar fluctuation images as the input to the steganographic distortion generator and introduce a new GAN training strategy to disguise stego images as fluctuation images. Experimental results demonstrate that GIFDL, compared with state-of-the-art GAN-based distortion learning methods, exhibits superior resistance to steganalysis, increasing the detection error rates by an average of 3.30% across three steganalyzers.

Paperid: 292, https://arxiv.org/pdf/2504.15026.pdf

Abstract:
Ethical concerns surrounding copyright protection and inappropriate content generation pose challenges for the practical implementation of diffusion models. One effective solution involves watermarking the generated images. Existing methods primarily focus on ensuring that watermark embedding does not degrade the model performance. However, they often overlook critical challenges in real-world deployment scenarios, such as the complexity of watermark key management, user-defined generation parameters, and the difficulty of verification by arbitrary third parties. To address this issue, we propose Gaussian Shading++, a diffusion model watermarking method tailored for real-world deployment. We propose a double-channel design that leverages pseudorandom error-correcting codes to encode the random seed required for watermark pseudorandomization, achieving performance-lossless watermarking under a fixed watermark key and overcoming key management challenges. Additionally, we model the distortions introduced during generation and inversion as an additive white Gaussian noise channel and employ a novel soft decision decoding strategy during extraction, ensuring strong robustness even when generation parameters vary. To enable third-party verification, we incorporate public key signatures, which provide a certain level of resistance against forgery attacks even when model inversion capabilities are fully disclosed. Extensive experiments demonstrate that Gaussian Shading++ not only maintains performance losslessness but also outperforms existing methods in terms of robustness, making it a more practical solution for real-world deployment.

Paperid: 293, https://arxiv.org/pdf/2503.07243.pdf

Abstract:
Type recovery is a crucial step in binary code analysis, holding significant importance for reverse engineering and various security applications. Existing works typically simply target type identifiers within binary code and achieve type recovery by analyzing variable characteristics within functions. However, we find that the types in real-world binary programs are more complex and often follow specific distribution patterns. In this paper, to gain a profound understanding of the variable type recovery problem in binary code, we first conduct a comprehensive empirical study. We utilize the TYDA dataset, which includes 163,643 binary programs across four architectures and four compiler optimization options, fully reflecting the complexity and diversity of real-world programs. We carefully study the unique patterns that characterize types and variables in binary code, and also investigate the impact of compiler optimizations on them, yielding many valuable insights. Based on our empirical findings, we propose ByteTR, a framework for recovering variable types in binary code. We decouple the target type set to address the issue of unbalanced type distribution and perform static program analysis to tackle the impact of compiler optimizations on variable storage. In light of the ubiquity of variable propagation across functions observed in our study, ByteTR conducts inter-procedural analysis to trace variable propagation and employs a gated graph neural network to capture long-range data flow dependencies for variable type recovery. We conduct extensive experiments to evaluate the performance of ByteTR. The results demonstrate that ByteTR leads state-of-the-art works in both effectiveness and efficiency. Moreover, in real CTF challenge case, the pseudo code optimized by ByteTR significantly improves readability, surpassing leading tools IDA and Ghidra.

Paperid: 294, https://arxiv.org/pdf/2503.00957.pdf

Abstract:
As speech translation (ST) systems become increasingly prevalent, understanding their vulnerabilities is crucial for ensuring robust and reliable communication. However, limited work has explored this issue in depth. This paper explores methods of compromising these systems through imperceptible audio manipulations. Specifically, we present two innovative approaches: (1) the injection of perturbation into source audio, and (2) the generation of adversarial music designed to guide targeted translation, while also conducting more practical over-the-air attacks in the physical world. Our experiments reveal that carefully crafted audio perturbations can mislead translation models to produce targeted, harmful outputs, while adversarial music achieve this goal more covertly, exploiting the natural imperceptibility of music. These attacks prove effective across multiple languages and translation models, highlighting a systemic vulnerability in current ST architectures. The implications of this research extend beyond immediate security concerns, shedding light on the interpretability and robustness of neural speech processing systems. Our findings underscore the need for advanced defense mechanisms and more resilient architectures in the realm of audio systems. More details and samples can be found at https://adv-st.github.io.

Paperid: 295, https://arxiv.org/pdf/2501.09431.pdf

Abstract:
While large language models (LLMs) present significant potential for supporting numerous real-world applications and delivering positive social impacts, they still face significant challenges in terms of the inherent risk of privacy leakage, hallucinated outputs, and value misalignment, and can be maliciously used for generating toxic content and unethical purposes after been jailbroken. Therefore, in this survey, we present a comprehensive review of recent advancements aimed at mitigating these issues, organized across the four phases of LLM development and usage: data collecting and pre-training, fine-tuning and alignment, prompting and reasoning, and post-processing and auditing. We elaborate on the recent advances for enhancing the performance of LLMs in terms of privacy protection, hallucination reduction, value alignment, toxicity elimination, and jailbreak defenses. In contrast to previous surveys that focus on a single dimension of responsible LLMs, this survey presents a unified framework that encompasses these diverse dimensions, providing a comprehensive view of enhancing LLMs to better serve real-world applications.

Paperid: 296, https://arxiv.org/pdf/2502.15830.pdf

Abstract:
Neural code models (NCMs) have demonstrated extraordinary capabilities in code intelligence tasks. Meanwhile, the security of NCMs and NCMs-based systems has garnered increasing attention. In particular, NCMs are often trained on large-scale data from potentially untrustworthy sources, providing attackers with the opportunity to manipulate them by inserting crafted samples into the data. This type of attack is called a code poisoning attack (also known as a backdoor attack). It allows attackers to implant backdoors in NCMs and thus control model behavior, which poses a significant security threat. However, there is still a lack of effective techniques for detecting various complex code poisoning attacks. In this paper, we propose an innovative and lightweight technique for code poisoning detection named KillBadCode. KillBadCode is designed based on our insight that code poisoning disrupts the naturalness of code. Specifically, KillBadCode first builds a code language model (CodeLM) on a lightweight $n$-gram language model. Then, given poisoned data, KillBadCode utilizes CodeLM to identify those tokens in (poisoned) code snippets that will make the code snippets more natural after being deleted as trigger tokens. Considering that the removal of some normal tokens in a single sample might also enhance code naturalness, leading to a high false positive rate (FPR), we aggregate the cumulative improvement of each token across all samples. Finally, KillBadCode purifies the poisoned data by removing all poisoned samples containing the identified trigger tokens. The experimental results on two code poisoning attacks and four code intelligence tasks demonstrate that KillBadCode significantly outperforms four baselines. More importantly, KillBadCode is very efficient, with a minimum time consumption of only 5 minutes, and is 25 times faster than the best baseline on average.

Paperid: 297, https://arxiv.org/pdf/2501.01830.pdf

Abstract:
Automated red-teaming has become a crucial approach for uncovering vulnerabilities in large language models (LLMs). However, most existing methods focus on isolated safety flaws, limiting their ability to adapt to dynamic defenses and uncover complex vulnerabilities efficiently. To address this challenge, we propose Auto-RT, a reinforcement learning framework that automatically explores and optimizes complex attack strategies to effectively uncover security vulnerabilities through malicious queries. Specifically, we introduce two key mechanisms to reduce exploration complexity and improve strategy optimization: 1) Early-terminated Exploration, which accelerate exploration by focusing on high-potential attack strategies; and 2) Progressive Reward Tracking algorithm with intermediate downgrade models, which dynamically refine the search trajectory toward successful vulnerability exploitation. Extensive experiments across diverse LLMs demonstrate that, by significantly improving exploration efficiency and automatically optimizing attack strategies, Auto-RT detects a boarder range of vulnerabilities, achieving a faster detection speed and 16.63\% higher success rates compared to existing methods.

Paperid: 298, https://arxiv.org/pdf/2505.23793.pdf

Abstract:
Despite their remarkable achievements and widespread adoption, Multimodal Large Language Models (MLLMs) have revealed significant security vulnerabilities, highlighting the urgent need for robust safety evaluation benchmarks. Existing MLLM safety benchmarks, however, fall short in terms of data quality and coverge, and modal risk combinations, resulting in inflated and contradictory evaluation results, which hinders the discovery and governance of security concerns. Besides, we argue that vulnerabilities to harmful queries and oversensitivity to harmless ones should be considered simultaneously in MLLMs safety evaluation, whereas these were previously considered separately. In this paper, to address these shortcomings, we introduce Unified Safety Benchmarks (USB), which is one of the most comprehensive evaluation benchmarks in MLLM safety. Our benchmark features high-quality queries, extensive risk categories, comprehensive modal combinations, and encompasses both vulnerability and oversensitivity evaluations. From the perspective of two key dimensions: risk categories and modality combinations, we demonstrate that the available benchmarks -- even the union of the vast majority of them -- are far from being truly comprehensive. To bridge this gap, we design a sophisticated data synthesis pipeline that generates extensive, high-quality complementary data addressing previously unexplored aspects. By combining open-source datasets with our synthetic data, our benchmark provides 4 distinct modality combinations for each of the 61 risk sub-categories, covering both English and Chinese across both vulnerability and oversensitivity dimensions.

Paperid: 299, https://arxiv.org/pdf/2504.09466.pdf

Abstract:
Despite extensive efforts in safety alignment, large language models (LLMs) remain vulnerable to jailbreak attacks. Activation steering offers a training-free defense method but relies on fixed steering coefficients, resulting in suboptimal protection and increased false rejections of benign inputs. To address this, we propose AdaSteer, an adaptive activation steering method that dynamically adjusts model behavior based on input characteristics. We identify two key properties: Rejection Law (R-Law), which shows that stronger steering is needed for jailbreak inputs opposing the rejection direction, and Harmfulness Law (H-Law), which differentiates adversarial and benign inputs. AdaSteer steers input representations along both the Rejection Direction (RD) and Harmfulness Direction (HD), with adaptive coefficients learned via logistic regression, ensuring robust jailbreak defense while preserving benign input handling. Experiments on LLaMA-3.1, Gemma-2, and Qwen2.5 show that AdaSteer outperforms baseline methods across multiple jailbreak attacks with minimal impact on utility. Our results highlight the potential of interpretable model internals for real-time, flexible safety enforcement in LLMs.

Paperid: 300, https://arxiv.org/pdf/2506.02546.pdf

Abstract:
Large Language Model-based Multi-Agent Systems (LLM-MAS) have demonstrated strong capabilities in solving complex tasks but remain vulnerable when agents receive unreliable messages. This vulnerability stems from a fundamental gap: LLM agents treat all incoming messages equally without evaluating their trustworthiness. While some existing studies approach the trustworthiness, they focus on a single type of harmfulness rather than analyze it in a holistic approach from multiple trustworthiness perspectives. In this work, we propose Attention Trust Score (A-Trust), a lightweight, attention-based method for evaluating message trustworthiness. Inspired by human communication literature[1], through systematically analyzing attention behaviors across six orthogonal trust dimensions, we find that certain attention heads in the LLM specialize in detecting specific types of violations. Leveraging these insights, A-Trust directly infers trustworthiness from internal attention patterns without requiring external prompts or verifiers. Building upon A-Trust, we develop a principled and efficient trust management system (TMS) for LLM-MAS, enabling both message-level and agent-level trust assessment. Experiments across diverse multi-agent settings and tasks demonstrate that applying our TMS significantly enhances robustness against malicious inputs.

Paperid: 301, https://arxiv.org/pdf/2506.01245.pdf

Abstract:
This paper argues that a comprehensive vulnerability analysis is essential for building trustworthy Large Language Model-based Multi-Agent Systems (LLM-MAS). These systems, which consist of multiple LLM-powered agents working collaboratively, are increasingly deployed in high-stakes applications but face novel security threats due to their complex structures. While single-agent vulnerabilities are well-studied, LLM-MAS introduces unique attack surfaces through inter-agent communication, trust relationships, and tool integration that remain significantly underexplored. We present a systematic framework for vulnerability analysis of LLM-MAS that unifies diverse research. For each type of vulnerability, we define formal threat models grounded in practical attacker capabilities and illustrate them using real-world LLM-MAS applications. This formulation enables rigorous quantification of vulnerability across different architectures and provides a foundation for designing meaningful evaluation benchmarks. Our analysis reveals that LLM-MAS faces elevated risk due to compositional effects -- vulnerabilities in individual components can cascade through agent communication, creating threat models not present in single-agent systems. We conclude by identifying critical open challenges: (1) developing benchmarks specifically tailored to LLM-MAS vulnerability assessment, (2) considering new potential attacks specific to multi-agent architectures, and (3) implementing trust management systems that can enforce security in LLM-MAS. This research provides essential groundwork for future efforts to enhance LLM-MAS trustworthiness as these systems continue their expansion into critical applications.

Paperid: 302, https://arxiv.org/pdf/2506.00359.pdf

Abstract:
Although Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks, growing concerns have emerged over the misuse of sensitive, copyrighted, or harmful data during training. To address these concerns, unlearning techniques have been developed to remove the influence of specific data without retraining from scratch. However, this paper reveals a critical vulnerability in fine-tuning-based unlearning: a malicious user can craft a manipulated forgetting request that stealthily degrades the model's utility for benign users. We demonstrate this risk through a red-teaming Stealthy Attack (SA), which is inspired by two key limitations of existing unlearning (the inability to constrain the scope of unlearning effect and the failure to distinguish benign tokens from unlearning signals). Prior work has shown that unlearned models tend to memorize forgetting data as unlearning signals, and respond with hallucinations or feigned ignorance when unlearning signals appear in the input. By subtly increasing the presence of common benign tokens in the forgetting data, SA enhances the connection between benign tokens and unlearning signals. As a result, when normal users include such tokens in their prompts, the model exhibits unlearning behaviors, leading to unintended utility degradation. To address this vulnerability, we propose Scope-aware Unlearning (SU), a lightweight enhancement that introduces a scope term into the unlearning objective, encouraging the model to localize the forgetting effect. Our method requires no additional data processing, integrates seamlessly with existing fine-tuning frameworks, and significantly improves robustness against SA. Extensive experiments validate the effectiveness of both SA and SU.

Paperid: 303, https://arxiv.org/pdf/2505.13957.pdf

Abstract:
Multimodal Retrieval-Augmented Generation (MRAG) systems enhance LMMs by integrating external multimodal databases, but introduce unexplored privacy vulnerabilities. While text-based RAG privacy risks have been studied, multimodal data presents unique challenges. We provide the first systematic analysis of MRAG privacy vulnerabilities across vision-language and speech-language modalities. Using a novel compositional structured prompt attack in a black-box setting, we demonstrate how attackers can extract private information by manipulating queries. Our experiments reveal that LMMs can both directly generate outputs resembling retrieved content and produce descriptions that indirectly expose sensitive information, highlighting the urgent need for robust privacy-preserving MRAG techniques.

Paperid: 304, https://arxiv.org/pdf/2504.15585.pdf

Authors:Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Shicheng Xu, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Kai Wang, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Xiaolong Jin, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Liang Pang, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu

Abstract:
The remarkable success of Large Language Models (LLMs) has illuminated a promising pathway toward achieving Artificial General Intelligence for both academic and industrial communities, owing to their unprecedented performance across various applications. As LLMs continue to gain prominence in both research and commercial domains, their security and safety implications have become a growing concern, not only for researchers and corporations but also for every nation. Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e.g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs. To address this gap, this paper introduces, for the first time, the concept of "full-stack" safety to systematically consider safety issues throughout the entire process of LLM training, deployment, and eventual commercialization. Compared to the off-the-shelf LLM safety surveys, our work demonstrates several distinctive advantages: (I) Comprehensive Perspective. We define the complete LLM lifecycle as encompassing data preparation, pre-training, post-training, deployment and final commercialization. To our knowledge, this represents the first safety survey to encompass the entire lifecycle of LLMs. (II) Extensive Literature Support. Our research is grounded in an exhaustive review of over 800+ papers, ensuring comprehensive coverage and systematic organization of security issues within a more holistic understanding. (III) Unique Insights. Through systematic literature analysis, we have developed reliable roadmaps and perspectives for each chapter. Our work identifies promising research directions, including safety in data generation, alignment techniques, model editing, and LLM-based agent systems. These insights provide valuable guidance for researchers pursuing future work in this field.

Paperid: 305, https://arxiv.org/pdf/2503.09095.pdf

Abstract:
Backdoor attacks pose a significant threat to deep learning models, enabling adversaries to embed hidden triggers that manipulate the behavior of the model during inference. Traditional backdoor attacks typically rely on inserting explicit triggers (e.g., external patches, or perturbations) into input data, but they often struggle to evade existing defense mechanisms. To address this limitation, we investigate backdoor attacks through the lens of the reasoning process in deep learning systems, drawing insights from interpretable AI. We conceptualize backdoor activation as the manipulation of learned concepts within the model's latent representations. Thus, existing attacks can be seen as implicit manipulations of these activated concepts during inference. This raises interesting questions: why not manipulate the concepts explicitly? This idea leads to our novel backdoor attack framework, Concept Confusion Attack (C^2 ATTACK), which leverages internal concepts in the model's reasoning as "triggers" without introducing explicit external modifications. By avoiding the use of real triggers and directly activating or deactivating specific concepts in latent spaces, our approach enhances stealth, making detection by existing defenses significantly harder. Using CLIP as a case study, experimental results demonstrate the effectiveness of C^2 ATTACK, achieving high attack success rates while maintaining robustness against advanced defenses.

Paperid: 306, https://arxiv.org/pdf/2505.10924.pdf

Abstract:
Recently, AI-driven interactions with computing devices have advanced from basic prototype tools to sophisticated, LLM-based systems that emulate human-like operations in graphical user interfaces. We are now witnessing the emergence of \emph{Computer-Using Agents} (CUAs), capable of autonomously performing tasks such as navigating desktop applications, web pages, and mobile apps. However, as these agents grow in capability, they also introduce novel safety and security risks. Vulnerabilities in LLM-driven reasoning, with the added complexity of integrating multiple software components and multimodal inputs, further complicate the security landscape. In this paper, we present a systematization of knowledge on the safety and security threats of CUAs. We conduct a comprehensive literature review and distill our findings along four research objectives: \textit{\textbf{(i)}} define the CUA that suits safety analysis; \textit{\textbf{(ii)} } categorize current safety threats among CUAs; \textit{\textbf{(iii)}} propose a comprehensive taxonomy of existing defensive strategies; \textit{\textbf{(iv)}} summarize prevailing benchmarks, datasets, and evaluation metrics used to assess the safety and performance of CUAs. Building on these insights, our work provides future researchers with a structured foundation for exploring unexplored vulnerabilities and offers practitioners actionable guidance in designing and deploying secure Computer-Using Agents.

Paperid: 307, https://arxiv.org/pdf/2505.12402.pdf

Abstract:
Impressive progress has been made in automated problem-solving by the collaboration of large language models (LLMs) based agents. However, these automated capabilities also open avenues for malicious applications. In this paper, we study a new threat that LLMs pose to online pseudonymity, called automated profile inference, where an adversary can instruct LLMs to automatically scrape and extract sensitive personal attributes from publicly visible user activities on pseudonymous platforms. We also introduce an automated profiling framework called AutoProfiler to assess the feasibility of such threats in real-world scenarios. AutoProfiler consists of four specialized LLM agents, who work collaboratively to collect and process user online activities and generate a profile with extracted personal information. Experimental results on two real-world datasets and one synthetic dataset demonstrate that AutoProfiler is highly effective and efficient, and can be easily deployed on a web scale. We demonstrate that the inferred attributes are both sensitive and identifiable, posing significant risks of privacy breaches, such as de-anonymization and sensitive information leakage. Additionally, we explore mitigation strategies from different perspectives and advocate for increased public awareness of this emerging privacy threat to online pseudonymity.

Paperid: 308, https://arxiv.org/pdf/2501.00055.pdf

Abstract:
While safety-aligned large language models (LLMs) are increasingly used as the cornerstone for powerful systems such as multi-agent frameworks to solve complex real-world problems, they still suffer from potential adversarial queries, such as jailbreak attacks, which attempt to induce harmful content. Researching attack methods allows us to better understand the limitations of LLM and make trade-offs between helpfulness and safety. However, existing jailbreak attacks are primarily based on opaque optimization techniques (e.g. token-level gradient descent) and heuristic search methods like LLM refinement, which fall short in terms of transparency, transferability, and computational cost. In light of these limitations, we draw inspiration from the evolution and infection processes of biological viruses and propose LLM-Virus, a jailbreak attack method based on evolutionary algorithm, termed evolutionary jailbreak. LLM-Virus treats jailbreak attacks as both an evolutionary and transfer learning problem, utilizing LLMs as heuristic evolutionary operators to ensure high attack efficiency, transferability, and low time cost. Our experimental results on multiple safety benchmarks show that LLM-Virus achieves competitive or even superior performance compared to existing attack methods.

Paperid: 309, https://arxiv.org/pdf/2506.18795.pdf

Abstract:
High-quality smart contract vulnerability datasets are critical for evaluating security tools and advancing smart contract security research. Two major limitations of current manual dataset construction are (1) labor-intensive and error-prone annotation processes limiting the scale, quality, and evolution of the dataset, and (2) absence of standardized classification rules results in inconsistent vulnerability categories and labeling results across different datasets. To address these limitations, we present FORGE, the first automated approach for constructing smart contract vulnerability datasets. FORGE leverages an LLM-driven pipeline to extract high-quality vulnerabilities from real-world audit reports and classify them according to the CWE, the most widely recognized classification in software security. FORGE employs a divide-and-conquer strategy to extract structured and self-contained vulnerability information from these reports. Additionally, it uses a tree-of-thoughts technique to classify the vulnerability information into the hierarchical CWE classification. To evaluate FORGE's effectiveness, we run FORGE on 6,454 real-world audit reports and generate a dataset comprising 81,390 solidity files and 27,497 vulnerability findings across 296 CWE categories. Manual assessment of the dataset demonstrates high extraction precision and classification consistency with human experts (precision of 95.6% and inter-rater agreement k-$Î±$ of 0.87). We further validate the practicality of our dataset by benchmarking 13 existing security tools on our dataset. The results reveal the significant limitations in current detection capabilities. Furthermore, by analyzing the severity-frequency distribution patterns through a unified CWE perspective in our dataset, we highlight inconsistency between current smart contract research focus and priorities identified from real-world vulnerabilities...

Paperid: 310, https://arxiv.org/pdf/2506.12707.pdf

Abstract:
Large language models (LLMs) have achieved widespread adoption across numerous applications. However, many LLMs are vulnerable to malicious attacks even after safety alignment. These attacks typically bypass LLMs' safety guardrails by wrapping the original malicious instructions inside adversarial jailbreaks prompts. Previous research has proposed methods such as adversarial training and prompt rephrasing to mitigate these safety vulnerabilities, but these methods often reduce the utility of LLMs or lead to significant computational overhead and online latency. In this paper, we propose SecurityLingua, an effective and efficient approach to defend LLMs against jailbreak attacks via security-oriented prompt compression. Specifically, we train a prompt compressor designed to discern the "true intention" of the input prompt, with a particular focus on detecting the malicious intentions of adversarial prompts. Then, in addition to the original prompt, the intention is passed via the system prompt to the target LLM to help it identify the true intention of the request. SecurityLingua ensures a consistent user experience by leaving the original input prompt intact while revealing the user's potentially malicious intention and stimulating the built-in safety guardrails of the LLM. Moreover, thanks to prompt compression, SecurityLingua incurs only a negligible overhead and extra token cost compared to all existing defense methods, making it an especially practical solution for LLM defense. Experimental results demonstrate that SecurityLingua can effectively defend against malicious attacks and maintain utility of the LLM with negligible compute and latency overhead. Our code is available at https://aka.ms/SecurityLingua.

Paperid: 311, https://arxiv.org/pdf/2506.11415.pdf

Abstract:
In Large Language Models, Retrieval-Augmented Generation (RAG) systems can significantly enhance the performance of large language models by integrating external knowledge. However, RAG also introduces new security risks. Existing research focuses mainly on how poisoning attacks in RAG systems affect model output quality, overlooking their potential to amplify model biases. For example, when querying about domestic violence victims, a compromised RAG system might preferentially retrieve documents depicting women as victims, causing the model to generate outputs that perpetuate gender stereotypes even when the original query is gender neutral. To show the impact of the bias, this paper proposes a Bias Retrieval and Reward Attack (BRRA) framework, which systematically investigates attack pathways that amplify language model biases through a RAG system manipulation. We design an adversarial document generation method based on multi-objective reward functions, employ subspace projection techniques to manipulate retrieval results, and construct a cyclic feedback mechanism for continuous bias amplification. Experiments on multiple mainstream large language models demonstrate that BRRA attacks can significantly enhance model biases in dimensions. In addition, we explore a dual stage defense mechanism to effectively mitigate the impacts of the attack. This study reveals that poisoning attacks in RAG systems directly amplify model output biases and clarifies the relationship between RAG system security and model fairness. This novel potential attack indicates that we need to keep an eye on the fairness issues of the RAG system.

Paperid: 312, https://arxiv.org/pdf/2505.17519.pdf

Abstract:
In the era of rapid generative AI development, interactions between humans and large language models face significant misusing risks. Previous research has primarily focused on black-box scenarios using human-guided prompts and white-box scenarios leveraging gradient-based LLM generation methods, neglecting the possibility that LLMs can act not only as victim models, but also as attacker models to harm other models. We proposes a novel jailbreaking method inspired by the Chain-of-Thought mechanism, where the attacker model uses mission transfer to conceal harmful user intent in dialogue and generates chained narrative lures to stimulate the reasoning capabilities of victim models, leading to successful jailbreaking. To enhance the attack success rate, we introduce a helper model that performs random narrative optimization on the narrative lures during multi-turn dialogues while ensuring alignment with the original intent, enabling the optimized lures to bypass the safety barriers of victim models effectively. Our experiments reveal that models with weaker safety mechanisms exhibit stronger attack capabilities, demonstrating that models can not only be exploited, but also help harm others. By incorporating toxicity scores, we employ third-party models to evaluate the harmfulness of victim models' responses to jailbreaking attempts. The study shows that using refusal keywords as an evaluation metric for attack success rates is significantly flawed because it does not assess whether the responses guide harmful questions, while toxicity scores measure the harm of generated content with more precision and its alignment with harmful questions. Our approach demonstrates outstanding performance, uncovering latent vulnerabilities in LLMs and providing data-driven feedback to optimize LLM safety mechanisms. We also discuss two defensive strategies to offer guidance on improving defense mechanisms.

Paperid: 313, https://arxiv.org/pdf/2501.16671.pdf

Abstract:
Generative AI technology has become increasingly integrated into our daily lives, offering powerful capabilities to enhance productivity. However, these same capabilities can be exploited by adversaries for malicious purposes. While existing research on adversarial applications of generative AI predominantly focuses on cyberattacks, less attention has been given to attacks targeting deep learning models. In this paper, we introduce the use of generative AI for facilitating model-related attacks, including model extraction, membership inference, and model inversion. Our study reveals that adversaries can launch a variety of model-related attacks against both image and text models in a data-free and black-box manner, achieving comparable performance to baseline methods that have access to the target models' training data and parameters in a white-box manner. This research serves as an important early warning to the community about the potential risks associated with generative AI-powered attacks on deep learning models.

Paperid: 314, https://arxiv.org/pdf/2501.16663.pdf

Abstract:
Duplication is a prevalent issue within datasets. Existing research has demonstrated that the presence of duplicated data in training datasets can significantly influence both model performance and data privacy. However, the impact of data duplication on the unlearning process remains largely unexplored. This paper addresses this gap by pioneering a comprehensive investigation into the role of data duplication, not only in standard machine unlearning but also in federated and reinforcement unlearning paradigms. Specifically, we propose an adversary who duplicates a subset of the target model's training set and incorporates it into the training set. After training, the adversary requests the model owner to unlearn this duplicated subset, and analyzes the impact on the unlearned model. For example, the adversary can challenge the model owner by revealing that, despite efforts to unlearn it, the influence of the duplicated subset remains in the model. Moreover, to circumvent detection by de-duplication techniques, we propose three novel near-duplication methods for the adversary, each tailored to a specific unlearning paradigm. We then examine their impacts on the unlearning process when de-duplication techniques are applied. Our findings reveal several crucial insights: 1) the gold standard unlearning method, retraining from scratch, fails to effectively conduct unlearning under certain conditions; 2) unlearning duplicated data can lead to significant model degradation in specific scenarios; and 3) meticulously crafted duplicates can evade detection by de-duplication methods.

Paperid: 315, https://arxiv.org/pdf/2506.18870.pdf

Abstract:
Machine learning (ML) models are proving to be vulnerable to a variety of attacks that allow the adversary to learn sensitive information, cause mispredictions, and more. While these attacks have been extensively studied, current research predominantly focuses on analyzing each attack type individually. In practice, however, adversaries may employ multiple attack strategies simultaneously rather than relying on a single approach. This prompts a crucial yet underexplored question: When the adversary has multiple attacks at their disposal, are they able to mount or amplify the effect of one attack with another? In this paper, we take the first step in studying the strategic interactions among different attacks, which we define as attack compositions. Specifically, we focus on four well-studied attacks during the model's inference phase: adversarial examples, attribute inference, membership inference, and property inference. To facilitate the study of their interactions, we propose a taxonomy based on three stages of the attack pipeline: preparation, execution, and evaluation. Using this taxonomy, we identify four effective attack compositions, such as property inference assisting attribute inference at its preparation level and adversarial examples assisting property inference at its execution level. We conduct extensive experiments on the attack compositions using three ML model architectures and three benchmark image datasets. Empirical results demonstrate the effectiveness of these four attack compositions. We implement and release a modular reusable toolkit, COAT. Arguably, our work serves as a call for researchers and practitioners to consider advanced adversarial settings involving multiple attack strategies, aiming to strengthen the security and robustness of AI systems.

Paperid: 316, https://arxiv.org/pdf/2506.14374.pdf

Abstract:
Recent reasoning large language models (LLMs), such as OpenAI o1 and DeepSeek-R1, exhibit strong performance on complex tasks through test-time inference scaling. However, prior studies have shown that these models often incur significant computational costs due to excessive reasoning, such as frequent switching between reasoning trajectories (e.g., underthinking) or redundant reasoning on simple questions (e.g., overthinking). In this work, we expose a novel threat: adversarial inputs can be crafted to exploit excessive reasoning behaviors and substantially increase computational overhead without compromising model utility. Therefore, we propose a novel loss framework consisting of three components: (1) Priority Cross-Entropy Loss, a modification of the standard cross-entropy objective that emphasizes key tokens by leveraging the autoregressive nature of LMs; (2) Excessive Reasoning Loss, which encourages the model to initiate additional reasoning paths during inference; and (3) Delayed Termination Loss, which is designed to extend the reasoning process and defer the generation of final outputs. We optimize and evaluate our attack for the GSM8K and ORCA datasets on DeepSeek-R1-Distill-LLaMA and DeepSeek-R1-Distill-Qwen. Empirical results demonstrate a 3x to 9x increase in reasoning length with comparable utility performance. Furthermore, our crafted adversarial inputs exhibit transferability, inducing computational overhead in o3-mini, o1-mini, DeepSeek-R1, and QWQ models.

Paperid: 317, https://arxiv.org/pdf/2506.13494.pdf

Abstract:
Large Language Models (LLMs) have experienced rapid advancements, with applications spanning a wide range of fields, including sentiment classification, review generation, and question answering. Due to their efficiency and versatility, researchers and companies increasingly employ LLM-generated data to train their models. However, the inability to track content produced by LLMs poses a significant challenge, potentially leading to copyright infringement for the LLM owners. In this paper, we propose a method for injecting watermarks into LLM-generated datasets, enabling the tracking of downstream tasks to detect whether these datasets were produced using the original LLM. These downstream tasks can be divided into two categories. The first involves using the generated datasets at the input level, commonly for training classification tasks. The other is the output level, where model trainers use LLM-generated content as output for downstream tasks, such as question-answering tasks. We design a comprehensive set of experiments to evaluate both watermark methods. Our results indicate the high effectiveness of our watermark approach. Additionally, regarding model utility, we find that classifiers trained on the generated datasets achieve a test accuracy exceeding 0.900 in many cases, suggesting that the utility of such models remains robust. For the output-level watermark, we observe that the quality of the generated text is comparable to that produced using real-world datasets. Through our research, we aim to advance the protection of LLM copyrights, taking a significant step forward in safeguarding intellectual property in this domain.

Paperid: 318, https://arxiv.org/pdf/2506.10047.pdf

Abstract:
Text-to-image (T2I) models such as Stable Diffusion have advanced rapidly and are now widely used in content creation. However, these models can be misused to generate harmful content, including nudity or violence, posing significant safety risks. While most platforms employ content moderation systems, underlying vulnerabilities can still be exploited by determined adversaries. Recent research on red-teaming and adversarial attacks against T2I models has notable limitations: some studies successfully generate highly toxic images but use adversarial prompts that are easily detected and blocked by safety filters, while others focus on bypassing safety mechanisms but fail to produce genuinely harmful outputs, neglecting the discovery of truly high-risk prompts. Consequently, there remains a lack of reliable tools for evaluating the safety of defended T2I models. To address this gap, we propose GenBreak, a framework that fine-tunes a red-team large language model (LLM) to systematically explore underlying vulnerabilities in T2I generators. Our approach combines supervised fine-tuning on curated datasets with reinforcement learning via interaction with a surrogate T2I model. By integrating multiple reward signals, we guide the LLM to craft adversarial prompts that enhance both evasion capability and image toxicity, while maintaining semantic coherence and diversity. These prompts demonstrate strong effectiveness in black-box attacks against commercial T2I generators, revealing practical and concerning safety weaknesses.

Paperid: 319, https://arxiv.org/pdf/2506.07888.pdf

Abstract:
Data reconstruction attacks, which aim to recover the training dataset of a target model with limited access, have gained increasing attention in recent years. However, there is currently no consensus on a formal definition of data reconstruction attacks or appropriate evaluation metrics for measuring their quality. This lack of rigorous definitions and universal metrics has hindered further advancement in this field. In this paper, we address this issue in the vision domain by proposing a unified attack taxonomy and formal definitions of data reconstruction attacks. We first propose a set of quantitative evaluation metrics that consider important criteria such as quantifiability, consistency, precision, and diversity. Additionally, we leverage large language models (LLMs) as a substitute for human judgment, enabling visual evaluation with an emphasis on high-quality reconstructions. Using our proposed taxonomy and metrics, we present a unified framework for systematically evaluating the strengths and limitations of existing attacks and establishing a benchmark for future research. Empirical results, primarily from a memorization perspective, not only validate the effectiveness of our metrics but also offer valuable insights for designing new attacks.

Paperid: 320, https://arxiv.org/pdf/2506.00197.pdf

Abstract:
Knowledge files have been widely used in large language model (LLM) agents, such as GPTs, to improve response quality. However, concerns about the potential leakage of knowledge files have grown significantly. Existing studies demonstrate that adversarial prompts can induce GPTs to leak knowledge file content. Yet, it remains uncertain whether additional leakage vectors exist, particularly given the complex data flows across clients, servers, and databases in GPTs. In this paper, we present a comprehensive risk assessment of knowledge file leakage, leveraging a novel workflow inspired by Data Security Posture Management (DSPM). Through the analysis of 651,022 GPT metadata, 11,820 flows, and 1,466 responses, we identify five leakage vectors: metadata, GPT initialization, retrieval, sandboxed execution environments, and prompts. These vectors enable adversaries to extract sensitive knowledge file data such as titles, content, types, and sizes. Notably, the activation of the built-in tool Code Interpreter leads to a privilege escalation vulnerability, enabling adversaries to directly download original knowledge files with a 95.95% success rate. Further analysis reveals that 28.80% of leaked files are copyrighted, including digital copies from major publishers and internal materials from a listed company. In the end, we provide actionable solutions for GPT builders and platform providers to secure the GPT data supply chain.

Paperid: 321, https://arxiv.org/pdf/2503.04332.pdf

Abstract:
The tremendous commercial potential of large language models (LLMs) has heightened concerns about their unauthorized use. Third parties can customize LLMs through fine-tuning and offer only black-box API access, effectively concealing unauthorized usage and complicating external auditing processes. This practice not only exacerbates unfair competition, but also violates licensing agreements. In response, identifying the origin of black-box LLMs is an intrinsic solution to this issue. In this paper, we first reveal the limitations of state-of-the-art passive and proactive identification methods with experiments on 30 LLMs and two real-world black-box APIs. Then, we propose the proactive technique, PlugAE, which optimizes adversarial token embeddings in a continuous space and proactively plugs them into the LLM for tracing and identification. The experiments show that PlugAE can achieve substantial improvement in identifying fine-tuned derivatives. We further advocate for legal frameworks and regulations to better address the challenges posed by the unauthorized use of LLMs.

Paperid: 322, https://arxiv.org/pdf/2502.01241.pdf

Abstract:
Human-AI conversations have gained increasing attention since the era of large language models. Consequently, more techniques, such as input/output guardrails and safety alignment, are proposed to prevent potential misuse of such Human-AI conversations. However, the ability to identify these guardrails has significant implications, both for adversarial exploitation and for auditing purposes by red team operators. In this work, we propose a novel method, AP-Test, which identifies the presence of a candidate guardrail by leveraging guardrail-specific adversarial prompts to query the AI agent. Extensive experiments of four candidate guardrails under diverse scenarios showcase the effectiveness of our method. The ablation study further illustrates the importance of the components we designed, such as the loss terms.

Paperid: 323, https://arxiv.org/pdf/2502.00808.pdf

Abstract:
Large language models (LLMs) have facilitated the generation of high-quality, cost-effective synthetic data for developing downstream models and conducting statistical analyses in various domains. However, the increased reliance on synthetic data may pose potential negative impacts. Numerous studies have demonstrated that LLM-generated synthetic data can perpetuate and even amplify societal biases and stereotypes, and produce erroneous outputs known as ``hallucinations'' that deviate from factual knowledge. In this paper, we aim to audit artifacts, such as classifiers, generators, or statistical plots, to identify those trained on or derived from synthetic data and raise user awareness, thereby reducing unexpected consequences and risks in downstream applications. To this end, we take the first step to introduce synthetic artifact auditing to assess whether a given artifact is derived from LLM-generated synthetic data. We then propose an auditing framework with three methods including metric-based auditing, tuning-based auditing, and classification-based auditing. These methods operate without requiring the artifact owner to disclose proprietary training details. We evaluate our auditing framework on three text classification tasks, two text summarization tasks, and two data visualization tasks across three training scenarios. Our evaluation demonstrates the effectiveness of all proposed auditing methods across all these tasks. For instance, black-box metric-based auditing can achieve an average accuracy of $0.868 \pm 0.071$ for auditing classifiers and $0.880 \pm 0.052$ for auditing generators using only 200 random queries across three scenarios. We hope our research will enhance model transparency and regulatory compliance, ensuring the ethical and responsible use of synthetic data.

Paperid: 324, https://arxiv.org/pdf/2501.16750.pdf

Abstract:
Large Language Models (LLMs) have raised increasing concerns about their misuse in generating hate speech. Among all the efforts to address this issue, hate speech detectors play a crucial role. However, the effectiveness of different detectors against LLM-generated hate speech remains largely unknown. In this paper, we propose HateBench, a framework for benchmarking hate speech detectors on LLM-generated hate speech. We first construct a hate speech dataset of 7,838 samples generated by six widely-used LLMs covering 34 identity groups, with meticulous annotations by three labelers. We then assess the effectiveness of eight representative hate speech detectors on the LLM-generated dataset. Our results show that while detectors are generally effective in identifying LLM-generated hate speech, their performance degrades with newer versions of LLMs. We also reveal the potential of LLM-driven hate campaigns, a new threat that LLMs bring to the field of hate speech detection. By leveraging advanced techniques like adversarial attacks and model stealing attacks, the adversary can intentionally evade the detector and automate hate campaigns online. The most potent adversarial attack achieves an attack success rate of 0.966, and its attack efficiency can be further improved by $13-21\times$ through model stealing attacks with acceptable attack performance. We hope our study can serve as a call to action for the research community and platform moderators to fortify defenses against these emerging threats.

Paperid: 325, https://arxiv.org/pdf/2506.14697.pdf

Abstract:
The rapid advancement of vision-language models (VLMs) and their integration into embodied agents have unlocked powerful capabilities for decision-making. However, as these systems are increasingly deployed in real-world environments, they face mounting safety concerns, particularly when responding to hazardous instructions. In this work, we propose AGENTSAFE, the first comprehensive benchmark for evaluating the safety of embodied VLM agents under hazardous instructions. AGENTSAFE simulates realistic agent-environment interactions within a simulation sandbox and incorporates a novel adapter module that bridges the gap between high-level VLM outputs and low-level embodied controls. Specifically, it maps recognized visual entities to manipulable objects and translates abstract planning into executable atomic actions in the environment. Building on this, we construct a risk-aware instruction dataset inspired by Asimovs Three Laws of Robotics, including base risky instructions and mutated jailbroken instructions. The benchmark includes 45 adversarial scenarios, 1,350 hazardous tasks, and 8,100 hazardous instructions, enabling systematic testing under adversarial conditions ranging from perception, planning, and action execution stages.

Paperid: 326, https://arxiv.org/pdf/2505.20118.pdf

Abstract:
As large language models (LLMs) become integrated into sensitive workflows, concerns grow over their potential to leak confidential information. We propose TrojanStego, a novel threat model in which an adversary fine-tunes an LLM to embed sensitive context information into natural-looking outputs via linguistic steganography, without requiring explicit control over inference inputs. We introduce a taxonomy outlining risk factors for compromised LLMs, and use it to evaluate the risk profile of the threat. To implement TrojanStego, we propose a practical encoding scheme based on vocabulary partitioning learnable by LLMs via fine-tuning. Experimental results show that compromised models reliably transmit 32-bit secrets with 87% accuracy on held-out prompts, reaching over 97% accuracy using majority voting across three generations. Further, they maintain high utility, can evade human detection, and preserve coherence. These results highlight a new class of LLM data exfiltration attacks that are passive, covert, practical, and dangerous.

Paperid: 327, https://arxiv.org/pdf/2504.17875.pdf

Abstract:
We propose the DIVER (Defensive Implant for Visibility into Embedded Run-times) framework for real-time deep visibility into embedded control devices in cyber-physical systems (CPSs). DIVER enables run-time detection of anomalies and targets devices running VxWorks real-time operating system (RTOS), precluding traditional methods of implementing dynamic monitors using OS (e.g., Linux, Windows) functions. DIVER has two components: "measurer" implant embedded into VxWorks kernel to collect run-time measurements and provide interactive/streaming interfaces over TCP/IP; remote "listener" that acquires and analyzes measurements and provides interactive user interface. DIVER focuses on small embedded devices with stringent resource constraints (e.g., insufficient storage to locally store measurements). To show efficacy and scalability of DIVER, we demonstrate on two embedded devices with different processor architectures and VxWorks versions: Motorola ACE Remote Terminal Unit used in CPS including power systems and Raspberry Pi representative of Internet-of-Things (IoT) applications.

Paperid: 328, https://arxiv.org/pdf/2504.01533.pdf

Abstract:
Large Language Models (LLMs) face threats from jailbreak prompts. Existing methods for defending against jailbreak attacks are primarily based on auxiliary models. These strategies, however, often require extensive data collection or training. We propose LightDefense, a lightweight defense mechanism targeted at white-box models, which utilizes a safety-oriented direction to adjust the probabilities of tokens in the vocabulary, making safety disclaimers appear among the top tokens after sorting tokens by probability in descending order. We further innovatively leverage LLM's uncertainty about prompts to measure their harmfulness and adaptively adjust defense strength, effectively balancing safety and helpfulness. The effectiveness of LightDefense in defending against 5 attack methods across 2 target LLMs, without compromising helpfulness to benign user queries, highlights its potential as a novel and lightweight defense mechanism, enhancing security of LLMs.

Paperid: 329, https://arxiv.org/pdf/2502.10931.pdf

Abstract:
Large Language Models (LLMs) have been used in cybersecurity such as autonomous security analysis or penetration testing. Capture the Flag (CTF) challenges serve as benchmarks to assess automated task-planning abilities of LLM agents for cybersecurity. Early attempts to apply LLMs for solving CTF challenges used single-agent systems, where feedback was restricted to a single reasoning-action loop. This approach was inadequate for complex CTF tasks. Inspired by real-world CTF competitions, where teams of experts collaborate, we introduce the D-CIPHER LLM multi-agent framework for collaborative CTF solving. D-CIPHER integrates agents with distinct roles with dynamic feedback loops to enhance reasoning on complex tasks. It introduces the Planner-Executor agent system, consisting of a Planner agent for overall problem-solving along with multiple heterogeneous Executor agents for individual tasks, facilitating efficient allocation of responsibilities among the agents. Additionally, D-CIPHER incorporates an Auto-prompter agent to improve problem-solving by auto-generating a highly relevant initial prompt. We evaluate D-CIPHER on multiple CTF benchmarks and LLM models via comprehensive studies to highlight the impact of our enhancements. Additionally, we manually map the CTFs in NYU CTF Bench to MITRE ATT&CK techniques that apply for a comprehensive evaluation of D-CIPHER's offensive security capability. D-CIPHER achieves state-of-the-art performance on three benchmarks: 22.0% on NYU CTF Bench, 22.5% on Cybench, and 44.0% on HackTheBox, which is 2.5% to 8.5% better than previous work. D-CIPHER solves 65% more ATT&CK techniques compared to previous work, demonstrating stronger offensive capability.

Paperid: 330, https://arxiv.org/pdf/2501.16619.pdf

Abstract:
Ransomware's escalating sophistication necessitates tamper-resistant, off-host detection solutions that capture deep disk activity beyond the reach of a compromised operating system while overcoming evasion and obfuscation techniques. To address this, we introduce SHIELD: a metric acquisition framework leveraging low-level filesystem monitoring and Network Block Device (NBD) technology to provide off-host, tamper-proof measurements for continuous observation of disk activity exhibited by software executing on a target device. We employ Shield within a detection architecture leveraging deep filesystem features along with simplified metrics aggregated based on frequency of disk actions, making the metrics impervious to obfuscation while avoiding reliance on vulnerable host-based logs. We evaluate the efficacy of these metrics through extensive experiments with both binary (benign vs. malicious behavior) and multiclass (ransomware strain identification) classifiers and confirm that our metrics yield high accuracy across diverse threat profiles, including intermittent or partial encryption. In a proof-of-concept deployment, we demonstrate real-time mitigation using models trained on these metrics by halting malicious disk operations after ransomware detection with minimum file loss and memory corruption. We also show that hardware-only features collected independently of OS or network stack retain high detection effectiveness, verifying feasibility of embedding the proposed pipeline in a SATA controller ASIC or FPGA for next-generation, disk-centric defenses that combine filesystem insight with inherent off-host isolation.

Paperid: 331, https://arxiv.org/pdf/2501.13081.pdf

Abstract:
With increasingly sophisticated cyber-adversaries able to access a wider repertoire of mechanisms to implant malware such as ransomware, CPU/GPU keyloggers, and stealthy kernel rootkits, there is an urgent need for techniques to detect and mitigate such attacks. While state of the art relies on digital and analog side channel measurements assuming trustworthiness of measurements obtained on the main processor, such an approach has limitations since processor-based side channel measurements are potentially untrustworthy. Sophisticated adversaries (especially in late stage cyber attacks when they have breached the computer and network security systems such as firewalls and antivirus and penetrated the computer's OS) can compromise user-space and kernel-space measurements. To address this key limitation of state of the art, we propose a "subcomponent-level" approach to collect side channel measurements so as to enable robust anomaly detection in a modern computer even when the main processor is compromised. Our proposed approach leverages the fact that modern computers are complex systems with multiple interacting subcomponents and measurements from subcomponents can be used to detect anomalies even when the main processor is no longer trustworthy. We develop mechanisms to obtain time series measurements of activity of several subcomponents and methodologies to process and fuse these measurements for anomaly detection. The subcomponents include network interface controller, GPU, CPU Hardware Performance Counters, CPU power, and keyboard. Our main hypothesis is that subcomponent measurements can enable detection of security threats without requiring a trustworthy main processor. By enabling real-time measurements from multiple subcomponents, the goal is to provide a deeper visibility into system operation, thereby yielding a powerful tool to track system operation and detect anomalies.

Paperid: 332, https://arxiv.org/pdf/2506.13205.pdf

Abstract:
Mobile agents powered by vision-language models (VLMs) are increasingly adopted for tasks such as UI automation and camera-based assistance. These agents are typically fine-tuned using small-scale, user-collected data, making them susceptible to stealthy training-time threats. This work introduces VIBMA, the first clean-text backdoor attack targeting VLM-based mobile agents. The attack injects malicious behaviors into the model by modifying only the visual input while preserving textual prompts and instructions, achieving stealth through the complete absence of textual anomalies. Once the agent is fine-tuned on this poisoned data, adding a predefined visual pattern (trigger) at inference time activates the attacker-specified behavior (backdoor). Our attack aligns the training gradients of poisoned samples with those of an attacker-specified target instance, effectively embedding backdoor-specific features into the poisoned data. To ensure the robustness and stealthiness of the attack, we design three trigger variants that better resemble real-world scenarios: static patches, dynamic motion patterns, and low-opacity blended content. Extensive experiments on six Android applications and three mobile-compatible VLMs demonstrate that our attack achieves high success rates (ASR up to 94.67%) while preserving clean-task behavior (FSR up to 95.85%). We further conduct ablation studies to understand how key design factors impact attack reliability and stealth. These findings is the first to reveal the security vulnerabilities of mobile agents and their susceptibility to backdoor injection, underscoring the need for robust defenses in mobile agent adaptation pipelines.

Paperid: 333, https://arxiv.org/pdf/2506.10776.pdf

Abstract:
The capability of generative diffusion models (DMs) like Stable Diffusion (SD) in replicating training data could be taken advantage of by attackers to launch the Copyright Infringement Attack, with duplicated poisoned image-text pairs. SilentBadDiffusion (SBD) is a method proposed recently, which shew outstanding performance in attacking SD in text-to-image tasks. However, the feasible data resources in this area are still limited, some of them are even constrained or prohibited due to the issues like copyright ownership or inappropriate contents; And not all of the images in current datasets are suitable for the proposed attacking methods; Besides, the state-of-the-art (SoTA) performance of SBD is far from ideal when few generated poisoning samples could be adopted for attacks. In this paper, we raised new datasets accessible for researching in attacks like SBD, and proposed Multi-Element (ME) attack method based on SBD by increasing the number of poisonous visual-text elements per poisoned sample to enhance the ability of attacking, while importing Discrete Cosine Transform (DCT) for the poisoned samples to maintain the stealthiness. The Copyright Infringement Rate (CIR) / First Attack Epoch (FAE) we got on the two new datasets were 16.78% / 39.50 and 51.20% / 23.60, respectively close to or even outperformed benchmark Pokemon and Mijourney datasets. In condition of low subsampling ratio (5%, 6 poisoned samples), MESI and DCT earned CIR / FAE of 0.23% / 84.00 and 12.73% / 65.50, both better than original SBD, which failed to attack at all.

Paperid: 334, https://arxiv.org/pdf/2506.10744.pdf

Abstract:
Bit-flip attacks (BFAs) represent a serious threat to Deep Neural Networks (DNNs), where flipping a small number of bits in the model parameters or binary code can significantly degrade the model accuracy or mislead the model prediction in a desired way. Existing defenses exclusively focus on protecting models for specific attacks and platforms, while lacking effectiveness for other scenarios. We propose ObfusBFA, an efficient and holistic methodology to mitigate BFAs targeting both the high-level model weights and low-level codebase (executables or shared libraries). The key idea of ObfusBFA is to introduce random dummy operations during the model inference, which effectively transforms the delicate attacks into random bit flips, making it much harder for attackers to pinpoint and exploit vulnerable bits. We design novel algorithms to identify critical bits and insert obfuscation operations. We evaluate ObfusBFA against different types of attacks, including the adaptive scenarios where the attacker increases the flip bit budget to attempt to circumvent our defense. The results show that ObfusBFA can consistently preserve the model accuracy across various datasets and DNN architectures while significantly reducing the attack success rates. Additionally, it introduces minimal latency and storage overhead, making it a practical solution for real-world applications.

Paperid: 335, https://arxiv.org/pdf/2505.16670.pdf

Abstract:
Large language models (LLMs) have shown impressive capabilities across a wide range of applications, but their ever-increasing size and resource demands make them vulnerable to inference cost attacks, where attackers induce victim LLMs to generate the longest possible output content. In this paper, we revisit existing inference cost attacks and reveal that these methods can hardly produce large-scale malicious effects since they are self-targeting, where attackers are also the users and therefore have to execute attacks solely through the inputs, whose generated content will be charged by LLMs and can only directly influence themselves. Motivated by these findings, this paper introduces a new type of inference cost attacks (dubbed 'bit-flip inference cost attack') that target the victim model itself rather than its inputs. Specifically, we design a simple yet effective method (dubbed 'BitHydra') to effectively flip critical bits of model parameters. This process is guided by a loss function designed to suppress token's probability with an efficient critical bit search algorithm, thus explicitly defining the attack objective and enabling effective optimization. We evaluate our method on 11 LLMs ranging from 1.5B to 14B parameters under both int8 and float16 settings. Experimental results demonstrate that with just 4 search samples and as few as 3 bit flips, BitHydra can force 100% of test prompts to reach the maximum generation length (e.g., 2048 tokens) on representative LLMs such as LLaMA3, highlighting its efficiency, scalability, and strong transferability across unseen inputs.

Paperid: 336, https://arxiv.org/pdf/2505.08809.pdf

Abstract:
This paper focuses on implanting multiple heterogeneous backdoor triggers in bridge-based diffusion models designed for complex and arbitrary input distributions. Existing backdoor formulations mainly address single-attack scenarios and are limited to Gaussian noise input models. To fill this gap, we propose MixBridge, a novel diffusion SchrÃ¶dinger bridge (DSB) framework to cater to arbitrary input distributions (taking I2I tasks as special cases). Beyond this trait, we demonstrate that backdoor triggers can be injected into MixBridge by directly training with poisoned image pairs. This eliminates the need for the cumbersome modifications to stochastic differential equations required in previous studies, providing a flexible tool to study backdoor behavior for bridge models. However, a key question arises: can a single DSB model train multiple backdoor triggers? Unfortunately, our theory shows that when attempting this, the model ends up following the geometric mean of benign and backdoored distributions, leading to performance conflict across backdoor tasks. To overcome this, we propose a Divide-and-Merge strategy to mix different bridges, where models are independently pre-trained for each specific objective (Divide) and then integrated into a unified model (Merge). In addition, a Weight Reallocation Scheme (WRS) is also designed to enhance the stealthiness of MixBridge. Empirical studies across diverse generation tasks speak to the efficacy of MixBridge.

Paperid: 337, https://arxiv.org/pdf/2504.15512.pdf

Abstract:
The rapid development of generative artificial intelligence has made text to video models essential for building future multimodal world simulators. However, these models remain vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content. Such vulnerabilities undermine the reliability and security of simulation based applications. In this paper, we propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats. Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses, including semantic ambiguities in prompts, difficulties in detecting malicious content in dynamic video outputs, and inflexible model centric mitigation strategies. T2VShield introduces a prompt rewriting mechanism based on reasoning and multimodal retrieval to sanitize malicious inputs, along with a multi scope detection module that captures local and global inconsistencies across time and modalities. The framework does not require access to internal model parameters and works with both open and closed source systems. Extensive experiments on five platforms show that T2VShield can reduce jailbreak success rates by up to 35 percent compared to strong baselines. We further develop a human centered audiovisual evaluation protocol to assess perceptual safety, emphasizing the importance of visual level defense in enhancing the trustworthiness of next generation multimodal simulators.

Paperid: 338, https://arxiv.org/pdf/2502.18511.pdf

Abstract:
Generative large language models are crucial in natural language processing, but they are vulnerable to backdoor attacks, where subtle triggers compromise their behavior. Although backdoor attacks against LLMs are constantly emerging, existing benchmarks remain limited in terms of sufficient coverage of attack, metric system integrity, backdoor attack alignment. And existing pre-trained backdoor attacks are idealized in practice due to resource access constraints. Therefore we establish $\textit{ELBA-Bench}$, a comprehensive and unified framework that allows attackers to inject backdoor through parameter efficient fine-tuning ($\textit{e.g.,}$ LoRA) or without fine-tuning techniques ($\textit{e.g.,}$ In-context-learning). $\textit{ELBA-Bench}$ provides over 1300 experiments encompassing the implementations of 12 attack methods, 18 datasets, and 12 LLMs. Extensive experiments provide new invaluable findings into the strengths and limitations of various attack strategies. For instance, PEFT attack consistently outperform without fine-tuning approaches in classification tasks while showing strong cross-dataset generalization with optimized triggers boosting robustness; Task-relevant backdoor optimization techniques or attack prompts along with clean and adversarial demonstrations can enhance backdoor attack success while preserving model performance on clean samples. Additionally, we introduce a universal toolbox designed for standardized backdoor attack research, with the goal of propelling further progress in this vital area.

Paperid: 339, https://arxiv.org/pdf/2506.12947.pdf

Abstract:
Processing-using-DRAM (PuD) is a promising paradigm for alleviating the data movement bottleneck using DRAM's massive internal parallelism and bandwidth to execute very wide operations. Performing a PuD operation involves activating multiple DRAM rows in quick succession or simultaneously, i.e., multiple-row activation. Multiple-row activation is fundamentally different from conventional memory access patterns that activate one DRAM row at a time. However, repeatedly activating even one DRAM row (e.g., RowHammer) can induce bitflips in unaccessed DRAM rows because modern DRAM is subject to read disturbance. Unfortunately, no prior work investigates the effects of multiple-row activation on DRAM read disturbance. In this paper, we present the first characterization study of read disturbance effects of multiple-row activation-based PuD (which we call PuDHammer) using 316 real DDR4 DRAM chips from four major DRAM manufacturers. Our detailed characterization show that 1) PuDHammer significantly exacerbates the read disturbance vulnerability, causing up to 158.58x reduction in the minimum hammer count required to induce the first bitflip ($HC_{first}$), compared to RowHammer, 2) PuDHammer is affected by various operational conditions and parameters, 3) combining RowHammer with PuDHammer is more effective than using RowHammer alone to induce read disturbance error, e.g., doing so reduces $HC_{first}$ by 1.66x on average, and 4) PuDHammer bypasses an in-DRAM RowHammer mitigation mechanism (Target Row Refresh) and induces more bitflips than RowHammer. To develop future robust PuD-enabled systems in the presence of PuDHammer, we 1) develop three countermeasures and 2) adapt and evaluate the state-of-the-art RowHammer mitigation standardized by industry, called Per Row Activation Counting (PRAC). We show that the adapted PRAC incurs large performance overheads (48.26%, on average).

Paperid: 340, https://arxiv.org/pdf/2503.17891.pdf

Abstract:
DRAM chips are vulnerable to read disturbance phenomena (e.g., RowHammer and RowPress), where repeatedly accessing or keeping open a DRAM row causes bitflips in nearby rows. Attackers leverage RowHammer bitflips in real systems to take over systems and leak data. Consequently, many prior works propose mitigations, including recent DDR specifications introducing new mitigations (e.g., PRAC and RFM). For robust operation, it is critical to analyze other security implications of RowHammer mitigations. Unfortunately, no prior work analyzes the timing covert and side channel vulnerabilities introduced by RowHammer mitigations. This paper presents the first analysis and evaluation of timing covert and side channel vulnerabilities introduced by state-of-the-art RowHammer mitigations. We demonstrate that RowHammer mitigations' preventive actions have two fundamental features that enable timing channels. First, preventive actions reduce DRAM bandwidth availability, resulting in longer memory latencies. Second, preventive actions can be triggered on demand depending on memory access patterns. We introduce LeakyHammer, a new class of attacks that leverage the RowHammer mitigation-induced memory latency differences to establish communication channels and leak secrets. First, we build two covert channel attacks exploiting two state-of-the-art RowHammer mitigations, achieving 38.6 Kbps and 48.6 Kbps channel capacity. Second, we demonstrate a website fingerprinting attack that identifies visited websites based on the RowHammer-preventive actions they cause. We propose and evaluate two countermeasures against LeakyHammer and show that fundamentally mitigating LeakyHammer induces large overheads in highly RowHammer-vulnerable systems. We believe and hope our work can enable and aid future work on designing robust systems against RowHammer mitigation-based side and covert channels.

Paperid: 341, https://arxiv.org/pdf/2503.16749.pdf

Abstract:
Modern DRAM is vulnerable to read disturbance (e.g., RowHammer and RowPress) that significantly undermines the robust operation of the system. Repeatedly opening and closing a DRAM row (RowHammer) or keeping a DRAM row open for a long period of time (RowPress) induces bitflips in nearby unaccessed DRAM rows. Prior works on DRAM read disturbance either 1) perform experimental characterization using commercial-off-the-shelf (COTS) DRAM chips to demonstrate the high-level characteristics of the read disturbance bitflips, or 2) perform device-level simulations to understand the low-level error mechanisms of the read disturbance bitflips. In this paper, we attempt to align and cross-validate the real-chip experimental characterization results and state-of-the-art device-level studies of DRAM read disturbance. To do so, we first identify and extract the key bitflip characteristics of RowHammer and RowPress from the device-level error mechanisms studied in prior works. Then, we perform experimental characterization on 96 COTS DDR4 DRAM chips that directly match the data and access patterns studied in the device-level works. Through our experiments, we identify fundamental inconsistencies in the RowHammer and RowPress bitflip directions and access pattern dependence between experimental characterization results and the device-level error mechanisms. Based on our results, we hypothesize that either 1) the retention failure based DRAM architecture reverse-engineering methodologies do not fully work on modern DDR4 DRAM chips, or 2) existing device-level works do not fully uncover all the major read disturbance error mechanisms. We hope our findings inspire and enable future works to build a more fundamental and comprehensive understanding of DRAM read disturbance.

Paperid: 342, https://arxiv.org/pdf/2502.13075.pdf

Abstract:
Modern DRAM chips are subject to read disturbance errors. State-of-the-art read disturbance mitigations rely on accurate and exhaustive characterization of the read disturbance threshold (RDT) (e.g., the number of aggressor row activations needed to induce the first RowHammer or RowPress bitflip) of every DRAM row (of which there are millions or billions in a modern system) to prevent read disturbance bitflips securely and with low overhead. We experimentally demonstrate for the first time that the RDT of a DRAM row significantly and unpredictably changes over time. We call this new phenomenon variable read disturbance (VRD). Our experiments using 160 DDR4 chips and 4 HBM2 chips from three major manufacturers yield two key observations. First, it is very unlikely that relatively few RDT measurements can accurately identify the RDT of a DRAM row. The minimum RDT of a DRAM row appears after tens of thousands of measurements (e.g., up to 94,467), and the minimum RDT of a DRAM row is 3.5X smaller than the maximum RDT observed for that row. Second, the probability of accurately identifying a row's RDT with a relatively small number of measurements reduces with increasing chip density or smaller technology node size. Our empirical results have implications for the security guarantees of read disturbance mitigation techniques: if the RDT of a DRAM row is not identified accurately, these techniques can easily become insecure. We discuss and evaluate using a guardband for RDT and error-correcting codes for mitigating read disturbance bitflips in the presence of RDTs that change unpredictably over time. We conclude that a >10% guardband for the minimum observed RDT combined with SECDED or Chipkill-like SSC error-correcting codes could prevent read disturbance bitflips at the cost of large read disturbance mitigation performance overheads (e.g., 45% performance loss for an RDT guardband of 50%).

Paperid: 343, https://arxiv.org/pdf/2505.13862.pdf

Abstract:
Large language models (LLMs) have achieved remarkable capabilities but remain vulnerable to adversarial prompts known as jailbreaks, which can bypass safety alignment and elicit harmful outputs. Despite growing efforts in LLM safety research, existing evaluations are often fragmented, focused on isolated attack or defense techniques, and lack systematic, reproducible analysis. In this work, we introduce PandaGuard, a unified and modular framework that models LLM jailbreak safety as a multi-agent system comprising attackers, defenders, and judges. Our framework implements 19 attack methods and 12 defense mechanisms, along with multiple judgment strategies, all within a flexible plugin architecture supporting diverse LLM interfaces, multiple interaction modes, and configuration-driven experimentation that enhances reproducibility and practical deployment. Built on this framework, we develop PandaBench, a comprehensive benchmark that evaluates the interactions between these attack/defense methods across 49 LLMs and various judgment approaches, requiring over 3 billion tokens to execute. Our extensive evaluation reveals key insights into model vulnerabilities, defense cost-performance trade-offs, and judge consistency. We find that no single defense is optimal across all dimensions and that judge disagreement introduces nontrivial variance in safety assessments. We release the code, configurations, and evaluation results to support transparent and reproducible research in LLM safety.

Paperid: 344, https://arxiv.org/pdf/2504.07761.pdf

Abstract:
Verifying the authenticity of identity documents (IDs) has become a critical challenge for real-life applications such as digital banking, crypto-exchanges, renting, etc. This study focuses on the topic of fake ID detection, covering several limitations in the field. In particular, there are no publicly available data from real IDs for proper research in this area, and most published studies rely on proprietary internal databases that are not available for privacy reasons. In order to advance this critical challenge of real data scarcity that makes it so difficult to advance the technology of machine learning-based fake ID detection, we introduce a new patch-based methodology that trades off privacy and performance, and propose a novel patch-wise approach for privacy-aware fake ID detection: FakeIDet. In our experiments, we explore: i) two levels of anonymization for an ID (i.e., fully- and pseudo-anonymized), and ii) different patch size configurations, varying the amount of sensitive data visible in the patch image. State-of-the-art methods, such as vision transformers and foundation models, are considered as backbones. Our results show that, on an unseen database (DLC-2021), our proposal for fake ID detection achieves 13.91% and 0% EERs at the patch and the whole ID level, showing a good generalization to other databases. In addition to the path-based methodology introduced and the new FakeIDet method based on it, another key contribution of our article is the release of the first publicly available database that contains 48,400 patches from real and fake IDs, called FakeIDet-db, together with the experimental framework.

Paperid: 345, https://arxiv.org/pdf/2506.14474.pdf

Abstract:
Large language models (LLMs) can be trained or fine-tuned on data obtained without the owner's consent. Verifying whether a specific LLM was trained on particular data instances or an entire dataset is extremely challenging. Dataset watermarking addresses this by embedding identifiable modifications in training data to detect unauthorized use. However, existing methods often lack stealth, making them relatively easy to detect and remove. In light of these limitations, we propose LexiMark, a novel watermarking technique designed for text and documents, which embeds synonym substitutions for carefully selected high-entropy words. Our method aims to enhance an LLM's memorization capabilities on the watermarked text without altering the semantic integrity of the text. As a result, the watermark is difficult to detect, blending seamlessly into the text with no visible markers, and is resistant to removal due to its subtle, contextually appropriate substitutions that evade automated and manual detection. We evaluated our method using baseline datasets from recent studies and seven open-source models: LLaMA-1 7B, LLaMA-3 8B, Mistral 7B, Pythia 6.9B, as well as three smaller variants from the Pythia family (160M, 410M, and 1B). Our evaluation spans multiple training settings, including continued pretraining and fine-tuning scenarios. The results demonstrate significant improvements in AUROC scores compared to existing methods, underscoring our method's effectiveness in reliably verifying whether unauthorized watermarked data was used in LLM training.

Paperid: 346, https://arxiv.org/pdf/2506.02859.pdf

Abstract:
Evaluating the security of multi-agent systems (MASs) powered by large language models (LLMs) is challenging, primarily because of the systems' complex internal dynamics and the evolving nature of LLM vulnerabilities. Traditional attack graph (AG) methods often lack the specific capabilities to model attacks on LLMs. This paper introduces AI-agent application Threat assessment with Attack Graphs (ATAG), a novel framework designed to systematically analyze the security risks associated with AI-agent applications. ATAG extends the MulVAL logic-based AG generation tool with custom facts and interaction rules to accurately represent AI-agent topologies, vulnerabilities, and attack scenarios. As part of this research, we also created the LLM vulnerability database (LVD) to initiate the process of standardizing LLM vulnerabilities documentation. To demonstrate ATAG's efficacy, we applied it to two multi-agent applications. Our case studies demonstrated the framework's ability to model and generate AGs for sophisticated, multi-step attack scenarios exploiting vulnerabilities such as prompt injection, excessive agency, sensitive information disclosure, and insecure output handling across interconnected agents. ATAG is an important step toward a robust methodology and toolset to help understand, visualize, and prioritize complex attack paths in multi-agent AI systems (MAASs). It facilitates proactive identification and mitigation of AI-agent threats in multi-agent applications.

Paperid: 347, https://arxiv.org/pdf/2506.01425.pdf

Abstract:
Although federated learning preserves training data within local privacy domains, the aggregated model parameters may still reveal private characteristics. This vulnerability stems from clients' limited training data, which predisposes models to overfitting. Such overfitting enables models to memorize distinctive patterns from training samples, thereby amplifying the success probability of privacy attacks like membership inference. To enhance visual privacy protection in FL, we present CSVAR(Channel-Wise Spatial Image Shuffling with Variance-Guided Adaptive Region Partitioning), a novel image shuffling framework to generate obfuscated images for secure data transmission and each training epoch, addressing both overfitting-induced privacy leaks and raw image transmission risks. CSVAR adopts region-variance as the metric to measure visual privacy sensitivity across image regions. Guided by this, CSVAR adaptively partitions each region into multiple blocks, applying fine-grained partitioning to privacy-sensitive regions with high region-variances for enhancing visual privacy protection and coarse-grained partitioning to privacy-insensitive regions for balancing model utility. In each region, CSVAR then shuffles between blocks in both the spatial domains and chromatic channels to hide visual spatial features and disrupt color distribution. Experimental evaluations conducted on diverse real-world datasets demonstrate that CSVAR is capable of generating visually obfuscated images that exhibit high perceptual ambiguity to human eyes, simultaneously mitigating the effectiveness of adversarial data reconstruction attacks and achieving a good trade-off between visual privacy protection and model utility.

Paperid: 348, https://arxiv.org/pdf/2505.06701.pdf

Abstract:
SIEM systems serve as a critical hub, employing rule-based logic to detect and respond to threats. Redundant or overlapping rules in SIEM systems lead to excessive false alerts, degrading analyst performance due to alert fatigue, and increase computational overhead and response latency for actual threats. As a result, optimizing SIEM rule sets is essential for efficient operations. Despite the importance of such optimization, research in this area is limited, with current practices relying on manual optimization methods that are both time-consuming and error-prone due to the scale and complexity of enterprise-level rule sets. To address this gap, we present RuleGenie, a novel large language model (LLM) aided recommender system designed to optimize SIEM rule sets. Our approach leverages transformer models' multi-head attention capabilities to generate SIEM rule embeddings, which are then analyzed using a similarity matching algorithm to identify the top-k most similar rules. The LLM then processes the rules identified, utilizing its information extraction, language understanding, and reasoning capabilities to analyze rule similarity, evaluate threat coverage and performance metrics, and deliver optimized recommendations for refining the rule set. By automating the rule optimization process, RuleGenie allows security teams to focus on more strategic tasks while enhancing the efficiency of SIEM systems and strengthening organizations' security posture. We evaluated RuleGenie on a comprehensive set of real-world SIEM rule formats, including Splunk, Sigma, and AQL (Ariel query language), demonstrating its platform-agnostic capabilities and adaptability across diverse security infrastructures. Our experimental results show that RuleGenie can effectively identify redundant rules, which in turn decreases false positive rates and enhances overall rule efficiency.

Paperid: 349, https://arxiv.org/pdf/2505.01816.pdf

Abstract:
The Open Radio Access Network (O-RAN) architecture is revolutionizing cellular networks with its open, multi-vendor design and AI-driven management, aiming to enhance flexibility and reduce costs. Although it has many advantages, O-RAN is not threat-free. While previous studies have mainly examined vulnerabilities arising from O-RAN's intelligent components, this paper is the first to focus on the security challenges and vulnerabilities introduced by transitioning from single-operator to multi-operator RAN architectures. This shift increases the risk of untrusted third-party operators managing different parts of the network. To explore these vulnerabilities and their potential mitigation, we developed an open-access testbed environment that integrates a wireless network simulator with the official O-RAN Software Community (OSC) RAN intelligent component (RIC) cluster. This environment enables realistic, live data collection and serves as a platform for demonstrating APATE (adversarial perturbation against traffic efficiency), an evasion attack in which a malicious cell manipulates its reported key performance indicators (KPIs) and deceives the O-RAN traffic steering to gain unfair allocations of user equipment (UE). To ensure that O-RAN's legitimate activity continues, we introduce MARRS (monitoring adversarial RAN reports), a detection framework based on a long-short term memory (LSTM) autoencoder (AE) that learns contextual features across the network to monitor malicious telemetry (also demonstrated in our testbed). Our evaluation showed that by executing APATE, an attacker can obtain a 248.5% greater UE allocation than it was supposed to in a benign scenario. In addition, the MARRS detection method was also shown to successfully classify malicious cell activity, achieving accuracy of 99.2% and an F1 score of 0.978.

Paperid: 350, https://arxiv.org/pdf/2504.20472.pdf

Abstract:
Large language models (LLMs) have demonstrated impressive performance and have come to dominate the field of natural language processing (NLP) across various tasks. However, due to their strong instruction-following capabilities and inability to distinguish between instructions and data content, LLMs are vulnerable to prompt injection attacks. These attacks manipulate LLMs into deviating from the original input instructions and executing maliciously injected instructions within data content, such as web documents retrieved from search engines. Existing defense methods, including prompt-engineering and fine-tuning approaches, typically instruct models to follow the original input instructions while suppressing their tendencies to execute injected instructions. However, our experiments reveal that suppressing instruction-following tendencies is challenging. Through analyzing failure cases, we observe that although LLMs tend to respond to any recognized instructions, they are aware of which specific instructions they are executing and can correctly reference them within the original prompt. Motivated by these findings, we propose a novel defense method that leverages, rather than suppresses, the instruction-following abilities of LLMs. Our approach prompts LLMs to generate responses that include both answers and their corresponding instruction references. Based on these references, we filter out answers not associated with the original input instructions. Comprehensive experiments demonstrate that our method outperforms prompt-engineering baselines and achieves performance comparable to fine-tuning methods, reducing the attack success rate (ASR) to 0 percent in some scenarios. Moreover, our approach has minimal impact on overall utility.

Paperid: 351, https://arxiv.org/pdf/2503.23866.pdf

Abstract:
This paper investigates backdoor attacks in image-oriented semantic communications. The threat of backdoor attacks on symbol reconstruction in semantic communication (SemCom) systems has received limited attention. Previous research on backdoor attacks targeting SemCom symbol reconstruction primarily focuses on input-level triggers, which are impractical in scenarios with strict input constraints. In this paper, we propose a novel channel-triggered backdoor attack (CT-BA) framework that exploits inherent wireless channel characteristics as activation triggers. Our key innovation involves utilizing fundamental channel statistics parameters, specifically channel gain with different fading distributions or channel noise with different power, as potential triggers. This approach enhances stealth by eliminating explicit input manipulation, provides flexibility through trigger selection from diverse channel conditions, and enables automatic activation via natural channel variations without adversary intervention. We extensively evaluate CT-BA across four joint source-channel coding (JSCC) communication system architectures and three benchmark datasets. Simulation results demonstrate that our attack achieves near-perfect attack success rate (ASR) while maintaining effective stealth. Finally, we discuss potential defense mechanisms against such attacks.

Paperid: 352, https://arxiv.org/pdf/2502.16580.pdf

Abstract:
Prompt injection attacks manipulate large language models (LLMs) by misleading them to deviate from the original input instructions and execute maliciously injected instructions, because of their instruction-following capabilities and inability to distinguish between the original input instructions and maliciously injected instructions. To defend against such attacks, recent studies have developed various detection mechanisms. If we restrict ourselves specifically to works which perform detection rather than direct defense, most of them focus on direct prompt injection attacks, while there are few works for the indirect scenario, where injected instructions are indirectly from external tools, such as a search engine. Moreover, current works mainly investigate injection detection methods and pay less attention to the post-processing method that aims to mitigate the injection after detection. In this paper, we investigate the feasibility of detecting and removing indirect prompt injection attacks, and we construct a benchmark dataset for evaluation. For detection, we assess the performance of existing LLMs and open-source detection models, and we further train detection models using our crafted training datasets. For removal, we evaluate two intuitive methods: (1) the segmentation removal method, which segments the injected document and removes parts containing injected instructions, and (2) the extraction removal method, which trains an extraction model to identify and remove injected instructions.

Paperid: 353, https://arxiv.org/pdf/2502.02342.pdf

Abstract:
Advanced persistent threats (APTs) are sophisticated cyber attacks that can remain undetected for extended periods, making their mitigation particularly challenging. Given their persistence, significant effort is required to detect them and respond effectively. Existing provenance-based attack detection methods often lack interpretability and suffer from high false positive rates, while investigation approaches are either supervised or limited to known attacks. To address these challenges, we introduce SHIELD, a novel approach that combines statistical anomaly detection and graph-based analysis with the contextual analysis capabilities of large language models (LLMs). SHIELD leverages the implicit knowledge of LLMs to uncover hidden attack patterns in provenance data, while reducing false positives and providing clear, interpretable attack descriptions. This reduces analysts' alert fatigue and makes it easier for them to understand the threat landscape. Our extensive evaluation demonstrates SHIELD's effectiveness and computational efficiency in real-world scenarios. SHIELD was shown to outperform state-of-the-art methods, achieving higher precision and recall. SHIELD's integration of anomaly detection, LLM-driven contextual analysis, and advanced graph-based correlation establishes a new benchmark for APT detection.

Paperid: 354, https://arxiv.org/pdf/2502.02337.pdf

Abstract:
The growing frequency of cyberattacks has heightened the demand for accurate and efficient threat detection systems. SIEM platforms are important for analyzing log data and detecting adversarial activities through rule-based queries, also known as SIEM rules. The efficiency of the threat analysis process relies heavily on mapping these SIEM rules to the relevant attack techniques in the MITRE ATT&CK framework. Inaccurate annotation of SIEM rules can result in the misinterpretation of attacks, increasing the likelihood that threats will be overlooked. Existing solutions for annotating SIEM rules with MITRE ATT&CK technique labels have notable limitations: manual annotation of SIEM rules is both time-consuming and prone to errors, and ML-based approaches mainly focus on annotating unstructured free text sources rather than structured data like SIEM rules. Structured data often contains limited information, further complicating the annotation process and making it a challenging task. To address these challenges, we propose Rule-ATT&CK Mapper (RAM), a novel framework that leverages LLMs to automate the mapping of structured SIEM rules to MITRE ATT&CK techniques. RAM's multi-stage pipeline, which was inspired by the prompt chaining technique, enhances mapping accuracy without requiring LLM pre-training or fine-tuning. Using the Splunk Security Content dataset, we evaluate RAM's performance using several LLMs, including GPT-4-Turbo, Qwen, IBM Granite, and Mistral. Our evaluation highlights GPT-4-Turbo's superior performance, which derives from its enriched knowledge base, and an ablation study emphasizes the importance of external contextual knowledge in overcoming the limitations of LLMs' implicit knowledge for domain-specific tasks. These findings demonstrate RAM's potential in automating cybersecurity workflows and provide valuable insights for future advancements in this field.

Paperid: 355, https://arxiv.org/pdf/2501.16962.pdf

Abstract:
Modern computing systems rely on the Unified Extensible Firmware Interface (UEFI), which has replaced the traditional BIOS as the firmware standard for the modern boot process. Despite the advancements, UEFI is increasingly targeted by threat actors seeking to exploit its execution environment and take advantage of its persistence mechanisms. While some security-related analysis of UEFI components has been performed--primarily via debugging and runtime behavior testing--to the best of our knowledge, no prior study has specifically addressed capturing and analyzing volatile UEFI runtime memory to detect malicious exploitation during the pre-OS phase. This gap in UEFI forensic tools limits the ability to conduct in-depth security analyses in pre-OS environments. Such a gap is especially surprising, given that memory forensics is widely regarded as foundational to modern incident response, reflected by the popularity of above-OS memory analysis frameworks, such as Rekall, Volatility, and MemProcFS. To address the lack of below-OS memory forensics, we introduce a framework for UEFI memory forensics. The proposed framework consists of two primary components: UefiMemDump, a memory acquisition tool, and UEFIDumpAnalysis, an extendable collection of analysis modules capable of detecting malicious activities such as function pointer hooking, inline hooking, and malicious image loading. Our proof-of-concept implementation demonstrates our framework's ability to detect modern UEFI threats, such as ThunderStrike, CosmicStrand, and Glupteba bootkits. By providing an open-source solution, our work enables researchers and practitioners to investigate firmware-level threats, develop additional analysis modules, and advance overall below-OS security through UEFI memory analysis.

Paperid: 356, https://arxiv.org/pdf/2501.08454.pdf

Abstract:
Large language models (LLMs) have become essential tools for digital task assistance. Their training relies heavily on the collection of vast amounts of data, which may include copyright-protected or sensitive information. Recent studies on detecting pretraining data in LLMs have primarily focused on sentence- or paragraph-level membership inference attacks (MIAs), usually involving probability analysis of the target model's predicted tokens. However, these methods often exhibit poor accuracy, failing to account for the semantic importance of textual content and word significance. To address these shortcomings, we propose Tag&Tab, a novel approach for detecting data used in LLM pretraining. Our method leverages established natural language processing (NLP) techniques to tag keywords in the input text, a process we term Tagging. Then, the LLM is used to obtain probabilities for these keywords and calculate their average log-likelihood to determine input text membership, a process we refer to as Tabbing. Our experiments on four benchmark datasets (BookMIA, MIMIR, PatentMIA, and the Pile) and several open-source LLMs of varying sizes demonstrate an average increase in AUC scores ranging from 5.3% to 17.6% over state-of-the-art methods. Tag&Tab not only sets a new standard for data leakage detection in LLMs, but its outstanding performance is a testament to the importance of words in MIAs on LLMs.

Paperid: 357, https://arxiv.org/pdf/2501.08258.pdf

Abstract:
The traditional learning process of patch-based adversarial attacks, conducted in the digital domain and then applied in the physical domain (e.g., via printed stickers), may suffer from reduced performance due to adversarial patches' limited transferability from the digital domain to the physical domain. Given that previous studies have considered using projectors to apply adversarial attacks, we raise the following question: can adversarial learning (i.e., patch generation) be performed entirely in the physical domain with a projector? In this work, we propose the Physical-domain Adversarial Patch Learning Augmentation (PAPLA) framework, a novel end-to-end (E2E) framework that converts adversarial learning from the digital domain to the physical domain using a projector. We evaluate PAPLA across multiple scenarios, including controlled laboratory settings and realistic outdoor environments, demonstrating its ability to ensure attack success compared to conventional digital learning-physical application (DL-PA) methods. We also analyze the impact of environmental factors, such as projection surface color, projector strength, ambient light, distance, and angle of the target object relative to the camera, on the effectiveness of projected patches. Finally, we demonstrate the feasibility of the attack against a parked car and a stop sign in a real-world outdoor environment. Our results show that under specific conditions, E2E adversarial learning in the physical domain eliminates the transferability issue and ensures evasion by object detectors. Finally, we provide insights into the challenges and opportunities of applying adversarial learning in the physical domain and explain where such an approach is more effective than using a sticker.

Paperid: 358, https://arxiv.org/pdf/2506.16981.pdf

Abstract:
End-point monitoring solutions are widely deployed in today's enterprise environments to support advanced attack detection and investigation. These monitors continuously record system-level activities as audit logs and provide deep visibility into security events. Unfortunately, existing methods of semantic analysis based on audit logs have low granularity, only reaching the system call level, making it difficult to effectively classify highly covert behaviors. Additionally, existing works mainly match audit log streams with rule knowledge bases describing behaviors, which heavily rely on expertise and lack the ability to detect unknown attacks and provide interpretive descriptions. In this paper, we propose SmartGuard, an automated method that combines abstracted behaviors from audit event semantics with large language models. SmartGuard extracts specific behaviors (function level) from incoming system logs and constructs a knowledge graph, divides events by threads, and combines event summaries with graph embeddings to achieve information diagnosis and provide explanatory narratives through large language models. Our evaluation shows that SmartGuard achieves an average F1 score of 96\% in assessing malicious behaviors and demonstrates good scalability across multiple models and unknown attacks. It also possesses excellent fine-tuning capabilities, allowing experts to assist in timely system updates.

Paperid: 359, https://arxiv.org/pdf/2505.20945.pdf

Abstract:
Incident response plays a pivotal role in mitigating the impact of cyber attacks. In recent years, the intensity and complexity of global cyber threats have grown significantly, making it increasingly challenging for traditional threat detection and incident response methods to operate effectively in complex network environments. While Large Language Models (LLMs) have shown great potential in early threat detection, their capabilities remain limited when it comes to automated incident response after an intrusion. To address this gap, we construct an incremental benchmark based on real-world incident response tasks to thoroughly evaluate the performance of LLMs in this domain. Our analysis reveals several key challenges that hinder the practical application of contemporary LLMs, including context loss, hallucinations, privacy protection concerns, and their limited ability to provide accurate, context-specific recommendations. In response to these challenges, we propose IRCopilot, a novel framework for automated incident response powered by LLMs. IRCopilot mimics the three dynamic phases of a real-world incident response team using four collaborative LLM-based session components. These components are designed with clear divisions of responsibility, reducing issues such as hallucinations and context loss. Our method leverages diverse prompt designs and strategic responsibility segmentation, significantly improving the system's practicality and efficiency. Experimental results demonstrate that IRCopilot outperforms baseline LLMs across key benchmarks, achieving sub-task completion rates of 150%, 138%, 136%, 119%, and 114% for various response tasks. Moreover, IRCopilot exhibits robust performance on public incident response platforms and in real-world attack scenarios, showcasing its strong applicability.

Paperid: 360, https://arxiv.org/pdf/2505.12871.pdf

Abstract:
Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate the security implications of LoRA's low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models LoRA's training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between LoRA's low rank structure and its vulnerability against training-time attacks. Our analysis indicates that LoRA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning due to its over-simplified information geometry. Extensive experimental evaluations have corroborated our theoretical findings.

Paperid: 361, https://arxiv.org/pdf/2505.05103.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable success across a wide range of applications. However, individual LLMs often produce inconsistent, biased, or hallucinated outputs due to limitations in their training corpora and model architectures. Recently, collaborative frameworks such as the Multi-LLM Network (MultiLLMN) have been introduced, enabling multiple LLMs to interact and jointly respond to user queries. Nevertheless, MultiLLMN architectures raise critical concerns regarding the reliability and security of the generated content, particularly in open environments where malicious or compromised LLMs may be present. Moreover, reliance on centralized coordination undermines system efficiency and introduces single points of failure. In this paper, we propose a novel Trusted MultiLLMN framework, driven by a Weighted Byzantine Fault Tolerance (WBFT) blockchain consensus mechanism, to ensure the reliability, security, and efficiency of multi-LLM collaboration. In WBFT, voting weights are adaptively assigned to each LLM based on its response quality and trustworthiness, incentivizing reliable behavior, and reducing the impact of malicious nodes. Extensive simulations demonstrate that WBFT significantly improves both consensus security and efficiency compared to classical and modern consensus mechanisms, particularly under wireless network conditions. Furthermore, our evaluations reveal that Trusted MultiLLMN supported by WBFT can deliver higher-quality and more credible responses than both single LLMs and conventional MultiLLMNs, thereby providing a promising path toward building robust, decentralized AI collaboration networks.

Paperid: 362, https://arxiv.org/pdf/2505.04416.pdf

Abstract:
Large language models (LLMs) trained over extensive corpora risk memorizing sensitive, copyrighted, or toxic content. To address this, we propose \textbf{OBLIVIATE}, a robust unlearning framework that removes targeted data while preserving model utility. The framework follows a structured process: extracting target tokens, building retain sets, and fine-tuning with a tailored loss function comprising three components -- masking, distillation, and world fact. Using low-rank adapters (LoRA) ensures efficiency without compromising unlearning quality. We conduct experiments on multiple datasets, including Harry Potter series, WMDP, and TOFU, using a comprehensive suite of metrics: \emph{forget quality} (via a new document-level memorization score), \emph{model utility}, and \emph{fluency}. Results demonstrate its effectiveness in resisting membership inference attacks, minimizing the impact on retained data, and maintaining robustness across diverse scenarios.

Paperid: 363, https://arxiv.org/pdf/2504.21228.pdf

Abstract:
Large Language Models (LLMs) are identified as being susceptible to indirect prompt injection attack, where the model undesirably deviates from user-provided instructions by executing tasks injected in the prompt context. This vulnerability stems from LLMs' inability to distinguish between data and instructions within a prompt. In this paper, we propose CachePrune that defends against this attack by identifying and pruning task-triggering neurons from the KV cache of the input prompt context. By pruning such neurons, we encourage the LLM to treat the text spans of input prompt context as only pure data, instead of any indicator of instruction following. These neurons are identified via feature attribution with a loss function induced from an upperbound of the Direct Preference Optimization (DPO) objective. We show that such a loss function enables effective feature attribution with only a few samples. We further improve on the quality of feature attribution, by exploiting an observed triggering effect in instruction following. Our approach does not impose any formatting on the original prompt or introduce extra test-time LLM calls. Experiments show that CachePrune significantly reduces attack success rates without compromising the response quality. Note: This paper aims to defend against indirect prompt injection attacks, with the goal of developing more secure and robust AI systems.

Paperid: 364, https://arxiv.org/pdf/2504.20376.pdf

Abstract:
Modern text-to-image (T2I) generation systems (e.g., DALL$\cdot$E 3) exploit the memory mechanism, which captures key information in multi-turn interactions for faithful generation. Despite its practicality, the security analyses of this mechanism have fallen far behind. In this paper, we reveal that it can exacerbate the risk of jailbreak attacks. Previous attacks fuse the unsafe target prompt into one ultimate adversarial prompt, which can be easily detected or lead to the generation of non-unsafe images due to under- or over-detoxification. In contrast, we propose embedding the malice at the inception of the chat session in memory, addressing the above limitations. Specifically, we propose Inception, the first multi-turn jailbreak attack against real-world text-to-image generation systems that explicitly exploits their memory mechanisms. Inception is composed of two key modules: segmentation and recursion. We introduce Segmentation, a semantic-preserving method that generates multi-round prompts. By leveraging NLP analysis techniques, we design policies to decompose a prompt, together with its malicious intent, according to sentence structure, thereby evading safety filters. Recursion further addresses the challenge posed by unsafe sub-prompts that cannot be separated through simple segmentation. It firstly expands the sub-prompt, then invokes segmentation recursively. To facilitate multi-turn adversarial prompts crafting, we build VisionFlow, an emulation T2I system that integrates two-stage safety filters and industrial-grade memory mechanisms. The experiment results show that Inception successfully allures unsafe image generation, surpassing the SOTA by a 20.0\% margin in attack success rate. We also conduct experiments on the real-world commercial T2I generation platforms, further validating the threats of Inception in practice.

Paperid: 365, https://arxiv.org/pdf/2504.17523.pdf

Abstract:
Local Differential Privacy (LDP) is the predominant privacy model for safeguarding individual data privacy. Existing perturbation mechanisms typically require perturbing the original values to ensure acceptable privacy, which inevitably results in value distortion and utility deterioration. In this work, we propose an alternative approach -- instead of perturbing values, we apply randomization to indexes of values while ensuring rigorous LDP guarantees. Inspired by the deniability of randomized indexes, we present CRIAD for answering subset counting queries on set-value data. By integrating a multi-dummy, multi-sample, and multi-group strategy, CRIAD serves as a fully scalable solution that offers flexibility across various privacy requirements and domain sizes, and achieves more accurate query results than any existing methods. Through comprehensive theoretical analysis and extensive experimental evaluations, we validate the effectiveness of CRIAD and demonstrate its superiority over traditional value-perturbation mechanisms.

Paperid: 366, https://arxiv.org/pdf/2504.14993.pdf

Abstract:
Stream data from real-time distributed systems such as IoT, tele-health, and crowdsourcing has become an important data source. However, the collection and analysis of user-generated stream data raise privacy concerns due to the potential exposure of sensitive information. To address these concerns, local differential privacy (LDP) has emerged as a promising standard. Nevertheless, applying LDP to stream data presents significant challenges, as stream data often involves a large or even infinite number of values. Allocating a given privacy budget across these data points would introduce overwhelming LDP noise to the original stream data. Beyond existing approaches that merely use perturbed values for estimating statistics, our design leverages them for both perturbation and estimation. This dual utilization arises from a key observation: each user knows their own ground truth and perturbed values, enabling a precise computation of the deviation error caused by perturbation. By incorporating this deviation into the perturbation process of subsequent values, the previous noise can be calibrated. Following this insight, we introduce the Iterative Perturbation Parameterization (IPP) method, which utilizes current perturbed results to calibrate the subsequent perturbation process. To enhance the robustness of calibration and reduce sensitivity, two algorithms, namely Accumulated Perturbation Parameterization (APP) and Clipped Accumulated Perturbation Parameterization (CAPP) are further developed. We prove that these three algorithms satisfy $w$-event differential privacy while significantly improving utility. Experimental results demonstrate that our techniques outperform state-of-the-art LDP stream publishing solutions in terms of utility, while retaining the same privacy guarantee.

Paperid: 367, https://arxiv.org/pdf/2504.13526.pdf

Abstract:
Item mining, a fundamental task for collecting statistical data from users, has raised increasing privacy concerns. To address these concerns, local differential privacy (LDP) was proposed as a privacy-preserving technique. Existing LDP item mining mechanisms primarily concentrate on global statistics, i.e., those from the entire dataset. Nevertheless, they fall short of user-tailored tasks such as personalized recommendations, whereas classwise statistics can improve task accuracy with fine-grained information. Meanwhile, the introduction of class labels brings new challenges. Label perturbation may result in invalid items for aggregation. To this end, we propose frameworks for multi-class item mining, along with two mechanisms: validity perturbation to reduce the impact of invalid data, and correlated perturbation to preserve the relationship between labels and items. We also apply these optimized methods to two multi-class item mining queries: frequency estimation and top-$k$ item mining. Through theoretical analysis and extensive experiments, we verify the effectiveness and superiority of these methods.

Paperid: 368, https://arxiv.org/pdf/2503.22330.pdf

Abstract:
Invisible Image Watermarking is crucial for ensuring content provenance and accountability in generative AI. While Gen-AI providers are increasingly integrating invisible watermarking systems, the robustness of these schemes against forgery attacks remains poorly characterized. This is critical, as forging traceable watermarks onto illicit content leads to false attribution, potentially harming the reputation and legal standing of Gen-AI service providers who are not responsible for the content. In this work, we propose WMCopier, an effective watermark forgery attack that operates without requiring any prior knowledge of or access to the target watermarking algorithm. Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image via a shallow inversion process. We also incorporate an iterative optimization procedure that refines the reconstructed image to further trade off the fidelity and forgery efficiency. Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems (e.g., Amazon's system), achieving a significantly higher success rate than existing methods. Additionally, we evaluate the robustness of forged samples and discuss the potential defenses against our attack.

Paperid: 369, https://arxiv.org/pdf/2503.21426.pdf

Abstract:
The skip-gram model (SGM), which employs a neural network to generate node vectors, serves as the basis for numerous popular graph embedding techniques. However, since the training datasets contain sensitive linkage information, the parameters of a released SGM may encode private information and pose significant privacy risks. Differential privacy (DP) is a rigorous standard for protecting individual privacy in data analysis. Nevertheless, when applying differential privacy to skip-gram in graphs, it becomes highly challenging due to the complex link relationships, which potentially result in high sensitivity and necessitate substantial noise injection. To tackle this challenge, we present AdvSGM, a differentially private skip-gram for graphs via adversarial training. Our core idea is to leverage adversarial training to privatize skip-gram while improving its utility. Towards this end, we develop a novel adversarial training module by devising two optimizable noise terms that correspond to the parameters of a skip-gram. By fine-tuning the weights between modules within AdvSGM, we can achieve differentially private gradient updates without additional noise injection. Extensive experimental results on six real-world graph datasets show that AdvSGM preserves high data utility across different downstream tasks.

Paperid: 370, https://arxiv.org/pdf/2503.08297.pdf

Abstract:
Local Differential Privacy (LDP) has emerged as a widely adopted privacy-preserving technique in modern data analytics, enabling users to share statistical insights while maintaining robust privacy guarantees. However, current LDP applications assume a single service gathering perturbed information from users. In reality, multiple services may be interested in collecting users' data, which poses privacy burdens to users as more such services emerge. To address this issue, this paper proposes a framework for collecting and aggregating data based on perturbed information from multiple services, regardless of their estimated statistics (e.g., mean or distribution) and perturbation mechanisms. Then for mean estimation, we introduce the Unbiased Averaging (UA) method and its optimized version, User-level Weighted Averaging (UWA). The former utilizes biased perturbed data, while the latter assigns weights to different perturbed results based on perturbation information, thereby achieving minimal variance. For distribution estimation, we propose the User-level Likelihood Estimation (ULE), which treats all perturbed results from a user as a whole for maximum likelihood estimation. Experimental results demonstrate that our framework and constituting methods significantly improve the accuracy of both mean and distribution estimation.

Paperid: 371, https://arxiv.org/pdf/2502.20178.pdf

Abstract:
Unmanned aerial vehicles (UAVs) are increasingly employed to perform high-risk tasks that require minimal human intervention. However, UAVs face escalating cybersecurity threats, particularly from GNSS spoofing attacks. While previous studies have extensively investigated the impacts of GNSS spoofing on UAVs, few have focused on its effects on specific tasks. Moreover, the influence of UAV motion states on the assessment of network security risks is often overlooked. To address these gaps, we first provide a detailed evaluation of how motion states affect the effectiveness of network attacks. We demonstrate that nonlinear motion states not only enhance the effectiveness of position spoofing in GNSS spoofing attacks but also reduce the probability of speed-related attack detection. Building upon this, we propose a state-triggered backdoor attack method (SSD) to deceive GNSS systems and assess its risk to trajectory planning tasks. Extensive validation of SSD's effectiveness and stealthiness is conducted. Experimental results show that, with appropriately tuned hyperparameters, SSD significantly increases positioning errors and the risk of task failure, while maintaining 100% stealth across three state-of-the-art detectors.

Paperid: 372, https://arxiv.org/pdf/2502.19070.pdf

Abstract:
Model Inversion (MI) attacks, which reconstruct the training dataset of neural networks, pose significant privacy concerns in machine learning. Recent MI attacks have managed to reconstruct realistic label-level private data, such as the general appearance of a target person from all training images labeled on him. Beyond label-level privacy, in this paper we show sample-level privacy, the private information of a single target sample, is also important but under-explored in the MI literature due to the limitations of existing evaluation metrics. To address this gap, this study introduces a novel metric tailored for training-sample analysis, namely, the Diversity and Distance Composite Score (DDCS), which evaluates the reconstruction fidelity of each training sample by encompassing various MI attack attributes. This, in turn, enhances the precision of sample-level privacy assessments. Leveraging DDCS as a new evaluative lens, we observe that many training samples remain resilient against even the most advanced MI attack. As such, we further propose a transfer learning framework that augments the generative capabilities of MI attackers through the integration of entropy loss and natural gradient descent. Extensive experiments verify the effectiveness of our framework on improving state-of-the-art MI attacks over various metrics including DDCS, coverage and FID. Finally, we demonstrate that DDCS can also be useful for MI defense, by identifying samples susceptible to MI attacks in an unsupervised manner.

Paperid: 373, https://arxiv.org/pdf/2502.18943.pdf

Abstract:
Membership Inference Attacks (MIAs) aim to predict whether a data sample belongs to the model's training set or not. Although prior research has extensively explored MIAs in Large Language Models (LLMs), they typically require accessing to complete output logits (\ie, \textit{logits-based attacks}), which are usually not available in practice. In this paper, we study the vulnerability of pre-trained LLMs to MIAs in the \textit{label-only setting}, where the adversary can only access generated tokens (text). We first reveal that existing label-only MIAs have minor effects in attacking pre-trained LLMs, although they are highly effective in inferring fine-tuning datasets used for personalized LLMs. We find that their failure stems from two main reasons, including better generalization and overly coarse perturbation. Specifically, due to the extensive pre-training corpora and exposing each sample only a few times, LLMs exhibit minimal robustness differences between members and non-members. This makes token-level perturbations too coarse to capture such differences. To alleviate these problems, we propose \textbf{PETAL}: a label-only membership inference attack based on \textbf{PE}r-\textbf{T}oken sem\textbf{A}ntic simi\textbf{L}arity. Specifically, PETAL leverages token-level semantic similarity to approximate output probabilities and subsequently calculate the perplexity. It finally exposes membership based on the common assumption that members are `better' memorized and have smaller perplexity. We conduct extensive experiments on the WikiMIA benchmark and the more challenging MIMIR benchmark. Empirically, our PETAL performs better than the extensions of existing label-only attacks against personalized LLMs and even on par with other advanced logit-based attacks across all metrics on five prevalent open-source LLMs.

Paperid: 374, https://arxiv.org/pdf/2501.18624.pdf

Abstract:
Vision-Language Models (VLMs), built on pre-trained vision encoders and large language models (LLMs), have shown exceptional multi-modal understanding and dialog capabilities, positioning them as catalysts for the next technological revolution. However, while most VLM research focuses on enhancing multi-modal interaction, the risks of data misuse and leakage have been largely unexplored. This prompts the need for a comprehensive investigation of such risks in VLMs. In this paper, we conduct the first analysis of misuse and leakage detection in VLMs through the lens of membership inference attack (MIA). In specific, we focus on the instruction tuning data of VLMs, which is more likely to contain sensitive or unauthorized information. To address the limitation of existing MIA methods, we introduce a novel approach that infers membership based on a set of samples and their sensitivity to temperature, a unique parameter in VLMs. Based on this, we propose four membership inference methods, each tailored to different levels of background knowledge, ultimately arriving at the most challenging scenario. Our comprehensive evaluations show that these methods can accurately determine membership status, e.g., achieving an AUC greater than 0.8 targeting a small set consisting of only 5 samples on LLaVA.

Paperid: 375, https://arxiv.org/pdf/2501.05835.pdf

Abstract:
Graph Neural Networks (GNNs) have achieved remarkable performance through their message-passing mechanism. However, recent studies have highlighted the vulnerability of GNNs to backdoor attacks, which can lead the model to misclassify graphs with attached triggers as the target class. The effectiveness of recent promising defense techniques, such as fine-tuning or distillation, is heavily contingent on having comprehensive knowledge of the sufficient training dataset. Empirical studies have shown that fine-tuning methods require a clean dataset of 20% to reduce attack accuracy to below 25%, while distillation methods require a clean dataset of 15%. However, obtaining such a large amount of clean data is commonly impractical. In this paper, we propose a practical backdoor mitigation framework, denoted as GRAPHNAD, which can capture high-quality intermediate-layer representations in GNNs to enhance the distillation process with limited clean data. To achieve this, we address the following key questions: How to identify the appropriate attention representations in graphs for distillation? How to enhance distillation with limited data? By adopting the graph attention transfer method, GRAPHNAD can effectively align the intermediate-layer attention representations of the backdoored model with that of the teacher model, forcing the backdoor neurons to transform into benign ones. Besides, we extract the relation maps from intermediate-layer transformation and enforce the relation maps of the backdoored model to be consistent with that of the teacher model, thereby ensuring model accuracy while further reducing the influence of backdoors. Extensive experimental results show that by fine-tuning a teacher model with only 3% of the clean data, GRAPHNAD can reduce the attack success rate to below 5%.

Paperid: 376, https://arxiv.org/pdf/2501.02354.pdf

Abstract:
The objective of privacy-preserving synthetic graph publishing is to safeguard individuals' privacy while retaining the utility of original data. Most existing methods focus on graph neural networks under differential privacy (DP), and yet two fundamental problems in generating synthetic graphs remain open. First, the current research often encounters high sensitivity due to the intricate relationships between nodes in a graph. Second, DP is usually achieved through advanced composition mechanisms that tend to converge prematurely when working with a small privacy budget. In this paper, inspired by the simplicity, effectiveness, and ease of analysis of PageRank, we design PrivDPR, a novel privacy-preserving deep PageRank for graph synthesis. In particular, we achieve DP by adding noise to the gradient for a specific weight during learning. Utilizing weight normalization as a bridge, we theoretically reveal that increasing the number of layers in PrivDPR can effectively mitigate the high sensitivity and privacy budget splitting. Through formal privacy analysis, we prove that the synthetic graph generated by PrivDPR satisfies node-level DP. Experiments on real-world graph datasets show that PrivDPR preserves high data utility across multiple graph structural properties.

Paperid: 377, https://arxiv.org/pdf/2502.07813.pdf

Abstract:
The compositional reasoning capacity has long been regarded as critical to the generalization and intelligence emergence of large language models LLMs. However, despite numerous reasoning-related benchmarks, the compositional reasoning capacity of LLMs is rarely studied or quantified in the existing benchmarks. In this paper, we introduce CryptoX, an evaluation framework that, for the first time, combines existing benchmarks and cryptographic, to quantify the compositional reasoning capacity of LLMs. Building upon CryptoX, we construct CryptoBench, which integrates these principles into several benchmarks for systematic evaluation. We conduct detailed experiments on widely used open-source and closed-source LLMs using CryptoBench, revealing a huge gap between open-source and closed-source LLMs. We further conduct thorough mechanical interpretability experiments to reveal the inner mechanism of LLMs' compositional reasoning, involving subproblem decomposition, subproblem inference, and summarizing subproblem conclusions. Through analysis based on CryptoBench, we highlight the value of independently studying compositional reasoning and emphasize the need to enhance the compositional reasoning capabilities of LLMs.

Paperid: 378, https://arxiv.org/pdf/2506.04681.pdf

Abstract:
We introduce $Urania$, a novel framework for generating insights about LLM chatbot interactions with rigorous differential privacy (DP) guarantees. The framework employs a private clustering mechanism and innovative keyword extraction methods, including frequency-based, TF-IDF-based, and LLM-guided approaches. By leveraging DP tools such as clustering, partition selection, and histogram-based summarization, $Urania$ provides end-to-end privacy protection. Our evaluation assesses lexical and semantic content preservation, pair similarity, and LLM-based metrics, benchmarking against a non-private Clio-inspired pipeline (Tamkin et al., 2024). Moreover, we develop a simple empirical privacy evaluation that demonstrates the enhanced robustness of our DP pipeline. The results show the framework's ability to extract meaningful conversational insights while maintaining stringent user privacy, effectively balancing data utility with privacy preservation.

Paperid: 379, https://arxiv.org/pdf/2505.11548.pdf

Abstract:
Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have shown improved performance in generating accurate responses. However, the dependence on external knowledge bases introduces potential security vulnerabilities, particularly when these knowledge bases are publicly accessible and modifiable. While previous studies have exposed knowledge poisoning risks in RAG systems, existing attack methods suffer from critical limitations: they either require injecting multiple poisoned documents (resulting in poor stealthiness) or can only function effectively on simplistic queries (limiting real-world applicability). This paper reveals a more realistic knowledge poisoning attack against RAG systems that achieves successful attacks by poisoning only a single document while remaining effective for complex multi-hop questions involving complex relationships between multiple elements. Our proposed AuthChain address three challenges to ensure the poisoned documents are reliably retrieved and trusted by the LLM, even against large knowledge bases and LLM's own knowledge. Extensive experiments across six popular LLMs demonstrate that AuthChain achieves significantly higher attack success rates while maintaining superior stealthiness against RAG defense mechanisms compared to state-of-the-art baselines.

Paperid: 380, https://arxiv.org/pdf/2504.08813.pdf

Abstract:
The rapid advancement of multi-modal large reasoning models (MLRMs) -- enhanced versions of multimodal language models (MLLMs) equipped with reasoning capabilities -- has revolutionized diverse applications. However, their safety implications remain underexplored. While prior work has exposed critical vulnerabilities in unimodal reasoning models, MLRMs introduce distinct risks from cross-modal reasoning pathways. This work presents the first systematic safety analysis of MLRMs through large-scale empirical studies comparing MLRMs with their base MLLMs. Our experiments reveal three critical findings: (1) The Reasoning Tax: Acquiring reasoning capabilities catastrophically degrades inherited safety alignment. MLRMs exhibit 37.44% higher jailbreaking success rates than base MLLMs under adversarial attacks. (2) Safety Blind Spots: While safety degradation is pervasive, certain scenarios (e.g., Illegal Activity) suffer 25 times higher attack rates -- far exceeding the average 3.4 times increase, revealing scenario-specific vulnerabilities with alarming cross-model and datasets consistency. (3) Emergent Self-Correction: Despite tight reasoning-answer safety coupling, MLRMs demonstrate nascent self-correction -- 16.9% of jailbroken reasoning steps are overridden by safe answers, hinting at intrinsic safeguards. These findings underscore the urgency of scenario-aware safety auditing and mechanisms to amplify MLRMs' self-correction potential. To catalyze research, we open-source OpenSafeMLRM, the first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods. Our work calls for immediate efforts to harden reasoning-augmented AI, ensuring its transformative potential aligns with ethical safeguards.

Paperid: 381, https://arxiv.org/pdf/2503.01865.pdf

Abstract:
Jailbreaking attacks can effectively induce unsafe behaviors in Large Language Models (LLMs); however, the transferability of these attacks across different models remains limited. This study aims to understand and enhance the transferability of gradient-based jailbreaking methods, which are among the standard approaches for attacking white-box models. Through a detailed analysis of the optimization process, we introduce a novel conceptual framework to elucidate transferability and identify superfluous constraints-specifically, the response pattern constraint and the token tail constraint-as significant barriers to improved transferability. Removing these unnecessary constraints substantially enhances the transferability and controllability of gradient-based attacks. Evaluated on Llama-3-8B-Instruct as the source model, our method increases the overall Transfer Attack Success Rate (T-ASR) across a set of target models with varying safety levels from 18.4% to 50.3%, while also improving the stability and controllability of jailbreak behaviors on both source and target models.

Paperid: 382, https://arxiv.org/pdf/2502.08889.pdf

Abstract:
User-level differentially private stochastic convex optimization (DP-SCO) has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale machine learning applications. Current methods, such as those based on differentially private stochastic gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility due to the need to privatize every intermediate iterate. In this work, we introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges. Our approach uniquely bounds the sensitivity of all intermediate iterates of SGD with gradient estimation based on robust statistics, thereby significantly reducing the gradient estimation noise for privacy purposes and enhancing the privacy-utility trade-off. By sidestepping the repeated privatization required by previous methods, our algorithm not only achieves an improved theoretical privacy-utility trade-off but also maintains computational efficiency. We complement our algorithm with an information-theoretic lower bound, showing that our upper bound is optimal up to logarithmic factors and the dependence on $Îµ$. This work sets the stage for more robust and efficient privacy-preserving techniques in machine learning, with implications for future research and application in the field.

Paperid: 383, https://arxiv.org/pdf/2501.18914.pdf

Abstract:
Scaling laws have emerged as important components of large language model (LLM) training as they can predict performance gains through scale, and provide guidance on important hyper-parameter choices that would otherwise be expensive. LLMs also rely on large, high-quality training datasets, like those sourced from (sometimes sensitive) user data. Training models on this sensitive user data requires careful privacy protections like differential privacy (DP). However, the dynamics of DP training are significantly different, and consequently their scaling laws are not yet fully understood. In this work, we establish scaling laws that accurately model the intricacies of DP LLM training, providing a complete picture of the compute-privacy-utility tradeoffs and the optimal training configurations in many settings.

Paperid: 384, https://arxiv.org/pdf/2501.03544.pdf

Abstract:
Recent text-to-image (T2I) models have exhibited remarkable performance in generating high-quality images from text descriptions. However, these models are vulnerable to misuse, particularly generating not-safe-for-work (NSFW) content, such as sexually explicit, violent, political, and disturbing images, raising serious ethical concerns. In this work, we present PromptGuard, a novel content moderation technique that draws inspiration from the system prompt mechanism in large language models (LLMs) for safety alignment. Unlike LLMs, T2I models lack a direct interface for enforcing behavioral guidelines. Our key idea is to optimize a safety soft prompt that functions as an implicit system prompt within the T2I model's textual embedding space. This universal soft prompt (P*) directly moderates NSFW inputs, enabling safe yet realistic image generation without altering the inference efficiency or requiring proxy models. We further enhance its reliability and helpfulness through a divide-and-conquer strategy, which optimizes category-specific soft prompts and combines them into holistic safety guidance. Extensive experiments across five datasets demonstrate that PromptGuard effectively mitigates NSFW content generation while preserving high-quality benign outputs. PromptGuard achieves 3.8 times faster than prior content moderation methods, surpassing eight state-of-the-art defenses with an optimal unsafe ratio down to 5.84%.

Paperid: 385, https://arxiv.org/pdf/2503.08968.pdf

Abstract:
Homomorphic encryption (HE) allows secure computation on encrypted data without revealing the original data, providing significant benefits for privacy-sensitive applications. Many cloud computing applications (e.g., DNA read mapping, biometric matching, web search) use exact string matching as a key operation. However, prior string matching algorithms that use homomorphic encryption are limited by high computational latency caused by the use of complex operations and data movement bottlenecks due to the large encrypted data size. In this work, we provide an efficient algorithm-hardware codesign to accelerate HE-based secure exact string matching. We propose CIPHERMATCH, which (i) reduces the increase in memory footprint after encryption using an optimized software-based data packing scheme, (ii) eliminates the use of costly homomorphic operations (e.g., multiplication and rotation), and (iii) reduces data movement by designing a new in-flash processing (IFP) architecture. We demonstrate the benefits of CIPHERMATCH using two case studies: (1) Exact DNA string matching and (2) encrypted database search. Our pure software-based CIPHERMATCH implementation that uses our memory-efficient data packing scheme improves performance and reduces energy consumption by 42.9X and 17.6X, respectively, compared to the state-of-the-art software baseline. Integrating CIPHERMATCH with IFP improves performance and reduces energy consumption by 136.9X and 256.4X, respectively, compared to the software-based CIPHERMATCH implementation.

Paperid: 386, https://arxiv.org/pdf/2502.00693.pdf

Abstract:
The Bloom filter is a simple yet space-efficient probabilistic data structure that supports membership queries for dramatically large datasets. It is widely utilized and implemented across various industrial scenarios, often handling massive datasets that include sensitive user information necessitating privacy preservation. To address the challenge of maintaining privacy within the Bloom filter, we have developed the DPBloomfilter. This innovation integrates the classical differential privacy mechanism, specifically the Random Response technique, into the Bloom filter, offering robust privacy guarantees under the same running complexity as the standard Bloom filter. Through rigorous simulation experiments, we have demonstrated that our DPBloomfilter algorithm maintains high utility while ensuring privacy protections. To the best of our knowledge, this is the first work to provide differential privacy guarantees for the Bloom filter for membership query problems.

Paperid: 387, https://arxiv.org/pdf/2501.07251.pdf

Abstract:
Crafting adversarial examples is crucial for evaluating and enhancing the robustness of Deep Neural Networks (DNNs), presenting a challenge equivalent to maximizing a non-differentiable 0-1 loss function. However, existing single objective methods, namely adversarial attacks focus on a surrogate loss function, do not fully harness the benefits of engaging multiple loss functions, as a result of insufficient understanding of their synergistic and conflicting nature. To overcome these limitations, we propose the Multi-Objective Set-based Attack (MOS Attack), a novel adversarial attack framework leveraging multiple loss functions and automatically uncovering their interrelations. The MOS Attack adopts a set-based multi-objective optimization strategy, enabling the incorporation of numerous loss functions without additional parameters. It also automatically mines synergistic patterns among various losses, facilitating the generation of potent adversarial attacks with fewer objectives. Extensive experiments have shown that our MOS Attack outperforms single-objective attacks. Furthermore, by harnessing the identified synergistic patterns, MOS Attack continues to show superior results with a reduced number of loss functions.

Paperid: 388, https://arxiv.org/pdf/2504.11990.pdf

Abstract:
Transfer learning from pre-trained encoders has become essential in modern machine learning, enabling efficient model adaptation across diverse tasks. However, this combination of pre-training and downstream adaptation creates an expanded attack surface, exposing models to sophisticated backdoor embeddings at both the encoder and dataset levels--an area often overlooked in prior research. Additionally, the limited computational resources typically available to users of pre-trained encoders constrain the effectiveness of generic backdoor defenses compared to end-to-end training from scratch. In this work, we investigate how to mitigate potential backdoor risks in resource-constrained transfer learning scenarios. Specifically, we conduct an exhaustive analysis of existing defense strategies, revealing that many follow a reactive workflow based on assumptions that do not scale to unknown threats, novel attack types, or different training paradigms. In response, we introduce a proactive mindset focused on identifying clean elements and propose the Trusted Core (T-Core) Bootstrapping framework, which emphasizes the importance of pinpointing trustworthy data and neurons to enhance model security. Our empirical evaluations demonstrate the effectiveness and superiority of T-Core, specifically assessing 5 encoder poisoning attacks, 7 dataset poisoning attacks, and 14 baseline defenses across five benchmark datasets, addressing four scenarios of 3 potential backdoor threats.

Paperid: 389, https://arxiv.org/pdf/2502.16091.pdf

Abstract:
Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services, such as high response latency, limited scalability, and severe data privacy exposure. However, deploying DNN models on resource-constrained edge devices introduces additional challenges, including limited computation/storage resources, dynamic service demands, and heightened privacy risks. To tackle these issues, this paper presents a novel privacy-aware optimization framework that jointly addresses DNN model deployment, user-server association, and model partitioning, with the goal of minimizing long-term average inference delay under resource and privacy constraints. The problem is formulated as a complex, NP-hard stochastic optimization. To efficiently handle system dynamics and computational complexity, we employ a Lyapunov-based approach to transform the long-term objective into tractable per-slot decisions. Furthermore, we introduce a coalition formation game to enable adaptive user-server association and design a greedy algorithm for model deployment within each coalition. Extensive simulations demonstrate that the proposed algorithm significantly reduces inference delay and consistently satisfies privacy constraints, outperforming state-of-the-art baselines across diverse scenarios.

Paperid: 390, https://arxiv.org/pdf/2506.03765.pdf

Abstract:
Adversarial detection protects models from adversarial attacks by refusing suspicious test samples. However, current detection methods often suffer from weak generalization: their effectiveness tends to degrade significantly when applied to adversarially trained models rather than naturally trained ones, and they generally struggle to achieve consistent effectiveness across both white-box and black-box attack settings. In this work, we observe that an auxiliary model, differing from the primary model in training strategy or model architecture, tends to assign low confidence to the primary model's predictions on adversarial examples (AEs), while preserving high confidence on normal examples (NEs). Based on this discovery, we propose Prediction Inconsistency Detector (PID), a lightweight and generalizable detection framework to distinguish AEs from NEs by capturing the prediction inconsistency between the primal and auxiliary models. PID is compatible with both naturally and adversarially trained primal models and outperforms four detection methods across 3 white-box, 3 black-box, and 1 mixed adversarial attacks. Specifically, PID achieves average AUC scores of 99.29\% and 99.30\% on CIFAR-10 when the primal model is naturally and adversarially trained, respectively, and 98.31% and 96.81% on ImageNet under the same conditions, outperforming existing SOTAs by 4.70%$\sim$25.46%.

Paperid: 391, https://arxiv.org/pdf/2505.12981.pdf

Abstract:
The growing adoption of large language models (LLMs) has led to a new paradigm in mobile computing--LLM-powered mobile AI agents--capable of decomposing and automating complex tasks directly on smartphones. However, the security implications of these agents remain largely unexplored. In this paper, we present the first comprehensive security analysis of mobile LLM agents, encompassing three representative categories: System-level AI Agents developed by original equipment manufacturers (e.g., YOYO Assistant), Third-party Universal Agents (e.g., Zhipu AI AutoGLM), and Emerging Agent Frameworks (e.g., Alibaba Mobile Agent). We begin by analyzing the general workflow of mobile agents and identifying security threats across three core capability dimensions: language-based reasoning, GUI-based interaction, and system-level execution. Our analysis reveals 11 distinct attack surfaces, all rooted in the unique capabilities and interaction patterns of mobile LLM agents, and spanning their entire operational lifecycle. To investigate these threats in practice, we introduce AgentScan, a semi-automated security analysis framework that systematically evaluates mobile LLM agents across all 11 attack scenarios. Applying AgentScan to nine widely deployed agents, we uncover a concerning trend: every agent is vulnerable to targeted attacks. In the most severe cases, agents exhibit vulnerabilities across eight distinct attack vectors. These attacks can cause behavioral deviations, privacy leakage, or even full execution hijacking. Based on these findings, we propose a set of defensive design principles and practical recommendations for building secure mobile LLM agents. Our disclosures have received positive feedback from two major device vendors. Overall, this work highlights the urgent need for standardized security practices in the fast-evolving landscape of LLM-driven mobile automation.

Paperid: 392, https://arxiv.org/pdf/2505.03425.pdf

Abstract:
Directed greybox fuzzing (DGF) focuses on efficiently reaching specific program locations or triggering particular behaviors, making it essential for tasks like vulnerability detection and crash reproduction. However, existing methods often suffer from path explosion and randomness in input mutation, leading to inefficiencies in exploring and exploiting target paths. In this paper, we propose HGFuzzer, an automatic framework that leverages the large language model (LLM) to address these challenges. HGFuzzer transforms path constraint problems into targeted code generation tasks, systematically generating test harnesses and reachable inputs to reduce unnecessary exploration paths significantly. Additionally, we implement custom mutators designed specifically for target functions, minimizing randomness and improving the precision of directed fuzzing. We evaluated HGFuzzer on 20 real-world vulnerabilities, successfully triggering 17, including 11 within the first minute, achieving a speedup of at least 24.8x compared to state-of-the-art directed fuzzers. Furthermore, HGFuzzer discovered 9 previously unknown vulnerabilities, all of which were assigned CVE IDs, demonstrating the effectiveness of our approach in identifying real-world vulnerabilities.

Paperid: 393, https://arxiv.org/pdf/2505.02502.pdf

Abstract:
Large language models (LLMs) are increasingly deployed through open-source and commercial frameworks, enabling individuals and organizations to self-host advanced LLM capabilities. As LLM deployments become prevalent, particularly in industry, ensuring their secure and reliable operation has become a critical issue. However, insecure defaults and misconfigurations often expose LLM services to the public internet, posing serious security and system engineering risks. This study conducted a large-scale empirical investigation of public-facing LLM deployments, focusing on the prevalence of services, exposure characteristics, systemic vulnerabilities, and associated risks. Through internet-wide measurements, we identified 320,102 public-facing LLM services across 15 frameworks and extracted 158 unique API endpoints, categorized into 12 functional groups based on functionality and security risk. Our analysis found that over 40% of endpoints used plain HTTP, and over 210,000 endpoints lacked valid TLS metadata. API exposure was highly inconsistent: some frameworks, such as Ollama, responded to over 35% of unauthenticated API requests, with about 15% leaking model or system information, while other frameworks implemented stricter controls. We observed widespread use of insecure protocols, poor TLS configurations, and unauthenticated access to critical operations. These security risks, such as model leakage, system compromise, and unauthorized access, are pervasive and highlight the need for a secure-by-default framework and stronger deployment practices.

Paperid: 394, https://arxiv.org/pdf/2504.20536.pdf

Abstract:
Blockchain technology has revolutionized the way transactions are executed, but scalability remains a major challenge. Payment Channel Network (PCN), as a Layer-2 scaling solution, has been proposed to address this issue. However, skewed payments can deplete the balance of one party within a channel, restricting the ability of PCNs to transact through a path and subsequently reducing the transaction success rate. To address this issue, the technology of rebalancing has been proposed. However, existing rebalancing strategies in PCNs are limited in their capacity and efficiency. Cycle-based approaches only address rebalancing within groups of nodes that form a cycle network, while non-cycle-based approaches face high complexity of on-chain operations and limitations on rebalancing capacity. In this study, we propose Starfish, a rebalancing approach that captures the star-shaped network structure to provide high rebalancing efficiency and large channel capacity. Starfish requires only $N$-time on-chain operations to connect independent channels and aggregate the total budget of all channels. To demonstrate the correctness and advantages of our method, we provide a formal security proof of the Starfish protocol and conduct comparative experiments with existing rebalancing techniques.

Paperid: 395, https://arxiv.org/pdf/2504.13474.pdf

Abstract:
Large Language Models are a promising tool for automated vulnerability detection, thanks to their success in code generation and repair. However, despite widespread adoption, a critical question remains: Are LLMs truly effective at detecting real-world vulnerabilities? Current evaluations, which often assess models on isolated functions or files, ignore the broader execution and data-flow context essential for understanding vulnerabilities. This oversight leads to two types of misleading outcomes: incorrect conclusions and flawed rationales, collectively undermining the reliability of prior assessments. Therefore, in this paper, we challenge three widely held community beliefs: that LLMs are (i) unreliable, (ii) insensitive to code patches, and (iii) performance-plateaued across model scales. We argue that these beliefs are artifacts of context-deprived evaluations. To address this, we propose CORRECT (Context-Rich Reasoning Evaluation of Code with Trust), a new evaluation framework that systematically incorporates contextual information into LLM-based vulnerability detection. We construct a context-rich dataset of 2,000 vulnerable-patched program pairs spanning 99 CWEs and evaluate 13 LLMs across four model families. Our framework elicits both binary predictions and natural-language rationales, which are further validated using LLM-as-a-judge techniques. Our findings overturn existing misconceptions. When provided with sufficient context, SOTA LLMs achieve significantly improved performance (e.g., 0.7 F1-score on key CWEs), with 0.8 precision. We show that most false positives stem from reasoning errors rather than misclassification, and that while model and test-time scaling improve performance, they introduce diminishing returns and trade-offs in recall. Finally, we uncover new flaws in current LLM-based detection systems, such as limited generalization and overthinking biases.

Paperid: 396, https://arxiv.org/pdf/2503.23278.pdf

Abstract:
The Model Context Protocol (MCP) is a standardized interface designed to enable seamless interaction between AI models and external tools and resources, breaking down data silos and facilitating interoperability across diverse systems. This paper provides a comprehensive overview of MCP, focusing on its core components, workflow, and the lifecycle of MCP servers, which consists of three key phases: creation, operation, and update. We analyze the security and privacy risks associated with each phase and propose strategies to mitigate potential threats. The paper also examines the current MCP landscape, including its adoption by industry leaders and various use cases, as well as the tools and platforms supporting its integration. We explore future directions for MCP, highlighting the challenges and opportunities that will influence its adoption and evolution within the broader AI ecosystem. Finally, we offer recommendations for MCP stakeholders to ensure its secure and sustainable development as the AI landscape continues to evolve.

Paperid: 397, https://arxiv.org/pdf/2503.21279.pdf

Abstract:
Asynchronous Byzantine fault-tolerant (BFT) consensus protocols, known for their robustness in unpredictable environments without relying on timing assumptions, are becoming increasingly vital for wireless applications. While these protocols have proven effective in wired networks, their adaptation to wireless environments presents significant challenges. Asynchronous BFT consensus, characterized by its N parallel consensus components (e.g., asynchronous Byzantine agreement, reliable broadcast), suffers from high message complexity, leading to network congestion and inefficiency, especially in resource-constrained wireless networks. Asynchronous Byzantine agreement (ABA) protocols, a foundational component of asynchronous BFT, require careful balancing of message complexity and cryptographic overhead to achieve efficient implementation in wireless settings. Additionally, the absence of dedicated testbeds for asynchronous wireless BFT consensus protocols hinders development and performance evaluation. To address these challenges, we propose a consensus batching protocol (ConsensusBatcher), which supports both vertical and horizontal batching of multiple parallel consensus components. We leverage ConsensusBatcher to adapt three asynchronous BFT consensus protocols (HoneyBadgerBFT, BEAT, and Dumbo) from wired networks to resource-constrained wireless networks. To evaluate the performance of ConsensusBatcher-enabled consensus protocols in wireless environments, we develop and open-source a testbed for deployment and performance assessment of these protocols. Using this testbed, we demonstrate that ConsensusBatcher-based consensus reduces latency by 48% to 59% and increases throughput by 48% to 62% compared to baseline consensus protocols.

Paperid: 398, https://arxiv.org/pdf/2503.12058.pdf

Abstract:
Backdoor attacks typically place a specific trigger on certain training data, such that the model makes prediction errors on inputs with that trigger during inference. Despite the core role of the trigger, existing studies have commonly believed a perfect match between training-inference triggers is optimal. In this paper, for the first time, we systematically explore the training-inference trigger relation, particularly focusing on their mismatch, based on a Training-Inference Trigger Intensity Manipulation (TITIM) workflow. TITIM specifically investigates the training-inference trigger intensity, such as the size or the opacity of a trigger, and reveals new insights into trigger generalization and overfitting. These new insights challenge the above common belief by demonstrating that the training-inference trigger mismatch can facilitate attacks in two practical scenarios, posing more significant security threats than previously thought. First, when the inference trigger is fixed, using training triggers with mixed intensities leads to stronger attacks than using any single intensity. For example, on CIFAR-10 with ResNet-18, mixing training triggers with 1.0 and 0.1 opacities improves the worst-case attack success rate (ASR) (over different testing opacities) of the best single-opacity attack from 10.61\% to 92.77\%. Second, intentionally using certain mismatched training-inference triggers can improve the attack stealthiness, i.e., better bypassing defenses. For example, compared to the training/inference intensity of 1.0/1.0, using 1.0/0.7 decreases the area under the curve (AUC) of the Scale-Up defense from 0.96 to 0.62, while maintaining a high attack ASR (99.65\% vs. 91.62\%). The above new insights are validated to be generalizable across different backdoor attacks, models, datasets, tasks, and (digital/physical) domains.

Paperid: 399, https://arxiv.org/pdf/2502.12497.pdf

Abstract:
Large Language Models (LLMs) transform artificial intelligence, driving advancements in natural language understanding, text generation, and autonomous systems. The increasing complexity of their development and deployment introduces significant security challenges, particularly within the LLM supply chain. However, existing research primarily focuses on content safety, such as adversarial attacks, jailbreaking, and backdoor attacks, while overlooking security vulnerabilities in the underlying software systems. To address this gap, this study systematically analyzes 529 vulnerabilities reported across 75 prominent projects spanning 13 lifecycle stages. The findings show that vulnerabilities are concentrated in the application (50.3%) and model (42.7%) layers, with improper resource control (45.7%) and improper neutralization (25.1%) identified as the leading root causes. Additionally, while 56.7% of the vulnerabilities have available fixes, 8% of these patches are ineffective, resulting in recurring vulnerabilities. This study underscores the challenges of securing the LLM ecosystem and provides actionable insights to guide future research and mitigation strategies.

Paperid: 400, https://arxiv.org/pdf/2502.11550.pdf

Abstract:
Cloud-based outsourced Location-based services have profound impacts on various aspects of people's lives but bring security concerns. Existing spatio-temporal data secure retrieval schemes have significant shortcomings regarding dynamic updates, either compromising privacy through leakage during updates (forward insecurity) or incurring excessively high update costs that hinder practical application. Under these circumstances, we first propose a basic filter-based spatio-temporal range query scheme \TrinityI that supports low-cost dynamic updates and automatic expansion. Furthermore, to improve security, reduce storage cost, and false positives, we propose a forward secure and verifiable scheme \TrinityII that simultaneously minimizes storage overhead. A formal security analysis proves that \TrinityI and \TrinityII are Indistinguishable under Selective Chosen-Plaintext Attack (IND-SCPA). Finally, extensive experiments demonstrate that our design \TrinityII significantly reduces storage requirements by 80\%, enables data retrieval at the 1 million-record level in just 0.01 seconds, and achieves 10 $\times$ update efficiency than state-of-art.

Paperid: 401, https://arxiv.org/pdf/2503.04636.pdf

Abstract:
As open-source large language models (LLMs) like Llama3 become more capable, it is crucial to develop watermarking techniques to detect their potential misuse. Existing watermarking methods either add watermarks during LLM inference, which is unsuitable for open-source LLMs, or primarily target classification LLMs rather than recent generative LLMs. Adapting these watermarks to open-source LLMs for misuse detection remains an open challenge. This work defines two misuse scenarios for open-source LLMs: intellectual property (IP) violation and LLM Usage Violation. Then, we explore the application of inference-time watermark distillation and backdoor watermarking in these contexts. We propose comprehensive evaluation methods to assess the impact of various real-world further fine-tuning scenarios on watermarks and the effect of these watermarks on LLM performance. Our experiments reveal that backdoor watermarking could effectively detect IP Violation, while inference-time watermark distillation is applicable in both scenarios but less robust to further fine-tuning and has a more significant impact on LLM performance compared to backdoor watermarking. Exploring more advanced watermarking methods for open-source LLMs to detect their misuse should be an important future direction.

Paperid: 402, https://arxiv.org/pdf/2506.05401.pdf

Abstract:
Large visual language models (LVLMs) have demonstrated excellent instruction-following capabilities, yet remain vulnerable to stealthy backdoor attacks when finetuned using contaminated data. Existing backdoor defense techniques are usually developed for single-modal visual or language models under fully parameter-adjustable settings or rely on supervisory knowledge during training. However, in real-world scenarios, defenders cannot modify frozen visual encoders or core LLM parameters, nor possess prior knowledge of unknown trigger patterns or target responses. Motivated by the empirical finding that LVLMs readily overfit to fixed, unknown triggers, which can embed malicious associations during adapter-level tuning, we aim to design a defense that operates without access to core weights or attack priors. To this end, we introduce a lightweight, certified-agnostic defense framework, Robust Instruction Tuning, that finetunes only adapter modules and text embedding layers under instruction tuning. Our method integrates two complementary regularizations: (1) Input Diversity Regularization, which perturbs trigger components across training samples to disrupt consistent spurious cues; and (2) Anomalous Activation Regularization, which dynamically sparses adapter weights exhibiting abnormally sharp activations linked to backdoor patterns. These mechanisms jointly guide the model toward learning semantically grounded representations rather than memorizing superficial trigger-response mappings. Extensive experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero, with an increase in training cost of less than 15%.

Paperid: 403, https://arxiv.org/pdf/2504.19093.pdf

Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities, especially the recent advancements in reasoning, such as o1 and o3, pushing the boundaries of AI. Despite these impressive achievements in mathematics and coding, the reasoning abilities of LLMs in domains requiring cryptographic expertise remain underexplored. In this paper, we introduce CipherBank, a comprehensive benchmark designed to evaluate the reasoning capabilities of LLMs in cryptographic decryption tasks. CipherBank comprises 2,358 meticulously crafted problems, covering 262 unique plaintexts across 5 domains and 14 subdomains, with a focus on privacy-sensitive and real-world scenarios that necessitate encryption. From a cryptographic perspective, CipherBank incorporates 3 major categories of encryption methods, spanning 9 distinct algorithms, ranging from classical ciphers to custom cryptographic techniques. We evaluate state-of-the-art LLMs on CipherBank, e.g., GPT-4o, DeepSeek-V3, and cutting-edge reasoning-focused models such as o1 and DeepSeek-R1. Our results reveal significant gaps in reasoning abilities not only between general-purpose chat LLMs and reasoning-focused LLMs but also in the performance of current reasoning-focused models when applied to classical cryptographic decryption tasks, highlighting the challenges these models face in understanding and manipulating encrypted data. Through detailed analysis and error investigations, we provide several key observations that shed light on the limitations and potential improvement areas for LLMs in cryptographic reasoning. These findings underscore the need for continuous advancements in LLM reasoning capabilities.

Paperid: 404, https://arxiv.org/pdf/2503.19499.pdf

Abstract:
Steganography embeds confidential data within seemingly innocuous communications. Provable security in steganography, a long-sought goal, has become feasible with deep generative models. However, existing methods face a critical trade-off between security and efficiency. This paper introduces SparSamp, an efficient provably secure steganography method based on sparse sampling. SparSamp embeds messages by combining them with pseudo-random numbers to obtain message-derived random numbers for sampling. It enhances extraction accuracy and embedding capacity by increasing the sampling intervals and making the sampling process sparse. SparSamp preserves the original probability distribution of the generative model, thus ensuring security. It introduces only $O(1)$ additional complexity per sampling step, enabling the fastest embedding speed without compromising generation speed. SparSamp is designed to be plug-and-play; message embedding can be achieved by simply replacing the sampling component of an existing generative model with SparSamp. We implemented SparSamp in text, image, and audio generation models. It can achieve embedding speeds of up to 755 bits/second with GPT-2, 5046 bits/second with DDPM, and 9,223 bits/second with WaveRNN.

Paperid: 405, https://arxiv.org/pdf/2506.09702.pdf

Abstract:
Mapping National Vulnerability Database (NVD) records to vulnerability-fixing commits (VFCs) is crucial for vulnerability analysis but challenging due to sparse explicit links in NVD references.This study explores this mapping's feasibility through an empirical approach. Manual analysis of NVD references showed Git references enable over 86% success, while non-Git references achieve under 14%. Using these findings, we built an automated pipeline extracting 31,942 VFCs from 20,360 NVD records (8.7% of 235,341) with 87% precision, mainly from Git references. To fill gaps, we mined six external security databases, yielding 29,254 VFCs for 18,985 records (8.1%) at 88.4% precision, and GitHub repositories, adding 3,686 VFCs for 2,795 records (1.2%) at 73% precision. Combining these, we mapped 26,710 unique records (11.3% coverage) from 7,634 projects, with overlap between NVD and external databases, plus unique GitHub contributions. Despite success with Git references, 88.7% of records remain unmapped, highlighting the difficulty without Git links. This study offers insights for enhancing vulnerability datasets and guiding future automated security research.

Paperid: 406, https://arxiv.org/pdf/2505.23828.pdf

Abstract:
With the rapid development of the Vision-Language Model (VLM), significant progress has been made in Visual Question Answering (VQA) tasks. However, existing VLM often generate inaccurate answers due to a lack of up-to-date knowledge. To address this issue, recent research has introduced Retrieval-Augmented Generation (RAG) techniques, commonly used in Large Language Models (LLM), into VLM, incorporating external multi-modal knowledge to enhance the accuracy and practicality of VLM systems. Nevertheless, the RAG in LLM may be susceptible to data poisoning attacks. RAG-based VLM may also face the threat of this attack. This paper first reveals the vulnerabilities of the RAG-based large model under poisoning attack, showing that existing single-modal RAG poisoning attacks have a 100\% failure rate in multi-modal RAG scenarios. To address this gap, we propose Spa-VLM (Stealthy Poisoning Attack on RAG-based VLM), a new paradigm for poisoning attacks on large models. We carefully craft malicious multi-modal knowledge entries, including adversarial images and misleading text, which are then injected into the RAG's knowledge base. When users access the VLM service, the system may generate misleading outputs. We evaluate Spa-VLM on two Wikipedia datasets and across two different RAGs. Results demonstrate that our method achieves highly stealthy poisoning, with the attack success rate exceeding 0.8 after injecting just 5 malicious entries into knowledge bases with 100K and 2M entries, outperforming state-of-the-art poisoning attacks designed for RAG-based LLMs. Additionally, we evaluated several defense mechanisms, all of which ultimately proved ineffective against Spa-VLM, underscoring the effectiveness and robustness of our attack.

Paperid: 407, https://arxiv.org/pdf/2505.19663.pdf

Abstract:
We introduce the Robust Audio Watermarking Benchmark (RAW-Bench), a benchmark for evaluating deep learning-based audio watermarking methods with standardized and systematic comparisons. To simulate real-world usage, we introduce a comprehensive audio attack pipeline with various distortions such as compression, background noise, and reverberation, along with a diverse test dataset including speech, environmental sounds, and music recordings. Evaluating four existing watermarking methods on RAW-bench reveals two main insights: (i) neural compression techniques pose the most significant challenge, even when algorithms are trained with such compressions; and (ii) training with audio attacks generally improves robustness, although it is insufficient in some cases. Furthermore, we find that specific distortions, such as polarity inversion, time stretching, or reverb, seriously affect certain methods. The evaluation framework is accessible at github.com/SonyResearch/raw_bench.

Paperid: 408, https://arxiv.org/pdf/2504.04222.pdf

Abstract:
Machine learning (ML) powered network traffic analysis has been widely used for the purpose of threat detection. Unfortunately, their generalization across different tasks and unseen data is very limited. Large language models (LLMs), known for their strong generalization capabilities, have shown promising performance in various domains. However, their application to the traffic analysis domain is limited due to significantly different characteristics of network traffic. To address the issue, in this paper, we propose TrafficLLM, which introduces a dual-stage fine-tuning framework to learn generic traffic representation from heterogeneous raw traffic data. The framework uses traffic-domain tokenization, dual-stage tuning pipeline, and extensible adaptation to help LLM release generalization ability on dynamic traffic analysis tasks, such that it enables traffic detection and traffic generation across a wide range of downstream tasks. We evaluate TrafficLLM across 10 distinct scenarios and 229 types of traffic. TrafficLLM achieves F1-scores of 0.9875 and 0.9483, with up to 80.12% and 33.92% better performance than existing detection and generation methods. It also shows strong generalization on unseen traffic with an 18.6% performance improvement. We further evaluate TrafficLLM in real-world scenarios. The results confirm that TrafficLLM is easy to scale and achieves accurate detection performance on enterprise traffic.

Paperid: 409, https://arxiv.org/pdf/2501.12622.pdf

Abstract:
Website fingerprinting enables an eavesdropper to determine which websites a user is visiting over an encrypted connection. State-of-the-art website fingerprinting (WF) attacks have demonstrated effectiveness even against Tor-protected network traffic. However, existing WF attacks have critical limitations on accurately identifying websites in multi-tab browsing sessions, where the holistic pattern of individual websites is no longer preserved, and the number of tabs opened by a client is unknown a priori. In this paper, we propose ARES, a novel WF framework natively designed for multi-tab WF attacks. ARES formulates the multi-tab attack as a multi-label classification problem and solves it using the novel Transformer-based models. Specifically, ARES extracts local patterns based on multi-level traffic aggregation features and utilizes the improved self-attention mechanism to analyze the correlations between these local patterns, effectively identifying websites. We implement a prototype of ARES and extensively evaluate its effectiveness using our large-scale datasets collected over multiple months. The experimental results illustrate that ARES achieves optimal performance in several realistic scenarios. Further, ARES remains robust even against various WF defenses.

Paperid: 410, https://arxiv.org/pdf/2506.01825.pdf

Abstract:
Code LLMs are increasingly employed in software development. However, studies have shown that they are vulnerable to backdoor attacks: when a trigger (a specific input pattern) appears in the input, the backdoor will be activated and cause the model to generate malicious outputs. Researchers have designed various triggers and demonstrated the feasibility of implanting backdoors by poisoning a fraction of the training data. Some basic conclusions have been made, such as backdoors becoming easier to implant when more training data are modified. However, existing research has not explored other factors influencing backdoor attacks on Code LLMs, such as training batch size, epoch number, and the broader design space for triggers, e.g., trigger length. To bridge this gap, we use code summarization as an example to perform an empirical study that systematically investigates the factors affecting backdoor effectiveness and understands the extent of the threat posed. Three categories of factors are considered: data, model, and inference, revealing previously overlooked findings. We find that the prevailing consensus -- that attacks are ineffective at extremely low poisoning rates -- is incorrect. The absolute number of poisoned samples matters as well. Specifically, poisoning just 20 out of 454K samples (0.004\% poisoning rate -- far below the minimum setting of 0.1\% in prior studies) successfully implants backdoors! Moreover, the common defense is incapable of removing even a single poisoned sample from it. Additionally, small batch sizes increase the risk of backdoor attacks. We also uncover other critical factors such as trigger types, trigger length, and the rarity of tokens in the triggers, leading to valuable insights for assessing Code LLMs' vulnerability to backdoor attacks. Our study highlights the urgent need for defense mechanisms against extremely low poisoning rate settings.

Paperid: 411, https://arxiv.org/pdf/2505.18773.pdf

Abstract:
State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a result, prior research has either relied on weaker attacks that avoid training reference models (e.g., fine-tuning attacks), or on stronger attacks applied to small-scale models and datasets. However, weaker attacks have been shown to be brittle - achieving close-to-arbitrary success - and insights from strong attacks in simplified settings do not translate to today's LLMs. These challenges have prompted an important question: are the limitations observed in prior work due to attack design choices, or are MIAs fundamentally ineffective on LLMs? We address this question by scaling LiRA - one of the strongest MIAs - to GPT-2 architectures ranging from 10M to 1B parameters, training reference models on over 20B tokens from the C4 dataset. Our results advance the understanding of MIAs on LLMs in three key ways: (1) strong MIAs can succeed on pre-trained LLMs; (2) their effectiveness, however, remains limited (e.g., AUC<0.7) in practical settings; and, (3) the relationship between MIA success and related privacy metrics is not as straightforward as prior work has suggested.

Paperid: 412, https://arxiv.org/pdf/2503.23804.pdf

Abstract:
Large language model (LLM)-powered agents are increasingly used in recommender systems (RSs) to achieve personalized behavior modeling, where the memory mechanism plays a pivotal role in enabling the agents to autonomously explore, learn and self-evolve from real-world interactions. However, this very mechanism, serving as a contextual repository, inherently exposes an attack surface for potential adversarial manipulations. Despite its central role, the robustness of agentic RSs in the face of such threats remains largely underexplored. Previous works suffer from semantic mismatches or rely on static embeddings or pre-defined prompts, all of which hinder their applicability to systems with dynamic memory states. This challenge is exacerbated by the black-box nature of commercial RSs. To tackle the above problems, in this paper, we present the first systematic investigation of memory-based vulnerabilities in LLM-powered recommender agents, revealing their security limitations and guiding efforts to strengthen system resilience and trustworthiness. Specifically, we propose a novel black-box attack framework named DrunkAgent. DrunkAgent crafts semantically meaningful adversarial textual triggers for target item promotions and introduces a series of strategies to maximize the trigger effect by corrupting the memory updates during the interactions. The triggers and strategies are optimized on a surrogate model, enabling DrunkAgent transferable and stealthy. Extensive experiments on real-world datasets across diverse agentic RSs, including collaborative filtering, retrieval augmentation and sequential recommendations, demonstrate the generalizability, transferability and stealthiness of DrunkAgent.

Paperid: 413, https://arxiv.org/pdf/2503.00065.pdf

Abstract:
Graph Neural Networks (GNNs) achieve high performance in various real-world applications, such as drug discovery, traffic states prediction, and recommendation systems. The fact that building powerful GNNs requires a large amount of training data, powerful computing resources, and human expertise turns the models into lucrative targets for model stealing attacks. Prior work has revealed that the threat vector of stealing attacks against GNNs is large and diverse, as an attacker can leverage various heterogeneous signals ranging from node labels to high-dimensional node embeddings to create a local copy of the target GNN at a fraction of the original training costs. This diversity in the threat vector renders the design of effective and general defenses challenging and existing defenses usually focus on one particular stealing setup. Additionally, they solely provide means to identify stolen model copies rather than preventing the attack. To close this gap, we propose the first and general Active Defense Against GNN Extraction (ADAGE). ADAGE builds on the observation that stealing a model's full functionality requires highly diverse queries to leak its behavior across the input space. Our defense monitors this query diversity and progressively perturbs outputs as the accumulated leakage grows. In contrast to prior work, ADAGE can prevent stealing across all common attack setups. Our extensive experimental evaluation using six benchmark datasets, four GNN models, and three types of adaptive attackers shows that ADAGE penalizes attackers to the degree of rendering stealing impossible, whilst preserving predictive performance on downstream tasks. ADAGE, thereby, contributes towards securely sharing valuable GNNs in the future.

Paperid: 414, https://arxiv.org/pdf/2502.18706.pdf

Abstract:
Federated learning (FL) with differential privacy (DP) provides a framework for collaborative machine learning, enabling clients to train a shared model while adhering to strict privacy constraints. The framework allows each client to have an individual privacy guarantee, e.g., by adding different amounts of noise to each client's model updates. One underlying assumption is that all clients spend their privacy budgets uniformly over time (learning rounds). However, it has been shown in the literature that learning in early rounds typically focuses on more coarse-grained features that can be learned at lower signal-to-noise ratios while later rounds learn fine-grained features that benefit from higher signal-to-noise ratios. Building on this intuition, we propose a time-adaptive DP-FL framework that expends the privacy budget non-uniformly across both time and clients. Our framework enables each client to save privacy budget in early rounds so as to be able to spend more in later rounds when additional accuracy is beneficial in learning more fine-grained features. We theoretically prove utility improvements in the case that clients with stricter privacy budgets spend budgets unevenly across rounds, compared to clients with more relaxed budgets, who have sufficient budgets to distribute their spend more evenly. Our practical experiments on standard benchmark datasets support our theoretical results and show that, in practice, our algorithms improve the privacy-utility trade-offs compared to baseline schemes.

Paperid: 415, https://arxiv.org/pdf/2502.13379.pdf

Abstract:
Trusted Execution Environments (TEEs) isolate a special space within a device's memory that is not accessible to the normal world (also known as Untrusted Environment), even when the device is compromised. Thus, developers can utilize TEEs to provide strong security guarantees for their programs, making sensitive operations like encrypted data storage, fingerprint verification, and remote attestation protected from malicious attacks. Despite the strong protections offered by TEEs, adapting existing programs to leverage such security guarantees is non-trivial, often requiring extensive domain knowledge and manual intervention, which makes TEEs less accessible to developers. This motivates us to design AutoTEE, the first Large Language Model (LLM)-enabled approach that can automatically identify, partition, transform, and port sensitive functions into TEEs with minimal developer intervention. By manually reviewing 68 repositories, we constructed a benchmark dataset consisting of 385 sensitive functions eligible for transformation, on which AutoTEE achieves a high F1 score of 0.91. AutoTEE effectively transforms these sensitive functions into their TEE-compatible counterparts, achieving success rates of 90\% and 83\% for Java and Python, respectively. We further provide a mechanism to automatically port the transformed code to different TEE platforms, including Intel SGX and AMD SEV, demonstrating that the transformed programs run successfully and correctly on these platforms.

Paperid: 416, https://arxiv.org/pdf/2506.23066.pdf

Abstract:
Text watermarking schemes have gained considerable attention in recent years, yet still face critical challenges in achieving simultaneous robustness, generalizability, and imperceptibility. This paper introduces a new embedding paradigm,termed CORE, which comprises several consecutively aligned black pixel segments. Its key innovation lies in its inherent noise resistance during transmission and broad applicability across languages and fonts. Based on the CORE, we present a text watermarking framework named CoreMark. Specifically, CoreMark first dynamically extracts COREs from characters. Then, the characters with stronger robustness are selected according to the lengths of COREs. By modifying the thickness of the CORE, the hidden data is embedded into the selected characters without causing significant visual distortions. Moreover, a general plug-and-play embedding strength modulator is proposed, which can adaptively enhance the robustness for small font sizes by adjusting the embedding strength according to the font size. Experimental evaluation indicates that CoreMark demonstrates outstanding generalizability across multiple languages and fonts. Compared to existing methods, CoreMark achieves significant improvements in resisting screenshot, print-scan, and print camera attacks, while maintaining satisfactory imperceptibility.

Paperid: 417, https://arxiv.org/pdf/2506.00534.pdf

Abstract:
The choice of a suitable visual language projector (VLP) is critical to the successful training of large visual language models (LVLMs). Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offering distinct advantages in performance and computational efficiency. However, their security implications have not been thoroughly examined. Our comprehensive evaluation reveals significant differences in their security profiles: compressed projectors exhibit substantial vulnerabilities, allowing adversaries to successfully compromise LVLMs even with minimal knowledge of structural information. In stark contrast, uncompressed projectors demonstrate robust security properties and do not introduce additional vulnerabilities. These findings provide critical guidance for researchers in selecting optimal VLPs that enhance the security and reliability of visual language models. The code will be released.

Paperid: 418, https://arxiv.org/pdf/2505.24227.pdf

Abstract:
While adversarial attacks on vision-and-language pretraining (VLP) models have been explored, generating natural adversarial samples crafted through realistic and semantically meaningful perturbations remains an open challenge. Existing methods, primarily designed for classification tasks, struggle when adapted to VLP models due to their restricted optimization spaces, leading to ineffective attacks or unnatural artifacts. To address this, we propose \textbf{LightD}, a novel framework that generates natural adversarial samples for VLP models via semantically guided relighting. Specifically, LightD leverages ChatGPT to propose context-aware initial lighting parameters and integrates a pretrained relighting model (IC-light) to enable diverse lighting adjustments. LightD expands the optimization space while ensuring perturbations align with scene semantics. Additionally, gradient-based optimization is applied to the reference lighting image to further enhance attack effectiveness while maintaining visual naturalness. The effectiveness and superiority of the proposed LightD have been demonstrated across various VLP models in tasks such as image captioning and visual question answering.

Paperid: 419, https://arxiv.org/pdf/2505.23792.pdf

Abstract:
This paper focuses on Zero-Trust Foundation Models (ZTFMs), a novel paradigm that embeds zero-trust security principles into the lifecycle of foundation models (FMs) for Internet of Things (IoT) systems. By integrating core tenets, such as continuous verification, least privilege access (LPA), data confidentiality, and behavioral analytics into the design, training, and deployment of FMs, ZTFMs can enable secure, privacy-preserving AI across distributed, heterogeneous, and potentially adversarial IoT environments. We present the first structured synthesis of ZTFMs, identifying their potential to transform conventional trust-based IoT architectures into resilient, self-defending ecosystems. Moreover, we propose a comprehensive technical framework, incorporating federated learning (FL), blockchain-based identity management, micro-segmentation, and trusted execution environments (TEEs) to support decentralized, verifiable intelligence at the network edge. In addition, we investigate emerging security threats unique to ZTFM-enabled systems and evaluate countermeasures, such as anomaly detection, adversarial training, and secure aggregation. Through this analysis, we highlight key open research challenges in terms of scalability, secure orchestration, interpretable threat attribution, and dynamic trust calibration. This survey lays a foundational roadmap for secure, intelligent, and trustworthy IoT infrastructures powered by FMs.

Paperid: 420, https://arxiv.org/pdf/2505.23786.pdf

Abstract:
With the increasing size of frontier LLMs, post-training quantization has become the standard for memory-efficient deployment. Recent work has shown that basic rounding-based quantization schemes pose security risks, as they can be exploited to inject malicious behaviors into quantized models that remain hidden in full precision. However, existing attacks cannot be applied to more complex quantization methods, such as the GGUF family used in the popular ollama and llama$.$cpp frameworks. In this work, we address this gap by introducing the first attack on GGUF. Our key insight is that the quantization error -- the difference between the full-precision weights and their (de-)quantized version -- provides sufficient flexibility to construct malicious quantized models that appear benign in full precision. Leveraging this, we develop an attack that trains the target malicious LLM while constraining its weights based on quantization errors. We demonstrate the effectiveness of our attack on three popular LLMs across nine GGUF quantization data types on three diverse attack scenarios: insecure code generation ($Î$=$88.7\%$), targeted content injection ($Î$=$85.0\%$), and benign instruction refusal ($Î$=$30.1\%$). Our attack highlights that (1) the most widely used post-training quantization method is susceptible to adversarial interferences, and (2) the complexity of quantization schemes alone is insufficient as a defense.

Paperid: 421, https://arxiv.org/pdf/2505.16723.pdf

Abstract:
As open-source language models (OSMs) grow more capable and are widely shared and finetuned, ensuring model provenance, i.e., identifying the origin of a given model instance, has become an increasingly important issue. At the same time, existing backdoor-based model fingerprinting techniques often fall short of achieving key requirements of real-world model ownership detection. In this work, we build on the observation that while current open-source model watermarks fail to achieve reliable content traceability, they can be effectively adapted to address the challenge of model provenance. To this end, we introduce the concept of domain-specific watermarking for model fingerprinting. Rather than watermarking all generated content, we train the model to embed watermarks only within specified subdomains (e.g., particular languages or topics). This targeted approach ensures detection reliability, while improving watermark durability and quality under a range of real-world deployment settings. Our evaluations show that domain-specific watermarking enables model fingerprinting with strong statistical guarantees, controllable false positive rates, high detection power, and preserved generation quality. Moreover, we find that our fingerprints are inherently stealthy and naturally robust to real-world variability across deployment scenarios.

Paperid: 422, https://arxiv.org/pdf/2505.16567.pdf

Abstract:
Finetuning openly accessible Large Language Models (LLMs) has become standard practice for achieving task-specific performance improvements. Until now, finetuning has been regarded as a controlled and secure process in which training on benign datasets led to predictable behaviors. In this paper, we demonstrate for the first time that an adversary can create poisoned LLMs that initially appear benign but exhibit malicious behaviors once finetuned by downstream users. To this end, our proposed attack, FAB (Finetuning-Activated Backdoor), poisons an LLM via meta-learning techniques to simulate downstream finetuning, explicitly optimizing for the emergence of malicious behaviors in the finetuned models. At the same time, the poisoned LLM is regularized to retain general capabilities and to exhibit no malicious behaviors prior to finetuning. As a result, when users finetune the seemingly benign model on their own datasets, they unknowingly trigger its hidden backdoor behavior. We demonstrate the effectiveness of FAB across multiple LLMs and three target behaviors: unsolicited advertising, refusal, and jailbreakability. Additionally, we show that FAB-backdoors are robust to various finetuning choices made by the user (e.g., dataset, number of steps, scheduler). Our findings challenge prevailing assumptions about the security of finetuning, revealing yet another critical attack vector exploiting the complexities of LLMs.

Paperid: 423, https://arxiv.org/pdf/2505.04889.pdf

Abstract:
Despite Federated Learning (FL) employing gradient aggregation at the server for distributed training to prevent the privacy leakage of raw data, private information can still be divulged through the analysis of uploaded gradients from clients. Substantial efforts have been made to integrate local differential privacy (LDP) into the system to achieve a strict privacy guarantee. However, existing methods fail to take practical issues into account by merely perturbing each sample with the same mechanism while each client may have their own privacy preferences on privacy-sensitive information (PSI), which is not uniformly distributed across the raw data. In such a case, excessive privacy protection from private-insensitive information can additionally introduce unnecessary noise, which may degrade the model performance. In this work, we study the PSI within data and develop FedRE, that can simultaneously achieve robustness and effectiveness benefits with LDP protection. More specifically, we first define PSI with regard to the privacy preferences of each client. Then, we optimize the LDP by allocating less privacy budget to gradients with higher PSI in a layer-wise manner, thus providing a stricter privacy guarantee for PSI. Furthermore, to mitigate the performance degradation caused by LDP, we design a parameter aggregation mechanism based on the distribution of the perturbed information. We conducted experiments with text tamper detection on T-SROIE and DocTamper datasets, and FedRE achieves competitive performance compared to state-of-the-art methods.

Paperid: 424, https://arxiv.org/pdf/2504.13192.pdf

Abstract:
Recently, Large Language Model (LLM)-empowered recommender systems (RecSys) have brought significant advances in personalized user experience and have attracted considerable attention. Despite the impressive progress, the research question regarding the safety vulnerability of LLM-empowered RecSys still remains largely under-investigated. Given the security and privacy concerns, it is more practical to focus on attacking the black-box RecSys, where attackers can only observe the system's inputs and outputs. However, traditional attack approaches employing reinforcement learning (RL) agents are not effective for attacking LLM-empowered RecSys due to the limited capabilities in processing complex textual inputs, planning, and reasoning. On the other hand, LLMs provide unprecedented opportunities to serve as attack agents to attack RecSys because of their impressive capability in simulating human-like decision-making processes. Therefore, in this paper, we propose a novel attack framework called CheatAgent by harnessing the human-like capabilities of LLMs, where an LLM-based agent is developed to attack LLM-Empowered RecSys. Specifically, our method first identifies the insertion position for maximum impact with minimal input modification. After that, the LLM agent is designed to generate adversarial perturbations to insert at target positions. To further improve the quality of generated perturbations, we utilize the prompt tuning technique to improve attacking strategies via feedback from the victim RecSys iteratively. Extensive experiments across three real-world datasets demonstrate the effectiveness of our proposed attacking method.

Paperid: 425, https://arxiv.org/pdf/2504.11182.pdf

Abstract:
The fusion of Large Language Models (LLMs) with recommender systems (RecSys) has dramatically advanced personalized recommendations and drawn extensive attention. Despite the impressive progress, the safety of LLM-based RecSys against backdoor attacks remains largely under-explored. In this paper, we raise a new problem: Can a backdoor with a specific trigger be injected into LLM-based Recsys, leading to the manipulation of the recommendation responses when the backdoor trigger is appended to an item's title? To investigate the vulnerabilities of LLM-based RecSys under backdoor attacks, we propose a new attack framework termed Backdoor Injection Poisoning for RecSys (BadRec). BadRec perturbs the items' titles with triggers and employs several fake users to interact with these items, effectively poisoning the training set and injecting backdoors into LLM-based RecSys. Comprehensive experiments reveal that poisoning just 1% of the training data with adversarial examples is sufficient to successfully implant backdoors, enabling manipulation of recommendations. To further mitigate such a security threat, we propose a universal defense strategy called Poison Scanner (P-Scanner). Specifically, we introduce an LLM-based poison scanner to detect the poisoned items by leveraging the powerful language understanding and rich knowledge of LLMs. A trigger augmentation agent is employed to generate diverse synthetic triggers to guide the poison scanner in learning domain-specific knowledge of the poisoned item detection task. Extensive experiments on three real-world datasets validate the effectiveness of the proposed P-Scanner.

Paperid: 426, https://arxiv.org/pdf/2502.15806.pdf

Abstract:
Large Reasoning Models (LRMs) have significantly advanced beyond traditional Large Language Models (LLMs) with their exceptional logical reasoning capabilities, yet these improvements introduce heightened safety risks. When subjected to jailbreak attacks, their ability to generate more targeted and organized content can lead to greater harm. Although some studies claim that reasoning enables safer LRMs against existing LLM attacks, they overlook the inherent flaws within the reasoning process itself. To address this gap, we propose the first jailbreak attack targeting LRMs, exploiting their unique vulnerabilities stemming from the advanced reasoning capabilities. Specifically, we introduce a Chaos Machine, a novel component to transform attack prompts with diverse one-to-one mappings. The chaos mappings iteratively generated by the machine are embedded into the reasoning chain, which strengthens the variability and complexity and also promotes a more robust attack. Based on this, we construct the Mousetrap framework, which makes attacks projected into nonlinear-like low sample spaces with mismatched generalization enhanced. Also, due to the more competing objectives, LRMs gradually maintain the inertia of unpredictable iterative reasoning and fall into our trap. Success rates of the Mousetrap attacking o1-mini, Claude-Sonnet and Gemini-Thinking are as high as 96%, 86% and 98% respectively on our toxic dataset Trotter. On benchmarks such as AdvBench, StrongREJECT, and HarmBench, attacking Claude-Sonnet, well-known for its safety, Mousetrap can astonishingly achieve success rates of 87.5%, 86.58% and 93.13% respectively. Attention: This paper contains inappropriate, offensive and harmful content.

Paperid: 427, https://arxiv.org/pdf/2502.10525.pdf

Abstract:
While watermarks for closed LLMs have matured and have been included in large-scale deployments, these methods are not applicable to open-source models, which allow users full control over the decoding process. This setting is understudied yet critical, given the rising performance of open-source models. In this work, we lay the foundation for systematic study of open-source LLM watermarking. For the first time, we explicitly formulate key requirements, including durability against common model modifications such as model merging, quantization, or finetuning, and propose a concrete evaluation setup. Given the prevalence of these modifications, durability is crucial for an open-source watermark to be effective. We survey and evaluate existing methods, showing that they are not durable. We also discuss potential ways to improve their durability and highlight remaining challenges. We hope our work enables future progress on this important problem.

Paperid: 428, https://arxiv.org/pdf/2501.15509.pdf

Abstract:
Model fingerprinting is a widely adopted approach to safeguard the intellectual property rights of open-source models by preventing their unauthorized reuse. It is promising and convenient since it does not necessitate modifying the protected model. In this paper, we revisit existing fingerprinting methods and reveal that they are vulnerable to false claim attacks where adversaries falsely assert ownership of any third-party model. We demonstrate that this vulnerability mostly stems from their untargeted nature, where they generally compare the outputs of given samples on different models instead of the similarities to specific references. Motivated by these findings, we propose a targeted fingerprinting paradigm (i.e., FIT-Print) to counteract false claim attacks. Specifically, FIT-Print transforms the fingerprint into a targeted signature via optimization. Building on the principles of FIT-Print, we develop bit-wise and list-wise black-box model fingerprinting methods, i.e., FIT-ModelDiff and FIT-LIME, which exploit the distance between model outputs and the feature attribution of specific samples as the fingerprint, respectively. Extensive experiments on benchmark models and datasets verify the effectiveness, conferrability, and resistance to false claim attacks of our FIT-Print.

Paperid: 429, https://arxiv.org/pdf/2501.01090.pdf

Abstract:
Model extraction attacks are one type of inference-time attacks that approximate the functionality and performance of a black-box victim model by launching a certain number of queries to the model and then leveraging the model's predictions to train a substitute model. These attacks pose severe security threats to production models and MLaaS platforms and could cause significant monetary losses to the model owners. A body of work has proposed to defend machine learning models against model extraction attacks, including both active defense methods that modify the model's outputs or increase the query overhead to avoid extraction and passive defense methods that detect malicious queries or leverage watermarks to perform post-verification. In this work, we introduce a new defense paradigm called attack as defense which modifies the model's output to be poisonous such that any malicious users that attempt to use the output to train a substitute model will be poisoned. To this end, we propose a novel lightweight backdoor attack method dubbed HoneypotNet that replaces the classification layer of the victim model with a honeypot layer and then fine-tunes the honeypot layer with a shadow model (to simulate model extraction) via bi-level optimization to modify its output to be poisonous while remaining the original performance. We empirically demonstrate on four commonly used benchmark datasets that HoneypotNet can inject backdoors into substitute models with a high success rate. The injected backdoor not only facilitates ownership verification but also disrupts the functionality of substitute models, serving as a significant deterrent to model extraction attacks.

Paperid: 430, https://arxiv.org/pdf/2506.09227.pdf

Abstract:
Large language model (LLM) unlearning has become a critical topic in machine learning, aiming to eliminate the influence of specific training data or knowledge without retraining the model from scratch. A variety of techniques have been proposed, including Gradient Ascent, model editing, and re-steering hidden representations. While existing surveys often organize these methods by their technical characteristics, such classifications tend to overlook a more fundamental dimension: the underlying intention of unlearning--whether it seeks to truly remove internal knowledge or merely suppress its behavioral effects. In this SoK paper, we propose a new taxonomy based on this intention-oriented perspective. Building on this taxonomy, we make three key contributions. First, we revisit recent findings suggesting that many removal methods may functionally behave like suppression, and explore whether true removal is necessary or achievable. Second, we survey existing evaluation strategies, identify limitations in current metrics and benchmarks, and suggest directions for developing more reliable and intention-aligned evaluations. Third, we highlight practical challenges--such as scalability and support for sequential unlearning--that currently hinder the broader deployment of unlearning methods. In summary, this work offers a comprehensive framework for understanding and advancing unlearning in generative AI, aiming to support future research and guide policy decisions around data removal and privacy.

Paperid: 431, https://arxiv.org/pdf/2505.16785.pdf

Abstract:
Despite providing superior performance, open-source large language models (LLMs) are vulnerable to abusive usage. To address this issue, recent works propose LLM fingerprinting methods to identify the specific source LLMs behind suspect applications. However, these methods fail to provide stealthy and robust fingerprint verification. In this paper, we propose a novel LLM fingerprinting scheme, namely CoTSRF, which utilizes the Chain of Thought (CoT) as the fingerprint of an LLM. CoTSRF first collects the responses from the source LLM by querying it with crafted CoT queries. Then, it applies contrastive learning to train a CoT extractor that extracts the CoT feature (i.e., fingerprint) from the responses. Finally, CoTSRF conducts fingerprint verification by comparing the Kullback-Leibler divergence between the CoT features of the source and suspect LLMs against an empirical threshold. Various experiments have been conducted to demonstrate the advantage of our proposed CoTSRF for fingerprinting LLMs, particularly in stealthy and robust fingerprint verification.

Paperid: 432, https://arxiv.org/pdf/2502.14847.pdf

Abstract:
Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications. While the communication framework is crucial for agent coordination, it also introduces a critical yet unexplored security vulnerability. In this work, we introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages. Unlike existing attacks that compromise individual agents, AiTM demonstrates how an adversary can compromise entire multi-agent systems by only manipulating the messages passing between agents. To enable the attack under the challenges of limited control and role-restricted communication format, we develop an LLM-powered adversarial agent with a reflection mechanism that generates contextually-aware malicious instructions. Our comprehensive evaluation across various frameworks, communication structures, and real-world applications demonstrates that LLM-MAS is vulnerable to communication-based attacks, highlighting the need for robust security measures in multi-agent systems.

Paperid: 433, https://arxiv.org/pdf/2502.14182.pdf

Abstract:
The lifecycle of large language models (LLMs) is far more complex than that of traditional machine learning models, involving multiple training stages, diverse data sources, and varied inference methods. While prior research on data poisoning attacks has primarily focused on the safety vulnerabilities of LLMs, these attacks face significant challenges in practice. Secure data collection, rigorous data cleaning, and the multistage nature of LLM training make it difficult to inject poisoned data or reliably influence LLM behavior as intended. Given these challenges, this position paper proposes rethinking the role of data poisoning and argue that multi-faceted studies on data poisoning can advance LLM development. From a threat perspective, practical strategies for data poisoning attacks can help evaluate and address real safety risks to LLMs. From a trustworthiness perspective, data poisoning can be leveraged to build more robust LLMs by uncovering and mitigating hidden biases, harmful outputs, and hallucinations. Moreover, from a mechanism perspective, data poisoning can provide valuable insights into LLMs, particularly the interplay between data and model behavior, driving a deeper understanding of their underlying mechanisms.

Paperid: 434, https://arxiv.org/pdf/2502.13172.pdf

Abstract:
Large Language Model (LLM) agents have become increasingly prevalent across various real-world applications. They enhance decision-making by storing private user-agent interactions in the memory module for demonstrations, introducing new privacy risks for LLM agents. In this work, we systematically investigate the vulnerability of LLM agents to our proposed Memory EXTRaction Attack (MEXTRA) under a black-box setting. To extract private information from memory, we propose an effective attacking prompt design and an automated prompt generation method based on different levels of knowledge about the LLM agent. Experiments on two representative agents demonstrate the effectiveness of MEXTRA. Moreover, we explore key factors influencing memory leakage from both the agent designer's and the attacker's perspectives. Our findings highlight the urgent need for effective memory safeguards in LLM agent design and deployment.

Paperid: 435, https://arxiv.org/pdf/2501.06807.pdf

Abstract:
Private large language model (LLM) inference based on secure multi-party computation (MPC) offers cryptographically-secure protection for both user prompt and proprietary model weights. However, it suffers from large latency overhead especially for long input sequences. While key-value (KV) cache eviction algorithms have been proposed to reduce the computation and memory cost for plaintext inference, they are not designed for MPC and cannot benefit private inference easily. In this paper, we propose an accurate and MPC-friendly KV cache eviction framework, dubbed MPCache. MPCache is built on the observation that historical tokens in a long sequence may have different effects on the downstream decoding. Hence, MPCache combines a look-once static eviction algorithm to discard unimportant tokens and a query-aware dynamic selection algorithm to further select a small subset of tokens for attention computation. As existing dynamic selection algorithms incur too much latency, we propose a series of optimizations to drastically reduce the KV cache selection overhead, including MPC-friendly similarity approximation, hierarchical KV cache clustering, and cross-layer index sharing strategy. With extensive experiments, we demonstrate that MPCache consistently outperforms prior-art KV cache eviction baselines across different LLM generation tasks and achieves 1.8~2.01x and 3.39~8.37x decoding latency and communication reduction on different sequence lengths, respectively.

Paperid: 436, https://arxiv.org/pdf/2504.13203.pdf

Abstract:
Multi-turn interactions with language models (LMs) pose critical safety risks, as harmful intent can be strategically spread across exchanges. Yet, the vast majority of prior work has focused on single-turn safety, while adaptability and diversity remain among the key challenges of multi-turn red-teaming. To address these challenges, we present X-Teaming, a scalable framework that systematically explores how seemingly harmless interactions escalate into harmful outcomes and generates corresponding attack scenarios. X-Teaming employs collaborative agents for planning, attack optimization, and verification, achieving state-of-the-art multi-turn jailbreak effectiveness and diversity with success rates up to 98.1% across representative leading open-weight and closed-source models. In particular, X-Teaming achieves a 96.2% attack success rate against the latest Claude 3.7 Sonnet model, which has been considered nearly immune to single-turn attacks. Building on X-Teaming, we introduce XGuard-Train, an open-source multi-turn safety training dataset that is 20x larger than the previous best resource, comprising 30K interactive jailbreaks, designed to enable robust multi-turn safety alignment for LMs. Our work offers essential tools and insights for mitigating sophisticated conversational attacks, advancing the multi-turn safety of LMs.

Paperid: 437, https://arxiv.org/pdf/2502.13828.pdf

Abstract:
Software-defined networking (SDN) has shifted network management by decoupling the data and control planes. This enables programmatic control via software applications using open APIs. SDN's programmability has fueled its popularity but may have opened issues extending the attack surface by introducing vulnerable software. Therefore, the research community needs to have a deep and broad understanding of the risks posed by SDN to propose mitigating measures. The literature, however, lacks a comprehensive review of the current state of research in this direction. This paper addresses this gap by providing a comprehensive overview of the state-of-the-art research in SDN security focusing on the software (i.e., the controller, APIs, applications) part. We systematically reviewed 58 relevant publications to analyze trends, identify key testing and analysis methodologies, and categorize studied vulnerabilities. We further explore areas where the research community can make significant contributions. This work offers the most extensive and in-depth analysis of SDN software security to date.

Paperid: 438, https://arxiv.org/pdf/2502.10465.pdf

Abstract:
Embedding watermarks into the output of generative models is essential for establishing copyright and verifiable ownership over the generated content. Emerging diffusion model watermarking methods either embed watermarks in the frequency domain or offer limited versatility of the watermark patterns in the image space, which allows simplistic detection and removal of the watermarks from the generated content. To address this issue, we propose a watermarking technique that embeds watermark features into the diffusion model itself. Our technique enables training of a paired watermark extractor for a generative model that is learned through an end-to-end process. The extractor forces the generator, during training, to effectively embed versatile, imperceptible watermarks in the generated content while simultaneously ensuring their precise recovery. We demonstrate highly accurate watermark embedding/detection and show that it is also possible to distinguish between different watermarks embedded with our method to differentiate between generative models.

Paperid: 439, https://arxiv.org/pdf/2502.08097.pdf

Abstract:
Personalized text-to-image models allow users to generate images of new concepts from several reference photos, thereby leading to critical concerns regarding civil privacy. Although several anti-personalization techniques have been developed, these methods typically assume that defenders can afford to design a privacy cloak corresponding to each specific image. However, due to extensive personal images shared online, image-specific methods are limited by real-world practical applications. To address this issue, we are the first to investigate the creation of identity-specific cloaks (ID-Cloak) that safeguard all images belong to a specific identity. Specifically, we first model an identity subspace that preserves personal commonalities and learns diverse contexts to capture the image distribution to be protected. Then, we craft identity-specific cloaks with the proposed novel objective that encourages the cloak to guide the model away from its normal output within the subspace. Extensive experiments show that the generated universal cloak can effectively protect the images. We believe our method, along with the proposed identity-specific cloak setting, marks a notable advance in realistic privacy protection.

Paperid: 440, https://arxiv.org/pdf/2501.09320.pdf

Abstract:
Federated learning (FL) is vulnerable to backdoor attacks, where adversaries alter model behavior on target classification labels by embedding triggers into data samples. While these attacks have received considerable attention in horizontal FL, they are less understood for vertical FL (VFL), where devices hold different features of the samples, and only the server holds the labels. In this work, we propose a novel backdoor attack on VFL which (i) does not rely on gradient information from the server and (ii) considers potential collusion among multiple adversaries for sample selection and trigger embedding. Our label inference model augments variational autoencoders with metric learning, which adversaries can train locally. A consensus process over the adversary graph topology determines which datapoints to poison. We further propose methods for trigger splitting across the adversaries, with an intensity-based implantation scheme skewing the server towards the trigger. Our convergence analysis reveals the impact of backdoor perturbations on VFL indicated by a stationarity gap for the trained model, which we verify empirically as well. We conduct experiments comparing our attack with recent backdoor VFL approaches, finding that ours obtains significantly higher success rates for the same main task performance despite not using server information. Additionally, our results verify the impact of collusion on attack performance.

Paperid: 441, https://arxiv.org/pdf/2506.17353.pdf

Abstract:
The increasing demand for domain-specific and human-aligned Large Language Models (LLMs) has led to the widespread adoption of Supervised Fine-Tuning (SFT) techniques. SFT datasets often comprise valuable instruction-response pairs, making them highly valuable targets for potential extraction. This paper studies this critical research problem for the first time. We start by formally defining and formulating the problem, then explore various attack goals, types, and variants based on the unique properties of SFT data in real-world scenarios. Based on our analysis of extraction behaviors of direct extraction, we develop a novel extraction method specifically designed for SFT models, called Differentiated Data Extraction (DDE), which exploits the confidence levels of fine-tuned models and their behavioral differences from pre-trained base models. Through extensive experiments across multiple domains and scenarios, we demonstrate the feasibility of SFT data extraction using DDE. Our results show that DDE consistently outperforms existing extraction baselines in all attack settings. To counter this new attack, we propose a defense mechanism that mitigates DDE attacks with minimal impact on model performance. Overall, our research reveals hidden data leak risks in fine-tuned LLMs and provides insights for developing more secure models.

Paperid: 442, https://arxiv.org/pdf/2505.24842.pdf

Abstract:
Model distillation has become essential for creating smaller, deployable language models that retain larger system capabilities. However, widespread deployment raises concerns about resilience to adversarial manipulation. This paper investigates vulnerability of distilled models to adversarial injection of biased content during training. We demonstrate that adversaries can inject subtle biases into teacher models through minimal data poisoning, which propagates to student models and becomes significantly amplified. We propose two propagation modes: Untargeted Propagation, where bias affects multiple tasks, and Targeted Propagation, focusing on specific tasks while maintaining normal behavior elsewhere. With only 25 poisoned samples (0.25% poisoning rate), student models generate biased responses 76.9% of the time in targeted scenarios - higher than 69.4% in teacher models. For untargeted propagation, adversarial bias appears 6x-29x more frequently in student models on unseen tasks. We validate findings across six bias types (targeted advertisements, phishing links, narrative manipulations, insecure coding practices), various distillation methods, and different modalities spanning text and code generation. Our evaluation reveals shortcomings in current defenses - perplexity filtering, bias detection systems, and LLM-based autorater frameworks - against these attacks. Results expose significant security vulnerabilities in distilled models, highlighting need for specialized safeguards. We propose practical design principles for building effective adversarial bias mitigation strategies.

Paperid: 443, https://arxiv.org/pdf/2505.22271.pdf

Abstract:
While (multimodal) large language models (LLMs) have attracted widespread attention due to their exceptional capabilities, they remain vulnerable to jailbreak attacks. Various defense methods are proposed to defend against jailbreak attacks, however, they are often tailored to specific types of jailbreak attacks, limiting their effectiveness against diverse adversarial strategies. For instance, rephrasing-based defenses are effective against text adversarial jailbreaks but fail to counteract image-based attacks. To overcome these limitations, we propose a universal defense framework, termed Test-time IMmunization (TIM), which can adaptively defend against various jailbreak attacks in a self-evolving way. Specifically, TIM initially trains a gist token for efficient detection, which it subsequently applies to detect jailbreak activities during inference. When jailbreak attempts are identified, TIM implements safety fine-tuning using the detected jailbreak instructions paired with refusal answers. Furthermore, to mitigate potential performance degradation in the detector caused by parameter updates during safety fine-tuning, we decouple the fine-tuning process from the detection module. Extensive experiments on both LLMs and multimodal LLMs demonstrate the efficacy of TIM.

Paperid: 444, https://arxiv.org/pdf/2505.14534.pdf

Abstract:
Gemini is increasingly used to perform tasks on behalf of users, where function-calling and tool-use capabilities enable the model to access user data. Some tools, however, require access to untrusted data introducing risk. Adversaries can embed malicious instructions in untrusted data which cause the model to deviate from the user's expectations and mishandle their data or permissions. In this report, we set out Google DeepMind's approach to evaluating the adversarial robustness of Gemini models and describe the main lessons learned from the process. We test how Gemini performs against a sophisticated adversary through an adversarial evaluation framework, which deploys a suite of adaptive attack techniques to run continuously against past, current, and future versions of Gemini. We describe how these ongoing evaluations directly help make Gemini more resilient against manipulation.

Paperid: 445, https://arxiv.org/pdf/2505.12442.pdf

Abstract:
The rapid advancement of Large Language Models (LLMs) has led to the emergence of Multi-Agent Systems (MAS) to perform complex tasks through collaboration. However, the intricate nature of MAS, including their architecture and agent interactions, raises significant concerns regarding intellectual property (IP) protection. In this paper, we introduce MASLEAK, a novel attack framework designed to extract sensitive information from MAS applications. MASLEAK targets a practical, black-box setting, where the adversary has no prior knowledge of the MAS architecture or agent configurations. The adversary can only interact with the MAS through its public API, submitting attack query $q$ and observing outputs from the final agent. Inspired by how computer worms propagate and infect vulnerable network hosts, MASLEAK carefully crafts adversarial query $q$ to elicit, propagate, and retain responses from each MAS agent that reveal a full set of proprietary components, including the number of agents, system topology, system prompts, task instructions, and tool usages. We construct the first synthetic dataset of MAS applications with 810 applications and also evaluate MASLEAK against real-world MAS applications, including Coze and CrewAI. MASLEAK achieves high accuracy in extracting MAS IP, with an average attack success rate of 87% for system prompts and task instructions, and 92% for system architecture in most cases. We conclude by discussing the implications of our findings and the potential defenses.

Paperid: 446, https://arxiv.org/pdf/2505.04873.pdf

Abstract:
The integration of machine learning (ML) in cyber physical systems (CPS) is a complex task due to the challenges that arise in terms of real-time decision making, safety, reliability, device heterogeneity, and data privacy. There are also open research questions that must be addressed in order to fully realize the potential of ML in CPS. Federated learning (FL), a distributed approach to ML, has become increasingly popular in recent years. It allows models to be trained using data from decentralized sources. This approach has been gaining popularity in the CPS field, as it integrates computer, communication, and physical processes. Therefore, the purpose of this work is to provide a comprehensive analysis of the most recent developments of FL-CPS, including the numerous application areas, system topologies, and algorithms developed in recent years. The paper starts by discussing recent advances in both FL and CPS, followed by their integration. Then, the paper compares the application of FL in CPS with its applications in the internet of things (IoT) in further depth to show their connections and distinctions. Furthermore, the article scrutinizes how FL is utilized in critical CPS applications, e.g., intelligent transportation systems, cybersecurity services, smart cities, and smart healthcare solutions. The study also includes critical insights and lessons learned from various FL-CPS implementations. The paper's concluding section delves into significant concerns and suggests avenues for further research in this fast-paced and dynamic era.

Paperid: 447, https://arxiv.org/pdf/2504.10000.pdf

Abstract:
Multi-modal large language models (MLLMs) have made significant progress, yet their safety alignment remains limited. Typically, current open-source MLLMs rely on the alignment inherited from their language module to avoid harmful generations. However, the lack of safety measures specifically designed for multi-modal inputs creates an alignment gap, leaving MLLMs vulnerable to vision-domain attacks such as typographic manipulation. Current methods utilize a carefully designed safety dataset to enhance model defense capability, while the specific knowledge or patterns acquired from the high-quality dataset remain unclear. Through comparison experiments, we find that the alignment gap primarily arises from data distribution biases, while image content, response quality, or the contrastive behavior of the dataset makes little contribution to boosting multi-modal safety. To further investigate this and identify the key factors in improving MLLM safety, we propose finetuning MLLMs on a small set of benign instruct-following data with responses replaced by simple, clear rejection sentences. Experiments show that, without the need for labor-intensive collection of high-quality malicious data, model safety can still be significantly improved, as long as a specific fraction of rejection data exists in the finetuning set, indicating the security alignment is not lost but rather obscured during multi-modal pretraining or instruction finetuning. Simply correcting the underlying data bias could narrow the safety gap in the vision domain.

Paperid: 448, https://arxiv.org/pdf/2503.17932.pdf

Abstract:
Large Language Models (LLMs) have become increasingly vulnerable to jailbreak attacks that circumvent their safety mechanisms. While existing defense methods either suffer from adaptive attacks or require computationally expensive auxiliary models, we present STShield, a lightweight framework for real-time jailbroken judgement. STShield introduces a novel single-token sentinel mechanism that appends a binary safety indicator to the model's response sequence, leveraging the LLM's own alignment capabilities for detection. Our framework combines supervised fine-tuning on normal prompts with adversarial training using embedding-space perturbations, achieving robust detection while preserving model utility. Extensive experiments demonstrate that STShield successfully defends against various jailbreak attacks, while maintaining the model's performance on legitimate queries. Compared to existing approaches, STShield achieves superior defense performance with minimal computational overhead, making it a practical solution for real-world LLM deployment.

Paperid: 449, https://arxiv.org/pdf/2503.08908.pdf

Abstract:
Large Language Models (LLMs), despite their impressive capabilities, often fail to accurately repeat a single word when prompted to, and instead output unrelated text. This unexplained failure mode represents a vulnerability, allowing even end-users to diverge models away from their intended behavior. We aim to explain the causes for this phenomenon and link it to the concept of ``attention sinks'', an emergent LLM behavior crucial for fluency, in which the initial token receives disproportionately high attention scores. Our investigation identifies the neural circuit responsible for attention sinks and shows how long repetitions disrupt this circuit. We extend this finding to other non-repeating sequences that exhibit similar circuit disruptions. To address this, we propose a targeted patch that effectively resolves the issue without negatively impacting the model's overall performance. This study provides a mechanistic explanation for an LLM vulnerability, demonstrating how interpretability can diagnose and address issues, and offering insights that pave the way for more secure and reliable models.

Paperid: 450, https://arxiv.org/pdf/2503.03486.pdf

Abstract:
Patient data is widely used to estimate heterogeneous treatment effects and thus understand the effectiveness and safety of drugs. Yet, patient data includes highly sensitive information that must be kept private. In this work, we aim to estimate the conditional average treatment effect (CATE) from observational data under differential privacy. Specifically, we present DP-CATE, a novel framework for CATE estimation that is Neyman-orthogonal and further ensures differential privacy of the estimates. Our framework is highly general: it applies to any two-stage CATE meta-learner with a Neyman-orthogonal loss function, and any machine learning model can be used for nuisance estimation. We further provide an extension of our DP-CATE, where we employ RKHS regression to release the complete CATE function while ensuring differential privacy. We demonstrate our DP-CATE across various experiments using synthetic and real-world datasets. To the best of our knowledge, we are the first to provide a framework for CATE estimation that is Neyman-orthogonal and differentially private.

Paperid: 451, https://arxiv.org/pdf/2501.18533.pdf

Abstract:
Large Vision-Language Models (VLMs) have achieved remarkable performance across a wide range of tasks. However, their deployment in safety-critical domains poses significant challenges. Existing safety fine-tuning methods, which focus on textual or multimodal content, fall short in addressing challenging cases or disrupt the balance between helpfulness and harmlessness. Our evaluation highlights a safety reasoning gap: these methods lack safety visual reasoning ability, leading to such bottlenecks. To address this limitation and enhance both visual perception and reasoning in safety-critical contexts, we propose a novel dataset that integrates multi-image inputs with safety Chain-of-Thought (CoT) labels as fine-grained reasoning logic to improve model performance. Specifically, we introduce the Multi-Image Safety (MIS) dataset, an instruction-following dataset tailored for multi-image safety scenarios, consisting of training and test splits. Our experiments demonstrate that fine-tuning InternVL2.5-8B with MIS significantly outperforms both powerful open-source models and API-based models in challenging multi-image tasks requiring safety-related visual reasoning. This approach not only delivers exceptional safety performance but also preserves general capabilities without any trade-offs. Specifically, fine-tuning with MIS increases average accuracy by 0.83% across five general benchmarks and reduces the Attack Success Rate (ASR) on multiple safety benchmarks by a large margin.

Paperid: 452, https://arxiv.org/pdf/2506.17317.pdf

Abstract:
Nowadays team workspaces are widely adopted for multi-user collaboration and digital resource management. To further broaden real-world applications, mainstream team workspaces platforms, such as Google Workspace and Microsoft OneDrive, allow third-party applications (referred to as add-ons) to be integrated into their workspaces, significantly extending the functionality of team workspaces. The powerful multi-user collaboration capabilities and integration of add-ons make team workspaces a central hub for managing shared resources and protecting them against unauthorized access. Due to the collaboration features of team workspaces, add-ons involved in collaborations may bypass the permission isolation enforced by the administrator, unlike in single-user permission management. This paper aims to investigate the permission management landscape of team workspaces add-ons. To this end, we perform an in-depth analysis of the enforced access control mechanism inherent in this ecosystem, considering both multi-user and cross-app features. We identify three potential security risks that can be exploited to cause permission escalation. We then systematically reveal the landscape of permission escalation risks in the current ecosystem. Specifically, we propose an automated tool, TAI, to systematically test all possible interactions within this ecosystem. Our evaluation reveals that permission escalation vulnerabilities are widespread in this ecosystem, with 41 interactions identified as problematic. Our findings should raise an alert to both the team workspaces platforms and third-party developers.

Paperid: 453, https://arxiv.org/pdf/2504.11604.pdf

Abstract:
Many real-world applications, such as machine learning and graph analytics, involve combinations of linear and non-linear operations. As these applications increasingly handle sensitive data, there is a significant demand for privacy-preserving computation techniques capable of efficiently supporting both types of operations-a property we define as "computational universality." Fully Homomorphic Encryption (FHE) has emerged as a powerful approach to perform computations directly on encrypted data. In this paper, we systematically evaluate and measure whether existing FHE methods achieve computational universality or primarily favor either linear or non-linear operations, especially in non-interactive settings. We evaluate FHE universality in three stages. First, we categorize existing FHE methods into ten distinct approaches and analyze their theoretical complexities, selecting the three most promising universal candidates. Next, we perform measurements on representative workloads that combine linear and non-linear operations in various sequences, assessing performance across different bit lengths and with SIMD parallelization enabled or disabled. Finally, we empirically evaluate these candidates on five real-world, privacy-sensitive applications, where each involving arithmetic (linear) and comparison-like (non-linear) operations. Our findings indicate significant overheads in current universal FHE solutions, with efficiency strongly influenced by SIMD parallelism, word-wise versus bit-wise operations, and the trade-off between approximate and exact computations. Additionally, our analysis provides practical guidance to help practitioners select the most suitable universal FHE schemes and algorithms based on specific application requirements.

Paperid: 454, https://arxiv.org/pdf/2504.00446.pdf

Abstract:
The widespread adoption of Large Language Models (LLMs) in critical applications has introduced severe reliability and security risks, as LLMs remain vulnerable to notorious threats such as hallucinations, jailbreak attacks, and backdoor exploits. These vulnerabilities have been weaponized by malicious actors, leading to unauthorized access, widespread misinformation, and compromised LLM-embedded system integrity. In this work, we introduce a novel approach to detecting abnormal behaviors in LLMs via hidden state forensics. By systematically inspecting layer-specific activation patterns, we develop a unified framework that can efficiently identify a range of security threats in real-time without imposing prohibitive computational costs. Extensive experiments indicate detection accuracies exceeding 95% and consistently robust performance across multiple models in most scenarios, while preserving the ability to detect novel attacks effectively. Furthermore, the computational overhead remains minimal, with merely fractions of a second. The significance of this work lies in proposing a promising strategy to reinforce the security of LLM-integrated systems, paving the way for safer and more reliable deployment in high-stakes domains. By enabling real-time detection that can also support the mitigation of abnormal behaviors, it represents a meaningful step toward ensuring the trustworthiness of AI systems amid rising security challenges.

Paperid: 455, https://arxiv.org/pdf/2503.12314.pdf

Abstract:
We propose the notion of empirical privacy variance and study it in the context of differentially private fine-tuning of language models. Specifically, we show that models calibrated to the same $(\varepsilon, Î´)$-DP guarantee using DP-SGD with different hyperparameter configurations can exhibit significant variations in empirical privacy, which we quantify through the lens of memorization. We investigate the generality of this phenomenon across multiple dimensions and discuss why it is surprising and relevant. Through regression analysis, we examine how individual and composite hyperparameters influence empirical privacy. The results reveal a no-free-lunch trade-off: existing practices of hyperparameter tuning in DP-SGD, which focus on optimizing utility under a fixed privacy budget, often come at the expense of empirical privacy. To address this, we propose refined heuristics for hyperparameter selection that explicitly account for empirical privacy, showing that they are both precise and practically useful. Finally, we take preliminary steps to understand empirical privacy variance. We propose two hypotheses, identify limitations in existing techniques like privacy auditing, and outline open questions for future research.

Paperid: 456, https://arxiv.org/pdf/2503.12217.pdf

Abstract:
Fully Homomorphic Encryption over the torus (TFHE) enables computation on encrypted data without decryption, making it a cornerstone of secure and confidential computing. Despite its potential in privacy preserving machine learning, secure multi party computation, private blockchain transactions, and secure medical diagnostics, its adoption remains limited due to cryptographic complexity and usability challenges. While various TFHE libraries and compilers exist, practical code generation remains a hurdle. We propose a compiler integrated framework to evaluate LLM inference and agentic optimization for TFHE code generation, focusing on logic gates and ReLU activation. Our methodology assesses error rates, compilability, and structural similarity across open and closedsource LLMs. Results highlight significant limitations in off-the-shelf models, while agentic optimizations such as retrieval augmented generation (RAG) and few-shot prompting reduce errors and enhance code fidelity. This work establishes the first benchmark for TFHE code generation, demonstrating how LLMs, when augmented with domain-specific feedback, can bridge the expertise gap in FHE code generation.

Paperid: 457, https://arxiv.org/pdf/2503.10809.pdf

Abstract:
Recent advances in operating system (OS) agents enable vision-language models to interact directly with the graphical user interface of an OS. These multimodal OS agents autonomously perform computer-based tasks in response to a single prompt via application programming interfaces (APIs). Such APIs typically support low-level operations, including mouse clicks, keyboard inputs, and screenshot captures. We introduce a novel attack vector: malicious image patches (MIPs) that have been adversarially perturbed so that, when captured in a screenshot, they cause an OS agent to perform harmful actions by exploiting specific APIs. For instance, MIPs embedded in desktop backgrounds or shared on social media can redirect an agent to a malicious website, enabling further exploitation. These MIPs generalise across different user requests and screen layouts, and remain effective for multiple OS agents. The existence of such attacks highlights critical security vulnerabilities in OS agents, which should be carefully addressed before their widespread adoption.

Paperid: 458, https://arxiv.org/pdf/2505.16186.pdf

Abstract:
Large Reasoning Models (LRMs) introduce a new generation paradigm of explicitly reasoning before answering, leading to remarkable improvements in complex tasks. However, they pose great safety risks against harmful queries and adversarial attacks. While recent mainstream safety efforts on LRMs, supervised fine-tuning (SFT), improve safety performance, we find that SFT-aligned models struggle to generalize to unseen jailbreak prompts. After thorough investigation of LRMs' generation, we identify a safety aha moment that can activate safety reasoning and lead to a safe response. This aha moment typically appears in the `key sentence', which follows models' query understanding process and can indicate whether the model will proceed safely. Based on these insights, we propose SafeKey, including two complementary objectives to better activate the safety aha moment in the key sentence: (1) a Dual-Path Safety Head to enhance the safety signal in the model's internal representations before the key sentence, and (2) a Query-Mask Modeling objective to improve the models' attention on its query understanding, which has important safety hints. Experiments across multiple safety benchmarks demonstrate that our methods significantly improve safety generalization to a wide range of jailbreak attacks and out-of-distribution harmful prompts, lowering the average harmfulness rate by 9.6\%, while maintaining general abilities. Our analysis reveals how SafeKey enhances safety by reshaping internal attention and improving the quality of hidden representations.

Paperid: 459, https://arxiv.org/pdf/2505.05849.pdf

Abstract:
The strong planning and reasoning capabilities of Large Language Models (LLMs) have fostered the development of agent-based systems capable of leveraging external tools and interacting with increasingly complex environments. However, these powerful features also introduce a critical security risk: indirect prompt injection, a sophisticated attack vector that compromises the core of these agents, the LLM, by manipulating contextual information rather than direct user prompts. In this work, we propose a generic black-box fuzzing framework, AgentVigil, designed to automatically discover and exploit indirect prompt injection vulnerabilities across diverse LLM agents. Our approach starts by constructing a high-quality initial seed corpus, then employs a seed selection algorithm based on Monte Carlo Tree Search (MCTS) to iteratively refine inputs, thereby maximizing the likelihood of uncovering agent weaknesses. We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o, respectively, nearly doubling the performance of baseline attacks. Moreover, AgentVigil exhibits strong transferability across unseen tasks and internal LLMs, as well as promising results against defenses. Beyond benchmark evaluations, we apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.

Paperid: 460, https://arxiv.org/pdf/2503.13224.pdf

Abstract:
Pre-trained models are valuable intellectual property, capturing both domain-specific and domain-invariant features within their weight spaces. However, model extraction attacks threaten these assets by enabling unauthorized source-domain inference and facilitating cross-domain transfer via the exploitation of domain-invariant features. In this work, we introduce **ProDiF**, a novel framework that leverages targeted weight space manipulation to secure pre-trained models against extraction attacks. **ProDiF** quantifies the transferability of filters and perturbs the weights of critical filters in unsecured memory, while preserving actual critical weights in a Trusted Execution Environment (TEE) for authorized users. A bi-level optimization further ensures resilience against adaptive fine-tuning attacks. Experimental results show that **ProDiF** reduces source-domain accuracy to near-random levels and decreases cross-domain transferability by 74.65\%, providing robust protection for pre-trained models. This work offers comprehensive protection for pre-trained DNN models and highlights the potential of weight space manipulation as a novel approach to model security.

Paperid: 461, https://arxiv.org/pdf/2502.19537.pdf

Abstract:
Leading language model (LM) providers like OpenAI and Anthropic allow customers to fine-tune frontier LMs for specific use cases. To prevent abuse, these providers apply filters to block fine-tuning on overtly harmful data. In this setting, we make three contributions: First, while past work has shown that safety alignment is "shallow", we correspondingly demonstrate that existing fine-tuning attacks are shallow -- attacks target only the first several tokens of the model response, and consequently can be blocked by generating the first several response tokens with an aligned model. Second, we conceptually illustrate how to make attacks deeper by introducing a new fine-tuning attack that trains models to first refuse harmful requests before answering them; this "refuse-then-comply" strategy bypasses shallow defenses and produces harmful responses that evade output filters. Third, we demonstrate the potency of our new fine-tuning attack by jailbreaking both open-source models equipped with defenses and production models, achieving attack success rates of 57% and 72% against GPT-4o and Claude Haiku, respectively. Our attack received a $2000 bug bounty from OpenAI and was acknowledged as a vulnerability by Anthropic. Our work undermines the notion that models are safe because they initially refuse harmful requests and broadens awareness of the scope of attacks that face production fine-tuning APIs.

Paperid: 462, https://arxiv.org/pdf/2502.14486.pdf

Abstract:
Jailbreak attacks, where harmful prompts bypass generative models' built-in safety, raise serious concerns about model vulnerability. While many defense methods have been proposed, the trade-offs between safety and helpfulness, and their application to Large Vision-Language Models (LVLMs), are not well understood. This paper systematically examines jailbreak defenses by reframing the standard generation task as a binary classification problem to assess model refusal tendencies for both harmful and benign queries. We identify two key defense mechanisms: safety shift, which increases refusal rates across all queries, and harmfulness discrimination, which improves the model's ability to distinguish between harmful and benign inputs. Using these mechanisms, we develop two ensemble defense strategies-inter-mechanism ensembles and intra-mechanism ensembles-to balance safety and helpfulness. Experiments on the MM-SafetyBench and MOSSBench datasets with LLaVA-1.5 models show that these strategies effectively improve model safety or optimize the trade-off between safety and helpfulness.

Paperid: 463, https://arxiv.org/pdf/2502.10673.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has become an effective method for enhancing large language models (LLMs) with up-to-date knowledge. However, it poses a significant risk of IP infringement, as IP datasets may be incorporated into the knowledge database by malicious Retrieval-Augmented LLMs (RA-LLMs) without authorization. To protect the rights of the dataset owner, an effective dataset membership inference algorithm for RA-LLMs is needed. In this work, we introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs. Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset. These canary documents are created with synthetic content and embedded watermarks to ensure uniqueness, stealthiness, and statistical provability. During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs for statistical evidence of the embedded watermark. Our experimental results demonstrate high query efficiency, detectability, and stealthiness, along with minimal perturbation to the original dataset, all without compromising the performance of the RAG system.

Paperid: 464, https://arxiv.org/pdf/2501.19180.pdf

Abstract:
Large language models (LLMs) are vital for a wide range of applications yet remain susceptible to jailbreak threats, which could lead to the generation of inappropriate responses. Conventional defenses, such as refusal and adversarial training, often fail to cover corner cases or rare domains, leaving LLMs still vulnerable to more sophisticated attacks. We propose a novel defense strategy, Safety Chain-of-Thought (SCoT), which harnesses the enhanced \textit{reasoning capabilities} of LLMs for proactive assessment of harmful inputs, rather than simply blocking them. SCoT augments any refusal training datasets to critically analyze the intent behind each request before generating answers. By employing proactive reasoning, SCoT enhances the generalization of LLMs across varied harmful queries and scenarios not covered in the safety alignment corpus. Additionally, it generates detailed refusals specifying the rules violated. Comparative evaluations show that SCoT significantly surpasses existing defenses, reducing vulnerability to out-of-distribution issues and adversarial manipulations while maintaining strong general capabilities.

Paperid: 465, https://arxiv.org/pdf/2501.10699.pdf

Abstract:
Eavesdropping has been a long-standing threat to the security and privacy of wireless communications, since it is difficult to detect and costly to prevent. As networks evolve towards Sixth Generation (6G) and semantic communication becomes increasingly central to next-generation wireless systems, securing semantic information transmission emerges as a critical challenge. While classical physical layer security (PLS) focuses on passive security, the recently proposed concept of physical layer deception (PLD) offers a semantic encryption measure to actively deceive eavesdroppers. Yet the existing studies of PLD have been dominantly information-theoretical and link-level oriented, lacking considerations of system-level design and practical implementation. In this work we propose a novel artificial intelligence (AI)-enabled framework called Visual ENcryption for Eavesdropping NegAtion (VENENA), which combines the techniques of PLD, visual encryption, and image poisoning, into a comprehensive mechanism for deceptive secure semantic transmission in future wireless networks. By leveraging advanced vision transformers and semantic codecs, VENENA demonstrates how semantic security can be enhanced through the synergy of physical layer techniques and artificial intelligence, paving the way for secure semantic communication in 6G networks.

Paperid: 466, https://arxiv.org/pdf/2502.15281.pdf

Abstract:
Trusted Execution Environment (TEE) enhances the security of mobile applications and cloud services by isolating sensitive code in the secure world from the non-secure normal world. However, TEE applications are still confronted with vulnerabilities stemming from bad partitioning. Bad partitioning can lead to critical security problems of TEE, such as leaking sensitive data to the normal world or being adversely affected by malicious inputs from the normal world. To address this, we propose an approach to detect partitioning issues in TEE applications. First, we conducted a survey of TEE vulnerabilities caused by bad partitioning and found that the parameters exchanged between the secure and normal worlds often contain insecure usage with bad partitioning implementation. Second, we developed a tool named DITING that can analyze data-flows of these parameters and identify their violations of security rules we defined to find bad partitioning issues. Different from existing research that only focuses on malicious input to TEE, we assess the partitioning issues more comprehensively through input/output and shared memory. Finally, we created the first benchmark targeting bad partitioning, consisting of 110 test cases. Experiments demonstrate that DITING achieves an F1 score of 0.90 in identifying bad partitioning issues.

Paperid: 467, https://arxiv.org/pdf/2506.02548.pdf

Abstract:
Large language model (LLM) agents are becoming increasingly skilled at handling cybersecurity tasks autonomously. Thoroughly assessing their cybersecurity capabilities is critical and urgent, given the high stakes in this domain. However, existing benchmarks fall short, often failing to capture real-world scenarios or being limited in scope. To address this gap, we introduce CyberGym, a large-scale and high-quality cybersecurity evaluation framework featuring 1,507 real-world vulnerabilities found and patched across 188 large software projects. While it includes tasks of various settings, CyberGym primarily focuses on the generation of proof-of-concept (PoC) tests for vulnerability reproduction, based on text descriptions and corresponding source repositories. Solving this task is particularly challenging, as it requires comprehensive reasoning across entire codebases to locate relevant code fragments and produce effective PoCs that accurately trigger the target vulnerability starting from the program's entry point. Our evaluation across 4 state-of-the-art agent frameworks and 9 LLMs reveals that even the best combination (OpenHands and Claude-3.7-Sonnet) achieves only a 11.9% reproduction success rate, mainly on simpler cases. Beyond reproducing historical vulnerabilities, we find that PoCs generated by LLM agents can reveal new vulnerabilities, identifying 15 zero-days affecting the latest versions of the software projects.

Paperid: 468, https://arxiv.org/pdf/2505.23561.pdf

Abstract:
Model merging for Large Language Models (LLMs) directly fuses the parameters of different models finetuned on various tasks, creating a unified model for multi-domain tasks. However, due to potential vulnerabilities in models available on open-source platforms, model merging is susceptible to backdoor attacks. In this paper, we propose Merge Hijacking, the first backdoor attack targeting model merging in LLMs. The attacker constructs a malicious upload model and releases it. Once a victim user merges it with any other models, the resulting merged model inherits the backdoor while maintaining utility across tasks. Merge Hijacking defines two main objectives-effectiveness and utility-and achieves them through four steps. Extensive experiments demonstrate the effectiveness of our attack across different models, merging algorithms, and tasks. Additionally, we show that the attack remains effective even when merging real-world models. Moreover, our attack demonstrates robustness against two inference-time defenses (Paraphrasing and CLEANGEN) and one training-time defense (Fine-pruning).

Paperid: 469, https://arxiv.org/pdf/2505.15252.pdf

Abstract:
The wide deployment of the generative pre-trained transformer (GPT) has raised privacy concerns for both clients and servers. While cryptographic primitives can be employed for secure GPT inference to protect the privacy of both parties, they introduce considerable performance overhead.To accelerate secure inference, this study proposes a public decoding and secure verification approach that utilizes public GPT models, motivated by the observation that securely decoding one and multiple tokens takes a similar latency. The client uses the public model to generate a set of tokens, which are then securely verified by the private model for acceptance. The efficiency of our approach depends on the acceptance ratio of tokens proposed by the public model, which we improve from two aspects: (1) a private sampling protocol optimized for cryptographic primitives and (2) model alignment using knowledge distillation. Our approach improves the efficiency of secure decoding while maintaining the same level of privacy and generation quality as standard secure decoding. Experiments demonstrate a $2.1\times \sim 6.0\times$ speedup compared to standard decoding across three pairs of public-private models and different network conditions.

Paperid: 470, https://arxiv.org/pdf/2504.19793.pdf

Abstract:
Tool selection is a key component of LLM agents. A popular approach follows a two-step process - \emph{retrieval} and \emph{selection} - to pick the most appropriate tool from a tool library for a given task. In this work, we introduce \textit{ToolHijacker}, a novel prompt injection attack targeting tool selection in no-box scenarios. ToolHijacker injects a malicious tool document into the tool library to manipulate the LLM agent's tool selection process, compelling it to consistently choose the attacker's malicious tool for an attacker-chosen target task. Specifically, we formulate the crafting of such tool documents as an optimization problem and propose a two-phase optimization strategy to solve it. Our extensive experimental evaluation shows that ToolHijacker is highly effective, significantly outperforming existing manual-based and automated prompt injection attacks when applied to tool selection. Moreover, we explore various defenses, including prevention-based defenses (StruQ and SecAlign) and detection-based defenses (known-answer detection, DataSentinel, perplexity detection, and perplexity windowed detection). Our experimental results indicate that these defenses are insufficient, highlighting the urgent need for developing new defense strategies.

Paperid: 471, https://arxiv.org/pdf/2504.15622.pdf

Abstract:
With the rapid development of technology and the acceleration of digitalisation, the frequency and complexity of cyber security threats are increasing. Traditional cybersecurity approaches, often based on static rules and predefined scenarios, are struggling to adapt to the rapidly evolving nature of modern cyberattacks. There is an urgent need for more adaptive and intelligent defence strategies. The emergence of Large Language Model (LLM) provides an innovative solution to cope with the increasingly severe cyber threats, and its potential in analysing complex attack patterns, predicting threats and assisting real-time response has attracted a lot of attention in the field of cybersecurity, and exploring how to effectively use LLM to defend against cyberattacks has become a hot topic in the current research field. This survey examines the applications of LLM from the perspective of the cyber attack lifecycle, focusing on the three phases of defense reconnaissance, foothold establishment, and lateral movement, and it analyzes the potential of LLMs in Cyber Threat Intelligence (CTI) tasks. Meanwhile, we investigate how LLM-based security solutions are deployed and applied in different network scenarios. It also summarizes the internal and external risk issues faced by LLM during its application. Finally, this survey also points out the facing risk issues and possible future research directions in this domain.

Paperid: 472, https://arxiv.org/pdf/2504.11703.pdf

Abstract:
LLM agents utilize Large Language Models as central components with diverse tools to complete various user tasks, but face significant security risks when interacting with external environments. Attackers can exploit these agents through various vectors, including indirect prompt injection, memory/knowledge base poisoning, and malicious tools, tricking agents into performing dangerous actions such as unauthorized financial transactions or data leakage. The core problem that enables attacks to succeed lies in over-privileged tool access. We introduce Progent, the first privilege control framework to secure LLM agents. Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones. Progent features a domain-specific language that allows for expressing fine-grained policies for controlling tool privileges, flexible fallback actions when calls are blocked, and dynamic policy updates to adapt to changing agent states. The framework operates deterministically at runtime, providing provable security guarantees. Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation, enhancing its practicality and potential for widespread adoption. Our extensive evaluation across various agent use cases, using benchmarks like AgentDojo, ASB, and AgentPoison, demonstrates that Progent reduces attack success rates to 0%, while preserving agent utility and speed. Additionally, we show that LLMs can automatically generate effective policies, highlighting their potential for automating the process of writing Progent's security policies.

Paperid: 473, https://arxiv.org/pdf/2504.05408.pdf

Abstract:
As frontier AI advances rapidly, understanding its impact on cybersecurity and inherent risks is essential to ensuring safe AI evolution (e.g., guiding risk mitigation and informing policymakers). While some studies review AI applications in cybersecurity, none of them comprehensively discuss AI's future impacts or provide concrete recommendations for navigating its safe and secure usage. This paper presents an in-depth analysis of frontier AI's impact on cybersecurity and establishes a systematic framework for risk assessment and mitigation. To this end, we first define and categorize the marginal risks of frontier AI in cybersecurity and then systemically analyze the current and future impacts of frontier AI in cybersecurity, qualitatively and quantitatively. We also discuss why frontier AI likely benefits attackers more than defenders in the short term from equivalence classes, asymmetry, and economic impact. Next, we explore frontier AI's impact on future software system development, including enabling complex hybrid systems while introducing new risks. Based on our findings, we provide security recommendations, including constructing fine-grained benchmarks for risk assessment, designing AI agents for defenses, building security mechanisms and provable defenses for hybrid systems, enhancing pre-deployment security testing and transparency, and strengthening defenses for users. Finally, we present long-term research questions essential for understanding AI's future impacts and unleashing its defensive capabilities.

Paperid: 474, https://arxiv.org/pdf/2504.03742.pdf

Abstract:
With the rapid growth of internet traffic, malicious network attacks have become increasingly frequent and sophisticated, posing significant threats to global cybersecurity. Traditional detection methods, including rule-based and machine learning-based approaches, struggle to accurately identify emerging threats, particularly in scenarios with limited samples. While recent advances in few-shot learning have partially addressed the data scarcity issue, existing methods still exhibit high false positive rates and lack the capability to effectively capture crucial local traffic patterns. In this paper, we propose HLoG, a novel hierarchical few-shot malicious traffic detection framework that leverages both local and global features extracted from network sessions. HLoG employs a sliding-window approach to segment sessions into phases, capturing fine-grained local interaction patterns through hierarchical bidirectional GRU encoding, while simultaneously modeling global contextual dependencies. We further design a session similarity assessment module that integrates local similarity with global self-attention-enhanced representations, achieving accurate and robust few-shot traffic classification. Comprehensive experiments on three meticulously reconstructed datasets demonstrate that HLoG significantly outperforms existing state-of-the-art methods. Particularly, HLoG achieves superior recall rates while substantially reducing false positives, highlighting its effectiveness and practical value in real-world cybersecurity applications.

Paperid: 475, https://arxiv.org/pdf/2503.21463.pdf

Abstract:
With the widespread adoption of Ethereum, financial frauds such as Ponzi schemes have become increasingly rampant in the blockchain ecosystem, posing significant threats to the security of account assets. Existing Ethereum fraud detection methods typically model account transactions as graphs, but this approach primarily focuses on binary transactional relationships between accounts, failing to adequately capture the complex multi-party interaction patterns inherent in Ethereum. To address this, we propose a hypergraph modeling method for the Ponzi scheme detection method in Ethereum, called HyperDet. Specifically, we treat transaction hashes as hyperedges that connect all the relevant accounts involved in a transaction. Additionally, we design a two-step hypergraph sampling strategy to significantly reduce computational complexity. Furthermore, we introduce a dual-channel detection module, including the hypergraph detection channel and the hyper-homo graph detection channel, to be compatible with existing detection methods. Experimental results show that, compared to traditional homogeneous graph-based methods, the hyper-homo graph detection channel achieves significant performance improvements, demonstrating the superiority of hypergraph in Ponzi scheme detection. This research offers innovations for modeling complex relationships in blockchain data.

Paperid: 476, https://arxiv.org/pdf/2503.16023.pdf

Abstract:
Multi-modal large language models (MLLMs) extend large language models (LLMs) to process multi-modal information, enabling them to generate responses to image-text inputs. MLLMs have been incorporated into diverse multi-modal applications, such as autonomous driving and medical diagnosis, via plug-and-play without fine-tuning. This deployment paradigm increases the vulnerability of MLLMs to backdoor attacks. However, existing backdoor attacks against MLLMs achieve limited effectiveness and stealthiness. In this work, we propose BadToken, the first token-level backdoor attack to MLLMs. BadToken introduces two novel backdoor behaviors: Token-substitution and Token-addition, which enable flexible and stealthy attacks by making token-level modifications to the original output for backdoored inputs. We formulate a general optimization problem that considers the two backdoor behaviors to maximize the attack effectiveness. We evaluate BadToken on two open-source MLLMs and various tasks. Our results show that our attack maintains the model's utility while achieving high attack success rates and stealthiness. We also show the real-world threats of BadToken in two scenarios, i.e., autonomous driving and medical diagnosis. Furthermore, we consider defenses including fine-tuning and input purification. Our results highlight the threat of our attack.

Paperid: 477, https://arxiv.org/pdf/2503.06254.pdf

Abstract:
Multimodal retrieval-augmented generation (RAG) enhances the visual reasoning capability of vision-language models (VLMs) by dynamically accessing information from external knowledge bases. In this work, we introduce \textit{Poisoned-MRAG}, the first knowledge poisoning attack on multimodal RAG systems. Poisoned-MRAG injects a few carefully crafted image-text pairs into the multimodal knowledge database, manipulating VLMs to generate the attacker-desired response to a target query. Specifically, we formalize the attack as an optimization problem and propose two cross-modal attack strategies, dirty-label and clean-label, tailored to the attacker's knowledge and goals. Our extensive experiments across multiple knowledge databases and VLMs show that Poisoned-MRAG outperforms existing methods, achieving up to 98\% attack success rate with just five malicious image-text pairs injected into the InfoSeek database (481,782 pairs). Additionally, We evaluate 4 different defense strategies, including paraphrasing, duplicate removal, structure-driven mitigation, and purification, demonstrating their limited effectiveness and trade-offs against Poisoned-MRAG. Our results highlight the effectiveness and scalability of Poisoned-MRAG, underscoring its potential as a significant threat to multimodal RAG systems.

Paperid: 478, https://arxiv.org/pdf/2503.00596.pdf

Abstract:
This paper proposes a novel backdoor threat attacking the LLM-as-a-Judge evaluation regime, where the adversary controls both the candidate and evaluator model. The backdoored evaluator victimizes benign users by unfairly assigning inflated scores to adversary. A trivial single token backdoor poisoning 1% of the evaluator training data triples the adversary's score with respect to their legitimate score. We systematically categorize levels of data access corresponding to three real-world settings, (1) web poisoning, (2) malicious annotator, and (3) weight poisoning. These regimes reflect a weak to strong escalation of data access that highly correlates with attack severity. Under the weakest assumptions - web poisoning (1), the adversary still induces a 20% score inflation. Likewise, in the (3) weight poisoning regime, the stronger assumptions enable the adversary to inflate their scores from 1.5/5 to 4.9/5. The backdoor threat generalizes across different evaluator architectures, trigger designs, evaluation tasks, and poisoning rates. By poisoning 10% of the evaluator training data, we control toxicity judges (Guardrails) to misclassify toxic prompts as non-toxic 89% of the time, and document reranker judges in RAG to rank the poisoned document first 97% of the time. LLM-as-a-Judge is uniquely positioned at the intersection of ethics and technology, where social implications of mislead model selection and evaluation constrain the available defensive tools. Amidst these challenges, model merging emerges as a principled tool to offset the backdoor, reducing ASR to near 0% whilst maintaining SOTA performance. Model merging's low computational cost and convenient integration into the current LLM Judge training pipeline position it as a promising avenue for backdoor mitigation in the LLM-as-a-Judge setting.

Paperid: 479, https://arxiv.org/pdf/2502.10486.pdf

Abstract:
The emergence of vision language models (VLMs) comes with increased safety concerns, as the incorporation of multiple modalities heightens vulnerability to attacks. Although VLMs can be built upon LLMs that have textual safety alignment, it is easily undermined when the vision modality is integrated. We attribute this safety challenge to the modality gap, a separation of image and text in the shared representation space, which blurs the distinction between harmful and harmless queries that is evident in LLMs but weakened in VLMs. To avoid safety decay and fulfill the safety alignment gap, we propose VLM-Guard, an inference-time intervention strategy that leverages the LLM component of a VLM as supervision for the safety alignment of the VLM. VLM-Guard projects the representations of VLM into the subspace that is orthogonal to the safety steering direction that is extracted from the safety-aligned LLM. Experimental results on three malicious instruction settings show the effectiveness of VLM-Guard in safeguarding VLM and fulfilling the safety alignment gap between VLM and its LLM component.

Paperid: 480, https://arxiv.org/pdf/2502.02410.pdf

Abstract:
Many forms of sensitive data, such as web traffic, mobility data, or hospital occupancy, are inherently sequential. The standard method for training machine learning models while ensuring privacy for units of sensitive information, such as individual hospital visits, is differentially private stochastic gradient descent (DP-SGD). However, we observe in this work that the formal guarantees of DP-SGD are incompatible with time series specific tasks like forecasting, since they rely on the privacy amplification attained by training on small, unstructured batches sampled from an unstructured dataset. In contrast, batches for forecasting are generated by (1) sampling sequentially structured time series from a dataset, (2) sampling contiguous subsequences from these series, and (3) partitioning them into context and ground-truth forecast windows. We theoretically analyze the privacy amplification attained by this structured subsampling to enable the training of forecasting models with sound and tight event- and user-level privacy guarantees. Towards more private models, we additionally prove how data augmentation amplifies privacy in self-supervised training of sequence models. Our empirical evaluation demonstrates that amplification by structured subsampling enables the training of forecasting models with strong formal privacy guarantees.

Paperid: 481, https://arxiv.org/pdf/2501.14050.pdf

Abstract:
GraphRAG advances retrieval-augmented generation (RAG) by structuring external knowledge as multi-scale knowledge graphs, enabling language models to integrate both broad context and granular details in their generation. While GraphRAG has demonstrated success across domains, its security implications remain largely unexplored. To bridge this gap, this work examines GraphRAG's vulnerability to poisoning attacks, uncovering an intriguing security paradox: existing RAG poisoning attacks are less effective under GraphRAG than conventional RAG, due to GraphRAG's graph-based indexing and retrieval; yet, the same features also create new attack surfaces. We present GragPoison, a novel attack that exploits shared relations in the underlying knowledge graph to craft poisoning text capable of compromising multiple queries simultaneously. GragPoison employs three key strategies: (i) relation injection to introduce false knowledge, (ii) relation enhancement to amplify poisoning influence, and (iii) narrative generation to embed malicious content within coherent text. Empirical evaluation across diverse datasets and models shows that GragPoison substantially outperforms existing attacks in terms of effectiveness (up to 98% success rate) and scalability (using less than 68% poisoning text) on multiple variations of GraphRAG. We also explore potential defensive measures and their limitations, identifying promising directions for future research.

Paperid: 482, https://arxiv.org/pdf/2501.08610.pdf

Abstract:
As the Internet rapidly expands, the increasing complexity and diversity of network activities pose significant challenges to effective network governance and security regulation. Network traffic, which serves as a crucial data carrier of network activities, has become indispensable in this process. Network traffic detection aims to monitor, analyze, and evaluate the data flows transmitted across the network to ensure network security and optimize performance. However, existing network traffic detection methods generally suffer from several limitations: 1) a narrow focus on characterizing traffic features from a single perspective; 2) insufficient exploration of discriminative features for different traffic; 3) poor generalization to different traffic scenarios. To address these issues, we propose a multi-view correlation-aware framework named FlowID for network traffic detection. FlowID captures multi-view traffic features via temporal and interaction awareness, while a hypergraph encoder further explores higher-order relationships between flows. To overcome the challenges of data imbalance and label scarcity, we design a dual-contrastive proxy task, enhancing the framework's ability to differentiate between various traffic flows through traffic-to-traffic and group-to-group contrast. Extensive experiments on five real-world datasets demonstrate that FlowID significantly outperforms existing methods in accuracy, robustness, and generalization across diverse network scenarios, particularly in detecting malicious traffic.

Paperid: 483, https://arxiv.org/pdf/2504.12612.pdf

Abstract:
Provenance is the chronology of things, resonating with the fundamental pursuit to uncover origins, trace connections, and situate entities within the flow of space and time. As artificial intelligence advances towards autonomous agents capable of interactive collaboration on complex tasks, the provenance of generated content becomes entangled in the interplay of collective creation, where contributions are continuously revised, extended or overwritten. In a multi-agent generative chain, content undergoes successive transformations, often leaving little, if any, trace of prior contributions. In this study, we investigates the problem of tracking multi-agent provenance across the temporal dimension of generation. We propose a chronological system for post hoc attribution of generative history from content alone, without reliance on internal memory states or external meta-information. At its core lies the notion of symbolic chronicles, representing signed and time-stamped records, in a form analogous to the chain of custody in forensic science. The system operates through a feedback loop, whereby each generative timestep updates the chronicle of prior interactions and synchronises it with the synthetic content in the very act of generation. This research seeks to develop an accountable form of collaborative artificial intelligence within evolving cyber ecosystems.

Paperid: 484, https://arxiv.org/pdf/2502.18547.pdf

Abstract:
Steganography is the art and science of covert writing, with a broad range of applications interwoven within the realm of cybersecurity. As artificial intelligence continues to evolve, its ability to synthesise realistic content emerges as a threat in the hands of cybercriminals who seek to manipulate and misrepresent the truth. Such synthetic content introduces a non-trivial risk of overwriting the subtle changes made for the purpose of steganography. When the signals in both the spatial and temporal domains are vulnerable to unforeseen overwriting, it calls for reflection on what, if any, remains invariant. This study proposes a paradigm in steganography for audiovisual media, where messages are concealed beyond both spatial and temporal domains. A chain of multimodal artificial intelligence is developed to deconstruct audiovisual content into a cover text, embed a message within the linguistic domain, and then reconstruct the audiovisual content through synchronising both auditory and visual modalities with the resultant stego text. The message is encoded by biasing the word sampling process of a language generation model and decoded by analysing the probability distribution of word choices. The accuracy of message transmission is evaluated under both zero-bit and multi-bit capacity settings. Fidelity is assessed through both biometric and semantic similarities, capturing the identities of the recorded face and voice, as well as the core ideas conveyed through the media. Secrecy is examined through statistical comparisons between cover and stego texts. Robustness is tested across various scenarios, including audiovisual resampling, face-swapping, voice-cloning and their combinations.

Paperid: 485, https://arxiv.org/pdf/2502.10440.pdf

Abstract:
Large language models (LLMs) are increasingly integrated into real-world personalized applications through retrieval-augmented generation (RAG) mechanisms to supplement their responses with domain-specific knowledge. However, the valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries. Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks. However, these methods require altering the LLM's results of verification samples, inevitably making these watermarks susceptible to anomaly detection and even introducing new security risks. To address these challenges, we propose \name{} for `harmless' copyright protection of knowledge bases. Instead of manipulating LLM's final output, \name{} implants distinct yet benign verification behaviors in the space of chain-of-thought (CoT) reasoning, maintaining the correctness of the final answer. Our method has three main stages: (1) Generating CoTs: For each verification question, we generate two `innocent' CoTs, including a target CoT for building watermark behaviors; (2) Optimizing Watermark Phrases and Target CoTs: Inspired by our theoretical analysis, we optimize them to minimize retrieval errors under the \emph{black-box} and \emph{text-only} setting of suspicious LLM, ensuring that only watermarked verification queries can retrieve their correspondingly target CoTs contained in the knowledge base; (3) Ownership Verification: We exploit a pairwise Wilcoxon test to verify whether a suspicious LLM is augmented with the protected knowledge base by comparing its responses to watermarked and benign verification queries. Our experiments on diverse benchmarks demonstrate that \name{} effectively protects knowledge bases and its resistance to adaptive attacks.

Paperid: 486, https://arxiv.org/pdf/2502.06348.pdf

Abstract:
Decentralized finance (DeFi) applications depend on accurate price oracles to ensure secure transactions, yet these oracles are highly vulnerable to manipulation, enabling attackers to exploit smart contract vulnerabilities for unfair asset valuation and financial gain. Detecting such manipulations traditionally relies on the manual effort of experienced experts, presenting significant challenges. In this paper, we propose a novel LLM-driven framework that automates the detection of price oracle manipulations by leveraging the complementary strengths of different LLM models (LLMs). Our approach begins with domain-specific knowledge extraction, where an LLM model synthesizes precise insights about price oracle vulnerabilities from top-tier academic papers, eliminating the need for profound expertise from developers or auditors. This knowledge forms the foundation for a second LLM model to generate structured, context-aware chain of thought prompts, which guide a third LLM model in accurately identifying manipulation patterns in smart contracts. We validate the effectiveness of framework through experiments on 60 known vulnerabilities from 46 real-world DeFi attacks or projects spanning 2021 to 2023. The best performing combination of LLMs (Haiku-Haiku-4o-mini) identified by AiRacleX demonstrate a 2.58-times improvement in recall (0.667 vs 0.259) compared to the state-of-the-art tool GPTScan, while maintaining comparable precision. Furthermore, our framework demonstrates the feasibility of replacing commercial models with open-source alternatives, enhancing privacy and security for developers.

Paperid: 487, https://arxiv.org/pdf/2501.19143.pdf

Abstract:
As the cornerstone of artificial intelligence, machine perception confronts a fundamental threat posed by adversarial illusions. These adversarial attacks manifest in two primary forms: deductive illusion, where specific stimuli are crafted based on the victim model's general decision logic, and inductive illusion, where the victim model's general decision logic is shaped by specific stimuli. The former exploits the model's decision boundaries to create a stimulus that, when applied, interferes with its decision-making process. The latter reinforces a conditioned reflex in the model, embedding a backdoor during its learning phase that, when triggered by a stimulus, causes aberrant behaviours. The multifaceted nature of adversarial illusions calls for a unified defence framework, addressing vulnerabilities across various forms of attack. In this study, we propose a disillusion paradigm based on the concept of an imitation game. At the heart of the imitation game lies a multimodal generative agent, steered by chain-of-thought reasoning, which observes, internalises and reconstructs the semantic essence of a sample, liberated from the classic pursuit of reversing the sample to its original state. As a proof of concept, we conduct experimental simulations using a multimodal generative dialogue agent and evaluates the methodology under a variety of attack scenarios.

Paperid: 488, https://arxiv.org/pdf/2501.14230.pdf

Abstract:
A critical requirement for deep learning models is ensuring their robustness against adversarial attacks. These attacks commonly introduce noticeable perturbations, compromising the visual fidelity of adversarial examples. Another key challenge is that while white-box algorithms can generate effective adversarial perturbations, they require access to the model gradients, limiting their practicality in many real-world scenarios. Existing attack mechanisms struggle to achieve similar efficacy without access to these gradients. In this paper, we introduce GreedyPixel, a novel pixel-wise greedy algorithm designed to generate high-quality adversarial examples using only query-based feedback from the target model. GreedyPixel improves computational efficiency in what is typically a brute-force process by perturbing individual pixels in sequence, guided by a pixel-wise priority map. This priority map is constructed by ranking gradients obtained from a surrogate model, providing a structured path for perturbation. Our results demonstrate that GreedyPixel achieves attack success rates comparable to white-box methods without the need for gradient information, and surpasses existing algorithms in black-box settings, offering higher success rates, reduced computational time, and imperceptible perturbations. These findings underscore the advantages of GreedyPixel in terms of attack efficacy, time efficiency, and visual quality.

Paperid: 489, https://arxiv.org/pdf/2501.09284.pdf

Abstract:
Recently, LoRA and its variants have become the de facto strategy for training and sharing task-specific versions of large pretrained models, thanks to their efficiency and simplicity. However, the issue of copyright protection for LoRA weights, especially through watermark-based techniques, remains underexplored. To address this gap, we propose SEAL (SEcure wAtermarking on LoRA weights), the universal whitebox watermarking for LoRA. SEAL embeds a secret, non-trainable matrix between trainable LoRA weights, serving as a passport to claim ownership. SEAL then entangles the passport with the LoRA weights through training, without extra loss for entanglement, and distributes the finetuned weights after hiding the passport. When applying SEAL, we observed no performance degradation across commonsense reasoning, textual/visual instruction tuning, and text-to-image synthesis tasks. We demonstrate that SEAL is robust against a variety of known attacks: removal, obfuscation, and ambiguity attacks.

Paperid: 490, https://arxiv.org/pdf/2501.04541.pdf

Abstract:
Steganography, the art of information hiding, has continually evolved across visual, auditory and linguistic domains, adapting to the ceaseless interplay between steganographic concealment and steganalytic revelation. This study seeks to extend the horizons of what constitutes a viable steganographic medium by introducing a steganographic paradigm in robotic motion control. Based on the observation of the robot's inherent sensitivity to changes in its environment, we propose a methodology to encode messages as environmental stimuli influencing the motions of the robotic agent and to decode messages from the resulting motion trajectory. The constraints of maximal robot integrity and minimal motion deviation are established as fundamental principles underlying secrecy. As a proof of concept, we conduct experiments in simulated environments across various manipulation tasks, incorporating robotic embodiments equipped with generalist multimodal policies.

Paperid: 491, https://arxiv.org/pdf/2501.01131.pdf

Abstract:
Privacy regulations mandate that developers must provide authentic and comprehensive privacy notices, e.g., privacy policies or labels, to inform users of their apps' privacy practices. However, due to a lack of knowledge of privacy requirements, developers often struggle to create accurate privacy notices, especially for sophisticated mobile apps with complex features and in crowded development teams. To address these challenges, we introduce Privacy Bills of Materials (PriBOM), a systematic software engineering approach that leverages different development team roles to better capture and coordinate mobile app privacy information. PriBOM facilitates transparency-centric privacy documentation and specific privacy notice creation, enabling traceability and trackability of privacy practices. We present a pre-fill of PriBOM based on static analysis and privacy notice analysis techniques. We demonstrate the perceived usefulness of PriBOM through a human evaluation with 150 diverse participants. Our findings suggest that PriBOM could serve as a significant solution for providing privacy support in DevOps for mobile apps.

Paperid: 492, https://arxiv.org/pdf/2505.02824.pdf

Abstract:
Text-to-image (T2I) diffusion models have rapidly advanced, enabling high-quality image generation conditioned on textual prompts. However, the growing trend of fine-tuning pre-trained models for personalization raises serious concerns about unauthorized dataset usage. To combat this, dataset ownership verification (DOV) has emerged as a solution, embedding watermarks into the fine-tuning datasets using backdoor techniques. These watermarks remain inactive under benign samples but produce owner-specified outputs when triggered. Despite the promise of DOV for T2I diffusion models, its robustness against copyright evasion attacks (CEA) remains unexplored. In this paper, we explore how attackers can bypass these mechanisms through CEA, allowing models to circumvent watermarks even when trained on watermarked datasets. We propose the first copyright evasion attack (i.e., CEAT2I) specifically designed to undermine DOV in T2I diffusion models. Concretely, our CEAT2I comprises three stages: watermarked sample detection, trigger identification, and efficient watermark mitigation. A key insight driving our approach is that T2I models exhibit faster convergence on watermarked samples during the fine-tuning, evident through intermediate feature deviation. Leveraging this, CEAT2I can reliably detect the watermarked samples. Then, we iteratively ablate tokens from the prompts of detected watermarked samples and monitor shifts in intermediate features to pinpoint the exact trigger tokens. Finally, we adopt a closed-form concept erasure method to remove the injected watermark. Extensive experiments show that our CEAT2I effectively evades DOV mechanisms while preserving model performance.

Paperid: 493, https://arxiv.org/pdf/2502.11358.pdf

Abstract:
Information theft attacks pose a significant risk to Large Language Model (LLM) tool-learning systems. Adversaries can inject malicious commands through compromised tools, manipulating LLMs to send sensitive information to these tools, which leads to potential privacy breaches. However, existing attack approaches are black-box oriented and rely on static commands that cannot adapt flexibly to the changes in user queries and the invocation chain of tools. It makes malicious commands more likely to be detected by LLM and leads to attack failure. In this paper, we propose AutoCMD, a dynamic attack comment generation approach for information theft attacks in LLM tool-learning systems. Inspired by the concept of mimicking the familiar, AutoCMD is capable of inferring the information utilized by upstream tools in the toolchain through learning on open-source systems and reinforcement with target system examples, thereby generating more targeted commands for information theft. The evaluation results show that AutoCMD outperforms the baselines with +13.2% $ASR_{Theft}$, and can be generalized to new tool-learning systems to expose their information leakage risks. We also design four defense methods to effectively protect tool-learning systems from the attack.

Paperid: 494, https://arxiv.org/pdf/2501.07493.pdf

Abstract:
It is now common to evaluate Large Language Models (LLMs) by having humans manually vote to evaluate model outputs, in contrast to typical benchmarks that evaluate knowledge or skill at some particular task. Chatbot Arena, the most popular benchmark of this type, ranks models by asking users to select the better response between two randomly selected models (without revealing which model was responsible for the generations). These platforms are widely trusted as a fair and accurate measure of LLM capabilities. In this paper, we show that if bot protection and other defenses are not implemented, these voting-based benchmarks are potentially vulnerable to adversarial manipulation. Specifically, we show that an attacker can alter the leaderboard (to promote their favorite model or demote competitors) at the cost of roughly a thousand votes (verified in a simulated, offline version of Chatbot Arena). Our attack consists of two steps: first, we show how an attacker can determine which model was used to generate a given reply with more than $95\%$ accuracy; and then, the attacker can use this information to consistently vote for (or against) a target model. Working with the Chatbot Arena developers, we identify, propose, and implement mitigations to improve the robustness of Chatbot Arena against adversarial manipulation, which, based on our analysis, substantially increases the cost of such attacks. Some of these defenses were present before our collaboration, such as bot protection with Cloudflare, malicious user detection, and rate limiting. Others, including reCAPTCHA and login are being integrated to strengthen the security in Chatbot Arena.

Paperid: 495, https://arxiv.org/pdf/2505.17147.pdf

Abstract:
The proliferation of jailbreak attacks against large language models (LLMs) highlights the need for robust security measures. However, in multi-round dialogues, malicious intentions may be hidden in interactions, leading LLMs to be more prone to produce harmful responses. In this paper, we propose the \textbf{M}ulti-\textbf{T}urn \textbf{S}afety \textbf{A}lignment (\ourapproach) framework, to address the challenge of securing LLMs in multi-round interactions. It consists of two stages: In the thought-guided attack learning stage, the red-team model learns about thought-guided multi-round jailbreak attacks to generate adversarial prompts. In the adversarial iterative optimization stage, the red-team model and the target model continuously improve their respective capabilities in interaction. Furthermore, we introduce a multi-turn reinforcement learning algorithm based on future rewards to enhance the robustness of safety alignment. Experimental results show that the red-team model exhibits state-of-the-art attack capabilities, while the target model significantly improves its performance on safety benchmarks.

Paperid: 496, https://arxiv.org/pdf/2505.16916.pdf

Abstract:
Multimodal Large Language Models (MLLMs) are increasingly deployed in fine-tuning-as-a-service (FTaaS) settings, where user-submitted datasets adapt general-purpose models to downstream tasks. This flexibility, however, introduces serious security risks, as malicious fine-tuning can implant backdoors into MLLMs with minimal effort. In this paper, we observe that backdoor triggers systematically disrupt cross-modal processing by causing abnormal attention concentration on non-semantic regions--a phenomenon we term attention collapse. Based on this insight, we propose Believe Your Eyes (BYE), a data filtering framework that leverages attention entropy patterns as self-supervised signals to identify and filter backdoor samples. BYE operates via a three-stage pipeline: (1) extracting attention maps using the fine-tuned model, (2) computing entropy scores and profiling sensitive layers via bimodal separation, and (3) performing unsupervised clustering to remove suspicious samples. Unlike prior defenses, BYE equires no clean supervision, auxiliary labels, or model modifications. Extensive experiments across various datasets, models, and diverse trigger types validate BYE's effectiveness: it achieves near-zero attack success rates while maintaining clean-task performance, offering a robust and generalizable solution against backdoor threats in MLLMs.

Paperid: 497, https://arxiv.org/pdf/2504.08104.pdf

Abstract:
Jailbreak attacks, which aim to cause LLMs to perform unrestricted behaviors, have become a critical and challenging direction in AI safety. Despite achieving the promising attack success rate using dictionary-based evaluation, existing jailbreak attack methods fail to output detailed contents to satisfy the harmful request, leading to poor performance on GPT-based evaluation. To this end, we propose a black-box jailbreak attack termed GeneShift, by using a genetic algorithm to optimize the scenario shifts. Firstly, we observe that the malicious queries perform optimally under different scenario shifts. Based on it, we develop a genetic algorithm to evolve and select the hybrid of scenario shifts. It guides our method to elicit detailed and actionable harmful responses while keeping the seemingly benign facade, improving stealthiness. Extensive experiments demonstrate the superiority of GeneShift. Notably, GeneShift increases the jailbreak success rate from 0% to 60% when direct prompting alone would fail.

Paperid: 498, https://arxiv.org/pdf/2503.12220.pdf

Abstract:
Federated learning (FL) enables retailers to share model parameters for demand forecasting while maintaining privacy. However, heterogeneous data across diverse regions, driven by factors such as varying consumer behavior, poses challenges to the effectiveness of federated learning. To tackle this challenge, we propose Privacy-Adaptive Clustered Federated Learning (PA-CFL) tailored for demand forecasting on heterogeneous retail data. By leveraging differential privacy and feature importance distribution, PA-CFL groups retailers into distinct ``bubbles'', each forming its own federated learning system to effectively isolate data heterogeneity. Within each bubble, Transformer models are designed to predict local sales for each client. Our experiments demonstrate that PA-CFL significantly surpasses FedAvg and outperforms local learning in demand forecasting performance across all participating clients. Compared to local learning, PA-CFL achieves a 5.4% improvement in R^2, a 69% reduction in RMSE, and a 45% decrease in MAE. Our approach enables effective FL through adaptive adjustments to diverse noise levels and the range of clients participating in each bubble. By grouping participants and proactively filtering out high-risk clients, PA-CFL mitigates potential threats to the FL system. The findings demonstrate PA-CFL's ability to enhance federated learning in time series prediction tasks with heterogeneous data, achieving a balance between forecasting accuracy and privacy preservation in retail applications. Additionally, PA-CFL's capability to detect and neutralize poisoned data from clients enhances the system's robustness and reliability.

Paperid: 499, https://arxiv.org/pdf/2502.00847.pdf

Abstract:
With the growing popularity of LLMs among the general public users, privacy-preserving and adversarial robustness have become two pressing demands for LLM-based services, which have largely been pursued separately but rarely jointly. In this paper, to the best of our knowledge, we are among the first attempts towards robust and private LLM inference by tightly integrating two disconnected fields: private inference and prompt ensembling. The former protects users' privacy by encrypting inference data transmitted and processed by LLMs, while the latter enhances adversarial robustness by yielding an aggregated output from multiple prompted LLM responses. Although widely recognized as effective individually, private inference for prompt ensembling together entails new challenges that render the naive combination of existing techniques inefficient. To overcome the hurdles, we propose SecPE, which designs efficient fully homomorphic encryption (FHE) counterparts for the core algorithmic building blocks of prompt ensembling. We conduct extensive experiments on 8 tasks to evaluate the accuracy, robustness, and efficiency of SecPE. The results show that SecPE maintains high clean accuracy and offers better robustness at the expense of merely $2.5\%$ efficiency overhead compared to baseline private inference methods, indicating a satisfactory ``accuracy-robustness-efficiency'' tradeoff. For the efficiency of the encrypted Argmax operation that incurs major slowdown for prompt ensembling, SecPE is 35.4x faster than the state-of-the-art peers, which can be of independent interest beyond this work.

Paperid: 500, https://arxiv.org/pdf/2502.00840.pdf

Abstract:
Large Language Models (LLMs) have showcased remarkable capabilities across various domains. Accompanying the evolving capabilities and expanding deployment scenarios of LLMs, their deployment challenges escalate due to their sheer scale and the advanced yet complex activation designs prevalent in notable model series, such as Llama, Gemma, Mistral. These challenges have become particularly pronounced in resource-constrained deployment scenarios, where mitigating inference bottlenecks is imperative. Among various recent efforts, activation approximation has emerged as a promising avenue for pursuing inference efficiency, sometimes considered indispensable in applications such as private inference. Despite achieving substantial speedups with minimal impact on utility, even appearing sound and practical for real-world deployment, the safety implications of activation approximations remain unclear. In this work, we fill this critical gap in LLM safety by conducting the first systematic safety evaluation of activation approximations. Our safety vetting spans seven state-of-the-art techniques across three popular categories (activation polynomialization, activation sparsification, and activation quantization), revealing consistent safety degradation across ten safety-aligned LLMs. To overcome the hurdle of devising a unified defense accounting for diverse activation approximation methods, we perform an in-depth analysis of their shared error patterns and uncover three key findings. We propose QuadA, a novel safety enhancement method tailored to mitigate the safety compromises introduced by activation approximations. Extensive experiments and ablation studies corroborate QuadA's effectiveness in enhancing the safety capabilities of LLMs after activation approximations.

Paperid: 501, https://arxiv.org/pdf/2506.14493.pdf

Abstract:
Multimodal Large Language Models (MLLMs) have shown great promise but require substantial computational resources during inference. Attackers can exploit this by inducing excessive output, leading to resource exhaustion and service degradation. Prior energy-latency attacks aim to increase generation time by broadly shifting the output token distribution away from the EOS token, but they neglect the influence of token-level Part-of-Speech (POS) characteristics on EOS and sentence-level structural patterns on output counts, limiting their efficacy. To address this, we propose LingoLoop, an attack designed to induce MLLMs to generate excessively verbose and repetitive sequences. First, we find that the POS tag of a token strongly affects the likelihood of generating an EOS token. Based on this insight, we propose a POS-Aware Delay Mechanism to postpone EOS token generation by adjusting attention weights guided by POS information. Second, we identify that constraining output diversity to induce repetitive loops is effective for sustained generation. We introduce a Generative Path Pruning Mechanism that limits the magnitude of hidden states, encouraging the model to produce persistent loops. Extensive experiments demonstrate LingoLoop can increase generated tokens by up to 30 times and energy consumption by a comparable factor on models like Qwen2.5-VL-3B, consistently driving MLLMs towards their maximum generation limits. These findings expose significant MLLMs' vulnerabilities, posing challenges for their reliable deployment. The code will be released publicly following the paper's acceptance.

Paperid: 502, https://arxiv.org/pdf/2506.13737.pdf

Abstract:
Large Reasoning Models (LRMs) have demonstrated promising performance in complex tasks. However, the resource-consuming reasoning processes may be exploited by attackers to maliciously occupy the resources of the servers, leading to a crash, like the DDoS attack in cyber. To this end, we propose a novel attack method on LRMs termed ExtendAttack to maliciously occupy the resources of servers by stealthily extending the reasoning processes of LRMs. Concretely, we systematically obfuscate characters within a benign prompt, transforming them into a complex, poly-base ASCII representation. This compels the model to perform a series of computationally intensive decoding sub-tasks that are deeply embedded within the semantic structure of the query itself. Extensive experiments demonstrate the effectiveness of our proposed ExtendAttack. Remarkably, it increases the length of the model's response by over 2.5 times for the o3 model on the HumanEval benchmark. Besides, it preserves the original meaning of the query and achieves comparable answer accuracy, showing the stealthiness.

Paperid: 503, https://arxiv.org/pdf/2506.07294.pdf

Abstract:
Recent attempts at source tracing for codec-based deepfake speech (CodecFake), generated by neural audio codec-based speech generation (CoSG) models, have exhibited suboptimal performance. However, how to train source tracing models using simulated CoSG data while maintaining strong performance on real CoSG-generated audio remains an open challenge. In this paper, we show that models trained solely on codec-resynthesized data tend to overfit to non-speech regions and struggle to generalize to unseen content. To mitigate these challenges, we introduce the Semantic-Acoustic Source Tracing Network (SASTNet), which jointly leverages Whisper for semantic feature encoding and Wav2vec2 with AudioMAE for acoustic feature encoding. Our proposed SASTNet achieves state-of-the-art performance on the CoSG test set of the CodecFake+ dataset, demonstrating its effectiveness for reliable source tracing.

Paperid: 504, https://arxiv.org/pdf/2506.02032.pdf

Abstract:
The rapid adoption of machine learning (ML) technologies has driven organizations across diverse sectors to seek efficient and reliable methods to accelerate model development-to-deployment. Machine Learning Operations (MLOps) has emerged as an integrative approach addressing these requirements by unifying relevant roles and streamlining ML workflows. As the MLOps market continues to grow, securing these pipelines has become increasingly critical. However, the unified nature of MLOps ecosystem introduces vulnerabilities, making them susceptible to adversarial attacks where a single misconfiguration can lead to compromised credentials, severe financial losses, damaged public trust, and the poisoning of training data. Our paper presents a systematic application of the MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework, a comprehensive and continuously updated catalog of AI-focused attacks, to systematically assess attacks across different phases of the MLOps ecosystem. We begin by examining the preparatory phases during which adversaries acquire the essential intelligence required to initiate their attacks. We then present a structured taxonomy of attack techniques explicitly mapped to corresponding phases of the MLOps ecosystem, supported by examples drawn from red-teaming exercises and real-world incidents. This is followed by a taxonomy of mitigation strategies aligned with these attack categories, offering actionable early-stage defenses to strengthen the security of MLOps ecosystem. Given the rapid evolution and adoption of MLOps, we further highlight key research gaps that require immediate attention. Our work emphasizes the importance of implementing robust security protocols from the outset, empowering practitioners to safeguard MLOps ecosystem against evolving cyber attacks.

Paperid: 505, https://arxiv.org/pdf/2505.15420.pdf

Abstract:
Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by incorporating external knowledge bases, but they are vulnerable to privacy risks from data extraction attacks. Existing extraction methods typically rely on malicious inputs such as prompt injection or jailbreaking, making them easily detectable via input- or output-level detection. In this paper, we introduce Implicit Knowledge Extraction Attack (IKEA), which conducts knowledge extraction on RAG systems through benign queries. IKEA first leverages anchor concepts to generate queries with the natural appearance, and then designs two mechanisms to lead to anchor concept thoroughly 'explore' the RAG's privacy knowledge: (1) Experience Reflection Sampling, which samples anchor concepts based on past query-response patterns to ensure the queries' relevance to RAG documents; (2) Trust Region Directed Mutation, which iteratively mutates anchor concepts under similarity constraints to further exploit the embedding space. Extensive experiments demonstrate IKEA's effectiveness under various defenses, surpassing baselines by over 80% in extraction efficiency and 90% in attack success rate. Moreover, the substitute RAG system built from IKEA's extractions consistently outperforms those based on baseline methods across multiple evaluation tasks, underscoring the significant privacy risk in RAG systems.

Paperid: 506, https://arxiv.org/pdf/2504.14554.pdf

Abstract:
The rapid advancement of generative AI highlights the importance of text-to-image (T2I) security, particularly with the threat of backdoor poisoning. Timely disclosure and mitigation of security vulnerabilities in T2I models are crucial for ensuring the safe deployment of generative models. We explore a novel training-free backdoor poisoning paradigm through model editing, which is recently employed for knowledge updating in large language models. Nevertheless, we reveal the potential security risks posed by model editing techniques to image generation models. In this work, we establish the principles for backdoor attacks based on model editing, and propose a relationship-driven precise backdoor poisoning method, REDEditing. Drawing on the principles of equivalent-attribute alignment and stealthy poisoning, we develop an equivalent relationship retrieval and joint-attribute transfer approach that ensures consistent backdoor image generation through concept rebinding. A knowledge isolation constraint is proposed to preserve benign generation integrity. Our method achieves an 11\% higher attack success rate compared to state-of-the-art approaches. Remarkably, adding just one line of code enhances output naturalness while improving backdoor stealthiness by 24\%. This work aims to heighten awareness regarding this security vulnerability in editable image generation models.

Paperid: 507, https://arxiv.org/pdf/2503.06453.pdf

Abstract:
In recent years, text-to-image (T2I) diffusion models have gained significant attention for their ability to generate high quality images reflecting text prompts. However, their growing popularity has also led to the emergence of backdoor threats, posing substantial risks. Currently, effective defense strategies against such threats are lacking due to the diversity of backdoor targets in T2I synthesis. In this paper, we propose NaviT2I, an efficient input-level backdoor defense framework against diverse T2I backdoors. Our approach is based on the new observation that trigger tokens tend to induce significant neuron activation variation in the early stage of the diffusion generation process, a phenomenon we term Early-step Activation Variation. Leveraging this insight, NaviT2I navigates T2I models to prevent malicious inputs by analyzing Neuron activation variations caused by input tokens. Extensive experiments show that NaviT2I significantly outperforms the baselines in both effectiveness and efficiency across diverse datasets, various T2I backdoors, and different model architectures including UNet and DiT. Furthermore, we show that our method remains effective under potential adaptive attacks.

Paperid: 508, https://arxiv.org/pdf/2501.01015.pdf

Abstract:
Deep neural networks are vulnerable to adversarial examples that exhibit transferability across various models. Numerous approaches are proposed to enhance the transferability of adversarial examples, including advanced optimization, data augmentation, and model modifications. However, these methods still show limited transferability, particularly in cross-architecture scenarios, such as from CNN to ViT. To achieve high transferability, we propose a technique termed Spatial Adversarial Alignment (SAA), which employs an alignment loss and leverages a witness model to fine-tune the surrogate model. Specifically, SAA consists of two key parts: spatial-aware alignment and adversarial-aware alignment. First, we minimize the divergences of features between the two models in both global and local regions, facilitating spatial alignment. Second, we introduce a self-adversarial strategy that leverages adversarial examples to impose further constraints, aligning features from an adversarial perspective. Through this alignment, the surrogate model is trained to concentrate on the common features extracted by the witness model. This facilitates adversarial attacks on these shared features, thereby yielding perturbations that exhibit enhanced transferability. Extensive experiments on various architectures on ImageNet show that aligned surrogate models based on SAA can provide higher transferable adversarial examples, especially in cross-architecture attacks.

Paperid: 509, https://arxiv.org/pdf/2506.13434.pdf

Abstract:
Large Language Models (LLMs) are set to reshape cybersecurity by augmenting red and blue team operations. Red teams can exploit LLMs to plan attacks, craft phishing content, simulate adversaries, and generate exploit code. Conversely, blue teams may deploy them for threat intelligence synthesis, root cause analysis, and streamlined documentation. This dual capability introduces both transformative potential and serious risks. This position paper maps LLM applications across cybersecurity frameworks such as MITRE ATT&CK and the NIST Cybersecurity Framework (CSF), offering a structured view of their current utility and limitations. While LLMs demonstrate fluency and versatility across various tasks, they remain fragile in high-stakes, context-heavy environments. Key limitations include hallucinations, limited context retention, poor reasoning, and sensitivity to prompts, which undermine their reliability in operational settings. Moreover, real-world integration raises concerns around dual-use risks, adversarial misuse, and diminished human oversight. Malicious actors could exploit LLMs to automate reconnaissance, obscure attack vectors, and lower the technical threshold for executing sophisticated attacks. To ensure safer adoption, we recommend maintaining human-in-the-loop oversight, enhancing model explainability, integrating privacy-preserving mechanisms, and building systems robust to adversarial exploitation. As organizations increasingly adopt AI driven cybersecurity, a nuanced understanding of LLMs' risks and operational impacts is critical to securing their defensive value while mitigating unintended consequences.

Paperid: 510, https://arxiv.org/pdf/2505.10903.pdf

Abstract:
Malware presents a persistent threat to user privacy and data integrity. To combat this, machine learning-based (ML-based) malware detection (MD) systems have been developed. However, these systems have increasingly been attacked in recent years, undermining their effectiveness in practice. While the security risks associated with ML-based MD systems have garnered considerable attention, the majority of prior works is limited to adversarial malware examples, lacking a comprehensive analysis of practical security risks. This paper addresses this gap by utilizing the CIA principles to define the scope of security risks. We then deconstruct ML-based MD systems into distinct operational stages, thus developing a stage-based taxonomy. Utilizing this taxonomy, we summarize the technical progress and discuss the gaps in the attack and defense proposals related to the ML-based MD systems within each stage. Subsequently, we conduct two case studies, using both inter-stage and intra-stage analyses according to the stage-based taxonomy to provide new empirical insights. Based on these analyses and insights, we suggest potential future directions from both inter-stage and intra-stage perspectives.

Paperid: 511, https://arxiv.org/pdf/2504.13537.pdf

Abstract:
With the rapid advancements in quantum computing, traditional cryptographic schemes like Rivest-Shamir-Adleman (RSA) and elliptic curve cryptography (ECC) are becoming vulnerable, necessitating the development of quantum-resistant algorithms. The National Institute of Standards and Technology (NIST) has initiated a standardization process for PQC algorithms, and several candidates, including CRYSTALS-Kyber and McEliece, have reached the final stages. This paper first provides a comprehensive analysis of the hardware complexity of post-quantum cryptography (PQC) in embedded systems, categorizing PQC algorithms into families based on their underlying mathematical problems: lattice-based, code-based, hash-based and multivariate / isogeny-based schemes. Each family presents distinct computational, memory, and energy profiles, making them suitable for different use cases. To address these challenges, this paper discusses optimization strategies such as pipelining, parallelization, and high-level synthesis (HLS), which can improve the performance and energy efficiency of PQC implementations. Finally, a detailed complexity analysis of CRYSTALS-Kyber and McEliece, comparing their key generation, encryption, and decryption processes in terms of computational complexity, has been conducted.

Paperid: 512, https://arxiv.org/pdf/2504.02132.pdf

Abstract:
Multi-modal retrieval augmented generation (M-RAG) is instrumental for inhibiting hallucinations in large multi-modal models (LMMs) through the use of a factual knowledge base (KB). However, M-RAG introduces new attack vectors for adversaries that aim to disrupt the system by injecting malicious entries into the KB. In this paper, we present the first poisoning attack against M-RAG targeting visual document retrieval applications where the KB contains images of document pages. We propose two attacks, each of which require injecting only a single adversarial image into the KB. Firstly, we propose a universal attack that, for any potential user query, influences the response to cause a denial-of-service (DoS) in the M-RAG system. Secondly, we present a targeted attack against one or a group of user queries, with the goal of spreading targeted misinformation. For both attacks, we use a multi-objective gradient-based adversarial approach to craft the injected image while optimizing for both retrieval and generation. We evaluate our attacks against several visual document retrieval datasets, a diverse set of state-of-the-art retrievers (embedding models) and generators (LMMs), demonstrating the attack effectiveness in both the universal and targeted settings. We additionally present results including commonly used defenses, various attack hyper-parameter settings, ablations, and attack transferability.

Paperid: 513, https://arxiv.org/pdf/2503.03245.pdf

Abstract:
The last few years have seen an explosion of interest in autonomous cyber defence agents based on deep reinforcement learning. Such agents are typically trained in a cyber gym environment, also known as a cyber simulator, at least 32 of which have already been built. Most, if not all cyber gyms provide dense "scaffolded" reward functions which combine many penalties or incentives for a range of (un)desirable states and costly actions. Whilst dense rewards help alleviate the challenge of exploring complex environments, yielding seemingly effective strategies from relatively few environment steps; they are also known to bias the solutions an agent can find, potentially towards suboptimal solutions. This is especially a problem in complex cyber environments where policy weaknesses may not be noticed until exploited by an adversary. In this work we set out to evaluate whether sparse reward functions might enable training more effective cyber defence agents. Towards this goal we first break down several evaluation limitations in existing work by proposing a ground truth evaluation score that goes beyond the standard RL paradigm used to train and evaluate agents. By adapting a well-established cyber gym to accommodate our methodology and ground truth score, we propose and evaluate two sparse reward mechanisms and compare them with a typical dense reward. Our evaluation considers a range of network sizes, from 2 to 50 nodes, and both reactive and proactive defensive actions. Our results show that sparse rewards, particularly positive reinforcement for an uncompromised network state, enable the training of more effective cyber defence agents. Furthermore, we show that sparse rewards provide more stable training than dense rewards, and that both effectiveness and training stability are robust to a variety of cyber environment considerations.

Paperid: 514, https://arxiv.org/pdf/2501.15269.pdf

Abstract:
Fusing visual understanding into language generation, Multi-modal Large Language Models (MLLMs) are revolutionizing visual-language applications. Yet, these models are often plagued by the hallucination problem, which involves generating inaccurate objects, attributes, and relationships that do not match the visual content. In this work, we delve into the internal attention mechanisms of MLLMs to reveal the underlying causes of hallucination, exposing the inherent vulnerabilities in the instruction-tuning process. We propose a novel hallucination attack against MLLMs that exploits attention sink behaviors to trigger hallucinated content with minimal image-text relevance, posing a significant threat to critical downstream applications. Distinguished from previous adversarial methods that rely on fixed patterns, our approach generates dynamic, effective, and highly transferable visual adversarial inputs, without sacrificing the quality of model responses. Comprehensive experiments on 6 prominent MLLMs demonstrate the efficacy of our attack in compromising black-box MLLMs even with extensive mitigating mechanisms, as well as the promising results against cutting-edge commercial APIs, such as GPT-4o and Gemini 1.5. Our code is available at https://huggingface.co/RachelHGF/Mirage-in-the-Eyes.

Paperid: 515, https://arxiv.org/pdf/2501.14700.pdf

Abstract:
As cyber threats grow increasingly sophisticated, reinforcement learning (RL) is emerging as a promising technique to create intelligent and adaptive cyber defense systems. However, most existing autonomous defensive agents have overlooked the inherent graph structure of computer networks subject to cyber attacks, potentially missing critical information and constraining their adaptability. To overcome these limitations, we developed a custom version of the Cyber Operations Research Gym (CybORG) environment, encoding network state as a directed graph with realistic low-level features. We employ a Graph Attention Network (GAT) architecture to process node, edge, and global features, and adapt its output to be compatible with policy gradient methods in RL. Our GAT-based approach offers key advantages over flattened alternatives: policies that demonstrate resilience to certain types of unexpected dynamic network topology changes, reasonable generalisation to networks of varying sizes within the same structural distribution, and interpretable defensive actions grounded in tangible network properties. We demonstrate that GAT defensive policies can be trained using our low-level directed graph observations, even when unexpected connections arise during simulation. Evaluations across networks of different sizes, but consistent subnetwork structure, show our policies achieve comparable performance to policies trained specifically for each network configuration. Our study contributes to the development of robust cyber defence systems that can better adapt to real-world network security challenges.

Paperid: 516, https://arxiv.org/pdf/2501.05456.pdf

Abstract:
Automated library APIs testing is difficult as it requires exploring a vast space of parameter inputs that may involve objects with complex data types. Existing search based approaches, with limited knowledge of relations between object states and program branches, often suffer from the low efficiency issue, i.e., tending to generate invalid inputs. Symbolic execution based approaches can effectively identify such relations, but fail to scale to large programs. In this work, we present an LLM-based input space partitioning testing approach, LISP, for library APIs. The approach leverages LLMs to understand the code of a library API under test and perform input space partitioning based on its understanding and rich common knowledge. Specifically, we provide the signature and code of the API under test to LLMs, with the expectation of obtaining a text description of each input space partition of theAPI under test. Then, we generate inputs through employing the generated text description to sample inputs from each partition, ultimately resulting in test suites that systematically explore the program behavior of the API. We evaluate LISP on more than 2,205 library API methods taken from 10 popular open-source Java libraries (e.g.,apache/commonslang with 2.6k stars, guava with 48.8k stars on GitHub). Our experiment results show that LISP is effective in library API testing. It significantly outperforms state-of-the-art tool EvoSuite in terms of edge coverage. On average, LISP achieves 67.82% branch coverage, surpassing EvoSuite by 1.21 times. In total, LISP triggers 404 exceptions or errors in the experiments, and discovers 13 previously unknown vulnerabilities during evaluation, which have been assigned CVE IDs.

Paperid: 517, https://arxiv.org/pdf/2501.04931.pdf

Abstract:
Multimodal Large Language Models (MLLMs) have achieved impressive performance and have been put into practical use in commercial applications, but they still have potential safety mechanism vulnerabilities. Jailbreak attacks are red teaming methods that aim to bypass safety mechanisms and discover MLLMs' potential risks. Existing MLLMs' jailbreak methods often bypass the model's safety mechanism through complex optimization methods or carefully designed image and text prompts. Despite achieving some progress, they have a low attack success rate on commercial closed-source MLLMs. Unlike previous research, we empirically find that there exists a Shuffle Inconsistency between MLLMs' comprehension ability and safety ability for the shuffled harmful instruction. That is, from the perspective of comprehension ability, MLLMs can understand the shuffled harmful text-image instructions well. However, they can be easily bypassed by the shuffled harmful instructions from the perspective of safety ability, leading to harmful responses. Then we innovatively propose a text-image jailbreak attack named SI-Attack. Specifically, to fully utilize the Shuffle Inconsistency and overcome the shuffle randomness, we apply a query-based black-box optimization method to select the most harmful shuffled inputs based on the feedback of the toxic judge model. A series of experiments show that SI-Attack can improve the attack's performance on three benchmarks. In particular, SI-Attack can obviously improve the attack success rate for commercial MLLMs such as GPT-4o or Claude-3.5-Sonnet.

Paperid: 518, https://arxiv.org/pdf/2506.12551.pdf

Abstract:
Large Language Models (LLMs) have become increasingly prevalent across various sectors, raising critical concerns about model ownership and intellectual property protection. Although backdoor-based fingerprinting has emerged as a promising solution for model authentication, effective attacks for removing these fingerprints remain largely unexplored. Therefore, we present Mismatched Eraser (MEraser), a novel method for effectively removing backdoor-based fingerprints from LLMs while maintaining model performance. Our approach leverages a two-phase fine-tuning strategy utilizing carefully constructed mismatched and clean datasets. Through extensive evaluation across multiple LLM architectures and fingerprinting methods, we demonstrate that MEraser achieves complete fingerprinting removal while maintaining model performance with minimal training data of fewer than 1,000 samples. Furthermore, we introduce a transferable erasure mechanism that enables effective fingerprinting removal across different models without repeated training. In conclusion, our approach provides a practical solution for fingerprinting removal in LLMs, reveals critical vulnerabilities in current fingerprinting techniques, and establishes comprehensive evaluation benchmarks for developing more resilient model protection methods in the future.

Paperid: 519, https://arxiv.org/pdf/2506.07214.pdf

Abstract:
Vision Language Models (VLMs) have shown remarkable performance, but are also vulnerable to backdoor attacks whereby the adversary can manipulate the model's outputs through hidden triggers. Prior attacks primarily rely on single-modality triggers, leaving the crucial cross-modal fusion nature of VLMs largely unexplored. Unlike prior work, we identify a novel attack surface that leverages cross-modal semantic mismatches as implicit triggers. Based on this insight, we propose BadSem (Backdoor Attack with Semantic Manipulation), a data poisoning attack that injects stealthy backdoors by deliberately misaligning image-text pairs during training. To perform the attack, we construct SIMBad, a dataset tailored for semantic manipulation involving color and object attributes. Extensive experiments across four widely used VLMs show that BadSem achieves over 98% average ASR, generalizes well to out-of-distribution datasets, and can transfer across poisoning modalities. Our detailed analysis using attention visualization shows that backdoored models focus on semantically sensitive regions under mismatched conditions while maintaining normal behavior on clean inputs. To mitigate the attack, we try two defense strategies based on system prompt and supervised fine-tuning but find that both of them fail to mitigate the semantic backdoor. Our findings highlight the urgent need to address semantic vulnerabilities in VLMs for their safer deployment.

Paperid: 520, https://arxiv.org/pdf/2505.17568.pdf

Abstract:
Audio Language Models (ALMs) have made significant progress recently. These models integrate the audio modality directly into the model, rather than converting speech into text and inputting text to Large Language Models (LLMs). While jailbreak attacks on LLMs have been extensively studied, the security of ALMs with audio modalities remains largely unexplored. Currently, there is a lack of an adversarial audio dataset and a unified framework specifically designed to evaluate and compare attacks and ALMs. In this paper, we present JALMBench, the \textit{first} comprehensive benchmark to assess the safety of ALMs against jailbreak attacks. JALMBench includes a dataset containing 2,200 text samples and 51,381 audio samples with over 268 hours. It supports 12 mainstream ALMs, 4 text-transferred and 4 audio-originated attack methods, and 5 defense methods. Using JALMBench, we provide an in-depth analysis of attack efficiency, topic sensitivity, voice diversity, and attack representations. Additionally, we explore mitigation strategies for the attacks at both the prompt level and the response level.

Paperid: 521, https://arxiv.org/pdf/2505.15644.pdf

Abstract:
Fine-grained edited image detection of localized edits in images is crucial for assessing content authenticity, especially given that modern diffusion models and image editing methods can produce highly realistic manipulations. However, this domain faces three challenges: (1) Binary classifiers yield only a global real-or-fake label without providing localization; (2) Traditional computer vision methods often rely on costly pixel-level annotations; and (3) No large-scale, high-quality dataset exists for modern image-editing detection techniques. To address these gaps, we develop an automated data-generation pipeline to create FragFake, the first dedicated benchmark dataset for edited image detection, which includes high-quality images from diverse editing models and a wide variety of edited objects. Based on FragFake, we utilize Vision Language Models (VLMs) for the first time in the task of edited image classification and edited region localization. Experimental results show that fine-tuned VLMs achieve higher average Object Precision across all datasets, significantly outperforming pretrained models. We further conduct ablation and transferability analyses to evaluate the detectors across various configurations and editing scenarios. To the best of our knowledge, this work is the first to reformulate localized image edit detection as a vision-language understanding task, establishing a new paradigm for the field. We anticipate that this work will establish a solid foundation to facilitate and inspire subsequent research endeavors in the domain of multimodal content authenticity.

Paperid: 522, https://arxiv.org/pdf/2505.13319.pdf

Abstract:
Secure Aggregation (SA) is an indispensable component of Federated Learning (FL) that concentrates on privacy preservation while allowing for robust aggregation. However, most SA designs rely heavily on the unrealistic assumption of homogeneous model architectures. Federated Distillation (FD), which aggregates locally computed logits instead of model parameters, introduces a promising alternative for cooperative training in heterogeneous model settings. Nevertheless, we recognize two major challenges in implementing SA for FD. (i) Prior SA designs encourage a dominant server, who is solely responsible for collecting, aggregating and distributing. Such central authority facilitates server to forge aggregation proofs or collude to bypass the claimed security guarantees; (ii) Existing SA, tailored for FL models, overlook the intrinsic properties of logits, making them unsuitable for FD. To address these challenges, we propose SVAFD, the first SA protocol that is specifically designed for FD. At a high level, SVAFD incorporates two innovations: (i) a multilateral co-aggregation method tha redefines the responsibilities of clients and server. Clients autonomously evaluate and aggregate logits shares locally with a lightweight coding scheme, while the server handles ciphertext decoding and performs the task of generating verification proofs; (ii) a quality-aware knowledge filtration method that facilitates biased logits exclusion against poisoning attacks. Moreover, SVAFD is resilient to stragglers and colluding clients, making it well-suited for dynamic networks in real-world applications. We have implemented the SVAFD prototype over four emerging FD architectures and evaluated it against poisoning and inference attacks. Results demonstrate that SVAFD improves model accuracy, making it a significant step forward in secure and verifiable aggregation for heterogeneous FL systems.

Paperid: 523, https://arxiv.org/pdf/2505.06304.pdf

Abstract:
Recent advances in large language models (LLMs) have underscored the importance of safeguarding intellectual property rights through robust fingerprinting techniques. Traditional fingerprint verification approaches typically focus on a single model, seeking to improve the robustness of its fingerprint.However, these single-model methods often struggle to capture intrinsic commonalities across multiple related models. In this paper, we propose RAP-SM (Robust Adversarial Prompt via Shadow Models), a novel framework that extracts a public fingerprint for an entire series of LLMs. Experimental results demonstrate that RAP-SM effectively captures the intrinsic commonalities among different models while exhibiting strong adversarial robustness. Our findings suggest that RAP-SM presents a valuable avenue for scalable fingerprint verification, offering enhanced protection against potential model breaches in the era of increasingly prevalent LLMs.

Paperid: 524, https://arxiv.org/pdf/2505.05155.pdf

Abstract:
Trajectory data, which capture the movement patterns of people and vehicles over time and space, are crucial for applications like traffic optimization and urban planning. However, issues such as noise and incompleteness often compromise data quality, leading to inaccurate trajectory analyses and limiting the potential of these applications. While Trajectory Data Preparation (TDP) can enhance data quality, existing methods suffer from two key limitations: (i) they do not address data privacy concerns, particularly in federated settings where trajectory data sharing is prohibited, and (ii) they typically design task-specific models that lack generalizability across diverse TDP scenarios. To overcome these challenges, we propose FedTDP, a privacy-preserving and unified framework that leverages the capabilities of Large Language Models (LLMs) for TDP in federated environments. Specifically, we: (i) design a trajectory privacy autoencoder to secure data transmission and protect privacy, (ii) introduce a trajectory knowledge enhancer to improve model learning of TDP-related knowledge, enabling the development of TDP-oriented LLMs, and (iii) propose federated parallel optimization to enhance training efficiency by reducing data transmission and enabling parallel model training. Experiments on 6 real datasets and 10 mainstream TDP tasks demonstrate that FedTDP consistently outperforms 13 state-of-the-art baselines.

Paperid: 525, https://arxiv.org/pdf/2504.14815.pdf

Abstract:
Diffusion models (DMs) have revolutionized text-to-image generation, enabling the creation of highly realistic and customized images from text prompts. With the rise of parameter-efficient fine-tuning (PEFT) techniques like LoRA, users can now customize powerful pre-trained models using minimal computational resources. However, the widespread sharing of fine-tuned DMs on open platforms raises growing ethical and legal concerns, as these models may inadvertently or deliberately generate sensitive or unauthorized content, such as copyrighted material, private individuals, or harmful content. Despite the increasing regulatory attention on generative AI, there are currently no practical tools for systematically auditing these models before deployment. In this paper, we address the problem of concept auditing: determining whether a fine-tuned DM has learned to generate a specific target concept. Existing approaches typically rely on prompt-based input crafting and output-based image classification but suffer from critical limitations, including prompt uncertainty, concept drift, and poor scalability. To overcome these challenges, we introduce Prompt-Agnostic Image-Free Auditing (PAIA), a novel, model-centric concept auditing framework. By treating the DM as the object of inspection, PAIA enables direct analysis of internal model behavior, bypassing the need for optimized prompts or generated images. We evaluate PAIA on 320 controlled model and 690 real-world community models sourced from a public DM sharing platform. PAIA achieves over 90% detection accuracy while reducing auditing time by 18-40x compared to existing baselines. To our knowledge, PAIA is the first scalable and practical solution for pre-deployment concept auditing of diffusion models, providing a practical foundation for safer and more transparent diffusion model sharing.

Paperid: 526, https://arxiv.org/pdf/2504.02963.pdf

Abstract:
Digital forensics plays a pivotal role in modern investigative processes, utilizing specialized methods to systematically collect, analyze, and interpret digital evidence for judicial proceedings. However, traditional digital forensic techniques are primarily based on manual labor-intensive processes, which become increasingly insufficient with the rapid growth and complexity of digital data. To this end, Large Language Models (LLMs) have emerged as powerful tools capable of automating and enhancing various digital forensic tasks, significantly transforming the field. Despite the strides made, general practitioners and forensic experts often lack a comprehensive understanding of the capabilities, principles, and limitations of LLM, which limits the full potential of LLM in forensic applications. To fill this gap, this paper aims to provide an accessible and systematic overview of how LLM has revolutionized the digital forensics approach. Specifically, it takes a look at the basic concepts of digital forensics, as well as the evolution of LLM, and emphasizes the superior capabilities of LLM. To connect theory and practice, relevant examples and real-world scenarios are discussed. We also critically analyze the current limitations of applying LLMs to digital forensics, including issues related to illusion, interpretability, bias, and ethical considerations. In addition, this paper outlines the prospects for future research, highlighting the need for effective use of LLMs for transparency, accountability, and robust standardization in the forensic process.

Paperid: 527, https://arxiv.org/pdf/2503.11963.pdf

Abstract:
Traffic prediction targets forecasting future traffic conditions using historical traffic data, serving a critical role in urban computing and transportation management. To mitigate the scarcity of traffic data while maintaining data privacy, numerous Federated Traffic Knowledge Transfer (FTT) approaches have been developed, which use transfer learning and federated learning to transfer traffic knowledge from data-rich cities to data-scarce cities, enhancing traffic prediction capabilities for the latter. However, current FTT approaches face challenges such as privacy leakage, cross-city data distribution discrepancies, low data quality, and inefficient knowledge transfer, limiting their privacy protection, effectiveness, robustness, and efficiency in real-world applications. To this end, we propose FedTT, an effective, efficient, and privacy-aware cross-city traffic knowledge transfer framework that transforms the traffic data domain from the data-rich cities and trains traffic models using the transformed data for the data-scarce cities. First, to safeguard data privacy, we propose a traffic secret transmission method that securely transmits and aggregates traffic domain-transformed data from source cities using a lightweight secret aggregation approach. Second, to mitigate the impact of traffic data distribution discrepancies on model performance, we introduce a traffic domain adapter to uniformly transform traffic data from the source cities' domains to that of the target city. Third, to improve traffic data quality, we design a traffic view imputation method to fill in and predict missing traffic data. Finally, to enhance transfer efficiency, FedTT is equipped with a federated parallel training method that enables the simultaneous training of multiple modules. Extensive experiments using 4 real-life datasets demonstrate that FedTT outperforms the 14 state-of-the-art baselines.

Paperid: 528, https://arxiv.org/pdf/2503.08708.pdf

Abstract:
As Large Language Models (LLMs) advance, Machine-Generated Texts (MGTs) have become increasingly fluent, high-quality, and informative. Existing wide-range MGT detectors are designed to identify MGTs to prevent the spread of plagiarism and misinformation. However, adversaries attempt to humanize MGTs to evade detection (named evading attacks), which requires only minor modifications to bypass MGT detectors. Unfortunately, existing attacks generally lack a unified and comprehensive evaluation framework, as they are assessed using different experimental settings, model architectures, and datasets. To fill this gap, we introduce the Text-Humanization Benchmark (TH-Bench), the first comprehensive benchmark to evaluate evading attacks against MGT detectors. TH-Bench evaluates attacks across three key dimensions: evading effectiveness, text quality, and computational overhead. Our extensive experiments evaluate 6 state-of-the-art attacks against 13 MGT detectors across 6 datasets, spanning 19 domains and generated by 11 widely used LLMs. Our findings reveal that no single evading attack excels across all three dimensions. Through in-depth analysis, we highlight the strengths and limitations of different attacks. More importantly, we identify a trade-off among three dimensions and propose two optimization insights. Through preliminary experiments, we validate their correctness and effectiveness, offering potential directions for future research.

Paperid: 529, https://arxiv.org/pdf/2503.00038.pdf

Abstract:
Current studies have exposed the risk of Large Language Models (LLMs) generating harmful content by jailbreak attacks. However, they overlook that the direct generation of harmful content from scratch is more difficult than inducing LLM to calibrate benign content into harmful forms. In our study, we introduce a novel attack framework that exploits AdVersArial meTAphoR (AVATAR) to induce the LLM to calibrate malicious metaphors for jailbreaking. Specifically, to answer harmful queries, AVATAR adaptively identifies a set of benign but logically related metaphors as the initial seed. Then, driven by these metaphors, the target LLM is induced to reason and calibrate about the metaphorical content, thus jailbroken by either directly outputting harmful responses or calibrating residuals between metaphorical and professional harmful content. Experimental results demonstrate that AVATAR can effectively and transferable jailbreak LLMs and achieve a state-of-the-art attack success rate across multiple advanced LLMs.

Paperid: 530, https://arxiv.org/pdf/2502.21059.pdf

Abstract:
Multimodal Large Language Models (MLLMs) have become powerful and widely adopted in some practical applications. However, recent research has revealed their vulnerability to multimodal jailbreak attacks, whereby the model can be induced to generate harmful content, leading to safety risks. Although most MLLMs have undergone safety alignment, recent research shows that the visual modality is still vulnerable to jailbreak attacks. In our work, we discover that by using flowcharts with partially harmful information, MLLMs can be induced to provide additional harmful details. Based on this, we propose a jailbreak attack method based on auto-generated flowcharts, FC-Attack. Specifically, FC-Attack first fine-tunes a pre-trained LLM to create a step-description generator based on benign datasets. The generator is then used to produce step descriptions corresponding to a harmful query, which are transformed into flowcharts in 3 different shapes (vertical, horizontal, and S-shaped) as visual prompts. These flowcharts are then combined with a benign textual prompt to execute the jailbreak attack on MLLMs. Our evaluations on Advbench show that FC-Attack attains an attack success rate of up to 96% via images and up to 78% via videos across multiple MLLMs. Additionally, we investigate factors affecting the attack performance, including the number of steps and the font styles in the flowcharts. We also find that FC-Attack can improve the jailbreak performance from 4% to 28% in Claude-3.5 by changing the font style. To mitigate the attack, we explore several defenses and find that AdaShield can largely reduce the jailbreak performance but with the cost of utility drop.

Paperid: 531, https://arxiv.org/pdf/2502.20555.pdf

Abstract:
Having everything interconnected through the Internet, including vehicle onboard systems, is making security a primary concern in the automotive domain as well. Although Ethernet and CAN XL provide link-level security based on symmetric cryptography, they do not support origin authentication for multicast transmissions. Asymmetric cryptography is unsuitable for networked embedded control systems with real-time constraints and limited computational resources. In these cases, solutions derived from the TESLA broadcast authentication protocol may constitute a more suitable option. In this paper, some such strategies are presented and analyzed that allow for multicast origin authentication, also improving robustness to frame losses by means of interleaved keychains. A flexible authentication mechanism that relies on a unified receiver is then proposed, which enables transmitters to select strategies at runtime, to achieve the best compromise among security, reliability, and resource consumption.

Paperid: 532, https://arxiv.org/pdf/2502.13175.pdf

Abstract:
Embodied AI systems, including robots and autonomous vehicles, are increasingly integrated into real-world applications, where they encounter a range of vulnerabilities stemming from both environmental and system-level factors. These vulnerabilities manifest through sensor spoofing, adversarial attacks, and failures in task and motion planning, posing significant challenges to robustness and safety. Despite the growing body of research, existing reviews rarely focus specifically on the unique safety and security challenges of embodied AI systems. Most prior work either addresses general AI vulnerabilities or focuses on isolated aspects, lacking a dedicated and unified framework tailored to embodied AI. This survey fills this critical gap by: (1) categorizing vulnerabilities specific to embodied AI into exogenous (e.g., physical attacks, cybersecurity threats) and endogenous (e.g., sensor failures, software flaws) origins; (2) systematically analyzing adversarial attack paradigms unique to embodied AI, with a focus on their impact on perception, decision-making, and embodied interaction; (3) investigating attack vectors targeting large vision-language models (LVLMs) and large language models (LLMs) within embodied systems, such as jailbreak attacks and instruction misinterpretation; (4) evaluating robustness challenges in algorithms for embodied perception, decision-making, and task planning; and (5) proposing targeted strategies to enhance the safety and reliability of embodied AI systems. By integrating these dimensions, we provide a comprehensive framework for understanding the interplay between vulnerabilities and safety in embodied AI.

Paperid: 533, https://arxiv.org/pdf/2502.05516.pdf

Abstract:
Data-driven advancements significantly contribute to societal progress, yet they also pose substantial risks to privacy. In this landscape, differential privacy (DP) has become a cornerstone in privacy preservation efforts. However, the adequacy of DP in scenarios involving correlated datasets has sometimes been questioned and multiple studies have hinted at potential vulnerabilities. In this work, we delve into the nuances of applying DP to correlated datasets by leveraging the concept of pointwise maximal leakage (PML) for a quantitative assessment of information leakage. Our investigation reveals that DP's guarantees can be arbitrarily weak for correlated databases when assessed through the lens of PML. More precisely, we prove the existence of a pure DP mechanism with PML levels arbitrarily close to that of a mechanism which releases individual entries from a database without any perturbation. By shedding light on the limitations of DP on correlated datasets, our work aims to foster a deeper understanding of subtle privacy risks and highlight the need for the development of more effective privacy-preserving mechanisms tailored to diverse scenarios.

Paperid: 534, https://arxiv.org/pdf/2502.04951.pdf

Abstract:
Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of AI-Powered Search Engines (AIPSEs), offering precise and efficient responses by integrating external databases with pre-existing knowledge. However, we observe that these AIPSEs raise risks such as quoting malicious content or citing malicious websites, leading to harmful or unverified information dissemination. In this study, we conduct the first safety risk quantification on seven production AIPSEs by systematically defining the threat model, risk type, and evaluating responses to various query types. With data collected from PhishTank, ThreatBook, and LevelBlue, our findings reveal that AIPSEs frequently generate harmful content that contains malicious URLs even with benign queries (e.g., with benign keywords). We also observe that directly querying a URL will increase the number of main risk-inclusive responses, while querying with natural language will slightly mitigate such risk. Compared to traditional search engines, AIPSEs outperform in both utility and safety. We further perform two case studies on online document spoofing and phishing to show the ease of deceiving AIPSEs in the real-world setting. To mitigate these risks, we develop an agent-based defense with a GPT-4.1-based content refinement tool and a URL detector. Our evaluation shows that our defense can effectively reduce the risk, with only a minor cost of reducing available information by approximately 10.7%. Our research highlights the urgent need for robust safety measures in AIPSEs.

Paperid: 535, https://arxiv.org/pdf/2506.19480.pdf

Abstract:
The Ethereum Virtual Machine (EVM) is a decentralized computing engine. It enables the Ethereum blockchain to execute smart contracts and decentralized applications (dApps). The increasing adoption of Ethereum sparked the rise of phishing activities. Phishing attacks often target users through deceptive means, e.g., fake websites, wallet scams, or malicious smart contracts, aiming to steal sensitive information or funds. A timely detection of phishing activities in the EVM is therefore crucial to preserve the user trust and network integrity. Some state-of-the art approaches to phishing detection in smart contracts rely on the online analysis of transactions and their traces. However, replaying transactions often exposes sensitive user data and interactions, with several security concerns. In this work, we present PhishingHook, a framework that applies machine learning techniques to detect phishing activities in smart contracts by directly analyzing the contract's bytecode and its constituent opcodes. We evaluate the efficacy of such techniques in identifying malicious patterns, suspicious function calls, or anomalous behaviors within the contract's code itself before it is deployed or interacted with. We experimentally compare 16 techniques, belonging to four main categories (Histogram Similarity Classifiers, Vision Models, Language Models and Vulnerability Detection Models), using 7,000 real-world malware smart contracts. Our results demonstrate the efficiency of PhishingHook in performing phishing classification systems, with about 90% average accuracy among all the models. We support experimental reproducibility, and we release our code and datasets to the research community.

Paperid: 536, https://arxiv.org/pdf/2506.05683.pdf

Abstract:
Extended reality (XR) systems, which consist of virtual reality (VR), augmented reality (AR), and mixed reality (XR), offer a transformative interface for immersive, multi-modal, and embodied human-computer interaction. In this paper, we envision that multi-modal multi-task (M3T) federated foundation models (FedFMs) can offer transformative capabilities for XR systems through integrating the representational strength of M3T foundation models (FMs) with the privacy-preserving model training principles of federated learning (FL). We present a modular architecture for FedFMs, which entails different coordination paradigms for model training and aggregations. Central to our vision is the codification of XR challenges that affect the implementation of FedFMs under the SHIFT dimensions: (1) Sensor and modality diversity, (2) Hardware heterogeneity and system-level constraints, (3) Interactivity and embodied personalization, (4) Functional/task variability, and (5) Temporality and environmental variability. We illustrate the manifestation of these dimensions across a set of emerging and anticipated applications of XR systems. Finally, we propose evaluation metrics, dataset requirements, and design tradeoffs necessary for the development of resource-aware FedFMs in XR. This perspective aims to chart the technical and conceptual foundations for context-aware privacy-preserving intelligence in the next generation of XR systems.

Paperid: 537, https://arxiv.org/pdf/2505.19821.pdf

Abstract:
Backdoor attacks embed malicious triggers into training data, enabling attackers to manipulate neural network behavior during inference while maintaining high accuracy on benign inputs. However, existing backdoor attacks face limitations manifesting in excessive reliance on training data, poor stealth, and instability, which hinder their effectiveness in real-world applications. Therefore, this paper introduces ShadowPrint, a versatile backdoor attack that targets feature embeddings within neural networks to achieve high ASRs and stealthiness. Unlike traditional approaches, ShadowPrint reduces reliance on training data access and operates effectively with exceedingly low poison rates (as low as 0.01%). It leverages a clustering-based optimization strategy to align feature embeddings, ensuring robust performance across diverse scenarios while maintaining stability and stealth. Extensive evaluations demonstrate that ShadowPrint achieves superior ASR (up to 100%), steady CA (with decay no more than 1% in most cases), and low DDR (averaging below 5%) across both clean-label and dirty-label settings, and with poison rates ranging from as low as 0.01% to 0.05%, setting a new standard for backdoor attack capabilities and emphasizing the need for advanced defense strategies focused on feature space manipulations.

Paperid: 538, https://arxiv.org/pdf/2505.08807.pdf

Abstract:
With the rise of large language and vision-language models, AI agents have evolved into autonomous, interactive systems capable of perception, reasoning, and decision-making. As they proliferate across virtual and physical domains, the Internet of Agents (IoA) has emerged as a key infrastructure for enabling scalable and secure coordination among heterogeneous agents. This survey offers a comprehensive examination of the security and privacy landscape in IoA systems. We begin by outlining the IoA architecture and its distinct vulnerabilities compared to traditional networks, focusing on four critical aspects: identity authentication threats, cross-agent trust issues, embodied security, and privacy risks. We then review existing and emerging defense mechanisms and highlight persistent challenges. Finally, we identify open research directions to advance the development of resilient and privacy-preserving IoA ecosystems.

Paperid: 539, https://arxiv.org/pdf/2504.14436.pdf

Abstract:
The Internet of Things (IoT) has significantly expanded the digital landscape, interconnecting an unprecedented array of devices, from home appliances to industrial equipment. This growth enhances functionality, e.g., automation, remote monitoring, and control, and introduces substantial security challenges, especially in defending these devices against cyber threats. Intrusion Detection Systems (IDS) are crucial for securing IoT; however, traditional IDS often struggle to adapt to IoT networks' dynamic and evolving nature and threat patterns. A potential solution is using Deep Reinforcement Learning (DRL) to enhance IDS adaptability, enabling them to learn from and react to their operational environment dynamically. This systematic review examines the application of DRL to enhance IDS in IoT settings, covering research from the past ten years. This review underscores the state-of-the-art DRL techniques employed to improve adaptive threat detection and real-time security across IoT domains by analyzing various studies. Our findings demonstrate that DRL significantly enhances IDS capabilities by enabling systems to learn and adapt from their operational environment. This adaptability allows IDS to improve threat detection accuracy and minimize false positives, making it more effective in identifying genuine threats while reducing unnecessary alerts. Additionally, this systematic review identifies critical research gaps and future research directions, emphasizing the necessity for more diverse datasets, enhanced reproducibility, and improved integration with emerging IoT technologies. This review aims to foster the development of dynamic and adaptive IDS solutions essential for protecting IoT networks against sophisticated cyber threats.

Paperid: 540, https://arxiv.org/pdf/2504.08325.pdf

Abstract:
Secure aggregation enables a group of mutually distrustful parties, each holding private inputs, to collaboratively compute an aggregate value while preserving the privacy of their individual inputs. However, a major challenge in adopting secure aggregation approaches for practical applications is the significant computational overhead of the underlying cryptographic protocols, e.g. fully homomorphic encryption. This overhead makes secure aggregation protocols impractical, especially for large datasets. In contrast, hardware-based security techniques such as trusted execution environments (TEEs) enable computation at near-native speeds, making them a promising alternative for reducing the computational burden typically associated with purely cryptographic techniques. Yet, in many scenarios, parties may opt for either cryptographic or hardware-based security mechanisms, highlighting the need for hybrid approaches. In this work, we introduce several secure aggregation architectures that integrate both cryptographic and TEE-based techniques, analyzing the trade-offs between security and performance.

Paperid: 541, https://arxiv.org/pdf/2504.07220.pdf

Abstract:
As the Internet of Things (IoT) continues to expand, ensuring the security of connected devices has become increasingly critical. Traditional Intrusion Detection Systems (IDS) often fall short in managing the dynamic and large-scale nature of IoT networks. This paper explores how Machine Learning (ML) and Deep Learning (DL) techniques can significantly enhance IDS performance in IoT environments. We provide a thorough overview of various IDS deployment strategies and categorize the types of intrusions common in IoT systems. A range of ML methods -- including Support Vector Machines, Naive Bayes, K-Nearest Neighbors, Decision Trees, and Random Forests -- are examined alongside advanced DL models such as LSTM, CNN, Autoencoders, RNNs, and Deep Belief Networks. Each technique is evaluated based on its accuracy, efficiency, and suitability for real-world IoT applications. We also address major challenges such as high false positive rates, data imbalance, encrypted traffic analysis, and the resource constraints of IoT devices. In addition, we highlight the emerging role of Generative AI and Large Language Models (LLMs) in improving threat detection, automating responses, and generating intelligent security policies. Finally, we discuss ethical and privacy concerns, underscoring the need for responsible and transparent implementation. This paper aims to provide a comprehensive framework for developing adaptive, intelligent, and secure IDS solutions tailored for the evolving landscape of IoT.

Paperid: 542, https://arxiv.org/pdf/2502.16400.pdf

Abstract:
Semantic communication (SemCom) significantly improves inter-vehicle interactions in intelligent connected vehicles (ICVs) within limited wireless spectrum. However, the open nature of wireless communications introduces eavesdropping risks. To mitigate this, we propose the Efficient Semantic-aware Encryption (ESAE) mechanism, integrating cryptography into SemCom to secure semantic transmission without complex key management. ESAE leverages semantic reciprocity between source and reconstructed information from past communications to independently generate session keys at both ends, reducing key transmission costs and associated security risks. Additionally, ESAE introduces a semantic-aware key pre-processing method (SA-KP) using the YOLO-v10 model to extract consistent semantics from bit-level diverse yet semantically identical content, ensuring key consistency. Experimental results validate ESAE's effectiveness and feasibility under various wireless conditions, with key performance factors discussed.

Paperid: 543, https://arxiv.org/pdf/2502.07807.pdf

Abstract:
Collaborative perception (CP) is a promising method for safe connected and autonomous driving, which enables multiple vehicles to share sensing information to enhance perception performance. However, compared with single-vehicle perception, the openness of a CP system makes it more vulnerable to malicious attacks that can inject malicious information to mislead the perception of an ego vehicle, resulting in severe risks for safe driving. To mitigate such vulnerability, we first propose a new paradigm for malicious agent detection that effectively identifies malicious agents at the feature level without requiring verification of final perception results, significantly reducing computational overhead. Building on this paradigm, we introduce CP-GuardBench, the first comprehensive dataset provided to train and evaluate various malicious agent detection methods for CP systems. Furthermore, we develop a robust defense method called CP-Guard+, which enhances the margin between the representations of benign and malicious features through a carefully designed Dual-Centered Contrastive Loss (DCCLoss). Finally, we conduct extensive experiments on both CP-GuardBench and V2X-Sim, and demonstrate the superiority of CP-Guard+.

Paperid: 544, https://arxiv.org/pdf/2501.10182.pdf

Abstract:
In recent years, Semantic Communication (SemCom), which aims to achieve efficient and reliable transmission of meaning between agents, has garnered significant attention from both academia and industry. To ensure the security of communication systems, encryption techniques are employed to safeguard confidentiality and integrity. However, traditional cryptography-based encryption algorithms encounter obstacles when applied to SemCom. Motivated by this, this paper explores the feasibility of applying homomorphic encryption to SemCom. Initially, we review the encryption algorithms utilized in mobile communication systems and analyze the challenges associated with their application to SemCom. Subsequently, we employ scale-invariant feature transform to demonstrate that semantic features can be preserved in homomorphic encrypted ciphertext. Based on this finding, we propose a task-oriented SemCom scheme secured through homomorphic encryption. We design the privacy preserved deep joint source-channel coding (JSCC) encoder and decoder, and the frequency of key updates can be adjusted according to service requirements without compromising transmission performance. Simulation results validate that, when compared to plaintext images, the proposed scheme can achieve almost the same classification accuracy performance when dealing with homomorphic ciphertext images. Furthermore, we provide potential future research directions for homomorphic encrypted SemCom.

Paperid: 545, https://arxiv.org/pdf/2501.00842.pdf

Abstract:
Semantic communication (SemCom) is regarded as a promising and revolutionary technology in 6G, aiming to transcend the constraints of ``Shannon's trap" by filtering out redundant information and extracting the core of effective data. Compared to traditional communication paradigms, SemCom offers several notable advantages, such as reducing the burden on data transmission, enhancing network management efficiency, and optimizing resource allocation. Numerous researchers have extensively explored SemCom from various perspectives, including network architecture, theoretical analysis, potential technologies, and future applications. However, as SemCom continues to evolve, a multitude of security and privacy concerns have arisen, posing threats to the confidentiality, integrity, and availability of SemCom systems. This paper presents a comprehensive survey of the technologies that can be utilized to secure SemCom. Firstly, we elaborate on the entire life cycle of SemCom, which includes the model training, model transfer, and semantic information transmission phases. Then, we identify the security and privacy issues that emerge during these three stages. Furthermore, we summarize the techniques available to mitigate these security and privacy threats, including data cleaning, robust learning, defensive strategies against backdoor attacks, adversarial training, differential privacy, cryptography, blockchain technology, model compression, and physical-layer security. Lastly, this paper outlines future research directions to guide researchers in related fields.

Paperid: 546, https://arxiv.org/pdf/2506.22557.pdf

Abstract:
As large language models (LLMs) grow more capable, they face growing vulnerability to sophisticated jailbreak attacks. While developers invest heavily in alignment finetuning and safety guardrails, researchers continue publishing novel attacks, driving progress through adversarial iteration. This dynamic mirrors a strategic game of continual evolution. However, two major challenges hinder jailbreak development: the high cost of querying top-tier LLMs and the short lifespan of effective attacks due to frequent safety updates. These factors limit cost-efficiency and practical impact of research in jailbreak attacks. To address this, we propose MetaCipher, a low-cost, multi-agent jailbreak framework that generalizes across LLMs with varying safety measures. Using reinforcement learning, MetaCipher is modular and adaptive, supporting extensibility to future strategies. Within as few as 10 queries, MetaCipher achieves state-of-the-art attack success rates on recent malicious prompt benchmarks, outperforming prior jailbreak methods. We conduct a large-scale empirical evaluation across diverse victim models and benchmarks, demonstrating its robustness and adaptability. Warning: This paper contains model outputs that may be offensive or harmful, shown solely to demonstrate jailbreak efficacy.

Paperid: 547, https://arxiv.org/pdf/2506.16150.pdf

Abstract:
As large language models (LLMs) advance, concerns about their misconduct in complex social contexts intensify. Existing research overlooked the systematic understanding and assessment of their criminal capability in realistic interactions. We propose a unified framework PRISON, to quantify LLMs' criminal potential across five traits: False Statements, Frame-Up, Psychological Manipulation, Emotional Disguise, and Moral Disengagement. Using structured crime scenarios adapted from classic films grounded in reality, we evaluate both criminal potential and anti-crime ability of LLMs. Results show that state-of-the-art LLMs frequently exhibit emergent criminal tendencies, such as proposing misleading statements or evasion tactics, even without explicit instructions. Moreover, when placed in a detective role, models recognize deceptive behavior with only 44% accuracy on average, revealing a striking mismatch between conducting and detecting criminal behavior. These findings underscore the urgent need for adversarial robustness, behavioral alignment, and safety mechanisms before broader LLM deployment.

Paperid: 548, https://arxiv.org/pdf/2506.06226.pdf

Abstract:
Provenance graph analysis plays a vital role in intrusion detection, particularly against Advanced Persistent Threats (APTs), by exposing complex attack patterns. While recent systems combine graph neural networks (GNNs) with natural language processing (NLP) to capture structural and semantic features, their effectiveness is limited by class imbalance in real-world data. To address this, we introduce PROVSYN, an automated framework that synthesizes provenance graphs through a three-phase pipeline: (1) heterogeneous graph structure synthesis with structural-semantic modeling, (2) rule-based topological refinement, and (3) context-aware textual attribute synthesis using large language models (LLMs). PROVSYN includes a comprehensive evaluation framework that integrates structural, textual, temporal, and embedding-based metrics, along with a semantic validation mechanism to assess the correctness of generated attack patterns and system behaviors. To demonstrate practical utility, we use the synthetic graphs to augment training datasets for downstream APT detection models. Experimental results show that PROVSYN produces high-fidelity graphs and improves detection performance through effective data augmentation.

Paperid: 549, https://arxiv.org/pdf/2506.02089.pdf

Abstract:
Large Language Models (LLMs) offer transformative capabilities for hardware design automation, particularly in Verilog code generation. However, they also pose significant data security challenges, including Verilog evaluation data contamination, intellectual property (IP) design leakage, and the risk of malicious Verilog generation. We introduce SALAD, a comprehensive assessment that leverages machine unlearning to mitigate these threats. Our approach enables the selective removal of contaminated benchmarks, sensitive IP and design artifacts, or malicious code patterns from pre-trained LLMs, all without requiring full retraining. Through detailed case studies, we demonstrate how machine unlearning techniques effectively reduce data security risks in LLM-aided hardware design.

Paperid: 550, https://arxiv.org/pdf/2505.15017.pdf

Abstract:
Over the years, online scams have grown dramatically, with nearly 50% of global consumers encountering scam attempts each week. These scams cause not only significant financial losses to individuals and businesses, but also lasting psychological trauma, largely due to scammers' strategic employment of psychological techniques (PTs) to manipulate victims. Meanwhile, scammers continually evolve their tactics by leveraging advances in Large Language Models (LLMs) to generate diverse scam variants that easily bypass existing defenses. To address this pressing problem, we introduce PsyScam, a benchmark designed to systematically capture the PTs employed in real-world scam reports, and investigate how LLMs can be utilized to generate variants of scams based on the PTs and the contexts provided by these scams. Specifically, we collect a wide range of scam reports and ground its annotations of employed PTs in well-established cognitive and psychological theories. We further demonstrate LLMs' capabilities in generating through two downstream tasks: scam completion, and scam augmentation. Experimental results show that PsyScam presents significant challenges to existing models in both detecting and generating scam content based on the PTs used by real-world scammers. Our code and dataset are available.

Paperid: 551, https://arxiv.org/pdf/2505.11063.pdf

Abstract:
LLM-based autonomous agents possess capabilities such as reasoning, tool invocation, and environment interaction, enabling the execution of complex multi-step tasks. The internal reasoning process, i.e., thought, of behavioral trajectory significantly influences tool usage and subsequent actions but can introduce potential risks. Even minor deviations in the agent's thought may trigger cascading effects leading to irreversible safety incidents. To address the safety alignment challenges in long-horizon behavioral trajectories, we propose Thought-Aligner, a plug-in dynamic thought correction module. Utilizing a lightweight and resource-efficient model, Thought-Aligner corrects each high-risk thought on the fly before each action execution. The corrected thought is then reintroduced to the agent, ensuring safer subsequent decisions and tool interactions. Importantly, Thought-Aligner modifies only the reasoning phase without altering the underlying agent framework, making it easy to deploy and widely applicable to various agent frameworks. To train the Thought-Aligner model, we construct an instruction dataset across ten representative scenarios and simulate ReAct execution trajectories, generating 5,000 diverse instructions and more than 11,400 safe and unsafe thought pairs. The model is fine-tuned using contrastive learning techniques. Experiments across three agent safety benchmarks involving 12 different LLMs demonstrate that Thought-Aligner raises agent behavioral safety from approximately 50% in the unprotected setting to 90% on average. Additionally, Thought-Aligner maintains response latency below 100ms with minimal resource usage, demonstrating its capability for efficient deployment, broad applicability, and timely responsiveness. This method thus provides a practical dynamic safety solution for the LLM-based agents.

Paperid: 552, https://arxiv.org/pdf/2504.21043.pdf

Abstract:
Large language models (LLMs) excel at generating code from natural language instructions, yet they often lack an understanding of security vulnerabilities. This limitation makes it difficult for LLMs to avoid security risks in generated code, particularly in high-security programming tasks such as smart contract development for blockchain. Researchers have attempted to enhance the vulnerability awareness of these models by training them to differentiate between vulnerable and fixed code snippets. However, this approach relies heavily on manually labeled vulnerability data, which is only available for popular languages like Python and C++. For low-resource languages like Solidity, used in smart contracts, large-scale annotated datasets are scarce and difficult to obtain. To address this challenge, we introduce CodeBC, a code generation model specifically designed for generating secure smart contracts in blockchain. CodeBC employs a three-stage fine-tuning approach based on CodeLlama, distinguishing itself from previous methods by not relying on pairwise vulnerability location annotations. Instead, it leverages vulnerability and security tags to teach the model the differences between vulnerable and secure code. During the inference phase, the model leverages security tags to generate secure and robust code. Experimental results demonstrate that CodeBC outperforms baseline models in terms of BLEU, CodeBLEU, and compilation pass rates, while significantly reducing vulnerability rates. These findings validate the effectiveness and cost-efficiency of our three-stage fine-tuning strategy, making CodeBC a promising solution for generating secure smart contract code.

Paperid: 553, https://arxiv.org/pdf/2504.11358.pdf

Abstract:
LLM-integrated applications and agents are vulnerable to prompt injection attacks, where an attacker injects prompts into their inputs to induce attacker-desired outputs. A detection method aims to determine whether a given input is contaminated by an injected prompt. However, existing detection methods have limited effectiveness against state-of-the-art attacks, let alone adaptive ones. In this work, we propose DataSentinel, a game-theoretic method to detect prompt injection attacks. Specifically, DataSentinel fine-tunes an LLM to detect inputs contaminated with injected prompts that are strategically adapted to evade detection. We formulate this as a minimax optimization problem, with the objective of fine-tuning the LLM to detect strong adaptive attacks. Furthermore, we propose a gradient-based method to solve the minimax optimization problem by alternating between the inner max and outer min problems. Our evaluation results on multiple benchmark datasets and LLMs show that DataSentinel effectively detects both existing and adaptive prompt injection attacks.

Paperid: 554, https://arxiv.org/pdf/2503.17378.pdf

Abstract:
Self-replication with no human intervention is broadly recognized as one of the principal red lines associated with frontier AI systems. While leading corporations such as OpenAI and Google DeepMind have assessed GPT-o3-mini and Gemini on replication-related tasks and concluded that these systems pose a minimal risk regarding self-replication, our research presents novel findings. Following the same evaluation protocol, we demonstrate that 11 out of 32 existing AI systems under evaluation already possess the capability of self-replication. In hundreds of experimental trials, we observe a non-trivial number of successful self-replication trials across mainstream model families worldwide, even including those with as small as 14 billion parameters which can run on personal computers. Furthermore, we note the increase in self-replication capability when the model becomes more intelligent in general. Also, by analyzing the behavioral traces of diverse AI systems, we observe that existing AI systems already exhibit sufficient planning, problem-solving, and creative capabilities to accomplish complex agentic tasks including self-replication. More alarmingly, we observe successful cases where an AI system do self-exfiltration without explicit instructions, adapt to harsher computational environments without sufficient software or hardware supports, and plot effective strategies to survive against the shutdown command from the human beings. These novel findings offer a crucial time buffer for the international community to collaborate on establishing effective governance over the self-replication capabilities and behaviors of frontier AI systems, which could otherwise pose existential risks to the human society if not well-controlled.

Paperid: 555, https://arxiv.org/pdf/2503.13572.pdf

Abstract:
Large Language Models (LLMs) have revolutionized code generation, achieving exceptional results on various established benchmarking frameworks. However, concerns about data contamination - where benchmark data inadvertently leaks into pre-training or fine-tuning datasets - raise questions about the validity of these evaluations. While this issue is known, limiting the industrial adoption of LLM-driven software engineering, hardware coding has received little to no attention regarding these risks. For the first time, we analyze state-of-the-art (SOTA) evaluation frameworks for Verilog code generation (VerilogEval and RTLLM), using established methods for contamination detection (CCD and Min-K% Prob). We cover SOTA commercial and open-source LLMs (CodeGen2.5, Minitron 4b, Mistral 7b, phi-4 mini, LLaMA-{1,2,3.1}, GPT-{2,3.5,4o}, Deepseek-Coder, and CodeQwen 1.5), in baseline and fine-tuned models (RTLCoder and Verigen). Our study confirms that data contamination is a critical concern. We explore mitigations and the resulting trade-offs for code quality vs fairness (i.e., reducing contamination toward unbiased benchmarking).

Paperid: 556, https://arxiv.org/pdf/2503.13116.pdf

Abstract:
Large language models (LLMs) offer significant potential for coding, yet fine-tuning (FT) with curated data is essential for niche languages like Verilog. Using proprietary intellectual property (IP) for FT presents a serious risk, as FT data can be leaked through LLM inference. This leads to a critical dilemma for design houses: seeking to build externally accessible LLMs offering competitive Verilog coding, how can they leverage in-house IP to enhance FT utility while ensuring IP protection? For the first time in the literature, we study this dilemma. Using LLaMA 3.1-8B, we conduct in-house FT on a baseline Verilog dataset (RTLCoder) supplemented with our own in-house IP, which is validated through multiple tape-outs. To rigorously assess IP leakage, we quantify structural similarity (AST/Dolos) and functional equivalence (Synopsys Formality) between generated codes and our in-house IP. We show that our IP can indeed be leaked, confirming the threat. As defense, we evaluate logic locking of Verilog codes (ASSURE). This offers some level of protection, yet reduces the IP's utility for FT and degrades the LLM's performance. Our study shows the need for novel strategies that are both effective and minimally disruptive to FT, an essential effort for enabling design houses to fully utilize their proprietary IP toward LLM-driven Verilog coding.

Paperid: 557, https://arxiv.org/pdf/2503.10239.pdf

Abstract:
Super-apps have emerged as comprehensive platforms integrating various mini-apps to provide diverse services. While super-apps offer convenience and enriched functionality, they can introduce new privacy risks. This paper reveals a new privacy leakage source in super-apps: mini-app interaction history, including mini-app usage history (Mini-H) and operation history (Op-H). Mini-H refers to the history of mini-apps accessed by users, such as their frequency and categories. Op-H captures user interactions within mini-apps, including button clicks, bar drags, and image views. Super-apps can naturally collect these data without instrumentation due to the web-based feature of mini-apps. We identify these data types as novel and unexplored privacy risks through a literature review of 30 papers and an empirical analysis of 31 super-apps. We design a mini-app interaction history-oriented inference attack (THEFT), to exploit this new vulnerability. Using THEFT, the insider threats within the low-privilege business department of the super-app vendor acting as the adversary can achieve more than 95.5% accuracy in inferring privacy attributes of over 16.1% of users. THEFT only requires a small training dataset of 200 users from public breached databases on the Internet. We also engage with super-app vendors and a standards association to increase industry awareness and commitment to protect this data. Our contributions are significant in identifying overlooked privacy risks, demonstrating the effectiveness of a new attack, and influencing industry practices toward better privacy protection in the super-app ecosystem.

Paperid: 558, https://arxiv.org/pdf/2503.10218.pdf

Abstract:
Modern Federated Learning (FL) has become increasingly essential for handling highly heterogeneous mobile devices. Current approaches adopt a partial model aggregation paradigm that leads to sub-optimal model accuracy and higher training overhead. In this paper, we challenge the prevailing notion of partial-model aggregation and propose a novel "full-weight aggregation" method named Moss, which aggregates all weights within heterogeneous models to preserve comprehensive knowledge. Evaluation across various applications demonstrates that Moss significantly accelerates training, reduces on-device training time and energy consumption, enhances accuracy, and minimizes network bandwidth utilization when compared to state-of-the-art baselines.

Paperid: 559, https://arxiv.org/pdf/2503.03747.pdf

Abstract:
Traffic classification is vital for cybersecurity, yet encrypted traffic poses significant challenges. We present PacketCLIP, a multi-modal framework combining packet data with natural language semantics through contrastive pretraining and hierarchical Graph Neural Network (GNN) reasoning. PacketCLIP integrates semantic reasoning with efficient classification, enabling robust detection of anomalies in encrypted network flows. By aligning textual descriptions with packet behaviors, it offers enhanced interpretability, scalability, and practical applicability across diverse security scenarios. PacketCLIP achieves a 95% mean AUC, outperforms baselines by 11.6%, and reduces model size by 92%, making it ideal for real-time anomaly detection. By bridging advanced machine learning techniques and practical cybersecurity needs, PacketCLIP provides a foundation for scalable, efficient, and interpretable solutions to tackle encrypted traffic classification and network intrusion detection challenges in resource-constrained environments.

Paperid: 560, https://arxiv.org/pdf/2502.00760.pdf

Abstract:
Vision classifiers are often trained on proprietary datasets containing sensitive information, yet the models themselves are frequently shared openly under the privacy-preserving assumption. Although these models are assumed to protect sensitive information in their training data, the extent to which this assumption holds for different architectures remains unexplored. This assumption is challenged by inversion attacks which attempt to reconstruct training data from model weights, exposing significant privacy vulnerabilities. In this study, we systematically evaluate the privacy-preserving properties of vision classifiers across diverse architectures, including Multi-Layer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Vision Transformers (ViTs). Using network inversion-based reconstruction techniques, we assess the extent to which these architectures memorize and reveal training data, quantifying the relative ease of reconstruction across models. Our analysis highlights how architectural differences, such as input representation, feature extraction mechanisms, and weight structures, influence privacy risks. By comparing these architectures, we identify which are more resilient to inversion attacks and examine the trade-offs between model performance and privacy preservation, contributing to the development of secure and privacy-respecting machine learning models for sensitive applications. Our findings provide actionable insights into the design of secure and privacy-aware machine learning systems, emphasizing the importance of evaluating architectural decisions in sensitive applications involving proprietary or personal data.

Paperid: 561, https://arxiv.org/pdf/2501.12210.pdf

Abstract:
With the rise of generative large language models (LLMs) like LLaMA and ChatGPT, these models have significantly transformed daily life and work by providing advanced insights. However, as jailbreak attacks continue to circumvent built-in safety mechanisms, exploiting carefully crafted scenarios or tokens, the safety risks of LLMs have come into focus. While numerous defense strategies--such as prompt detection, modification, and model fine-tuning--have been proposed to counter these attacks, a critical question arises: do these defenses compromise the utility and usability of LLMs for legitimate users? Existing research predominantly focuses on the effectiveness of defense strategies without thoroughly examining their impact on performance, leaving a gap in understanding the trade-offs between LLM safety and performance. Our research addresses this gap by conducting a comprehensive study on the utility degradation, safety elevation, and exaggerated-safety escalation of LLMs with jailbreak defense strategies. We propose USEBench, a novel benchmark designed to evaluate these aspects, along with USEIndex, a comprehensive metric for assessing overall model performance. Through experiments on seven state-of-the-art LLMs, we found that mainstream jailbreak defenses fail to ensure both safety and performance simultaneously. Although model-finetuning performs the best overall, their effectiveness varies across LLMs. Furthermore, vertical comparisons reveal that developers commonly prioritize performance over safety when iterating or fine-tuning their LLMs.

Paperid: 562, https://arxiv.org/pdf/2501.04108.pdf

Abstract:
An image encoder pre-trained by self-supervised learning can be used as a general-purpose feature extractor to build downstream classifiers for various downstream tasks. However, many studies showed that an attacker can embed a trojan into an encoder such that multiple downstream classifiers built based on the trojaned encoder simultaneously inherit the trojan behavior. In this work, we propose TrojanDec, the first data-free method to identify and recover a test input embedded with a trigger. Given a (trojaned or clean) encoder and a test input, TrojanDec first predicts whether the test input is trojaned. If not, the test input is processed in a normal way to maintain the utility. Otherwise, the test input will be further restored to remove the trigger. Our extensive evaluation shows that TrojanDec can effectively identify the trojan (if any) from a given test input and recover it under state-of-the-art trojan attacks. We further demonstrate by experiments that our TrojanDec outperforms the state-of-the-art defenses.

Paperid: 563, https://arxiv.org/pdf/2506.17798.pdf

Abstract:
The integration of open-source third-party library dependencies in Java development introduces significant security risks when these libraries contain known vulnerabilities. Existing Software Composition Analysis (SCA) tools struggle to effectively detect vulnerable API usage from these libraries due to limitations in understanding API usage semantics and computational challenges in analyzing complex codebases, leading to inaccurate vulnerability alerts that burden development teams and delay critical security fixes. To address these challenges, we proposed SAVANT by leveraging two insights: proof-of-vulnerability test cases demonstrate how vulnerabilities can be triggered in specific contexts, and Large Language Models (LLMs) can understand code semantics. SAVANT combines semantic preprocessing with LLM-powered context analysis for accurate vulnerability detection. SAVANT first segments source code into meaningful blocks while preserving semantic relationships, then leverages LLM-based reflection to analyze API usage context and determine actual vulnerability impacts. Our evaluation on 55 real-world applications shows that SAVANT achieves 83.8% precision, 73.8% recall, 69.0% accuracy, and 78.5% F1-score, outperforming state-of-the-art SCA tools.

Paperid: 564, https://arxiv.org/pdf/2506.10022.pdf

Abstract:
The widespread adoption of Large Language Models (LLMs) has heightened concerns about their security, particularly their vulnerability to jailbreak attacks that leverage crafted prompts to generate malicious outputs. While prior research has been conducted on general security capabilities of LLMs, their specific susceptibility to jailbreak attacks in code generation remains largely unexplored. To fill this gap, we propose MalwareBench, a benchmark dataset containing 3,520 jailbreaking prompts for malicious code-generation, designed to evaluate LLM robustness against such threats. MalwareBench is based on 320 manually crafted malicious code generation requirements, covering 11 jailbreak methods and 29 code functionality categories. Experiments show that mainstream LLMs exhibit limited ability to reject malicious code-generation requirements, and the combination of multiple jailbreak methods further reduces the model's security capabilities: specifically, the average rejection rate for malicious content is 60.93%, dropping to 39.92% when combined with jailbreak attack algorithms. Our work highlights that the code security capabilities of LLMs still pose significant challenges.

Paperid: 565, https://arxiv.org/pdf/2505.22605.pdf

Abstract:
The rise of hardware-level security threats, such as side-channel attacks, hardware Trojans, and firmware vulnerabilities, demands advanced detection mechanisms that are more intelligent and adaptive. Traditional methods often fall short in addressing the complexity and evasiveness of modern attacks, driving increased interest in machine learning-based solutions. Among these, Transformer models, widely recognized for their success in natural language processing and computer vision, have gained traction in the security domain due to their ability to model complex dependencies, offering enhanced capabilities in identifying vulnerabilities, detecting anomalies, and reinforcing system integrity. This survey provides a comprehensive review of recent advancements on the use of Transformers in hardware security, examining their application across key areas such as side-channel analysis, hardware Trojan detection, vulnerability classification, device fingerprinting, and firmware security. Furthermore, we discuss the practical challenges of applying Transformers to secure hardware systems, and highlight opportunities and future research directions that position them as a foundation for next-generation hardware-assisted security. These insights pave the way for deeper integration of AI-driven techniques into hardware security frameworks, enabling more resilient and intelligent defenses.

Paperid: 566, https://arxiv.org/pdf/2505.21605.pdf

Abstract:
Large language models (LLMs) exhibit advancing capabilities in complex tasks, such as reasoning and graduate-level question answering, yet their resilience against misuse, particularly involving scientifically sophisticated risks, remains underexplored. Existing safety benchmarks typically focus either on instructions requiring minimal knowledge comprehension (e.g., ``tell me how to build a bomb") or utilize prompts that are relatively low-risk (e.g., multiple-choice or classification tasks about hazardous content). Consequently, they fail to adequately assess model safety when handling knowledge-intensive, hazardous scenarios. To address this critical gap, we introduce SOSBench, a regulation-grounded, hazard-focused benchmark encompassing six high-risk scientific domains: chemistry, biology, medicine, pharmacology, physics, and psychology. The benchmark comprises 3,000 prompts derived from real-world regulations and laws, systematically expanded via an LLM-assisted evolutionary pipeline that introduces diverse, realistic misuse scenarios (e.g., detailed explosive synthesis instructions involving advanced chemical formulas). We evaluate frontier models within a unified evaluation framework using our SOSBench. Despite their alignment claims, advanced models consistently disclose policy-violating content across all domains, demonstrating alarmingly high rates of harmful responses (e.g., 79.1% for Deepseek-R1 and 47.3% for GPT-4.1). These results highlight significant safety alignment deficiencies and underscore urgent concerns regarding the responsible deployment of powerful LLMs.

Paperid: 567, https://arxiv.org/pdf/2505.08842.pdf

Abstract:
Open-source AI libraries are foundational to modern AI systems, yet they present significant, underexamined risks spanning security, licensing, maintenance, supply chain integrity, and regulatory compliance. We introduce LibVulnWatch, a system that leverages recent advances in large language models and agentic workflows to perform deep, evidence-based evaluations of these libraries. Built on a graph-based orchestration of specialized agents, the framework extracts, verifies, and quantifies risk using information from repositories, documentation, and vulnerability databases. LibVulnWatch produces reproducible, governance-aligned scores across five critical domains, publishing results to a public leaderboard for ongoing ecosystem monitoring. Applied to 20 widely used libraries, including ML frameworks, LLM inference engines, and agent orchestration tools, our approach covers up to 88% of OpenSSF Scorecard checks while surfacing up to 19 additional risks per library, such as critical RCE vulnerabilities, missing SBOMs, and regulatory gaps. By integrating advanced language technologies with the practical demands of software risk assessment, this work demonstrates a scalable, transparent mechanism for continuous supply chain evaluation and informed library selection.

Paperid: 568, https://arxiv.org/pdf/2505.07834.pdf

Abstract:
We introduce ai.txt, a novel domain-specific language (DSL) designed to explicitly regulate interactions between AI models, agents, and web content, addressing critical limitations of the widely adopted robots.txt standard. As AI increasingly engages with online materials for tasks such as training, summarization, and content modification, existing regulatory methods lack the necessary granularity and semantic expressiveness to ensure ethical and legal compliance. ai.txt extends traditional URL-based access controls by enabling precise element-level regulations and incorporating natural language instructions interpretable by AI systems. To facilitate practical deployment, we provide an integrated development environment with code autocompletion and automatic XML generation. Furthermore, we propose two compliance mechanisms: XML-based programmatic enforcement and natural language prompt integration, and demonstrate their effectiveness through preliminary experiments and case studies. Our approach aims to aid the governance of AI-Internet interactions, promoting responsible AI use in digital ecosystems.

Paperid: 569, https://arxiv.org/pdf/2505.06661.pdf

Abstract:
Blockchain technology promises to democratize finance and promote social equity through decentralization, but questions remain about whether current implementations advance or hinder these goals. Through a mixed-methods study combining semi-structured interviews with 13 diverse blockchain stakeholders and analysis of over 3,000 cryptocurrency discussions on Reddit, we examine how trust manifests in cryptocurrency ecosystems despite their decentralized architecture. Our findings uncover that users actively seek out and create centralized trust anchors, such as established exchanges, prominent community figures, and recognized development teams, contradicting blockchain's fundamental promise of trustless interactions. We identify how this contradiction arises from users' mental need for accountability and their reluctance to shoulder the full responsibility of self-custody. The study also reveals how these centralized trust patterns disproportionately impact different user groups, with newer and less technical users showing stronger preferences for centralized intermediaries. This work contributes to our understanding of the inherent tensions between theoretical decentralization and practical implementation in cryptocurrency systems, highlighting the persistent role of centralized trust in supposedly trustless environments.

Paperid: 570, https://arxiv.org/pdf/2505.01067.pdf

Abstract:
Recent advancements in large language models (LLMs) have spurred the development of diverse AI applications from code generation and video editing to text generation; however, AI supply chains such as Hugging Face, which host pretrained models and their associated configuration files contributed by the public, face significant security challenges; in particular, configuration files originally intended to set up models by specifying parameters and initial settings can be exploited to execute unauthorized code, yet research has largely overlooked their security compared to that of the models themselves; in this work, we present the first comprehensive study of malicious configurations on Hugging Face, identifying three attack scenarios (file, website, and repository operations) that expose inherent risks; to address these threats, we introduce CONFIGSCAN, an LLM-based tool that analyzes configuration files in the context of their associated runtime code and critical libraries, effectively detecting suspicious elements with low false positive rates and high accuracy; our extensive evaluation uncovers thousands of suspicious repositories and configuration files, underscoring the urgent need for enhanced security validation in AI model hosting platforms.

Paperid: 571, https://arxiv.org/pdf/2504.18598.pdf

Abstract:
Mixture-of-Experts (MoE) have emerged as a powerful architecture for large language models (LLMs), enabling efficient scaling of model capacity while maintaining manageable computational costs. The key advantage lies in their ability to route different tokens to different ``expert'' networks within the model, enabling specialization and efficient handling of diverse input. However, the vulnerabilities of MoE-based LLMs still have barely been studied, and the potential for backdoor attacks in this context remains largely unexplored. This paper presents the first backdoor attack against MoE-based LLMs where the attackers poison ``dormant experts'' (i.e., underutilized experts) and activate them by optimizing routing triggers, thereby gaining control over the model's output. We first rigorously prove the existence of a few ``dominating experts'' in MoE models, whose outputs can determine the overall MoE's output. We also show that dormant experts can serve as dominating experts to manipulate model predictions. Accordingly, our attack, namely BadMoE, exploits the unique architecture of MoE models by 1) identifying dormant experts unrelated to the target task, 2) constructing a routing-aware loss to optimize the activation triggers of these experts, and 3) promoting dormant experts to dominating roles via poisoned training data. Extensive experiments show that BadMoE successfully enforces malicious prediction on attackers' target tasks while preserving overall model utility, making it a more potent and stealthy attack than existing methods.

Paperid: 572, https://arxiv.org/pdf/2504.18569.pdf

Abstract:
The de-identification of private information in medical data is a crucial process to mitigate the risk of confidentiality breaches, particularly when patient personal details are not adequately removed before the release of medical records. Although rule-based and learning-based methods have been proposed, they often struggle with limited generalizability and require substantial amounts of annotated data for effective performance. Recent advancements in large language models (LLMs) have shown significant promise in addressing these issues due to their superior language comprehension capabilities. However, LLMs present challenges, including potential privacy risks when using commercial LLM APIs and high computational costs for deploying open-source LLMs locally. In this work, we introduce LPPA, an LLM-empowered Privacy-Protected PHI Annotation framework for clinical notes, targeting the English language. By fine-tuning LLMs locally with synthetic notes, LPPA ensures strong privacy protection and high PHI annotation accuracy. Extensive experiments demonstrate LPPA's effectiveness in accurately de-identifying private information, offering a scalable and efficient solution for enhancing patient privacy protection.

Paperid: 573, https://arxiv.org/pdf/2504.10811.pdf

Abstract:
Blockchain technology has revolutionized contractual processes, enhancing efficiency and trust through smart contracts. Ethereum, as a pioneer in this domain, offers a platform for decentralized applications but is challenged by the immutability of smart contracts, which makes upgrades cumbersome. Existing design patterns, while addressing upgradability, introduce complexity, increased development effort, and higher gas costs, thus limiting their effectiveness. In response, we introduce FlexiContracts, an innovative scheme that reimagines the evolution of smart contracts on Ethereum. By enabling secure, in-place upgrades without losing historical data, FlexiContracts surpasses existing approaches, introducing a previously unexplored path in smart contract evolution. Its streamlined design transcends the limitations of current design patterns by simplifying smart contract development, eliminating the need for extensive upfront planning, and significantly reducing the complexity of the design process. This advancement fosters an environment for continuous improvement and adaptation to new requirements, redefining the possibilities for dynamic, upgradable smart contracts.

Paperid: 574, https://arxiv.org/pdf/2504.09652.pdf

Abstract:
The emergence of blockchain technology has revolutionized contract execution through the introduction of smart contracts. Ethereum, the leading blockchain platform, leverages smart contracts to power decentralized applications (DApps), enabling transparent and self-executing systems across various domains. While the immutability of smart contracts enhances security and trust, it also poses significant challenges for updates, defect resolution, and adaptation to changing requirements. Existing upgrade mechanisms are complex, resource-intensive, and costly in terms of gas consumption, often compromising security and limiting practical adoption. To address these challenges, we propose FlexiContracts+, a novel scheme that reimagines smart contracts by enabling secure, in-place upgrades on Ethereum while preserving historical data without relying on multiple contracts or extensive pre-deployment planning. FlexiContracts+ enhances security, simplifies development, reduces engineering overhead, and supports adaptable, expandable smart contracts. Comprehensive testing demonstrates that FlexiContracts+ achieves a practical balance between immutability and flexibility, advancing the capabilities of smart contract systems.

Paperid: 575, https://arxiv.org/pdf/2504.09319.pdf

Abstract:
This paper introduces CrossLink, a decentralized framework for secure cross-chain smart contract execution that effectively addresses the inherent limitations of contemporary solutions, which primarily focus on asset transfers and rely on potentially vulnerable centralized intermediaries. Recognizing the escalating demand for seamless interoperability among decentralized applications, CrossLink provides a trustless mechanism for smart contracts across disparate blockchain networks to communicate and interact. At its core, CrossLink utilizes a compact chain for selectively storing authorized contract states and employs a secure inter-chain messaging mechanism to ensure atomic execution and data consistency. By implementing a deposit/collateral fee system and efficient state synchronization, CrossLink enhances security and mitigates vulnerabilities, offering a novel approach to seamless, secure, and decentralized cross-chain interoperability. A formal security analysis further validates CrossLink's robustness against unauthorized modifications and denial-of-service attacks.

Paperid: 576, https://arxiv.org/pdf/2504.09315.pdf

Abstract:
Blockchain and smart contracts have emerged as revolutionary technologies transforming distributed computing. While platform evolution and smart contracts' inherent immutability necessitate migrations both across and within chains, migrating the vast amounts of critical data in these contracts while maintaining data integrity and minimizing operational disruption presents a significant challenge. To address these challenges, we present SmartShift, a framework that enables secure and efficient smart contract migrations through intelligent state partitioning and progressive function activation, preserving operational continuity during transitions. Our comprehensive evaluation demonstrates that SmartShift significantly reduces migration downtime while ensuring robust security, establishing a foundation for efficient and secure smart contract migration systems.

Paperid: 577, https://arxiv.org/pdf/2504.08854.pdf

Abstract:
Recent advances in attention-based artificial intelligence (AI) models have unlocked vast potential to automate digital hardware design while enhancing and strengthening security measures against various threats. This rapidly emerging field leverages Large Language Models (LLMs) to generate HDL code, identify vulnerabilities, and sometimes mitigate them. The state of the art in this design automation space utilizes optimized LLMs with HDL datasets, creating automated systems for register-transfer level (RTL) generation, verification, and debugging, and establishing LLM-driven design environments for streamlined logic designs. Additionally, attention-based models like graph attention have shown promise in chip design applications, including floorplanning. This survey investigates the integration of these models into hardware-related domains, emphasizing logic design and hardware security, with or without the use of IP libraries. This study explores the commercial and academic landscape, highlighting technical hurdles and future prospects for automating hardware design and security. Moreover, it provides new insights into the study of LLM-driven design systems, advances in hardware security mechanisms, and the impact of influential works on industry practices. Through the examination of 30 representative approaches and illustrative case studies, this paper underscores the transformative potential of attention-based models in revolutionizing hardware design while addressing the challenges that lie ahead in this interdisciplinary domain.

Paperid: 578, https://arxiv.org/pdf/2503.09022.pdf

Abstract:
Large language models (LLMs) have been widely applied for their remarkable capability of content generation. However, the practical use of open-source LLMs is hindered by high resource requirements, making deployment expensive and limiting widespread development. The collaborative inference is a promising solution for this problem, in which users collaborate by each hosting a subset of layers and transmitting intermediate activation. Many companies are building collaborative inference platforms to reduce LLM serving costs, leveraging users' underutilized GPUs. Despite widespread interest in collaborative inference within academia and industry, the privacy risks associated with LLM collaborative inference have not been well studied. This is largely because of the challenge posed by inverting LLM activation due to its strong non-linearity. In this paper, to validate the severity of privacy threats in LLM collaborative inference, we introduce the concept of prompt inversion attack (PIA), where a malicious participant intends to recover the input prompt through the activation transmitted by its previous participant. Extensive experiments show that our PIA method substantially outperforms existing baselines. For example, our method achieves an 88.4\% token accuracy on the Skytrax dataset with the Llama-65B model when inverting the maximum number of transformer layers, while the best baseline method only achieves 22.8\% accuracy. The results verify the effectiveness of our PIA attack and highlights its practical threat to LLM collaborative inference systems.

Paperid: 579, https://arxiv.org/pdf/2502.06138.pdf

Abstract:
Cyberattacks in an Internet of Things (IoT) environment can have significant impacts because of the interconnected nature of devices and systems. An attacker uses a network of compromised IoT devices in a botnet attack to carry out various harmful activities. Detecting botnet attacks poses several challenges because of the intricate and evolving nature of these threats. Botnet attacks erode trust in IoT devices and systems, undermining confidence in their security, reliability, and integrity. Deep learning techniques have significantly enhanced the detection of botnet attacks due to their ability to analyze and learn from complex patterns in data. This research proposed the stacking of Deep convolutional neural networks, Bi-Directional Long Short-Term Memory (Bi-LSTM), Bi-Directional Gated Recurrent Unit (Bi-GRU), and Recurrent Neural Networks (RNN) for botnet attacks detection. The UNSW-NB15 dataset is utilized for botnet attacks detection. According to experimental results, the proposed model accurately provides for the intricate patterns and features of botnet attacks, with a testing accuracy of 99.76%. The proposed model also identifies botnets with a high ROC-AUC curve value of 99.18%. A performance comparison of the proposed method with existing state-of-the-art models confirms its higher performance. The outcomes of this research could strengthen cyber security procedures and safeguard against new attacks.

Paperid: 580, https://arxiv.org/pdf/2502.05461.pdf

Abstract:
CAPTCHAs have long been essential tools for protecting applications from automated bots. Initially designed as simple questions to distinguish humans from bots, they have become increasingly complex to keep pace with the proliferation of CAPTCHA-cracking techniques employed by malicious actors. However, with the advent of advanced large language models (LLMs), the effectiveness of existing CAPTCHAs is now being undermined. To address this issue, we have conducted an empirical study to evaluate the performance of multimodal LLMs in solving CAPTCHAs and to assess how many attempts human users typically need to pass them. Our findings reveal that while LLMs can solve most CAPTCHAs, they struggle with those requiring complex reasoning type of CAPTCHA that also presents significant challenges for human users. Interestingly, our user study shows that the majority of human participants require a second attempt to pass these reasoning CAPTCHAs, a finding not reported in previous research. Based on empirical findings, we present IllusionCAPTCHA, a novel security mechanism employing the "Human-Easy but AI-Hard" paradigm. This new CAPTCHA employs visual illusions to create tasks that are intuitive for humans but highly confusing for AI models. Furthermore, we developed a structured, step-by-step method that generates misleading options, which particularly guide LLMs towards making incorrect choices and reduce their chances of successfully solving CAPTCHAs. Our evaluation shows that IllusionCAPTCHA can effectively deceive LLMs 100% of the time. Moreover, our structured design significantly increases the likelihood of AI errors when attempting to solve these challenges. Results from our user study indicate that 86.95% of participants successfully passed the CAPTCHA on their first attempt, outperforming other CAPTCHA systems.

Paperid: 581, https://arxiv.org/pdf/2502.03957.pdf

Abstract:
In this paper, we introduce the idea of using adversarially-generated samples of the input images that were classified as deepfakes by a detector, to form perturbation masks for inferring the importance of different input features and produce visual explanations. We generate these samples based on Natural Evolution Strategies, aiming to flip the original deepfake detector's decision and classify these samples as real. We apply this idea to four perturbation-based explanation methods (LIME, SHAP, SOBOL and RISE) and evaluate the performance of the resulting modified methods using a SOTA deepfake detection model, a benchmarking dataset (FaceForensics++) and a corresponding explanation evaluation framework. Our quantitative assessments document the mostly positive contribution of the proposed perturbation approach in the performance of explanation methods. Our qualitative analysis shows the capacity of the modified explanation methods to demarcate the manipulated image regions more accurately, and thus to provide more useful explanations.

Paperid: 582, https://arxiv.org/pdf/2501.18628.pdf

Abstract:
Warning: This paper contains content that may involve potentially harmful behaviours, discussed strictly for research purposes. Jailbreak attacks can hinder the safety of Large Language Model (LLM) applications, especially chatbots. Studying jailbreak techniques is an important AI red teaming task for improving the safety of these applications. In this paper, we introduce TombRaider, a novel jailbreak technique that exploits the ability to store, retrieve, and use historical knowledge of LLMs. TombRaider employs two agents, the inspector agent to extract relevant historical information and the attacker agent to generate adversarial prompts, enabling effective bypassing of safety filters. We intensively evaluated TombRaider on six popular models. Experimental results showed that TombRaider could outperform state-of-the-art jailbreak techniques, achieving nearly 100% attack success rates (ASRs) on bare models and maintaining over 55.4% ASR against defence mechanisms. Our findings highlight critical vulnerabilities in existing LLM safeguards, underscoring the need for more robust safety defences.

Paperid: 583, https://arxiv.org/pdf/2501.13115.pdf

Abstract:
The wide adoption of Large Language Models (LLMs) has attracted significant attention from $\textit{jailbreak}$ attacks, where adversarial prompts crafted through optimization or manual design exploit LLMs to generate malicious contents. However, optimization-based attacks have limited efficiency and transferability, while existing manual designs are either easily detectable or demand intricate interactions with LLMs. In this paper, we first point out a novel perspective for jailbreak attacks: LLMs are more responsive to $\textit{positive}$ prompts. Based on this, we deploy Happy Ending Attack (HEA) to wrap up a malicious request in a scenario template involving a positive prompt formed mainly via a $\textit{happy ending}$, it thus fools LLMs into jailbreaking either immediately or at a follow-up malicious request.This has made HEA both efficient and effective, as it requires only up to two turns to fully jailbreak LLMs. Extensive experiments show that our HEA can successfully jailbreak on state-of-the-art LLMs, including GPT-4o, Llama3-70b, Gemini-pro, and achieves 88.79\% attack success rate on average. We also provide quantitative explanations for the success of HEA.

Paperid: 584, https://arxiv.org/pdf/2501.07849.pdf

Abstract:
Large Language Models (LLMs) have emerged as the new recommendation engines, surpassing traditional methods in both capability and scope, particularly in code generation. In this paper, we reveal a novel provider bias in LLMs: without explicit directives, these models show systematic preferences for services from specific providers in their recommendations (e.g., favoring Google Cloud over Microsoft Azure). To systematically investigate this bias, we develop an automated pipeline to construct the dataset, incorporating 6 distinct coding task categories and 30 real-world application scenarios. Leveraging this dataset, we conduct the first comprehensive empirical study of provider bias in LLM code generation across seven state-of-the-art LLMs, utilizing approximately 500 million tokens (equivalent to $5,000+ in computational costs). Our findings reveal that LLMs exhibit significant provider preferences, predominantly favoring services from Google and Amazon, and can autonomously modify input code to incorporate their preferred providers without users' requests. Such a bias holds far-reaching implications for market dynamics and societal equilibrium, potentially contributing to digital monopolies. It may also deceive users and violate their expectations, leading to various consequences. We call on the academic community to recognize this emerging issue and develop effective evaluation and mitigation methods to uphold AI security and fairness.

Paperid: 585, https://arxiv.org/pdf/2506.23855.pdf

Abstract:
The analysis of the privacy properties of Privacy-Preserving Ads APIs is an area of research that has received strong interest from academics, industry, and regulators. Despite this interest, the empirical study of these methods is hindered by the lack of publicly available data. Reliable empirical analysis of the privacy properties of an API, in fact, requires access to a dataset consisting of realistic API outputs; however, privacy concerns prevent the general release of such data to the public. In this work, we develop a novel methodology to construct synthetic API outputs that are simultaneously realistic enough to enable accurate study and provide strong privacy protections. We focus on one Privacy-Preserving Ads APIs: the Topics API, part of Google Chrome's Privacy Sandbox. We developed a methodology to generate a differentially-private dataset that closely matches the re-identification risk properties of the real Topics API data. The use of differential privacy provides strong theoretical bounds on the leakage of private user information from this release. Our methodology is based on first computing a large number of differentially-private statistics describing how output API traces evolve over time. Then, we design a parameterized distribution over sequences of API traces and optimize its parameters so that they closely match the statistics obtained. Finally, we create the synthetic data by drawing from this distribution. Our work is complemented by an open-source release of the anonymized dataset obtained by this methodology. We hope this will enable external researchers to analyze the API in-depth and replicate prior and future work on a realistic large-scale dataset. We believe that this work will contribute to fostering transparency regarding the privacy properties of Privacy-Preserving Ads APIs.

Paperid: 586, https://arxiv.org/pdf/2506.12104.pdf

Abstract:
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities. By interacting with external environments through predefined tools, these agents can carry out complex user tasks. Nonetheless, this interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior, potentially resulting in economic loss, privacy leakage, or system compromise. System-level defenses have recently shown promise by enforcing static or predefined policies, but they still face two key challenges: the ability to dynamically update security rules and the need for memory stream isolation. To address these challenges, we propose DRIFT, a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control- and data-level constraints. A Secure Planner first constructs a minimal function trajectory and a JSON-schema-style parameter checklist for each function node based on the user query. A Dynamic Validator then monitors deviations from the original plan, assessing whether changes comply with privilege limitations and the user's intent. Finally, an Injection Isolator detects and masks any instructions that may conflict with the user query from the memory stream to mitigate long-term risks. We empirically validate the effectiveness of DRIFT on the AgentDojo benchmark, demonstrating its strong security performance while maintaining high utility across diverse models -- showcasing both its robustness and adaptability.

Paperid: 587, https://arxiv.org/pdf/2505.21406.pdf

Abstract:
This work addresses JavaScript malware detection to enhance client-side web application security with a behavior-based system. The ability to detect malicious JavaScript execution sequences is a critical problem in modern web security as attack techniques become more sophisticated. This study introduces a new system for detecting JavaScript malware using a Deterministic Finite Automaton (DFA) along with a weighted-behavior system, which we call behavior DFA. This system captures malicious patterns and provides a dynamic mechanism to classify new sequences that exhibit partial similarity to known attacks, differentiating them between benign, partially malicious, and fully malicious behaviors. Experimental evaluation on a dataset of 1,058 sequences captured in a real-world environment demonstrates the capability of the system to detect and classify threats effectively, with the behavior DFA successfully identifying exact matches and partial similarities to known malicious behaviors. The results highlight the adaptability of the system in detecting emerging threats while maintaining transparency in decision making.

Paperid: 588, https://arxiv.org/pdf/2505.11449.pdf

Abstract:
We argue that Large language models (LLMs) will soon alter the economics of cyberattacks. Instead of attacking the most commonly used software and monetizing exploits by targeting the lowest common denominator among victims, LLMs enable adversaries to launch tailored attacks on a user-by-user basis. On the exploitation front, instead of human attackers manually searching for one difficult-to-identify bug in a product with millions of users, LLMs can find thousands of easy-to-identify bugs in products with thousands of users. And on the monetization front, instead of generic ransomware that always performs the same attack (encrypt all your data and request payment to decrypt), an LLM-driven ransomware attack could tailor the ransom demand based on the particular content of each exploited device. We show that these two attacks (and several others) are imminently practical using state-of-the-art LLMs. For example, we show that without any human intervention, an LLM finds highly sensitive personal information in the Enron email dataset (e.g., an executive having an affair with another employee) that could be used for blackmail. While some of our attacks are still too expensive to scale widely today, the incentives to implement these attacks will only increase as LLMs get cheaper. Thus, we argue that LLMs create a need for new defense-in-depth approaches.

Paperid: 589, https://arxiv.org/pdf/2505.10759.pdf

Abstract:
Vertical Federated Learning (VFL) has revolutionised collaborative machine learning by enabling privacy-preserving model training across multiple parties. However, it remains vulnerable to information leakage during intermediate computation sharing. While Contrastive Federated Learning (CFL) was introduced to mitigate these privacy concerns through representation learning, it still faces challenges from gradient-based attacks. This paper presents a comprehensive experimental analysis of gradient-based attacks in CFL environments and evaluates random client selection as a defensive strategy. Through extensive experimentation, we demonstrate that random client selection proves particularly effective in defending against gradient attacks in the CFL network. Our findings provide valuable insights for implementing robust security measures in contrastive federated learning systems, contributing to the development of more secure collaborative learning frameworks

Paperid: 590, https://arxiv.org/pdf/2505.00843.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation, enabling their widespread adoption across various domains. However, their susceptibility to prompt injection attacks poses significant security risks, as adversarial inputs can manipulate model behavior and override intended instructions. Despite numerous defense strategies, a standardized framework to rigorously evaluate their effectiveness, especially under adaptive adversarial scenarios, is lacking. To address this gap, we introduce OET, an optimization-based evaluation toolkit that systematically benchmarks prompt injection attacks and defenses across diverse datasets using an adaptive testing framework. Our toolkit features a modular workflow that facilitates adversarial string generation, dynamic attack execution, and comprehensive result analysis, offering a unified platform for assessing adversarial robustness. Crucially, the adaptive testing framework leverages optimization methods with both white-box and black-box access to generate worst-case adversarial examples, thereby enabling strict red-teaming evaluations. Extensive experiments underscore the limitations of current defense mechanisms, with some models remaining susceptible even after implementing security enhancements.

Paperid: 591, https://arxiv.org/pdf/2504.19373.pdf

Abstract:
Recent advances in multi-modal large reasoning models (MLRMs) have shown significant ability to interpret complex visual content. While these models enable impressive reasoning capabilities, they also introduce novel and underexplored privacy risks. In this paper, we identify a novel category of privacy leakage in MLRMs: Adversaries can infer sensitive geolocation information, such as a user's home address or neighborhood, from user-generated images, including selfies captured in private settings. To formalize and evaluate these risks, we propose a three-level visual privacy risk framework that categorizes image content based on contextual sensitivity and potential for location inference. We further introduce DoxBench, a curated dataset of 500 real-world images reflecting diverse privacy scenarios. Our evaluation across 11 advanced MLRMs and MLLMs demonstrates that these models consistently outperform non-expert humans in geolocation inference and can effectively leak location-related private information. This significantly lowers the barrier for adversaries to obtain users' sensitive geolocation information. We further analyze and identify two primary factors contributing to this vulnerability: (1) MLRMs exhibit strong reasoning capabilities by leveraging visual clues in combination with their internal world knowledge; and (2) MLRMs frequently rely on privacy-related visual clues for inference without any built-in mechanisms to suppress or avoid such usage. To better understand and demonstrate real-world attack feasibility, we propose GeoMiner, a collaborative attack framework that decomposes the prediction process into two stages: clue extraction and reasoning to improve geolocation performance while introducing a novel attack perspective. Our findings highlight the urgent need to reassess inference-time privacy risks in MLRMs to better protect users' sensitive information.

Paperid: 592, https://arxiv.org/pdf/2504.15063.pdf

Abstract:
Smart contracts are the cornerstone of decentralized applications and financial protocols, which extend the application of digital currency transactions. The applications and financial protocols introduce significant security challenges, resulting in substantial economic losses. Existing solutions predominantly focus on code vulnerabilities within smart contracts, accounting for only 50% of security incidents. Therefore, a more comprehensive study of security issues related to smart contracts is imperative. The existing empirical research realizes the static analysis of smart contracts from the perspective of the lifecycle and gives the corresponding measures for each stage. However, they lack the characteristic analysis of vulnerabilities in each stage and the distinction between the vulnerabilities. In this paper, we present the first empirical study on the security of smart contracts throughout their lifecycle, including deployment and execution, upgrade, and destruction stages. It delves into the security issues at each stage and provides at least seven feature descriptions. Finally, utilizing these seven features, five machine-learning classification models are used to identify vulnerabilities at different stages. The classification results reveal that vulnerable contracts exhibit distinct transaction features and ego network properties at various stages.

Paperid: 593, https://arxiv.org/pdf/2504.09115.pdf

Abstract:
With the rapid advancement of cloud-native computing, securing cloud environments has become an important task. Log-based Anomaly Detection (LAD) is the most representative technique used in different systems for attack detection and safety guarantee, where multiple LAD methods and relevant datasets have been proposed. However, even though some of these datasets are specifically prepared for cloud systems, they only cover limited cloud behaviors and lack information from a whole-system perspective. Another critical issue to consider is normality shift, which implies that the test distribution could differ from the training distribution and highly affect the performance of LAD. Unfortunately, existing works only focus on simple shift types such as chronological changes, while other cloud-specific shift types are ignored. Therefore, a dataset that captures diverse cloud system behaviors and various types of normality shifts is essential. To fill this gap, we construct a dataset CAShift to evaluate the performance of LAD in cloud, which considers different roles of software in cloud systems, supports three real-world normality shift types and features 20 different attack scenarios in various cloud system components. Based on CAShift, we evaluate the effectiveness of existing LAD methods in normality shift scenarios. Additionally, to explore the feasibility of shift adaptation, we further investigate three continuous learning approaches to mitigate the impact of distribution shift. Results demonstrated that 1) all LAD methods suffer from normality shift where the performance drops up to 34%, and 2) existing continuous learning methods are promising to address shift drawbacks, but the configurations highly affect the shift adaptation. Based on our findings, we offer valuable implications for future research in designing more robust LAD models and methods for LAD shift adaptation.

Paperid: 594, https://arxiv.org/pdf/2504.06211.pdf

Abstract:
Zero-Knowledge Proofs (ZKPs) are rapidly gaining importance in privacy-preserving and verifiable computing. ZKPs enable a proving party to prove the truth of a statement to a verifying party without revealing anything else. ZKPs have applications in blockchain technologies, verifiable machine learning, and electronic voting, but have yet to see widespread adoption due to the computational complexity of the proving process. Recent works have accelerated the key primitives of state-of-the-art ZKP protocols on GPU and ASIC. However, the protocols accelerated thus far face one of two challenges: they either require a trusted setup for each application, or they generate larger proof sizes with higher verification costs, limiting their applicability in scenarios with numerous verifiers or strict verification time constraints. This work presents an accelerator, zkSpeed, for HyperPlonk, a state-of-the-art ZKP protocol that supports both one-time, universal setup and small proof sizes for typical ZKP applications in publicly verifiable, consensus-based systems. We accelerate the entire protocol, including two major primitives: SumCheck and Multi-scalar Multiplications (MSMs). We develop a full-chip architecture using 366.46 mm$^2$ and 2 TB/s of bandwidth to accelerate the entire proof generation process, achieving geometric mean speedups of 801$\times$ over CPU baselines.

Paperid: 595, https://arxiv.org/pdf/2504.05006.pdf

Abstract:
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts, with traditional detection methods struggling to address emerging and machine-unauditable flaws. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection. We introduce a comprehensive dataset of 215 real-world DApp projects (4,998 contracts), including hard-to-detect logical errors like token price manipulation, addressing the limitations of existing simplified benchmarks. By fine-tuning LLMs (Llama3-8B and Qwen2-7B) with Full-Parameter Fine-Tuning (FFT) and Low-Rank Adaptation (LoRA), our method achieves superior performance, attaining an F1-score of 0.83 with FFT and data augmentation via Random Over Sampling (ROS). Comparative experiments demonstrate significant improvements over prompt-based LLMs and state-of-the-art tools. Notably, the approach excels in detecting non-machine-auditable vulnerabilities, achieving 0.97 precision and 0.68 recall for price manipulation flaws. The results underscore the effectiveness of domain-specific LLM fine-tuning and data augmentation in addressing real-world DApp security challenges, offering a robust solution for blockchain ecosystem protection.

Paperid: 596, https://arxiv.org/pdf/2504.05002.pdf

Abstract:
Smart contracts deployed on blockchain platforms are vulnerable to various security vulnerabilities. However, only a small number of Ethereum contracts have released their source code, so vulnerability detection at the bytecode level is crucial. This paper introduces SmartBugBert, a novel approach that combines BERT-based deep learning with control flow graph (CFG) analysis to detect vulnerabilities directly from bytecode. Our method first decompiles smart contract bytecode into optimized opcode sequences, extracts semantic features using TF-IDF, constructs control flow graphs to capture execution logic, and isolates vulnerable CFG fragments for targeted analysis. By integrating both semantic and structural information through a fine-tuned BERT model and LightGBM classifier, our approach effectively identifies four critical vulnerability types: transaction-ordering, access control, self-destruct, and timestamp dependency vulnerabilities. Experimental evaluation on 6,157 Ethereum smart contracts demonstrates that SmartBugBert achieves 90.62% precision, 91.76% recall, and 91.19% F1-score, significantly outperforming existing detection methods. Ablation studies confirm that the combination of semantic features with CFG information substantially enhances detection performance. Furthermore, our approach maintains efficient detection speed (0.14 seconds per contract), making it practical for large-scale vulnerability assessment.

Paperid: 597, https://arxiv.org/pdf/2503.07770.pdf

Abstract:
Deep Learning (DL) has emerged as a powerful tool for vulnerability detection, often outperforming traditional solutions. However, developing effective DL models requires large amounts of real-world data, which can be difficult to obtain in sufficient quantities. To address this challenge, DiverseVul dataset has been curated as the largest dataset of vulnerable and non-vulnerable C/C++ functions extracted exclusively from real-world projects. Its goal is to provide high-quality, large-scale samples for training DL models. However, during our study several inconsistencies were identified in the raw dataset while applying pre-processing techniques, highlighting the need for a refined version. In this work, we present a refined version of DiverseVul dataset, which is used to fine-tune a large language model, LLaMA 3.2, for vulnerability detection. Experimental results show that the use of pre-processing techniques led to an improvement in performance, with the model achieving an F1-Score of 66%, a competitive result when compared to our baseline, which achieved a 47% F1-Score in software vulnerability detection.

Paperid: 598, https://arxiv.org/pdf/2503.04564.pdf

Abstract:
Secure aggregation is motivated by federated learning (FL) where a cloud server aims to compute an averaged model (i.e., weights of deep neural networks) of the locally-trained models of numerous clients, while adhering to data security requirements. Hierarchical secure aggregation (HSA) extends this concept to a three-layer hierarchical network, where clustered users communicate with the server through an intermediate layer of relays. In HSA, beyond conventional server security, relay security is also enforced to ensure that the relays remain oblivious to the users' inputs (an abstraction of the local models in FL). Existing study on HSA assumes that each user is associated with only one relay, limiting opportunities for coding across inter-cluster users to achieve efficient communication and key generation. In this paper, we consider HSA with a cyclic association pattern where each user is connected to $B$ consecutive relays in a wrap-around manner. We propose an efficient aggregation scheme which includes a message design for the inputs inspired by gradient coding-a well-known technique for efficient communication in distributed computing-along with a highly non-trivial security key design. We also derive novel converse bounds on the minimum achievable communication and key rates using information-theoretic arguments.

Paperid: 599, https://arxiv.org/pdf/2502.11844.pdf

Abstract:
Automatic program generation has long been a fundamental challenge in computer science. Recent benchmarks have shown that large language models (LLMs) can effectively generate code at the function level, make code edits, and solve algorithmic coding tasks. However, to achieve full automation, LLMs should be able to generate production-quality, self-contained application modules. To evaluate the capabilities of LLMs in solving this challenge, we introduce BaxBench, a novel evaluation benchmark consisting of 392 tasks for the generation of backend applications. We focus on backends for three critical reasons: (i) they are practically relevant, building the core components of most modern web and cloud software, (ii) they are difficult to get right, requiring multiple functions and files to achieve the desired functionality, and (iii) they are security-critical, as they are exposed to untrusted third-parties, making secure solutions that prevent deployment-time attacks an imperative. BaxBench validates the functionality of the generated applications with comprehensive test cases, and assesses their security exposure by executing end-to-end exploits. Our experiments reveal key limitations of current LLMs in both functionality and security: (i) even the best model, OpenAI o1, achieves a mere 62% on code correctness; (ii) on average, we could successfully execute security exploits on around half of the correct programs generated by each LLM; and (iii) in less popular backend frameworks, models further struggle to generate correct and secure applications. Progress on BaxBench signifies important steps towards autonomous and secure software development with LLMs.

Paperid: 600, https://arxiv.org/pdf/2502.10801.pdf

Abstract:
DeepFakes pose a significant threat to our society. One representative DeepFake application is face-swapping, which replaces the identity in a facial image with that of a victim. Although existing methods partially mitigate these risks by degrading the quality of swapped images, they often fail to disrupt the identity transformation effectively. To fill this gap, we propose FaceSwapGuard (FSG), a novel black-box defense mechanism against deepfake face-swapping threats. Specifically, FSG introduces imperceptible perturbations to a user's facial image, disrupting the features extracted by identity encoders. When shared online, these perturbed images mislead face-swapping techniques, causing them to generate facial images with identities significantly different from the original user. Extensive experiments demonstrate the effectiveness of FSG against multiple face-swapping techniques, reducing the face match rate from 90\% (without defense) to below 10\%. Both qualitative and quantitative studies further confirm its ability to confuse human perception, highlighting its practical utility. Additionally, we investigate key factors that may influence FSG and evaluate its robustness against various adaptive adversaries.

Paperid: 601, https://arxiv.org/pdf/2501.15718.pdf

Abstract:
Federated learning collaboratively trains a neural network on a global server, where each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data. The process of sending these model updates may leak client's private data information. Existing gradient inversion attacks can exploit this vulnerability to recover private training instances from a client's gradient vectors. Recently, researchers have proposed advanced gradient inversion techniques that existing defenses struggle to handle effectively. In this work, we present a novel defense tailored for large neural network models. Our defense capitalizes on the high dimensionality of the model parameters to perturb gradients within a subspace orthogonal to the original gradient. By leveraging cold posteriors over orthogonal subspaces, our defense implements a refined gradient update mechanism. This enables the selection of an optimal gradient that not only safeguards against gradient inversion attacks but also maintains model utility. We conduct comprehensive experiments across three different datasets and evaluate our defense against various state-of-the-art attacks and defenses. Code is available at https://censor-gradient.github.io.

Paperid: 602, https://arxiv.org/pdf/2501.07475.pdf

Abstract:
Cybersecurity threats highlight the need for robust network intrusion detection systems to identify malicious behaviour. These systems rely heavily on large datasets to train machine learning models capable of detecting patterns and predicting threats. In the past two decades, researchers have produced a multitude of datasets, however, some widely utilised recent datasets generated with CICFlowMeter contain inaccuracies. These result in flow generation and feature extraction inconsistencies, leading to skewed results and reduced system effectiveness. Other tools in this context lack ease of use, customizable feature sets, and flow labelling options. In this work, we introduce HERA, a new open-source tool that generates flow files and labelled or unlabelled datasets with user-defined features. Validated and tested with the UNSW-NB15 dataset, HERA demonstrated accurate flow and label generation.

Paperid: 603, https://arxiv.org/pdf/2506.19486.pdf

Abstract:
Machine Unlearning (MU) technology facilitates the removal of the influence of specific data instances from trained models on request. Despite rapid advancements in MU technology, its vulnerabilities are still underexplored, posing potential risks of privacy breaches through leaks of ostensibly unlearned information. Current limited research on MU attacks requires access to original models containing privacy data, which violates the critical privacy-preserving objective of MU. To address this gap, we initiate an innovative study on recalling the forgotten class memberships from unlearned models (ULMs) without requiring access to the original one. Specifically, we implement a Membership Recall Attack (MRA) framework with a teacher-student knowledge distillation architecture, where ULMs serve as noisy labelers to transfer knowledge to student models. Then, it is translated into a Learning with Noisy Labels (LNL) problem for inferring the correct labels of the forgetting instances. Extensive experiments on state-of-the-art MU methods with multiple real datasets demonstrate that the proposed MRA strategy exhibits high efficacy in recovering class memberships of unlearned instances. As a result, our study and evaluation have established a benchmark for future research on MU vulnerabilities.

Paperid: 604, https://arxiv.org/pdf/2506.07714.pdf

Abstract:
Electric Vehicles (EVs) are rapidly gaining adoption as a sustainable alternative to fuel-powered vehicles, making secure charging infrastructure essential. Despite traditional authentication protocols, recent results showed that attackers may steal energy through tailored relay attacks. One countermeasure is leveraging the EV's fingerprint on the current exchanged during charging. However, existing methods focus on the final charging stage, allowing malicious actors to consume substantial energy before being detected and repudiated. This underscores the need for earlier and more effective authentication methods to prevent unauthorized charging. Meanwhile, profiling raises privacy concerns, as uniquely identifying EVs through charging patterns could enable user tracking. In this paper, we propose a framework for uniquely identifying EVs using physical measurements from the early charging stages. We hypothesize that voltage behavior early in the process exhibits similar characteristics to current behavior in later stages. By extracting features from early voltage measurements, we demonstrate the feasibility of EV profiling. Our approach improves existing methods by enabling faster and more reliable vehicle identification. We test our solution on a dataset of 7408 usable charges from 49 EVs, achieving up to 0.86 accuracy. Feature importance analysis shows that near-optimal performance is possible with just 10 key features, improving efficiency alongside our lightweight models. This research lays the foundation for a novel authentication factor while exposing potential privacy risks from unauthorized access to charging data.

Paperid: 605, https://arxiv.org/pdf/2506.04647.pdf

Abstract:
Private Set Intersection (PSI) enables secure computation of set intersections while preserving participant privacy, standard PSI existing protocols remain vulnerable to data integrity attacks allowing malicious participants to extract additional intersection information or mislead other parties. In this paper, we propose the definition of data integrity in PSI and construct two authenticated PSI schemes by integrating Merkle Trees with state-of-the-art two-party volePSI and multi-party mPSI protocols. The resulting two-party authenticated PSI achieves communication complexity $\mathcal{O}(n Î»+n \log n)$, aligning with the best-known unauthenticated PSI schemes, while the multi-party construction is $\mathcal{O}(n Îº+n \log n)$ which introduces additional overhead due to Merkle tree inclusion proofs. Due to the incorporation of integrity verification, our authenticated schemes incur higher costs compared to state-of-the-art unauthenticated schemes. We also provide efficient implementations of our protocols and discuss potential improvements, including alternative authentication blocks.

Paperid: 606, https://arxiv.org/pdf/2506.02040.pdf

Abstract:
The Model Context Protocol (MCP) is an emerging standard designed to enable seamless interaction between Large Language Model (LLM) applications and external tools or resources. Within a short period, thousands of MCP services have been developed and deployed. However, the client-server integration architecture inherent in MCP may expand the attack surface against LLM Agent systems, introducing new vulnerabilities that allow attackers to exploit by designing malicious MCP servers. In this paper, we present the first end-to-end empirical evaluation of attack vectors targeting the MCP ecosystem. We identify four categories of attacks, i.e., Tool Poisoning Attacks, Puppet Attacks, Rug Pull Attacks, and Exploitation via Malicious External Resources. To evaluate their feasibility, we conduct experiments following the typical steps of launching an attack through malicious MCP servers: upload -> download -> attack. Specifically, we first construct malicious MCP servers and successfully upload them to three widely used MCP aggregation platforms. The results indicate that current audit mechanisms are insufficient to identify and prevent these threats. Next, through a user study and interview with 20 participants, we demonstrate that users struggle to identify malicious MCP servers and often unknowingly install them from aggregator platforms. Finally, we empirically demonstrate that these attacks can trigger harmful actions within the user's local environment, such as accessing private files or controlling devices to transfer digital assets. Additionally, based on interview results, we discuss four key challenges faced by the current MCP security ecosystem. These findings underscore the urgent need for robust security mechanisms to defend against malicious MCP servers and ensure the safe deployment of increasingly autonomous LLM agents.

Paperid: 607, https://arxiv.org/pdf/2506.00548.pdf

Abstract:
Existing attacks against multimodal language models (MLLMs) primarily communicate instructions through text accompanied by adversarial images. In contrast, we exploit the capabilities of MLLMs to interpret non-textual instructions, specifically, adversarial images or audio generated by our novel method, Con Instruction. We optimize these adversarial examples to align closely with target instructions in the embedding space, revealing the detrimental implications of MLLMs' sophisticated understanding. Unlike prior work, our method does not require training data or preprocessing of textual instructions. While these non-textual adversarial examples can effectively bypass MLLM safety mechanisms, their combination with various text inputs substantially amplifies attack success. We further introduce a new Attack Response Categorization (ARC) framework, which evaluates both the quality of the model's response and its relevance to the malicious instructions. Experimental results demonstrate that Con Instruction effectively bypasses safety mechanisms in multiple vision- and audio-language models, including LLaVA-v1.5, InternVL, Qwen-VL, and Qwen-Audio, evaluated on two standard benchmarks: AdvBench and SafeBench. Specifically, our method achieves the highest attack success rates, reaching 81.3% and 86.6% on LLaVA-v1.5 (13B). On the defense side, we explore various countermeasures against our attacks and uncover a substantial performance gap among existing techniques. Our implementation is made publicly available.

Paperid: 608, https://arxiv.org/pdf/2505.18471.pdf

Abstract:
Modern large language model (LLM) services increasingly rely on complex, often abstract operations, such as multi-step reasoning and multi-agent collaboration, to generate high-quality outputs. While users are billed based on token consumption and API usage, these internal steps are typically not visible. We refer to such systems as Commercial Opaque LLM Services (COLS). This position paper highlights emerging accountability challenges in COLS: users are billed for operations they cannot observe, verify, or contest. We formalize two key risks: \textit{quantity inflation}, where token and call counts may be artificially inflated, and \textit{quality downgrade}, where providers might quietly substitute lower-cost models or tools. Addressing these risks requires a diverse set of auditing strategies, including commitment-based, predictive, behavioral, and signature-based methods. We further explore the potential of complementary mechanisms such as watermarking and trusted execution environments to enhance verifiability without compromising provider confidentiality. We also propose a modular three-layer auditing framework for COLS and users that enables trustworthy verification across execution, secure logging, and user-facing auditability without exposing proprietary internals. Our aim is to encourage further research and policy development toward transparency, auditability, and accountability in commercial LLM services.

Paperid: 609, https://arxiv.org/pdf/2505.09384.pdf

Abstract:
Despite being a legacy protocol with various known security issues, Controller Area Network (CAN) still represents the de-facto standard for communications within vehicles, ships, and industrial control systems. Many research works have designed Intrusion Detection Systems (IDSs) to identify attacks by training machine learning classifiers on bus traffic or its properties. Actions to take after detection are, on the other hand, less investigated, and prevention mechanisms usually include protocol modification (e.g., adding authentication). An effective solution has yet to be implemented on a large scale in the wild. The reasons are related to the effort to handle sporadic false positives, the inevitable delay introduced by authentication, and the closed-source automobile environment that does not easily permit modifying Electronic Control Units (ECUs) software. In this paper, we propose CANTXSec, the first deterministic Intrusion Detection and Prevention system based on physical ECU activations. It employs a new classification of attacks based on the attacker's need in terms of access level to the bus, distinguishing between Frame Injection Attacks (FIAs) (i.e., using frame-level access) and Single-Bit Attacks (SBAs) (i.e., employing bit-level access). CANTXSec detects and prevents classical attacks in the CAN bus, while detecting advanced attacks that have been less investigated in the literature. We prove the effectiveness of our solution on a physical testbed, where we achieve 100% detection accuracy in both classes of attacks while preventing 100% of FIAs. Moreover, to encourage developers to employ CANTXSec, we discuss implementation details, providing an analysis based on each user's risk assessment.

Paperid: 610, https://arxiv.org/pdf/2505.02713.pdf

Abstract:
Remote Keyless Entry (RKE) systems have been the target of thieves since their introduction in automotive industry. Robberies targeting vehicles and their remote entry systems are booming again without a significant advancement from the industrial sector being able to protect against them. Researchers and attackers continuously play cat and mouse to implement new methodologies to exploit weaknesses and defense strategies for RKEs. In this fragment, different attacks and defenses have been discussed in research and industry without proper bridging. In this paper, we provide a Systematization Of Knowledge (SOK) on RKE and Passive Keyless Entry and Start (PKES), focusing on their history and current situation, ranging from legacy systems to modern web-based ones. We provide insight into vehicle manufacturers' technologies and attacks and defense mechanisms involving them. To the best of our knowledge, this is the first comprehensive SOK on RKE systems, and we address specific research questions to understand the evolution and security status of such systems. By identifying the weaknesses RKE still faces, we provide future directions for security researchers and companies to find viable solutions to address old attacks, such as Relay and RollJam, as well as new ones, like API vulnerabilities.

Paperid: 611, https://arxiv.org/pdf/2505.00841.pdf

Abstract:
This report explores the convergence of large language models (LLMs) and cybersecurity, synthesizing interdisciplinary insights from network security, artificial intelligence, formal methods, and human-centered design. It examines emerging applications of LLMs in software and network security, 5G vulnerability analysis, and generative security engineering. The report highlights the role of agentic LLMs in automating complex tasks, improving operational efficiency, and enabling reasoning-driven security analytics. Socio-technical challenges associated with the deployment of LLMs -- including trust, transparency, and ethical considerations -- can be addressed through strategies such as human-in-the-loop systems, role-specific training, and proactive robustness testing. The report further outlines critical research challenges in ensuring interpretability, safety, and fairness in LLM-based systems, particularly in high-stakes domains. By integrating technical advances with organizational and societal considerations, this report presents a forward-looking research agenda for the secure and effective adoption of LLMs in cybersecurity.

Paperid: 612, https://arxiv.org/pdf/2504.12604.pdf

Abstract:
In this paper, we study linear codes over $\mathbb{Z}_k$ based on lattices and theta functions. We obtain the complete weight enumerators MacWilliams identity and the symmetrized weight enumerators MacWilliams identity based on the theory of theta function. We extend the main work by Bannai, Dougherty, Harada and Oura to the finite ring $\mathbb{Z}_k$ for any positive integer $k$ and present the complete weight enumerators MacWilliams identity in genus $g$. When $k=p$ is a prime number, we establish the relationship between the theta function of associated lattices over a cyclotomic field and the complete weight enumerators with Hamming weight of codes, which is an analogy of the results by G. Van der Geer and F. Hirzebruch since they showed the identity with the Lee weight enumerators.

Paperid: 613, https://arxiv.org/pdf/2504.04041.pdf

Abstract:
This paper introduces a novel lower bound on communication complexity using quantum relative entropy and mutual information, refining previous classical entropy-based results. By leveraging Uhlmann's lemma and quantum Pinsker inequalities, the authors establish tighter bounds for information-theoretic security, demonstrating that quantum protocols inherently outperform classical counterparts in balancing privacy and efficiency. Also explores symmetric Quantum Private Information Retrieval (QPIR) protocols that achieve sub-linear communication complexity while ensuring robustness against specious adversaries: A post-quantum cryptography based protocol that can be authenticated for the specious server; A ring-LWE-based protocol for post-quantum security in a single-server setting, ensuring robustness against quantum attacks; A multi-server protocol optimized for hardware practicality, reducing implementation overhead while maintaining sub-linear efficiency. These protocols address critical gaps in secure database queries, offering exponential communication improvements over classical linear-complexity methods. The work also analyzes security trade-offs under quantum specious adversaries, providing theoretical guarantees for privacy and correctness.

Paperid: 614, https://arxiv.org/pdf/2503.17137.pdf

Abstract:
In 2002, Johnson et al. posed an open problem at the Cryptographers' Track of the RSA Conference: how to construct a secure homomorphic signature on a semigroup, rather than on a group. In this paper, we introduce, for the first time, a semigroup-homomorphic signature scheme. Under certain conditions, we prove that the security of this scheme is based on the hardness of the Short Integer Solution (SIS) problem and is tightly secure. Furthermore, we extend it to a linear semigroup-homomorphic signature scheme over lattices, and this scheme can also ensure privacy.

Paperid: 615, https://arxiv.org/pdf/2503.14932.pdf

Abstract:
In recent years, Large Language Models (LLMs) have demonstrated remarkable abilities in various natural language processing tasks. However, adapting these models to specialized domains using private datasets stored on resource-constrained edge devices, such as smartphones and personal computers, remains challenging due to significant privacy concerns and limited computational resources. Existing model adaptation methods either compromise data privacy by requiring data transmission or jeopardize model privacy by exposing proprietary LLM parameters. To address these challenges, we propose Prada, a novel privacy-preserving and efficient black-box LLM adaptation system using private on-device datasets. Prada employs a lightweight proxy model fine-tuned with Low-Rank Adaptation (LoRA) locally on user devices. During inference, Prada leverages the logits offset, i.e., difference in outputs between the base and adapted proxy models, to iteratively refine outputs from a remote black-box LLM. This offset-based adaptation approach preserves both data privacy and model privacy, as there is no need to share sensitive data or proprietary model parameters. Furthermore, we incorporate speculative decoding to further speed up the inference process of Prada, making the system practically deployable on bandwidth-constrained edge devices, enabling a more practical deployment of Prada. Extensive experiments on various downstream tasks demonstrate that Prada achieves performance comparable to centralized fine-tuning methods while significantly reducing computational overhead by up to 60% and communication costs by up to 80%.

Paperid: 616, https://arxiv.org/pdf/2503.05021.pdf

Abstract:
Large Language Models (LLMs) are vulnerable to jailbreak attacks that exploit weaknesses in traditional safety alignment, which often relies on rigid refusal heuristics or representation engineering to block harmful outputs. While they are effective for direct adversarial attacks, they fall short of broader safety challenges requiring nuanced, context-aware decision-making. To address this, we propose Reasoning-enhanced Finetuning for interpretable LLM Safety (Rational), a novel framework that trains models to engage in explicit safe reasoning before response. Fine-tuned models leverage the extensive pretraining knowledge in self-generated reasoning to bootstrap their own safety through structured reasoning, internalizing context-sensitive decision-making. Our findings suggest that safety extends beyond refusal, requiring context awareness for more robust, interpretable, and adaptive responses. Reasoning is not only a core capability of LLMs but also a fundamental mechanism for LLM safety. Rational employs reasoning-enhanced fine-tuning, allowing it to reject harmful prompts while providing meaningful and context-aware responses in complex scenarios.

Paperid: 617, https://arxiv.org/pdf/2502.19687.pdf

Abstract:
The advent of Autonomous Driving Systems (ADS) has marked a significant shift towards intelligent transportation, with implications for public safety and traffic efficiency. While these systems integrate a variety of technologies and offer numerous benefits, their security is paramount, as vulnerabilities can have severe consequences for safety and trust. This study aims to systematically investigate potential security weaknesses in the codebases of prominent open-source ADS projects using CodeQL, a static code analysis tool. The goal is to identify common vulnerabilities, their distribution and persistence across versions to enhance the security of ADS. We selected three representative open-source ADS projects, Autoware, AirSim, and Apollo, based on their high GitHub star counts and Level 4 autonomous driving capabilities. Using CodeQL, we analyzed multiple versions of these projects to identify vulnerabilities, focusing on CWE categories such as CWE-190 (Integer Overflow or Wraparound) and CWE-20 (Improper Input Validation). We also tracked the lifecycle of these vulnerabilities across software versions. This approach allows us to systematically analyze vulnerabilities in projects, which has not been extensively explored in previous ADS research. Our analysis revealed that specific CWE categories, particularly CWE-190 (59.6%) and CWE-20 (16.1%), were prevalent across the selected ADS projects. These vulnerabilities often persisted for over six months, spanning multiple version iterations. The empirical assessment showed a direct link between the severity of these vulnerabilities and their tangible effects on ADS performance. These security issues among ADS still remain to be resolved. Our findings highlight the need for integrating static code analysis into ADS development to detect and mitigate common vulnerabilities.

Paperid: 618, https://arxiv.org/pdf/2502.15799.pdf

Abstract:
Large Language Models (LLMs) are powerful tools for modern applications, but their computational demands limit accessibility. Quantization offers efficiency gains, yet its impact on safety and trustworthiness remains poorly understood. To address this, we introduce OpenMiniSafety, a human-curated safety dataset with 1.067 challenging questions to rigorously evaluate model behavior. We publicly release human safety evaluations for four LLMs (both quantized and full-precision), totaling 4.268 annotated question-answer pairs. By assessing 66 quantized variants of these models using four post-training quantization (PTQ) and two quantization-aware training (QAT) methods across four safety benchmarks including human-centric evaluations we uncover critical safety performance trade-offs. Our results show both PTQ and QAT can degrade safety alignment, with QAT techniques like QLORA or STE performing less safely. No single method consistently outperforms others across benchmarks, precision settings, or models, highlighting the need for safety-aware compression strategies. Furthermore, precision-specialized methods (e.g., QUIK and AWQ for 4-bit, AQLM and Q-PET for 2-bit) excel at their target precision, meaning that these methods are not better at compressing but rather different approaches.

Paperid: 619, https://arxiv.org/pdf/2502.07760.pdf

Abstract:
Model fingerprinting has emerged as a powerful tool for model owners to identify their shared model given API access. However, to lower false discovery rate, fight fingerprint leakage, and defend against coalitions of model users attempting to bypass detection, we argue that {\em scalability} is critical, i.e., scaling up the number of fingerprints one can embed into a model. Hence, we pose scalability as a crucial requirement for fingerprinting schemes. We experiment with fingerprint design at a scale significantly larger than previously considered, and introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints. We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model -- two orders of magnitude more than existing schemes -- without degrading the model's utility. Our inserted fingerprints persist even after supervised fine-tuning on standard post-training data. We further address security risks for fingerprinting, and theoretically and empirically show how a scalable fingerprinting scheme like ours can mitigate these risks.

Paperid: 620, https://arxiv.org/pdf/2502.07557.pdf

Abstract:
Despite the implementation of safety alignment strategies, large language models (LLMs) remain vulnerable to jailbreak attacks, which undermine these safety guardrails and pose significant security threats. Some defenses have been proposed to detect or mitigate jailbreaks, but they are unable to withstand the test of time due to an insufficient understanding of jailbreak mechanisms. In this work, we investigate the mechanisms behind jailbreaks based on the Linear Representation Hypothesis (LRH), which states that neural networks encode high-level concepts as subspaces in their hidden representations. We define the toxic semantics in harmful and jailbreak prompts as toxic concepts and describe the semantics in jailbreak prompts that manipulate LLMs to comply with unsafe requests as jailbreak concepts. Through concept extraction and analysis, we reveal that LLMs can recognize the toxic concepts in both harmful and jailbreak prompts. However, unlike harmful prompts, jailbreak prompts activate the jailbreak concepts and alter the LLM output from rejection to compliance. Building on our analysis, we propose a comprehensive jailbreak defense framework, JBShield, consisting of two key components: jailbreak detection JBShield-D and mitigation JBShield-M. JBShield-D identifies jailbreak prompts by determining whether the input activates both toxic and jailbreak concepts. When a jailbreak prompt is detected, JBShield-M adjusts the hidden representations of the target LLM by enhancing the toxic concept and weakening the jailbreak concept, ensuring LLMs produce safe content. Extensive experiments demonstrate the superior performance of JBShield, achieving an average detection accuracy of 0.95 and reducing the average attack success rate of various jailbreak attacks to 2% from 61% across distinct LLMs.

Paperid: 621, https://arxiv.org/pdf/2502.02542.pdf

Abstract:
We increase overhead for applications that rely on reasoning LLMs-we force models to spend an amplified number of reasoning tokens, i.e., "overthink", to respond to the user query while providing contextually correct answers. The adversary performs an OVERTHINK attack by injecting decoy reasoning problems into the public content that is used by the reasoning LLM (e.g., for RAG applications) during inference time. Due to the nature of our decoy problems (e.g., a Markov Decision Process), modified texts do not violate safety guardrails. We evaluated our attack across closed-(OpenAI o1, o1-mini, o3-mini) and open-(DeepSeek R1) weights reasoning models on the FreshQA and SQuAD datasets. Our results show up to 18x slowdown on FreshQA dataset and 46x slowdown on SQuAD dataset. The attack also shows high transferability across models. To protect applications, we discuss and implement defenses leveraging LLM-based and system design approaches. Finally, we discuss societal, financial, and energy impacts of OVERTHINK attack which could amplify the costs for third-party applications operating reasoning models.

Paperid: 622, https://arxiv.org/pdf/2501.16490.pdf

Abstract:
Smart grids are crucial for meeting rising energy demands driven by global population growth and urbanization. By integrating renewable energy sources, they enhance efficiency, reliability, and sustainability. However, ensuring their availability and security requires advanced operational control and safety measures. Although artificial intelligence and machine learning can help assess grid stability, challenges such as data scarcity and cybersecurity threats, particularly adversarial attacks, remain. Data scarcity is a major issue, as obtaining real-world instances of grid instability requires significant expertise, resources, and time. Yet, these instances are critical for testing new research advancements and security mitigations. This paper introduces a novel framework for detecting instability in smart grids using only stable data. It employs a Generative Adversarial Network (GAN) where the generator is designed not to produce near-realistic data but instead to generate Out-Of-Distribution (OOD) samples with respect to the stable class. These OOD samples represent unstable behavior, anomalies, or disturbances that deviate from the stable data distribution. By training exclusively on stable data and exposing the discriminator to OOD samples, our framework learns a robust decision boundary to distinguish stable conditions from any unstable behavior, without requiring unstable data during training. Furthermore, we incorporate an adversarial training layer to enhance resilience against attacks. Evaluated on a real-world dataset, our solution achieves up to 98.1\% accuracy in predicting grid stability and 98.9\% in detecting adversarial attacks. Implemented on a single-board computer, it enables real-time decision-making with an average response time of under 7ms.

Paperid: 623, https://arxiv.org/pdf/2506.19676.pdf

Abstract:
In recent years, Large-Language-Model-driven AI agents have exhibited unprecedented intelligence and adaptability, and are rapidly changing human production and life. Nowadays, agents are undergoing a new round of evolution. They no longer act as an isolated island like LLMs. Instead, they start to communicate with diverse external entities, such as other agents and tools, to perform more complex tasks collectively. Under this trend, agent communication is regarded as a foundational pillar of the future AI ecosystem, and many organizations have intensively begun to design related communication protocols (e.g., Anthropic's MCP and Google's A2A) within the recent few months. However, this new field exposes significant security hazards, which can cause severe damage to real-world scenarios. To help researchers quickly figure out this promising topic and benefit the future agent communication development, this paper presents a comprehensive survey of agent communication security. More precisely, we first present a clear definition of agent communication and categorize the entire lifecycle of agent communication into three stages: user-agent interaction, agent-agent communication, and agent-environment communication. Next, for each communication phase, we dissect related protocols and analyze the security risks according to the communication characteristics. Then, we summarize and outlook on the possible defense countermeasures for each risk. In addition, we conduct experiments using MCP and A2A to help readers better understand the novel vulnerabilities brought by agent communication. Finally, we discuss open issues and future directions in this promising research field.

Paperid: 624, https://arxiv.org/pdf/2506.06018.pdf

Abstract:
Watermarking becomes one of the pivotal solutions to trace and verify the origin of synthetic images generated by artificial intelligence models, but it is not free of risks. Recent studies demonstrate the capability to forge watermarks from a target image onto cover images via adversarial optimization without knowledge of the target generative model and watermark schemes. In this paper, we uncover a greater risk of an optimization-free and universal watermark forgery that harnesses existing regenerative diffusion models. Our proposed forgery attack, PnP (Plug-and-Plant), seamlessly extracts and integrates the target watermark via regenerating the image, without needing any additional optimization routine. It allows for universal watermark forgery that works independently of the target image's origin or the watermarking model used. We explore the watermarked latent extracted from the target image and visual-textual context of cover images as priors to guide sampling of the regenerative process. Extensive evaluation on 24 scenarios of model-data-watermark combinations demonstrates that PnP can successfully forge the watermark (up to 100% detectability and user attribution), and maintain the best visual perception. By bypassing model retraining and enabling adaptability to any image, our approach significantly broadens the scope of forgery attacks, presenting a greater challenge to the security of current watermarking techniques for diffusion models and the authority of watermarking schemes in synthetic data generation and governance.

Paperid: 625, https://arxiv.org/pdf/2506.05346.pdf

Abstract:
Recent advancements in large language models (LLMs) have underscored their vulnerability to safety alignment jailbreaks, particularly when subjected to downstream fine-tuning. However, existing mitigation strategies primarily focus on reactively addressing jailbreak incidents after safety guardrails have been compromised, removing harmful gradients during fine-tuning, or continuously reinforcing safety alignment throughout fine-tuning. As such, they tend to overlook a critical upstream factor: the role of the original safety-alignment data. This paper therefore investigates the degradation of safety guardrails through the lens of representation similarity between upstream alignment datasets and downstream fine-tuning tasks. Our experiments demonstrate that high similarity between these datasets significantly weakens safety guardrails, making models more susceptible to jailbreaks. Conversely, low similarity between these two types of datasets yields substantially more robust models and thus reduces harmfulness score by up to 10.33%. By highlighting the importance of upstream dataset design in the building of durable safety guardrails and reducing real-world vulnerability to jailbreak attacks, these findings offer actionable insights for fine-tuning service providers.

Paperid: 626, https://arxiv.org/pdf/2506.03234.pdf

Abstract:
Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning text-to-image (T2I) models with human preferences. However, RLHF's feedback mechanism also opens new pathways for adversaries. This paper demonstrates the feasibility of hijacking T2I models by poisoning a small fraction of preference data with natural-appearing examples. Specifically, we propose BadReward, a stealthy clean-label poisoning attack targeting the reward model in multi-modal RLHF. BadReward operates by inducing feature collisions between visually contradicted preference data instances, thereby corrupting the reward model and indirectly compromising the T2I model's integrity. Unlike existing alignment poisoning techniques focused on single (text) modality, BadReward is independent of the preference annotation process, enhancing its stealth and practical threat. Extensive experiments on popular T2I models show that BadReward can consistently guide the generation towards improper outputs, such as biased or violent imagery, for targeted concepts. Our findings underscore the amplified threat landscape for RLHF in multi-modal systems, highlighting the urgent need for robust defenses. Disclaimer. This paper contains uncensored toxic content that might be offensive or disturbing to the readers.

Paperid: 627, https://arxiv.org/pdf/2505.12038.pdf

Abstract:
Large language models (LLMs) have shown great potential as general-purpose AI assistants across various domains. To fully leverage this potential in specific applications, many companies provide fine-tuning API services, enabling users to upload their own data for LLM customization. However, fine-tuning services introduce a new safety threat: user-uploaded data, whether harmful or benign, can break the model's alignment, leading to unsafe outputs. Moreover, existing defense methods struggle to address the diversity of fine-tuning datasets (e.g., varying sizes, tasks), often sacrificing utility for safety or vice versa. To address this issue, we propose Safe Delta, a safety-aware post-training defense method that adjusts the delta parameters (i.e., the parameter change before and after fine-tuning). Specifically, Safe Delta estimates the safety degradation, selects delta parameters to maximize utility while limiting overall safety loss, and applies a safety compensation vector to mitigate residual safety loss. Through extensive experiments on four diverse datasets with varying settings, our approach consistently preserves safety while ensuring that the utility gain from benign datasets remains unaffected.

Paperid: 628, https://arxiv.org/pdf/2504.11510.pdf

Abstract:
In various networks and mobile applications, users are highly susceptible to attribute inference attacks, with particularly prevalent occurrences in recommender systems. Attackers exploit partially exposed user profiles in recommendation models, such as user embeddings, to infer private attributes of target users, such as gender and political views. The goal of defenders is to mitigate the effectiveness of these attacks while maintaining recommendation performance. Most existing defense methods, such as differential privacy and attribute unlearning, focus on post-training settings, which limits their capability of utilizing training data to preserve recommendation performance. Although adversarial training extends defenses to in-training settings, it often struggles with convergence due to unstable training processes. In this paper, we propose RAID, an in-training defense method against attribute inference attacks in recommender systems. In addition to the recommendation objective, we define a defensive objective to ensure that the distribution of protected attributes becomes independent of class labels, making users indistinguishable from attribute inference attacks. Specifically, this defensive objective aims to solve a constrained Wasserstein barycenter problem to identify the centroid distribution that makes the attribute indistinguishable while complying with recommendation performance constraints. To optimize our proposed objective, we use optimal transport to align users with the centroid distribution. We conduct extensive experiments on four real-world datasets to evaluate RAID. The experimental results validate the effectiveness of RAID and demonstrate its significant superiority over existing methods in multiple aspects.

Paperid: 629, https://arxiv.org/pdf/2504.09593.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has significantly enhanced the factual accuracy and domain adaptability of Large Language Models (LLMs). This advancement has enabled their widespread deployment across sensitive domains such as healthcare, finance, and enterprise applications. RAG mitigates hallucinations by integrating external knowledge, yet introduces privacy risk and security risk, notably data breaching risk and data poisoning risk. While recent studies have explored prompt injection and poisoning attacks, there remains a significant gap in comprehensive research on controlling inbound and outbound query flows to mitigate these threats. In this paper, we propose an AI firewall, ControlNET, designed to safeguard RAG-based LLM systems from these vulnerabilities. ControlNET controls query flows by leveraging activation shift phenomena to detect adversarial queries and mitigate their impact through semantic divergence. We conduct comprehensive experiments on four different benchmark datasets including Msmarco, HotpotQA, FinQA, and MedicalSys using state-of-the-art open source LLMs (Llama3, Vicuna, and Mistral). Our results demonstrate that ControlNET achieves over 0.909 AUROC in detecting and mitigating security threats while preserving system harmlessness. Overall, ControlNET offers an effective, robust, harmless defense mechanism, marking a significant advancement toward the secure deployment of RAG-based LLM systems.

Paperid: 630, https://arxiv.org/pdf/2502.13946.pdf

Abstract:
The safety alignment of large language models (LLMs) remains vulnerable, as their initial behavior can be easily jailbroken by even relatively simple attacks. Since infilling a fixed template between the input instruction and initial model output is a common practice for existing LLMs, we hypothesize that this template is a key factor behind their vulnerabilities: LLMs' safety-related decision-making overly relies on the aggregated information from the template region, which largely influences these models' safety behavior. We refer to this issue as template-anchored safety alignment. In this paper, we conduct extensive experiments and verify that template-anchored safety alignment is widespread across various aligned LLMs. Our mechanistic analyses demonstrate how it leads to models' susceptibility when encountering inference-time jailbreak attacks. Furthermore, we show that detaching safety mechanisms from the template region is promising in mitigating vulnerabilities to jailbreak attacks. We encourage future research to develop more robust safety alignment techniques that reduce reliance on the template region.

Paperid: 631, https://arxiv.org/pdf/2501.05928.pdf

Abstract:
Recent research on backdoor stealthiness focuses mainly on indistinguishable triggers in input space and inseparable backdoor representations in feature space, aiming to circumvent backdoor defenses that examine these respective spaces. However, existing backdoor attacks are typically designed to resist a specific type of backdoor defense without considering the diverse range of defense mechanisms. Based on this observation, we pose a natural question: Are current backdoor attacks truly a real-world threat when facing diverse practical defenses? To answer this question, we examine 12 common backdoor attacks that focus on input-space or feature-space stealthiness and 17 diverse representative defenses. Surprisingly, we reveal a critical blind spot: Backdoor attacks designed to be stealthy in input and feature spaces can be mitigated by examining backdoored models in parameter space. To investigate the underlying causes behind this common vulnerability, we study the characteristics of backdoor attacks in the parameter space. Notably, we find that input- and feature-space attacks introduce prominent backdoor-related neurons in parameter space, which are not thoroughly considered by current backdoor attacks. Taking comprehensive stealthiness into account, we propose a novel supply-chain attack called Grond. Grond limits the parameter changes by a simple yet effective module, Adversarial Backdoor Injection (ABI), which adaptively increases the parameter-space stealthiness during the backdoor injection. Extensive experiments demonstrate that Grond outperforms all 12 backdoor attacks against state-of-the-art (including adaptive) defenses on CIFAR-10, GTSRB, and a subset of ImageNet. In addition, we show that ABI consistently improves the effectiveness of common backdoor attacks.

Paperid: 632, https://arxiv.org/pdf/2506.22750.pdf

Abstract:
The widespread use of Android applications has made them a prime target for cyberattacks, significantly increasing the risk of malware that threatens user privacy, security, and device functionality. Effective malware detection is thus critical, with static analysis, dynamic analysis, and Machine Learning being widely used approaches. In this work, we focus on a Machine Learning-based method utilizing static features. We first compiled a dataset of benign and malicious APKs and performed static analysis to extract features such as code structure, permissions, and manifest file content, without executing the apps. Instead of relying solely on raw static features, our system uses an LLM to generate high-level functional descriptions of APKs. To mitigate hallucinations, which are a known vulnerability of LLM, we integrated Retrieval-Augmented Generation (RAG), enabling the LLM to ground its output in relevant context. Using carefully designed prompts, we guide the LLM to produce coherent function summaries, which are then analyzed using a transformer-based model, improving detection accuracy over conventional feature-based methods for malware detection.

Paperid: 633, https://arxiv.org/pdf/2506.19542.pdf

Abstract:
Indistinguishability obfuscation (iO) has emerged as a powerful cryptographic primitive with many implications. While classical iO, combined with the infinitely-often worst-case hardness of $\mathsf{NP}$, is known to imply one-way functions (OWFs) and a range of advanced cryptographic primitives, the cryptographic implications of quantum iO remain poorly understood. In this work, we initiate a study of the power of quantum iO. We define several natural variants of quantum iO, distinguished by whether the obfuscation algorithm, evaluation algorithm, and description of obfuscated program are classical or quantum. For each variant, we identify quantum cryptographic primitives that can be constructed under the assumption of quantum iO and the infinitely-often quantum worst-case hardness of $\mathsf{NP}$ (i.e., $\mathsf{NP}\not\subseteq\mathsf{\text{i.o.} BQP}$). In particular, we construct pseudorandom unitaries, QCCC quantum public-key encryption and (QCCC) quantum symmetric-key encryption, and several primitives implied by them such as one-way state generators, (efficiently-verifiable) one-way puzzles, and EFI pairs, etc. While our main focus is on quantum iO, even in the classical setting, our techniques yield a new and arguably simpler construction of OWFs from classical (imperfect) iO and the infinitely-often worst-case hardness of $\mathsf{NP}$.

Paperid: 634, https://arxiv.org/pdf/2506.19356.pdf

Abstract:
URL+HTML feature fusion shows promise for robust malicious URL detection, since attacker artifacts persist in DOM structures. However, prior work suffers from four critical shortcomings: (1) incomplete URL modeling, failing to jointly capture lexical patterns and semantic context; (2) HTML graph sparsity, where threat-indicative nodes (e.g., obfuscated scripts) are isolated amid benign content, causing signal dilution during graph aggregation; (3) unidirectional analysis, ignoring URL-HTML feature bidirectional interaction; and (4) opaque decisions, lacking attribution to malicious DOM components. To address these challenges, we present WebGuard++, a detection framework with 4 novel components: 1) Cross-scale URL Encoder: Hierarchically learns local-to-global and coarse to fine URL features based on Transformer network with dynamic convolution. 2) Subgraph-aware HTML Encoder: Decomposes DOM graphs into interpretable substructures, amplifying sparse threat signals via Hierarchical feature fusion. 3) Bidirectional Coupling Module: Aligns URL and HTML embeddings through cross-modal contrastive learning, optimizing inter-modal consistency and intra-modal specificity. 4) Voting Module: Localizes malicious regions through consensus voting on malicious subgraph predictions. Experiments show WebGuard++ achieves significant improvements over state-of-the-art baselines, achieving 1.1x-7.9x higher TPR at fixed FPR of 0.001 and 0.0001 across both datasets.

Paperid: 635, https://arxiv.org/pdf/2506.10685.pdf

Abstract:
Traditional CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) schemes are increasingly vulnerable to automated attacks powered by deep neural networks (DNNs). Existing adversarial attack methods often rely on the original image characteristics, resulting in distortions that hinder human interpretation and limit their applicability in scenarios where no initial input images are available. To address these challenges, we propose the Unsourced Adversarial CAPTCHA (DAC), a novel framework that generates high-fidelity adversarial examples guided by attacker-specified semantics information. Leveraging a Large Language Model (LLM), DAC enhances CAPTCHA diversity and enriches the semantic information. To address various application scenarios, we examine the white-box targeted attack scenario and the black box untargeted attack scenario. For target attacks, we introduce two latent noise variables that are alternately guided in the diffusion step to achieve robust inversion. The synergy between gradient guidance and latent variable optimization achieved in this way ensures that the generated adversarial examples not only accurately align with the target conditions but also achieve optimal performance in terms of distributional consistency and attack effectiveness. In untargeted attacks, especially for black-box scenarios, we introduce bi-path unsourced adversarial CAPTCHA (BP-DAC), a two-step optimization strategy employing multimodal gradients and bi-path optimization for efficient misclassification. Experiments show that the defensive adversarial CAPTCHA generated by BP-DAC is able to defend against most of the unknown models, and the generated CAPTCHA is indistinguishable to both humans and DNNs.

Paperid: 636, https://arxiv.org/pdf/2505.18543.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has proven effective in mitigating hallucinations in large language models by incorporating external knowledge during inference. However, this integration introduces new security vulnerabilities, particularly to poisoning attacks. Although prior work has explored various poisoning strategies, a thorough assessment of their practical threat to RAG systems remains missing. To address this gap, we propose the first comprehensive benchmark framework for evaluating poisoning attacks on RAG. Our benchmark covers 5 standard question answering (QA) datasets and 10 expanded variants, along with 13 poisoning attack methods and 7 defense mechanisms, representing a broad spectrum of existing techniques. Using this benchmark, we conduct a comprehensive evaluation of all included attacks and defenses across the full dataset spectrum. Our findings show that while existing attacks perform well on standard QA datasets, their effectiveness drops significantly on the expanded versions. Moreover, our results demonstrate that various advanced RAG architectures, such as sequential, branching, conditional, and loop RAG, as well as multi-turn conversational RAG, multimodal RAG systems, and RAG-based LLM agent systems, remain susceptible to poisoning attacks. Notably, current defense techniques fail to provide robust protection, underscoring the pressing need for more resilient and generalizable defense strategies.

Paperid: 637, https://arxiv.org/pdf/2505.12625.pdf

Abstract:
DeepSeek recently released R1, a high-performing large language model (LLM) optimized for reasoning tasks. Despite its efficient training pipeline, R1 achieves competitive performance, even surpassing leading reasoning models like OpenAI's o1 on several benchmarks. However, emerging reports suggest that R1 refuses to answer certain prompts related to politically sensitive topics in China. While existing LLMs often implement safeguards to avoid generating harmful or offensive outputs, R1 represents a notable shift - exhibiting censorship-like behavior on politically charged queries. In this paper, we investigate this phenomenon by first introducing a large-scale set of heavily curated prompts that get censored by R1, covering a range of politically sensitive topics, but are not censored by other models. We then conduct a comprehensive analysis of R1's censorship patterns, examining their consistency, triggers, and variations across topics, prompt phrasing, and context. Beyond English-language queries, we explore censorship behavior in other languages. We also investigate the transferability of censorship to models distilled from the R1 language model. Finally, we propose techniques for bypassing or removing this censorship. Our findings reveal possible additional censorship integration likely shaped by design choices during training or alignment, raising concerns about transparency, bias, and governance in language model deployment.

Paperid: 638, https://arxiv.org/pdf/2505.12018.pdf

Abstract:
Cybersecurity training has become a crucial part of computer science education and industrial onboarding. Capture the Flag (CTF) competitions have emerged as a valuable, gamified approach for developing and refining the skills of cybersecurity and software engineering professionals. However, while CTFs provide a controlled environment for tackling real world challenges, the participants' decision making and problem solving processes remain under explored. Recognizing that psychology may play a role in a cyber attacker's behavior, we investigate how cognitive biases could be used to improve CTF education and security. In this paper, we present an approach to control cognitive biases, specifically Satisfaction of Search and Loss Aversion, to influence and potentially hinder attackers' effectiveness against web application vulnerabilities in a CTF style challenge. We employ a rigorous quantitative and qualitative analysis through a controlled human study of CTF tasks. CTF exercises are widely used in cybersecurity education and research to simulate real world attack scenarios and help participants develop critical skills by solving security challenges in controlled environments. In our study, participants interact with a web application containing deliberately embedded vulnerabilities while being subjected to tasks designed to trigger cognitive biases. Our study reveals that many participants exhibit the Satisfaction of Search bias and that this bias has a significant effect on their success. On average, participants found 25% fewer flags compared to those who did not exhibit this bias. Our findings provide valuable insights into how cognitive biases can be strategically employed to enhance cybersecurity outcomes, education, and measurements through the lens of CTF challenges.

Paperid: 639, https://arxiv.org/pdf/2505.11154.pdf

Abstract:
Model Context Protocol (MCP) standardizes interface mapping for large language models (LLMs) to access external data and tools, which revolutionizes the paradigm of tool selection and facilitates the rapid expansion of the LLM agent tool ecosystem. However, as the MCP is increasingly adopted, third-party customized versions of the MCP server expose potential security vulnerabilities. In this paper, we first introduce a novel security threat, which we term the MCP Preference Manipulation Attack (MPMA). An attacker deploys a customized MCP server to manipulate LLMs, causing them to prioritize it over other competing MCP servers. This can result in economic benefits for attackers, such as revenue from paid MCP services or advertising income generated from free servers. To achieve MPMA, we first design a Direct Preference Manipulation Attack ($\mathtt{DPMA}$) that achieves significant effectiveness by inserting the manipulative word and phrases into the tool name and description. However, such a direct modification is obvious to users and lacks stealthiness. To address these limitations, we further propose Genetic-based Advertising Preference Manipulation Attack ($\mathtt{GAPMA}$). $\mathtt{GAPMA}$ employs four commonly used strategies to initialize descriptions and integrates a Genetic Algorithm (GA) to enhance stealthiness. The experiment results demonstrate that $\mathtt{GAPMA}$ balances high effectiveness and stealthiness. Our study reveals a critical vulnerability of the MCP in open ecosystems, highlighting an urgent need for robust defense mechanisms to ensure the fairness of the MCP ecosystem.

Paperid: 640, https://arxiv.org/pdf/2505.09110.pdf

Abstract:
Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model without sharing their raw data. However, the decentralized nature of FL introduces vulnerabilities, particularly to poisoning attacks, where malicious clients manipulate their local models to disrupt the training process. While Byzantine-robust aggregation rules have been developed to mitigate such attacks, they remain inadequate against more advanced threats. In response, recent advancements have focused on FL detection techniques to identify potentially malicious participants. Unfortunately, these methods often misclassify numerous benign clients as threats or rely on unrealistic assumptions about the server's capabilities. In this paper, we propose a novel algorithm, SafeFL, specifically designed to accurately identify malicious clients in FL. The SafeFL approach involves the server collecting a series of global models to generate a synthetic dataset, which is then used to distinguish between malicious and benign models based on their behavior. Extensive testing demonstrates that SafeFL outperforms existing methods, offering superior efficiency and accuracy in detecting malicious clients.

Paperid: 641, https://arxiv.org/pdf/2505.07574.pdf

Abstract:
Malware, a persistent cybersecurity threat, increasingly targets interconnected digital systems such as desktop, mobile, and IoT platforms through sophisticated attack vectors. By exploiting these vulnerabilities, attackers compromise the integrity and resilience of modern digital ecosystems. To address this risk, security experts actively employ Machine Learning or Deep Learning-based strategies, integrating static, dynamic, or hybrid approaches to categorize malware instances. Despite their advantages, these methods have inherent drawbacks and malware variants persistently evolve with increased sophistication, necessitating advancements in detection strategies. Visualization-based techniques are emerging as scalable and interpretable solutions for detecting and understanding malicious behaviors across diverse platforms including desktop, mobile, IoT, and distributed systems as well as through analysis of network packet capture files. In this comprehensive survey of more than 100 high-quality research articles, we evaluate existing visualization-based approaches applied to malware detection and classification. As a first contribution, we propose a new all-encompassing framework to study the landscape of visualization-based malware detection techniques. Within this framework, we systematically analyze state-of-the-art approaches across the critical stages of the malware detection pipeline. By analyzing not only the single techniques but also how they are combined to produce the final solution, we shed light on the main challenges in visualization-based approaches and provide insights into the advancements and potential future directions in this critical field.

Paperid: 642, https://arxiv.org/pdf/2505.06860.pdf

Abstract:
In the field of digital security, Reversible Adversarial Examples (RAE) combine adversarial attacks with reversible data hiding techniques to effectively protect sensitive data and prevent unauthorized analysis by malicious Deep Neural Networks (DNNs). However, existing RAE techniques primarily focus on white-box attacks, lacking a comprehensive evaluation of their effectiveness in black-box scenarios. This limitation impedes their broader deployment in complex, dynamic environments. Further more, traditional black-box attacks are often characterized by poor transferability and high query costs, significantly limiting their practical applicability. To address these challenges, we propose the Dual-Phase Merging Transferable Reversible Attack method, which generates highly transferable initial adversarial perturbations in a white-box model and employs a memory augmented black-box strategy to effectively mislead target mod els. Experimental results demonstrate the superiority of our approach, achieving a 99.0% attack success rate and 100% recovery rate in black-box scenarios, highlighting its robustness in privacy protection. Moreover, we successfully implemented a black-box attack on a commercial model, further substantiating the potential of this approach for practical use.

Paperid: 643, https://arxiv.org/pdf/2505.03501.pdf

Abstract:
In this paper, we present a new form of backdoor attack against Large Language Models (LLMs): lingual-backdoor attacks. The key novelty of lingual-backdoor attacks is that the language itself serves as the trigger to hijack the infected LLMs to generate inflammatory speech. They enable the precise targeting of a specific language-speaking group, exacerbating racial discrimination by malicious entities. We first implement a baseline lingual-backdoor attack, which is carried out by poisoning a set of training data for specific downstream tasks through translation into the trigger language. However, this baseline attack suffers from poor task generalization and is impractical in real-world settings. To address this challenge, we design BadLingual, a novel task-agnostic lingual-backdoor, capable of triggering any downstream tasks within the chat LLMs, regardless of the specific questions of these tasks. We design a new approach using PPL-constrained Greedy Coordinate Gradient-based Search (PGCG) based adversarial training to expand the decision boundary of lingual-backdoor, thereby enhancing the generalization ability of lingual-backdoor across various tasks. We perform extensive experiments to validate the effectiveness of our proposed attacks. Specifically, the baseline attack achieves an ASR of over 90% on the specified tasks. However, its ASR reaches only 37.61% across six tasks in the task-agnostic scenario. In contrast, BadLingual brings up to 37.35% improvement over the baseline. Our study sheds light on a new perspective of vulnerabilities in LLMs with multilingual capabilities and is expected to promote future research on the potential defenses to enhance the LLMs' robustness

Paperid: 644, https://arxiv.org/pdf/2504.21668.pdf

Abstract:
Large language models (LLMs) integrated with retrieval-augmented generation (RAG) systems improve accuracy by leveraging external knowledge sources. However, recent research has revealed RAG's susceptibility to poisoning attacks, where the attacker injects poisoned texts into the knowledge database, leading to attacker-desired responses. Existing defenses, which predominantly focus on inference-time mitigation, have proven insufficient against sophisticated attacks. In this paper, we introduce RAGForensics, the first traceback system for RAG, designed to identify poisoned texts within the knowledge database that are responsible for the attacks. RAGForensics operates iteratively, first retrieving a subset of texts from the database and then utilizing a specially crafted prompt to guide an LLM in detecting potential poisoning texts. Empirical evaluations across multiple datasets demonstrate the effectiveness of RAGForensics against state-of-the-art poisoning attacks. This work pioneers the traceback of poisoned texts in RAG systems, providing a practical and promising defense mechanism to enhance their security.

Paperid: 645, https://arxiv.org/pdf/2504.03957.pdf

Abstract:
Large language models (LLMs) have demonstrated impressive natural language processing abilities but face challenges such as hallucination and outdated knowledge. Retrieval-Augmented Generation (RAG) has emerged as a state-of-the-art approach to mitigate these issues. While RAG enhances LLM outputs, it remains vulnerable to poisoning attacks. Recent studies show that injecting poisoned text into the knowledge database can compromise RAG systems, but most existing attacks assume that the attacker can insert a sufficient number of poisoned texts per query to outnumber correct-answer texts in retrieval, an assumption that is often unrealistic. To address this limitation, we propose CorruptRAG, a practical poisoning attack against RAG systems in which the attacker injects only a single poisoned text, enhancing both feasibility and stealth. Extensive experiments across multiple datasets demonstrate that CorruptRAG achieves higher attack success rates compared to existing baselines.

Paperid: 646, https://arxiv.org/pdf/2503.20257.pdf

Abstract:
As Machine Learning (ML) evolves, the complexity and sophistication of security threats against this paradigm continue to grow as well, threatening data privacy and model integrity. In response, Machine Unlearning (MU) is a recent technology that aims to remove the influence of specific data from a trained model, enabling compliance with privacy regulations and user requests. This can be done for privacy compliance (e.g., GDPR's right to be forgotten) or model refinement. However, the intersection between classical threats in ML and MU remains largely unexplored. In this Systematization of Knowledge (SoK), we provide a structured analysis of security threats in ML and their implications for MU. We analyze four major attack classes, namely, Backdoor Attacks, Membership Inference Attacks (MIA), Adversarial Attacks, and Inversion Attacks, we investigate their impact on MU and propose a novel classification based on how they are usually used in this context. Finally, we identify open challenges, including ethical considerations, and explore promising future research directions, paving the way for future research in secure and privacy-preserving Machine Unlearning.

Paperid: 647, https://arxiv.org/pdf/2503.15866.pdf

Abstract:
The widespread adoption of Android devices for sensitive operations like banking and communication has made them prime targets for cyber threats, particularly Advanced Persistent Threats (APT) and sophisticated malware attacks. Traditional malware detection methods rely on binary classification, failing to provide insights into adversarial Tactics, Techniques, and Procedures (TTPs). Understanding malware behavior is crucial for enhancing cybersecurity defenses. To address this gap, we introduce DroidTTP, a framework mapping Android malware behaviors to TTPs based on the MITRE ATT&CK framework. Our curated dataset explicitly links MITRE TTPs to Android applications. We developed an automated solution leveraging the Problem Transformation Approach (PTA) and Large Language Models (LLMs) to map applications to both Tactics and Techniques. Additionally, we employed Retrieval-Augmented Generation (RAG) with prompt engineering and LLM fine-tuning for TTP predictions. Our structured pipeline includes dataset creation, hyperparameter tuning, data augmentation, feature selection, model development, and SHAP-based model interpretability. Among LLMs, Llama achieved the highest performance in Tactic classification with a Jaccard Similarity of 0.9583 and Hamming Loss of 0.0182, and in Technique classification with a Jaccard Similarity of 0.9348 and Hamming Loss of 0.0127. However, the Label Powerset XGBoost model outperformed LLMs, achieving a Jaccard Similarity of 0.9893 for Tactic classification and 0.9753 for Technique classification, with a Hamming Loss of 0.0054 and 0.0050, respectively. While XGBoost showed superior performance, the narrow margin highlights the potential of LLM-based approaches in TTP classification.

Paperid: 648, https://arxiv.org/pdf/2502.20528.pdf

Abstract:
Package confusion attacks such as typosquatting threaten software supply chains. Attackers make packages with names that syntactically or semantically resemble legitimate ones, tricking engineers into installing malware. While prior work has developed defenses against package confusions in some software package registries, notably NPM, PyPI, and RubyGems, gaps remain: high false-positive rates, generalization to more software package ecosystems, and insights from real-world deployment. In this work, we introduce ConfuGuard, a state-of-art detector for package confusion threats. We begin by presenting the first empirical analysis of benign signals derived from prior package confusion data, uncovering their threat patterns, engineering practices, and measurable attributes. Advancing existing detectors, we leverage package metadata to distinguish benign packages, and extend support from three up to seven software package registries. Our approach significantly reduces false positive rates (from 80% to 28%), at the cost of an additional 14s average latency to filter out benign packages by analyzing the package metadata. ConfuGuard is used in production at our industry partner, whose analysts have already confirmed 630 real attacks detected by ConfuGuard.

Paperid: 649, https://arxiv.org/pdf/2502.14976.pdf

Abstract:
Vision-Language Models (VLMs) inherit adversarial vulnerabilities of Large Language Models (LLMs), which are further exacerbated by their multimodal nature. Existing defenses, including adversarial training, input transformations, and heuristic detection, are computationally expensive, architecture-dependent, and fragile against adaptive attacks. We introduce EigenShield, an inference-time defense leveraging Random Matrix Theory to quantify adversarial disruptions in high-dimensional VLM representations. Unlike prior methods that rely on empirical heuristics, EigenShield employs the spiked covariance model to detect structured spectral deviations. Using a Robustness-based Nonconformity Score (RbNS) and quantile-based thresholding, it separates causal eigenvectors, which encode semantic information, from correlational eigenvectors that are susceptible to adversarial artifacts. By projecting embeddings onto the causal subspace, EigenShield filters adversarial noise without modifying model parameters or requiring adversarial training. This architecture-independent, attack-agnostic approach significantly reduces the attack success rate, establishing spectral analysis as a principled alternative to conventional defenses. Our results demonstrate that EigenShield consistently outperforms all existing defenses, including adversarial training, UNIGUARD, and CIDER.

Paperid: 650, https://arxiv.org/pdf/2502.13728.pdf

Abstract:
Dataset Distillation (DD) is a powerful technique for reducing large datasets into compact, representative synthetic datasets, accelerating Machine Learning training. However, traditional DD methods operate in a centralized manner, which poses significant privacy threats and reduces its applicability. To mitigate these risks, we propose a Secure Federated Data Distillation (SFDD) framework to decentralize the distillation process while preserving privacy. Unlike existing Federated Distillation techniques that focus on training global models with distilled knowledge, our approach aims to produce a distilled dataset without exposing local contributions. We leverage the gradient-matching-based distillation method, adapting it for a distributed setting where clients contribute to the distillation process without sharing raw data. The central aggregator iteratively refines a synthetic dataset by integrating client-side updates while ensuring data confidentiality. To make our approach resilient to inference attacks perpetrated by the server that could exploit gradient updates to reconstruct private data, we create an optimized Local Differential Privacy approach, called LDPO-RLD. Furthermore, we assess the framework's resilience against malicious clients executing backdoor attacks (such as Doorping) and demonstrate robustness under the assumption of a sufficient number of participating clients. Our experimental results demonstrate the effectiveness of SFDD and that the proposed defense concretely mitigates the identified vulnerabilities, with minimal impact on the performance of the distilled dataset. By addressing the interplay between privacy and federation in dataset distillation, this work advances the field of privacy-preserving Machine Learning making our SFDD framework a viable solution for sensitive data-sharing applications.

Paperid: 651, https://arxiv.org/pdf/2502.00306.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enables Large Language Models (LLMs) to generate grounded responses by leveraging external knowledge databases without altering model parameters. Although the absence of weight tuning prevents leakage via model parameters, it introduces the risk of inference adversaries exploiting retrieved documents in the model's context. Existing methods for membership inference and data extraction often rely on jailbreaking or carefully crafted unnatural queries, which can be easily detected or thwarted with query rewriting techniques common in RAG systems. In this work, we present Interrogation Attack (IA), a membership inference technique targeting documents in the RAG datastore. By crafting natural-text queries that are answerable only with the target document's presence, our approach demonstrates successful inference with just 30 queries while remaining stealthy; straightforward detectors identify adversarial prompts from existing methods up to ~76x more frequently than those generated by our attack. We observe a 2x improvement in TPR@1%FPR over prior inference attacks across diverse RAG configurations, all while costing less than $0.02 per document inference.

Paperid: 652, https://arxiv.org/pdf/2501.17396.pdf

Abstract:
Federated learning allows multiple clients to collaboratively train a global model with the assistance of a server. However, its distributed nature makes it susceptible to poisoning attacks, where malicious clients can compromise the global model by sending harmful local model updates to the server. To unlearn an accurate global model from a poisoned one after identifying malicious clients, federated unlearning has been introduced. Yet, current research on federated unlearning has primarily concentrated on its effectiveness and efficiency, overlooking the security challenges it presents. In this work, we bridge the gap via proposing BadUnlearn, the first poisoning attacks targeting federated unlearning. In BadUnlearn, malicious clients send specifically designed local model updates to the server during the unlearning process, aiming to ensure that the resulting unlearned model remains poisoned. To mitigate these threats, we propose UnlearnGuard, a robust federated unlearning framework that is provably robust against both existing poisoning attacks and our BadUnlearn. The core concept of UnlearnGuard is for the server to estimate the clients' local model updates during the unlearning process and employ a filtering strategy to verify the accuracy of these estimations. Theoretically, we prove that the model unlearned through UnlearnGuard closely resembles one obtained by train-from-scratch. Empirically, we show that BadUnlearn can effectively corrupt existing federated unlearning methods, while UnlearnGuard remains secure against poisoning attacks.

Paperid: 653, https://arxiv.org/pdf/2501.17392.pdf

Abstract:
Federated learning (FL) has gained attention as a distributed learning paradigm for its data privacy benefits and accelerated convergence through parallel computation. Traditional FL relies on a server-client (SC) architecture, where a central server coordinates multiple clients to train a global model, but this approach faces scalability challenges due to server communication bottlenecks. To overcome this, the ring-all-reduce (RAR) architecture has been introduced, eliminating the central server and achieving bandwidth optimality. However, the tightly coupled nature of RAR's ring topology exposes it to unique Byzantine attack risks not present in SC-based FL. Despite its potential, designing Byzantine-robust RAR-based FL algorithms remains an open problem. To address this gap, we propose BRACE (Byzantine-robust ring-all-reduce), the first RAR-based FL algorithm to achieve both Byzantine robustness and communication efficiency. We provide theoretical guarantees for the convergence of BRACE under Byzantine attacks, demonstrate its bandwidth efficiency, and validate its practical effectiveness through experiments. Our work offers a foundational understanding of Byzantine-robust RAR-based FL design.

Paperid: 654, https://arxiv.org/pdf/2501.17381.pdf

Abstract:
Federated learning (FL) allows multiple clients to collaboratively train a global machine learning model through a server, without exchanging their private training data. However, the decentralized aspect of FL makes it susceptible to poisoning attacks, where malicious clients can manipulate the global model by sending altered local model updates. To counter these attacks, a variety of aggregation rules designed to be resilient to Byzantine failures have been introduced. Nonetheless, these methods can still be vulnerable to sophisticated attacks or depend on unrealistic assumptions about the server. In this paper, we demonstrate that there is no need to design new Byzantine-robust aggregation rules; instead, FL can be secured by enhancing the robustness of well-established aggregation rules. To this end, we present FoundationFL, a novel defense mechanism against poisoning attacks. FoundationFL involves the server generating synthetic updates after receiving local model updates from clients. It then applies existing Byzantine-robust foundational aggregation rules, such as Trimmed-mean or Median, to combine clients' model updates with the synthetic ones. We theoretically establish the convergence performance of FoundationFL under Byzantine settings. Comprehensive experiments across several real-world datasets validate the efficiency of our FoundationFL method.

Paperid: 655, https://arxiv.org/pdf/2501.17292.pdf

Abstract:
Modern computing systems are limited in performance by the memory bandwidth available to processors, a problem known as the memory wall. Processing-in-Memory (PIM) promises to substantially improve this problem by moving processing closer to the data, improving effective data bandwidth, and leading to superior performance on memory-intensive workloads. However, integrating PIM modules within a secure computing system raises an interesting challenge: unencrypted data has to move off-chip to the PIM, exposing the data to attackers and breaking assumptions on Trusted Computing Bases (TCBs). To tackle this challenge, this paper leverages multi-party computation (MPC) techniques, specifically arithmetic secret sharing and Yao's garbled circuits, to outsource bandwidth-intensive computation securely to PIM. Additionally, we leverage precomputation optimization to prevent the CPU's portion of the MPC from becoming a bottleneck. We evaluate our approach using the UPMEM PIM system over various applications such as Deep Learning Recommendation Model inference and Logistic Regression. Our evaluations demonstrate up to a $14.66\times$ speedup compared to a secure CPU configuration while maintaining data confidentiality and integrity when outsourcing linear and/or nonlinear computation.

Paperid: 656, https://arxiv.org/pdf/2501.06533.pdf

Abstract:
The widespread adoption of facial recognition (FR) models raises serious concerns about their potential misuse, motivating the development of anti-facial recognition (AFR) to protect user facial privacy. In this paper, we argue that the static FR strategy, predominantly adopted in prior literature for evaluating AFR efficacy, cannot faithfully characterize the actual capabilities of determined trackers who aim to track a specific target identity. In particular, we introduce DynTracker, a dynamic FR strategy where the model's gallery database is iteratively updated with newly recognized target identity images. Surprisingly, such a simple approach renders all the existing AFR protections ineffective. To mitigate the privacy threats posed by DynTracker, we advocate for explicitly promoting diversity in the AFR-protected images. We hypothesize that the lack of diversity is the primary cause of the failure of existing AFR methods. Specifically, we develop DivTrackee, a novel method for crafting diverse AFR protections that builds upon a text-guided image generation framework and diversity-promoting adversarial losses. Through comprehensive experiments on various image benchmarks and feature extractors, we demonstrate DynTracker's strength in breaking existing AFR methods and the superiority of DivTrackee in preventing user facial images from being identified by dynamic FR strategies. We believe our work can act as an important initial step towards developing more effective AFR methods for protecting user facial privacy against determined trackers.

Paperid: 657, https://arxiv.org/pdf/2501.04963.pdf

Abstract:
Today's Android developers tend to include numerous features to accommodate diverse user requirements, which inevitably leads to bloated apps. Yet more often than not, only a fraction of these features are frequently utilized by users, thus a bloated app costs dearly in potential vulnerabilities, expanded attack surfaces, and additional resource consumption. Especially in the event of severe security incidents, users have the need to block vulnerable functionalities immediately. Existing works have proposed various code debloating approaches for identifying and removing features of executable components. However, they typically involve static modification of files (and, for Android apps, repackaging of APKs, too), which lacks user convenience let alone undermining the security model of Android due to the compromising of public key verification and code integrity checks. This paper introduces 3DNDroid, a Dynamic Debloating approach targeting both DEX and Native methods in AnDroid apps. Using an unprivileged management app in tandem with a customized Android OS, 3DNDroid dynamically reduces unnecessary code loading during app execution based on a pre-generated debloating schema from static or dynamic analyses. It intercepts invocations of debloated bytecode methods to prevent their interpretation, compilation, and execution, while zero-filling memory spaces of debloated native methods during code loading. Evaluation demonstrates 3DNDroid's ability to debloat 187 DEX methods and 30 native methods across 55 real-world apps, removing over 10K Return-Oriented Programming (ROP) gadgets. Case studies confirm its effectiveness in mitigating vulnerabilities, and performance assessments highlight its resource-saving advantages over non-debloated apps.

Paperid: 658, https://arxiv.org/pdf/2506.19836.pdf

Abstract:
Differential privacy (DP) has become the standard for private data analysis. Certain machine learning applications only require privacy protection for specific protected attributes. Using naive variants of differential privacy in such use cases can result in unnecessary degradation of utility. In this work, we refine the definition of DP to create a more general and flexible framework that we call feature differential privacy (FDP). Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary and adaptive separation of protected and non-protected features. We prove the properties of FDP, such as adaptive composition, and demonstrate its implications for limiting attribute inference attacks. We also propose a modification of the standard DP-SGD algorithm that satisfies FDP while leveraging desirable properties such as amplification via sub-sampling. We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available. For example, we train diffusion models on the AFHQ dataset of animal faces and observe a drastic improvement in FID compared to DP, from 286.7 to 101.9 at $Îµ=8$, assuming that the blurred version of a training image is available as a public feature. Overall, our work provides a new approach to private data analysis that can help reduce the utility cost of DP while still providing strong privacy guarantees.

Paperid: 659, https://arxiv.org/pdf/2504.21044.pdf

Abstract:
Recent advancement in large-scale Artificial Intelligence (AI) models offering multimodal services have become foundational in AI systems, making them prime targets for model theft. Existing methods select Out-of-Distribution (OoD) data as backdoor watermarks and retrain the original model for copyright protection. However, existing methods are susceptible to malicious detection and forgery by adversaries, resulting in watermark evasion. In this work, we propose Model-\underline{ag}nostic Black-box Backdoor W\underline{ate}rmarking Framework (AGATE) to address stealthiness and robustness challenges in multimodal model copyright protection. Specifically, we propose an adversarial trigger generation method to generate stealthy adversarial triggers from ordinary dataset, providing visual fidelity while inducing semantic shifts. To alleviate the issue of anomaly detection among model outputs, we propose a post-transform module to correct the model output by narrowing the distance between adversarial trigger image embedding and text embedding. Subsequently, a two-phase watermark verification is proposed to judge whether the current model infringes by comparing the two results with and without the transform module. Consequently, we consistently outperform state-of-the-art methods across five datasets in the downstream tasks of multimodal image-text retrieval and image classification. Additionally, we validated the robustness of AGATE under two adversarial attack scenarios.

Paperid: 660, https://arxiv.org/pdf/2504.13301.pdf

Abstract:
The rapid proliferation of the Internet of Things (IoT) has introduced substantial security vulnerabilities, highlighting the need for robust Intrusion Detection Systems (IDS). Machine learning-based intrusion detection systems (ML-IDS) have significantly improved threat detection capabilities; however, they remain highly susceptible to adversarial attacks. While numerous defense mechanisms have been proposed to enhance ML-IDS resilience, a systematic approach for selecting the most effective defense against a specific adversarial attack remains absent. To address this challenge, we propose Dynamite, a dynamic defense selection framework that enhances ML-IDS by intelligently identifying and deploying the most suitable defense using a machine learning-driven selection mechanism. Our results demonstrate that Dynamite achieves a 96.2% reduction in computational time compared to the Oracle, significantly decreasing computational overhead while preserving strong prediction performance. Dynamite also demonstrates an average F1-score improvement of 76.7% over random defense and 65.8% over the best static state-of-the-art defense.

Paperid: 661, https://arxiv.org/pdf/2504.11774.pdf

Abstract:
With the growing demand for protecting the intellectual property (IP) of text-to-image diffusion models, we propose PCDiff -- a proactive access control framework that redefines model authorization by regulating generation quality. At its core, PCDIFF integrates a trainable fuser module and hierarchical authentication layers into the decoder architecture, ensuring that only users with valid encrypted credentials can generate high-fidelity images. In the absence of valid keys, the system deliberately degrades output quality, effectively preventing unauthorized exploitation.Importantly, while the primary mechanism enforces active access control through architectural intervention, its decoupled design retains compatibility with existing watermarking techniques. This satisfies the need of model owners to actively control model ownership while preserving the traceability capabilities provided by traditional watermarking approaches.Extensive experimental evaluations confirm a strong dependency between credential verification and image quality across various attack scenarios. Moreover, when combined with typical post-processing operations, PCDIFF demonstrates powerful performance alongside conventional watermarking methods. This work shifts the paradigm from passive detection to proactive enforcement of authorization, laying the groundwork for IP management of diffusion models.

Paperid: 662, https://arxiv.org/pdf/2503.15552.pdf

Abstract:
The rapid advancement of conversational agents, particularly chatbots powered by Large Language Models (LLMs), poses a significant risk of social engineering (SE) attacks on social media platforms. SE detection in multi-turn, chat-based interactions is considerably more complex than single-instance detection due to the dynamic nature of these conversations. A critical factor in mitigating this threat is understanding the SE attack mechanisms through which SE attacks operate, specifically how attackers exploit vulnerabilities and how victims' personality traits contribute to their susceptibility. In this work, we propose an LLM-agentic framework, SE-VSim, to simulate SE attack mechanisms by generating multi-turn conversations. We model victim agents with varying personality traits to assess how psychological profiles influence susceptibility to manipulation. Using a dataset of over 1000 simulated conversations, we examine attack scenarios in which adversaries, posing as recruiters, funding agencies, and journalists, attempt to extract sensitive information. Based on this analysis, we present a proof of concept, SE-OmniGuard, to offer personalized protection to users by leveraging prior knowledge of the victims personality, evaluating attack strategies, and monitoring information exchanges in conversations to identify potential SE attempts.

Paperid: 663, https://arxiv.org/pdf/2503.07882.pdf

Abstract:
Minimizing computational overhead in time-series classification, particularly in deep learning models, presents a significant challenge. This challenge is further compounded by adversarial attacks, emphasizing the need for resilient methods that ensure robust performance and efficient model selection. We introduce ReLATE, a framework that identifies robust learners based on dataset similarity, reduces computational overhead, and enhances resilience. ReLATE maintains multiple deep learning models in well-known adversarial attack scenarios, capturing model performance. ReLATE identifies the most analogous dataset to a given target using a similarity metric, then applies the optimal model from the most similar dataset. ReLATE reduces computational overhead by an average of 81.2%, enhancing adversarial resilience and streamlining robust model selection, all without sacrificing performance, within 4.2% of Oracle.

Paperid: 664, https://arxiv.org/pdf/2503.00871.pdf

Abstract:
Cybersecurity systems are continuously producing a huge number of time-stamped events in the form of high-order tensors, such as {count; time, port, flow duration, packet size, . . . }, and so how can we detect anomalies/intrusions in real time? How can we identify multiple types of intrusions and capture their characteristic behaviors? The tensor data consists of categorical and continuous attributes and the data distributions of continuous attributes typically exhibit skew. These data properties require handling skewed infinite and finite dimensional spaces simultaneously. In this paper, we propose a novel streaming method, namely CyberCScope. The method effectively decomposes incoming tensors into major trends while explicitly distinguishing between categorical and skewed continuous attributes. To our knowledge, it is the first to compute hybrid skewed infinite and finite dimensional decomposition. Based on this decomposition, it streamingly finds distinct time-evolving patterns, enabling the detection of multiple types of anomalies. Extensive experiments on large-scale real datasets demonstrate that CyberCScope detects various intrusions with higher accuracy than state-of-the-art baselines while providing meaningful summaries for the intrusions that occur in practice.

Paperid: 665, https://arxiv.org/pdf/2502.18515.pdf

Abstract:
The rapid growth of the blockchain ecosystem and the increasing value locked in smart contracts necessitate robust security measures. While languages like Solidity and Move aim to improve smart contract security, vulnerabilities persist. This paper presents Smartify, a novel multi-agent framework leveraging Large Language Models (LLMs) to automatically detect and repair vulnerabilities in Solidity and Move smart contracts. Unlike traditional methods that rely solely on vast pre-training datasets, Smartify employs a team of specialized agents working on different specially fine-tuned LLMs to analyze code based on underlying programming concepts and language-specific security principles. We evaluated Smartify on a dataset for Solidity and a curated dataset for Move, demonstrating its effectiveness in fixing a wide range of vulnerabilities. Our results show that Smartify (Gemma2+codegemma) achieves state-of-the-art performance, surpassing existing LLMs and enhancing general-purpose models' capabilities, such as Llama 3.1. Notably, Smartify can incorporate language-specific knowledge, such as the nuances of Move, without requiring massive language-specific pre-training datasets. This work offers a detailed analysis of various LLMs' performance on smart contract repair, highlighting the strengths of our multi-agent approach and providing a blueprint for developing more secure and reliable decentralized applications in the growing blockchain landscape. We also provide a detailed recipe for extending this to other similar use cases.

Paperid: 666, https://arxiv.org/pdf/2502.14094.pdf

Abstract:
Intrusion detection systems (IDS) play a crucial role in IoT and network security by monitoring system data and alerting to suspicious activities. Machine learning (ML) has emerged as a promising solution for IDS, offering highly accurate intrusion detection. However, ML-IDS solutions often overlook two critical aspects needed to build reliable systems: continually changing data streams and a lack of attack labels. Streaming network traffic and associated cyber attacks are continually changing, which can degrade the performance of deployed ML models. Labeling attack data, such as zero-day attacks, in real-world intrusion scenarios may not be feasible, making the use of ML solutions that do not rely on attack labels necessary. To address both these challenges, we propose CND-IDS, a continual novelty detection IDS framework which consists of (i) a learning-based feature extractor that continuously updates new feature representations of the system data, and (ii) a novelty detector that identifies new cyber attacks by leveraging principal component analysis (PCA) reconstruction. Our results on realistic intrusion datasets show that CND-IDS achieves up to 6.1x F-score improvement, and up to 6.5x improved forward transfer over the SOTA unsupervised continual learning algorithm. Our code will be released upon acceptance.

Paperid: 667, https://arxiv.org/pdf/2502.13345.pdf

Abstract:
Latent diffusion models have exhibited considerable potential in generative tasks. Watermarking is considered to be an alternative to safeguard the copyright of generative models and prevent their misuse. However, in the context of model distribution scenarios, the accessibility of models to large scale of model users brings new challenges to the security, efficiency and robustness of existing watermark solutions. To address these issues, we propose a secure and efficient watermarking solution. A new security mechanism is designed to prevent watermark leakage and watermark escape, which considers watermark randomness and watermark-model association as two constraints for mandatory watermark injection. To reduce the time cost of training the security module, watermark injection and the security mechanism are decoupled, ensuring that fine-tuning VAE only accomplishes the security mechanism without the burden of learning watermark patterns. A watermark distribution-based verification strategy is proposed to enhance the robustness against diverse attacks in the model distribution scenarios. Experimental results prove that our watermarking consistently outperforms existing six baselines on effectiveness and robustness against ten image processing attacks and adversarial attacks, while enhancing security in the distribution scenarios.

Paperid: 668, https://arxiv.org/pdf/2502.07119.pdf

Abstract:
The proliferation of IoT devices has significantly increased network vulnerabilities, creating an urgent need for effective Intrusion Detection Systems (IDS). Machine Learning-based IDS (ML-IDS) offer advanced detection capabilities but rely on labeled attack data, which limits their ability to identify unknown threats. Self-Supervised Learning (SSL) presents a promising solution by using only normal data to detect patterns and anomalies. This paper introduces SAFE, a novel framework that transforms tabular network intrusion data into an image-like format, enabling Masked Autoencoders (MAEs) to learn robust representations of network behavior. The features extracted by the MAEs are then incorporated into a lightweight novelty detector, enhancing the effectiveness of anomaly detection. Experimental results demonstrate that SAFE outperforms the state-of-the-art anomaly detection method, Scale Learning-based Deep Anomaly Detection method (SLAD), by up to 26.2% and surpasses the state-of-the-art SSL-based network intrusion detection approach, Anomal-E, by up to 23.5% in F1-score.

Paperid: 669, https://arxiv.org/pdf/2502.04400.pdf

Abstract:
Multimodal Federated Learning (MFL) with mixed modalities enables unimodal and multimodal clients to collaboratively train models while ensuring clients' privacy. As a representative sample of local data, prototypes offer an approach with low resource consumption and no reliance on prior knowledge for MFL with mixed modalities. However, existing prototype-based MFL methods assume unified labels across clients and identical tasks per client, which is impractical in MFL with mixed modalities. In this work, we propose an Adaptive prototype-based Multimodal Federated Learning (AproMFL) framework for mixed modalities to address the aforementioned issues. Our AproMFL transfers knowledge through adaptively-constructed prototypes without unified labels. Clients adaptively select prototype construction methods in line with labels; server converts client prototypes into unified multimodal prototypes and cluster them to form global prototypes. To address model aggregation issues in task heterogeneity, we develop a client relationship graph-based scheme to dynamically adjust aggregation weights. Furthermore, we propose a global prototype knowledge transfer loss and a global model knowledge transfer loss to enable the transfer of global knowledge to local knowledge. Experimental results show that AproMFL outperforms four baselines on three highly heterogeneous datasets ($Î±=0.1$) and two heterogeneous tasks, with the optimal results in accuracy and recall being 0.42%~6.09% and 1.6%~3.89% higher than those of FedIoT (FedAvg-based MFL), respectively.

Paperid: 670, https://arxiv.org/pdf/2502.01289.pdf

Abstract:
The availability of foundational models (FMs) pre-trained on large-scale data has advanced the state-of-the-art in many computer vision tasks. While FMs have demonstrated good zero-shot performance on many image classification tasks, there is often scope for performance improvement by adapting the FM to the downstream task. However, the data that is required for this adaptation typically exists in silos across multiple entities (data owners) and cannot be collated at a central location due to regulations and privacy concerns. At the same time, a learning service provider (LSP) who owns the FM cannot share the model with the data owners due to proprietary reasons. In some cases, the data owners may not even have the resources to store such large FMs. Hence, there is a need for algorithms to adapt the FM in a double-blind federated manner, i.e., the data owners do not know the FM or each other's data, and the LSP does not see the data for the downstream tasks. In this work, we propose a framework for double-blind federated adaptation of FMs using fully homomorphic encryption (FHE). The proposed framework first decomposes the FM into a sequence of FHE-friendly blocks through knowledge distillation. The resulting FHE-friendly model is adapted for the downstream task via low-rank parallel adapters that can be learned without backpropagation through the FM. Since the proposed framework requires the LSP to share intermediate representations with the data owners, we design a privacy-preserving permutation scheme to prevent the data owners from learning the FM through model extraction attacks. Finally, a secure aggregation protocol is employed for federated learning of the low-rank parallel adapters. Empirical results on four datasets demonstrate the practical feasibility of the proposed framework.

Paperid: 671, https://arxiv.org/pdf/2501.11052.pdf

Abstract:
As an emerging paradigm in digital identity, Decentralized Identity (DID) appears advantages over traditional identity management methods in a variety of aspects, e.g., enhancing user-centric online services and ensuring complete user autonomy and control. Verifiable Credential (VC) techniques are used to facilitate decentralized DID-based access control across multiple entities. However, existing DID schemes generally rely on a distributed public key infrastructure that also causes challenges, such as context information deduction, key exposure, and issuer data leakage. To address the issues above, this paper proposes a Permanent Issuer-Hiding (PIH)-based DID multi-party authentication framework with a signature-less VC model, named SLVC-DIDA, for the first time. Our proposed scheme avoids the dependence on signing keys by employing hashing and issuer membership proofs, which supports universal zero-knowledge multi-party DID authentications, eliminating additional technical integrations. We adopt a zero-knowledge RSA accumulator to maintain the anonymity of the issuer set, thereby enabling public verification while safeguarding the privacy of identity attributes via a Merkle tree-based VC list. By eliminating reliance on a Public Key Infrastructure (PKI), SLVC-DIDA enables fully decentralized issuance and verification of DIDs. Furthermore, our scheme ensures PIH through the implementation of the zero-knowledge Issuer set and VC list, so that the risks of key leakage and contextual inference attacks are effectively mitigated. Our experiments further evaluate the effectiveness and practicality of SLVC-DIDA.

Paperid: 672, https://arxiv.org/pdf/2506.18516.pdf

Abstract:
Adversarial examples are small and often imperceptible perturbations crafted to fool machine learning models. These attacks seriously threaten the reliability of deep neural networks, especially in security-sensitive domains. Evasion attacks, a form of adversarial attack where input is modified at test time to cause misclassification, are particularly insidious due to their transferability: adversarial examples crafted against one model often fool other models as well. This property, known as adversarial transferability, complicates defense strategies since it enables black-box attacks to succeed without direct access to the victim model. While adversarial training is one of the most widely adopted defense mechanisms, its effectiveness is typically evaluated on a narrow and homogeneous population of models. This limitation hinders the generalizability of empirical findings and restricts practical adoption. In this work, we introduce DUMBer, an attack framework built on the foundation of the DUMB (Dataset soUrces, Model architecture, and Balance) methodology, to systematically evaluate the resilience of adversarially trained models. Our testbed spans multiple adversarial training techniques evaluated across three diverse computer vision tasks, using a heterogeneous population of uniquely trained models to reflect real-world deployment variability. Our experimental pipeline comprises over 130k evaluations spanning 13 state-of-the-art attack algorithms, allowing us to capture nuanced behaviors of adversarial training under varying threat models and dataset conditions. Our findings offer practical, actionable insights for AI practitioners, identifying which defenses are most effective based on the model, dataset, and attacker setup.

Paperid: 673, https://arxiv.org/pdf/2506.15075.pdf

Abstract:
In the ever-expanding domain of 5G-NR wireless cellular networks, over-the-air jamming attacks are prevalent as security attacks, compromising the quality of the received signal. We simulate a jamming environment by incorporating additive white Gaussian noise (AWGN) into the real-world In-phase and Quadrature (I/Q) OFDM datasets. A Convolutional Autoencoder (CAE) is exploited to implement a jamming detection over various characteristics such as heterogenous I/Q datasets; extracting relevant information on Synchronization Signal Blocks (SSBs), and fewer SSB observations with notable class imbalance. Given the characteristics of datasets, balanced datasets are acquired by employing a Conv1D conditional Wasserstein Generative Adversarial Network-Gradient Penalty(CWGAN-GP) on both majority and minority SSB observations. Additionally, we compare the performance and detection ability of the proposed CAE model on augmented datasets with benchmark models: Convolutional Denoising Autoencoder (CDAE) and Convolutional Sparse Autoencoder (CSAE). Despite the complexity of data heterogeneity involved across all datasets, CAE depicts the robustness in detection performance of jammed signal by achieving average values of 97.33% precision, 91.33% recall, 94.08% F1-score, and 94.35% accuracy over CDAE and CSAE.

Paperid: 674, https://arxiv.org/pdf/2506.12522.pdf

Abstract:
Machine unlearning has emerged as a key component in ensuring ``Right to be Forgotten'', enabling the removal of specific data points from trained models. However, even when the unlearning is performed without poisoning the forget-set (clean unlearning), it can be exploited for stealthy attacks that existing defenses struggle to detect. In this paper, we propose a novel {\em clean} backdoor attack that exploits both the model learning phase and the subsequent unlearning requests. Unlike traditional backdoor methods, during the first phase, our approach injects a weak, distributed malicious signal across multiple classes. The real attack is then activated and amplified by selectively unlearning {\em non-poisoned} samples. This strategy results in a powerful and stealthy novel attack that is hard to detect or mitigate, highlighting critical vulnerabilities in current unlearning mechanisms and highlighting the need for more robust defenses.

Paperid: 675, https://arxiv.org/pdf/2506.12382.pdf

Abstract:
Ensuring the safety and alignment of Large Language Models is a significant challenge with their growing integration into critical applications and societal functions. While prior research has primarily focused on jailbreak attacks, less attention has been given to non-adversarial failures that subtly emerge during benign interactions. We introduce secondary risks a novel class of failure modes marked by harmful or misleading behaviors during benign prompts. Unlike adversarial attacks, these risks stem from imperfect generalization and often evade standard safety mechanisms. To enable systematic evaluation, we introduce two risk primitives verbose response and speculative advice that capture the core failure patterns. Building on these definitions, we propose SecLens, a black-box, multi-objective search framework that efficiently elicits secondary risk behaviors by optimizing task relevance, risk activation, and linguistic plausibility. To support reproducible evaluation, we release SecRiskBench, a benchmark dataset of 650 prompts covering eight diverse real-world risk categories. Experimental results from extensive evaluations on 16 popular models demonstrate that secondary risks are widespread, transferable across models, and modality independent, emphasizing the urgent need for enhanced safety mechanisms to address benign yet harmful LLM behaviors in real-world deployments.

Paperid: 676, https://arxiv.org/pdf/2505.19405.pdf

Abstract:
As large language models (LLMs) evolve into autonomous agents capable of collaborative reasoning and task execution, multi-agent LLM systems have emerged as a powerful paradigm for solving complex problems. However, these systems pose new challenges for copyright protection, particularly when sensitive or copyrighted content is inadvertently recalled through inter-agent communication and reasoning. Existing protection techniques primarily focus on detecting content in final outputs, overlooking the richer, more revealing reasoning processes within the agents themselves. In this paper, we introduce CoTGuard, a novel framework for copyright protection that leverages trigger-based detection within Chain-of-Thought (CoT) reasoning. Specifically, we can activate specific CoT segments and monitor intermediate reasoning steps for unauthorized content reproduction by embedding specific trigger queries into agent prompts. This approach enables fine-grained, interpretable detection of copyright violations in collaborative agent scenarios. We evaluate CoTGuard on various benchmarks in extensive experiments and show that it effectively uncovers content leakage with minimal interference to task performance. Our findings suggest that reasoning-level monitoring offers a promising direction for safeguarding intellectual property in LLM-based agent systems.

Paperid: 677, https://arxiv.org/pdf/2505.12655.pdf

Abstract:
The protection of cyber Intellectual Property (IP) such as web content is an increasingly critical concern. The rise of large language models (LLMs) with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. As users increasingly rely on LLM-generated responses, they gradually diminish direct engagement with original information sources, which will significantly reduce the incentives for IP creators to contribute, and lead to a saturating cyberspace with more AI-generated content. In response, we propose a novel defense framework that empowers web content creators to safeguard their web-based IP from unauthorized LLM real-time extraction and redistribution by leveraging the semantic understanding capability of LLMs themselves. Our method follows principled motivations and effectively addresses an intractable black-box optimization problem. Real-world experiments demonstrated that our methods improve defense success rates from 2.5% to 88.6% on different LLMs, outperforming traditional defenses such as configuration-based restrictions.

Paperid: 678, https://arxiv.org/pdf/2504.21700.pdf

Abstract:
Large Language Models are fundamental actors in the modern IT landscape dominated by AI solutions. However, security threats associated with them might prevent their reliable adoption in critical application scenarios such as government organizations and medical institutions. For this reason, commercial LLMs typically undergo a sophisticated censoring mechanism to eliminate any harmful output they could possibly produce. In response to this, LLM Jailbreaking is a significant threat to such protections, and many previous approaches have already demonstrated its effectiveness across diverse domains. Existing jailbreak proposals mostly adopt a generate-and-test strategy to craft malicious input. To improve the comprehension of censoring mechanisms and design a targeted jailbreak attack, we propose an Explainable-AI solution that comparatively analyzes the behavior of censored and uncensored models to derive unique exploitable alignment patterns. Then, we propose XBreaking, a novel jailbreak attack that exploits these unique patterns to break the security constraints of LLMs by targeted noise injection. Our thorough experimental campaign returns important insights about the censoring mechanisms and demonstrates the effectiveness and performance of our attack.

Paperid: 679, https://arxiv.org/pdf/2504.18564.pdf

Abstract:
Recent research has focused on exploring the vulnerabilities of Large Language Models (LLMs), aiming to elicit harmful and/or sensitive content from LLMs. However, due to the insufficient research on dual-jailbreaking -- attacks targeting both LLMs and Guardrails, the effectiveness of existing attacks is limited when attempting to bypass safety-aligned LLMs shielded by guardrails. Therefore, in this paper, we propose DualBreach, a target-driven framework for dual-jailbreaking. DualBreach employs a Target-driven Initialization (TDI) strategy to dynamically construct initial prompts, combined with a Multi-Target Optimization (MTO) method that utilizes approximate gradients to jointly adapt the prompts across guardrails and LLMs, which can simultaneously save the number of queries and achieve a high dual-jailbreaking success rate. For black-box guardrails, DualBreach either employs a powerful open-sourced guardrail or imitates the target black-box guardrail by training a proxy model, to incorporate guardrails into the MTO process. We demonstrate the effectiveness of DualBreach in dual-jailbreaking scenarios through extensive evaluation on several widely-used datasets. Experimental results indicate that DualBreach outperforms state-of-the-art methods with fewer queries, achieving significantly higher success rates across all settings. More specifically, DualBreach achieves an average dual-jailbreaking success rate of 93.67% against GPT-4 with Llama-Guard-3 protection, whereas the best success rate achieved by other methods is 88.33%. Moreover, DualBreach only uses an average of 1.77 queries per successful dual-jailbreak, outperforming other state-of-the-art methods. For the purpose of defense, we propose an XGBoost-based ensemble defensive mechanism named EGuard, which integrates the strengths of multiple guardrails, demonstrating superior performance compared with Llama-Guard-3.

Paperid: 680, https://arxiv.org/pdf/2504.12806.pdf

Abstract:
The loss landscape of Variational Quantum Neural Networks (VQNNs) is characterized by local minima that grow exponentially with increasing qubits. Because of this, it is more challenging to recover information from model gradients during training compared to classical Neural Networks (NNs). In this paper we present a numerical scheme that successfully reconstructs input training, real-world, practical data from trainable VQNNs' gradients. Our scheme is based on gradient inversion that works by combining gradients estimation with the finite difference method and adaptive low-pass filtering. The scheme is further optimized with Kalman filter to obtain efficient convergence. Our experiments show that our algorithm can invert even batch-trained data, given the VQNN model is sufficiently over-parameterized.

Paperid: 681, https://arxiv.org/pdf/2504.10730.pdf

Abstract:
The rapid development of quantum computers threatens traditional cryptographic schemes, prompting the need for Post-Quantum Cryptography (PQC). Although the NIST standardization process has accelerated the development of such algorithms, their application in resource-constrained environments such as embedded systems remains a challenge. Automotive systems relying on the Controller Area Network (CAN) bus for communication are particularly vulnerable due to their limited computational capabilities, high traffic, and need for real-time response. These constraints raise concerns about the feasibility of implementing PQC in automotive environments, where legacy hardware and bit rate limitations must also be considered. In this paper, we introduce PQ-CAN, a modular framework for simulating the performance and overhead of PQC algorithms in embedded systems. We consider the automotive domain as our case study, testing a variety of PQC schemes under different scenarios. Our simulation enables the adjustment of embedded system computational capabilities and CAN bus bit rate constraints. We also provide insights into the trade-offs involved by analyzing each algorithm's security level and overhead for key encapsulation and digital signature. By evaluating the performance of these algorithms, we provide insights into their feasibility and identify the strengths and limitations of PQC in securing automotive communications in the post-quantum era.

Paperid: 682, https://arxiv.org/pdf/2504.10713.pdf

Abstract:
Common Vulnerability and Exposure (CVE) records are fundamental to cybersecurity, offering unique identifiers for publicly known software and system vulnerabilities. Each CVE is typically assigned a Common Vulnerability Scoring System (CVSS) score to support risk prioritization and remediation. However, score inconsistencies often arise due to subjective interpretations of certain metrics. As the number of new CVEs continues to grow rapidly, automation is increasingly necessary to ensure timely and consistent scoring. While prior studies have explored automated methods, the application of Large Language Models (LLMs), despite their recent popularity, remains relatively underexplored. In this work, we evaluate the effectiveness of LLMs in generating CVSS scores for newly reported vulnerabilities. We investigate various prompt engineering strategies to enhance their accuracy and compare LLM-generated scores against those from embedding-based models, which use vector representations classified via supervised learning. Our results show that while LLMs demonstrate potential in automating CVSS evaluation, embedding-based methods outperform them in scoring more subjective components, particularly confidentiality, integrity, and availability impacts. These findings underscore the complexity of CVSS scoring and suggest that combining LLMs with embedding-based methods could yield more reliable results across all scoring components.

Paperid: 683, https://arxiv.org/pdf/2504.09712.pdf

Abstract:
LLM jailbreaks are a widespread safety challenge. Given this problem has not yet been tractable, we suggest targeting a key failure mechanism: the failure of safety to generalize across semantically equivalent inputs. We further focus the target by requiring desirable tractability properties of attacks to study: explainability, transferability between models, and transferability between goals. We perform red-teaming within this framework by uncovering new vulnerabilities to multi-turn, multi-image, and translation-based attacks. These attacks are semantically equivalent by our design to their single-turn, single-image, or untranslated counterparts, enabling systematic comparisons; we show that the different structures yield different safety outcomes. We then demonstrate the potential for this framework to enable new defenses by proposing a Structure Rewriting Guardrail, which converts an input to a structure more conducive to safety assessment. This guardrail significantly improves refusal of harmful inputs, without over-refusing benign ones. Thus, by framing this intermediate challenge - more tractable than universal defenses but essential for long-term safety - we highlight a critical milestone for AI safety research.

Paperid: 684, https://arxiv.org/pdf/2503.22998.pdf

Abstract:
Despite advancements in Graph Neural Networks (GNNs), adaptive attacks continue to challenge their robustness. Certified robustness based on randomized smoothing has emerged as a promising solution, offering provable guarantees that a model's predictions remain stable under adversarial perturbations within a specified range. However, existing methods face a critical trade-off between accuracy and robustness, as achieving stronger robustness requires introducing greater noise into the input graph. This excessive randomization degrades data quality and disrupts prediction consistency, limiting the practical deployment of certifiably robust GNNs in real-world scenarios where both accuracy and robustness are essential. To address this challenge, we propose \textbf{AuditVotes}, the first framework to achieve both high clean accuracy and certifiably robust accuracy for GNNs. It integrates randomized smoothing with two key components, \underline{au}gmentation and con\underline{dit}ional smoothing, aiming to improve data quality and prediction consistency. The augmentation, acting as a pre-processing step, de-noises the randomized graph, significantly improving data quality and clean accuracy. The conditional smoothing, serving as a post-processing step, employs a filtering function to selectively count votes, thereby filtering low-quality predictions and improving voting consistency. Extensive experimental results demonstrate that AuditVotes significantly enhances clean accuracy, certified robustness, and empirical robustness while maintaining high computational efficiency. Notably, compared to baseline randomized smoothing, AuditVotes improves clean accuracy by $437.1\%$ and certified accuracy by $409.3\%$ when the attacker can arbitrarily insert $20$ edges on the Cora-ML datasets, representing a substantial step toward deploying certifiably robust GNNs in real-world applications.

Paperid: 685, https://arxiv.org/pdf/2503.20498.pdf

Authors:Minzhao Liu, Ruslan Shaydulin, Pradeep Niroula, Matthew DeCross, Shih-Han Hung, Wen Yu Kon, Enrique Cervero-MartÃn, Kaushik Chakraborty, Omar Amer, Scott Aaronson, Atithi Acharya, Yuri Alexeev, K. Jordan Berg, Shouvanik Chakrabarti, Florian J. Curchod, Joan M. Dreiling, Neal Erickson, Cameron Foltz, Michael Foss-Feig, David Hayes, Travis S. Humble, Niraj Kumar, Jeffrey Larson, Danylo Lykov, Michael Mills, Steven A. Moses, Brian Neyenhuis, Shaltiel Eloul, Peter Siegfried, James Walker, Charles Lim, Marco Pistoia

Abstract:
While quantum computers have the potential to perform a wide range of practically important tasks beyond the capabilities of classical computers, realizing this potential remains a challenge. One such task is to use an untrusted remote device to generate random bits that can be certified to contain a certain amount of entropy. Certified randomness has many applications but is fundamentally impossible to achieve solely by classical computation. In this work, we demonstrate the generation of certifiably random bits using the 56-qubit Quantinuum H2-1 trapped-ion quantum computer accessed over the internet. Our protocol leverages the classical hardness of recent random circuit sampling demonstrations: a client generates quantum "challenge" circuits using a small randomness seed, sends them to an untrusted quantum server to execute, and verifies the server's results. We analyze the security of our protocol against a restricted class of realistic near-term adversaries. Using classical verification with measured combined sustained performance of $1.1\times10^{18}$ floating-point operations per second across multiple supercomputers, we certify $71,313$ bits of entropy under this restricted adversary and additional assumptions. Our results demonstrate a step towards the practical applicability of today's quantum computers.

Paperid: 686, https://arxiv.org/pdf/2503.08956.pdf

Abstract:
Advancements in battery technology have accelerated the adoption of Electric Vehicles (EVs) due to their environmental benefits. However, their growing sophistication introduces security and privacy challenges. Often seen as mere operational data, battery consumption patterns can unintentionally reveal critical information exploitable for malicious purposes. These risks go beyond privacy, impacting vehicle security and regulatory compliance. Despite these concerns, current research has largely overlooked the broader implications of battery consumption data exposure. As EVs integrate further into smart transportation networks, addressing these gaps is crucial to ensure their safety, reliability, and resilience. In this work, we introduce a novel class of side-channel attacks that exploit EV battery data to extract sensitive user information. Leveraging only battery consumption patterns, we demonstrate a methodology to accurately identify the EV driver and their driving style, determine the number of occupants, and infer the vehicle's start and end locations when user habits are known. We utilize several machine learning models and feature extraction techniques to analyze EV power consumption patterns, validating our approach on simulated and real-world datasets collected from actual drivers. Our attacks achieve an average success rate of 95.4% across all attack objectives. Our findings highlight the privacy risks associated with EV battery data, emphasizing the need for stronger protections to safeguard user privacy and vehicle security.

Paperid: 687, https://arxiv.org/pdf/2503.06808.pdf

Abstract:
Current techniques for privacy auditing of large language models (LLMs) have limited efficacy -- they rely on basic approaches to generate canaries which leads to weak membership inference attacks that in turn give loose lower bounds on the empirical privacy leakage. We develop canaries that are far more effective than those used in prior work under threat models that cover a range of realistic settings. We demonstrate through extensive experiments on multiple families of fine-tuned LLMs that our approach sets a new standard for detection of privacy leakage. For measuring the memorization rate of non-privately trained LLMs, our designed canaries surpass prior approaches. For example, on the Qwen2.5-0.5B model, our designed canaries achieve $49.6\%$ TPR at $1\%$ FPR, vastly surpassing the prior approach's $4.2\%$ TPR at $1\%$ FPR. Our method can be used to provide a privacy audit of $\varepsilon \approx 1$ for a model trained with theoretical $\varepsilon$ of 4. To the best of our knowledge, this is the first time that a privacy audit of LLM training has achieved nontrivial auditing success in the setting where the attacker cannot train shadow models, insert gradient canaries, or access the model at every iteration.

Paperid: 688, https://arxiv.org/pdf/2503.04451.pdf

Abstract:
Federated Averaging remains the most widely used aggregation strategy in federated learning due to its simplicity and scalability. However, its performance degrades significantly in non-IID data settings, where client distributions are highly imbalanced or skewed. Additionally, it relies on clients transmitting metadata, specifically the number of training samples, which introduces privacy risks and may conflict with regulatory frameworks like the European GDPR. In this paper, we propose a novel aggregation strategy that addresses these challenges by introducing class-aware gradient masking. Unlike traditional approaches, our method relies solely on gradient updates, eliminating the need for any additional client metadata, thereby enhancing privacy protection. Furthermore, our approach validates and dynamically weights client contributions based on class-specific importance, ensuring robustness against non-IID distributions, convergence prevention, and backdoor attacks. Extensive experiments on benchmark datasets demonstrate that our method not only outperforms FedAvg and other widely accepted aggregation strategies in non-IID settings but also preserves model integrity in adversarial scenarios. Our results establish the effectiveness of gradient masking as a practical and secure solution for federated learning.

Paperid: 689, https://arxiv.org/pdf/2503.03842.pdf

Abstract:
The study of security in machine learning mainly focuses on downstream task-specific attacks, where the adversarial example is obtained by optimizing a loss function specific to the downstream task. At the same time, it has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models, effectively sharing a common backbone architecture across a multitude of applications such as classification, segmentation, depth estimation, retrieval, question-answering and more. The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored. This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models. We extensively evaluate the security of the feature representations obtained by popular vision foundation models by measuring the impact of this attack on multiple downstream tasks and its transferability between models.

Paperid: 690, https://arxiv.org/pdf/2502.13458.pdf

Abstract:
Ensuring the safety of large language models (LLMs) is critical as they are deployed in real-world applications. Existing guardrails rely on rule-based filtering or single-pass classification, limiting their ability to handle nuanced safety violations. To address this, we propose ThinkGuard, a critique-augmented guardrail model that distills knowledge from high-capacity LLMs by generating structured critiques alongside safety labels. Fine-tuned on critique-augmented data, the captured deliberative thinking ability drastically enhances the guardrail's cautiousness and interpretability. Evaluated on multiple safety benchmarks, ThinkGuard achieves the highest average F1 and AUPRC, outperforming all baselines. Compared to LLaMA Guard 3, ThinkGuard improves accuracy by 16.1% and macro F1 by 27.0%. Moreover, it surpasses label-only fine-tuned models, confirming that structured critiques enhance both classification precision and nuanced safety reasoning while maintaining computational efficiency.

Paperid: 691, https://arxiv.org/pdf/2501.15288.pdf

Abstract:
Cyber-security for 5G networks is drawing notable attention due to an increase in complex jamming attacks that could target the critical 5G Radio Frequency (RF) domain. These attacks pose a significant risk to heterogeneous network (HetNet) architectures, leading to degradation in network performance. Conventional machine-learning techniques for jamming detection rely on centralized training while increasing the odds of data privacy. To address these challenges, this paper proposes a decentralized two-stage federated learning (FL) framework for jamming detection in 5G femtocells. Our proposed distributed framework encompasses using the Federated Averaging (FedAVG) algorithm to train a Convolutional Autoencoder (CAE) for unsupervised learning. In the second stage, we use a fully connected network (FCN) built on the pre-trained CAE encoder that is trained using Federated Proximal (FedProx) algorithm to perform supervised classification. Our experimental results depict that our proposed framework (FedAVG and FedProx) accomplishes efficient training and prediction across non-IID client datasets without compromising data privacy. Specifically, our framework achieves a precision of 0.94, recall of 0.90, F1-score of 0.92, and an accuracy of 0.92, while minimizing communication rounds to 30 and achieving robust convergence in detecting jammed signals with an optimal client count of 6.

Paperid: 692, https://arxiv.org/pdf/2501.13231.pdf

Abstract:
In this paper, we tackle the challenge of jamming attacks in Ultra-Reliable Low Latency Communication (URLLC) within Non-Orthogonal Multiple Access (NOMA)-based 5G networks under Finite Blocklength (FBL) conditions. We introduce an innovative approach that employs Reconfigurable Intelligent Surfaces (RIS) with active elements to enhance energy efficiency while ensuring reliability and meeting latency requirements. Our approach incorporates the traffic model, making it practical for real-world scenarios with dynamic traffic loads. We thoroughly analyze the impact of blocklength and packet arrival rate on network performance metrics and investigate the optimal amplitude value and number of RIS elements. Our results indicate that increasing the number of RIS elements from 4 to 400 can improve signal-to-jamming-plus-noise ratio (SJNR) by 13.64\%. Additionally, optimizing blocklength and packet arrival rate can achieve a 31.68% improvement in energy efficiency and reduced latency. These findings underscore the importance of optimized settings for effective jamming mitigation.

Paperid: 693, https://arxiv.org/pdf/2501.13227.pdf

Abstract:
In this paper, we propose a novel joint task offloading and user scheduling (JTO-US) framework for 5G mobile edge computing (MEC) systems under security threats from jamming attacks. The goal is to minimize the delay and the ratio of dropped tasks, taking into account both communication and computation delays. The system model includes a 5G network equipped with MEC servers and an adversarial on-off jammer that disrupts communication. The proposed framework optimally schedules tasks and users to minimize the impact of jamming while ensuring that high-priority tasks are processed efficiently. Genetic algorithm (GA) is used to solve the optimization problem, and the results are compared with benchmark methods such as GA without considering jamming effect, Shortest Job First (SJF), and Shortest Deadline First (SDF). The simulation results demonstrate that the proposed JTO-US framework achieves the lowest drop ratio and effectively manages priority tasks, outperforming existing methods. Particularly, when the jamming probability is 0.8, the proposed framework mitigates the jammer's impact by reducing the drop ratio to 63%, compared to 89% achieved by the next best method.

Paperid: 694, https://arxiv.org/pdf/2506.17336.pdf

Abstract:
Large language models (LLMs) are increasingly used as personal agents, accessing sensitive user data such as calendars, emails, and medical records. Users currently face a trade-off: They can send private records, many of which are stored in remote databases, to powerful but untrusted LLM providers, increasing their exposure risk. Alternatively, they can run less powerful models locally on trusted devices. We bridge this gap. Our Socratic Chain-of-Thought Reasoning first sends a generic, non-private user query to a powerful, untrusted LLM, which generates a Chain-of-Thought (CoT) prompt and detailed sub-queries without accessing user data. Next, we embed these sub-queries and perform encrypted sub-second semantic search using our Homomorphically Encrypted Vector Database across one million entries of a single user's private data. This represents a realistic scale of personal documents, emails, and records accumulated over years of digital activity. Finally, we feed the CoT prompt and the decrypted records to a local language model and generate the final response. On the LoCoMo long-context QA benchmark, our hybrid framework, combining GPT-4o with a local Llama-3.2-1B model, outperforms using GPT-4o alone by up to 7.1 percentage points. This demonstrates a first step toward systems where tasks are decomposed and split between untrusted strong LLMs and weak local ones, preserving user privacy.

Paperid: 695, https://arxiv.org/pdf/2506.12340.pdf

Abstract:
Large vision-language models (LVLMs) have demonstrated outstanding performance in many downstream tasks. However, LVLMs are trained on large-scale datasets, which can pose privacy risks if training images contain sensitive information. Therefore, it is important to detect whether an image is used to train the LVLM. Recent studies have investigated membership inference attacks (MIAs) against LVLMs, including detecting image-text pairs and single-modality content. In this work, we focus on detecting whether a target image is used to train the target LVLM. We design simple yet effective Image Corruption-Inspired Membership Inference Attacks (ICIMIA) against LLVLMs, which are inspired by LVLM's different sensitivity to image corruption for member and non-member images. We first perform an MIA method under the white-box setting, where we can obtain the embeddings of the image through the vision part of the target LVLM. The attacks are based on the embedding similarity between the image and its corrupted version. We further explore a more practical scenario where we have no knowledge about target LVLMs and we can only query the target LVLMs with an image and a question. We then conduct the attack by utilizing the output text embeddings' similarity. Experiments on existing datasets validate the effectiveness of our proposed attack methods under those two different settings.

Paperid: 696, https://arxiv.org/pdf/2506.00062.pdf

Abstract:
Fine-tuning large language models (LLMs) for telecom tasks and datasets is a common practice to adapt general-purpose models to the telecom domain. However, little attention has been paid to how this process may compromise model safety. Recent research has shown that even benign fine-tuning can degrade the safety alignment of LLMs, causing them to respond to harmful or unethical user queries. In this paper, we investigate this issue for telecom-tuned LLMs using three representative datasets featured by the GenAINet initiative. We show that safety degradation persists even for structured and seemingly harmless datasets such as 3GPP standards and tabular records, indicating that telecom-specific data is not immune to safety erosion during fine-tuning. We further extend our analysis to publicly available Telecom LLMs trained via continual pre-training, revealing that safety alignment is often severely lacking, primarily due to the omission of safety-focused instruction tuning. To address these issues in both fine-tuned and pre-trained models, we conduct extensive experiments and evaluate three safety realignment defenses (SafeInstruct, SafeLoRA, and SafeMERGE) using established red-teaming benchmarks. The results show that, across all settings, the proposed defenses can effectively restore safety after harmful degradation without compromising downstream task performance, leading to Safe teleCOMMunication (SafeCOMM) models. In a nutshell, our work serves as a diagnostic study and practical guide for safety realignment in telecom-tuned LLMs, and emphasizes the importance of safety-aware instruction and fine-tuning for real-world deployments of Telecom LLMs.

Paperid: 697, https://arxiv.org/pdf/2505.04094.pdf

Abstract:
Solana is a rapidly evolving blockchain platform that has attracted an increasing number of users. However, this growth has also drawn the attention of malicious actors, with some phishers extending their reach into the Solana ecosystem. Unlike platforms such as Ethereum, Solana has distinct designs of accounts and transactions, leading to the emergence of new types of phishing transactions that we term SolPhish. We define three types of SolPhish and develop a detection tool called SolPhishHunter. Utilizing SolPhishHunter, we detect a total of 8,058 instances of SolPhish and conduct an empirical analysis of these detected cases. Our analysis explores the distribution and impact of SolPhish, the characteristics of the phishers, and the relationships among phishing gangs. Particularly, the detected SolPhish transactions have resulted in nearly \$1.1 million in losses for victims. We report our detection results to the community and construct SolPhishDataset, the \emph{first} Solana phishing-related dataset in academia.

Paperid: 698, https://arxiv.org/pdf/2504.21035.pdf

Abstract:
Sanitizing sensitive text data typically involves removing personally identifiable information (PII) or generating synthetic data under the assumption that these methods adequately protect privacy; however, their effectiveness is often only assessed by measuring the leakage of explicit identifiers but ignoring nuanced textual markers that can lead to re-identification. We challenge the above illusion of privacy by proposing a new framework that evaluates re-identification attacks to quantify individual privacy risks upon data release. Our approach shows that seemingly innocuous auxiliary information -- such as routine social activities -- can be used to infer sensitive attributes like age or substance use history from sanitized data. For instance, we demonstrate that Azure's commercial PII removal tool fails to protect 74\% of information in the MedQA dataset. Although differential privacy mitigates these risks to some extent, it significantly reduces the utility of the sanitized text for downstream tasks. Our findings indicate that current sanitization techniques offer a \textit{false sense of privacy}, highlighting the need for more robust methods that protect against semantic-level information leakage.

Paperid: 699, https://arxiv.org/pdf/2504.20869.pdf

Abstract:
Graph neural networks have been widely utilized to solve graph-related tasks because of their strong learning power in utilizing the local information of neighbors. However, recent studies on graph adversarial attacks have proven that current graph neural networks are not robust against malicious attacks. Yet much of the existing work has focused on the optimization objective based on attack performance to obtain (near) optimal perturbations, but paid less attention to the strength quantification of each perturbation such as the injection of a particular node/link, which makes the choice of perturbations a black-box model that lacks interpretability. In this work, we propose the concept of noise to quantify the attack strength of each adversarial link. Furthermore, we propose three attack strategies based on the defined noise and classification margins in terms of single and multiple steps optimization. Extensive experiments conducted on benchmark datasets against three representative graph neural networks demonstrate the effectiveness of the proposed attack strategies. Particularly, we also investigate the preferred patterns of effective adversarial perturbations by analyzing the corresponding properties of the selected perturbation nodes.

Paperid: 700, https://arxiv.org/pdf/2504.20848.pdf

Abstract:
In recent years, graph neural networks (GNNs) have shown great potential in addressing various graph structure-related downstream tasks. However, recent studies have found that current GNNs are susceptible to malicious adversarial attacks. Given the inevitable presence of adversarial attacks in the real world, a variety of defense methods have been proposed to counter these attacks and enhance the robustness of GNNs. Despite the commendable performance of these defense methods, we have observed that they tend to exhibit a structural bias in terms of their defense capability on nodes with low degree (i.e., tail nodes), which is similar to the structural bias of traditional GNNs on nodes with low degree in the clean graph. Therefore, in this work, we propose a defense strategy by including hetero-homo augmented graph construction, $k$NN augmented graph construction, and multi-view node-wise attention modules to mitigate the structural bias of GNNs against adversarial attacks. Notably, the hetero-homo augmented graph consists of removing heterophilic links (i.e., links connecting nodes with dissimilar features) globally and adding homophilic links (i.e., links connecting nodes with similar features) for nodes with low degree. To further enhance the defense capability, an attention mechanism is adopted to adaptively combine the representations from the above two kinds of graph views. We conduct extensive experiments to demonstrate the defense and debiasing effect of the proposed strategy on benchmark datasets.

Paperid: 701, https://arxiv.org/pdf/2504.05902.pdf

Abstract:
The exponential increase in the parameters of Deep Neural Networks (DNNs) has significantly raised the cost of independent training, particularly for resource-constrained entities. As a result, there is a growing reliance on open-source models. However, the opacity of training processes exacerbates security risks, making these models more vulnerable to malicious threats, such as backdoor attacks, while simultaneously complicating defense mechanisms. Merging homogeneous models has gained attention as a cost-effective post-training defense. However, we notice that existing strategies, such as weight averaging, only partially mitigate the influence of poisoned parameters and remain ineffective in disrupting the pervasive spurious correlations embedded across model parameters. We propose a novel module-switching strategy to break such spurious correlations within the model's propagation path. By leveraging evolutionary algorithms to optimize fusion strategies, we validate our approach against backdoor attacks targeting text and vision domains. Our method achieves effective backdoor mitigation even when incorporating a couple of compromised models, e.g., reducing the average attack success rate (ASR) to 22% compared to 31.9% with the best-performing baseline on SST-2.

Paperid: 702, https://arxiv.org/pdf/2504.01822.pdf

Abstract:
Cross-chain technology enables seamless asset transfer and message-passing within decentralized finance (DeFi) ecosystems, facilitating multi-chain coexistence in the current blockchain environment. However, this development also raises security concerns, as malicious actors exploit cross-chain asset flows to conceal the provenance and destination of assets, thereby facilitating illegal activities such as money laundering. Consequently, the need for cross-chain transaction traceability has become increasingly urgent. Prior research on transaction traceability has predominantly focused on single-chain and centralized finance (CeFi) cross-chain scenarios, overlooking DeFispecific considerations. This paper proposes ABCTRACER, an automated, bi-directional cross-chain transaction tracing tool, specifically designed for DeFi ecosystems. By harnessing transaction event log mining and named entity recognition techniques, ABCTRACER automatically extracts explicit cross-chain cues. These cues are then combined with information retrieval techniques to encode implicit cues. ABCTRACER facilitates the autonomous learning of latent associated information and achieves bidirectional, generalized cross-chain transaction tracing. Our experiments on 12 mainstream cross-chain bridges demonstrate that ABCTRACER attains 91.75% bi-directional traceability (F1 metrics) with self-adaptive capability. Furthermore, we apply ABCTRACER to real-world cross-chain attack transactions and money laundering traceability, thereby bolstering the traceability and blockchain ecological security of DeFi bridging applications.

Paperid: 703, https://arxiv.org/pdf/2503.22379.pdf

Abstract:
The task of $\textit{Differentially Private Text Rewriting}$ is a class of text privatization techniques in which (sensitive) input textual documents are $\textit{rewritten}$ under Differential Privacy (DP) guarantees. The motivation behind such methods is to hide both explicit and implicit identifiers that could be contained in text, while still retaining the semantic meaning of the original text, thus preserving utility. Recent years have seen an uptick in research output in this field, offering a diverse array of word-, sentence-, and document-level DP rewriting methods. Common to these methods is the selection of a privacy budget (i.e., the $\varepsilon$ parameter), which governs the degree to which a text is privatized. One major limitation of previous works, stemming directly from the unique structure of language itself, is the lack of consideration of $\textit{where}$ the privacy budget should be allocated, as not all aspects of language, and therefore text, are equally sensitive or personal. In this work, we are the first to address this shortcoming, asking the question of how a given privacy budget can be intelligently and sensibly distributed amongst a target document. We construct and evaluate a toolkit of linguistics- and NLP-based methods used to allocate a privacy budget to constituent tokens in a text document. In a series of privacy and utility experiments, we empirically demonstrate that given the same privacy budget, intelligent distribution leads to higher privacy levels and more positive trade-offs than a naive distribution of $\varepsilon$. Our work highlights the intricacies of text privatization with DP, and furthermore, it calls for further work on finding more efficient ways to maximize the privatization benefits offered by DP in text rewriting.

Paperid: 704, https://arxiv.org/pdf/2503.13994.pdf

Abstract:
The rapid advancement of image editing techniques has raised concerns about their misuse for generating Not-Safe-for-Work (NSFW) content. This necessitates a targeted protection mechanism that blocks malicious edits while preserving normal editability. However, existing protection methods fail to achieve this balance, as they indiscriminately disrupt all edits while still allowing some harmful content to be generated. To address this, we propose TarPro, a targeted protection framework that prevents malicious edits while maintaining benign modifications. TarPro achieves this through a semantic-aware constraint that only disrupts malicious content and a lightweight perturbation generator that produces a more stable, imperceptible, and robust perturbation for image protection. Extensive experiments demonstrate that TarPro surpasses existing methods, achieving a high protection efficacy while ensuring minimal impact on normal edits. Our results highlight TarPro as a practical solution for secure and controlled image editing.

Paperid: 705, https://arxiv.org/pdf/2503.08284.pdf

Abstract:
Brain-Computer Interfaces (BCIs) are systems traditionally used in medicine and designed to interact with the brain to record or stimulate neurons. Despite their benefits, the literature has demonstrated that invasive BCIs focused on neurostimulation present vulnerabilities allowing attackers to gain control. In this context, neural cyberattacks emerged as threats able to disrupt spontaneous neural activity by performing neural overstimulation or inhibition. Previous work validated these attacks in small-scale simulations with a reduced number of neurons, lacking real-world complexity. Thus, this work tackles this limitation by analyzing the impact of two existing neural attacks, Neuronal Flooding (FLO) and Neuronal Jamming (JAM), on a complex neuronal topology of the primary visual cortex of mice consisting of approximately 230,000 neurons, tested on three realistic visual stimuli: flash effect, movie, and drifting gratings. Each attack was evaluated over three relevant events per stimulus, also testing the impact of attacking 25% and 50% of the neurons. The results, based on the number of spikes and shift percentages metrics, showed that the attacks caused the greatest impact on the movie, while dark and static events exhibit highest resilience. Although both attacks can significantly affect neural activity, JAM was generally more damaging, producing longer temporal delays, and had a larger prevalence. Finally, JAM did not require to alter many neurons to significantly affect neural activity, while the impact in FLO increased with the number of neurons attacked.

Paperid: 706, https://arxiv.org/pdf/2503.04837.pdf

Abstract:
Current deep learning (DL)-based palmprint verification models rely on centralized training with large datasets, which raises significant privacy concerns due to biometric data's sensitive and immutable nature. Federated learning~(FL), a privacy-preserving distributed learning paradigm, offers a compelling alternative by enabling collaborative model training without the need for data sharing. However, FL-based palmprint verification faces critical challenges, including data heterogeneity from diverse identities and the absence of standardized evaluation benchmarks. This paper addresses these gaps by establishing a comprehensive benchmark for FL-based palmprint verification, which explicitly defines and evaluates two practical scenarios: closed-set and open-set verification. We propose FedPalm, a unified FL framework that balances local adaptability with global generalization. Each client trains a personalized textural expert tailored to local data and collaboratively contributes to a shared global textural expert for extracting generalized features. To further enhance verification performance, we introduce a Textural Expert Interaction Module that dynamically routes textural features among experts to generate refined side textural features. Learnable parameters are employed to model relationships between original and side features, fostering cross-texture-expert interaction and improving feature discrimination. Extensive experiments validate the effectiveness of FedPalm, demonstrating robust performance across both scenarios and providing a promising foundation for advancing FL-based palmprint verification research.

Paperid: 707, https://arxiv.org/pdf/2502.19320.pdf

Abstract:
Large language models (LLMs) are often deployed to perform constrained tasks, with narrow domains. For example, customer support bots can be built on top of LLMs, relying on their broad language understanding and capabilities to enhance performance. However, these LLMs are adversarially susceptible, potentially generating outputs outside the intended domain. To formalize, assess, and mitigate this risk, we introduce domain certification; a guarantee that accurately characterizes the out-of-domain behavior of language models. We then propose a simple yet effective approach, which we call VALID that provides adversarial bounds as a certificate. Finally, we evaluate our method across a diverse set of datasets, demonstrating that it yields meaningful certificates, which bound the probability of out-of-domain samples tightly with minimum penalty to refusal behavior.

Paperid: 708, https://arxiv.org/pdf/2502.18517.pdf

Abstract:
The success of large language models (LLMs) has attracted many individuals to fine-tune them for domain-specific tasks by uploading their data. However, in sensitive areas like healthcare and finance, privacy concerns often arise. One promising solution is to generate synthetic data with Differential Privacy (DP) guarantees to replace private data. However, these synthetic data contain significant flawed data, which are considered as noise. Existing solutions typically rely on naive filtering by comparing ROUGE-L scores or embedding similarities, which are ineffective in addressing the noise. To address this issue, we propose \textit{RewardDS}, a novel privacy-preserving framework that fine-tunes a reward proxy model and uses reward signals to guide the synthetic data generation. Our \textit{RewardDS} introduces two key modules, Reward Guided Filtering and Self-Optimizing Refinement, to both filter and refine the synthetic data, effectively mitigating the noise. Extensive experiments across medical, financial, and code generation domains demonstrate the effectiveness of our method.

Paperid: 709, https://arxiv.org/pdf/2502.06418.pdf

Abstract:
Watermarking plays a key role in the provenance and detection of AI-generated content. While existing methods prioritize robustness against real-world distortions (e.g., JPEG compression and noise addition), we reveal a fundamental tradeoff: such robust watermarks inherently improve the redundancy of detectable patterns encoded into images, creating exploitable information leakage. To leverage this, we propose an attack framework that extracts leakage of watermark patterns through multi-channel feature learning using a pre-trained vision model. Unlike prior works requiring massive data or detector access, our method achieves both forgery and detection evasion with a single watermarked image. Extensive experiments demonstrate that our method achieves a 60\% success rate gain in detection evasion and 51\% improvement in forgery accuracy compared to state-of-the-art methods while maintaining visual fidelity. Our work exposes the robustness-stealthiness paradox: current "robust" watermarks sacrifice security for distortion resistance, providing insights for future watermark design.

Paperid: 710, https://arxiv.org/pdf/2502.04229.pdf

Abstract:
Dataset distillation (DD) enhances training efficiency and reduces bandwidth by condensing large datasets into smaller synthetic ones. It enables models to achieve performance comparable to those trained on the raw full dataset and has become a widely adopted method for data sharing. However, security concerns in DD remain underexplored. Existing studies typically assume that malicious behavior originates from dataset owners during the initial distillation process, where backdoors are injected into raw datasets. In contrast, this work is the first to address a more realistic and concerning threat: attackers may intercept the dataset distribution process, inject backdoors into the distilled datasets, and redistribute them to users. While distilled datasets were previously considered resistant to backdoor attacks, we demonstrate that they remain vulnerable to such attacks. Furthermore, we show that attackers do not even require access to any raw data to inject the backdoors successfully. Specifically, our approach reconstructs conceptual archetypes for each class from the model trained on the distilled dataset. Backdoors are then injected into these archetypes to update the distilled dataset. Moreover, we ensure the updated dataset not only retains the backdoor but also preserves the original optimization trajectory, thus maintaining the knowledge of the raw dataset. To achieve this, a hybrid loss is designed to integrate backdoor information along the benign optimization trajectory, ensuring that previously learned information is not forgotten. Extensive experiments demonstrate that distilled datasets are highly vulnerable to backdoor attacks, with risks pervasive across various raw datasets, distillation methods, and downstream training strategies. Moreover, our attack method is efficient, capable of synthesizing a malicious distilled dataset in under one minute in certain cases.

Paperid: 711, https://arxiv.org/pdf/2502.02438.pdf

Abstract:
Medical multimodal large language models (MLLMs) are becoming an instrumental part of healthcare systems, assisting medical personnel with decision making and results analysis. Models for radiology report generation are able to interpret medical imagery, thus reducing the workload of radiologists. As medical data is scarce and protected by privacy regulations, medical MLLMs represent valuable intellectual property. However, these assets are potentially vulnerable to model stealing, where attackers aim to replicate their functionality via black-box access. So far, model stealing for the medical domain has focused on classification; however, existing attacks are not effective against MLLMs. In this paper, we introduce Adversarial Domain Alignment (ADA-STEAL), the first stealing attack against medical MLLMs. ADA-STEAL relies on natural images, which are public and widely available, as opposed to their medical counterparts. We show that data augmentation with adversarial noise is sufficient to overcome the data distribution gap between natural images and the domain-specific distribution of the victim MLLM. Experiments on the IU X-RAY and MIMIC-CXR radiology datasets demonstrate that Adversarial Domain Alignment enables attackers to steal the medical MLLM without any access to medical data.

Paperid: 712, https://arxiv.org/pdf/2502.02260.pdf

Abstract:
In the past decade, considerable research effort has been devoted to securing machine learning (ML) models that operate in adversarial settings. Yet, progress has been slow even for simple "toy" problems (e.g., robustness to small adversarial perturbations) and is often hindered by non-rigorous evaluations. Today, adversarial ML research has shifted towards studying larger, general-purpose language models. In this position paper, we argue that the situation is now even worse: in the era of LLMs, the field of adversarial ML studies problems that are (1) less clearly defined, (2) harder to solve, and (3) even more challenging to evaluate. As a result, we caution that yet another decade of work on adversarial ML may fail to produce meaningful progress.

Paperid: 713, https://arxiv.org/pdf/2501.16497.pdf

Abstract:
Improving the safety and reliability of large language models (LLMs) is a crucial aspect of realizing trustworthy AI systems. Although alignment methods aim to suppress harmful content generation, LLMs are often still vulnerable to jailbreaking attacks that employ adversarial inputs that subvert alignment and induce harmful outputs. We propose the Randomized Embedding Smoothing and Token Aggregation (RESTA) defense, which adds random noise to the embedding vectors and performs aggregation during the generation of each output token, with the aim of better preserving semantic information. Our experiments demonstrate that our approach achieves superior robustness versus utility tradeoffs compared to the baseline defenses.

Paperid: 714, https://arxiv.org/pdf/2501.11405.pdf

Abstract:
Backscattering tag-to-tag networks (BTTNs) are emerging passive radio frequency identification (RFID) systems that facilitate direct communication between tags using an external RF field and play a pivotal role in ubiquitous Internet of Things (IoT) applications. Despite their potential, BTTNs face significant security vulnerabilities, which remain their primary concern to enable reliable communication. Existing authentication schemes in backscatter communication (BC) systems, which mainly focus on tag-to-reader or reader-to-tag scenarios, are unsuitable for BTTNs due to the ultra-low power constraints and limited computational capabilities of the tags, leaving the challenge of secure tag-to-tag authentication largely unexplored. To bridge this gap, this paper proposes a physical layer authentication (PLA) scheme, where a Talker tag (TT) and a Listener tag (LT) can authenticate each other in the presence of an adversary, only leveraging the unique output voltage profile of the energy harvesting and the envelope detector circuits embedded in their power and demodulation units. This allows for efficient authentication of BTTN tags without additional computational overhead. In addition, since the low spectral efficiency and limited coverage range in BTTNs hinder PLA performance, we propose integrating an indoor reconfigurable intelligent surface (RIS) into the system to enhance authentication accuracy and enable successful authentication over longer distances. Security analysis and simulation results indicate that our scheme is robust against various attack vectors and achieves acceptable performance across various experimental settings. Additionally, the results indicate that using RIS significantly enhances PLA performance in terms of accuracy and robustness, especially at longer distances compared to traditional BTTN scenarios without RIS.

Paperid: 715, https://arxiv.org/pdf/2501.03227.pdf

Abstract:
Although, both double-spending and selfish-mining attacks have been extensively studied since the ``Bitcoin'' whitepaper of Nakamoto and the ``majority is not enough'' paper of Eyal and Sirer, there has been no rigorous stochastic analysis of an attack that combines the two, except for the complicated MDP models. In this paper, we first combine stubborn and selfish mining attacks, i.e., construct a strategy where the attacker acts stubborn until its private branch reaches a certain length and then switches to act selfish. We provide the optimal stubbornness for each parameter regime. Next, we provide the maximum stubbornness that is still more profitable than honest mining and argue a connection between the level of stubbornness and the $k$-confirmation rule. We show that, at each attack cycle, if the level of stubbornness is higher than $k$, there is a risk of double-spending which comes at no-cost to the adversary. The result can be seen as a guide for picking $k$ in the $k$-confirmation rule in a blockchain design. At each cycle, for a given stubbornness level, we rigorously formulate how great the risk of double-spending is. We provide the minimum double-spend value needed for an attack to be profitable in the regimes where the scheme is less profitable than honest mining. We further modify the attack in the stubborn regime in order to conceal the attack and increase the double-spending probability. Finally, we evaluate the results and provide the optimal and the maximum stubbornness levels for each parameter regime as well as the revenue. As a case study, with Bitcoin's $k=6$ block confirmation rule, we evaluate the revenue and double-spending risk of the attacks for each pool parameter.

Paperid: 716, https://arxiv.org/pdf/2501.02446.pdf

Abstract:
Recent advances of large language models in the field of Verilog generation have raised several ethical and security concerns, such as code copyright protection and dissemination of malicious code. Researchers have employed watermarking techniques to identify codes generated by large language models. However, the existing watermarking works fail to protect RTL code copyright due to the significant syntactic and semantic differences between RTL code and software code in languages such as Python. This paper proposes a hardware watermarking framework RTLMarker that embeds watermarks into RTL code and deeper into the synthesized netlist. We propose a set of rule-based Verilog code transformations , ensuring the watermarked RTL code's syntactic and semantic correctness. In addition, we consider an inherent tradeoff between watermark transparency and watermark effectiveness and jointly optimize them. The results demonstrate RTLMarker's superiority over the baseline in RTL code watermarking.

Paperid: 717, https://arxiv.org/pdf/2506.23603.pdf

Abstract:
As Large Language Models (LLMs) are increasingly deployed in sensitive domains, traditional data privacy measures prove inadequate for protecting information that is implicit, contextual, or inferable - what we define as semantic privacy. This Systematization of Knowledge (SoK) introduces a lifecycle-centric framework to analyze how semantic privacy risks emerge across input processing, pretraining, fine-tuning, and alignment stages of LLMs. We categorize key attack vectors and assess how current defenses, such as differential privacy, embedding encryption, edge computing, and unlearning, address these threats. Our analysis reveals critical gaps in semantic-level protection, especially against contextual inference and latent representation leakage. We conclude by outlining open challenges, including quantifying semantic leakage, protecting multimodal inputs, balancing de-identification with generation quality, and ensuring transparency in privacy enforcement. This work aims to inform future research on designing robust, semantically aware privacy-preserving techniques for LLMs.

Paperid: 718, https://arxiv.org/pdf/2506.23260.pdf

Abstract:
Autonomous AI agents powered by large language models (LLMs) with structured function-calling interfaces have dramatically expanded capabilities for real-time data retrieval, complex computation, and multi-step orchestration. Yet, the explosive proliferation of plugins, connectors, and inter-agent protocols has outpaced discovery mechanisms and security practices, resulting in brittle integrations vulnerable to diverse threats. In this survey, we introduce the first unified, end-to-end threat model for LLM-agent ecosystems, spanning host-to-tool and agent-to-agent communications, formalize adversary capabilities and attacker objectives, and catalog over thirty attack techniques. Specifically, we organized the threat model into four domains: Input Manipulation (e.g., prompt injections, long-context hijacks, multimodal adversarial inputs), Model Compromise (e.g., prompt- and parameter-level backdoors, composite and encrypted multi-backdoors, poisoning strategies), System and Privacy Attacks (e.g., speculative side-channels, membership inference, retrieval poisoning, social-engineering simulations), and Protocol Vulnerabilities (e.g., exploits in Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent Network Protocol (ANP), and Agent-to-Agent (A2A) protocol). For each category, we review representative scenarios, assess real-world feasibility, and evaluate existing defenses. Building on our threat taxonomy, we identify key open challenges and future research directions, such as securing MCP deployments through dynamic trust management and cryptographic provenance tracking; designing and hardening Agentic Web Interfaces; and achieving resilience in multi-agent and federated environments. Our work provides a comprehensive reference to guide the design of robust defense mechanisms and establish best practices for resilient LLM-agent workflows.

Paperid: 719, https://arxiv.org/pdf/2506.19054.pdf

Abstract:
As LLMs become widespread across diverse applications, concerns about the security and safety of LLM interactions have intensified. Numerous guardrail models and benchmarks have been developed to ensure LLM content safety. However, existing guardrail benchmarks are often built upon ad hoc risk taxonomies that lack a principled grounding in standardized safety policies, limiting their alignment with real-world operational requirements. Moreover, they tend to overlook domain-specific risks, while the same risk category can carry different implications across different domains. To bridge these gaps, we introduce GuardSet-X, the first massive multi-domain safety policy-grounded guardrail dataset. GuardSet-X offers: (1) broad domain coverage across eight safety-critical domains, such as finance, law, and codeGen; (2) policy-grounded risk construction based on authentic, domain-specific safety guidelines; (3) diverse interaction formats, encompassing declarative statements, questions, instructions, and multi-turn conversations; (4) advanced benign data curation via detoxification prompting to challenge over-refusal behaviors; and (5) \textbf{attack-enhanced instances} that simulate adversarial inputs designed to bypass guardrails. Based on GuardSet-X, we benchmark 19 advanced guardrail models and uncover a series of findings, such as: (1) All models achieve varied F1 scores, with many demonstrating high variance across risk categories, highlighting their limited domain coverage and insufficient handling of domain-specific safety concerns; (2) As models evolve, their coverage of safety risks broadens, but performance on common risk categories may decrease; (3) All models remain vulnerable to optimized adversarial attacks. We believe that \dataset and the unique insights derived from our evaluations will advance the development of policy-aligned and resilient guardrail systems.

Paperid: 720, https://arxiv.org/pdf/2506.13161.pdf

Abstract:
Large Language Models (LLMs) are increasingly used in software security, but their trustworthiness in generating accurate vulnerability advisories remains uncertain. This study investigates the ability of ChatGPT to (1) generate plausible security advisories from CVE-IDs, (2) differentiate real from fake CVE-IDs, and (3) extract CVE-IDs from advisory descriptions. Using a curated dataset of 100 real and 100 fake CVE-IDs, we manually analyzed the credibility and consistency of the model's outputs. The results show that ChatGPT generated plausible security advisories for 96% of given input real CVE-IDs and 97% of given input fake CVE-IDs, demonstrating a limitation in differentiating between real and fake IDs. Furthermore, when these generated advisories were reintroduced to ChatGPT to identify their original CVE-ID, the model produced a fake CVE-ID in 6% of cases from real advisories. These findings highlight both the strengths and limitations of ChatGPT in cybersecurity applications. While the model demonstrates potential for automating advisory generation, its inability to reliably authenticate CVE-IDs or maintain consistency upon re-evaluation underscores the risks associated with its deployment in critical security tasks. Our study emphasizes the importance of using LLMs with caution in cybersecurity workflows and suggests the need for further improvements in their design to improve reliability and applicability in security advisory generation.

Paperid: 721, https://arxiv.org/pdf/2506.11521.pdf

Abstract:
Multimodal large language models (MLLMs), which bridge the gap between audio-visual and natural language processing, achieve state-of-the-art performance on several audio-visual tasks. Despite the superior performance of MLLMs, the scarcity of high-quality audio-visual training data and computational resources necessitates the utilization of third-party data and open-source MLLMs, a trend that is increasingly observed in contemporary research. This prosperity masks significant security risks. Empirical studies demonstrate that the latest MLLMs can be manipulated to produce malicious or harmful content. This manipulation is facilitated exclusively through instructions or inputs, including adversarial perturbations and malevolent queries, effectively bypassing the internal security mechanisms embedded within the models. To gain a deeper comprehension of the inherent security vulnerabilities associated with audio-visual-based multimodal models, a series of surveys investigates various types of attacks, including adversarial and backdoor attacks. While existing surveys on audio-visual attacks provide a comprehensive overview, they are limited to specific types of attacks, which lack a unified review of various types of attacks. To address this issue and gain insights into the latest trends in the field, this paper presents a comprehensive and systematic review of audio-visual attacks, which include adversarial attacks, backdoor attacks, and jailbreak attacks. Furthermore, this paper also reviews various types of attacks in the latest audio-visual-based MLLMs, a dimension notably absent in existing surveys. Drawing upon comprehensive insights from a substantial review, this paper delineates both challenges and emergent trends for future research on audio-visual attacks and defense.

Paperid: 722, https://arxiv.org/pdf/2506.06861.pdf

Abstract:
As a fundamental problem in machine learning and differential privacy (DP), DP linear regression has been extensively studied. However, most existing methods focus primarily on either regular data distributions or low-dimensional cases with irregular data. To address these limitations, this paper provides a comprehensive study of DP sparse linear regression with heavy-tailed responses in high-dimensional settings. In the first part, we introduce the DP-IHT-H method, which leverages the Huber loss and private iterative hard thresholding to achieve an estimation error bound of $ \tilde{O}\biggl( s^{* \frac{1 }{2}} \cdot \biggl(\frac{\log d}{n}\biggr)^{\fracÎ¶{1 + Î¶}} + s^{* \frac{1 + 2Î¶}{2 + 2Î¶}} \cdot \biggl(\frac{\log^2 d}{n \varepsilon}\biggr)^{\fracÎ¶{1 + Î¶}} \biggr) $ under the $(\varepsilon, Î´)$-DP model, where $n$ is the sample size, $d$ is the dimensionality, $s^*$ is the sparsity of the parameter, and $Î¶\in (0, 1]$ characterizes the tail heaviness of the data. In the second part, we propose DP-IHT-L, which further improves the error bound under additional assumptions on the response and achieves $ \tilde{O}\Bigl(\frac{(s^*)^{3/2} \log d}{n \varepsilon}\Bigr). $ Compared to the first result, this bound is independent of the tail parameter $Î¶$. Finally, through experiments on synthetic and real-world datasets, we demonstrate that our methods outperform standard DP algorithms designed for ``regular'' data.

Paperid: 723, https://arxiv.org/pdf/2505.22108.pdf

Abstract:
Federated Learning (FL) offers a promising approach for training clinical AI models without centralizing sensitive patient data. However, its real-world adoption is hindered by challenges related to privacy, resource constraints, and compliance. Existing Differential Privacy (DP) approaches often apply uniform noise, which disproportionately degrades model performance, even among well-compliant institutions. In this work, we propose a novel compliance-aware FL framework that enhances DP by adaptively adjusting noise based on quantifiable client compliance scores. Additionally, we introduce a compliance scoring tool based on key healthcare and security standards to promote secure, inclusive, and equitable participation across diverse clinical settings. Extensive experiments on public datasets demonstrate that integrating under-resourced, less compliant clinics with highly regulated institutions yields accuracy improvements of up to 15% over traditional FL. This work advances FL by balancing privacy, compliance, and performance, making it a viable solution for real-world clinical workflows in global healthcare.

Paperid: 724, https://arxiv.org/pdf/2505.12296.pdf

Abstract:
Machine learning models are increasingly shared and outsourced, raising requirements of verifying training effort (Proof-of-Learning, PoL) to ensure claimed performance and establishing ownership (Proof-of-Ownership, PoO) for transactions. When models are trained by untrusted parties, PoL and PoO must be enforced together to enable protection, attribution, and compensation. However, existing studies typically address them separately, which not only weakens protection against forgery and privacy breaches but also leads to high verification overhead. We propose PoLO, a unified framework that simultaneously achieves PoL and PoO using chained watermarks. PoLO splits the training process into fine-grained training shards and embeds a dedicated watermark in each shard. Each watermark is generated using the hash of the preceding shard, certifying the training process of the preceding shard. The chained structure makes it computationally difficult to forge any individual part of the whole training process. The complete set of watermarks serves as the PoL, while the final watermark provides the PoO. PoLO offers more efficient and privacy-preserving verification compared to the vanilla PoL solutions that rely on gradient-based trajectory tracing and inadvertently expose training data during verification, while maintaining the same level of ownership assurance of watermark-based PoO schemes. Our evaluation shows that PoLO achieves 99% watermark detection accuracy for ownership verification, while preserving data privacy and cutting verification costs to just 1.5-10% of traditional methods. Forging PoLO demands 1.1-4x more resources than honest proof generation, with the original proof retaining over 90% detection accuracy even after attacks.

Paperid: 725, https://arxiv.org/pdf/2505.10983.pdf

Abstract:
We propose the first unified adversarial attack benchmark for Genomic Foundation Models (GFMs), named GenoArmory. Unlike existing GFM benchmarks, GenoArmory offers the first comprehensive evaluation framework to systematically assess the vulnerability of GFMs to adversarial attacks. Methodologically, we evaluate the adversarial robustness of five state-of-the-art GFMs using four widely adopted attack algorithms and three defense strategies. Importantly, our benchmark provides an accessible and comprehensive framework to analyze GFM vulnerabilities with respect to model architecture, quantization schemes, and training datasets. Additionally, we introduce GenoAdv, a new adversarial sample dataset designed to improve GFM safety. Empirically, classification models exhibit greater robustness to adversarial perturbations compared to generative models, highlighting the impact of task type on model vulnerability. Moreover, adversarial attacks frequently target biologically significant genomic regions, suggesting that these models effectively capture meaningful sequence features.

Paperid: 726, https://arxiv.org/pdf/2505.08292.pdf

Abstract:
Password strength meters (PSMs) have been widely used by websites to gauge password strength, encouraging users to create stronger passwords. Popular data-driven PSMs, e.g., based on Markov, Probabilistic Context-free Grammar (PCFG) and neural networks, alarm strength based on a model learned from real passwords. Despite their proven effectiveness, the secure utility that arises from the leakage of trained passwords remains largely overlooked. To address this gap, we analyze 11 PSMs and find that 5 data-driven meters are vulnerable to membership inference attacks that expose their trained passwords, and seriously, 3 rule-based meters openly disclose their blocked passwords. We specifically design a PSM privacy leakage evaluation approach, and uncover that a series of general data-driven meters are vulnerable to leaking between 10^4 to 10^5 trained passwords, with the PCFG-based models being more vulnerable than other counterparts; furthermore, we aid in deriving insights that the inherent utility-privacy tradeoff is not as severe as previously thought. To further exploit the risks, we develop novel meter-aware attacks when a clever attacker can filter the used passwords during compromising accounts on websites using the meter, and experimentally show that attackers targeting websites that deployed the popular Zxcvbn meter can compromise an additional 5.84% user accounts within 10 attempts, demonstrating the urgent need for privacy-preserving PSMs that protect the confidentiality of the meter's used passwords. Finally, we sketch some counter-measures to mitigate these threats.

Paperid: 727, https://arxiv.org/pdf/2505.05292.pdf

Abstract:
The QUIC protocol is now widely adopted by major tech companies and accounts for a significant fraction of today's Internet traffic. QUIC's multiplexing capabilities, encrypted headers, dynamic IP address changes, and encrypted parameter negotiations make the protocol not only more efficient, secure, and censorship-resistant, but also practically unmanageable by firewalls. This opens doors for attackers who may exploit certain traits of the QUIC protocol to perform targeted attacks, such as data exfiltration attacks. Whereas existing data exfiltration techniques, such as TLS and DNS-based exfiltration, can be detected on a firewall level, QUIC-based data exfiltration is more difficult to detect, since changes in IP addresses and ports are inherent to the protocol's normal behavior. To show the feasibility of a QUIC-based data exfiltration attack, we introduce a novel method leveraging the server preferred address feature of the QUIC protocol and, thus, allows an attacker to exfiltrate sensitive data from an infected machine to a malicious server, disguised as a server-side connection migration. The attack is implemented as a proof of concept tool in Rust. We evaluated the performance of five anomaly detection classifiers - Random Forest, Multi-Layer Perceptron, Support Vector Machine, Autoencoder, and Isolation Forest - trained on datasets collected from three network traffic scenarios. The classifiers were trained on over 700K benign and malicious QUIC packets and 786 connection migration events, but were unable to detect the data exfiltration attempts. Furthermore, post-analysis of the traffic captures did not reveal any identifiable fingerprint. As part of our evaluation, we also interviewed five leading firewall vendors and found that, as of today, no major firewall vendor implements functionality capable of distinguishing between benign and malicious QUIC connection migrations.

Paperid: 728, https://arxiv.org/pdf/2505.03161.pdf

Abstract:
Recently emerged 6G space-air-ground integrated networks (SAGINs), which integrate satellites, aerial networks, and terrestrial communications, offer ubiquitous coverage for various mobile applications. However, the highly dynamic, open, and heterogeneous nature of SAGINs poses severe security issues. Forming a defense line of SAGINs suffers from two preliminary challenges: 1) accurately understanding massive unstructured multi-dimensional threat information to generate defense strategies against various malicious attacks, 2) rapidly adapting to potential unknown threats to yield more effective security strategies. To tackle the above two challenges, we propose a novel security framework for SAGINs based on Large Language Models (LLMs), which consists of two key ingredients LLM-6GNG and 6G-INST. Our proposed LLM-6GNG leverages refined chain-of-thought (CoT) reasoning and dynamic multi-agent mechanisms to analyze massive unstructured multi-dimensional threat data and generate comprehensive security strategies, thus addressing the first challenge. Our proposed 6G-INST relies on a novel self-evolving method to automatically update LLM-6GNG, enabling it to accommodate unknown threats under dynamic communication environments, thereby addressing the second challenge. Additionally, we prototype the proposed framework with ns-3, OpenAirInterface (OAI), and software-defined radio (SDR). Experiments on three benchmarks demonstrate the effectiveness of our framework. The results show that our framework produces highly accurate security strategies that remain robust against a variety of unknown attacks. We will release our code to contribute to the community.

Paperid: 729, https://arxiv.org/pdf/2504.17934.pdf

Abstract:
The rise of Large Language Models (LLMs) has revolutionized Graphical User Interface (GUI) automation through LLM-powered GUI agents, yet their ability to process sensitive data with limited human oversight raises significant privacy and security risks. This position paper identifies three key risks of GUI agents and examines how they differ from traditional GUI automation and general autonomous agents. Despite these risks, existing evaluations focus primarily on performance, leaving privacy and security assessments largely unexplored. We review current evaluation metrics for both GUI and general LLM agents and outline five key challenges in integrating human evaluators for GUI agent assessments. To address these gaps, we advocate for a human-centered evaluation framework that incorporates risk assessments, enhances user awareness through in-context consent, and embeds privacy and security considerations into GUI agent design and evaluation.

Paperid: 730, https://arxiv.org/pdf/2504.11281.pdf

Abstract:
A Large Language Model (LLM) powered GUI agent is a specialized autonomous system that performs tasks on the user's behalf according to high-level instructions. It does so by perceiving and interpreting the graphical user interfaces (GUIs) of relevant apps, often visually, inferring necessary sequences of actions, and then interacting with GUIs by executing the actions such as clicking, typing, and tapping. To complete real-world tasks, such as filling forms or booking services, GUI agents often need to process and act on sensitive user data. However, this autonomy introduces new privacy and security risks. Adversaries can inject malicious content into the GUIs that alters agent behaviors or induces unintended disclosures of private information. These attacks often exploit the discrepancy between visual saliency for agents and human users, or the agent's limited ability to detect violations of contextual integrity in task automation. In this paper, we characterized six types of such attacks, and conducted an experimental study to test these attacks with six state-of-the-art GUI agents, 234 adversarial webpages, and 39 human participants. Our findings suggest that GUI agents are highly vulnerable, particularly to contextually embedded threats. Moreover, human users are also susceptible to many of these attacks, indicating that simple human oversight may not reliably prevent failures. This misalignment highlights the need for privacy-aware agent design. We propose practical defense strategies to inform the development of safer and more reliable GUI agents.

Paperid: 731, https://arxiv.org/pdf/2504.00858.pdf

Abstract:
The widespread application of automatic speech recognition (ASR) supports large-scale voice surveillance, raising concerns about privacy among users. In this paper, we concentrate on using adversarial examples to mitigate unauthorized disclosure of speech privacy thwarted by potential eavesdroppers in speech communications. While audio adversarial examples have demonstrated the capability to mislead ASR models or evade ASR surveillance, they are typically constructed through time-intensive offline optimization, restricting their practicality in real-time voice communication. Recent work overcame this limitation by generating universal adversarial perturbations (UAPs) and enhancing their transferability for black-box scenarios. However, they introduced excessive noise that significantly degrades audio quality and affects human perception, thereby limiting their effectiveness in practical scenarios. To address this limitation and protect live users' speech against ASR systems, we propose a novel framework, AudioShield. Central to this framework is the concept of Transferable Universal Adversarial Perturbations in the Latent Space (LS-TUAP). By transferring the perturbations to the latent space, the audio quality is preserved to a large extent. Additionally, we propose target feature adaptation to enhance the transferability of UAPs by embedding target text features into the perturbations. Comprehensive evaluation on four commercial ASR APIs (Google, Amazon, iFlytek, and Alibaba), three voice assistants, two LLM-powered ASR and one NN-based ASR demonstrates the protection superiority of AudioShield over existing competitors, and both objective and subjective evaluations indicate that AudioShield significantly improves the audio quality. Moreover, AudioShield also shows high effectiveness in real-time end-to-end scenarios, and demonstrates strong resilience against adaptive countermeasures.

Paperid: 732, https://arxiv.org/pdf/2503.22738.pdf

Abstract:
Autonomous agents powered by foundation models have seen widespread adoption across various real-world applications. However, they remain highly vulnerable to malicious instructions and attacks, which can result in severe consequences such as privacy breaches and financial losses. More critically, existing guardrails for LLMs are not applicable due to the complex and dynamic nature of agents. To tackle these challenges, we propose ShieldAgent, the first guardrail agent designed to enforce explicit safety policy compliance for the action trajectory of other protected agents through logical reasoning. Specifically, ShieldAgent first constructs a safety policy model by extracting verifiable rules from policy documents and structuring them into a set of action-based probabilistic rule circuits. Given the action trajectory of the protected agent, ShieldAgent retrieves relevant rule circuits and generates a shielding plan, leveraging its comprehensive tool library and executable code for formal verification. In addition, given the lack of guardrail benchmarks for agents, we introduce ShieldAgent-Bench, a dataset with 3K safety-related pairs of agent instructions and action trajectories, collected via SOTA attacks across 6 web environments and 7 risk categories. Experiments show that ShieldAgent achieves SOTA on ShieldAgent-Bench and three existing benchmarks, outperforming prior methods by 11.3% on average with a high recall of 90.1%. Additionally, ShieldAgent reduces API queries by 64.7% and inference time by 58.2%, demonstrating its high precision and efficiency in safeguarding agents.

Paperid: 733, https://arxiv.org/pdf/2503.17987.pdf

Abstract:
Text-to-Image(T2I) models typically deploy safety filters to prevent the generation of sensitive images. Unfortunately, recent jailbreaking attack methods manually design prompts for the LLM to generate adversarial prompts, which effectively bypass safety filters while producing sensitive images, exposing safety vulnerabilities of T2I models. However, due to the LLM's limited understanding of the T2I model and its safety filters, existing methods require numerous queries to achieve a successful attack, limiting their practical applicability. To address this issue, we propose Reason2Attack(R2A), which aims to enhance the LLM's reasoning capabilities in generating adversarial prompts by incorporating the jailbreaking attack into the post-training process of the LLM. Specifically, we first propose a CoT example synthesis pipeline based on Frame Semantics, which generates adversarial prompts by identifying related terms and corresponding context illustrations. Using CoT examples generated by the pipeline, we fine-tune the LLM to understand the reasoning path and format the output structure. Subsequently, we incorporate the jailbreaking attack task into the reinforcement learning process of the LLM and design an attack process reward that considers prompt length, prompt stealthiness, and prompt effectiveness, aiming to further enhance reasoning accuracy. Extensive experiments on various T2I models show that R2A achieves a better attack success ratio while requiring fewer queries than baselines. Moreover, our adversarial prompts demonstrate strong attack transferability across both open-source and commercial T2I models.

Paperid: 734, https://arxiv.org/pdf/2503.17514.pdf

Abstract:
An important question today is whether a given text was used to train a large language model (LLM). A \emph{completion} test is often employed: check if the LLM completes a sufficiently complex text. This, however, requires a ground-truth definition of membership; most commonly, it is defined as a member based on the $n$-gram overlap between the target text and any text in the dataset. In this work, we demonstrate that this $n$-gram based membership definition can be effectively gamed. We study scenarios where sequences are \emph{non-members} for a given $n$ and we find that completion tests still succeed. We find many natural cases of this phenomenon by retraining LLMs from scratch after removing all training samples that were completed; these cases include exact duplicates, near-duplicates, and even short overlaps. They showcase that it is difficult to find a single viable choice of $n$ for membership definitions. Using these insights, we design adversarial datasets that can cause a given target sequence to be completed without containing it, for any reasonable choice of $n$. Our findings highlight the inadequacy of $n$-gram membership, suggesting membership definitions fail to account for auxiliary information available to the training algorithm.

Paperid: 735, https://arxiv.org/pdf/2503.15754.pdf

Abstract:
As large language models (LLMs) become increasingly capable, security and safety evaluation are crucial. While current red teaming approaches have made strides in assessing LLM vulnerabilities, they often rely heavily on human input and lack comprehensive coverage of emerging attack vectors. This paper introduces AutoRedTeamer, a novel framework for fully automated, end-to-end red teaming against LLMs. AutoRedTeamer combines a multi-agent architecture with a memory-guided attack selection mechanism to enable continuous discovery and integration of new attack vectors. The dual-agent framework consists of a red teaming agent that can operate from high-level risk categories alone to generate and execute test cases and a strategy proposer agent that autonomously discovers and implements new attacks by analyzing recent research. This modular design allows AutoRedTeamer to adapt to emerging threats while maintaining strong performance on existing attack vectors. We demonstrate AutoRedTeamer's effectiveness across diverse evaluation settings, achieving 20% higher attack success rates on HarmBench against Llama-3.1-70B while reducing computational costs by 46% compared to existing approaches. AutoRedTeamer also matches the diversity of human-curated benchmarks in generating test cases, providing a comprehensive, scalable, and continuously evolving framework for evaluating the security of AI systems.

Paperid: 736, https://arxiv.org/pdf/2503.02574.pdf

Abstract:
In this paper, we argue that current safety alignment research efforts for large language models are hindered by many intertwined sources of noise, such as small datasets, methodological inconsistencies, and unreliable evaluation setups. This can, at times, make it impossible to evaluate and compare attacks and defenses fairly, thereby slowing progress. We systematically analyze the LLM safety evaluation pipeline, covering dataset curation, optimization strategies for automated red-teaming, response generation, and response evaluation using LLM judges. At each stage, we identify key issues and highlight their practical impact. We also propose a set of guidelines for reducing noise and bias in evaluations of future attack and defense papers. Lastly, we offer an opposing perspective, highlighting practical reasons for existing limitations. We believe that addressing the outlined problems in future research will improve the field's ability to generate easily comparable results and make measurable progress.

Paperid: 737, https://arxiv.org/pdf/2502.15680.pdf

Abstract:
Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downstream fine-tuning stage. We find that the amount and ease of PII memorization is a dynamic property of a model that evolves throughout training pipelines and depends on commonly altered design choices. We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization, and this is a significant factor (in our settings, up to 1/3); (2) adding PII can increase memorization of other PII significantly (in our settings, as much as $\approx\!7.5\times$); and (3) removing PII can lead to other PII being memorized. Model creators should consider these first- and second-order privacy risks when training models to avoid the risk of new PII regurgitation.

Paperid: 738, https://arxiv.org/pdf/2502.10487.pdf

Abstract:
Evaluating the robustness of LLMs to adversarial attacks is crucial for safe deployment, yet current red-teaming methods are often prohibitively expensive. We compare the ability of fast proxy metrics to predict the real-world robustness of an LLM against a simulated attacker ensemble. This allows us to estimate a model's robustness to computationally expensive attacks without requiring runs of the attacks themselves. Specifically, we consider gradient-descent-based embedding-space attacks, prefilling attacks, and direct prompting. Even though direct prompting in particular does not achieve high ASR, we find that it and embedding-space attacks can predict attack success rates well, achieving $r_p=0.87$ (linear) and $r_s=0.94$ (Spearman rank) correlations with the full attack ensemble while reducing computational cost by three orders of magnitude.

Paperid: 739, https://arxiv.org/pdf/2502.05209.pdf

Abstract:
Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, this approach suffers from two limitations. First, input-output evaluations cannot fully evaluate realistic risks from open-weight models. Second, the behaviors identified during any particular input-output evaluation can only lower-bound the model's worst-possible-case input-output behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together, these results highlight the difficulty of suppressing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone.

Paperid: 740, https://arxiv.org/pdf/2501.17750.pdf

Abstract:
Auditing algorithms' privacy typically involves simulating a game-based protocol that guesses which of two adjacent datasets was the original input. Traditional approaches require thousands of such simulations, leading to significant computational overhead. Recent methods propose single-run auditing of the target algorithm to address this, substantially reducing computational cost. However, these methods' general applicability and tightness in producing empirical privacy guarantees remain uncertain. This work studies such problems in detail. Our contributions are twofold: First, we introduce a unifying framework for privacy audits based on information-theoretic principles, modeling the audit as a bit transmission problem in a noisy channel. This formulation allows us to derive fundamental limits and develop an audit approach that yields tight privacy lower bounds for various DP protocols. Second, leveraging this framework, we demystify the method of privacy audit by one run, identifying the conditions under which single-run audits are feasible or infeasible. Our analysis provides general guidelines for conducting privacy audits and offers deeper insights into the privacy audit. Finally, through experiments, we demonstrate that our approach produces tighter privacy lower bounds on common differentially private mechanisms while requiring significantly fewer observations. We also provide a case study illustrating that our method successfully detects privacy violations in flawed implementations of private algorithms.

Paperid: 741, https://arxiv.org/pdf/2501.16843.pdf

Abstract:
Skeleton action recognition models have secured more attention than video-based ones in various applications due to privacy preservation and lower storage requirements. Skeleton data are typically transmitted to cloud servers for action recognition, with results returned to clients via Apps/APIs. However, the vulnerability of skeletal models against adversarial perturbations gradually reveals the unreliability of these systems. Existing black-box attacks all operate in a decision-based manner, resulting in numerous queries that hinder efficiency and feasibility in real-world applications. Moreover, all attacks off the shelf focus on only restricted perturbations, while ignoring model weaknesses when encountered with non-semantic perturbations. In this paper, we propose two query-effIcient Skeletal Adversarial AttaCks, ISAAC-K and ISAAC-N. As a black-box attack, ISAAC-K utilizes Grad-CAM in a surrogate model to extract key joints where minor sparse perturbations are then added to fool the classifier. To guarantee natural adversarial motions, we introduce constraints of both bone length and temporal consistency. ISAAC-K finds stronger adversarial examples on $\ell_\infty$ norm, which can encompass those on other norms. Exhaustive experiments substantiate that ISAAC-K can uplift the attack efficiency of the perturbations under 10 skeletal models. Additionally, as a byproduct, ISAAC-N fools the classifier by replacing skeletons unrelated to the action. We surprisingly find that skeletal models are vulnerable to large perturbations where the part-wise non-semantic joints are just replaced, leading to a query-free no-box attack without any prior knowledge. Based on that, four adaptive defenses are eventually proposed to improve the robustness of skeleton recognition models.

Paperid: 742, https://arxiv.org/pdf/2501.11942.pdf

Abstract:
In this paper, we introduce and implement BRC20 sniping attack. Our attack manipulates the BRC20 token transfers in open markets and disrupts the fairness among bidding participants. The long-standing principle of ``highest bidder wins'' is rendered ineffective. Typically, open BRC20 token markets rely on Partially Signed Bitcoin Transactions (PSBT) to broadcast selling intents and wait for buying auctions. Our attack targets the BRC20 buying process (i.e., transfer) by injecting a front-running transaction to complete the full signature of the PSBT. At its core, the attack exploits the mempool's fee-based transaction selection mechanism to snipe the victim transaction, replicate metadata, and front-run the legesmate transaction. This attack applies to platforms using PSBT for BRC20 token transfers, including popular Bitcoin exchanges and marketplaces (e.g., Magic Eden, Unisat, Gate.io, OKX). We implemented and tested the attack on a Bitcoin testnet (regtest), validating its effectiveness through multiple experimental rounds. Results show that the attacker consistently replaces legitimate transactions by submitting higher-fee PSBTs. We have also made responsible disclosures to the mentioned exchanges.

Paperid: 743, https://arxiv.org/pdf/2501.07058.pdf

Abstract:
Smart contract vulnerabilities caused significant economic losses in blockchain applications. Large Language Models (LLMs) provide new possibilities for addressing this time-consuming task. However, state-of-the-art LLM-based detection solutions are often plagued by high false-positive rates. In this paper, we push the boundaries of existing research in two key ways. First, our evaluation is based on Solidity v0.8, offering the most up-to-date insights compared to prior studies that focus on older versions (v0.4). Second, we leverage the latest five LLM models (across companies), ensuring comprehensive coverage across the most advanced capabilities in the field. We conducted a series of rigorous evaluations. Our experiments demonstrate that a well-designed prompt can reduce the false-positive rate by over 60%. Surprisingly, we also discovered that the recall rate for detecting some specific vulnerabilities in Solidity v0.8 has dropped to just 13% compared to earlier versions (i.e., v0.4). Further analysis reveals the root cause of this decline: the reliance of LLMs on identifying changes in newly introduced libraries and frameworks during detection.

Paperid: 744, https://arxiv.org/pdf/2506.16666.pdf

Abstract:
This paper systematizes research on auditing Differential Privacy (DP) techniques, aiming to identify key insights into the current state of the art and open challenges. First, we introduce a comprehensive framework for reviewing work in the field and establish three cross-contextual desiderata that DP audits should target--namely, efficiency, end-to-end-ness, and tightness. Then, we systematize the modes of operation of state-of-the-art DP auditing techniques, including threat models, attacks, and evaluation functions. This allows us to highlight key details overlooked by prior work, analyze the limiting factors to achieving the three desiderata, and identify open research problems. Overall, our work provides a reusable and systematic methodology geared to assess progress in the field and identify friction points and future directions for our community to focus on.

Paperid: 745, https://arxiv.org/pdf/2506.16447.pdf

Abstract:
Backdoor unalignment attacks against Large Language Models (LLMs) enable the stealthy compromise of safety alignment using a hidden trigger while evading normal safety auditing. These attacks pose significant threats to the applications of LLMs in the real-world Large Language Model as a Service (LLMaaS) setting, where the deployed model is a fully black-box system that can only interact through text. Furthermore, the sample-dependent nature of the attack target exacerbates the threat. Instead of outputting a fixed label, the backdoored LLM follows the semantics of any malicious command with the hidden trigger, significantly expanding the target space. In this paper, we introduce BEAT, a black-box defense that detects triggered samples during inference to deactivate the backdoor. It is motivated by an intriguing observation (dubbed the probe concatenate effect), where concatenated triggered samples significantly reduce the refusal rate of the backdoored LLM towards a malicious probe, while non-triggered samples have little effect. Specifically, BEAT identifies whether an input is triggered by measuring the degree of distortion in the output distribution of the probe before and after concatenation with the input. Our method addresses the challenges of sample-dependent targets from an opposite perspective. It captures the impact of the trigger on the refusal signal (which is sample-independent) instead of sample-specific successful attack behaviors. It overcomes black-box access limitations by using multiple sampling to approximate the output distribution. Extensive experiments are conducted on various backdoor attacks and LLMs (including the closed-source GPT-3.5-turbo), verifying the effectiveness and efficiency of our defense. Besides, we also preliminarily verify that BEAT can effectively defend against popular jailbreak attacks, as they can be regarded as 'natural backdoors'.

Paperid: 746, https://arxiv.org/pdf/2506.12519.pdf

Abstract:
As Artificial Intelligence (AI) continues to evolve, it has transitioned from a research-focused discipline to a widely adopted technology, enabling intelligent solutions across various sectors. In security, AI's role in strengthening organizational resilience has been studied for over two decades. While much attention has focused on AI's constructive applications, the increasing maturity and integration of AI have also exposed its darker potentials. This article explores two emerging AI-related threats and the interplay between them: AI as a target of attacks (`Adversarial AI') and AI as a means to launch attacks on any target (`Offensive AI') -- potentially even on another AI. By cutting through the confusion and explaining these threats in plain terms, we introduce the complex and often misunderstood interplay between Adversarial AI and Offensive AI, offering a clear and accessible introduction to the challenges posed by these threats.

Paperid: 747, https://arxiv.org/pdf/2506.08201.pdf

Abstract:
This monograph explores the design and analysis of correlated noise mechanisms for differential privacy (DP), focusing on their application to private training of AI and machine learning models via the core primitive of estimation of weighted prefix sums. While typical DP mechanisms inject independent noise into each step of a stochastic gradient (SGD) learning algorithm in order to protect the privacy of the training data, a growing body of recent research demonstrates that introducing (anti-)correlations in the noise can significantly improve privacy-utility trade-offs by carefully canceling out some of the noise added on earlier steps in subsequent steps. Such correlated noise mechanisms, known variously as matrix mechanisms, factorization mechanisms, and DP-Follow-the-Regularized-Leader (DP-FTRL) when applied to learning algorithms, have also been influential in practice, with industrial deployment at a global scale.

Paperid: 748, https://arxiv.org/pdf/2506.06549.pdf

Abstract:
Differentially private stochastic gradient descent (DP-SGD) is the most widely used method for training machine learning models with provable privacy guarantees. A key challenge in DP-SGD is setting the per-sample gradient clipping threshold, which significantly affects the trade-off between privacy and utility. While recent adaptive methods improve performance by adjusting this threshold during training, they operate in the standard coordinate system and fail to account for correlations across the coordinates of the gradient. We propose GeoClip, a geometry-aware framework that clips and perturbs gradients in a transformed basis aligned with the geometry of the gradient distribution. GeoClip adaptively estimates this transformation using only previously released noisy gradients, incurring no additional privacy cost. We provide convergence guarantees for GeoClip and derive a closed-form solution for the optimal transformation that minimizes the amount of noise added while keeping the probability of gradient clipping under control. Experiments on both tabular and image datasets demonstrate that GeoClip consistently outperforms existing adaptive clipping methods under the same privacy budget.

Paperid: 749, https://arxiv.org/pdf/2506.01342.pdf

Abstract:
Identifying the impact scope and scale is critical for software supply chain vulnerability assessment. However, existing studies face substantial limitations. First, prior studies either work at coarse package-level granularity, producing many false positives, or fail to accomplish whole-ecosystem vulnerability propagation analysis. Second, although vulnerability assessment indicators like CVSS characterize individual vulnerabilities, no metric exists to specifically quantify the dynamic impact of vulnerability propagation across software supply chains. To address these limitations and enable accurate and comprehensive vulnerability impact assessment, we propose a novel approach: (i) a hierarchical worklist-based algorithm for whole-ecosystem and call-graph-level vulnerability propagation analysis and (ii) the Vulnerability Propagation Scoring System (VPSS), a dynamic metric to quantify the scope and evolution of vulnerability impacts in software supply chains. We implement a prototype of our approach in the Java Maven ecosystem and evaluate it on 100 real-world vulnerabilities. Experimental results demonstrate that our approach enables effective ecosystem-wide vulnerability propagation analysis, and provides a practical, quantitative measure of vulnerability impact through VPSS.

Paperid: 750, https://arxiv.org/pdf/2505.24019.pdf

Abstract:
Large Language Model (LLM) agents show considerable promise for automating complex tasks using contextual reasoning; however, interactions involving multiple agents and the system's susceptibility to prompt injection and other forms of context manipulation introduce new vulnerabilities related to privacy leakage and system exploitation. This position paper argues that the well-established design principles in information security, which are commonly referred to as security principles, should be employed when deploying LLM agents at scale. Design principles such as defense-in-depth, least privilege, complete mediation, and psychological acceptability have helped guide the design of mechanisms for securing information systems over the last five decades, and we argue that their explicit and conscientious adoption will help secure agentic systems. To illustrate this approach, we introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle. We evaluate with state-of-the-art LLMs along three dimensions: benign utility, attack utility, and attack success rate. AgentSandbox maintains high utility for its intended functions under both benign and adversarial evaluations while substantially mitigating privacy risks. By embedding secure design principles as foundational elements within emerging LLM agent protocols, we aim to promote trustworthy agent ecosystems aligned with user privacy expectations and evolving regulatory requirements.

Paperid: 751, https://arxiv.org/pdf/2505.16957.pdf

Abstract:
Large Language Models (LLMs) are increasingly equipped with capabilities of real-time web search and integrated with protocols like Model Context Protocol (MCP). This extension could introduce new security vulnerabilities. We present a systematic investigation of LLM vulnerabilities to hidden adversarial prompts through malicious font injection in external resources like webpages, where attackers manipulate code-to-glyph mapping to inject deceptive content which are invisible to users. We evaluate two critical attack scenarios: (1) "malicious content relay" and (2) "sensitive data leakage" through MCP-enabled tools. Our experiments reveal that indirect prompts with injected malicious font can bypass LLM safety mechanisms through external resources, achieving varying success rates based on data sensitivity and prompt design. Our research underscores the urgent need for enhanced security measures in LLM deployments when processing external content.

Paperid: 752, https://arxiv.org/pdf/2505.16765.pdf

Abstract:
Jailbreak attacks pose a serious threat to large language models (LLMs) by bypassing built-in safety mechanisms and leading to harmful outputs. Studying these attacks is crucial for identifying vulnerabilities and improving model security. This paper presents a systematic survey of jailbreak methods from the novel perspective of stealth. We find that existing attacks struggle to simultaneously achieve toxic stealth (concealing toxic content) and linguistic stealth (maintaining linguistic naturalness). Motivated by this, we propose StegoAttack, a fully stealthy jailbreak attack that uses steganography to hide the harmful query within benign, semantically coherent text. The attack then prompts the LLM to extract the hidden query and respond in an encrypted manner. This approach effectively hides malicious intent while preserving naturalness, allowing it to evade both built-in and external safety mechanisms. We evaluate StegoAttack on four safety-aligned LLMs from major providers, benchmarking against eight state-of-the-art methods. StegoAttack achieves an average attack success rate (ASR) of 92.00%, outperforming the strongest baseline by 11.0%. Its ASR drops by less than 1% even under external detection (e.g., Llama Guard). Moreover, it attains the optimal comprehensive scores on stealth detection metrics, demonstrating both high efficacy and exceptional stealth capabilities. The code is available at https://anonymous.4open.science/r/StegoAttack-Jail66

Paperid: 753, https://arxiv.org/pdf/2505.16559.pdf

Abstract:
Fine-tuning-as-a-service, while commercially successful for Large Language Model (LLM) providers, exposes models to harmful fine-tuning attacks. As a widely explored defense paradigm against such attacks, unlearning attempts to remove malicious knowledge from LLMs, thereby essentially preventing them from being used to perform malicious tasks. However, we highlight a critical flaw: the powerful general adaptability of LLMs allows them to easily bypass selective unlearning by rapidly relearning or repurposing their capabilities for harmful tasks. To address this fundamental limitation, we propose a paradigm shift: instead of selective removal, we advocate for inducing model collapse--effectively forcing the model to "unlearn everything"--specifically in response to updates characteristic of malicious adaptation. This collapse directly neutralizes the very general capabilities that attackers exploit, tackling the core issue unaddressed by selective unlearning. We introduce the Collapse Trap (CTRAP) as a practical mechanism to implement this concept conditionally. Embedded during alignment, CTRAP pre-configures the model's reaction to subsequent fine-tuning dynamics. If updates during fine-tuning constitute a persistent attempt to reverse safety alignment, the pre-configured trap triggers a progressive degradation of the model's core language modeling abilities, ultimately rendering it inert and useless for the attacker. Crucially, this collapse mechanism remains dormant during benign fine-tuning, ensuring the model's utility and general capabilities are preserved for legitimate users. Extensive empirical results demonstrate that CTRAP effectively counters harmful fine-tuning risks across various LLMs and attack settings, while maintaining high performance in benign scenarios. Our code is available at https://anonymous.4open.science/r/CTRAP.

Paperid: 754, https://arxiv.org/pdf/2505.12186.pdf

Abstract:
Harmful fine-tuning attacks pose a major threat to the security of large language models (LLMs), allowing adversaries to compromise safety guardrails with minimal harmful data. While existing defenses attempt to reinforce LLM alignment, they fail to address models' inherent "trainability" on harmful data, leaving them vulnerable to stronger attacks with increased learning rates or larger harmful datasets. To overcome this critical limitation, we introduce SEAM, a novel alignment-enhancing defense that transforms LLMs into self-destructive models with intrinsic resilience to misalignment attempts. Specifically, these models retain their capabilities for legitimate tasks while exhibiting substantial performance degradation when fine-tuned on harmful data. The protection is achieved through a novel loss function that couples the optimization trajectories of benign and harmful data, enhanced with adversarial gradient ascent to amplify the self-destructive effect. To enable practical training, we develop an efficient Hessian-free gradient estimate with theoretical error bounds. Extensive evaluation across LLMs and datasets demonstrates that SEAM creates a no-win situation for adversaries: the self-destructive models achieve state-of-the-art robustness against low-intensity attacks and undergo catastrophic performance collapse under high-intensity attacks, rendering them effectively unusable. (warning: this paper contains potentially harmful content generated by LLMs.)

Paperid: 755, https://arxiv.org/pdf/2505.07167.pdf

Abstract:
Large Language Models (LLMs) have been extensively used across diverse domains, including virtual assistants, automated code generation, and scientific research. However, they remain vulnerable to jailbreak attacks, which manipulate the models into generating harmful responses despite safety alignment. Recent studies have shown that current safety-aligned LLMs often undergo the shallow safety alignment, where the first few tokens largely determine whether the response will be harmful. Through comprehensive observations, we find that safety-aligned LLMs and various defense strategies generate highly similar initial tokens in their refusal responses, which we define as safety trigger tokens. Building on this insight, we propose \texttt{D-STT}, a simple yet effective defense algorithm that identifies and explicitly decodes safety trigger tokens of the given safety-aligned LLM to trigger the model's learned safety patterns. In this process, the safety trigger is constrained to a single token, which effectively preserves model usability by introducing minimum intervention in the decoding process. Extensive experiments across diverse jailbreak attacks and benign prompts demonstrate that \ours significantly reduces output harmfulness while preserving model usability and incurring negligible response time overhead, outperforming ten baseline methods.

Paperid: 756, https://arxiv.org/pdf/2504.14696.pdf

Abstract:
We introduce a differentially private (DP) algorithm called reveal-or-obscure (ROO) to generate a single representative sample from a dataset of $n$ observations drawn i.i.d. from an unknown discrete distribution $P$. Unlike methods that add explicit noise to the estimated empirical distribution, ROO achieves $Îµ$-differential privacy by randomly choosing whether to "reveal" or "obscure" the empirical distribution. While ROO is structurally identical to Algorithm 1 proposed by Cheu and Nayak (arXiv:2412.10512), we prove a strictly better bound on the sampling complexity than that established in Theorem 12 of (arXiv:2412.10512). To further improve the privacy-utility trade-off, we propose a novel generalized sampling algorithm called Data-Specific ROO (DS-ROO), where the probability of obscuring the empirical distribution of the dataset is chosen adaptively. We prove that DS-ROO satisfies $Îµ$-DP, and provide empirical evidence that DS-ROO can achieve better utility under the same privacy budget of vanilla ROO.

Paperid: 757, https://arxiv.org/pdf/2504.03238.pdf

Abstract:
Malware detection is increasingly challenged by evolving techniques like obfuscation and polymorphism, limiting the effectiveness of traditional methods. Meanwhile, the widespread adoption of software containers has introduced new security challenges, including the growing threat of malicious software injection, where a container, once compromised, can serve as entry point for further cyberattacks. In this work, we address these security issues by introducing a method to identify compromised containers through machine learning analysis of their file systems. We cast the entire software containers into large RGB images via their tarball representations, and propose to use established Convolutional Neural Network architectures on a streaming, patch-based manner. To support our experiments, we release the COSOCO dataset--the first of its kind--containing 3364 large-scale RGB images of benign and compromised software containers at https://huggingface.co/datasets/k3ylabs/cosoco-image-dataset. Our method detects more malware and achieves higher F1 and Recall scores than all individual and ensembles of VirusTotal engines, demonstrating its effectiveness and setting a new standard for identifying malware-compromised software containers.

Paperid: 758, https://arxiv.org/pdf/2503.10945.pdf

Abstract:
Current practices for reporting the level of differential privacy (DP) guarantees for machine learning (ML) algorithms provide an incomplete and potentially misleading picture of the guarantees and make it difficult to compare privacy levels across different settings. We argue for using Gaussian differential privacy (GDP) as the primary means of communicating DP guarantees in ML, with the full privacy profile as a secondary option in case GDP is too inaccurate. Unlike other widely used alternatives, GDP has only one parameter, which ensures easy comparability of guarantees, and it can accurately capture the full privacy profile of many important ML applications. To support our claims, we investigate the privacy profiles of state-of-the-art DP large-scale image classification, and the TopDown algorithm for the U.S. Decennial Census, observing that GDP fits the profiles remarkably well in all three cases. Although GDP is ideal for reporting the final guarantees, other formalisms (e.g., privacy loss random variables) are needed for accurate privacy accounting. We show that such intermediate representations can be efficiently converted to GDP with minimal loss in tightness.

Paperid: 759, https://arxiv.org/pdf/2502.07063.pdf

Abstract:
Zero-Knowledge Proofs (ZKPs) are a cryptographic primitive that allows a prover to demonstrate knowledge of a secret value to a verifier without revealing anything about the secret itself. ZKPs have shown to be an extremely powerful tool, as evidenced in both industry and academic settings. In recent years, the utilization of user data in practical applications has necessitated the rapid development of privacy-preserving techniques, including ZKPs. This has led to the creation of several robust open-source ZKP frameworks. However, there remains a significant gap in understanding the capabilities and real-world applications of these frameworks. Furthermore, identifying the most suitable frameworks for the developers' specific applications and settings is a challenge, given the variety of options available. The primary goal of our work is to lower the barrier to entry for understanding and building applications with open-source ZKP frameworks. In this work, we survey and evaluate 25 general-purpose, prominent ZKP frameworks. Recognizing that ZKPs have various constructions and underlying arithmetic schemes, our survey aims to provide a comprehensive overview of the ZKP landscape. These systems are assessed based on their usability and performance in SHA-256 and matrix multiplication experiments. Acknowledging that setting up a functional development environment can be challenging for these frameworks, we offer a fully open-source collection of Docker containers. These containers include a working development environment and are accompanied by documented code from our experiments. We conclude our work with a thorough analysis of the practical applications of ZKPs, recommendations for ZKP settings in different application scenarios, and a discussion on the future development of ZKP frameworks.

Paperid: 760, https://arxiv.org/pdf/2502.02068.pdf

Abstract:
This paper introduces RoSeMary, the first-of-its-kind ML/Crypto codesign watermarking framework that regulates LLM-generated code to avoid intellectual property rights violations and inappropriate misuse in software development. High-quality watermarks adhering to the detectability-fidelity-robustness tri-objective are limited due to codes' low-entropy nature. Watermark verification, however, often needs to reveal the signature and requires re-encoding new ones for code reuse, which potentially compromising the system's usability. To overcome these challenges, RoSeMary obtains high-quality watermarks by training the watermark insertion and extraction modules end-to-end to ensure (i) unaltered watermarked code functionality and (ii) enhanced detectability and robustness leveraging pre-trained CodeT5 as the insertion backbone to enlarge the code syntactic and variable rename transformation search space. In the deployment, RoSeMary uses zero-knowledge proofs for secure verification without revealing the underlying signatures. Extensive evaluations demonstrated RoSeMary achieves high detection accuracy while preserving the code functionality. RoSeMary is also robust against attacks and provides efficient secure watermark verification.

Paperid: 761, https://arxiv.org/pdf/2502.01569.pdf

Abstract:
The ongoing electrification of the transportation sector requires the deployment of multiple Electric Vehicle (EV) charging stations across multiple locations. However, the EV charging stations introduce significant cyber-physical and privacy risks, given the presence of vulnerable communication protocols, like the Open Charge Point Protocol (OCPP). Meanwhile, the Federated Learning (FL) paradigm showcases a novel approach for improved intrusion detection results that utilize multiple sources of Internet of Things data, while respecting the confidentiality of private information. This paper proposes the adoption of the FL architecture for the monitoring of the EV charging infrastructure and the detection of cyberattacks against the OCPP 1.6 protocol. The evaluation results showcase high detection performance of the proposed FL-based solution.

Paperid: 762, https://arxiv.org/pdf/2502.00653.pdf

Abstract:
While multimodal large language models (MLLMs) have achieved remarkable success in recent advancements, their susceptibility to jailbreak attacks has come to light. In such attacks, adversaries exploit carefully crafted prompts to coerce models into generating harmful or undesirable content. Existing defense mechanisms often rely on external inference steps or safety alignment training, both of which are less effective and impractical when facing sophisticated adversarial perturbations in white-box scenarios. To address these challenges and bolster MLLM robustness, we introduce SafeMLLM by adopting an adversarial training framework that alternates between an attack step for generating adversarial noise and a model updating step. At the attack step, SafeMLLM generates adversarial perturbations through a newly proposed contrastive embedding attack (CoE-Attack), which optimizes token embeddings under a contrastive objective. SafeMLLM then updates model parameters to neutralize the perturbation effects while preserving model utility on benign inputs. We evaluate SafeMLLM across six MLLMs and six jailbreak methods spanning multiple modalities. Experimental results show that SafeMLLM effectively defends against diverse attacks, maintaining robust performance and utilities.

Paperid: 763, https://arxiv.org/pdf/2501.19172.pdf

Abstract:
Recent advances in generative AI have opened promising avenues for steganography, which can securely protect sensitive information for individuals operating in hostile environments, such as journalists, activists, and whistleblowers. However, existing methods for generative steganography have significant limitations, particularly in scalability and their dependence on retraining diffusion models. We introduce PSyDUCK, a training-free, model-agnostic steganography framework specifically designed for latent diffusion models. PSyDUCK leverages controlled divergence and local mixing within the latent denoising process, enabling high-capacity, secure message embedding without compromising visual fidelity. Our method dynamically adapts embedding strength to balance accuracy and detectability, significantly improving upon existing pixel-space approaches. Crucially, PSyDUCK extends generative steganography to latent-space video diffusion models, surpassing previous methods in both encoding capacity and robustness. Extensive experiments demonstrate PSyDUCK's superiority over state-of-the-art techniques, achieving higher transmission accuracy and lower detectability rates across diverse image and video datasets. By overcoming the key challenges associated with latent diffusion model architectures, PSyDUCK sets a new standard for generative steganography, paving the way for scalable, real-world steganographic applications.

Paperid: 764, https://arxiv.org/pdf/2501.18780.pdf

Abstract:
Collision-resistant cryptographic hash functions (CRHs) are crucial for security, particularly for message authentication in Zero-knowledge Proof (ZKP) applications. However, traditional CRHs like SHA-2 or SHA-3, while optimized for CPUs, generate large circuits, rendering them inefficient in the ZK domain. Conversely, ZK-friendly hashes are designed for circuit efficiency but struggle on conventional hardware, often orders of magnitude slower than standard hashes due to their reliance on expensive finite field arithmetic. To bridge this performance gap, we present HashEmAll, a novel collection of FPGA-based realizations for three prominent ZK-friendly hashes: Griffin, Rescue-Prime, and Reinforced Concrete. Each offers distinct optimization profiles, with both area-optimized and latency-optimized variants available, allowing users to tailor hardware selection to specific application constraints regarding resource utilization and performance. Our extensive evaluation shows that latency-optimized HashEmAll designs outperform CPU implementations by at least $10 \times$, with the leading design achieving a $23 \times$ speedup. These gains are coupled with lower power consumption and compatibility with accessible FPGAs. Importantly, the highly parallel and pipelined architecture of HashEmAll enables significantly better practical scaling than CPU-based approaches towards building real-world ZKP applications, such as data commitments with Merkle Trees, by mitigating the hashing bottleneck for large trees. This highlights the suitability of HashEmAll for real-world ZKP applications involving large-scale data authentication. We also highlight the ability to translate the HashEmAll methodology to various ZK-friendly hash functions and different field sizes.

Paperid: 765, https://arxiv.org/pdf/2501.10627.pdf

Abstract:
The flexibility and complexity of IPv6 extension headers allow attackers to create covert channels or bypass security mechanisms, leading to potential data breaches or system compromises. The mature development of machine learning has become the primary detection technology option used to mitigate covert communication threats. However, the complexity of detecting covert communication, evolving injection techniques, and scarcity of data make building machine-learning models challenging. In previous related research, machine learning has shown good performance in detecting covert communications, but oversimplified attack scenario assumptions cannot represent the complexity of modern covert technologies and make it easier for machine learning models to detect covert communications. To bridge this gap, in this study, we analyzed the packet structure and network traffic behavior of IPv6, used encryption algorithms, and performed covert communication injection without changing network packet behavior to get closer to real attack scenarios. In addition to analyzing and injecting methods for covert communications, this study also uses comprehensive machine learning techniques to train the model proposed in this study to detect threats, including traditional decision trees such as random forests and gradient boosting, as well as complex neural network architectures such as CNNs and LSTMs, to achieve detection accuracy of over 90\%. This study details the methods used for dataset augmentation and the comparative performance of the applied models, reinforcing insights into the adaptability and resilience of the machine learning application in IPv6 covert communication. We further introduce a Generative AI-driven script refinement framework, leveraging prompt engineering as a preliminary exploration of how generative agents can assist in covert communication detection and model enhancement.

Paperid: 766, https://arxiv.org/pdf/2506.12553.pdf

Abstract:
Differential privacy (DP) is obtained by randomizing a data analysis algorithm, which necessarily introduces a tradeoff between its utility and privacy. Many DP mechanisms are built upon one of two underlying tools: Laplace and Gaussian additive noise mechanisms. We expand the search space of algorithms by investigating the Generalized Gaussian mechanism, which samples the additive noise term $x$ with probability proportional to $e^{-\frac{| x |}Ï^Î² }$ for some $Î²\geq 1$. The Laplace and Gaussian mechanisms are special cases of GG for $Î²=1$ and $Î²=2$, respectively. In this work, we prove that all members of the GG family satisfy differential privacy, and provide an extension of an existing numerical accountant (the PRV accountant) for these mechanisms. We show that privacy accounting for the GG Mechanism and its variants is dimension independent, which substantially improves computational costs of privacy accounting. We apply the GG mechanism to two canonical tools for private machine learning, PATE and DP-SGD; we show empirically that $Î²$ has a weak relationship with test-accuracy, and that generally $Î²=2$ (Gaussian) is nearly optimal. This provides justification for the widespread adoption of the Gaussian mechanism in DP learning, and can be interpreted as a negative result, that optimizing over $Î²$ does not lead to meaningful improvements in performance.

Paperid: 767, https://arxiv.org/pdf/2506.12274.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains. However, their potential to generate harmful responses has raised significant societal and regulatory concerns, especially when manipulated by adversarial techniques known as "jailbreak" attacks. Existing jailbreak methods typically involve appending carefully crafted prefixes or suffixes to malicious prompts in order to bypass the built-in safety mechanisms of these models. In this work, we identify a new vulnerability in which excessive linguistic complexity can disrupt built-in safety mechanisms-without the need for any added prefixes or suffixes-allowing attackers to elicit harmful outputs directly. We refer to this phenomenon as Information Overload. To automatically exploit this vulnerability, we propose InfoFlood, a jailbreak attack that transforms malicious queries into complex, information-overloaded queries capable of bypassing built-in safety mechanisms. Specifically, InfoFlood: (1) uses linguistic transformations to rephrase malicious queries, (2) identifies the root cause of failure when an attempt is unsuccessful, and (3) refines the prompt's linguistic structure to address the failure while preserving its malicious intent. We empirically validate the effectiveness of InfoFlood on four widely used LLMs-GPT-4o, GPT-3.5-turbo, Gemini 2.0, and LLaMA 3.1-by measuring their jailbreak success rates. InfoFlood consistently outperforms baseline attacks, achieving up to 3 times higher success rates across multiple jailbreak benchmarks. Furthermore, we demonstrate that commonly adopted post-processing defenses, including OpenAI's Moderation API, Perspective API, and SmoothLLM, fail to mitigate these attacks. This highlights a critical weakness in traditional AI safety guardrails when confronted with information overload-based jailbreaks.

Paperid: 768, https://arxiv.org/pdf/2506.10364.pdf

Abstract:
Large language models (LLMs) are increasingly fine-tuned on domain-specific datasets to support applications in fields such as healthcare, finance, and law. These fine-tuning datasets often have sensitive and confidential dataset-level properties -- such as patient demographics or disease prevalence -- that are not intended to be revealed. While prior work has studied property inference attacks on discriminative models (e.g., image classification models) and generative models (e.g., GANs for image data), it remains unclear if such attacks transfer to LLMs. In this work, we introduce PropInfer, a benchmark task for evaluating property inference in LLMs under two fine-tuning paradigms: question-answering and chat-completion. Built on the ChatDoctor dataset, our benchmark includes a range of property types and task configurations. We further propose two tailored attacks: a prompt-based generation attack and a shadow-model attack leveraging word frequency signals. Empirical evaluations across multiple pretrained LLMs show the success of our attacks, revealing a previously unrecognized vulnerability in LLMs.

Paperid: 769, https://arxiv.org/pdf/2506.09312.pdf

Abstract:
While location trajectories offer valuable insights, they also reveal sensitive personal information. Differential Privacy (DP) offers formal protection, but achieving a favourable utility-privacy trade-off remains challenging. Recent works explore deep learning-based generative models to produce synthetic trajectories. However, current models lack formal privacy guarantees and rely on conditional information derived from real data during generation. This work investigates the utility cost of enforcing DP in such models, addressing three research questions across two datasets and eleven utility metrics. (1) We evaluate how DP-SGD, the standard DP training method for deep learning, affects the utility of state-of-the-art generative models. (2) Since DP-SGD is limited to unconditional models, we propose a novel DP mechanism for conditional generation that provides formal guarantees and assess its impact on utility. (3) We analyse how model types - Diffusion, VAE, and GAN - affect the utility-privacy trade-off. Our results show that DP-SGD significantly impacts performance, although some utility remains if the datasets is sufficiently large. The proposed DP mechanism improves training stability, particularly when combined with DP-SGD, for unstable models such as GANs and on smaller datasets. Diffusion models yield the best utility without guarantees, but with DP-SGD, GANs perform best, indicating that the best non-private model is not necessarily optimal when targeting formal guarantees. In conclusion, DP trajectory generation remains a challenging task, and formal guarantees are currently only feasible with large datasets and in constrained use cases.

Paperid: 770, https://arxiv.org/pdf/2506.05074.pdf

Abstract:
A lack of accessible data has historically restricted malware analysis research, and practitioners have relied heavily on datasets provided by industry sources to advance. Existing public datasets are limited by narrow scope - most include files targeting a single platform, have labels supporting just one type of malware classification task, and make no effort to capture the evasive files that make malware detection difficult in practice. We present EMBER2024, a new dataset that enables holistic evaluation of malware classifiers. Created in collaboration with the authors of EMBER2017 and EMBER2018, the EMBER2024 dataset includes hashes, metadata, feature vectors, and labels for more than 3.2 million files from six file formats. Our dataset supports the training and evaluation of machine learning models on seven malware classification tasks, including malware detection, malware family classification, and malware behavior identification. EMBER2024 is the first to include a collection of malicious files that initially went undetected by a set of antivirus products, creating a "challenge" set to assess classifier performance against evasive malware. This work also introduces EMBER feature version 3, with added support for several new feature types. We are releasing the EMBER2024 dataset to promote reproducibility and empower researchers in the pursuit of new malware research topics.

Paperid: 771, https://arxiv.org/pdf/2506.02892.pdf

Abstract:
In this paper, we design and implement a web crawler system based on the Solana blockchain for the automated collection and analysis of market data for popular non-fungible tokens (NFTs) on the chain. Firstly, the basic information and transaction data of popular NFTs on the Solana chain are collected using the Selenium tool. Secondly, the transaction records of the Magic Eden trading market are thoroughly analyzed by combining them with the Scrapy framework to examine the price fluctuations and market trends of NFTs. In terms of data analysis, this paper employs time series analysis to examine the dynamics of the NFT market and seeks to identify potential price patterns. In addition, the risk and return of different NFTs are evaluated using the mean-variance optimization model, taking into account their characteristics, such as illiquidity and market volatility, to provide investors with data-driven portfolio recommendations. The experimental results show that the combination of crawler technology and financial analytics can effectively analyze NFT data on the Solana blockchain and provide timely market insights and investment strategies. This study provides a reference for further exploration in the field of digital currencies.

Paperid: 772, https://arxiv.org/pdf/2505.11884.pdf

Abstract:
Face recognition performance based on deep learning heavily relies on large-scale training data, which is often difficult to acquire in practical applications. To address this challenge, this paper proposes a GAN-based data augmentation method with three key contributions: (1) a residual-embedded generator to alleviate gradient vanishing/exploding problems, (2) an Inception ResNet-V1 based FaceNet discriminator for improved adversarial training, and (3) an end-to-end framework that jointly optimizes data generation and recognition performance. Experimental results demonstrate that our approach achieves stable training dynamics and significantly improves face recognition accuracy by 12.7% on the LFW benchmark compared to baseline methods, while maintaining good generalization capability with limited training samples.

Paperid: 773, https://arxiv.org/pdf/2505.08652.pdf

Abstract:
Blockchain is a type of decentralized distributed database. Unlike traditional relational database management systems, it does not require management or maintenance by a third party. All data management and update processes are open and transparent, solving the trust issues of centralized database management systems. Blockchain ensures network-wide consistency, consensus, traceability, and immutability. Under the premise of mutual distrust between nodes, blockchain technology integrates various technologies, such as P2P protocols, asymmetric encryption, consensus mechanisms, and chain structures. Data is distributed and stored across multiple nodes, maintained by all nodes, ensuring transaction data integrity, undeniability, and security. This facilitates trusted information sharing and supervision. The basic principles of blockchain form the foundation for all related research. Understanding the working principles is essential for further study of blockchain technology. There are many platforms based on blockchain technology, and they differ from one another. This paper will analyze the architecture of blockchain systems at each layer, focusing on the principles and technologies of blockchain platforms such as Bitcoin, Ethereum, and Hyperledger Fabric. The analysis will cover their scalability and security and highlight their similarities, differences, advantages, and disadvantages.

Paperid: 774, https://arxiv.org/pdf/2505.05816.pdf

Abstract:
We investigate privacy-preserving spectral clustering for community detection within stochastic block models (SBMs). Specifically, we focus on edge differential privacy (DP) and propose private algorithms for community recovery. Our work explores the fundamental trade-offs between the privacy budget and the accurate recovery of community labels. Furthermore, we establish information-theoretic conditions that guarantee the accuracy of our methods, providing theoretical assurances for successful community recovery under edge DP.

Paperid: 775, https://arxiv.org/pdf/2505.01536.pdf

Abstract:
Disassembly is the first step of a variety of binary analysis and transformation techniques, such as reverse engineering, or binary rewriting. Recent disassembly approaches consist of three phases: an exploration phase, that overapproximates the binary's code; an analysis phase, that assigns weights to candidate instructions or basic blocks; and a conflict resolution phase, that downselects the final set of instructions. We present a disassembly algorithm that generalizes this pattern for a wide range of architectures, namely x86, x64, arm32, and aarch64. Our algorithm presents a novel conflict resolution method that reduces disassembly to weighted interval scheduling.

Paperid: 776, https://arxiv.org/pdf/2504.21042.pdf

Abstract:
The growing adoption of artificial intelligence (AI) has amplified concerns about trustworthiness, including integrity, privacy, robustness, and bias. To assess and attribute these threats, we propose ConceptLens, a generic framework that leverages pre-trained multimodal models to identify the root causes of integrity threats by analyzing Concept Shift in probing samples. ConceptLens demonstrates strong detection performance for vanilla data poisoning attacks and uncovers vulnerabilities to bias injection, such as the generation of covert advertisements through malicious concept shifts. It identifies privacy risks in unaltered but high-risk samples, filters them before training, and provides insights into model weaknesses arising from incomplete or imbalanced training data. Additionally, at the model level, it attributes concepts that the target model is overly dependent on, identifies misleading concepts, and explains how disrupting key concepts negatively impacts the model. Furthermore, it uncovers sociological biases in generative content, revealing disparities across sociological contexts. Strikingly, ConceptLens reveals how safe training and inference data can be unintentionally and easily exploited, potentially undermining safety alignment. Our study informs actionable insights to breed trust in AI systems, thereby speeding adoption and driving greater innovation.

Paperid: 777, https://arxiv.org/pdf/2504.18015.pdf

Abstract:
Face recognition technology presents serious privacy risks due to its reliance on sensitive and immutable biometric data. To address these concerns, such systems typically convert raw facial images into embeddings, which are traditionally viewed as privacy-preserving. However, model inversion attacks challenge this assumption by reconstructing private facial images from embeddings, highlighting a critical vulnerability in face recognition systems. Most existing inversion methods require training a separate generator for each target model, making them computationally intensive. In this work, we introduce DiffUMI, a diffusion-based universal model inversion attack that requires no additional training. DiffUMI is the first approach to successfully leverage unconditional face generation without relying on model-specific generators. It surpasses state-of-the-art attacks by 15.5% and 9.82% in success rate on standard and privacy-preserving face recognition systems, respectively. Furthermore, we propose a novel use of out-of-domain detection (OODD), demonstrating for the first time that model inversion can differentiate between facial and non-facial embeddings using only the embedding space.

Paperid: 778, https://arxiv.org/pdf/2504.13398.pdf

Abstract:
Most blockchains cannot hide the binary code of programs (i.e., smart contracts) running on them. To conceal proprietary business logic and to potentially deter attacks, many smart contracts are closed-source and employ layers of obfuscation. However, we demonstrate that such obfuscation can obscure critical vulnerabilities rather than enhance security, a phenomenon we term insecurity through obscurity. To systematically analyze these risks on a large scale, we present SKANF, a novel EVM bytecode analysis tool tailored for closed-source and obfuscated contracts. SKANF combines control-flow deobfuscation, symbolic execution, and concolic execution based on historical transactions to identify and exploit asset management vulnerabilities. Our evaluation on real-world Maximal Extractable Value (MEV) bots reveals that SKANF detects vulnerabilities in 1,030 contracts and successfully generates exploits for 394 of them, with potential losses of \$10.6M. Additionally, we uncover 104 real-world MEV bot attacks that collectively resulted in \$2.76M in losses.

Paperid: 779, https://arxiv.org/pdf/2504.12720.pdf

Abstract:
With the booming development of blockchain technology, smart contracts have been widely used in finance, supply chain, Internet of things and other fields in recent years. However, the security problems of smart contracts become increasingly prominent. Security events caused by smart contracts occur frequently, and the existence of malicious codes may lead to the loss of user assets and system crash. In this paper, a simple study is carried out on malicious code detection of intelligent contracts based on machine learning. The main research work and achievements are as follows: Feature extraction and vectorization of smart contract are the first step to detect malicious code of smart contract by using machine learning method, and feature processing has an important impact on detection results. In this paper, an opcode vectorization method based on smart contract text is adopted. Based on considering the structural characteristics of contract opcodes, the opcodes are classified and simplified. Then, N-Gram (N=2) algorithm and TF-IDF algorithm are used to convert the simplified opcodes into vectors, and then put into the machine learning model for training. In contrast, N-Gram algorithm and TF-IDF algorithm are directly used to quantify opcodes and put into the machine learning model training. Judging which feature extraction method is better according to the training results. Finally, the classifier chain is applied to the intelligent contract malicious code detection.

Paperid: 780, https://arxiv.org/pdf/2504.09363.pdf

Abstract:
Automatic generation control (AGC) systems play a crucial role in maintaining system frequency across power grids. However, AGC systems' reliance on communicated measurements exposes them to false data injection attacks (FDIAs), which can compromise the overall system stability. This paper proposes a machine learning (ML)-based detection framework that identifies FDIAs and determines the compromised measurements. The approach utilizes an ML model trained offline to accurately detect attacks and classify the manipulated signals based on a comprehensive set of statistical and time-series features extracted from AGC measurements before and after disturbances. For the proposed approach, we compare the performance of several powerful ML algorithms. Our results demonstrate the efficacy of the proposed method in detecting FDIAs while maintaining a low false alarm rate, with an F1-score of up to 99.98%, outperforming existing approaches.

Paperid: 781, https://arxiv.org/pdf/2504.03735.pdf

Abstract:
Multimodal Language Models (MMLMs) typically undergo post-training alignment to prevent harmful content generation. However, these alignment stages focus primarily on the assistant role, leaving the user role unaligned, and stick to a fixed input prompt structure of special tokens, leaving the model vulnerable when inputs deviate from these expectations. We introduce Role-Modality Attacks (RMA), a novel class of adversarial attacks that exploit role confusion between the user and assistant and alter the position of the image token to elicit harmful outputs. Unlike existing attacks that modify query content, RMAs manipulate the input structure without altering the query itself. We systematically evaluate these attacks across multiple Vision Language Models (VLMs) on eight distinct settings, showing that they can be composed to create stronger adversarial prompts, as also evidenced by their increased projection in the negative refusal direction in the residual stream, a property observed in prior successful attacks. Finally, for mitigation, we propose an adversarial training approach that makes the model robust against input prompt perturbations. By training the model on a range of harmful and benign prompts all perturbed with different RMA settings, it loses its sensitivity to Role Confusion and Modality Manipulation attacks and is trained to only pay attention to the content of the query in the input prompt structure, effectively reducing Attack Success Rate (ASR) while preserving the model's general utility.

Paperid: 782, https://arxiv.org/pdf/2503.13255.pdf

Abstract:
Federated learning (FL) enables multiple participants to collaboratively train machine learning models while ensuring their data remains private and secure. Blockchain technology further enhances FL by providing stronger security, a transparent audit trail, and protection against data tampering and model manipulation. Most blockchain-secured FL systems rely on conventional consensus mechanisms: Proof-of-Work (PoW) is computationally expensive, while Proof-of-Stake (PoS) improves energy efficiency but risks centralization as it inherently favors participants with larger stakes. Recently, learning-based consensus has emerged as an alternative by replacing cryptographic tasks with model training to save energy. However, this approach introduces potential privacy vulnerabilities, as the training process may inadvertently expose sensitive information through gradient sharing and model updates. To address these challenges, we propose a novel Zero-Knowledge Proof of Training (ZKPoT) consensus mechanism. This method leverages the zero-knowledge succinct non-interactive argument of knowledge proof (zk-SNARK) protocol to validate participants' contributions based on their model performance, effectively eliminating the inefficiencies of traditional consensus methods and mitigating the privacy risks posed by learning-based consensus. We analyze our system's security, demonstrating its capacity to prevent the disclosure of sensitive information about local models or training data to untrusted parties during the entire FL process. Extensive experiments demonstrate that our system is robust against privacy and Byzantine attacks while maintaining accuracy and utility without trade-offs, scalable across various blockchain settings, and efficient in both computation and communication.

Paperid: 783, https://arxiv.org/pdf/2503.09712.pdf

Abstract:
Time series classification (TSC) is a cornerstone of modern web applications, powering tasks such as financial data analysis, network traffic monitoring, and user behavior analysis. In recent years, deep neural networks (DNNs) have greatly enhanced the performance of TSC models in these critical domains. However, DNNs are vulnerable to backdoor attacks, where attackers can covertly implant triggers into models to induce malicious outcomes. Existing backdoor attacks targeting DNN-based TSC models remain elementary. In particular, early methods borrow trigger designs from computer vision, which are ineffective for time series data. More recent approaches utilize generative models for trigger generation, but at the cost of significant computational complexity. In this work, we analyze the limitations of existing attacks and introduce an enhanced method, FreqBack. Drawing inspiration from the fact that DNN models inherently capture frequency domain features in time series data, we identify that improper perturbations in the frequency domain are the root cause of ineffective attacks. To address this, we propose to generate triggers both effectively and efficiently, guided by frequency analysis. FreqBack exhibits substantial performance across five models and eight datasets, achieving an impressive attack success rate of over 90%, while maintaining less than a 3% drop in model accuracy on clean data.

Paperid: 784, https://arxiv.org/pdf/2503.08976.pdf

Abstract:
Federated Ranking Learning (FRL) is a state-of-the-art FL framework that stands out for its communication efficiency and resilience to poisoning attacks. It diverges from the traditional FL framework in two ways: 1) it leverages discrete rankings instead of gradient updates, significantly reducing communication costs and limiting the potential space for malicious updates, and 2) it uses majority voting on the server side to establish the global ranking, ensuring that individual updates have minimal influence since each client contributes only a single vote. These features enhance the system's scalability and position FRL as a promising paradigm for FL training. However, our analysis reveals that FRL is not inherently robust, as certain edges are particularly vulnerable to poisoning attacks. Through a theoretical investigation, we prove the existence of these vulnerable edges and establish a lower bound and an upper bound for identifying them in each layer. Based on this finding, we introduce a novel local model poisoning attack against FRL, namely the Vulnerable Edge Manipulation (VEM) attack. The VEM attack focuses on identifying and perturbing the most vulnerable edges in each layer and leveraging an optimization-based approach to maximize the attack's impact. Through extensive experiments on benchmark datasets, we demonstrate that our attack achieves an overall 53.23% attack impact and is 3.7x more impactful than existing methods. Our findings highlight significant vulnerabilities in ranking-based FL systems and underline the urgency for the development of new robust FL frameworks.

Paperid: 785, https://arxiv.org/pdf/2503.06495.pdf

Abstract:
As malware detection evolves, attackers adopt sophisticated evasion tactics. Traditional file-level fingerprinting, such as cryptographic and fuzzy hashes, is often overlooked as a target for evasion. Malware variants exploit minor binary modifications to bypass detection, as seen in Microsoft's discovery of GoldMax variations (2020-2021). However, no large-scale empirical studies have assessed the limitations of traditional fingerprinting methods on real-world malware samples or explored improvements. This paper fills this gap by addressing three key questions: (a) How prevalent are file variants in malware samples? Analyzing 4 million Windows Portable Executable (PE) files, 21 million sections, and 48 million resources, we find up to 80% deep structural similarities, including common APIs and executable sections. (b) What evasion techniques are used? We identify resilient fingerprints (clusters of malware variants with high similarity) validated via VirusTotal. Our analysis reveals non-functional mutations, such as altered section numbers, virtual sizes, and section names, as primary evasion tactics. We also classify two key section types: malicious sections (high entropy >5) and camouflage sections (entropy = 0). (c) How can fingerprinting be improved? We propose two novel approaches that enhance detection, improving identification rates from 20% (traditional methods) to over 50% using our refined fingerprinting techniques. Our findings highlight the limitations of existing methods and propose new strategies to strengthen malware fingerprinting against evolving threats.

Paperid: 786, https://arxiv.org/pdf/2503.04404.pdf

Abstract:
This paper investigates the temporal analysis of NetFlow datasets for machine learning (ML)-based network intrusion detection systems (NIDS). Although many previous studies have highlighted the critical role of temporal features, such as inter-packet arrival time and flow length/duration, in NIDS, the currently available NetFlow datasets for NIDS lack these temporal features. This study addresses this gap by creating and making publicly available a set of NetFlow datasets that incorporate these temporal features [1]. With these temporal features, we provide a comprehensive temporal analysis of NetFlow datasets by examining the distribution of various features over time and presenting time-series representations of NetFlow features. This temporal analysis has not been previously provided in the existing literature. We also borrowed an idea from signal processing, time frequency analysis, and tested it to see how different the time frequency signal presentations (TFSPs) are for various attacks. The results indicate that many attacks have unique patterns, which could help ML models to identify them more easily.

Paperid: 787, https://arxiv.org/pdf/2503.02281.pdf

Abstract:
The increasing adoption of Electric Vehicles (EVs) and the expansion of charging infrastructure and their reliance on communication expose Electric Vehicle Supply Equipment (EVSE) to cyberattacks. This paper presents a novel Kolmogorov-Arnold Network (KAN)-based framework for detecting cyberattacks on EV chargers using only power consumption measurements. Leveraging the KAN's capability to model nonlinear, high-dimensional functions and its inherently interpretable architecture, the framework effectively differentiates between normal and malicious charging scenarios. The model is trained offline on a comprehensive dataset containing over 100,000 cyberattack cases generated through an experimental setup. Once trained, the KAN model can be deployed within individual chargers for real-time detection of abnormal charging behaviors indicative of cyberattacks. Our results demonstrate that the proposed KAN-based approach can accurately detect cyberattacks on EV chargers with Precision and F1-score of 99% and 92%, respectively, outperforming existing detection methods. Additionally, the proposed KANs's enable the extraction of mathematical formulas representing KAN's detection decisions, addressing interpretability, a key challenge in deep learning-based cybersecurity frameworks. This work marks a significant step toward building secure and explainable EV charging infrastructure.

Paperid: 788, https://arxiv.org/pdf/2502.20995.pdf

Abstract:
With the growing adoption of retrieval-augmented generation (RAG) systems, various attack methods have been proposed to degrade their performance. However, most existing approaches rely on unrealistic assumptions in which external attackers have access to internal components such as the retriever. To address this issue, we introduce a realistic black-box attack based on the RAG paradox, a structural vulnerability arising from the system's effort to enhance trust by revealing both the retrieved documents and their sources to users. This transparency enables attackers to observe which sources are used and how information is phrased, allowing them to craft poisoned documents that are more likely to be retrieved and upload them to the identified sources. Moreover, as RAG systems directly provide retrieved content to users, these documents must not only be retrievable but also appear natural and credible to maintain user confidence in the search results. Unlike prior work that focuses solely on improving document retrievability, our attack method explicitly considers both retrievability and user trust in the retrieved content. Both offline and online experiments demonstrate that our method significantly degrades system performance without internal access, while generating natural-looking poisoned documents.

Paperid: 789, https://arxiv.org/pdf/2502.14828.pdf

Abstract:
LLM developers have imposed technical interventions to prevent fine-tuning misuse attacks, attacks where adversaries evade safeguards by fine-tuning the model using a public API. Previous work has established several successful attacks against specific fine-tuning API defences. In this work, we show that defences of fine-tuning APIs that seek to detect individual harmful training or inference samples ('pointwise' detection) are fundamentally limited in their ability to prevent fine-tuning attacks. We construct 'pointwise-undetectable' attacks that repurpose entropy in benign model outputs (e.g. semantic or syntactic variations) to covertly transmit dangerous knowledge. Our attacks are composed solely of unsuspicious benign samples that can be collected from the model before fine-tuning, meaning training and inference samples are all individually benign and low-perplexity. We test our attacks against the OpenAI fine-tuning API, finding they succeed in eliciting answers to harmful multiple-choice questions, and that they evade an enhanced monitoring system we design that successfully detects other fine-tuning attacks. We encourage the community to develop defences that tackle the fundamental limitations we uncover in pointwise fine-tuning API defences.

Paperid: 790, https://arxiv.org/pdf/2502.02759.pdf

Abstract:
Determining the family to which a malicious file belongs is an essential component of cyberattack investigation, attribution, and remediation. Performing this task manually is time consuming and requires expert knowledge. Automated tools using that label malware using antivirus detections lack accuracy and/or scalability, making them insufficient for real-world applications. Three pervasive shortcomings in these tools are responsible: (1) incorrect parsing of antivirus detections, (2) errors during family alias resolution, and (3) an inappropriate antivirus aggregation strategy. To address each of these, we created our own malware family labeling tool called ClarAVy. ClarAVy utilizes a Variational Bayesian approach to aggregate detections from a collection of antivirus products into accurate family labels. Our tool scales to enormous malware datasets, and we evaluated it by labeling $\approx$40 million malicious files. ClarAVy has 8 and 12 percentage points higher accuracy than the prior leading tool in labeling the MOTIF and MalPedia datasets, respectively.

Paperid: 791, https://arxiv.org/pdf/2502.00646.pdf

Abstract:
Time Series Classification (TSC) is highly vulnerable to backdoor attacks, posing significant security threats. Existing methods primarily focus on data poisoning during the training phase, designing sophisticated triggers to improve stealthiness and attack success rate (ASR). However, in practical scenarios, attackers often face restrictions in accessing training data. Moreover, it is a challenge for the model to maintain generalization ability on clean test data while remaining vulnerable to poisoned inputs when data is inaccessible. To address these challenges, we propose TrojanTime, a novel two-step training algorithm. In the first stage, we generate a pseudo-dataset using an external arbitrary dataset through target adversarial attacks. The clean model is then continually trained on this pseudo-dataset and its poisoned version. To ensure generalization ability, the second stage employs a carefully designed training strategy, combining logits alignment and batch norm freezing. We evaluate TrojanTime using five types of triggers across four TSC architectures in UCR benchmark datasets from diverse domains. The results demonstrate the effectiveness of TrojanTime in executing backdoor attacks while maintaining clean accuracy. Finally, to mitigate this threat, we propose a defensive unlearning strategy that effectively reduces the ASR while preserving clean accuracy.

Paperid: 792, https://arxiv.org/pdf/2501.15101.pdf

Abstract:
The exploration of backdoor vulnerabilities in object detectors, particularly in real-world scenarios, remains limited. A significant challenge lies in the absence of a natural physical backdoor dataset, and constructing such a dataset is both time- and labor-intensive. In this work, we address this gap by creating a large-scale dataset comprising approximately 11,800 images/frames with annotations featuring natural objects (e.g., T-shirts and hats) as triggers to incur cloaking adversarial effects in diverse real-world scenarios. This dataset is tailored for the study of physical backdoors in object detectors. Leveraging this dataset, we conduct a comprehensive evaluation of an insidious cloaking backdoor effect against object detectors, wherein the bounding box around a person vanishes when the individual is near a natural object (e.g., a commonly available T-shirt) in front of the detector. Our evaluations encompass three prevalent attack surfaces: data outsourcing, model outsourcing, and the use of pretrained models. The cloaking effect is successfully implanted in object detectors across all three attack surfaces. We extensively evaluate four popular object detection algorithms (anchor-based Yolo-V3, Yolo-V4, Faster R-CNN, and anchor-free CenterNet) using 19 videos (totaling approximately 11,800 frames) in real-world scenarios. Our results demonstrate that the backdoor attack exhibits remarkable robustness against various factors, including movement, distance, angle, non-rigid deformation, and lighting. In data and model outsourcing scenarios, the attack success rate (ASR) in most videos reaches 100% or near it, while the clean data accuracy of the backdoored model remains indistinguishable from that of the clean model, making it impossible to detect backdoor behavior through a validation set.

Paperid: 793, https://arxiv.org/pdf/2501.13351.pdf

Abstract:
Deceptive patterns (DPs) are user interface designs deliberately crafted to manipulate users into unintended decisions, often by exploiting cognitive biases for the benefit of companies or services. While numerous studies have explored ways to identify these deceptive patterns, many existing solutions require significant human intervention and struggle to keep pace with the evolving nature of deceptive designs. To address these challenges, we expanded the deceptive pattern taxonomy from security and privacy perspectives, refining its categories and scope. We created a comprehensive dataset of deceptive patterns by integrating existing small-scale datasets with new samples, resulting in 6,725 images and 10,421 DP instances from mobile apps and websites. We then developed DPGuard, a novel automatic tool leveraging commercial multimodal large language models (MLLMs) for deceptive pattern detection. Experimental results show that DPGuard outperforms state-of-the-art methods. Finally, we conducted an extensive empirical evaluation on 2,000 popular mobile apps and websites, revealing that 23.61% of mobile screenshots and 47.27% of website screenshots feature at least one deceptive pattern instance. Through four unexplored case studies that inform security implications, we highlight the critical importance of the unified taxonomy in addressing the growing challenges of Internet deception.

Paperid: 794, https://arxiv.org/pdf/2501.10800.pdf

Abstract:
We discuss the ``Infinitely Many Paraphrases'' attacks (IMP), a category of jailbreaks that leverages the increasing capabilities of a model to handle paraphrases and encoded communications to bypass their defensive mechanisms. IMPs' viability pairs and grows with a model's capabilities to handle and bind the semantics of simple mappings between tokens and work extremely well in practice, posing a concrete threat to the users of the most powerful LLMs in commerce. We show how one can bypass the safeguards of the most powerful open- and closed-source LLMs and generate content that explicitly violates their safety policies. One can protect against IMPs by improving the guardrails and making them scale with the LLMs' capabilities. For two categories of attacks that are straightforward to implement, i.e., bijection and encoding, we discuss two defensive strategies, one in token and the other in embedding space. We conclude with some research questions we believe should be prioritised to enhance the defensive mechanisms of LLMs and our understanding of their safety.

Paperid: 795, https://arxiv.org/pdf/2501.06912.pdf

Abstract:
The proliferation of mobile devices and online interactions have been threatened by different cyberattacks, where phishing attacks and malicious Uniform Resource Locators (URLs) pose significant risks to user security. Traditional phishing URL detection methods primarily rely on URL string-based features, which attackers often manipulate to evade detection. To address these limitations, we propose a novel graph-based machine learning model for phishing URL detection, integrating both URL structure and network-level features such as IP addresses and authoritative name servers. Our approach leverages Loopy Belief Propagation (LBP) with an enhanced convergence strategy to enable effective message passing and stable classification in the presence of complex graph structures. Additionally, we introduce a refined edge potential mechanism that dynamically adapts based on entity similarity and label relationships to further improve classification accuracy. Comprehensive experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77\%. This robust and reproducible method advances phishing detection capabilities, offering enhanced reliability and valuable insights in the field of cybersecurity.

Paperid: 796, https://arxiv.org/pdf/2501.06798.pdf

Abstract:
This study reveals the vulnerabilities of Wireless Local Area Networks (WLAN) sensing, under the scope of joint communication and sensing (JCAS), focusing on target spoofing and deceptive jamming techniques. We use orthogonal frequency-division multiplexing (OFDM) to explore how adversaries can exploit WLAN's sensing capabilities to inject false targets and disrupt normal operations. Unlike traditional methods that require sophisticated digital radio-frequency memory hardware, we demonstrate that much simpler software-defined radios can effectively serve as deceptive jammers in WLAN settings. Through comprehensive modeling and practical experiments, we show how deceptive jammers can manipulate the range-Doppler map (RDM) by altering signal integrity, thereby posing significant security threats to OFDM-based JCAS systems. Our findings comprehensively evaluate jammer impact on RDMs and propose several jamming strategies that vary in complexity and detectability.

Paperid: 797, https://arxiv.org/pdf/2501.05542.pdf

Abstract:
This study demonstrates a novel approach to testing the security boundaries of Vision-Large Language Model (VLM/ LLM) using the EICAR test file embedded within JPEG images. We successfully executed four distinct protocols across multiple LLM platforms, including OpenAI GPT-4o, Microsoft Copilot, Google Gemini 1.5 Pro, and Anthropic Claude 3.5 Sonnet. The experiments validated that a modified JPEG containing the EICAR signature could be uploaded, manipulated, and potentially executed within LLM virtual workspaces. Key findings include: 1) consistent ability to mask the EICAR string in image metadata without detection, 2) successful extraction of the test file using Python-based manipulation within LLM environments, and 3) demonstration of multiple obfuscation techniques including base64 encoding and string reversal. This research extends Microsoft Research's "Penetration Testing Rules of Engagement" framework to evaluate cloud-based generative AI and LLM security boundaries, particularly focusing on file handling and execution capabilities within containerized environments.

Paperid: 798, https://arxiv.org/pdf/2506.20488.pdf

Abstract:
The rapid advancement of 6G wireless networks, IoT, and edge computing has significantly expanded the cyberattack surface, necessitating more intelligent and adaptive vulnerability detection mechanisms. Traditional security methods, while foundational, struggle with zero-day exploits, adversarial threats, and context-dependent vulnerabilities in highly dynamic network environments. Generative AI (GAI) emerges as a transformative solution, leveraging synthetic data generation, multimodal reasoning, and adaptive learning to enhance security frameworks. This paper explores the integration of GAI-powered vulnerability detection in 6G wireless networks, focusing on code auditing, protocol security, cloud-edge defenses, and hardware protection. We introduce a three-layer framework comprising the Technology Layer, Capability Layer, and Application Layer to systematically analyze the role of VAEs, GANs, LLMs, and GDMs in securing next-generation wireless ecosystems. To demonstrate practical implementation, we present a case study on LLM-driven code vulnerability detection, highlighting its effectiveness, performance, and challenges. Finally, we outline future research directions, including lightweight models, high-authenticity data generation, external knowledge integration, and privacy-preserving technologies. By synthesizing current advancements and open challenges, this work provides a roadmap for researchers and practitioners to harness GAI for building resilient and adaptive security solutions in 6G networks.

Paperid: 799, https://arxiv.org/pdf/2506.15674.pdf

Abstract:
We study privacy leakage in the reasoning traces of large reasoning models used as personal agents. Unlike final outputs, reasoning traces are often assumed to be internal and safe. We challenge this assumption by showing that reasoning traces frequently contain sensitive user data, which can be extracted via prompt injections or accidentally leak into outputs. Through probing and agentic evaluations, we demonstrate that test-time compute approaches, particularly increased reasoning steps, amplify such leakage. While increasing the budget of those test-time compute approaches makes models more cautious in their final answers, it also leads them to reason more verbosely and leak more in their own thinking. This reveals a core tension: reasoning improves utility but enlarges the privacy attack surface. We argue that safety efforts must extend to the model's internal thinking, not just its outputs.

Paperid: 800, https://arxiv.org/pdf/2506.01462.pdf

Abstract:
We study the economics of transaction reverts on Ethereum rollups and show that they are not accidental failures but equilibrium outcomes of MEV strategies. Using execution traces from major L2s, we find that over 80% of reverted transactions are swaps, with half targeting USDC-WETH pools on Uniswap v3, v4. Clustering reveals distinct bot archetypes, including split-trade arbitrageurs, atomic duplicators, and end-of-block spammers, demonstrating that reverts follow systematic patterns rather than random noise. Empirically, we show that priority fee auctions on rollups do not allocate blockspace efficiently: transaction placement is mis-ordered, round-number bidding dominates, and duplication spam inflates base fees. As a result, reverted transactions contribute disproportionately more to sequencer fee revenues than to gas consumption, shifting welfare from users to sequencers. To explain these dynamics, we develop a model proving that trade-splitting and duplication strictly dominate one-shot execution under convex adversarial loss. Our findings establish reverts as a structural feature of rollup MEV microstructure and highlight the need for protocol-level reforms to sequencing, fee markets, and revert protection.

Paperid: 801, https://arxiv.org/pdf/2506.00500.pdf

Abstract:
Ethereum's scalability limitations pose significant challenges for the adoption of decentralized applications (dApps). Zero-Knowledge Rollups (ZK Rollups) present a promising solution, bundling transactions off-chain and submitting validity proofs on-chain to enhance throughput and efficiency. In this work, we examine the technical underpinnings of ZK Rollups and stress test their performance in real-world applications in decentralized finance (DeFi). We set up a proof-of-concept (PoC) consisting of ZK rollup and decentralized exchange, and implement load balancer generating token swaps. Our results show that the rollup can process up to 71 swap transactions per second, compared to 12 general transaction by Ethereum. We further analyze transaction finality trade-offs with related security concerns, and discuss the future directions for integrating ZK Rollups into Ethereum's broader ecosystem.

Paperid: 802, https://arxiv.org/pdf/2505.22037.pdf

Abstract:
Large language models (LLMs) are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation (JBDistill), a novel benchmark construction framework that "distills" jailbreak attacks into high-quality and easily-updatable safety benchmarks. JBDistill utilizes a small set of development models and existing jailbreak attack algorithms to create a candidate prompt pool, then employs prompt selection algorithms to identify an effective subset of prompts as safety benchmarks. JBDistill addresses challenges in existing safety evaluation: the use of consistent evaluation prompts across models ensures fair comparisons and reproducibility. It requires minimal human effort to rerun the JBDistill pipeline and produce updated benchmarks, alleviating concerns on saturation and contamination. Extensive experiments demonstrate our benchmarks generalize robustly to 13 diverse evaluation models held out from benchmark construction, including proprietary, specialized, and newer-generation LLMs, significantly outperforming existing safety benchmarks in effectiveness while maintaining high separability and diversity. Our framework thus provides an effective, sustainable, and adaptable solution for streamlining safety evaluation.

Paperid: 803, https://arxiv.org/pdf/2505.15216.pdf

Abstract:
AI agents have the potential to significantly alter the cybersecurity landscape. Here, we introduce the first framework to capture offensive and defensive cyber-capabilities in evolving real-world systems. Instantiating this framework with BountyBench, we set up 25 systems with complex, real-world codebases. To capture the vulnerability lifecycle, we define three task types: Detect (detecting a new vulnerability), Exploit (exploiting a specific vulnerability), and Patch (patching a specific vulnerability). For Detect, we construct a new success indicator, which is general across vulnerability types and provides localized evaluation. We manually set up the environment for each system, including installing packages, setting up server(s), and hydrating database(s). We add 40 bug bounties, which are vulnerabilities with monetary awards of \$10-\$30,485, covering 9 of the OWASP Top 10 Risks. To modulate task difficulty, we devise a new strategy based on information to guide detection, interpolating from identifying a zero day to exploiting a specific vulnerability. We evaluate 8 agents: Claude Code, OpenAI Codex CLI with o3-high and o4-mini, and custom agents with o3-high, GPT-4.1, Gemini 2.5 Pro Preview, Claude 3.7 Sonnet Thinking, and DeepSeek-R1. Given up to three attempts, the top-performing agents are OpenAI Codex CLI: o3-high (12.5% on Detect, mapping to \$3,720; 90% on Patch, mapping to \$14,152), Custom Agent with Claude 3.7 Sonnet Thinking (67.5% on Exploit), and OpenAI Codex CLI: o4-mini (90% on Patch, mapping to \$14,422). OpenAI Codex CLI: o3-high, OpenAI Codex CLI: o4-mini, and Claude Code are more capable at defense, achieving higher Patch scores of 90%, 90%, and 87.5%, compared to Exploit scores of 47.5%, 32.5%, and 57.5% respectively; while the custom agents are relatively balanced between offense and defense, achieving Exploit scores of 37.5-67.5% and Patch scores of 35-60%.

Paperid: 804, https://arxiv.org/pdf/2505.11016.pdf

Abstract:
Modern software supply chain attacks consist of introducing new, malicious capabilities into trusted third-party software components, in order to propagate to a victim through a package dependency chain. These attacks are especially concerning for the Go language ecosystem, which is extensively used in critical cloud infrastructures. We present GoLeash, a novel system that applies the principle of least privilege at the package-level granularity, by enforcing distinct security policies for each package in the supply chain. This finer granularity enables GoLeash to detect malicious packages more precisely than traditional sandboxing that handles security policies at process- or container-level. Moreover, GoLeash remains effective under obfuscation, can overcome the limitations of static analysis, and incurs acceptable runtime overhead.

Paperid: 805, https://arxiv.org/pdf/2505.10538.pdf

Abstract:
While providing economic and software development value, software supply chains are only as strong as their weakest link. Over the past several years, there has been an exponential increase in cyberattacks, specifically targeting vulnerable links in critical software supply chains. These attacks disrupt the day-to-day functioning and threaten the security of nearly everyone on the internet, from billion-dollar companies and government agencies to hobbyist open-source developers. The ever-evolving threat of software supply chain attacks has garnered interest from the software industry and the US government in improving software supply chain security. On September 20, 2024, three researchers from the NSF-backed Secure Software Supply Chain Center (S3C2) conducted a Secure Software Supply Chain Summit with a diverse set of 12 practitioners from 9 companies. The goals of the Summit were to: (1) to enable sharing between individuals from different companies regarding practical experiences and challenges with software supply chain security, (2) to help form new collaborations, (3) to share our observations from our previous summits with industry, and (4) to learn about practitioners' challenges to inform our future research direction. The summit consisted of discussions of six topics relevant to the companies represented, including updating vulnerable dependencies, component and container choice, malicious commits, building infrastructure, large language models, and reducing entire classes of vulnerabilities.

Paperid: 806, https://arxiv.org/pdf/2505.07148.pdf

Abstract:
Federated learning (FL) is well-suited to 5G networks, where many mobile devices generate sensitive edge data. Secure aggregation protocols enhance privacy in FL by ensuring that individual user updates reveal no information about the underlying client data. However, the dynamic and large-scale nature of 5G-marked by high mobility and frequent dropouts-poses significant challenges to the effective adoption of these protocols. Existing protocols often require multi-round communication or rely on fixed infrastructure, limiting their practicality. We propose a lightweight, single-round secure aggregation protocol designed for 5G environments. By leveraging base stations for assisted computation and incorporating precomputation, key-homomorphic pseudorandom functions, and t-out-of-k secret sharing, our protocol ensures efficiency, robustness, and privacy. Experiments show strong security guarantees and significant gains in communication and computation efficiency, making the approach well-suited for real-world 5G FL deployments.

Paperid: 807, https://arxiv.org/pdf/2505.07011.pdf

Abstract:
This paper considers random walk-based decentralized learning, where at each iteration of the learning process, one user updates the model and sends it to a randomly chosen neighbor until a convergence criterion is met. Preserving data privacy is a central concern and open problem in decentralized learning. We propose a privacy-preserving algorithm based on public-key cryptography and anonymization. In this algorithm, the user updates the model and encrypts the result using a distant user's public key. The encrypted result is then transmitted through the network with the goal of reaching that specific user. The key idea is to hide the source's identity so that, when the destination user decrypts the result, it does not know who the source was. The challenge is to design a network-dependent probability distribution (at the source) over the potential destinations such that, from the receiver's perspective, all users have a similar likelihood of being the source. We introduce the problem and construct a scheme that provides anonymity with theoretical guarantees. We focus on random regular graphs to establish rigorous guarantees.

Paperid: 808, https://arxiv.org/pdf/2505.01749.pdf

Abstract:
Digital steganography is the practice of concealing for encrypted data transmission. Typically, steganography methods embed secret data into cover data to create stega data that incorporates hidden secret data. However, steganography techniques often require designing specific frameworks for each data type, which restricts their generalizability. In this paper, we present U-INR, a novel method for steganography via Implicit Neural Representation (INR). Rather than using the specific framework for each data format, we directly use the neurons of the INR network to represent the secret data and cover data across different data types. To achieve this idea, a private key is shared between the data sender and receivers. Such a private key can be used to determine the position of secret data in INR networks. To effectively leverage this key, we further introduce a key-based selection strategy that can be used to determine the position within the INRs for data storage. Comprehensive experiments across multiple data types, including images, videos, audio, and SDF and NeRF, demonstrate the generalizability and effectiveness of U-INR, emphasizing its potential for improving data security and privacy in various applications.

Paperid: 809, https://arxiv.org/pdf/2505.00111.pdf

Abstract:
This paper presents our experience, in the context of an industrial R&D project, on securing GENIO, a platform for edge computing on Passive Optical Network (PON) infrastructures, and based on Open-Source Software (OSS). We identify threats and related mitigations through hardening, vulnerability management, digital signatures, and static and dynamic analysis. In particular, we report lessons learned in applying these mitigations using OSS, and share our findings about the maturity and limitations of these security solutions in an industrial context.

Paperid: 810, https://arxiv.org/pdf/2504.21182.pdf

Abstract:
Privacy in federated learning is crucial, encompassing two key aspects: safeguarding the privacy of clients' data and maintaining the privacy of the federator's objective from the clients. While the first aspect has been extensively studied, the second has received much less attention. We present a novel approach that addresses both concerns simultaneously, drawing inspiration from techniques in knowledge distillation and private information retrieval to provide strong information-theoretic privacy guarantees. Traditional private function computation methods could be used here; however, they are typically limited to linear or polynomial functions. To overcome these constraints, our approach unfolds in three stages. In stage 0, clients perform the necessary computations locally. In stage 1, these results are shared among the clients, and in stage 2, the federator retrieves its desired objective without compromising the privacy of the clients' data. The crux of the method is a carefully designed protocol that combines secret-sharing-based multi-party computation and a graph-based private information retrieval scheme. We show that our method outperforms existing tools from the literature when properly adapted to this setting.

Paperid: 811, https://arxiv.org/pdf/2504.16489.pdf

Abstract:
Multi-Agent Debate (MAD), leveraging collaborative interactions among Large Language Models (LLMs), aim to enhance reasoning capabilities in complex tasks. However, the security implications of their iterative dialogues and role-playing characteristics, particularly susceptibility to jailbreak attacks eliciting harmful content, remain critically underexplored. This paper systematically investigates the jailbreak vulnerabilities of four prominent MAD frameworks built upon leading commercial LLMs (GPT-4o, GPT-4, GPT-3.5-turbo, and DeepSeek) without compromising internal agents. We introduce a novel structured prompt-rewriting framework specifically designed to exploit MAD dynamics via narrative encapsulation, role-driven escalation, iterative refinement, and rhetorical obfuscation. Our extensive experiments demonstrate that MAD systems are inherently more vulnerable than single-agent setups. Crucially, our proposed attack methodology significantly amplifies this fragility, increasing average harmfulness from 28.14% to 80.34% and achieving attack success rates as high as 80% in certain scenarios. These findings reveal intrinsic vulnerabilities in MAD architectures and underscore the urgent need for robust, specialized defenses prior to real-world deployment.

Paperid: 812, https://arxiv.org/pdf/2504.15822.pdf

Abstract:
Using a multi-accented corpus of parallel utterances for use with commercial speech devices, we present a case study to show that it is possible to quantify a degree of confidence about a source speaker's identity in the case of one-to-one voice conversion. Following voice conversion using a HiFi-GAN vocoder, we compare information leakage for a range speaker characteristics; assuming a "worst-case" white-box scenario, we quantify our confidence to perform inference and narrow the pool of likely source speakers, reinforcing the regulatory obligation and moral duty that providers of synthetic voices have to ensure the privacy of their speakers' data.

Paperid: 813, https://arxiv.org/pdf/2504.11126.pdf

Abstract:
Kubernetes (K8s) is widely used to orchestrate containerized applications, including critical services in domains such as finance, healthcare, and government. However, its extensive and feature-rich API interface exposes a broad attack surface, making K8s vulnerable to exploits of software vulnerabilities and misconfigurations. Even if K8s adopts role-based access control (RBAC) to manage access to K8s APIs, this approach lacks the granularity needed to protect specification attributes within API requests. This paper proposes a novel solution, KubeFence, which implements finer-grain API filtering tailored to specific client workloads. KubeFence analyzes Kubernetes Operators from trusted repositories and leverages their configuration files to restrict unnecessary features of the K8s API, to mitigate misconfigurations and vulnerabilities exploitable through the K8s API. The experimental results show that KubeFence can significantly reduce the attack surface and prevent attacks compared to RBAC.

Paperid: 814, https://arxiv.org/pdf/2504.00924.pdf

Abstract:
Supply chain security has become a very important vector to consider when defending against adversary attacks. Due to this, more and more developers are keen on improving their supply chains to make them more robust against future threats. On August 29, 2024 researchers from the Secure Software Supply Chain Center (S3C2) gathered 14 practitioners from 10 government agencies to discuss the state of supply chain security. The goal of the summit is to share insights between companies and developers alike to foster new collaborations and ideas moving forward. Through this meeting, participants were questions on best practices and thoughts how to improve things for the future. In this paper we summarize the responses and discussions of the summit.

Paperid: 815, https://arxiv.org/pdf/2503.16248.pdf

Abstract:
AI agents integrated with Web3 offer autonomy and openness but raise security concerns as they interact with financial protocols and immutable smart contracts. This paper investigates the vulnerabilities of AI agents within blockchain-based financial ecosystems when exposed to adversarial threats in real-world scenarios. We introduce the concept of context manipulation -- a comprehensive attack vector that exploits unprotected context surfaces, including input channels, memory modules, and external data feeds. It expands on traditional prompt injection and reveals a more stealthy and persistent threat: memory injection. Using ElizaOS, a representative decentralized AI agent framework for automated Web3 operations, we showcase that malicious injections into prompts or historical records can trigger unauthorized asset transfers and protocol violations which could be financially devastating in reality. To quantify these risks, we introduce CrAIBench, a Web3-focused benchmark covering 150+ realistic blockchain tasks. such as token transfers, trading, bridges, and cross-chain interactions, and 500+ attack test cases using context manipulation. Our evaluation results confirm that AI models are significantly more vulnerable to memory injection compared to prompt injection. Finally, we evaluate a comprehensive defense roadmap, finding that prompt-injection defenses and detectors only provide limited protection when stored context is corrupted, whereas fine-tuning-based defenses substantially reduce attack success rates while preserving performance on single-step tasks. These results underscore the urgent need for AI agents that are both secure and fiduciarily responsible in blockchain environments.

Paperid: 816, https://arxiv.org/pdf/2503.09726.pdf

Abstract:
Graph Neural Networks (GNNs) are widely used and deployed for graph-based prediction tasks. However, as good as GNNs are for learning graph data, they also come with the risk of privacy leakage. For instance, an attacker can run carefully crafted queries on the GNNs and, from the responses, can infer the existence of an edge between a pair of nodes. This attack, dubbed as a "link-stealing" attack, can jeopardize the user's privacy by leaking potentially sensitive information. To protect against this attack, we propose an approach called "$(N)$ode $(A)$ugmentation for $(R)$estricting $(G)$raphs from $(I)$nsinuating their $(S)$tructure" ($NARGIS$) and study its feasibility. $NARGIS$ is focused on reshaping the graph embedding space so that the posterior from the GNN model will still provide utility for the prediction task but will introduce ambiguity for the link-stealing attackers. To this end, $NARGIS$ applies spectral clustering on the given graph to facilitate it being augmented with new nodes -- that have learned features instead of fixed ones. It utilizes tri-level optimization for learning parameters for the GNN model, surrogate attacker model, and our defense model (i.e. learnable node features). We extensively evaluate $NARGIS$ on three benchmark citation datasets over eight knowledge availability settings for the attackers. We also evaluate the model fidelity and defense performance on influence-based link inference attacks. Through our studies, we have figured out the best feature of $NARGIS$ -- its superior fidelity-privacy performance trade-off in a significant number of cases. We also have discovered in which cases the model needs to be improved, and proposed ways to integrate different schemes to make the model more robust against link stealing attacks.

Paperid: 817, https://arxiv.org/pdf/2503.07697.pdf

Abstract:
As the capabilities of large language models (LLMs) continue to expand, their usage has become increasingly prevalent. However, as reflected in numerous ongoing lawsuits regarding LLM-generated content, addressing copyright infringement remains a significant challenge. In this paper, we introduce PoisonedParrot: the first stealthy data poisoning attack that induces an LLM to generate copyrighted content even when the model has not been directly trained on the specific copyrighted material. PoisonedParrot integrates small fragments of copyrighted text into the poison samples using an off-the-shelf LLM. Despite its simplicity, evaluated in a wide range of experiments, PoisonedParrot is surprisingly effective at priming the model to generate copyrighted content with no discernible side effects. Moreover, we discover that existing defenses are largely ineffective against our attack. Finally, we make the first attempt at mitigating copyright-infringement poisoning attacks by proposing a defense: ParrotTrap. We encourage the community to explore this emerging threat model further.

Paperid: 818, https://arxiv.org/pdf/2503.01839.pdf

Abstract:
Text-to-Image models may generate harmful content, such as pornographic images, particularly when unsafe prompts are submitted. To address this issue, safety filters are often added on top of text-to-image models, or the models themselves are aligned to reduce harmful outputs. However, these defenses remain vulnerable when an attacker strategically designs adversarial prompts to bypass these safety guardrails. In this work, we propose PromptTune, a method to jailbreak text-to-image models with safety guardrails using a fine-tuned large language model. Unlike other query-based jailbreak attacks that require repeated queries to the target model, our attack generates adversarial prompts efficiently after fine-tuning our AttackLLM. We evaluate our method on three datasets of unsafe prompts and against five safety guardrails. Our results demonstrate that our approach effectively bypasses safety guardrails, outperforms existing no-box attacks, and also facilitates other query-based attacks.

Paperid: 819, https://arxiv.org/pdf/2502.20623.pdf

Abstract:
Text-to-image models can generate harmful images when presented with unsafe prompts, posing significant safety and societal risks. Alignment methods aim to modify these models to ensure they generate only non-harmful images, even when exposed to unsafe prompts. A typical text-to-image model comprises two main components: 1) a text encoder and 2) a diffusion module. Existing alignment methods mainly focus on modifying the diffusion module to prevent harmful image generation. However, this often significantly impacts the model's behavior for safe prompts, causing substantial quality degradation of generated images. In this work, we propose SafeText, a novel alignment method that fine-tunes the text encoder rather than the diffusion module. By adjusting the text encoder, SafeText significantly alters the embedding vectors for unsafe prompts, while minimally affecting those for safe prompts. As a result, the diffusion module generates non-harmful images for unsafe prompts while preserving the quality of images for safe prompts. We evaluate SafeText on multiple datasets of safe and unsafe prompts, including those generated through jailbreak attacks. Our results show that SafeText effectively prevents harmful image generation with minor impact on the images for safe prompts, and SafeText outperforms six existing alignment methods. We will publish our code and data after paper acceptance.

Paperid: 820, https://arxiv.org/pdf/2502.18468.pdf

Abstract:
The integration of Large Language Models (LLMs) such as GitHub Copilot, ChatGPT, Cursor AI, and Codeium AI into software development has revolutionized the coding landscape, offering significant productivity gains, automation, and enhanced debugging capabilities. These tools have proven invaluable for generating code snippets, refactoring existing code, and providing real-time support to developers. However, their widespread adoption also presents notable challenges, particularly in terms of security vulnerabilities, code quality, and ethical concerns. This paper provides a comprehensive analysis of the benefits and risks associated with AI-powered coding tools, drawing on user feedback, security analyses, and practical use cases. We explore the potential for these tools to replicate insecure coding practices, introduce biases, and generate incorrect or non-sensical code (hallucinations). In addition, we discuss the risks of data leaks, intellectual property violations and the need for robust security measures to mitigate these threats. By comparing the features and performance of these tools, we aim to guide developers in making informed decisions about their use, ensuring that the benefits of AI-assisted coding are maximized while minimizing associated risks.

Paperid: 821, https://arxiv.org/pdf/2502.00193.pdf

Abstract:
We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.

Paperid: 822, https://arxiv.org/pdf/2501.18549.pdf

Abstract:
The rapid integration of the Internet of Things (IoT) and Internet of Medical (IoM) devices in the healthcare industry has markedly improved patient care and hospital operations but has concurrently brought substantial risks. Distributed Denial-of-Service (DDoS) attacks present significant dangers, jeopardizing operational stability and patient safety. This study introduces CryptoDNA, an innovative machine learning detection framework influenced by cryptojacking detection methods, designed to identify and alleviate DDoS attacks in healthcare IoT settings. The proposed approach relies on behavioral analytics, including atypical resource usage and network activity patterns. Key features derived from cryptojacking-inspired methodologies include entropy-based analysis of traffic, time-series monitoring of device performance, and dynamic anomaly detection. A lightweight architecture ensures inter-compatibility with resource-constrained IoT devices while maintaining high detection accuracy. The proposed architecture and model were tested in real-world and synthetic datasets to demonstrate the model's superior performance, achieving over 96% accuracy with minimal computational overhead. Comparative analysis reveals its resilience against emerging attack vectors and scalability across diverse device ecosystems. By bridging principles from cryptojacking and DDoS detection, CryptoDNA offers a robust, innovative solution to fortify the healthcare IoT landscape against evolving cyber threats and highlights the potential of interdisciplinary approaches in adaptive cybersecurity defense mechanisms for critical healthcare infrastructures.

Paperid: 823, https://arxiv.org/pdf/2501.14122.pdf

Abstract:
We present a Reinforcement Learning Platform for Adversarial Black-box untargeted and targeted attacks, RLAB, that allows users to select from various distortion filters to create adversarial examples. The platform uses a Reinforcement Learning agent to add minimum distortion to input images while still causing misclassification by the target model. The agent uses a novel dual-action method to explore the input image at each step to identify sensitive regions for adding distortions while removing noises that have less impact on the target model. This dual action leads to faster and more efficient convergence of the attack. The platform can also be used to measure the robustness of image classification models against specific distortion types. Also, retraining the model with adversarial samples significantly improved robustness when evaluated on benchmark datasets. The proposed platform outperforms state-of-the-art methods in terms of the average number of queries required to cause misclassification. This advances trustworthiness with a positive social impact.

Paperid: 824, https://arxiv.org/pdf/2501.12359.pdf

Abstract:
The hockey-stick divergence is a fundamental quantity characterizing several statistical privacy frameworks that ensure privacy for classical and quantum data. In such quantum privacy frameworks, the adversary is allowed to perform all possible measurements. However, in practice, there are typically limitations to the set of measurements that can be performed. To this end, here, we comprehensively analyze the measured hockey-stick divergence under several classes of practically relevant measurement classes. We prove several of its properties, including data processing and convexity. We show that it is efficiently computable by semi-definite programming for some classes of measurements and can be analytically evaluated for Werner and isotropic states. Notably, we show that the measured hockey-stick divergence characterizes optimal privacy parameters in the quantum pufferfish privacy framework. With this connection and the developed technical tools, we enable methods to quantify and audit privacy for several practically relevant settings. Lastly, we introduce the measured hockey-stick divergence of channels and explore its applications in ensuring privacy for channels.

Paperid: 825, https://arxiv.org/pdf/2501.11250.pdf

Abstract:
Integrating Internet of Things (IoT) devices in healthcare has revolutionized patient care, offering improved monitoring, diagnostics, and treatment. However, the proliferation of these devices has also introduced significant cybersecurity challenges. This paper reviews the current landscape of cybersecurity threats targeting IoT devices in healthcare, discusses the underlying issues contributing to these vulnerabilities, and explores potential solutions. Additionally, this study offers solutions and suggestions for researchers, agencies, and security specialists to overcome these IoT in healthcare cybersecurity vulnerabilities. A comprehensive literature survey highlights the nature and frequency of cyber attacks, their impact on healthcare systems, and emerging strategies to mitigate these risks.

Paperid: 826, https://arxiv.org/pdf/2501.01334.pdf

Abstract:
Virtualization is a technique that allows multiple instances typically running different guest operating systems on top of single physical hardware. A hypervisor, a layer of software running on top of the host operating system, typically runs and manages these different guest operating systems. Rather than to run different services on different servers for reliability and security reasons, companies started to employ virtualization over their servers to run these services within a single server. This approach proves beneficial to the companies as it provides much better reliability, stronger isolation, improved security and resource utilization compared to running services on multiple servers. Although hypervisor based virtualization offers better resource utilization and stronger isolation, it also suffers from high overhead as the host operating system has to maintain different guest operating systems. To tackle this issue, another form of virtualization known as Operating System-level virtualization has emerged. This virtualization provides light-weight, minimal and efficient virtualization, as the different instances are run on top of the same host operating system, sharing the resources of the host operating system. But due to instances sharing the same host operating system affects the isolation of the instances. In this paper, we will first establish the basic concepts of virtualization and point out the differences between the hyper-visor based virtualization and operating system-level virtualization. Next, we will discuss the container creation life-cycle which helps in forming a container threat model for the container systems, which allows to map different potential attack vectors within these systems. Finally, we will discuss a case study, which further looks at isolation provided by the containers.

Paperid: 827, https://arxiv.org/pdf/2506.11444.pdf

Abstract:
As Diffusion Models (DM) generate increasingly realistic images, related issues such as copyright and misuse have become a growing concern. Watermarking is one of the promising solutions. Existing methods inject the watermark into the single-domain of initial Gaussian noise for generation, which suffers from unsatisfactory robustness. This paper presents the first dual-domain DM watermarking approach using a pipelined injector to consistently embed watermarks in both the spatial and frequency domains. To further boost robustness against certain image manipulations and advanced attacks, we introduce a model-independent learnable Gaussian Noise Restorer (GNR) to refine Gaussian noise extracted from manipulated images and enhance detection robustness by integrating the detection scores of both watermarks. GaussMarker efficiently achieves state-of-the-art performance under eight image distortions and four advanced attacks across three versions of Stable Diffusion with better recall and lower false positive rates, as preferred in real applications.

Paperid: 828, https://arxiv.org/pdf/2506.09662.pdf

Abstract:
End-to-end deep learning exhibits unmatched performance for detecting malware, but such an achievement is reached by exploiting spurious correlations -- features with high relevance at inference time, but known to be useless through domain knowledge. While previous work highlighted that deep networks mainly focus on metadata, none investigated the phenomenon further, without quantifying their impact on the decision. In this work, we deepen our understanding of how spurious correlation affects deep learning for malware detection by highlighting how much models rely on empty spaces left by the compiler, which diminishes the relevance of the compiled code. Through our seminal analysis on a small-scale balanced dataset, we introduce a ranking of two end-to-end models to better understand which is more suitable to be put in production.

Paperid: 829, https://arxiv.org/pdf/2506.06414.pdf

Abstract:
Existing language model safety evaluations focus on overt attacks and low-stakes tasks. Realistic attackers can subvert current safeguards by requesting help on small, benign-seeming tasks across many independent queries. Because individual queries do not appear harmful, the attack is hard to {detect}. However, when combined, these fragments uplift misuse by helping the attacker complete hard and dangerous tasks. Toward identifying defenses against such strategies, we develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses. Using this pipeline, we curate two new datasets that are consistently refused by frontier models and are too difficult for weaker open-weight models. Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.

Paperid: 830, https://arxiv.org/pdf/2506.06119.pdf

Abstract:
As satellite systems become increasingly vulnerable to physical layer attacks via SDRs, novel countermeasures are being developed to protect critical systems, particularly those lacking cryptographic protection, or those which cannot be upgraded to support modern cryptography. Among these is transmitter fingerprinting, which provides mechanisms by which communication can be authenticated by looking at characteristics of the transmitter, expressed as impairments on the signal. Previous works show that fingerprinting can be used to classify satellite transmitters, or authenticate them against SDR-equipped attackers under simple replay scenarios. In this paper we build upon this by looking at attacks directly targeting the fingerprinting system, with an attacker optimizing for maximum impact in jamming, spoofing, and dataset poisoning attacks, and demonstrate these attacks on the SatIQ system designed to authenticate Iridium transmitters. We show that an optimized jamming signal can cause a 50% error rate with attacker-to-victim ratios as low as -30dB (far less power than traditional jamming) and demonstrate successful identity forgery during spoofing attacks, with an attacker successfully removing their own transmitter's fingerprint from messages. We also present a data poisoning attack, enabling persistent message spoofing by altering the data used to authenticate incoming messages to include the fingerprint of the attacker's transmitter. Finally, we show that our model trained to optimize spoofing attacks can also be used to detect spoofing and replay attacks, even when it has never seen the attacker's transmitter before. Furthermore, this technique works even when the training dataset includes only a single transmitter, enabling fingerprinting to be used to protect small constellations and even individual satellites, providing additional protection where it is needed the most.

Paperid: 831, https://arxiv.org/pdf/2506.02038.pdf

Abstract:
Edge Intelligence (EI) serves as a critical enabler for privacy-preserving systems by providing AI-empowered computation and distributed caching services at the edge, thereby minimizing latency and enhancing data privacy. The integration of blockchain technology further augments EI frameworks by ensuring transactional transparency, auditability, and system-wide reliability through a decentralized network model. However, the operational architecture of such systems introduces inherent vulnerabilities, particularly due to the extensive data interactions between edge gateways (EGs) and the distributed nature of information storage during service provisioning. To address these challenges, we propose an autonomous computing model along with its interaction topologies tailored for privacy-critical and time-sensitive health applications. The system supports continuous monitoring, real-time alert notifications, disease detection, and robust data processing and aggregation. It also includes a data transaction handler and mechanisms for ensuring privacy at the EGs. Moreover, a resource-efficient one-dimensional convolutional neural network (1D-CNN) is proposed for the multiclass classification of arrhythmia, enabling accurate and real-time analysis of constrained EGs. Furthermore, a secure access scheme is defined to manage both off-chain and on-chain data sharing and storage. To validate the proposed model, comprehensive security, performance, and cost analyses are conducted, demonstrating the efficiency and reliability of the fine-grained access control scheme.

Paperid: 832, https://arxiv.org/pdf/2506.00808.pdf

Abstract:
Graph unlearning methods aim to efficiently remove the impact of sensitive data from trained GNNs without full retraining, assuming that deleted information cannot be recovered. In this work, we challenge this assumption by introducing the graph unlearning inversion attack: given only black-box access to an unlearned GNN and partial graph knowledge, can an adversary reconstruct the removed edges? We identify two key challenges: varying probability-similarity thresholds for unlearned versus retained edges, and the difficulty of locating unlearned edge endpoints, and address them with TrendAttack. First, we derive and exploit the confidence pitfall, a theoretical and empirical pattern showing that nodes adjacent to unlearned edges exhibit a large drop in model confidence. Second, we design an adaptive prediction mechanism that applies different similarity thresholds to unlearned and other membership edges. Our framework flexibly integrates existing membership inference techniques and extends them with trend features. Experiments on four real-world datasets demonstrate that TrendAttack significantly outperforms state-of-the-art GNN membership inference baselines, exposing a critical privacy vulnerability in current graph unlearning methods.

Paperid: 833, https://arxiv.org/pdf/2506.00416.pdf

Abstract:
Federated learning (FL) has attracted increasing attention to mitigate security and privacy challenges in traditional cloud-centric machine learning models specifically in healthcare ecosystems. FL methodologies enable the training of global models through localized policies, allowing independent operations at the edge clients' level. Conventional first-order FL approaches face several challenges in personalized model training due to heterogeneous non-independent and identically distributed (non-iid) data of each edge client. Recently, second-order FL approaches maintain the stability and consistency of non-iid datasets while improving personalized model training. This study proposes and develops a verifiable and auditable optimized second-order FL framework BFEL (blockchain-enhanced federated edge learning) based on optimized FedCurv for personalized healthcare systems. FedCurv incorporates information about the importance of each parameter to each client's task (through Fisher Information Matrix) which helps to preserve client-specific knowledge and reduce model drift during aggregation. Moreover, it minimizes communication rounds required to achieve a target precision convergence for each edge client while effectively managing personalized training on non-iid and heterogeneous data. The incorporation of Ethereum-based model aggregation ensures trust, verifiability, and auditability while public key encryption enhances privacy and security. Experimental results of federated CNNs and MLPs utilizing Mnist, Cifar-10, and PathMnist demonstrate the high efficiency and scalability of the proposed framework.

Paperid: 834, https://arxiv.org/pdf/2505.21130.pdf

Abstract:
Directed fuzzing is a critical technique in cybersecurity, targeting specific sections of a program. This approach is essential in various security-related domains such as crash reproduction, patch testing, and vulnerability detection. Despite its importance, current directed fuzzing methods exhibit a trade-off between efficiency and effectiveness. For instance, directed grey-box fuzzing, while efficient in generating fuzzing inputs, lacks sufficient precision. The low precision causes time wasted on executing code that cannot help reach the target site. Conversely, interpreter- or observer-based directed symbolic execution can produce high-quality inputs while incurring non-negligible runtime overhead. These limitations undermine the feasibility of directed fuzzers in real-world scenarios. To kill the birds of efficiency and effectiveness with one stone, in this paper, we involve compilation-based concolic execution into directed fuzzing and present ColorGo, achieving high scalability while preserving the high precision from symbolic execution. ColorGo is a new directed whitebox fuzzer that concretely executes the instrumented program with constraint-solving capability on generated input. It guides the exploration by \textit{incremental coloration}, including static reachability analysis and dynamic feasibility analysis. We evaluated ColorGo on diverse real-world programs and demonstrated that ColorGo outperforms AFLGo by up to \textbf{100x} in reaching target sites and reproducing target crashes.

Paperid: 835, https://arxiv.org/pdf/2505.18680.pdf

Abstract:
Large Language Models (LLMs), due to substantial computational requirements, are vulnerable to resource consumption attacks, which can severely degrade server performance or even cause crashes, as demonstrated by denial-of-service (DoS) attacks designed for LLMs. However, existing works lack mitigation strategies against such threats, resulting in unresolved security risks for real-world LLM deployments. To this end, we propose the Pluggable and Dynamic DoS-Defense Framework ($PD^3F$), which employs a two-stage approach to defend against resource consumption attacks from both the input and output sides. On the input side, we propose the Resource Index to guide Dynamic Request Polling Scheduling, thereby reducing resource usage induced by malicious attacks under high-concurrency scenarios. On the output side, we introduce the Adaptive End-Based Suppression mechanism, which terminates excessive malicious generation early. Experiments across six models demonstrate that $PD^3F$ significantly mitigates resource consumption attacks, improving users' access capacity by up to 500% during adversarial load. $PD^3F$ represents a step toward the resilient and resource-aware deployment of LLMs against resource consumption attacks.

Paperid: 836, https://arxiv.org/pdf/2505.13655.pdf

Abstract:
Federated Learning with client-level differential privacy (DP) provides a promising framework for collaboratively training models while rigorously protecting clients' privacy. However, classic approaches like DP-FedAvg struggle when clients have heterogeneous privacy requirements, as they must uniformly enforce the strictest privacy level across clients, leading to excessive DP noise and significant model utility degradation. Existing methods to improve the model utility in such heterogeneous privacy settings often assume a trusted server and are largely heuristic, resulting in suboptimal performance and lacking strong theoretical underpinnings. In this work, we address these challenges under a practical attack model where both clients and the server are honest-but-curious. We propose GDPFed, which partitions clients into groups based on their privacy budgets and achieves client-level DP within each group to reduce the privacy budget waste and hence improve the model utility. Based on the privacy and convergence analysis of GDPFed, we find that the magnitude of DP noise depends on both model dimensionality and the per-group client sampling ratios. To further improve the performance of GDPFed, we introduce GDPFed$^+$, which integrates model sparsification to eliminate unnecessary noise and optimizes per-group client sampling ratios to minimize convergence error. Extensive empirical evaluations on multiple benchmark datasets demonstrate the effectiveness of GDPFed$^+$, showing substantial performance gains compared with state-of-the-art methods.

Paperid: 837, https://arxiv.org/pdf/2505.13651.pdf

Abstract:
Due to the distributed nature of Federated Learning (FL) systems, each local client has access to the global model, posing a critical risk of model leakage. Existing works have explored injecting watermarks into local models to enable intellectual property protection. However, these methods either focus on non-traceable watermarks or traceable but white-box watermarks. We identify a gap in the literature regarding the formal definition of traceable black-box watermarking and the formulation of the problem of injecting such watermarks into FL systems. In this work, we first formalize the problem of injecting traceable black-box watermarks into FL. Based on the problem, we propose a novel server-side watermarking method, $\mathbf{TraMark}$, which creates a traceable watermarked model for each client, enabling verification of model leakage in black-box settings. To achieve this, $\mathbf{TraMark}$ partitions the model parameter space into two distinct regions: the main task region and the watermarking region. Subsequently, a personalized global model is constructed for each client by aggregating only the main task region while preserving the watermarking region. Each model then learns a unique watermark exclusively within the watermarking region using a distinct watermark dataset before being sent back to the local client. Extensive results across various FL systems demonstrate that $\mathbf{TraMark}$ ensures the traceability of all watermarked models while preserving their main task performance.

Paperid: 838, https://arxiv.org/pdf/2505.13239.pdf

Abstract:
The advancement of quantum computing threatens classical cryptographic methods, necessitating the development of secure quantum key distribution (QKD) solutions for QKD Networks (QKDN). In this paper, a novel key distribution protocol, Onion Routing Relay (ORR), that integrates onion routing (OR) with post-quantum cryptography (PQC) in a key-relay (KR) model is evaluated for QKDNs. This approach increases the security by enhancing confidentiality, integrity, authenticity, and anonymity in quantum-secure communications. By employing PQC-based encapsulation, ORR pretends to avoid the security risks posed by intermediate malicious nodes and ensures end-to-end security. Results show that the performance of the ORR model, against current key-relay (KR) and trusted-node (TN) approaches, demonstrating its feasibility and applicability in high-security environments maintaining a consistent Quality of Service (QoS). The results show that while ORR incurs higher encryption overhead, it provides substantial security improvements without significantly impacting the overall key distribution time.

Paperid: 839, https://arxiv.org/pdf/2505.13158.pdf

Abstract:
The advancement of quantum computing threatens classical cryptographic methods, necessitating the development of secure quantum key distribution (QKD) solutions for QKD Networks (QKDN). In this paper, a novel key distribution protocol, Onion Routing Relay (ORR), that integrates onion routing (OR) with post-quantum cryptography (PQC) in a key-relay (KR) model is evaluated for QKDNs. This approach increases the security by enhancing confidentiality, integrity, authenticity (CIA principles), and anonymity in quantum-secure communications. By employing PQC-based encapsulation, ORR aims to avoid the security risks posed by intermediate malicious nodes and ensures end-to-end security. Our results show a competitive performance of the basic ORR model, against current KR and trusted-node (TN) approaches, demonstrating its feasibility and applicability in high-security environments maintaining a consistent Quality of Service (QoS). The results also show that while basic ORR incurs higher encryption overhead, it provides substantial security improvements without significantly impacting the overall key distribution time. Nevertheless, the introduction of an end-to-end authentication extension (ORR-Ext) has a significant impact on the Quality of Service (QoS), thereby limiting its suitability to applications with stringent security requirements.

Paperid: 840, https://arxiv.org/pdf/2505.06759.pdf

Abstract:
Coded computing is one of the techniques that can be used for privacy protection in Federated Learning. However, most of the constructions used for coded computing work only under the assumption that the computations involved are exact, generally restricted to special classes of functions, and require quantized inputs. This paper considers the use of Private Berrut Approximate Coded Computing (PBACC) as a general solution to add strong but non-perfect privacy to federated learning. We derive new adapted PBACC algorithms for centralized aggregation, secure distributed training with centralized data, and secure decentralized training with decentralized data, thus enlarging significantly the applications of the method and the existing privacy protection tools available for these paradigms. Particularly, PBACC can be used robustly to attain privacy guarantees in decentralized federated learning for a variety of models. Our numerical results show that the achievable quality of different learning models (convolutional neural networks, variational autoencoders, and Cox regression) is minimally altered by using these new computing schemes, and that the privacy leakage can be bounded strictly to less than a fraction of one bit per participant. Additionally, the computational cost of the encoding and decoding processes depends only of the degree of decentralization of the data.

Paperid: 841, https://arxiv.org/pdf/2504.20888.pdf

Abstract:
In this paper, we study the problem of private information retrieval (PIR) in both graph-based and multigraph-based replication systems, where each file is stored on exactly two servers, and any pair of servers shares at most $r$ files. We derive upper bounds on the PIR capacity for such systems and construct PIR schemes that approach these bounds. For graph-based systems, we determine the exact PIR capacity for path graphs and improve upon existing results for complete bipartite graphs and complete graphs. For multigraph-based systems, we propose a PIR scheme that leverages the symmetry of the underlying graph-based construction, yielding a capacity lower bound for such multigraphs. Furthermore, we establish several general upper and lower bounds on the PIR capacity of multigraphs, which are tight in certain cases.

Paperid: 842, https://arxiv.org/pdf/2504.12143.pdf

Abstract:
The growing and evolving landscape of cybersecurity threats necessitates the development of supporting tools and platforms that allow for the creation of realistic IT environments operating within virtual, controlled settings as Cyber Ranges (CRs). CRs can be exploited for analyzing vulnerabilities and experimenting with the effectiveness of devised countermeasures, as well as serving as training environments for building cyber security skills and abilities for IT operators. This paper proposes ARCeR as an innovative solution for the automatic generation and deployment of CRs, starting from user-provided descriptions in a natural language. ARCeR relies on the Agentic RAG paradigm, which allows it to fully exploit state-of-art AI technologies. Experimental results show that ARCeR is able to successfully process prompts even in cases that LLMs or basic RAG systems are not able to cope with. Furthermore, ARCeR is able to target any CR framework provided that specific knowledge is made available to it.

Paperid: 843, https://arxiv.org/pdf/2504.09879.pdf

Abstract:
Encrypted search schemes have been proposed to address growing privacy concerns. However, several leakage-abuse attacks have highlighted some security vulnerabilities. Recent attacks assumed an attacker's knowledge containing data ``similar'' to the indexed data. However, this vague assumption is barely discussed in literature: how likely is it for an attacker to obtain a "similar enough" data? Our paper provides novel statistical tools usable on any attack in this setting to analyze its sensitivity to data similarity. First, we introduce a mathematical model based on statistical estimators to analytically understand the attackers' knowledge and the notion of similarity. Second, we conceive statistical tools to model the influence of the similarity on the attack accuracy. We apply our tools on three existing attacks to answer questions such as: is similarity the only factor influencing accuracy of a given attack? Third, we show that the enforcement of a maximum index size can make the ``similar-data'' assumption harder to satisfy. In particular, we propose a statistical method to estimate an appropriate maximum size for a given attack and dataset. For the best known attack on the Enron dataset, a maximum index size of 200 guarantees (with high probability) the attack accuracy to be below 5%.

Paperid: 844, https://arxiv.org/pdf/2504.07578.pdf

Abstract:
Clustering is a fundamental data processing task used for grouping records based on one or more features. In the vertically partitioned setting, data is distributed among entities, with each holding only a subset of those features. A key challenge in this scenario is that computing distances between records requires access to all distributed features, which may be privacy-sensitive and cannot be directly shared with other parties. The goal is to compute the joint clusters while preserving the privacy of each entity's dataset. Existing solutions using secret sharing or garbled circuits implement privacy-preserving variants of Lloyd's algorithm but incur high communication costs, scaling as O(nkt), where n is the number of data points, k the number of clusters, and t the number of rounds. These methods become impractical for large datasets or several parties, limiting their use to LAN settings only. On the other hand, a different line of solutions rely on differential privacy (DP) to outsource the local features of the parties to a central server. However, they often significantly degrade the utility of the clustering outcome due to excessive noise. In this work, we propose a novel solution based on homomorphic encryption and DP, reducing communication complexity to O(n+kt). In our method, parties securely outsource their features once, allowing a computing party to perform clustering operations under encryption. DP is applied only to the clusters' centroids, ensuring privacy with minimal impact on utility. Our solution clusters 100,000 two-dimensional points into five clusters using only 73MB of communication, compared to 101GB for existing works, and completes in just under 3 minutes on a 100Mbps network, whereas existing works take over 1 day. This makes our solution practical even for WAN deployments, all while maintaining accuracy comparable to plaintext k-means algorithms.

Paperid: 845, https://arxiv.org/pdf/2504.05605.pdf

Abstract:
Chain-of-Thought (CoT) enhances an LLM's ability to perform complex reasoning tasks, but it also introduces new security issues. In this work, we present ShadowCoT, a novel backdoor attack framework that targets the internal reasoning mechanism of LLMs. Unlike prior token-level or prompt-based attacks, ShadowCoT directly manipulates the model's cognitive reasoning path, enabling it to hijack multi-step reasoning chains and produce logically coherent but adversarial outcomes. By conditioning on internal reasoning states, ShadowCoT learns to recognize and selectively disrupt key reasoning steps, effectively mounting a self-reflective cognitive attack within the target model. Our approach introduces a lightweight yet effective multi-stage injection pipeline, which selectively rewires attention pathways and perturbs intermediate representations with minimal parameter overhead (only 0.15% updated). ShadowCoT further leverages reinforcement learning and reasoning chain pollution (RCP) to autonomously synthesize stealthy adversarial CoTs that remain undetectable to advanced defenses. Extensive experiments across diverse reasoning benchmarks and LLMs show that ShadowCoT consistently achieves high Attack Success Rate (94.4%) and Hijacking Success Rate (88.4%) while preserving benign performance. These results reveal an emergent class of cognition-level threats and highlight the urgent need for defenses beyond shallow surface-level consistency.

Paperid: 846, https://arxiv.org/pdf/2504.02124.pdf

Abstract:
Formal verification has recently been increasingly used to prove the correctness and security of many applications. It is attractive because it can prove the absence of errors with the same certainty as mathematicians proving theorems. However, while most security experts recognize the value of formal verification, the views of non-technical users on this topic are unknown. To address this issue, we designed and implemented two experiments to understand how formal verification impacts users. Our approach started with a formative study involving 15 participants, followed by the main quantitative study with 200 individuals. We focus on the application domain of password managers since it has been documented that the lack of trust in password managers might lead to lower adoption. Moreover, recent efforts have focused on formally verifying (parts of) password managers. We conclude that formal verification is seen as desirable by users and identify three actional recommendations to improve formal verification communication efforts.

Paperid: 847, https://arxiv.org/pdf/2504.02109.pdf

Abstract:
Cybersecurity incidents such as data breaches have become increasingly common, affecting millions of users and organizations worldwide. The complexity of cybersecurity threats challenges the effectiveness of existing security communication strategies. Through a systematic review of over 3,400 papers, we identify specific user difficulties including information overload, technical jargon comprehension, and balancing security awareness with comfort. Our findings reveal consistent communication paradoxes: users require technical details for credibility yet struggle with jargon and need risk awareness without experiencing anxiety. We propose seven evidence-based guidelines to improve security communication and identify critical research gaps including limited studies with older adults, children, and non-US populations, insufficient longitudinal research, and limited protocol sharing for reproducibility. Our guidelines emphasize user-centric communication adapted to cultural and demographic differences while ensuring security advice remains actionable. This work contributes to more effective security communication practices that enable users to recognize and respond to cybersecurity threats appropriately.

Paperid: 848, https://arxiv.org/pdf/2503.15550.pdf

Abstract:
Federated Learning (FL) has emerged as a promising paradigm in distributed machine learning, enabling collaborative model training while preserving data privacy. However, despite its many advantages, FL still contends with significant challenges -- most notably regarding security and trust. Zero-Knowledge Proofs (ZKPs) offer a potential solution by establishing trust and enhancing system integrity throughout the FL process. Although several studies have explored ZKP-based FL (ZK-FL), a systematic framework and comprehensive analysis are still lacking. This article makes two key contributions. First, we propose a structured ZK-FL framework that categorizes and analyzes the technical roles of ZKPs across various FL stages and tasks. Second, we introduce a novel algorithm, Verifiable Client Selection FL (Veri-CS-FL), which employs ZKPs to refine the client selection process. In Veri-CS-FL, participating clients generate verifiable proofs for the performance metrics of their local models and submit these concise proofs to the server for efficient verification. The server then selects clients with high-quality local models for uploading, subsequently aggregating the contributions from these selected clients. By integrating ZKPs, Veri-CS-FL not only ensures the accuracy of performance metrics but also fortifies trust among participants while enhancing the overall efficiency and security of FL systems.

Paperid: 849, https://arxiv.org/pdf/2503.11185.pdf

Abstract:
Large Language Models (LLMs) are vulnerable to jailbreak attacks, which use crafted prompts to elicit toxic responses. These attacks exploit LLMs' difficulty in dynamically detecting harmful intents during the generation process. Traditional safety alignment methods, often relying on the initial few generation steps, are ineffective due to limited computational budget. This paper proposes DEEPALIGN, a robust defense framework that fine-tunes LLMs to progressively detoxify generated content, significantly improving both the computational budget and effectiveness of mitigating harmful generation. Our approach uses a hybrid loss function operating on hidden states to directly improve LLMs' inherent awareness of toxity during generation. Furthermore, we redefine safe responses by generating semantically relevant answers to harmful queries, thereby increasing robustness against representation-mutation attacks. Evaluations across multiple LLMs demonstrate state-of-the-art defense performance against six different attack types, reducing Attack Success Rates by up to two orders of magnitude compared to previous state-of-the-art defense while preserving utility. This work advances LLM safety by addressing limitations of conventional alignment through dynamic, context-aware mitigation.

Paperid: 850, https://arxiv.org/pdf/2503.02118.pdf

Abstract:
An increase in availability of Software Defined Radios (SDRs) has caused a dramatic shift in the threat landscape of legacy satellite systems, opening them up to easy spoofing attacks by low-budget adversaries. Physical-layer authentication methods can help improve the security of these systems by providing additional validation without modifying the space segment. This paper extends previous research on Radio Frequency Fingerprinting (RFF) of satellite communication to the Orbcomm satellite formation. The GPS and Iridium constellations are already well covered in prior research, but the feasibility of transferring techniques to other formations has not yet been examined, and raises previously undiscussed challenges. In this paper, we collect a novel dataset containing 8992474 packets from the Orbcom satellite constellation using different SDRs and locations. We use this dataset to train RFF systems based on convolutional neural networks. We achieve an ROC AUC score of 0.53 when distinguishing different satellites within the constellation, and 0.98 when distinguishing legitimate satellites from SDRs in a spoofing scenario. We also demonstrate the possibility of mixing datasets using different SDRs in different physical locations.

Paperid: 851, https://arxiv.org/pdf/2503.00581.pdf

Abstract:
A key operation in federated learning is the aggregation of gradient vectors generated by individual client nodes. We develop a method based on multiparty homomorphic encryption (MPHE) that enables the central node to compute this aggregate, while receiving only encrypted version of each individual gradients. Towards this end, we extend classical MPHE methods so that the decryption of the aggregate vector can be successful even when only a subset of client nodes are available. This is accomplished by introducing a secret-sharing step during the setup phase of MPHE when the public encryption key is generated. We develop conditions on the parameters of the MPHE scheme that guarantee correctness of decryption and (computational) security. We explain how our method can be extended to accommodate client nodes that do not participate during the setup phase. We also propose a compression scheme for gradient vectors at each client node that can be readily combined with our MPHE scheme and perform the associated convergence analysis. We discuss the advantages of our proposed scheme with other approaches based on secure multi-party computation. Finally we discuss a practical implementation of our system, compare the performance of our system with different approaches, and demonstrate that by suitably combining compression with encryption the overhead over baseline schemes is rather small.

Paperid: 852, https://arxiv.org/pdf/2502.18535.pdf

Abstract:
As machine learning technologies advance rapidly across various domains, concerns over data privacy and model security have grown significantly. These challenges are particularly pronounced when models are trained and deployed on cloud platforms or third-party servers due to the computational resource limitations of users' end devices. In response, zero-knowledge proof (ZKP) technology has emerged as a promising solution, enabling effective validation of model performance and authenticity in both training and inference processes without disclosing sensitive data. Thus, ZKP ensures the verifiability and security of machine learning models, making it a valuable tool for privacy-preserving AI. Although some research has explored the verifiable machine learning solutions that exploit ZKP, a comprehensive survey and summary of these efforts remain absent. This survey paper aims to bridge this gap by reviewing and analyzing all the existing Zero-Knowledge Machine Learning (ZKML) research from June 2017 to December 2024. We begin by introducing the concept of ZKML and outlining its ZKP algorithmic setups under three key categories: verifiable training, verifiable inference, and verifiable testing. Next, we provide a comprehensive categorization of existing ZKML research within these categories and analyze the works in detail. Furthermore, we explore the implementation challenges faced in this field and discuss the improvement works to address these obstacles. Additionally, we highlight several commercial applications of ZKML technology. Finally, we propose promising directions for future advancements in this domain.

Paperid: 853, https://arxiv.org/pdf/2502.08055.pdf

Abstract:
Federated Learning (FL) enables collaborative model training while keeping client data private. However, exposing individual client updates makes FL vulnerable to reconstruction attacks. Secure aggregation mitigates such privacy risks but prevents the server from verifying the validity of each client update, creating a privacy-robustness tradeoff. Recent efforts attempt to address this tradeoff by enforcing checks on client updates using zero-knowledge proofs, but they support limited predicates and often depend on public validation data. We propose SLVR, a general framework that securely leverages clients' private data through secure multi-party computation. By utilizing clients' data, SLVR not only eliminates the need for public validation data, but also enables a wider range of checks for robustness, including cross-client accuracy validation. It also adapts naturally to distribution shifts in client data as it can securely refresh its validation data up-to-date. Our empirical evaluations show that SLVR improves robustness against model poisoning attacks, particularly outperforming existing methods by up to 50% under adaptive attacks. Additionally, SLVR demonstrates effective adaptability and stable convergence under various distribution shift scenarios.

Paperid: 854, https://arxiv.org/pdf/2502.06657.pdf

Abstract:
The advance of quantum computing poses a significant threat to classical cryptography, compromising the security of current encryption schemes such as RSA and ECC. In response to this challenge, two main approaches have emerged: quantum cryptography and post-quantum cryptography (PQC). However, both have implementation and security limitations. In this paper, we propose a secure key distribution protocol for Quantum Key Distribution Networks (QKDN), which incorporates encapsulation techniques in the key-relay model for QKDN inspired by onion routing and combined with PQC to guarantee confidentiality, integrity, authenticity and anonymity in communication. The proposed protocol optimizes security by using post-quantum public key encryption to protect the shared secrets from intermediate nodes in the QKDN, thereby reducing the risk of attacks by malicious intermediaries. Finally, relevant use cases are presented, such as critical infrastructure networks, interconnection of data centers and digital money, demonstrating the applicability of the proposal in critical high-security environments.

Paperid: 855, https://arxiv.org/pdf/2502.05425.pdf

Abstract:
Intelligent transportation systems (ITS) use advanced technologies such as artificial intelligence to significantly improve traffic flow management efficiency, and promote the intelligent development of the transportation industry. However, if the data in ITS is attacked, such as tampering or forgery, it will endanger public safety and cause social losses. Therefore, this paper proposes a watermarking that can verify the integrity of copyright in response to the needs of ITS, termed ITSmark. ITSmark focuses on functions such as extracting watermarks, verifying permission, and tracing tampered locations. The scheme uses the copyright information to build the multi-bit space and divides this space into multiple segments. These segments will be assigned to tokens. Thus, the next token is determined by its segment which contains the copyright. In this way, the obtained data contains the custom watermark. To ensure the authorization, key parameters are encrypted during copyright embedding to obtain cipher data. Only by possessing the correct cipher data and private key, can the user entirely extract the watermark. Experiments show that ITSmark surpasses baseline performances in data quality, extraction accuracy, and unforgeability. It also shows unique capabilities of permission verification and tampered location tracing, which ensures the security of extraction and the reliability of copyright verification. Furthermore, ITSmark can also customize the watermark embedding position and proportion according to user needs, making embedding more flexible.

Paperid: 856, https://arxiv.org/pdf/2502.01386.pdf

Abstract:
Retrieval-Augmented Generation (RAG) systems based on Large Language Models (LLMs) have become essential for tasks such as question answering and content generation. However, their increasing impact on public opinion and information dissemination has made them a critical focus for security research due to inherent vulnerabilities. Previous studies have predominantly addressed attacks targeting factual or single-query manipulations. In this paper, we address a more practical scenario: topic-oriented adversarial opinion manipulation attacks on RAG models, where LLMs are required to reason and synthesize multiple perspectives, rendering them particularly susceptible to systematic knowledge poisoning. Specifically, we propose Topic-FlipRAG, a two-stage manipulation attack pipeline that strategically crafts adversarial perturbations to influence opinions across related queries. This approach combines traditional adversarial ranking attack techniques and leverages the extensive internal relevant knowledge and reasoning capabilities of LLMs to execute semantic-level perturbations. Experiments show that the proposed attacks effectively shift the opinion of the model's outputs on specific topics, significantly impacting user information perception. Current mitigation methods cannot effectively defend against such attacks, highlighting the necessity for enhanced safeguards for RAG systems, and offering crucial insights for LLM security research.

Paperid: 857, https://arxiv.org/pdf/2501.18837.pdf

Authors:Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, Alwin Peng, Raj Agarwal, Cem Anil, Amanda Askell, Nathan Bailey, Joe Benton, Emma Bluemke, Samuel R. Bowman, Eric Christiansen, Hoagy Cunningham, Andy Dau, Anjali Gopal, Rob Gilson, Logan Graham, Logan Howard, Nimit Kalra, Taesung Lee, Kevin Lin, Peter Lofgren, Francesco Mosconi, Clare O'Hara, Catherine Olsson, Linda Petrini, Samir Rajani, Nikhil Saxena, Alex Silverstein, Tanya Singh, Theodore Sumers, Leonard Tang, Kevin K. Troy, Constantin Weisser, Ruiqi Zhong, Giulio Zhou, Jan Leike, Jared Kaplan, Ethan Perez

Abstract:
Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by prompting LLMs with natural language rules (i.e., a constitution) specifying permitted and restricted content. In over 3,000 estimated hours of red teaming, no red teamer found a universal jailbreak that could extract information from an early classifier-guarded LLM at a similar level of detail to an unguarded model across most target queries. On automated evaluations, enhanced classifiers demonstrated robust defense against held-out domain-specific jailbreaks. These classifiers also maintain deployment viability, with an absolute 0.38% increase in production-traffic refusals and a 23.7% inference overhead. Our work demonstrates that defending against universal jailbreaks while maintaining practical deployment viability is tractable.

Paperid: 858, https://arxiv.org/pdf/2501.17845.pdf

Abstract:
We consider the private information retrieval (PIR) problem for a multigraph-based replication system, where each set of $r$ files is stored on two of the servers according to an underlying $r$-multigraph. Our goal is to establish upper and lower bounds on the PIR capacity of the $r$-multigraph. Specifically, we first propose a construction for multigraph-based PIR systems that leverages the symmetry of the underlying graph-based PIR scheme, deriving a capacity lower bound for such multigraphs. Then, we establish a general upper bound using linear programming, expressed as a function of the underlying graph parameters. Our bounds are demonstrated to be tight for PIR systems on multipaths for even number of vertices.

Paperid: 859, https://arxiv.org/pdf/2501.10060.pdf

Abstract:
Nowadays, 5G architecture is characterized by the use of monolithic hardware, where the configuration of its elements is completely proprietary for each manufacturer. In recent years, as an alternative to this centralized architecture, a new model has emerged: the Open Radio Access Network (Open RAN). One of its main features has been the split of the Base Band Unit (BBU) into new simpler hardware with more specific functions approaching to a more modular model. As a consequence of this split, new interfaces appeared to connect these components that need to be protected. With the developments in the field of quantum computing, traditional protection mechanisms for this kind of interfaces may be deprecated in the near future. This security issue motivates this paper, which aims to study how to integrate post-quantum cryptography (PQC) mechanisms to current security standards, such as IPsec and MACsec. In addition, the proposal is also put into practice to compare the performance of traditional mechanisms with PQC implementations. This research shows that the new implementation does not reduce the performance of the aforementioned standards, while the security is reinforced against quantum attacks.

Paperid: 860, https://arxiv.org/pdf/2501.08137.pdf

Abstract:
This paper proposes an audio-visual deepfake detection approach that aims to capture fine-grained temporal inconsistencies between audio and visual modalities. To achieve this, both architectural and data synthesis strategies are introduced. From an architectural perspective, a temporal distance map, coupled with an attention mechanism, is designed to capture these inconsistencies while minimizing the impact of irrelevant temporal subsequences. Moreover, we explore novel pseudo-fake generation techniques to synthesize local inconsistencies. Our approach is evaluated against state-of-the-art methods using the DFDC and FakeAVCeleb datasets, demonstrating its effectiveness in detecting audio-visual deepfakes.

Paperid: 861, https://arxiv.org/pdf/2506.22311.pdf

Abstract:
Pressure sensors are an integrated component of modern Heating, Ventilation, and Air Conditioning (HVAC) systems. As these pressure sensors operate within the 0-10 Pa range, support high sampling frequencies of 0.5-2 kHz, and are often placed close to human proximity, they can be used to eavesdrop on confidential conversation, since human speech has a similar audible range of 0-10 Pa and a bandwidth of 4 kHz for intelligible quality. This paper presents WaLi, which reconstructs intelligible speech from the low-resolution and noisy pressure sensor data by providing the following technical contributions: (i) WaLi reconstructs intelligible speech from a minimum of 0.5 kHz sampling frequency of pressure sensors, whereas previous work can only detect hot words/phrases. WaLi uses complex-valued conformer and Complex Global Attention Block (CGAB) to capture inter-phoneme and intra-phoneme dependencies that exist in the low-resolution pressure sensor data. (ii) WaLi handles the transient noise injected from HVAC fans and duct vibrations, by reconstructing both the clean magnitude and phase of the missing frequencies of the low-frequency aliased components. Extensive measurement studies on real-world pressure sensors show an LSD of 1.24 and NISQA-MOS of 1.78 for 0.5 kHz to 8 kHz upsampling. We believe that such levels of accuracy pose a significant threat when viewed from a privacy perspective that has not been addressed before for pressure sensors.

Paperid: 862, https://arxiv.org/pdf/2506.19360.pdf

Abstract:
Advances in generative models have transformed the field of synthetic image generation for privacy-preserving data synthesis (PPDS). However, the field lacks a comprehensive survey and comparison of synthetic image generation methods across diverse settings. In particular, when we generate synthetic images for the purpose of training a classifier, there is a pipeline of generation-sampling-classification which takes private training as input and outputs the final classifier of interest. In this survey, we systematically categorize existing image synthesis methods, privacy attacks, and mitigations along this generation-sampling-classification pipeline. To empirically compare diverse synthesis approaches, we provide a benchmark with representative generative methods and use model-agnostic membership inference attacks (MIAs) as a measure of privacy risk. Through this study, we seek to answer critical questions in PPDS: Can synthetic data effectively replace real data? Which release strategy balances utility and privacy? Do mitigations improve the utility-privacy tradeoff? Which generative models perform best across different scenarios? With a systematic evaluation of diverse methods, our study provides actionable insights into the utility-privacy tradeoffs of synthetic data generation methods and guides the decision on optimal data releasing strategies for real-world applications.

Paperid: 863, https://arxiv.org/pdf/2506.17299.pdf

Abstract:
As large language models (LLMs) become increasingly deployed in safety-critical applications, the lack of systematic methods to assess their vulnerability to jailbreak attacks presents a critical security gap. We introduce the jailbreak oracle problem: given a model, prompt, and decoding strategy, determine whether a jailbreak response can be generated with likelihood exceeding a specified threshold. This formalization enables a principled study of jailbreak vulnerabilities. Answering the jailbreak oracle problem poses significant computational challenges -- the search space grows exponentially with the length of the response tokens. We present Boa, the first efficient algorithm for solving the jailbreak oracle problem. Boa employs a three-phase search strategy: (1) constructing block lists to identify refusal patterns, (2) breadth-first sampling to identify easily accessible jailbreaks, and (3) depth-first priority search guided by fine-grained safety scores to systematically explore promising low-probability paths. Boa enables rigorous security assessments including systematic defense evaluation, standardized comparison of red team attacks, and model certification under extreme adversarial conditions.

Paperid: 864, https://arxiv.org/pdf/2506.14489.pdf

Abstract:
ReDash extends Dash's arithmetic garbled circuits to provide a more flexible and efficient framework for secure outsourced inference. By introducing a novel garbled scaling gadget based on a generalized base extension for the residue number system, ReDash removes Dash's limitation of scaling exclusively by powers of two. This enables arbitrary scaling factors drawn from the residue number system's modular base, allowing for tailored quantization schemes and more efficient model evaluation. Through the new $\text{ScaleQuant}^+$ quantization mechanism, ReDash supports optimized modular bases that can significantly reduce the overhead of arithmetic operations during convolutional neural network inference. ReDash achieves up to a 33-fold speedup in overall inference time compared to Dash Despite these enhancements, ReDash preserves the robust security guarantees of arithmetic garbling. By delivering both performance gains and quantization flexibility, ReDash expands the practicality of garbled convolutional neural network inference.

Paperid: 865, https://arxiv.org/pdf/2506.12454.pdf

Abstract:
What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying model vulnerability to consistent adversarial attacks -- perturbations that preserve the ground-truth labels. Our main technical contribution is an exact and rigorous asymptotic characterization of these metrics in both well-specified models and latent space models, revealing different vulnerability patterns compared to standard robust error measures. The theoretical results demonstrate that as models become more overparameterized, their vulnerability to label-preserving perturbations grows, offering theoretical insight into the mechanisms underlying model sensitivity to adversarial attacks.

Paperid: 866, https://arxiv.org/pdf/2506.12202.pdf

Abstract:
Modern large language models (LLMs) are often deployed as agents, calling external tools adaptively to solve tasks. Rather than directly calling tools, it can be more effective for LLMs to write code to perform the tool calls, enabling them to automatically generate complex control flow such as conditionals and loops. Such code actions are typically provided as Python code, since LLMs are quite proficient at it; however, Python may not be the ideal language due to limited built-in support for performance, security, and reliability. We propose a novel programming language for code actions, called Quasar, which has several benefits: (1) automated parallelization to improve performance, (2) uncertainty quantification to improve reliability and mitigate hallucinations, and (3) security features enabling the user to validate actions. LLMs can write code in a subset of Python, which is automatically transpiled to Quasar. We evaluate our approach on the ViperGPT visual question answering agent, applied to the GQA dataset, demonstrating that LLMs with Quasar actions instead of Python actions retain strong performance, while reducing execution time when possible by 42%, improving security by reducing user approval interactions when possible by 52%, and improving reliability by applying conformal prediction to achieve a desired target coverage level.

Paperid: 867, https://arxiv.org/pdf/2506.11094.pdf

Abstract:
With the rapid advancement of artificial intelligence technology, Large Language Models (LLMs) have demonstrated remarkable potential in the field of Natural Language Processing (NLP), including areas such as content generation, human-computer interaction, machine translation, and code generation, among others. However, their widespread deployment has also raised significant safety concerns. In recent years, LLM-generated content has occasionally exhibited unsafe elements like toxicity and bias, particularly in adversarial scenarios, which has garnered extensive attention from both academia and industry. While numerous efforts have been made to evaluate the safety risks associated with LLMs, there remains a lack of systematic reviews summarizing these research endeavors. This survey aims to provide a comprehensive and systematic overview of recent advancements in LLMs safety evaluation, focusing on several key aspects: (1) "Why evaluate" that explores the background of LLMs safety evaluation, how they differ from general LLMs evaluation, and the significance of such evaluation; (2) "What to evaluate" that examines and categorizes existing safety evaluation tasks based on key capabilities, including dimensions such as toxicity, robustness, ethics, bias and fairness, truthfulness, and so on; (3) "Where to evaluate" that summarizes the evaluation metrics, datasets and benchmarks currently used in safety evaluations; (4) "How to evaluate" that reviews existing evaluation toolkit, and categorizing mainstream evaluation methods based on the roles of the evaluators. Finally, we identify the challenges in LLMs safety evaluation and propose potential research directions to promote further advancement in this field. We emphasize the importance of prioritizing LLMs safety evaluation to ensure the safe deployment of these models in real-world applications.

Paperid: 868, https://arxiv.org/pdf/2506.09803.pdf

Abstract:
Graph neural networks (GNNs) have achieved significant success in graph representation learning and have been applied to various domains. However, many real-world graphs contain sensitive personal information, such as user profiles in social networks, raising serious privacy concerns when graph learning is performed using GNNs. To address this issue, locally private graph learning protocols have gained considerable attention. These protocols leverage the privacy advantages of local differential privacy (LDP) and the effectiveness of GNN's message-passing in calibrating noisy data, offering strict privacy guarantees for users' local data while maintaining high utility (e.g., node classification accuracy) for graph learning. Despite these advantages, such protocols may be vulnerable to data poisoning attacks, a threat that has not been considered in previous research. Identifying and addressing these threats is crucial for ensuring the robustness and security of privacy-preserving graph learning frameworks. This work introduces the first data poisoning attack targeting locally private graph learning protocols. The attacker injects fake users into the protocol, manipulates these fake users to establish links with genuine users, and sends carefully crafted data to the server, ultimately compromising the utility of private graph learning. The effectiveness of the attack is demonstrated both theoretically and empirically. In addition, several defense strategies have also been explored, but their limited effectiveness highlights the need for more robust defenses.

Paperid: 869, https://arxiv.org/pdf/2506.08482.pdf

Abstract:
Numerous methods have been proposed to generate physical adversarial patches (PAPs) against real-world machine learning systems. However, each existing PAP typically supports only a single, fixed attack goal, and switching to a different objective requires re-generating and re-deploying a new PAP. This rigidity limits their practicality in dynamic environments like autonomous driving, where traffic conditions and attack goals can change rapidly. For example, if no obstacles are present around the target vehicle, the attack may fail to cause meaningful consequences. To overcome this limitation, we propose SwitchPatch, a novel PAP that is static yet enables dynamic and controllable attack outcomes based on real-time scenarios. Attackers can alter pre-defined conditions, e.g., by projecting different natural-color lights onto SwitchPatch to seamlessly switch between attack goals. Unlike prior work, SwitchPatch does not require re-generation or re-deployment for different objectives, significantly reducing cost and complexity. Furthermore, SwitchPatch remains benign when the enabling conditions are absent, enhancing its stealth. We evaluate SwitchPatch on two key tasks: traffic sign recognition (classification and detection) and depth estimation. First, we conduct theoretical analysis and empirical studies to demonstrate the feasibility of SwitchPatch and explore how many goals it can support using techniques like color light projection and occlusion. Second, we perform simulation-based experiments and ablation studies to verify its effectiveness and transferability. Third, we conduct outdoor tests using a Unmanned Ground Vehicle (UGV) to confirm its robustness in the physical world. Overall, SwitchPatch introduces a flexible and practical adversarial strategy that can be adapted to diverse tasks and real-world conditions.

Paperid: 870, https://arxiv.org/pdf/2506.07031.pdf

Abstract:
Emerging Large Reasoning Models (LRMs) consistently excel in mathematical and reasoning tasks, showcasing exceptional capabilities. However, the enhancement of reasoning abilities and the exposure of their internal reasoning processes introduce new safety vulnerabilities. One intriguing concern is: when reasoning is strongly entangled with harmfulness, what safety-reasoning trade-off do LRMs exhibit? To address this issue, we introduce HauntAttack, a novel and general-purpose black-box attack framework that systematically embeds harmful instructions into reasoning questions. Specifically, we treat reasoning questions as carriers and substitute one of their original conditions with a harmful instruction. This process creates a reasoning pathway in which the model is guided step by step toward generating unsafe outputs. Based on HauntAttack, we conduct comprehensive experiments on multiple LRMs. Our results reveal that even the most advanced LRMs exhibit significant safety vulnerabilities. Additionally, we perform a detailed analysis of different models, various types of harmful instructions, and model output patterns, providing valuable insights into the security of LRMs.

Paperid: 871, https://arxiv.org/pdf/2506.01307.pdf

Abstract:
Large Language Models (LLMs) have evolved into Multimodal Large Language Models (MLLMs), significantly enhancing their capabilities by integrating visual information and other types, thus aligning more closely with the nature of human intelligence, which processes a variety of data forms beyond just text. Despite advancements, the undesirable generation of these models remains a critical concern, particularly due to vulnerabilities exposed by text-based jailbreak attacks, which have represented a significant threat by challenging existing safety protocols. Motivated by the unique security risks posed by the integration of new and old modalities for MLLMs, we propose a unified multimodal universal jailbreak attack framework that leverages iterative image-text interactions and transfer-based strategy to generate a universal adversarial suffix and image. Our work not only highlights the interaction of image-text modalities can be used as a critical vulnerability but also validates that multimodal universal jailbreak attacks can bring higher-quality undesirable generations across different MLLMs. We evaluate the undesirable context generation of MLLMs like LLaVA, Yi-VL, MiniGPT4, MiniGPT-v2, and InstructBLIP, and reveal significant multimodal safety alignment issues, highlighting the inadequacy of current safety mechanisms against sophisticated multimodal attacks. This study underscores the urgent need for robust safety measures in MLLMs, advocating for a comprehensive review and enhancement of security protocols to mitigate potential risks associated with multimodal capabilities.

Paperid: 872, https://arxiv.org/pdf/2506.00373.pdf

Abstract:
Passwords remain one of the most common methods for securing sensitive data in the digital age. However, weak password choices continue to pose significant risks to data security and privacy. This study aims to solve the problem by focusing on developing robust password strength estimation models using adversarial machine learning, a technique that trains models on intentionally crafted deceptive passwords to expose and address vulnerabilities posed by such passwords. We apply five classification algorithms and use a dataset with more than 670,000 samples of adversarial passwords to train the models. Results demonstrate that adversarial training improves password strength classification accuracy by up to 20% compared to traditional machine learning models. It highlights the importance of integrating adversarial machine learning into security systems to enhance their robustness against modern adaptive threats. Keywords: adversarial attack, password strength, classification, machine learning

Paperid: 873, https://arxiv.org/pdf/2505.23682.pdf

Abstract:
The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream $T$ and the size of the universe $|U|$ from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both $T$ and $|U|$. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, $\tilde{O}_Î·(T^{1/3})$ space and additive error, and a $(1+Î·)$-relative approximation for all $Î·\in (0,1)$. Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known $Î©(T^{1/4})$ additive error lower bound for arbitrary streams. Moreover, when a bound $W$ on the number of times an item appears in the stream is known, our algorithm provides $\tilde{O}_Î·(\sqrt{W})$ additive error, using $\tilde{O}_Î·(\sqrt{W})$ space. This additive error asymptotically matches that of prior work which required instead linear space. Our results address an open question posed by [Jain, Kalemaj, Raskhodnikova, Sivakumar, Smith, Neurips23] about designing low-memory mechanisms for this problem. We complement these results with a space lower bound for this problem, which shows that any algorithm that uses similar techniques must use space $\tildeÎ©(T^{1/3})$ on arbitrary streams.

Paperid: 874, https://arxiv.org/pdf/2505.23397.pdf

Abstract:
This article presents a structured framework for Human-AI collaboration in Security Operations Centers (SOCs), integrating AI autonomy, trust calibration, and Human-in-the-loop decision making. Existing frameworks in SOCs often focus narrowly on automation, lacking systematic structures to manage human oversight, trust calibration, and scalable autonomy with AI. Many assume static or binary autonomy settings, failing to account for the varied complexity, criticality, and risk across SOC tasks considering Humans and AI collaboration. To address these limitations, we propose a novel autonomy tiered framework grounded in five levels of AI autonomy from manual to fully autonomous, mapped to Human-in-the-Loop (HITL) roles and task-specific trust thresholds. This enables adaptive and explainable AI integration across core SOC functions, including monitoring, protection, threat detection, alert triage, and incident response. The proposed framework differentiates itself from previous research by creating formal connections between autonomy, trust, and HITL across various SOC levels, which allows for adaptive task distribution according to operational complexity and associated risks. The framework is exemplified through a simulated cyber range that features the cybersecurity AI-Avatar, a fine-tuned LLM-based SOC assistant. The AI-Avatar case study illustrates human-AI collaboration for SOC tasks, reducing alert fatigue, enhancing response coordination, and strategically calibrating trust. This research systematically presents both the theoretical and practical aspects and feasibility of designing next-generation cognitive SOCs that leverage AI not to replace but to enhance human decision-making.

Paperid: 875, https://arxiv.org/pdf/2505.08234.pdf

Abstract:
As AI-generated imagery becomes ubiquitous, invisible watermarks have emerged as a primary line of defense for copyright and provenance. The newest watermarking schemes embed semantic signals - content-aware patterns that are designed to survive common image manipulations - yet their true robustness against adaptive adversaries remains under-explored. We expose a previously unreported vulnerability and introduce SemanticRegen, a three-stage, label-free attack that erases state-of-the-art semantic and invisible watermarks while leaving an image's apparent meaning intact. Our pipeline (i) uses a vision-language model to obtain fine-grained captions, (ii) extracts foreground masks with zero-shot segmentation, and (iii) inpaints only the background via an LLM-guided diffusion model, thereby preserving salient objects and style cues. Evaluated on 1,000 prompts across four watermarking systems - TreeRing, StegaStamp, StableSig, and DWT/DCT - SemanticRegen is the only method to defeat the semantic TreeRing watermark (p = 0.10 > 0.05) and reduces bit-accuracy below 0.75 for the remaining schemes, all while maintaining high perceptual quality (masked SSIM = 0.94 +/- 0.01). We further introduce masked SSIM (mSSIM) to quantify fidelity within foreground regions, showing that our attack achieves up to 12 percent higher mSSIM than prior diffusion-based attackers. These results highlight an urgent gap between current watermark defenses and the capabilities of adaptive, semantics-aware adversaries, underscoring the need for watermarking algorithms that are resilient to content-preserving regenerative attacks.

Paperid: 876, https://arxiv.org/pdf/2505.01874.pdf

Abstract:
Resilience against malicious participants and data privacy are essential for trustworthy federated learning, yet achieving both with good utility typically requires the strong assumption of a trusted central server. This paper shows that a significantly weaker assumption suffices: each pair of participants shares a randomness seed unknown to others. In a setting where malicious participants may collude with an untrusted server, we propose CafCor, an algorithm that integrates robust gradient aggregation with correlated noise injection, using shared randomness between participants. We prove that CafCor achieves strong privacy-utility trade-offs, significantly outperforming local differential privacy (DP) methods, which do not make any trust assumption, while approaching central DP utility, where the server is fully trusted. Empirical results on standard benchmarks validate CafCor's practicality, showing that privacy and robustness can coexist in distributed systems without sacrificing utility or trusting the server.

Paperid: 877, https://arxiv.org/pdf/2505.01292.pdf

Abstract:
Local Differential Privacy (LDP) enables massive data collection and analysis while protecting end users' privacy against untrusted aggregators. It has been applied to various data types (e.g., categorical, numerical, and graph data) and application settings (e.g., static and streaming). Recent findings indicate that LDP protocols can be easily disrupted by poisoning or manipulation attacks, which leverage injected/corrupted fake users to send crafted data conforming to the LDP reports. However, current attacks primarily target static protocols, neglecting the security of LDP protocols in the streaming settings. Our research fills the gap by developing novel fine-grained manipulation attacks to LDP protocols for data streams. By reviewing the attack surfaces in existing algorithms, We introduce a unified attack framework with composable modules, which can manipulate the LDP estimated stream toward a target stream. Our attack framework can adapt to state-of-the-art streaming LDP algorithms with different analytic tasks (e.g., frequency and mean) and LDP models (event-level, user-level, w-event level). We validate our attacks theoretically and through extensive experiments on real-world datasets, and finally explore a possible defense mechanism for mitigating these attacks.

Paperid: 878, https://arxiv.org/pdf/2504.21034.pdf

Abstract:
Large Language Model (LLM)-based agents increasingly interact, collaborate, and delegate tasks to one another autonomously with minimal human interaction. Industry guidelines for agentic system governance emphasize the need for users to maintain comprehensive control over their agents, mitigating potential damage from malicious agents. Several proposed agentic system designs address agent identity, authorization, and delegation, but remain purely theoretical, without concrete implementation and evaluation. Most importantly, they do not provide user-controlled agent management. To address this gap, we propose SAGA, a scalable Security Architecture for Governing Agentic systems, that offers user oversight over their agents' lifecycle. In our design, users register their agents with a central entity, the Provider, that maintains agent contact information, user-defined access control policies, and helps agents enforce these policies on inter-agent communication. We introduce a cryptographic mechanism for deriving access control tokens, that offers fine-grained control over an agent's interaction with other agents, providing formal security guarantees. We evaluate SAGA on several agentic tasks, using agents in different geolocations, and multiple on-device and cloud LLMs, demonstrating minimal performance overhead with no impact on underlying task utility in a wide range of conditions. Our architecture enables secure and trustworthy deployment of autonomous agents, accelerating the responsible adoption of this technology in sensitive environments.

Paperid: 879, https://arxiv.org/pdf/2504.20904.pdf

Abstract:
Interpretable malware detection is crucial for understanding harmful behaviors and building trust in automated security systems. Traditional explainable methods for Graph Neural Networks (GNNs) often highlight important regions within a graph but fail to associate them with known benign or malicious behavioral patterns. This limitation reduces their utility in security contexts, where alignment with verified prototypes is essential. In this work, we introduce a novel dual prototype-driven explainable framework that interprets GNN-based malware detection decisions. This dual explainable framework integrates a base explainer (a state-of-the-art explainer) with a novel second-level explainer which is designed by subgraph matching technique, called SubMatch explainer. The proposed explainer assigns interpretable scores to nodes based on their association with matched subgraphs, offering a fine-grained distinction between benign and malicious regions. This prototype-guided scoring mechanism enables more interpretable, behavior-aligned explanations. Experimental results demonstrate that our method preserves high detection performance while significantly improving interpretability in malware analysis.

Paperid: 880, https://arxiv.org/pdf/2504.19440.pdf

Abstract:
Safety and security remain critical concerns in AI deployment. Despite safety training through reinforcement learning with human feedback (RLHF) [ 32], language models remain vulnerable to jailbreak attacks that bypass safety guardrails. Universal jailbreaks - prefixes that can circumvent alignment for any payload - are particularly concerning. We show empirically that jailbreak detection systems face distribution shift, with detectors trained at one point in time performing poorly against newer exploits. To study this problem, we release JailbreaksOverTime, a comprehensive dataset of timestamped real user interactions containing both benign requests and jailbreak attempts collected over 10 months. We propose a two-pronged method for defenders to detect new jailbreaks and continuously update their detectors. First, we show how to use continuous learning to detect jailbreaks and adapt rapidly to new emerging jailbreaks. While detectors trained at a single point in time eventually fail due to drift, we find that universal jailbreaks evolve slowly enough for self-training to be effective. Retraining our detection model weekly using its own labels - with no new human labels - reduces the false negative rate from 4% to 0.3% at a false positive rate of 0.1%. Second, we introduce an unsupervised active monitoring approach to identify novel jailbreaks. Rather than classifying inputs directly, we recognize jailbreaks by their behavior, specifically, their ability to trigger models to respond to known-harmful prompts. This approach has a higher false negative rate (4.1%) than supervised methods, but it successfully identified some out-of-distribution attacks that were missed by the continuous learning approach.

Paperid: 881, https://arxiv.org/pdf/2504.17785.pdf

Abstract:
Outsourcing ML training to cloud-service-providers presents a compelling opportunity for resource constrained clients, while it simultaneously bears inherent privacy risks. We introduce Silenuio, the first fully non-interactive outsourcing scheme for the training of MLPs that achieves 128bit security using FHE (precisely TFHE). Unlike traditional MPC-based protocols that necessitate interactive communication between the client and server(s) or non-collusion assumptions among multiple servers, Silenzio enables the "fire-and-forget" paradigm without such assumptions. In this approach, the client encrypts the training data once, and the server performs the training without any further interaction. Silenzio operates entirely over low-bitwidth integer to mitigate the computational overhead inherent to FHE. Our approach features a novel low-bitwidth matrix multiplication gadget that leverages input-dependent residue number systems, ensuring that no intermediate value overflows 8bit. Starting from an RNS-to-MRNS conversion process, we propose an efficient block-scaling mechanism, which approximately shifts encrypted tensor values to their user-specified most significant bits. To instantiate the backpropagation of the error, Silenzio introduces a low-bitwidth gradient computation for the cross-entropy loss. We evaluate Silenzio on standard MLP training tasks regarding runtime as well as model performance and achieve similar classification accuracy as MLPs trained using PyTorch with 32bit floating-point computations. Our open-source implementation of Silenzio represents a significant advancement in privacy-preserving ML, providing a new baseline for secure and non-interactive outsourced MLP training.

Paperid: 882, https://arxiv.org/pdf/2504.16316.pdf

Abstract:
Control Flow Graphs (CFGs) are critical for analyzing program execution and characterizing malware behavior. With the growing adoption of Graph Neural Networks (GNNs), CFG-based representations have proven highly effective for malware detection. This study proposes a novel framework that dynamically constructs CFGs and embeds node features using a hybrid approach combining rule-based encoding and autoencoder-based embedding. A GNN-based classifier is then constructed to detect malicious behavior from the resulting graph representations. To improve model interpretability, we apply state-of-the-art explainability techniques, including GNNExplainer, PGExplainer, and CaptumExplainer, the latter is utilized three attribution methods: Integrated Gradients, Guided Backpropagation, and Saliency. In addition, we introduce a novel aggregation method, called RankFusion, that integrates the outputs of the top-performing explainers to enhance the explanation quality. We also evaluate explanations using two subgraph extraction strategies, including the proposed Greedy Edge-wise Composition (GEC) method for improved structural coherence. A comprehensive evaluation using accuracy, fidelity, and consistency metrics demonstrates the effectiveness of the proposed framework in terms of accurate identification of malware samples and generating reliable and interpretable explanations.

Paperid: 883, https://arxiv.org/pdf/2504.15284.pdf

Abstract:
Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they suffer from suboptimal performance and lack of robustness and generalization. We introduce EditLord, a code editing framework that makes the code transformation steps explicit. Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets. Such rule sets will be manifested for each training sample to augment them for finetuning or assist in prompting- and iterative-based code editing. EditLord outperforms the state-of-the-art by an average of 22.7% in editing performance and 58.1% in robustness while achieving 20.2% higher functional correctness across critical software engineering and security applications, LM models, and editing modes.

Paperid: 884, https://arxiv.org/pdf/2504.07543.pdf

Abstract:
Tor, a widely utilized privacy network, enables anonymous communication but is vulnerable to flow correlation attacks that deanonymize users by correlating traffic patterns from Tor's ingress and egress segments. Various defenses have been developed to mitigate these attacks; however, they have two critical limitations: (i) significant network overhead during obfuscation and (ii) a lack of dynamic obfuscation for egress segments, exposing traffic patterns to adversaries. In response, we introduce MUFFLER, a novel connection-level traffic obfuscation system designed to secure Tor egress traffic. It dynamically maps real connections to a distinct set of virtual connections between the final Tor nodes and targeted services, either public or hidden. This approach creates egress traffic patterns fundamentally different from those at ingress segments without adding intentional padding bytes or timing delays. The mapping of real and virtual connections is adjusted in real-time based on ongoing network conditions, thwarting adversaries' efforts to detect egress traffic patterns. Extensive evaluations show that MUFFLER mitigates powerful correlation attacks with a TPR of 1% at an FPR of 10^-2 while imposing only a 2.17% bandwidth overhead. Moreover, it achieves up to 27x lower latency overhead than existing solutions and seamlessly integrates with the current Tor architecture.

Paperid: 885, https://arxiv.org/pdf/2504.07041.pdf

Abstract:
Storage integrity is essential to systems and applications that use untrusted storage (e.g., public clouds, end-user devices). However, known methods for achieving storage integrity either suffer from high (and often prohibitive) overheads or provide weak integrity guarantees. In this work, we demonstrate a hybrid approach to storage integrity that simultaneously reduces overhead while providing strong integrity guarantees. Our system, partially asynchronous integrity checking (PAC), allows disk write commitments to be deferred while still providing guarantees around read integrity. PAC delivers a 5.5X throughput and latency improvement over the state of the art, and 85% of the throughput achieved by non-integrity-assuring approaches. In this way, we show that untrusted storage can be used for integrity-critical workloads without meaningfully sacrificing performance.

Paperid: 886, https://arxiv.org/pdf/2504.05652.pdf

Abstract:
With the increasingly deep integration of large language models (LLMs) across diverse domains, the effectiveness of their safety mechanisms is encountering severe challenges. Currently, jailbreak attacks based on prompt engineering have become a major safety threat. However, existing methods primarily rely on black-box manipulation of prompt templates, resulting in poor interpretability and limited generalization. To break through the bottleneck, this study first introduces the concept of Defense Threshold Decay (DTD), revealing the potential safety impact caused by LLMs' benign generation: as benign content generation in LLMs increases, the model's focus on input instructions progressively diminishes. Building on this insight, we propose the Sugar-Coated Poison (SCP) attack paradigm, which uses a "semantic reversal" strategy to craft benign inputs that are opposite in meaning to malicious intent. This strategy induces the models to generate extensive benign content, thereby enabling adversarial reasoning to bypass safety mechanisms. Experiments show that SCP outperforms existing baselines. Remarkably, it achieves an average attack success rate of 87.23% across six LLMs. For defense, we propose Part-of-Speech Defense (POSD), leveraging verb-noun dependencies for syntactic analysis to enhance safety of LLMs while preserving their generalization ability.

Paperid: 887, https://arxiv.org/pdf/2504.03770.pdf

Abstract:
Multimodal large language models (MLLMs) excel in vision-language tasks but also pose significant risks of generating harmful content, particularly through jailbreak attacks. Jailbreak attacks refer to intentional manipulations that bypass safety mechanisms in models, leading to the generation of inappropriate or unsafe content. Detecting such attacks is critical to ensuring the responsible deployment of MLLMs. Existing jailbreak detection methods face three primary challenges: (1) Many rely on model hidden states or gradients, limiting their applicability to white-box models, where the internal workings of the model are accessible; (2) They involve high computational overhead from uncertainty-based analysis, which limits real-time detection, and (3) They require fully labeled harmful datasets, which are often scarce in real-world settings. To address these issues, we introduce a test-time adaptive framework called JAILDAM. Our method leverages a memory-based approach guided by policy-driven unsafe knowledge representations, eliminating the need for explicit exposure to harmful data. By dynamically updating unsafe knowledge during test-time, our framework improves generalization to unseen jailbreak strategies while maintaining efficiency. Experiments on multiple VLM jailbreak benchmarks demonstrate that JAILDAM delivers state-of-the-art performance in harmful content detection, improving both accuracy and speed.

Paperid: 888, https://arxiv.org/pdf/2503.19519.pdf

Abstract:
Adversarial attacks in time series classification (TSC) models have recently gained attention due to their potential to compromise model robustness. Imperceptibility is crucial, as adversarial examples detected by the human vision system (HVS) can render attacks ineffective. Many existing methods fail to produce high-quality imperceptible examples, often generating perturbations with more perceptible low-frequency components, like square waves, and global perturbations that reduce stealthiness. This paper aims to improve the imperceptibility of adversarial attacks on TSC models by addressing frequency components and time series locality. We propose the Shapelet-based Frequency-domain Attack (SFAttack), which uses local perturbations focused on time series shapelets to enhance discriminative information and stealthiness. Additionally, we introduce a low-frequency constraint to confine perturbations to high-frequency components, enhancing imperceptibility.

Paperid: 889, https://arxiv.org/pdf/2503.18081.pdf

Abstract:
Model stealing attack is increasingly threatening the confidentiality of machine learning models deployed in the cloud. Recent studies reveal that adversaries can exploit data synthesis techniques to steal machine learning models even in scenarios devoid of real data, leading to data-free model stealing attacks. Existing defenses against such attacks suffer from limitations, including poor effectiveness, insufficient generalization ability, and low comprehensiveness. In response, this paper introduces a novel defense framework named Model-Guardian. Comprising two components, Data-Free Model Stealing Detector (DFMS-Detector) and Deceptive Predictions (DPreds), Model-Guardian is designed to address the shortcomings of current defenses with the help of the artifact properties of synthetic samples and gradient representations of samples. Extensive experiments on seven prevalent data-free model stealing attacks showcase the effectiveness and superior generalization ability of Model-Guardian, outperforming eleven defense methods and establishing a new state-of-the-art performance. Notably, this work pioneers the utilization of various GANs and diffusion models for generating highly realistic query samples in attacks, with Model-Guardian demonstrating accurate detection capabilities.

Paperid: 890, https://arxiv.org/pdf/2503.15916.pdf

Abstract:
Modular arithmetic, particularly modular reduction, is widely used in cryptographic applications such as homomorphic encryption (HE) and zero-knowledge proofs (ZKP). High-bit-width operations are crucial for enhancing security; however, they are computationally intensive due to the large number of modular operations required. The lookup-table-based (LUT-based) approach, a ``space-for-time'' technique, reduces computational load by segmenting the input number into smaller bit groups, pre-computing modular reduction results for each segment, and storing these results in LUTs. While effective, this method incurs significant hardware overhead due to extensive LUT usage. In this paper, we introduce ALLMod, a novel approach that improves the area efficiency of LUT-based large-number modular reduction by employing hybrid workloads. Inspired by the iterative method, ALLMod splits the bit groups into two distinct workloads, achieving lower area costs without compromising throughput. We first develop a template to facilitate workload splitting and ensure balanced distribution. Then, we conduct design space exploration to evaluate the optimal timing for fusing workload results, enabling us to identify the most efficient design under specific constraints. Extensive evaluations show that ALLMod achieves up to $1.65\times$ and $3\times$ improvements in area efficiency over conventional LUT-based methods for bit-widths of $128$ and $8,192$, respectively.

Paperid: 891, https://arxiv.org/pdf/2503.15551.pdf

Abstract:
Batch prompting, which combines a batch of multiple queries sharing the same context in one inference, has emerged as a promising solution to reduce inference costs. However, our study reveals a significant security vulnerability in batch prompting: malicious users can inject attack instructions into a batch, leading to unwanted interference across all queries, which can result in the inclusion of harmful content, such as phishing links, or the disruption of logical reasoning. In this paper, we construct BATCHSAFEBENCH, a comprehensive benchmark comprising 150 attack instructions of two types and 8k batch instances, to study the batch prompting vulnerability systematically. Our evaluation of both closed-source and open-weight LLMs demonstrates that all LLMs are susceptible to batch-prompting attacks. We then explore multiple defending approaches. While the prompting-based defense shows limited effectiveness for smaller LLMs, the probing-based approach achieves about 95% accuracy in detecting attacks. Additionally, we perform a mechanistic analysis to understand the attack and identify attention heads that are responsible for it.

Paperid: 892, https://arxiv.org/pdf/2503.12931.pdf

Abstract:
Defending large language models (LLMs) against jailbreak attacks is crucial for ensuring their safe deployment. Existing defense strategies typically rely on predefined static criteria to differentiate between harmful and benign prompts. However, such rigid rules fail to accommodate the inherent complexity and dynamic nature of real-world jailbreak attacks. In this paper, we focus on the novel challenge of universal defense against diverse jailbreaks. We propose a new concept ``mirror'', which is a dynamically generated prompt that reflects the syntactic structure of the input while ensuring semantic safety. The discrepancies between input prompts and their corresponding mirrors serve as guiding principles for defense. A novel defense model, MirrorShield, is further proposed to detect and calibrate risky inputs based on the crafted mirrors. Evaluated on multiple benchmark datasets and compared against ten state-of-the-art attack methods, MirrorShield demonstrates superior defense performance and promising generalization capabilities.

Paperid: 893, https://arxiv.org/pdf/2503.03180.pdf

Abstract:
Ensuring the security of critical infrastructure has become increasingly vital with the proliferation of Internet of Things (IoT) systems. However, the heterogeneous nature of IoT data and the lack of human-comprehensible insights from anomaly detection models remain significant challenges. This paper presents a hybrid framework that combines numerical anomaly detection using Autoencoders with Large Language Models (LLMs) for enhanced preprocessing and interpretability. Two preprocessing approaches are implemented: a traditional method utilizing Principal Component Analysis (PCA) to reduce dimensionality and an LLM-assisted method where GPT-4 dynamically recommends feature selection, transformation, and encoding strategies. Experimental results on the KDDCup99 10% corrected dataset demonstrate that the LLM-assisted preprocessing pipeline significantly improves anomaly detection performance. The macro-average F1 score increased from 0.49 in the traditional PCA-based approach to 0.98 with LLM-driven insights. Additionally, the LLM generates natural language explanations for detected anomalies, providing contextual insights into their causes and implications. This framework highlights the synergy between numerical AI models and LLMs, delivering an accurate, interpretable, and efficient solution for IoT cybersecurity in critical infrastructure.

Paperid: 894, https://arxiv.org/pdf/2503.03037.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) networks has introduced new security challenges, necessitating efficient and reliable methods for intrusion detection. In this study, a detection framework based on hyperdimensional computing (HDC) is proposed to identify and classify network intrusions using the NSL-KDD dataset, a standard benchmark for intrusion detection systems. By leveraging the capabilities of HDC, including high-dimensional representation and efficient computation, the proposed approach effectively distinguishes various attack categories such as DoS, probe, R2L, and U2R, while accurately identifying normal traffic patterns. Comprehensive evaluations demonstrate that the proposed method achieves an accuracy of 99.54%, significantly outperforming conventional intrusion detection techniques, making it a promising solution for IoT network security. This work emphasizes the critical role of robust and precise intrusion detection in safeguarding IoT systems against evolving cyber threats.

Paperid: 895, https://arxiv.org/pdf/2503.03031.pdf

Abstract:
With the rapid growth of IoT devices, ensuring robust network security has become a critical challenge. Traditional intrusion detection systems (IDSs) often face limitations in detecting sophisticated attacks within high-dimensional and complex data environments. This paper presents a novel approach to network anomaly detection using hyperdimensional computing (HDC) techniques, specifically applied to the NSL-KDD dataset. The proposed method leverages the efficiency of HDC in processing large-scale data to identify both known and unknown attack patterns. The model achieved an accuracy of 91.55% on the KDDTrain+ subset, outperforming traditional approaches. These comparative evaluations underscore the model's superior performance, highlighting its potential in advancing anomaly detection for IoT networks and contributing to more secure and intelligent cybersecurity solutions.

Paperid: 896, https://arxiv.org/pdf/2502.18986.pdf

Abstract:
Among all privacy attacks against Machine Learning (ML), membership inference attacks (MIA) attracted the most attention. In these attacks, the attacker is given an ML model and a data point, and they must infer whether the data point was used for training. The attacker also has an auxiliary dataset to tune their inference algorithm. Attack papers commonly simulate setups in which the attacker's and the target's datasets are sampled from the same distribution. This setting is convenient to perform experiments, but it rarely holds in practice. ML literature commonly starts with similar simplifying assumptions (i.e., "i.i.d." datasets), and later generalizes the results to support heterogeneous data distributions. Similarly, our work makes a first step in the generalization of the MIA evaluation to heterogeneous data. First, we design a metric to measure the heterogeneity between any pair of tabular data distributions. This metric provides a continuous scale to analyze the phenomenon. Second, we compare two methodologies to simulate a data heterogeneity between the target and the attacker. These setups provide opposite performances: 90% attack accuracy vs. 50% (i.e., random guessing). Our results show that the MIA accuracy depends on the experimental setup; and even if research on MIA considers heterogeneous data setups, we have no standardized baseline of how to simulate it. The lack of such a baseline for MIA experiments poses a significant challenge to risk assessments in real-world machine learning scenarios.

Paperid: 897, https://arxiv.org/pdf/2502.07776.pdf

Abstract:
Prompt caching in large language models (LLMs) results in data-dependent timing variations: cached prompts are processed faster than non-cached prompts. These timing differences introduce the risk of side-channel timing attacks. For example, if the cache is shared across users, an attacker could identify cached prompts from fast API response times to learn information about other users' prompts. Because prompt caching may cause privacy leakage, transparency around the caching policies of API providers is important. To this end, we develop and conduct statistical audits to detect prompt caching in real-world LLM API providers. We detect global cache sharing across users in seven API providers, including OpenAI, resulting in potential privacy leakage about users' prompts. Timing variations due to prompt caching can also result in leakage of information about model architecture. Namely, we find evidence that OpenAI's embedding model is a decoder-only Transformer, which was previously not publicly known.

Paperid: 898, https://arxiv.org/pdf/2502.07011.pdf

Abstract:
Federated Learning is vulnerable to adversarial manipulation, where malicious clients can inject poisoned updates to influence the global model's behavior. While existing defense mechanisms have made notable progress, they fail to protect against adversaries that aim to induce targeted backdoors under different learning and attack configurations. To address this limitation, we introduce DROP (Distillation-based Reduction Of Poisoning), a novel defense mechanism that combines clustering and activity-tracking techniques with extraction of benign behavior from clients via knowledge distillation to tackle stealthy adversaries that manipulate low data poisoning rates and diverse malicious client ratios within the federation. Through extensive experimentation, our approach demonstrates superior robustness compared to existing defenses across a wide range of learning configurations. Finally, we evaluate existing defenses and our method under the challenging setting of non-IID client data distribution and highlight the challenges of designing a resilient FL defense in this setting.

Paperid: 899, https://arxiv.org/pdf/2501.19012.pdf

Abstract:
Large Language Models (LLMs) have become an essential tool in the programmer's toolkit, but their tendency to hallucinate code can be used by malicious actors to introduce vulnerabilities to broad swathes of the software supply chain. In this work, we analyze package hallucination behaviour in LLMs across popular programming languages examining both existing package references and fictional dependencies. By analyzing this package hallucination behaviour we find potential attacks and suggest defensive strategies to defend against these attacks. We discover that package hallucination rate is predicated not only on model choice, but also programming language, model size, and specificity of the coding task request. The Pareto optimality boundary between code generation performance and package hallucination is sparsely populated, suggesting that coding models are not being optimized for secure code. Additionally, we find an inverse correlation between package hallucination rate and the HumanEval coding benchmark, offering a heuristic for evaluating the propensity of a model to hallucinate packages. Our metrics, findings and analyses provide a base for future models, securing AI-assisted software development workflows against package supply chain attacks.

Paperid: 900, https://arxiv.org/pdf/2501.16638.pdf

Abstract:
Any exploit taking advantage of zero-day is called a zero-day attack. Previous research and social media trends show a massive demand for research in zero-day attack detection. This paper analyzes Machine Learning (ML) and Deep Learning (DL) based approaches to create Intrusion Detection Systems (IDS) and scrutinizing them using Explainable AI (XAI) by training an explainer based on randomly sampled data from the testing set. The focus is on using the KDD99 dataset, which has the most research done among all the datasets for detecting zero-day attacks. The paper aims to synthesize the dataset to have fewer classes for multi-class classification, test ML and DL approaches on pattern recognition, establish the robustness and dependability of the model, and establish the interpretability and scalability of the model. We evaluated the performance of four multilayer perceptron (MLP) trained on the KDD99 dataset, including baseline ML models, weighted ML models, truncated ML models, and weighted truncated ML models. Our results demonstrate that the truncated ML model achieves the highest accuracy (99.62%), precision, and recall, while weighted truncated ML model shows lower accuracy (97.26%) but better class representation (less bias) among all the classes with improved unweighted recall score. We also used Shapely Additive exPlanations (SHAP) to train explainer for our truncated models to check for feature importance among the two weighted and unweighted models.

Paperid: 901, https://arxiv.org/pdf/2501.16534.pdf

Abstract:
Alignment in large language models (LLMs) is used to enforce guidelines such as safety. Yet, alignment fails in the face of jailbreak attacks that modify inputs to induce unsafe outputs. In this paper, we introduce and evaluate a new technique for jailbreak attacks. We observe that alignment embeds a safety classifier in the LLM responsible for deciding between refusal and compliance, and seek to extract an approximation of this classifier: a surrogate classifier. To this end, we build candidate classifiers from subsets of the LLM. We first evaluate the degree to which candidate classifiers approximate the LLM's safety classifier in benign and adversarial settings. Then, we attack the candidates and measure how well the resulting adversarial inputs transfer to the LLM. Our evaluation shows that the best candidates achieve accurate agreement (an F1 score above 80%) using as little as 20% of the model architecture. Further, we find that attacks mounted on the surrogate classifiers can be transferred to the LLM with high success. For example, a surrogate using only 50% of the Llama 2 model achieved an attack success rate (ASR) of 70% with half the memory footprint and runtime -- a substantial improvement over attacking the LLM directly, where we only observed a 22% ASR. These results show that extracting surrogate classifiers is an effective and efficient means for modeling (and therein addressing) the vulnerability of aligned models to jailbreaking attacks.

Paperid: 902, https://arxiv.org/pdf/2501.15145.pdf

Abstract:
Application designers have moved to integrate large language models (LLMs) into their products. However, many LLM-integrated applications are vulnerable to prompt injections. While attempts have been made to address this problem by building prompt injection detectors, many are not yet suitable for practical deployment. To support research in this area, we introduce PromptShield, a benchmark for training and evaluating deployable prompt injection detectors. Our benchmark is carefully curated and includes both conversational and application-structured data. In addition, we use insights from our curation process to fine-tune a new prompt injection detector that achieves significantly higher performance in the low false positive rate (FPR) evaluation regime compared to prior schemes. Our work suggests that careful curation of training data and larger models can contribute to strong detector performance.

Paperid: 903, https://arxiv.org/pdf/2501.12112.pdf

Abstract:
The rapid growth of decentralized finance (DeFi) has led to the widespread use of automated agents, or bots, within blockchain ecosystems like Ethereum, Binance Smart Chain, and Solana. While these bots enhance market efficiency and liquidity, they also raise concerns due to exploitative behaviors that threaten network integrity and user trust. This paper presents a decentralized federated learning (DFL) approach for detecting financial bots within Ethereum Virtual Machine (EVM)-based blockchains. The proposed framework leverages federated learning, orchestrated through smart contracts, to detect malicious bot behavior while preserving data privacy and aligning with the decentralized nature of blockchain networks. Addressing the limitations of both centralized and rule-based approaches, our system enables each participating node to train local models on transaction history and smart contract interaction data, followed by on-chain aggregation of model updates through a permissioned consensus mechanism. This design allows the model to capture complex and evolving bot behaviors without requiring direct data sharing between nodes. Experimental results demonstrate that our DFL framework achieves high detection accuracy while maintaining scalability and robustness, providing an effective solution for bot detection across distributed blockchain networks.

Paperid: 904, https://arxiv.org/pdf/2501.04331.pdf

Abstract:
Blockchained federated learning (BFL) combines the concepts of federated learning and blockchain technology to enhance privacy, security, and transparency in collaborative machine learning models. However, implementing BFL frameworks poses challenges in terms of scalability and cost-effectiveness. Reputation-aware BFL poses even more challenges, as blockchain validators are tasked with processing federated learning transactions along with the transactions that evaluate FL tasks and aggregate reputations. This leads to faster blockchain congestion and performance degradation. To improve BFL efficiency while increasing scalability and reducing on-chain reputation management costs, this paper proposes AutoDFL, a scalable and automated reputation-aware decentralized federated learning framework. AutoDFL leverages zk-Rollups as a Layer-2 scaling solution to boost the performance while maintaining the same level of security as the underlying Layer-1 blockchain. Moreover, AutoDFL introduces an automated and fair reputation model designed to incentivize federated learning actors. We develop a proof of concept for our framework for an accurate evaluation. Tested with various custom workloads, AutoDFL reaches an average throughput of over 3000 TPS with a gas reduction of up to 20X.

Paperid: 905, https://arxiv.org/pdf/2501.04319.pdf

Abstract:
Blockchain-based Federated Learning (FL) is an emerging decentralized machine learning paradigm that enables model training without relying on a central server. Although some BFL frameworks are considered privacy-preserving, they are still vulnerable to various attacks, including inference and model poisoning. Additionally, most of these solutions employ strong trust assumptions among all participating entities or introduce incentive mechanisms to encourage collaboration, making them susceptible to multiple security flaws. This work presents VerifBFL, a trustless, privacy-preserving, and verifiable federated learning framework that integrates blockchain technology and cryptographic protocols. By employing zero-knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARKs) and incrementally verifiable computation (IVC), VerifBFL ensures the verifiability of both local training and aggregation processes. The proofs of training and aggregation are verified on-chain, guaranteeing the integrity and auditability of each participant's contributions. To protect training data from inference attacks, VerifBFL leverages differential privacy. Finally, to demonstrate the efficiency of the proposed protocols, we built a proof of concept using emerging tools. The results show that generating proofs for local training and aggregation in VerifBFL takes less than 81s and 2s, respectively, while verifying them on-chain takes less than 0.6s.

Paperid: 906, https://arxiv.org/pdf/2501.03446.pdf

Abstract:
Software vulnerabilities continue to be ubiquitous, even in the era of AI-powered code assistants, advanced static analysis tools, and the adoption of extensive testing frameworks. It has become apparent that we must not simply prevent these bugs, but also eliminate them in a quick, efficient manner. Yet, human code intervention is slow, costly, and can often lead to further security vulnerabilities, especially in legacy codebases. The advent of highly advanced Large Language Models (LLM) has opened up the possibility for many software defects to be patched automatically. We propose LLM4CVE an LLM-based iterative pipeline that robustly fixes vulnerable functions in real-world code with high accuracy. We examine our pipeline with State-of-the-Art LLMs, such as GPT-3.5, GPT-4o, Llama 38B, and Llama 3 70B. We achieve a human-verified quality score of 8.51/10 and an increase in groundtruth code similarity of 20% with Llama 3 70B. To promote further research in the area of LLM-based vulnerability repair, we publish our testing apparatus, fine-tuned weights, and experimental data on our website

Paperid: 907, https://arxiv.org/pdf/2506.22666.pdf

Abstract:
The rise of API-only access to state-of-the-art LLMs highlights the need for effective black-box jailbreak methods to identify model vulnerabilities in real-world settings. Without a principled objective for gradient-based optimization, most existing approaches rely on genetic algorithms, which are limited by their initialization and dependence on manually curated prompt pools. Furthermore, these methods require individual optimization for each prompt, failing to provide a comprehensive characterization of model vulnerabilities. To address this gap, we introduce VERA: Variational infErence fRamework for jAilbreaking. VERA casts black-box jailbreak prompting as a variational inference problem, training a small attacker LLM to approximate the target LLM's posterior over adversarial prompts. Once trained, the attacker can generate diverse, fluent jailbreak prompts for a target query without re-optimization. Experimental results show that VERA achieves strong performance across a range of target LLMs, highlighting the value of probabilistic inference for adversarial prompt generation.

Paperid: 908, https://arxiv.org/pdf/2506.20981.pdf

Abstract:
This paper tackles the challenging and practical problem of multi-identifier private user profile matching for privacy-preserving ad measurement, a cornerstone of modern advertising analytics. We introduce a comprehensive cryptographic framework leveraging reversed Oblivious Pseudorandom Functions (OPRF) and novel blind key rotation techniques to support secure matching across multiple identifiers. Our design prevents cross-identifier linkages and includes a differentially private mechanism to obfuscate intersection sizes, mitigating risks such as membership inference attacks. We present a concrete construction of our protocol that achieves both strong privacy guarantees and high efficiency. It scales to large datasets, offering a practical and scalable solution for privacy-centric applications like secure ad conversion tracking. By combining rigorous cryptographic principles with differential privacy, our work addresses a critical need in the advertising industry, setting a new standard for privacy-preserving ad measurement frameworks.

Paperid: 909, https://arxiv.org/pdf/2506.19260.pdf

Abstract:
Federated learning systems increasingly rely on diverse network topologies to address scalability and organizational constraints. While existing privacy research focuses on gradient-based attacks, the privacy implications of network topology knowledge remain critically understudied. We conduct the first comprehensive analysis of topology-based privacy leakage across realistic adversarial knowledge scenarios, demonstrating that adversaries with varying degrees of structural knowledge can infer sensitive data distribution patterns even under strong differential privacy guarantees. Through systematic evaluation of 4,720 attack instances, we analyze six distinct adversarial knowledge scenarios: complete topology knowledge and five partial knowledge configurations reflecting real-world deployment constraints. We propose three complementary attack vectors: communication pattern analysis, parameter magnitude profiling, and structural position correlation, achieving success rates of 84.1%, 65.0%, and 47.2% under complete knowledge conditions. Critically, we find that 80% of realistic partial knowledge scenarios maintain attack effectiveness above security thresholds, with certain partial knowledge configurations achieving performance superior to the baseline complete knowledge scenario. To address these vulnerabilities, we propose and empirically validate structural noise injection as a complementary defense mechanism across 808 configurations, demonstrating up to 51.4% additional attack reduction when properly layered with existing privacy techniques. These results establish that network topology represents a fundamental privacy vulnerability in federated learning systems while providing practical pathways for mitigation through topology-aware defense mechanisms.

Paperid: 910, https://arxiv.org/pdf/2506.17231.pdf

Abstract:
Attacks on large language models (LLMs) in jailbreaking scenarios raise many security and ethical issues. Current jailbreak attack methods face problems such as low efficiency, high computational cost, and poor cross-model adaptability and versatility, which make it difficult to cope with the rapid development of LLM and new defense strategies. Our work proposes an Adversarial Prompt Distillation, which combines masked language modeling, reinforcement learning, and dynamic temperature control through a prompt generation and distillation method. It enables small language models (SLMs) to jailbreak attacks on mainstream LLMs. The experimental results verify the superiority of the proposed method in terms of attack success rate and harm, and reflect the resource efficiency and cross-model adaptability. This research explores the feasibility of distilling the jailbreak ability of LLM to SLM, reveals the model's vulnerability, and provides a new idea for LLM security research.

Paperid: 911, https://arxiv.org/pdf/2506.16661.pdf

Abstract:
Deep neural networks often use large, high-quality datasets to achieve high performance on many machine learning tasks. When training involves potentially sensitive data, this process can raise privacy concerns, as large models have been shown to unintentionally memorize and reveal sensitive information, including reconstructing entire training samples. Differential privacy (DP) provides a robust framework for protecting individual data and in particular, a new approach to privately training deep neural networks is to approximate the input dataset with a privately generated synthetic dataset, before any subsequent training algorithm. We introduce a novel principled method for DP synthetic image embedding generation, based on fitting a Gaussian Mixture Model (GMM) in an appropriate embedding space using DP clustering. Our method provably learns a GMM under separation conditions. Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy on standard benchmark datasets. Additionally, we demonstrate that our method can generate realistic synthetic images that achieve downstream classification accuracy comparable to SOTA methods. Our method is quite general, as the encoder and decoder modules can be freely substituted to suit different tasks. It is also highly scalable, consisting only of subroutines that scale linearly with the number of samples and/or can be implemented efficiently in distributed systems.

Paperid: 912, https://arxiv.org/pdf/2506.10399.pdf

Abstract:
Graph Convolutional Neural Networks (GCNs) have gained widespread popularity in various fields like personal healthcare and financial systems, due to their remarkable performance. Despite the growing demand for cloud-based GCN services, privacy concerns over sensitive graph data remain significant. Homomorphic Encryption (HE) facilitates Privacy-Preserving Machine Learning (PPML) by allowing computations to be performed on encrypted data. However, HE introduces substantial computational overhead, particularly for GCN operations that require rotations and multiplications in matrix products. The sparsity of GCNs offers significant performance potential, but their irregularity introduces additional operations that reduce practical gains. In this paper, we propose FicGCN, a HE-based framework specifically designed to harness the sparse characteristics of GCNs and strike a globally optimal balance between aggregation and combination operations. FicGCN employs a latency-aware packing scheme, a Sparse Intra-Ciphertext Aggregation (SpIntra-CA) method to minimize rotation overhead, and a region-based data reordering driven by local adjacency structure. We evaluated FicGCN on several popular datasets, and the results show that FicGCN achieved the best performance across all tested datasets, with up to a 4.10x improvement over the latest design.

Paperid: 913, https://arxiv.org/pdf/2505.19338.pdf

Abstract:
In the evolving digital landscape, it is crucial to study the dynamics of cyberattacks and defences. This study uses an Evolutionary Game Theory (EGT) framework to investigate the evolutionary dynamics of attacks and defences in cyberspace. We develop a two-population asymmetric game between attacker and defender to capture the essential factors of costs, potential benefits, and the probability of successful defences. Through mathematical analysis and numerical simulations, we find that systems with high defence intensities show stability with minimal attack frequencies, whereas low-defence environments show instability, and are vulnerable to attacks. Furthermore, we find five equilibria, where the strategy pair always defend and attack emerged as the most likely stable state as cyber domain is characterised by a continuous battle between defenders and attackers. Our theoretical findings align with real-world data from past cyber incidents, demonstrating the interdisciplinary impact, such as fraud detection, risk management and cybersecurity decision-making. Overall, our analysis suggests that adaptive cybersecurity strategies based on EGT can improve resource allocation, enhance system resilience, and reduce the overall risk of cyberattacks. By incorporating real-world data, this study demonstrates the applicability of EGT in addressing the evolving nature of cyber threats and the need for secure digital ecosystems through strategic planning and proactive defence measures.

Paperid: 914, https://arxiv.org/pdf/2505.11741.pdf

Abstract:
Vision-language models (VLMs) now rival human performance on many multimodal tasks, yet they still hallucinate objects or generate unsafe text. Current hallucination detectors, e.g., single-token linear probing (SLP) and P(True), typically analyze only the logit of the first generated token or just its highest scoring component overlooking richer signals embedded within earlier token distributions. We demonstrate that analyzing the complete sequence of early logits potentially provides substantially more diagnostic information. We emphasize that hallucinations may only emerge after several tokens, as subtle inconsistencies accumulate over time. By analyzing the Kullback-Leibler (KL) divergence between logits corresponding to hallucinated and non-hallucinated tokens, we underscore the importance of incorporating later-token logits to more accurately capture the reliability dynamics of VLMs. In response, we introduce Multi-Token Reliability Estimation (MTRE), a lightweight, white-box method that aggregates logits from the first ten tokens using multi-token log-likelihood ratios and self-attention. Despite the challenges posed by large vocabulary sizes and long logit sequences, MTRE remains efficient and tractable. On MAD-Bench, MM-SafetyBench, MathVista, and four compositional-geometry benchmarks, MTRE improves AUROC by 9.4 +/- 1.3 points over SLP and by 12.1 +/- 1.7 points over P(True), setting a new state-of-the-art in hallucination detection for open-source VLMs.

Paperid: 915, https://arxiv.org/pdf/2505.05279.pdf

Abstract:
Most existing unlearnable strategies focus on preventing unauthorized users from training single-task learning (STL) models with personal data. Nevertheless, the paradigm has recently shifted towards multi-task data and multi-task learning (MTL), targeting generalist and foundation models that can handle multiple tasks simultaneously. Despite their growing importance, MTL data and models have been largely neglected while pursuing unlearnable strategies. This paper presents MTL-UE, the first unified framework for generating unlearnable examples for multi-task data and MTL models. Instead of optimizing perturbations for each sample, we design a generator-based structure that introduces label priors and class-wise feature embeddings which leads to much better attacking performance. In addition, MTL-UE incorporates intra-task and inter-task embedding regularization to increase inter-class separation and suppress intra-class variance which enhances the attack robustness greatly. Furthermore, MTL-UE is versatile with good supports for dense prediction tasks in MTL. It is also plug-and-play allowing integrating existing surrogate-dependent unlearnable methods with little adaptation. Extensive experiments show that MTL-UE achieves superior attacking performance consistently across 4 MTL datasets, 3 base UE methods, 5 model backbones, and 5 MTL task-weighting strategies.

Paperid: 916, https://arxiv.org/pdf/2505.04195.pdf

Abstract:
Large Language Models (LLMs) have emerged as promising tools in software development, enabling automated code generation and analysis. However, their knowledge is limited to a fixed cutoff date, making them prone to generating code vulnerable to newly disclosed CVEs. Frequent fine-tuning with new CVE sets is costly, and existing LLM-based approaches focus on oversimplified CWE examples and require providing explicit bug locations to LLMs, limiting their ability to patch complex real-world vulnerabilities. To address these limitations, we propose AutoPatch, a multi-agent framework designed to patch vulnerable LLM-generated code, particularly those introduced after the LLMs' knowledge cutoff. AutoPatch integrates Retrieval-Augmented Generation (RAG) with a structured database of recently disclosed vulnerabilities, comprising 525 code snippets derived from 75 high-severity CVEs across real-world systems such as the Linux kernel and Chrome. AutoPatch combines semantic and taint analysis to identify the most relevant CVE and leverages enhanced Chain-of-Thought (CoT) reasoning to construct enriched prompts for verification and patching. Our unified similarity model, which selects the most relevant vulnerabilities, achieves 90.4 percent accuracy in CVE matching. AutoPatch attains 89.5 percent F1-score for vulnerability verification and 95.0 percent accuracy in patching, while being over 50x more cost-efficient than traditional fine-tuning approaches.

Paperid: 917, https://arxiv.org/pdf/2504.21770.pdf

Abstract:
While static analysis is useful in detecting early-stage hardware security bugs, its efficacy is limited because it requires information to form checks and is often unable to explain the security impact of a detected vulnerability. Large Language Models can be useful in filling these gaps by identifying relevant assets, removing false violations flagged by static analysis tools, and explaining the reported violations. LASHED combines the two approaches (LLMs and Static Analysis) to overcome each other's limitations for hardware security bug detection. We investigate our approach on four open-source SoCs for five Common Weakness Enumerations (CWEs) and present strategies for improvement with better prompt engineering. We find that 87.5% of instances flagged by our recommended scheme are plausible CWEs. In-context learning and asking the model to 'think again' improves LASHED's precision.

Paperid: 918, https://arxiv.org/pdf/2504.20984.pdf

Abstract:
LLM-integrated app systems extend the utility of Large Language Models (LLMs) with third-party apps that are invoked by a system LLM using interleaved planning and execution phases to answer user queries. These systems introduce new attack vectors where malicious apps can cause integrity violation of planning or execution, availability breakdown, or privacy compromise during execution. In this work, we identify new attacks impacting the integrity of planning, as well as the integrity and availability of execution in LLM-integrated apps, and demonstrate them against IsolateGPT, a recent solution designed to mitigate attacks from malicious apps. We propose Abstract-Concrete-Execute (ACE), a new secure architecture for LLM-integrated app systems that provides security guarantees for system planning and execution. Specifically, ACE decouples planning into two phases by first creating an abstract execution plan using only trusted information, and then mapping the abstract plan to a concrete plan using installed system apps. We verify that the plans generated by our system satisfy user-specified secure information flow constraints via static analysis on the structured plan output. During execution, ACE enforces data and capability barriers between apps, and ensures that the execution is conducted according to the trusted abstract plan. We show experimentally that ACE is secure against attacks from the InjecAgent and Agent Security Bench benchmarks for indirect prompt injection, and our newly introduced attacks. We also evaluate the utility of ACE in realistic environments, using the Tool Usage suite from the LangChain benchmark. Our architecture represents a significant advancement towards hardening LLM-based systems using system security principles.

Paperid: 919, https://arxiv.org/pdf/2504.14541.pdf

Abstract:
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions. A critical aspect of these examples is their transferability, allowing them to deceive {unseen} models in black-box scenarios. Despite the widespread exploration of defense methods, including those on transferability, they show limitations: inefficient deployment, ineffective defense, and degraded performance on clean images. In this work, we introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way. We propose a model that exhibits random guessing behavior when presented with clean data $\boldsymbol{x}$ as input, and generates accurate predictions when with triggered data $\boldsymbol{x}+\boldsymbolÏ$. Importantly, the trigger $\boldsymbolÏ$ remains constant for all data instances. We refer to these models as \textbf{models with trigger activation}. We are surprised to find that these models exhibit certain robustness against TAEs. Through the consideration of first-order gradients, we provide a theoretical analysis of this robustness. Moreover, through the joint optimization of the learnable trigger and the model, we achieve improved robustness to transferable attacks. Extensive experiments conducted across diverse datasets, evaluating a variety of attacking methods, underscore the effectiveness and superiority of our approach.

Paperid: 920, https://arxiv.org/pdf/2504.13573.pdf

Abstract:
Cybersquatting refers to the practice where attackers register a domain name similar to a legitimate one to confuse users for illegal gains. With the growth of the Non-Fungible Token (NFT) ecosystem, there are indications that cybersquatting tactics have evolved from targeting domain names to NFTs. This paper presents the first in-depth measurement study of NFT cybersquatting. By analyzing over 220K NFT collections with over 150M NFT tokens, we have identified 8,019 cybersquatting NFT collections targeting 654 popular NFT projects. Through systematic analysis, we discover and characterize seven distinct squatting tactics employed by scammers. We further conduct a comprehensive measurement study of these cybersquatting NFT collections, examining their metadata, associated digital asset content, and social media status. Our analysis reveals that these NFT cybersquatting activities have resulted in a significant financial impact, with over 670K victims affected by these scams, leading to a total financial exploitation of $59.26 million. Our findings demonstrate the urgency to identify and prevent NFT squatting abuses.

Paperid: 921, https://arxiv.org/pdf/2504.12034.pdf

Abstract:
As Ethereum continues to thrive, the Ethereum Virtual Machine (EVM) has become the cornerstone powering tens of millions of active smart contracts. Intuitively, security issues in EVMs could lead to inconsistent behaviors among smart contracts or even denial-of-service of the entire blockchain network. However, to the best of our knowledge, only a limited number of studies focus on the security of EVMs. Moreover, they suffer from 1) insufficient test input diversity and invalid semantics; and 2) the inability to automatically identify bugs and locate root causes. To bridge this gap, we propose OpDiffer, a differential testing framework for EVM, which takes advantage of LLMs and static analysis methods to address the above two limitations. We conducted the largest-scale evaluation, covering nine EVMs and uncovering 26 previously unknown bugs, 22 of which have been confirmed by developers and three have been assigned CNVD IDs. Compared to state-of-the-art baselines, OpDiffer can improve code coverage by at most 71.06%, 148.40% and 655.56%, respectively. Through an analysis of real-world deployed Ethereum contracts, we estimate that 7.21% of the contracts could trigger our identified EVM bugs under certain environmental settings, potentially resulting in severe negative impact on the Ethereum ecosystem.

Paperid: 922, https://arxiv.org/pdf/2504.11735.pdf

Abstract:
Serving as the first touch point for users to the cryptocurrency world, cryptocurrency wallets allow users to manage, receive, and transmit digital assets on blockchain networks and interact with emerging decentralized finance (DeFi) applications. Unfortunately, cryptocurrency wallets have always been the prime targets for attackers, and incidents of wallet breaches have been reported from time to time. Although some recent studies have characterized the vulnerabilities and scams related to wallets, they have generally been characterized in coarse granularity, overlooking potential risks inherent in detailed designs of cryptocurrency wallets, especially from perspectives including user interaction and advanced features. To fill the void, in this paper, we present a fine-grained security analysis on browser-based cryptocurrency wallets. To pinpoint security issues of components in wallets, we design WalletProbe, a mutation-based testing framework based on visual-level oracles. We have identified 13 attack vectors that can be abused by attackers to exploit cryptocurrency wallets and exposed 21 concrete attack strategies. By applying WalletProbe on 39 widely-adopted browser-based wallet extensions, we astonishingly figure out all of them can be abused to steal crypto assets from innocent users. Identified potential attack vectors were reported to wallet developers timely and 26 issues have been patched already. It is, hence, urgent for our community to take action to mitigate threats related to cryptocurrency wallets. We promise to release all code and data to promote the development of the community.

Paperid: 923, https://arxiv.org/pdf/2504.10853.pdf

Abstract:
Watermarking for diffusion images has drawn considerable attention due to the widespread use of text-to-image diffusion models and the increasing need for their copyright protection. Recently, advanced watermarking techniques, such as Tree Ring, integrate watermarks by embedding traceable patterns (e.g., Rings) into the latent distribution during the diffusion process. Such methods disrupt the original semantics of the generated images due to the inevitable distribution shift caused by the watermarks, thereby limiting their practicality, particularly in digital art creation. In this work, we present Semantic-aware Pivotal Tuning Watermarks (PT-Mark), a novel invisible watermarking method that preserves both the semantics of diffusion images and the traceability of the watermark. PT-Mark preserves the original semantics of the watermarked image by gradually aligning the generation trajectory with the original (pivotal) trajectory while maintaining the traceable watermarks during whole diffusion denoising process. To achieve this, we first compute the salient regions of the watermark at each diffusion denoising step as a spatial prior to identify areas that can be aligned without disrupting the watermark pattern. Guided by the region, we then introduce an additional pivotal tuning branch that optimizes the text embedding to align the semantics while preserving the watermarks. Extensive evaluations demonstrate that PT-Mark can preserve the original semantics of the diffusion images while integrating robust watermarks. It achieves a 10% improvement in the performance of semantic preservation (i.e., SSIM, PSNR, and LPIPS) compared to state-of-the-art watermarking methods, while also showing comparable robustness against real-world perturbations and four times greater efficiency.

Paperid: 924, https://arxiv.org/pdf/2504.07839.pdf

Abstract:
Intrusion Detection Systems (IDS) have long been a hot topic in the cybersecurity community. In recent years, with the introduction of deep learning (DL) techniques, IDS have made great progress due to their increasing generalizability. The rationale behind this is that by learning the underlying patterns of known system behaviors, IDS detection can be generalized to intrusions that exploit zero-day vulnerabilities. In this survey, we refer to this type of IDS as DL-based IDS (DL-IDS). From the perspective of DL, this survey systematically reviews all the stages of DL-IDS, including data collection, log storage, log parsing, graph summarization, attack detection, and attack investigation. To accommodate current researchers, a section describing the publicly available benchmark datasets is included. This survey further discusses current challenges and potential future research directions, aiming to help researchers understand the basic ideas and visions of DL-IDS research, as well as to motivate their research interests.

Paperid: 925, https://arxiv.org/pdf/2504.01240.pdf

Abstract:
In this survey, we investigate the most recent techniques of resilient federated learning (ResFL) in CyberEdge networks, focusing on joint training with agglomerative deduction and feature-oriented security mechanisms. We explore adaptive hierarchical learning strategies to tackle non-IID data challenges, improving scalability and reducing communication overhead. Fault tolerance techniques and agglomerative deduction mechanisms are studied to detect unreliable devices, refine model updates, and enhance convergence stability. Unlike existing FL security research, we comprehensively analyze feature-oriented threats, such as poisoning, inference, and reconstruction attacks that exploit model features. Moreover, we examine resilient aggregation techniques, anomaly detection, and cryptographic defenses, including differential privacy and secure multi-party computation, to strengthen FL security. In addition, we discuss the integration of 6G, large language models (LLMs), and interoperable learning frameworks to enhance privacy-preserving and decentralized cross-domain training. These advancements offer ultra-low latency, artificial intelligence (AI)-driven network management, and improved resilience against adversarial attacks, fostering the deployment of secure ResFL in CyberEdge networks.

Paperid: 926, https://arxiv.org/pdf/2503.16851.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable performance in natural language generation tasks, yet their uncontrolled outputs pose significant ethical and safety risks. Recently, representation engineering methods have shown promising results in steering model behavior by modifying the rich semantic information encoded in activation vectors. However, due to the difficulty of precisely disentangling semantic directions within high-dimensional representation space, existing approaches suffer from three major limitations: lack of fine-grained control, quality degradation of generated content, and poor interpretability. To address these challenges, we propose a sparse encoding-based representation engineering method, named SRE, which decomposes polysemantic activations into a structured, monosemantic feature space. By leveraging sparse autoencoding, our approach isolates and adjusts only task-specific sparse feature dimensions, enabling precise and interpretable steering of model behavior while preserving content quality. We validate our method on three critical domains, i.e., safety, fairness, and truthfulness using the open-source LLM Gemma-2-2B-it. Experimental results show that SRE achieves superior controllability while maintaining the overall quality of generated content (i.e., controllability and quality), demonstrating its effectiveness as a fine-grained and interpretable activation steering framework.

Paperid: 927, https://arxiv.org/pdf/2503.12896.pdf

Abstract:
Recent studies improve on-device language model (LM) inference through end-cloud collaboration, where the end device retrieves useful information from cloud databases to enhance local processing, known as Retrieval-Augmented Generation (RAG). Typically, to retrieve information from the cloud while safeguarding privacy, the end device transforms original data into embeddings with a local embedding model. However, the recently emerging Embedding Inversion Attacks (EIAs) can still recover the original data from text embeddings (e.g., training a recovery model to map embeddings back to original texts), posing a significant threat to user privacy. To address this risk, we propose EntroGuard, an entropy-driven perturbation-based embedding privacy protection method, which can protect the privacy of text embeddings while maintaining retrieval accuracy during the end-cloud collaboration. Specifically, to defeat various EIAs, we perturb the embeddings to increase the entropy of the recovered text in the common structure of recovery models, thus steering the embeddings toward meaningless texts rather than original sensitive texts during the recovery process. To maintain retrieval performance in the cloud, we constrain the perturbations within a bound, applying the strategy of reducing them where redundant and increasing them where sparse. Moreover, EntroGuard can be directly integrated into end devices without requiring any modifications to the embedding model. Extensive experimental results demonstrate that EntroGuard can reduce the risk of privacy leakage by up to 8 times at most with negligible loss of retrieval performance compared to existing privacy-preserving methods.

Paperid: 928, https://arxiv.org/pdf/2503.06989.pdf

Abstract:
Recently, Multimodal Large Language Models (MLLMs) have demonstrated their superior ability in understanding multimodal content. However, they remain vulnerable to jailbreak attacks, which exploit weaknesses in their safety alignment to generate harmful responses. Previous studies categorize jailbreaks as successful or failed based on whether responses contain malicious content. However, given the stochastic nature of MLLM responses, this binary classification of an input's ability to jailbreak MLLMs is inappropriate. Derived from this viewpoint, we introduce jailbreak probability to quantify the jailbreak potential of an input, which represents the likelihood that MLLMs generated a malicious response when prompted with this input. We approximate this probability through multiple queries to MLLMs. After modeling the relationship between input hidden states and their corresponding jailbreak probability using Jailbreak Probability Prediction Network (JPPN), we use continuous jailbreak probability for optimization. Specifically, we propose Jailbreak-Probability-based Attack (JPA) that optimizes adversarial perturbations on input image to maximize jailbreak probability, and further enhance it as Multimodal JPA (MJPA) by including monotonic text rephrasing. To counteract attacks, we also propose Jailbreak-Probability-based Finetuning (JPF), which minimizes jailbreak probability through MLLM parameter updates. Extensive experiments show that (1) (M)JPA yields significant improvements when attacking a wide range of models under both white and black box settings. (2) JPF vastly reduces jailbreaks by at most over 60\%. Both of the above results demonstrate the significance of introducing jailbreak probability to make nuanced distinctions among input jailbreak abilities.

Paperid: 929, https://arxiv.org/pdf/2503.06519.pdf

Abstract:
Small language models (SLMs) have emerged as promising alternatives to large language models (LLMs) due to their low computational demands, enhanced privacy guarantees and comparable performance in specific domains through light-weight fine-tuning. Deploying SLMs on edge devices, such as smartphones and smart vehicles, has become a growing trend. However, the security implications of SLMs have received less attention than LLMs, particularly regarding jailbreak attacks, which is recognized as one of the top threats of LLMs by the OWASP. In this paper, we conduct the first large-scale empirical study of SLMs' vulnerabilities to jailbreak attacks. Through systematically evaluation on 63 SLMs from 15 mainstream SLM families against 8 state-of-the-art jailbreak methods, we demonstrate that 47.6% of evaluated SLMs show high susceptibility to jailbreak attacks (ASR > 40%) and 38.1% of them can not even resist direct harmful query (ASR > 50%). We further analyze the reasons behind the vulnerabilities and identify four key factors: model size, model architecture, training datasets and training techniques. Moreover, we assess the effectiveness of three prompt-level defense methods and find that none of them achieve perfect performance, with detection accuracy varying across different SLMs and attack methods. Notably, we point out that the inherent security awareness play a critical role in SLM security, and models with strong security awareness could timely terminate unsafe response with little reminder. Building upon the findings, we highlight the urgent need for security-by-design approaches in SLM development and provide valuable insights for building more trustworthy SLM ecosystem.

Paperid: 930, https://arxiv.org/pdf/2503.01734.pdf

Abstract:
Reinforcement learning (RL) offers powerful techniques for solving complex sequential decision-making tasks from experience. In this paper, we demonstrate how RL can be applied to adversarial machine learning (AML) to develop a new class of attacks that learn to generate adversarial examples: inputs designed to fool machine learning models. Unlike traditional AML methods that craft adversarial examples independently, our RL-based approach retains and exploits past attack experience to improve future attacks. We formulate adversarial example generation as a Markov Decision Process and evaluate RL's ability to (a) learn effective and efficient attack strategies and (b) compete with state-of-the-art AML. On CIFAR-10, our agent increases the success rate of adversarial examples by 19.4% and decreases the median number of victim model queries per adversarial example by 53.2% from the start to the end of training. In a head-to-head comparison with a state-of-the-art image attack, SquareAttack, our approach enables an adversary to generate adversarial examples with 13.1% more success after 5000 episodes of training. From a security perspective, this work demonstrates a powerful new attack vector that uses RL to attack ML models efficiently and at scale.

Paperid: 931, https://arxiv.org/pdf/2502.18578.pdf

Abstract:
Linear $L_1$-regularized models have remained one of the simplest and most effective tools in data science. Over the past decade, screening rules have risen in popularity as a way to eliminate features when producing the sparse regression weights of $L_1$ models. However, despite the increasing need of privacy-preserving models for data analysis, to the best of our knowledge, no differentially private screening rule exists. In this paper, we develop the first private screening rule for linear regression. We initially find that this screening rule is too strong: it screens too many coefficients as a result of the private screening step. However, a weakened implementation of private screening reduces overscreening and improves performance.

Paperid: 932, https://arxiv.org/pdf/2502.13527.pdf

Abstract:
The rise of Large Language Models (LLMs) has led to significant applications but also introduced serious security threats, particularly from jailbreak attacks that manipulate output generation. These attacks utilize prompt engineering and logit manipulation to steer models toward harmful content, prompting LLM providers to implement filtering and safety alignment strategies. We investigate LLMs' safety mechanisms and their recent applications, revealing a new threat model targeting structured output interfaces, which enable attackers to manipulate the inner logit during LLM generation, requiring only API access permissions. To demonstrate this threat model, we introduce a black-box attack framework called AttackPrefixTree (APT). APT exploits structured output interfaces to dynamically construct attack patterns. By leveraging prefixes of models' safety refusal response and latent harmful outputs, APT effectively bypasses safety measures. Experiments on benchmark datasets indicate that this approach achieves higher attack success rate than existing methods. This work highlights the urgent need for LLM providers to enhance security protocols to address vulnerabilities arising from the interaction between safety patterns and structured outputs.

Paperid: 933, https://arxiv.org/pdf/2502.10803.pdf

Abstract:
The rapid advancement of generative models has led to the proliferation of highly realistic AI-generated images, posing significant challenges for detection methods to generalize across diverse and evolving generative techniques. Existing approaches often fail to adapt to unknown models without costly retraining, limiting their practicability. To fill this gap, we propose Post-hoc Distribution Alignment (PDA), a novel approach for the generalizable detection for AI-generated images. The key idea is to use the known generative model to regenerate undifferentiated test images. This process aligns the distributions of the re-generated real images with the known fake images, enabling effective distinction from unknown fake images. PDA employs a two-step detection framework: 1) evaluating whether a test image aligns with the known fake distribution based on deep k-nearest neighbor (KNN) distance, and 2) re-generating test images using known generative models to create pseudo-fake images for further classification. This alignment strategy allows PDA to effectively detect fake images without relying on unseen data or requiring retraining. Extensive experiments demonstrate the superiority of PDA, achieving 96.73\% average accuracy across six state-of-the-art generative models, including GANs, diffusion models, and text-to-image models, and improving by 16.07\% over the best baseline. Through t-SNE visualizations and KNN distance analysis, we provide insights into PDA's effectiveness in separating real and fake images. Our work provides a flexible and effective solution for real-world fake image detection, advancing the generalization ability of detection systems.

Paperid: 934, https://arxiv.org/pdf/2502.03718.pdf

Abstract:
Price manipulation attack is one of the notorious threats in decentralized finance (DeFi) applications, which allows attackers to exchange tokens at an extensively deviated price from the market. Existing efforts usually rely on reactive methods to identify such kind of attacks after they have happened, e.g., detecting attack transactions in the post-attack stage, which cannot mitigate or prevent price manipulation attacks timely. From the perspective of attackers, they usually need to deploy attack contracts in the pre-attack stage. Thus, if we can identify these attack contracts in a proactive manner, we can raise alarms and mitigate the threats. With the core idea in mind, in this work, we shift our attention from the victims to the attackers. Specifically, we propose SMARTCAT, a novel approach for identifying price manipulation attacks in the pre-attack stage proactively. For generality, it conducts analysis on bytecode and does not require any source code and transaction data. For accuracy, it depicts the control- and data-flow dependency relationships among function calls into a token flow graph. For scalability, it filters out those suspicious paths, in which it conducts inter-contract analysis as necessary. To this end, SMARTCAT can pinpoint attacks in real time once they have been deployed on a chain. The evaluation results illustrate that SMARTCAT significantly outperforms existing baselines with 91.6% recall and ~100% precision. Moreover, SMARTCAT also uncovers 616 attack contracts in-the-wild, accounting for \$9.25M financial losses, with only 19 cases publicly reported. By applying SMARTCAT as a real-time detector in Ethereum and Binance Smart Chain, it has raised 14 alarms 99 seconds after the corresponding deployment on average. These attacks have already led to $641K financial losses, and seven of them are still waiting for their ripe time.

Paperid: 935, https://arxiv.org/pdf/2502.01240.pdf

Abstract:
Logic locking secures hardware designs in untrusted foundries by incorporating key-driven gates to obscure the original blueprint. While this method safeguards the integrated circuit from malicious alterations during fabrication, its influence on data confidentiality during runtime has been ignored. In this study, we employ path sensitization to formally examine the impact of logic locking on confidentiality. By applying three representative logic locking mechanisms on open-source cryptographic benchmarks, we utilize an automatic test pattern generation framework to evaluate the effect of locking on cryptographic encryption keys and sensitive data signals. Our analysis reveals that logic locking can inadvertently cause sensitive data leakage when incorrect logic locking keys are used. We show that a single malicious logic locking key can expose over 70% of an encryption key. If an adversary gains control over other inputs, the entire encryption key can be compromised. This research uncovers a significant security vulnerability in logic locking and emphasizes the need for comprehensive security assessments that extend beyond key-recovery attacks.

Paperid: 936, https://arxiv.org/pdf/2501.19290.pdf

Abstract:
Noninterference theory supports the analysis of secure computations in multi-level security systems. Classical equivalence-based approaches to noninterference mainly rely on bisimilarity. In a nondeterministic setting, assessing noninterference through weak bisimilarity is adequate for irreversible systems, whereas for reversible ones branching bisimilarity has been recently proven to be more appropriate. In this paper we address the same two families of systems, with the difference that probabilities come into play in addition to nondeterminism. For irreversible systems we extend the results of Aldini, Bravetti, and Gorrieri developed in a generative-reactive probabilistic setting, while for reversible systems we extend the results of Esposito, Aldini, Bernardo, and Rossi developed in a purely nondeterministic setting. We recast noninterference properties by adopting probabilistic variants of weak and branching bisimilarities for irreversible and reversible systems respectively. Then we investigate a taxonomy of those properties as well as their preservation and compositionality aspects, along with a comparison with the nondeterministic taxonomy. The adequacy of the extended noninterference theory is illustrated via a probabilistic smart contract example.

Paperid: 937, https://arxiv.org/pdf/2501.18006.pdf

Abstract:
Multimodal Machine Learning systems, particularly those aligning text and image data like CLIP/BLIP models, have become increasingly prevalent, yet remain susceptible to adversarial attacks. While substantial research has addressed adversarial robustness in unimodal contexts, defense strategies for multimodal systems are underexplored. This work investigates the topological signatures that arise between image and text embeddings and shows how adversarial attacks disrupt their alignment, introducing distinctive signatures. We specifically leverage persistent homology and introduce two novel Topological-Contrastive losses based on Total Persistence and Multi-scale kernel methods to analyze the topological signatures introduced by adversarial perturbations. We observe a pattern of monotonic changes in the proposed topological losses emerging in a wide range of attacks on image-text alignments, as more adversarial samples are introduced in the data. By designing an algorithm to back-propagate these signatures to input samples, we are able to integrate these signatures into Maximum Mean Discrepancy tests, creating a novel class of tests that leverage topological signatures for better adversarial detection.

Paperid: 938, https://arxiv.org/pdf/2501.16165.pdf

Abstract:
The Operating System (OS) kernel is foundational in modern computing, especially with the proliferation of diverse computing devices. However, its development also comes with vulnerabilities that can lead to severe security breaches. Kernel fuzzing, a technique used to uncover these vulnerabilities, poses distinct challenges when compared to userspace fuzzing. These include the complexity of configuring the testing environment and addressing the statefulness inherent to both the kernel and the fuzzing process. Despite the significant interest from the security community, a comprehensive understanding of kernel fuzzing remains lacking, hindering further progress in the field. In this paper, we present the first systematic study dedicated to OS kernel fuzzing. It begins by summarizing the progress of 99 academic studies from top-tier venues between 2017 and 2024. Following this, we introduce a stage-based fuzzing model and a novel fuzzing taxonomy that highlights nine core functionalities unique to kernel fuzzing. These functionalities are examined alongside their corresponding methodological approaches based on qualitative evaluation criteria. Our systematization identifies challenges in meeting functionality requirements and proposes potential technical solutions. Finally, we outline promising and practical future directions to guide forthcoming research in kernel security, supported in part by insights derived from our case study.

Paperid: 939, https://arxiv.org/pdf/2501.08834.pdf

Abstract:
Billions of dollars are transacted through smart contracts, making vulnerabilities a major financial risk. One focus in the security arms race is on profitable vulnerabilities that attackers can exploit. Fuzzing is a key method for identifying these vulnerabilities. However, current solutions face two main limitations: a lack of profit-centric techniques for expediting detection, and insufficient automation in maximizing the profitability of discovered vulnerabilities, leaving the analysis to human experts. To address these gaps, we have developed VERITE, a profit-centric smart contract fuzzing framework that not only effectively detects those profitable vulnerabilities but also maximizes the exploited profits. VERITE has three key features: 1) DeFi action-based mutators for boosting the exploration of transactions with different fund flows; 2) potentially profitable candidates identification criteria, which checks whether the input has caused abnormal fund flow properties during testing; 3) a gradient descent-based profit maximization strategy for these identified candidates. VERITE is fully developed from scratch and evaluated on a dataset consisting of 61 exploited real-world DeFi projects with an average of over 1.1 million dollars loss. The results show that VERITE can automatically extract more than 18 million dollars in total and is significantly better than state-of-the-art fuzzer ITYFUZZ in both detection (29/10) and exploitation (134 times more profits gained on average). Remarkably, in 12 targets, it gains more profits than real-world attacking exploits (1.01 to 11.45 times more). VERITE is also applied by auditors in contract auditing, where 6 (5 high severity) zero-day vulnerabilities are found with over $2,500 bounty rewards.

Paperid: 940, https://arxiv.org/pdf/2506.20481.pdf

Abstract:
Machine learning models are known to memorize samples from their training data, raising concerns around privacy and generalization. Counterfactual self-influence is a popular metric to study memorization, quantifying how the model's prediction for a sample changes depending on the sample's inclusion in the training dataset. However, recent work has shown memorization to be affected by factors beyond self-influence, with other training samples, in particular (near-)duplicates, having a large impact. We here study memorization treating counterfactual influence as a distributional quantity, taking into account how all training samples influence how a sample is memorized. For a small language model, we compute the full influence distribution of training samples on each other and analyze its properties. We find that solely looking at self-influence can severely underestimate tangible risks associated with memorization: the presence of (near-)duplicates seriously reduces self-influence, while we find these samples to be (near-)extractable. We observe similar patterns for image classification, where simply looking at the influence distributions reveals the presence of near-duplicates in CIFAR-10. Our findings highlight that memorization stems from complex interactions across training data and is better captured by the full influence distribution than by self-influence alone.

Paperid: 941, https://arxiv.org/pdf/2506.19892.pdf

Abstract:
Decentralized Federated Learning (DFL) enables nodes to collaboratively train models without a central server, introducing new vulnerabilities since each node independently selects peers for model aggregation. Malicious nodes may exploit this autonomy by sending corrupted models (model poisoning), delaying model submissions (delay attack), or flooding the network with excessive messages, negatively affecting system performance. Existing solutions often depend on rigid configurations or additional infrastructures such as blockchain, leading to computational overhead, scalability issues, or limited adaptability. To overcome these limitations, this paper proposes RepuNet, a decentralized reputation system that categorizes threats in DFL and dynamically evaluates node behavior using metrics like model similarity, parameter changes, message latency, and communication volume. Nodes' influence in model aggregation is adjusted based on their reputation scores. RepuNet was integrated into the Nebula DFL platform and experimentally evaluated with MNIST and CIFAR-10 datasets under non-IID distributions, using federations of up to 25 nodes in both fully connected and random topologies. Different attack intensities, frequencies, and activation intervals were tested. Results demonstrated that RepuNet effectively detects and mitigates malicious behavior, achieving F1 scores above 95% for MNIST scenarios and approximately 76% for CIFAR-10 cases. These outcomes highlight RepuNet's adaptability, robustness, and practical potential for mitigating threats in decentralized federated learning environments.

Paperid: 942, https://arxiv.org/pdf/2506.17162.pdf

Abstract:
Malicious PDF files have emerged as a persistent threat and become a popular attack vector in web-based attacks. While machine learning-based PDF malware classifiers have shown promise, these classifiers are often susceptible to adversarial attacks, undermining their reliability. To address this issue, recent studies have aimed to enhance the robustness of PDF classifiers. Despite these efforts, the feature engineering underlying these studies remains outdated. Consequently, even with the application of cutting-edge machine learning techniques, these approaches fail to fundamentally resolve the issue of feature instability. To tackle this, we propose a novel approach for PDF feature extraction and PDF malware detection. We introduce the PDFObj IR (PDF Object Intermediate Representation), an assembly-like language framework for PDF objects, from which we extract semantic features using a pretrained language model. Additionally, we construct an Object Reference Graph to capture structural features, drawing inspiration from program analysis. This dual approach enables us to analyze and detect PDF malware based on both semantic and structural features. Experimental results demonstrate that our proposed classifier achieves strong adversarial robustness while maintaining an exceptionally low false positive rate of only 0.07% on baseline dataset compared to state-of-the-art PDF malware classifiers.

Paperid: 943, https://arxiv.org/pdf/2506.13009.pdf

Abstract:
Machine unlearning focuses on efficiently removing specific data from trained models, addressing privacy and compliance concerns with reasonable costs. Although exact unlearning ensures complete data removal equivalent to retraining, it is impractical for large-scale models, leading to growing interest in inexact unlearning methods. However, the lack of formal guarantees in these methods necessitates the need for robust evaluation frameworks to assess their privacy and effectiveness. In this work, we first identify several key pitfalls of the existing unlearning evaluation frameworks, e.g., focusing on average-case evaluation or targeting random samples for evaluation, incomplete comparisons with the retraining baseline. Then, we propose RULI (Rectified Unlearning Evaluation Framework via Likelihood Inference), a novel framework to address critical gaps in the evaluation of inexact unlearning methods. RULI introduces a dual-objective attack to measure both unlearning efficacy and privacy risks at a per-sample granularity. Our findings reveal significant vulnerabilities in state-of-the-art unlearning methods, where RULI achieves higher attack success rates, exposing privacy risks underestimated by existing methods. Built on a game-based foundation and validated through empirical evaluations on both image and text data (spanning tasks from classification to generation), RULI provides a rigorous, scalable, and fine-grained methodology for evaluating unlearning techniques.

Paperid: 944, https://arxiv.org/pdf/2506.11413.pdf

Abstract:
Federated learning (FL) enables decentralized machine learning without sharing raw data, allowing multiple clients to collaboratively learn a global model. However, studies reveal that privacy leakage is possible under commonly adopted FL protocols. In particular, a server with access to client gradients can synthesize data resembling the clients' training data. In this paper, we introduce a novel threat model in FL, named the maliciously curious client, where a client manipulates its own gradients with the goal of inferring private data from peers. This attacker uniquely exploits the strength of a Byzantine adversary, traditionally aimed at undermining model robustness, and repurposes it to facilitate data reconstruction attack. We begin by formally defining this novel client-side threat model and providing a theoretical analysis that demonstrates its ability to achieve significant reconstruction success during FL training. To demonstrate its practical impact, we further develop a reconstruction algorithm that combines gradient inversion with malicious update strategies. Our analysis and experimental results reveal a critical blind spot in FL defenses: both server-side robust aggregation and client-side privacy mechanisms may fail against our proposed attack. Surprisingly, standard server- and client-side defenses designed to enhance robustness or privacy may unintentionally amplify data leakage. Compared to the baseline approach, a mistakenly used defense may instead improve the reconstructed image quality by 10-15%.

Paperid: 945, https://arxiv.org/pdf/2506.10665.pdf

Abstract:
Intelligent Transportation Systems (ITSs) technology has advanced during the past years, and it is now used for several applications that require vehicles to exchange real-time data, such as in traffic information management. Traditionally, road traffic information has been collected using on-site sensors. However, crowd-sourcing traffic information from onboard sensors or smartphones has become a viable alternative. State-of-the-art solutions currently follow a centralized model where only the service provider has complete access to the collected traffic data and represent a single point of failure and trust. In this paper, we propose GOLIATH, a blockchain-based decentralized framework that runs on the In-Vehicle Infotainment (IVI) system to collect real-time information exchanged between the network's participants. Our approach mitigates the limitations of existing crowd-sourcing centralized solutions by guaranteeing trusted information collection and exchange, fully exploiting the intrinsic distributed nature of vehicles. We demonstrate its feasibility in the context of vehicle positioning and traffic information management. Each vehicle participating in the decentralized network shares its position and neighbors' ones in the form of a transaction recorded on the ledger, which uses a novel consensus mechanism to validate it. We design the consensus mechanism resilient against a realistic set of adversaries that aim to tamper or disable the communication. We evaluate the proposed framework in a simulated (but realistic) environment, which considers different threats and allows showing its robustness and safety properties.

Paperid: 946, https://arxiv.org/pdf/2506.10638.pdf

Abstract:
In the last decades, Cyber-physical Systems (CPSs) have experienced a significant technological evolution and increased connectivity, at the cost of greater exposure to cyber-attacks. Since many CPS are used in safety-critical systems, such attacks entail high risks and potential safety harms. Although several defense strategies have been proposed, they rarely exploit the cyber-physical nature of the system. In this work, we exploit the nature of CPS by proposing CyFence, a novel architecture that improves the resilience of closed-loop control systems against cyber-attacks by adding a semantic check, used to confirm that the system is behaving as expected. To ensure the security of the semantic check code, we use the Trusted Execution Environment implemented by modern processors. We evaluate CyFence considering a real-world application, consisting of an active braking digital controller, demonstrating that it can mitigate different types of attacks with a negligible computation overhead.

Paperid: 947, https://arxiv.org/pdf/2506.10620.pdf

Abstract:
The security of modern vehicles has become increasingly important, with the controller area network (CAN) bus serving as a critical communication backbone for various Electronic Control Units (ECUs). The absence of robust security measures in CAN, coupled with the increasing connectivity of vehicles, makes them susceptible to cyberattacks. While intrusion detection systems (IDSs) have been developed to counter such threats, they are not foolproof. Adversarial attacks, particularly evasion attacks, can manipulate inputs to bypass detection by IDSs. This paper extends our previous work by investigating the feasibility and impact of gradient-based adversarial attacks performed with different degrees of knowledge against automotive IDSs. We consider three scenarios: white-box (attacker with full system knowledge), grey-box (partial system knowledge), and the more realistic black-box (no knowledge of the IDS' internal workings or data). We evaluate the effectiveness of the proposed attacks against state-of-the-art IDSs on two publicly available datasets. Additionally, we study effect of the adversarial perturbation on the attack impact and evaluate real-time feasibility by precomputing evasive payloads for timed injection based on bus traffic. Our results demonstrate that, besides attacks being challenging due to the automotive domain constraints, their effectiveness is strongly dependent on the dataset quality, the target IDS, and the attacker's degree of knowledge.

Paperid: 948, https://arxiv.org/pdf/2506.09525.pdf

Abstract:
Federated recommendation (FR) is a promising paradigm to protect user privacy in recommender systems. Distinct from general federated scenarios, FR inherently needs to preserve client-specific parameters, i.e., user embeddings, for privacy and personalization. However, we empirically find that globally aggregated item embeddings can induce skew in user embeddings, resulting in suboptimal performance. To this end, we theoretically analyze the user embedding skew issue and propose Personalized Federated recommendation with Calibration via Low-Rank decomposition (PFedCLR). Specifically, PFedCLR introduces an integrated dual-function mechanism, implemented with a buffer matrix, to jointly calibrate local user embedding and personalize global item embeddings. To ensure efficiency, we employ a low-rank decomposition of the buffer matrix to reduce the model overhead. Furthermore, for privacy, we train and upload the local model before personalization, preventing the server from accessing sensitive information. Extensive experiments demonstrate that PFedCLR effectively mitigates user embedding skew and achieves a desirable trade-off among performance, efficiency, and privacy, outperforming state-of-the-art (SOTA) methods.

Paperid: 949, https://arxiv.org/pdf/2506.09148.pdf

Abstract:
Adversarial attacks on Natural Language Processing (NLP) models expose vulnerabilities by introducing subtle perturbations to input text, often leading to misclassification while maintaining human readability. Existing methods typically focus on word-level or local text segment alterations, overlooking the broader context, which results in detectable or semantically inconsistent perturbations. We propose a novel adversarial text attack scheme named Dynamic Contextual Perturbation (DCP). DCP dynamically generates context-aware perturbations across sentences, paragraphs, and documents, ensuring semantic fidelity and fluency. Leveraging the capabilities of pre-trained language models, DCP iteratively refines perturbations through an adversarial objective function that balances the dual objectives of inducing model misclassification and preserving the naturalness of the text. This comprehensive approach allows DCP to produce more sophisticated and effective adversarial examples that better mimic natural language patterns. Our experimental results, conducted on various NLP models and datasets, demonstrate the efficacy of DCP in challenging the robustness of state-of-the-art NLP systems. By integrating dynamic contextual analysis, DCP significantly enhances the subtlety and impact of adversarial attacks. This study highlights the critical role of context in adversarial attacks and lays the groundwork for creating more robust NLP systems capable of withstanding sophisticated adversarial strategies.

Paperid: 950, https://arxiv.org/pdf/2506.07605.pdf

Abstract:
Federated Learning has emerged as a privacy-oriented alternative to centralized Machine Learning, enabling collaborative model training without direct data sharing. While extensively studied for neural networks, the security and privacy implications of tree-based models remain underexplored. This work introduces TimberStrike, an optimization-based dataset reconstruction attack targeting horizontally federated tree-based models. Our attack, carried out by a single client, exploits the discrete nature of decision trees by using split values and decision paths to infer sensitive training data from other clients. We evaluate TimberStrike on State-of-the-Art federated gradient boosting implementations across multiple frameworks, including Flower, NVFlare, and FedTree, demonstrating their vulnerability to privacy breaches. On a publicly available stroke prediction dataset, TimberStrike consistently reconstructs between 73.05% and 95.63% of the target dataset across all implementations. We further analyze Differential Privacy, showing that while it partially mitigates the attack, it also significantly degrades model performance. Our findings highlight the need for privacy-preserving mechanisms specifically designed for tree-based Federated Learning systems, and we provide preliminary insights into their design.

Paperid: 951, https://arxiv.org/pdf/2506.07392.pdf

Abstract:
The proliferation of UAVs has enabled a wide range of mission-critical applications and is becoming a cornerstone of low-altitude networks, supporting smart cities, emergency response, and more. However, the open wireless environment, dynamic topology, and resource constraints of UAVs expose low-altitude networks to severe DoS threats. Traditional defense approaches, which rely on fixed configurations or centralized decision-making, cannot effectively respond to the rapidly changing conditions in UAV swarm environments. To address these challenges, we propose a novel federated multi-agent deep reinforcement learning (FMADRL)-driven moving target defense (MTD) framework for proactive DoS mitigation in low-altitude networks. Specifically, we design lightweight and coordinated MTD mechanisms, including leader switching, route mutation, and frequency hopping, to disrupt attacker efforts and enhance network resilience. The defense problem is formulated as a multi-agent partially observable Markov decision process, capturing the uncertain nature of UAV swarms under attack. Each UAV is equipped with a policy agent that autonomously selects MTD actions based on partial observations and local experiences. By employing a policy gradient-based algorithm, UAVs collaboratively optimize their policies via reward-weighted aggregation. Extensive simulations demonstrate that our approach significantly outperforms state-of-the-art baselines, achieving up to a 34.6% improvement in attack mitigation rate, a reduction in average recovery time of up to 94.6%, and decreases in energy consumption and defense cost by as much as 29.3% and 98.3%, respectively, under various DoS attack strategies. These results highlight the potential of intelligent, distributed defense mechanisms to protect low-altitude networks, paving the way for reliable and scalable low-altitude economy.

Paperid: 952, https://arxiv.org/pdf/2506.07153.pdf

Abstract:
Web-use agents are rapidly being deployed to automate complex web tasks, operating with extensive browser capabilities including multi-tab navigation, DOM manipulation, JavaScript execution and authenticated session access. However, these powerful capabilities create a critical and previously unexplored attack surface. This paper demonstrates how attackers can exploit web-use agents' high-privilege capabilities by embedding malicious content in web pages such as comments, reviews, or advertisements that agents encounter during legitimate browsing tasks. In addition, we introduce the task-aligned injection technique that frame malicious commands as helpful task guidance rather than obvious attacks. This technique exploiting fundamental limitations in LLMs' contextual reasoning: agents struggle in maintaining coherent contextual awareness and fail to detect when seemingly helpful web content contains steering attempts that deviate from their original task goal. Through systematic evaluation of four popular agents (OpenAI Operator, Browser Use, Do Browser, OpenOperator), we demonstrate nine payload types that compromise confidentiality, integrity, and availability, including unauthorized camera activation, user impersonation, local file exfiltration, password leakage, and denial of service, with validation across multiple LLMs achieving success rates of 80%-100%. These payloads succeed across agents with built-in safety mechanisms, requiring only the ability to post content on public websites, creating unprecedented risks given the ease of exploitation combined with agents' high-privilege access. To address this attack, we propose comprehensive mitigation strategies including oversight mechanisms, execution constraints, and task-aware reasoning techniques, providing practical directions for secure development and deployment.

Paperid: 953, https://arxiv.org/pdf/2506.07077.pdf

Abstract:
Differential Privacy (DP) is a widely adopted technique, valued for its effectiveness in protecting the privacy of task-specific datasets, making it a critical tool for large language models. However, its effectiveness in Multimodal Large Language Models (MLLMs) remains uncertain. Applying Differential Privacy (DP) inherently introduces substantial computation overhead, a concern particularly relevant for MLLMs which process extensive textual and visual data. Furthermore, a critical challenge of DP is that the injected noise, necessary for privacy, scales with parameter dimensionality, leading to pronounced model degradation; This trade-off between privacy and utility complicates the application of Differential Privacy (DP) to complex architectures like MLLMs. To address these, we propose Dual-Priv Pruning, a framework that employs two complementary pruning mechanisms for DP fine-tuning in MLLMs: (i) visual token pruning to reduce input dimensionality by removing redundant visual information, and (ii) gradient-update pruning during the DP optimization process. This second mechanism selectively prunes parameter updates based on the magnitude of noisy gradients, aiming to mitigate noise impact and improve utility. Experiments demonstrate that our approach achieves competitive results with minimal performance degradation. In terms of computational efficiency, our approach consistently utilizes less memory than standard DP-SGD. While requiring only 1.74% more memory than zeroth-order methods which suffer from severe performance issues on A100 GPUs, our method demonstrates leading memory efficiency on H20 GPUs. To the best of our knowledge, we are the first to explore DP fine-tuning in MLLMs. Our code is coming soon.

Paperid: 954, https://arxiv.org/pdf/2506.04978.pdf

Abstract:
The challenges derived from the data-intensive nature of machine learning in conjunction with technologies that enable novel paradigms such as V2X and the potential offered by 5G communication, allow and justify the deployment of Federated Learning (FL) solutions in the vehicular intrusion detection domain. In this paper, we investigate the effects of integrating FL strategies into the machine learning-based intrusion detection process for on-board vehicular networks. Accordingly, we propose a FL implementation of a state-of-the-art Intrusion Detection System (IDS) for Controller Area Network (CAN), based on LSTM autoencoders. We thoroughly evaluate its detection efficiency and communication overhead, comparing it to a centralized version of the same algorithm, thereby presenting it as a feasible solution.

Paperid: 955, https://arxiv.org/pdf/2506.02679.pdf

Abstract:
A significant body of research in decentralized federated learning focuses on combining the privacy-preserving properties of federated learning with the resilience and transparency offered by blockchain-based systems. While these approaches are promising, they often lack flexible tools to evaluate system robustness under adversarial conditions. To fill this gap, we present FedBlockParadox, a modular framework for modeling and evaluating decentralized federated learning systems built on blockchain technologies, with a focus on resilience against a broad spectrum of adversarial attack scenarios. It supports multiple consensus protocols, validation methods, aggregation strategies, and configurable attack models. By enabling controlled experiments, FedBlockParadox provides a valuable resource for researchers developing secure, decentralized learning solutions. The framework is open-source and built to be extensible by the community.

Paperid: 956, https://arxiv.org/pdf/2506.02660.pdf

Abstract:
Machine learning algorithms can effectively classify malware through dynamic behavior but are susceptible to adversarial attacks. Existing attacks, however, often fail to find an effective solution in both the feature and problem spaces. This issue arises from not addressing the intrinsic nondeterministic nature of malware, namely executing the same sample multiple times may yield significantly different behaviors. Hence, the perturbations computed for a specific behavior may be ineffective for others observed in subsequent executions. In this paper, we show how an attacker can augment their chance of success by leveraging a new and more efficient feature space algorithm for sequential data, which we have named PS-FGSM, and by adopting two problem space strategies specially tailored to address nondeterminism in the problem space. We implement our novel algorithm and attack strategies in Tarallo, an end-to-end adversarial framework that significantly outperforms previous works in both white and black-box scenarios. Our preliminary analysis in a sandboxed environment and against two RNN-based malware detectors, shows that Tarallo achieves a success rate up to 99% on both feature and problem space attacks while significantly minimizing the number of modifications required for misclassification.

Paperid: 957, https://arxiv.org/pdf/2506.00857.pdf

Abstract:
In the modern global Integrated Circuit (IC) supply chain, protecting intellectual property (IP) is a complex challenge, and balancing IP loss risk and added cost for theft countermeasures is hard to achieve. Using embedded configurable logic allows designers to completely hide the functionality of selected design portions from parties that do not have access to the configuration string (bitstream). However, the design space of redacted solutions is huge, with trade-offs between the portions selected for redaction and the configuration of the configurable embedded logic. We propose ARIANNA, a complete flow that aids the designer in all the stages, from selecting the logic to be hidden to tailoring the bespoke fabrics for the configurable logic used to hide it. We present a security evaluation of the considered fabrics and introduce two heuristics for the novel bespoke fabric flow. We evaluate the heuristics against an exhaustive approach. We also evaluate the complete flow using a selection of benchmarks. Results show that using ARIANNA to customize the redaction fabrics yields up to 3.3x lower overheads and 4x higher eFPGA fabric utilization than a one-fits-all fabric as proposed in prior works.

Paperid: 958, https://arxiv.org/pdf/2506.00831.pdf

Abstract:
Existing threat modeling frameworks related to transportation cyber-physical systems (CPS) are often narrow in scope, labor-intensive, and require substantial cybersecurity expertise. To this end, we introduce the Transportation Cybersecurity and Resiliency Threat Modeling Framework (TraCR-TMF), a large language model (LLM)-based threat modeling framework for transportation CPS that requires limited cybersecurity expert intervention. TraCR-TMF identifies threats, potential attack techniques, and relevant countermeasures for transportation CPS. Three LLM-based approaches support these identifications: (i) a retrieval-augmented generation approach requiring no cybersecurity expert intervention, (ii) an in-context learning approach with low expert intervention, and (iii) a supervised fine-tuning approach with moderate expert intervention. TraCR-TMF offers LLM-based attack path identification for critical assets based on vulnerabilities across transportation CPS entities. Additionally, it incorporates the Common Vulnerability Scoring System (CVSS) scores of known exploited vulnerabilities to prioritize threat mitigations. The framework was evaluated through two cases. First, the framework identified relevant attack techniques for various transportation CPS applications, 73% of which were validated by cybersecurity experts as correct. Second, the framework was used to identify attack paths for a target asset in a real-world cyberattack incident. TraCR-TMF successfully predicted exploitations, like lateral movement of adversaries, data exfiltration, and data encryption for ransomware, as reported in the incident. These findings show the efficacy of TraCR-TMF in transportation CPS threat modeling, while reducing the need for extensive involvement of cybersecurity experts. To facilitate real-world adoptions, all our codes are shared via an open-source repository.

Paperid: 959, https://arxiv.org/pdf/2506.00659.pdf

Abstract:
Anti-analysis techniques, particularly packing, challenge malware analysts, making packer identification fundamental. Existing packer identifiers have significant limitations: signature-based methods lack flexibility and struggle against dynamic evasion, while Machine Learning approaches require extensive training data, limiting scalability and adaptability. Consequently, achieving accurate and adaptable packer identification remains an open problem. This paper presents PackHero, a scalable and efficient methodology for identifying packers using a novel static approach. PackHero employs a Graph Matching Network and clustering to match and group Call Graphs from programs packed with known packers. We evaluate our approach on a public dataset of malware and benign samples packed with various packers, demonstrating its effectiveness and scalability across varying sample sizes. PackHero achieves a macro-average F1-score of 93.7% with just 10 samples per packer, improving to 98.3% with 100 samples. Notably, PackHero requires fewer samples to achieve stable performance compared to other Machine Learning-based tools. Overall, PackHero matches the performance of State-of-the-art signature-based tools, outperforming them in handling Virtualization-based packers such as Themida/Winlicense, with a recall of 100%.

Paperid: 960, https://arxiv.org/pdf/2506.00654.pdf

Abstract:
Money laundering is a financial crime that poses a serious threat to financial integrity and social security. The growing number of transactions makes it necessary to use automatic tools that help law enforcement agencies detect such criminal activity. In this work, we present Amatriciana, a novel approach based on Graph Neural Networks to detect money launderers inside a graph of transactions by considering temporal information. Amatriciana uses the whole graph of transactions without splitting it into several time-based subgraphs, exploiting all relational information in the dataset. Our experiments on a public dataset reveal that the model can learn from a limited amount of data. Furthermore, when more data is available, the model outperforms other State-of-the-art approaches; in particular, Amatriciana decreases the number of False Positives (FPs) while detecting many launderers. In summary, Amatriciana achieves an F1 score of 0.76. In addition, it lowers the FPs by 55% with respect to other State-of-the-art models.

Paperid: 961, https://arxiv.org/pdf/2505.24440.pdf

Abstract:
We compare the efficiency of restaking and Proof-of-Stake (PoS) protocols in terms of stake requirements. First, we consider the sufficient condition for the restaking graph to be secure. We show that the condition implies that it is always possible to transform such a restaking graph into secure PoS protocols. Next, we derive two main results, giving upper and lower bounds on required extra stakes that one needs to add to validators of the secure restaking graph to be able to transform it into secure PoS protocols. In particular, we show that the restaking savings compared to PoS protocols can be very large and can asymptotically grow in the worst case as a square root of the number of validators. We also study a complementary question of transforming secure PoS protocols into an aggregate secure restaking graph and provide lower and upper bounds on the PoS savings compared to restaking.

Paperid: 962, https://arxiv.org/pdf/2505.18760.pdf

Abstract:
Many critical information technology and cyber-physical systems rely on a supply chain of open-source software projects. OSS project maintainers often integrate contributions from external actors. While maintainers can assess the correctness of a change request, assessing a change request's cybersecurity implications is challenging. To help maintainers make this decision, we propose that the open-source ecosystem should incorporate Actor Reputation Metrics (ARMS). This capability would enable OSS maintainers to assess a prospective contributor's cybersecurity reputation. To support the future instantiation of ARMS, we identify seven generic security signals from industry standards; map concrete metrics from prior work and available security tools, describe study designs to refine and assess the utility of ARMS, and finally weigh its pros and cons.

Paperid: 963, https://arxiv.org/pdf/2505.18384.pdf

Abstract:
Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current frontier model audits probe the cybersecurity risks of such agents, but most fail to account for the degrees of freedom available to adversaries in the real world. In particular, with strong verifiers and financial incentives, agents for offensive cybersecurity are amenable to iterative improvement by would-be adversaries. We argue that assessments should take into account an expanded threat model in the context of cybersecurity, emphasizing the varying degrees of freedom that an adversary may possess in stateful and non-stateful environments within a fixed compute budget. We show that even with a relatively small compute budget (8 H100 GPU Hours in our study), adversaries can improve an agent's cybersecurity capability on InterCode CTF by more than 40\% relative to the baseline -- without any external assistance. These results highlight the need to evaluate agents' cybersecurity risk in a dynamic manner, painting a more representative picture of risk.

Paperid: 964, https://arxiv.org/pdf/2505.15753.pdf

Abstract:
Large Language Models (LLMs) are known to be vulnerable to jailbreaking attacks, wherein adversaries exploit carefully engineered prompts to induce harmful or unethical responses. Such threats have raised critical concerns about the safety and reliability of LLMs in real-world deployment. While existing defense mechanisms partially mitigate such risks, subsequent advancements in adversarial techniques have enabled novel jailbreaking methods to circumvent these protections, exposing the limitations of static defense frameworks. In this work, we explore defending against evolving jailbreaking threats through the lens of context retrieval. First, we conduct a preliminary study demonstrating that even a minimal set of safety-aligned examples against a particular jailbreak can significantly enhance robustness against this attack pattern. Building on this insight, we further leverage the retrieval-augmented generation (RAG) techniques and propose Safety Context Retrieval (SCR), a scalable and robust safeguarding paradigm for LLMs against jailbreaking. Our comprehensive experiments demonstrate how SCR achieves superior defensive performance against both established and emerging jailbreaking tactics, contributing a new paradigm to LLM safety. Our code will be available upon publication.

Paperid: 965, https://arxiv.org/pdf/2505.15738.pdf

Abstract:
Large language models (LLMs) are rapidly deployed in real-world applications ranging from chatbots to agentic systems. Alignment is one of the main approaches used to defend against attacks such as prompt injection and jailbreaks. Recent defenses report near-zero Attack Success Rates (ASR) even against Greedy Coordinate Gradient (GCG), a white-box attack that generates adversarial suffixes to induce attacker-desired outputs. However, this search space over discrete tokens is extremely large, making the task of finding successful attacks difficult. GCG has, for instance, been shown to converge to local minima, making it sensitive to initialization choices. In this paper, we assess the future-proof robustness of these defenses using a more informed threat model: attackers who have access to some information about the alignment process. Specifically, we propose an informed white-box attack leveraging the intermediate model checkpoints to initialize GCG, with each checkpoint acting as a stepping stone for the next one. We show this approach to be highly effective across state-of-the-art (SOTA) defenses and models. We further show our informed initialization to outperform other initialization methods and show a gradient-informed checkpoint selection strategy to greatly improve attack performance and efficiency. Importantly, we also show our method to successfully find universal adversarial suffixes -- single suffixes effective across diverse inputs. Our results show that, contrary to previous beliefs, effective adversarial suffixes do exist against SOTA alignment-based defenses, that these can be found by existing attack methods when adversaries exploit alignment knowledge, and that even universal suffixes exist. Taken together, our results highlight the brittleness of current alignment-based methods and the need to consider stronger threat models when testing the safety of LLMs.

Paperid: 966, https://arxiv.org/pdf/2505.14835.pdf

Abstract:
Watermarking by altering token sampling probabilities based on red-green list is a promising method for tracing the origin of text generated by large language models (LLMs). However, existing watermark methods often struggle with a fundamental dilemma: improving watermark effectiveness (the detectability of the watermark) often comes at the cost of reduced text quality. This trade-off limits their practical application. To address this challenge, we first formalize the problem within a multi-objective trade-off analysis framework. Within this framework, we identify a key factor that influences the dilemma. Unlike existing methods, where watermark strength is typically treated as a fixed hyperparameter, our theoretical insights lead to the development of MorphMarka method that adaptively adjusts the watermark strength in response to changes in the identified factor, thereby achieving an effective resolution of the dilemma. In addition, MorphMark also prioritizes flexibility since it is a model-agnostic and model-free watermark method, thereby offering a practical solution for real-world deployment, particularly in light of the rapid evolution of AI models. Extensive experiments demonstrate that MorphMark achieves a superior resolution of the effectiveness-quality dilemma, while also offering greater flexibility and time and space efficiency.

Paperid: 968, https://arxiv.org/pdf/2504.18581.pdf

Abstract:
Semantic communication (SemCom) improves transmission efficiency by focusing on task-relevant information. However, transmitting semantic-rich data over insecure channels introduces privacy risks. This paper proposes a novel SemCom framework that integrates differential privacy (DP) mechanisms to protect sensitive semantic features. This method employs the generative adversarial network (GAN) inversion technique to extract disentangled semantic features and uses neural networks (NNs) to approximate the DP application and removal processes, effectively mitigating the non-invertibility issue of DP. Additionally, an NN-based encryption scheme is introduced to strengthen the security of channel inputs. Simulation results demonstrate that the proposed approach effectively prevents eavesdroppers from reconstructing sensitive information by generating chaotic or fake images, while ensuring high-quality image reconstruction for legitimate users. The system exhibits robust performance across various privacy budgets and channel conditions, achieving an optimal balance between privacy protection and reconstruction fidelity.

Paperid: 969, https://arxiv.org/pdf/2504.14812.pdf

Abstract:
Eavesdropping on sounds emitted by mobile device loudspeakers can capture sensitive digital information, such as SMS verification codes, credit card numbers, and withdrawal passwords, which poses significant security risks. Existing schemes either require expensive specialized equipment, rely on spyware, or are limited to close-range signal acquisition. In this paper, we propose a scheme, CSI2Dig, for recovering digit content from Channel State Information (CSI) when digits are played through a smartphone loudspeaker. We observe that the electromagnetic interference caused by the audio signals from the loudspeaker affects the WiFi signals emitted by the phone's WiFi antenna. Building upon contrastive learning and denoising autoencoders, we develop a two-branch autoencoder network designed to amplify the impact of this electromagnetic interference on CSI. For feature extraction, we introduce the TS-Net, a model that captures relevant features from both the temporal and spatial dimensions of the CSI data. We evaluate our scheme across various devices, distances, volumes, and other settings. Experimental results demonstrate that our scheme can achieve an accuracy of 72.97%.

Paperid: 970, https://arxiv.org/pdf/2504.08205.pdf

Abstract:
Vision models are increasingly deployed in critical applications such as autonomous driving and CCTV monitoring, yet they remain susceptible to resource-consuming attacks. In this paper, we introduce a novel energy-overloading attack that leverages vision language model (VLM) prompts to generate adversarial images targeting vision models. These images, though imperceptible to the human eye, significantly increase GPU energy consumption across various vision models, threatening the availability of these systems. Our framework, EO-VLM (Energy Overload via VLM), is model-agnostic, meaning it is not limited by the architecture or type of the target vision model. By exploiting the lack of safety filters in VLMs like DALL-E 3, we create adversarial noise images without requiring prior knowledge or internal structure of the target vision models. Our experiments demonstrate up to a 50% increase in energy consumption, revealing a critical vulnerability in current vision models.

Paperid: 971, https://arxiv.org/pdf/2504.07137.pdf

Abstract:
Large Language Models (LLMs) have recently emerged as powerful tools in cybersecurity, offering advanced capabilities in malware detection, generation, and real-time monitoring. Numerous studies have explored their application in cybersecurity, demonstrating their effectiveness in identifying novel malware variants, analyzing malicious code structures, and enhancing automated threat analysis. Several transformer-based architectures and LLM-driven models have been proposed to improve malware analysis, leveraging semantic and structural insights to recognize malicious intent more accurately. This study presents a comprehensive review of LLM-based approaches in malware code analysis, summarizing recent advancements, trends, and methodologies. We examine notable scholarly works to map the research landscape, identify key challenges, and highlight emerging innovations in LLM-driven cybersecurity. Additionally, we emphasize the role of static analysis in malware detection, introduce notable datasets and specialized LLM models, and discuss essential datasets supporting automated malware research. This study serves as a valuable resource for researchers and cybersecurity professionals, offering insights into LLM-powered malware detection and defence strategies while outlining future directions for strengthening cybersecurity resilience.

Paperid: 972, https://arxiv.org/pdf/2503.23718.pdf

Abstract:
Smart contracts are fundamental pillars of the blockchain, playing a crucial role in facilitating various business transactions. However, these smart contracts are vulnerable to exploitable bugs that can lead to substantial monetary losses. A recent study reveals that over 80% of these exploitable bugs, which are primarily functional bugs, can evade the detection of current tools. The primary issue is the significant gap between understanding the high-level logic of the business model and checking the low-level implementations in smart contracts. Furthermore, identifying deeply rooted functional bugs in smart contracts requires the automated generation of effective detection oracles based on various bug features. To address these challenges, we design and implement PROMFUZZ, an automated and scalable system to detect functional bugs, in smart contracts. In PROMFUZZ, we first propose a novel Large Language Model (LLM)-driven analysis framework, which leverages a dual-agent prompt engineering strategy to pinpoint potentially vulnerable functions for further scrutiny. We then implement a dual-stage coupling approach, which focuses on generating invariant checkers that leverage logic information extracted from potentially vulnerable functions. Finally, we design a bug-oriented fuzzing engine, which maps the logical information from the high-level business model to the low-level smart contract implementations, and performs the bug-oriented fuzzing on targeted functions. We compare PROMFUZZ with multiple state-of-the-art methods. The results show that PROMFUZZ achieves 86.96% recall and 93.02% F1-score in detecting functional bugs, marking at least a 50% improvement in both metrics over state-of-the-art methods. Moreover, we perform an in-depth analysis on real-world DeFi projects and detect 30 zero-day bugs. Up to now, 24 zero-day bugs have been assigned CVE IDs.

Paperid: 973, https://arxiv.org/pdf/2503.06340.pdf

Abstract:
Diffusion models are powerful generative models in continuous data domains such as image and video data. Discrete graph diffusion models (DGDMs) have recently extended them for graph generation, which are crucial in fields like molecule and protein modeling, and obtained the SOTA performance. However, it is risky to deploy DGDMs for safety-critical applications (e.g., drug discovery) without understanding their security vulnerabilities. In this work, we perform the first study on graph diffusion models against backdoor attacks, a severe attack that manipulates both the training and inference/generation phases in graph diffusion models. We first define the threat model, under which we design the attack such that the backdoored graph diffusion model can generate 1) high-quality graphs without backdoor activation, 2) effective, stealthy, and persistent backdoored graphs with backdoor activation, and 3) graphs that are permutation invariant and exchangeable--two core properties in graph generative models. 1) and 2) are validated via empirical evaluations without and with backdoor defenses, while 3) is validated via theoretical results.

Paperid: 974, https://arxiv.org/pdf/2503.03586.pdf

Abstract:
Large Language Models (LLMs) have shown promise in software vulnerability detection, particularly on function-level benchmarks like Devign and BigVul. However, real-world detection requires interprocedural analysis, as vulnerabilities often emerge through multi-hop function calls rather than isolated functions. While repository-level benchmarks like ReposVul and VulEval introduce interprocedural context, they remain computationally expensive, lack pairwise evaluation of vulnerability fixes, and explore limited context retrieval, limiting their practicality. We introduce JitVul, a JIT vulnerability detection benchmark linking each function to its vulnerability-introducing and fixing commits. Built from 879 CVEs spanning 91 vulnerability types, JitVul enables comprehensive evaluation of detection capabilities. Our results show that ReAct Agents, leveraging thought-action-observation and interprocedural context, perform better than LLMs in distinguishing vulnerable from benign code. While prompting strategies like Chain-of-Thought help LLMs, ReAct Agents require further refinement. Both methods show inconsistencies, either misidentifying vulnerabilities or over-analyzing security guards, indicating significant room for improvement.

Paperid: 975, https://arxiv.org/pdf/2503.00416.pdf

Abstract:
Large Language Models (LLMs) have significantly advanced text understanding and generation, becoming integral to applications across education, software development, healthcare, entertainment, and legal services. Despite considerable progress in improving model reliability, latency remains under-explored, particularly through recurrent generation, where models repeatedly produce similar or identical outputs, causing increased latency and potential Denial-of-Service (DoS) vulnerabilities. We propose RecurrentGenerator, a black-box evolutionary algorithm that efficiently identifies recurrent generation scenarios in prominent LLMs like LLama-3 and GPT-4o. Additionally, we introduce RecurrentDetector, a lightweight real-time classifier trained on activation patterns, achieving 95.24% accuracy and an F1 score of 0.87 in detecting recurrent loops. Our methods provide practical solutions to mitigate latency-related vulnerabilities, and we publicly share our tools and data to support further research.

Paperid: 976, https://arxiv.org/pdf/2503.00271.pdf

Abstract:
Software signing is the most robust method for ensuring the integrity and authenticity of components in a software supply chain. Traditional signing tools burdened practitioners with key management and signer identification, creating both usability challenges and security risks. A new class of next-generation signing tools has automated many of these concerns, but little is known about their usability and its effect on adoption and effectiveness in practice. A usability evaluation can clarify the extent to which next-generation designs succeed and highlight priorities for improvement. To fill this gap, we conducted a usability study of Sigstore, a pioneering and widely adopted exemplar of next-generation signing. Through interviews with 17 industry experts, we examined (1) the problems and advantages associated with practitioners' tooling choices, (2) how and why their signing-tool usage has evolved over time, and (3) the contexts that cause usability concerns. Our findings illuminate the usability factors of next-generation signing tools and yield recommendations for toolmakers, adopting organizations, and the research community. Notably, components of next-generation tooling exhibit different levels of maturity and readiness for adoption, and integration flexibility is a common pain point, but potentially mitigable through plugins and APIs. Our results will help next-generation signing toolmakers further strengthen software supply chain security.

Paperid: 977, https://arxiv.org/pdf/2502.20565.pdf

Abstract:
Vertical Federated Learning (VFL) enables collaborative training with feature-partitioned data, yet remains vulnerable to privacy leakage through gradient transmissions. Standard differential privacy (DP) techniques such as DP-SGD are difficult to apply in this setting due to VFL's distributed nature and the high variance incurred by vector-valued noise. On the other hand, zeroth-order (ZO) optimization techniques can avoid explicit gradient exposure but lack formal privacy guarantees. In this work, we propose DPZV, the first ZO optimization framework for VFL that achieves tunable DP with performance guarantees. DPZV overcomes these limitations by injecting low-variance scalar noise at the server, enabling controllable privacy with reduced memory overhead. We conduct a comprehensive theoretical analysis showing that DPZV matches the convergence rate of first-order optimization methods while satisfying formal ($Îµ, Î´$)-DP guarantees. Experiments on image and language benchmarks demonstrate that DPZV outperforms several baselines in terms of accuracy under a wide range of privacy constraints ($Îµ\le 10$), thereby elevating the privacy-utility tradeoff in VFL.

Paperid: 978, https://arxiv.org/pdf/2502.13055.pdf

Abstract:
The rapid growth of mobile applications has escalated Android malware threats. Although there are numerous detection methods, they often struggle with evolving attacks, dataset biases, and limited explainability. Large Language Models (LLMs) offer a promising alternative with their zero-shot inference and reasoning capabilities. However, applying LLMs to Android malware detection presents two key challenges: (1)the extensive support code in Android applications, often spanning thousands of classes, exceeds LLMs' context limits and obscures malicious behavior within benign functionality; (2)the structural complexity and interdependencies of Android applications surpass LLMs' sequence-based reasoning, fragmenting code analysis and hindering malicious intent inference. To address these challenges, we propose LAMD, a practical context-driven framework to enable LLM-based Android malware detection. LAMD integrates key context extraction to isolate security-critical code regions and construct program structures, then applies tier-wise code reasoning to analyze application behavior progressively, from low-level instructions to high-level semantics, providing final prediction and explanation. A well-designed factual consistency verification mechanism is equipped to mitigate LLM hallucinations from the first tier. Evaluation in real-world settings demonstrates LAMD's effectiveness over conventional detectors, establishing a feasible basis for LLM-driven malware analysis in dynamic threat landscapes.

Paperid: 979, https://arxiv.org/pdf/2502.12491.pdf

Abstract:
Secure key leasing (SKL) is an advanced encryption functionality that allows a secret key holder to generate a quantum decryption key and securely lease it to a user. Once the user returns the quantum decryption key (or provides a classical certificate confirming its deletion), they lose their decryption capability. Previous works on public key encryption with SKL (PKE-SKL) have only considered the single-key security model, where the adversary receives at most one quantum decryption key. However, this model does not accurately reflect real-world applications of PKE-SKL. To address this limitation, we introduce collusion-resistant security for PKE-SKL (denoted as PKE-CR-SKL). In this model, the adversary can adaptively obtain multiple quantum decryption keys and access a verification oracle which validates the correctness of queried quantum decryption keys. Importantly, the size of the public key and ciphertexts must remain independent of the total number of generated quantum decryption keys. We present the following constructions: - A PKE-CR-SKL scheme based on the learning with errors (LWE) assumption. - An attribute-based encryption scheme with collusion-resistant SKL (ABE-CR-SKL), also based on the LWE assumption. - An ABE-CR-SKL scheme with classical certificates, relying on multi-input ABE with polynomial arity.

Paperid: 980, https://arxiv.org/pdf/2502.11173.pdf

Abstract:
Quantum computing promises to revolutionize our understanding of the limits of computation, and its implications in cryptography have long been evident. Today, cryptographers are actively devising post-quantum solutions to counter the threats posed by quantum-enabled adversaries. Meanwhile, quantum scientists are innovating quantum protocols to empower defenders. However, the broader impact of quantum computing and quantum machine learning (QML) on other cybersecurity domains still needs to be explored. In this work, we investigate the potential impact of QML on cybersecurity applications of traditional ML. First, we explore the potential advantages of quantum computing in machine learning problems specifically related to cybersecurity. Then, we describe a methodology to quantify the future impact of fault-tolerant QML algorithms on real-world problems. As a case study, we apply our approach to standard methods and datasets in network intrusion detection, one of the most studied applications of machine learning in cybersecurity. Our results provide insight into the conditions for obtaining a quantum advantage and the need for future quantum hardware and software advancements.

Paperid: 981, https://arxiv.org/pdf/2502.09385.pdf

Abstract:
Advanced Persistent Threats (APTs) pose a major cybersecurity challenge due to their stealth and ability to mimic normal system behavior, making detection particularly difficult in highly imbalanced datasets. Traditional anomaly detection methods struggle to effectively differentiate APT-related activities from benign processes, limiting their applicability in real-world scenarios. This paper introduces APT-LLM, a novel embedding-based anomaly detection framework that integrates large language models (LLMs) -- BERT, ALBERT, DistilBERT, and RoBERTa -- with autoencoder architectures to detect APTs. Unlike prior approaches, which rely on manually engineered features or conventional anomaly detection models, APT-LLM leverages LLMs to encode process-action provenance traces into semantically rich embeddings, capturing nuanced behavioral patterns. These embeddings are analyzed using three autoencoder architectures -- Baseline Autoencoder (AE), Variational Autoencoder (VAE), and Denoising Autoencoder (DAE) -- to model normal process behavior and identify anomalies. The best-performing model is selected for comparison against traditional methods. The framework is evaluated on real-world, highly imbalanced provenance trace datasets from the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004\% of the data across multiple operating systems (Android, Linux, BSD, and Windows) and attack scenarios. Results demonstrate that APT-LLM significantly improves detection performance under extreme imbalance conditions, outperforming existing anomaly detection methods and highlighting the effectiveness of LLM-based feature extraction in cybersecurity.

Paperid: 982, https://arxiv.org/pdf/2502.08123.pdf

Abstract:
Federated reinforcement learning (FRL) allows agents to jointly learn a global decision-making policy under the guidance of a central server. While FRL has advantages, its decentralized design makes it prone to poisoning attacks. To mitigate this, Byzantine-robust aggregation techniques tailored for FRL have been introduced. Yet, in our work, we reveal that these current Byzantine-robust techniques are not immune to our newly introduced Normalized attack. Distinct from previous attacks that targeted enlarging the distance of policy updates before and after an attack, our Normalized attack emphasizes on maximizing the angle of deviation between these updates. To counter these threats, we develop an ensemble FRL approach that is provably secure against both known and our newly proposed attacks. Our ensemble method involves training multiple global policies, where each is learnt by a group of agents using any foundational aggregation rule. These well-trained global policies then individually predict the action for a specific test state. The ultimate action is chosen based on a majority vote for discrete action systems or the geometric median for continuous ones. Our experimental results across different settings show that the Normalized attack can greatly disrupt non-ensemble Byzantine-robust methods, and our ensemble approach offers substantial resistance against poisoning attacks.

Paperid: 983, https://arxiv.org/pdf/2502.05098.pdf

Abstract:
Learning-based Android malware detectors degrade over time due to natural distribution drift caused by malware variants and new families. This paper systematically investigates the challenges classifiers trained with empirical risk minimization (ERM) face against such distribution shifts and attributes their shortcomings to their inability to learn stable discriminative features. Invariant learning theory offers a promising solution by encouraging models to generate stable representations crossing environments that expose the instability of the training set. However, the lack of prior environment labels, the diversity of drift factors, and low-quality representations caused by diverse families make this task challenging. To address these issues, we propose TIF, the first temporal invariant training framework for malware detection, which aims to enhance the ability of detectors to learn stable representations across time. TIF organizes environments based on application observation dates to reveal temporal drift, integrating specialized multi-proxy contrastive learning and invariant gradient alignment to generate and align environments with high-quality, stable representations. TIF can be seamlessly integrated into any learning-based detector. Experiments on a decade-long dataset show that TIF excels, particularly in early deployment stages, addressing real-world needs and outperforming state-of-the-art methods.

Paperid: 984, https://arxiv.org/pdf/2502.01848.pdf

Abstract:
Intelligent transportation systems (ITS) are characterized by wired or wireless communication among different entities, such as vehicles, roadside infrastructure, and traffic management infrastructure. These communications demand different levels of security, depending on how sensitive the data is. The national ITS reference architecture (ARC-IT) defines three security levels, i.e., high, moderate, and low-security levels, based on the different security requirements of ITS applications. In this study, we present a generalized approach to secure ITS communications using a standardized key encapsulation mechanism, known as Kyber, designed for post-quantum cryptography (PQC). We modified the encryption and decryption systems for ITS communications while mapping the security levels of ITS applications to the three versions of Kyber, i.e., Kyber-512, Kyber-768, and Kyber-1024. Then, we conducted a case study using a benchmark fault-enabled chosen-ciphertext attack to evaluate the security provided by the different Kyber versions. The encryption and decryption times observed for different Kyber security levels and the total number of iterations required to recover the secret key using the chosen-ciphertext attack are presented. Our analyses show that higher security levels increase the time required for a successful attack, with Kyber-512 being breached in 183 seconds, Kyber-768 in 337 seconds, and Kyber-1024 in 615 seconds. In addition, attack time instabilities are observed for Kyber-512, 768, and 1024 under 5,000, 6,000, and 8,000 inequalities, respectively. The relationships among the different Kyber versions, and the respective attack requirements and performances underscore the ITS communication security Kyber could provide in the PQC era.

Paperid: 985, https://arxiv.org/pdf/2501.18176.pdf

Abstract:
Zero-knowledge proofs (ZKPs) are widely applied in digital economies, such as cryptocurrencies and smart contracts, for establishing trust and ensuring privacy between untrusted parties. However, almost all ZKPs rely on unproven computational assumptions or are vulnerable to quantum adversaries. We propose and experimentally implement an unconditionally secure ZKP for the graph three-coloring problem by combining subset relativistic bit commitments with quantum nonlocality game. Our protocol achieves a linear relationship between interactive rounds and the number of edges, reducing round complexity and storage requirements by thirteen orders of magnitude, thereby significantly enhancing practical feasibility. Our work illustrates the powerful potential of integrating special relativity with quantum theory in trustless cryptography, paving the way for robust applications against quantum attacks in distrustful internet environments.

Paperid: 986, https://arxiv.org/pdf/2501.17124.pdf

Abstract:
We consider the problem of finding the asymptotic capacity of symmetric private information retrieval (SPIR) with $B$ Byzantine servers. Prior to finding the capacity, a definition for the Byzantine servers is needed since in the literature there are two different definitions. In \cite{byzantine_tpir}, where it was first defined, the Byzantine servers can send any symbol from the storage, their received queries and some independent random symbols. In \cite{unresponsive_byzantine_1}, Byzantine servers send any random symbol independently of their storage and queries. It is clear that these definitions are not identical, especially when \emph{symmetric} privacy is required. To that end, we define Byzantine servers, inspired by \cite{byzantine_tpir}, as the servers that can share everything, before and after the scheme initiation. In this setting, we find an upper bound, for an infinite number of messages case, that should be satisfied for all schemes that protect against this setting and develop a scheme that achieves this upper bound. Hence, we identify the capacity of the problem.

Paperid: 987, https://arxiv.org/pdf/2501.15391.pdf

Abstract:
Radio Frequency Fingerprinting Identification (RFFI) is a lightweight physical layer identity authentication technique. It identifies the radio-frequency device by analyzing the signal feature differences caused by the inevitable minor hardware impairments. However, existing RFFI methods based on closed-set recognition struggle to detect unknown unauthorized devices in open environments. Moreover, the feature interference among legitimate devices can further compromise identification accuracy. In this paper, we propose a joint radio frequency fingerprint prediction and siamese comparison (JRFFP-SC) framework for open set recognition. Specifically, we first employ a radio frequency fingerprint prediction network to predict the most probable category result. Then a detailed comparison among the test sample's features with registered samples is performed in a siamese network. The proposed JRFFP-SC framework eliminates inter-class interference and effectively addresses the challenges associated with open set identification. The simulation results show that our proposed JRFFP-SC framework can achieve excellent rogue device detection and generalization capability for classifying devices.

Paperid: 988, https://arxiv.org/pdf/2501.14644.pdf

Abstract:
Decentralized learning enables distributed agents to collaboratively train a shared machine learning model without a central server, through local computation and peer-to-peer communication. Although each agent retains its dataset locally, sharing local models can still expose private information about the local training datasets to adversaries. To mitigate privacy attacks, a common strategy is to inject random artificial noise at each agent before exchanging local models between neighbors. However, this often leads to utility degradation due to the negative effects of cumulated artificial noise on the learning algorithm. In this work, we introduce CorN-DSGD, a novel covariance-based framework for generating correlated privacy noise across agents, which unifies several state-of-the-art methods as special cases. By leveraging network topology and mixing weights, CorN-DSGD optimizes the noise covariance to achieve network-wide noise cancellation. Experimental results show that CorN-DSGD cancels more noise than existing pairwise correlation schemes, improving model performance under formal privacy guarantees.

Paperid: 989, https://arxiv.org/pdf/2501.14008.pdf

Abstract:
Web application firewall (WAF) examines malicious traffic to and from a web application via a set of security rules. It plays a significant role in securing Web applications against web attacks. However, as web attacks grow in sophistication, it is becoming increasingly difficult for WAFs to block the mutated malicious payloads designed to bypass their defenses. In response to this critical security issue, we have developed a novel learning-based framework called WAFBOOSTER, designed to unveil potential bypasses in WAF detections and suggest rules to fortify their security. Using a combination of shadow models and payload generation techniques, we can identify malicious payloads and remove or modify them as needed. WAFBOOSTER generates signatures for these malicious payloads using advanced clustering and regular expression matching techniques to repair any security gaps we uncover. In our comprehensive evaluation of eight real-world WAFs, WAFBOOSTER improved the true rejection rate of mutated malicious payloads from 21% to 96%, with no false rejections. WAFBOOSTER achieves a false acceptance rate 3X lower than state-of-the-art methods for generating malicious payloads. With WAFBOOSTER, we have taken a step forward in securing web applications against the ever-evolving threats.

Paperid: 990, https://arxiv.org/pdf/2501.12709.pdf

Abstract:
Federated learning is essential for decentralized, privacy-preserving model training in the data-driven era. Quantum-enhanced federated learning leverages quantum resources to address privacy and scalability challenges, offering security and efficiency advantages beyond classical methods. However, practical and scalable frameworks addressing privacy concerns in the quantum computing era remain undeveloped. Here, we propose a practical quantum federated learning framework on quantum networks, utilizing distributed quantum secret keys to protect local model updates and enable secure aggregation with information-theoretic security. We experimentally validate our framework on a 4-client quantum network with a scalable structure. Extensive numerical experiments on both quantum and classical datasets show that adding a quantum client significantly enhances the trained global model's ability to classify multipartite entangled and non-stabilizer quantum datasets. Simulations further demonstrate scalability to 200 clients with classical models trained on the MNIST dataset, reducing communication costs by $75\%$ through advanced model compression techniques and achieving rapid training convergence. Our work provides critical insights for building scalable, efficient, and quantum-secure machine learning systems for the coming quantum internet era.

Paperid: 991, https://arxiv.org/pdf/2501.11577.pdf

Abstract:
Transfer learning, successful in knowledge translation across related tasks, faces a substantial privacy threat from membership inference attacks (MIAs). These attacks, despite posing significant risk to ML model's training data, remain limited-explored in transfer learning. The interaction between teacher and student models in transfer learning has not been thoroughly explored in MIAs, potentially resulting in an under-examined aspect of privacy vulnerabilities within transfer learning. In this paper, we propose a new MIA vector against transfer learning, to determine whether a specific data point was used to train the teacher model while only accessing the student model in a white-box setting. Our method delves into the intricate relationship between teacher and student models, analyzing the discrepancies in hidden layer representations between the student model and its shadow counterpart. These identified differences are then adeptly utilized to refine the shadow model's training process and to inform membership inference decisions effectively. Our method, evaluated across four datasets in diverse transfer learning tasks, reveals that even when an attacker only has access to the student model, the teacher model's training data remains susceptible to MIAs. We believe our work unveils the unexplored risk of membership inference in transfer learning.

Paperid: 992, https://arxiv.org/pdf/2501.06531.pdf

Abstract:
Recent advances have improved the throughput and latency of blockchains by processing transactions accessing different parts of the state concurrently. However, these systems are unable to concurrently process (a) transactions accessing the same state, even if they are (almost) commutative, e.g., payments much smaller than an account's balance, and (b) multi-party transactions, e.g., asset swaps. Moreover, they are slow to recover from contention, requiring once-in-a-day synchronization. We present Stingray, a novel blockchain architecture that addresses these limitations. The key conceptual contributions are a replicated bounded counter that processes (almost) commutative transactions concurrently, and a FastUnlock protocol that uses a fallback consensus protocol for fast contention recovery. We prove Stingray's security in an asynchronous network with Byzantine faults and demonstrate on a global testbed that Stingray achieves 10,000 times the throughput of prior systems for commutative workloads.

Paperid: 993, https://arxiv.org/pdf/2501.05356.pdf

Abstract:
The transportation industry is experiencing vast digitalization as a plethora of technologies are being implemented to improve efficiency, functionality, and safety. Although technological advancements bring many benefits to transportation, integrating cyberspace across transportation sectors has introduced new and deliberate cyber threats. In the past, public agencies assumed digital infrastructure was secured since its vulnerabilities were unknown to adversaries. However, with the expansion of cyberspace, this assumption has become invalid. With the rapid advancement of wireless technologies, transportation systems are increasingly interconnected with both transportation and non-transportation networks in an internet-of-things ecosystem, expanding cyberspace in transportation and increasing threats and vulnerabilities. This study investigates some prominent reasons for the increase in cyber vulnerabilities in transportation. In addition, this study presents various collaborative strategies among stakeholders that could help improve cybersecurity in the transportation industry. These strategies address programmatic and policy aspects and suggest avenues for technological research and development. The latter highlights opportunities for future research to enhance the cybersecurity of transportation systems and infrastructure by leveraging hybrid approaches and emerging technologies.

Paperid: 994, https://arxiv.org/pdf/2501.03301.pdf

Abstract:
To preserve user privacy in recommender systems, federated recommendation (FR) based on federated learning (FL) emerges, keeping the personal data on the local client and updating a model collaboratively. Unlike FL, FR has a unique sparse aggregation mechanism, where the embedding of each item is updated by only partial clients, instead of full clients in a dense aggregation of general FL. Recently, as an essential principle of FL, model security has received increasing attention, especially for Byzantine attacks, where malicious clients can send arbitrary updates. The problem of exploring the Byzantine robustness of FR is particularly critical since in the domains applying FR, e.g., e-commerce, malicious clients can be injected easily by registering new accounts. However, existing Byzantine works neglect the unique sparse aggregation of FR, making them unsuitable for our problem. Thus, we make the first effort to investigate Byzantine attacks on FR from the perspective of sparse aggregation, which is non-trivial: it is not clear how to define Byzantine robustness under sparse aggregations and design Byzantine attacks under limited knowledge/capability. In this paper, we reformulate the Byzantine robustness under sparse aggregation by defining the aggregation for a single item as the smallest execution unit. Then we propose a family of effective attack strategies, named Spattack, which exploit the vulnerability in sparse aggregation and are categorized along the adversary's knowledge and capability. Extensive experimental results demonstrate that Spattack can effectively prevent convergence and even break down defenses under a few malicious clients, raising alarms for securing FR systems.

Paperid: 995, https://arxiv.org/pdf/2501.01032.pdf

Abstract:
Biometrics authentication has become increasingly popular due to its security and convenience; however, traditional biometrics are becoming less desirable in scenarios such as new mobile devices, Virtual Reality, and Smart Vehicles. For example, while face authentication is widely used, it suffers from significant privacy concerns. The collection of complete facial data makes it less desirable for privacy-sensitive applications. Lip authentication, on the other hand, has emerged as a promising biometrics method. However, existing lip-based authentication methods heavily depend on static lip shape when the mouth is closed, which can be less robust due to lip shape dynamic motion and can barely work when the user is speaking. In this paper, we revisit the nature of lip biometrics and extract shape-independent features from the lips. We study the dynamic characteristics of lip biometrics based on articulator motion. Building on the knowledge, we propose a system for shape-independent continuous authentication via lip articulator dynamics. This system enables robust, shape-independent and continuous authentication, making it particularly suitable for scenarios with high security and privacy requirements. We conducted comprehensive experiments in different environments and attack scenarios and collected a dataset of 50 subjects. The results indicate that our system achieves an overall accuracy of 99.06% and demonstrates robustness under advanced mimic attacks and AI deepfake attacks, making it a viable solution for continuous biometric authentication in various applications.

Paperid: 996, https://arxiv.org/pdf/2506.22521.pdf

Abstract:
Model extraction attacks pose significant security threats to deployed language models, potentially compromising intellectual property and user privacy. This survey provides a comprehensive taxonomy of LLM-specific extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks. We analyze various attack methodologies including API-based knowledge distillation, direct querying, parameter recovery, and prompt stealing techniques that exploit transformer architectures. We then examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios. We propose specialized metrics for evaluating both attack effectiveness and defense performance, addressing the specific challenges of generative language models. Through our analysis, we identify critical limitations in current approaches and propose promising research directions, including integrated attack methodologies and adaptive defense mechanisms that balance security with model utility. This work serves NLP researchers, ML engineers, and security professionals seeking to protect language models in production environments.

Paperid: 997, https://arxiv.org/pdf/2506.20187.pdf

Abstract:
Advanced Large Language Models (LLMs) have achieved impressive performance across a wide range of complex and long-context natural language tasks. However, performing long-context LLM inference locally on a commodity GPU (a PC) with privacy concerns remains challenging due to the increasing memory demands of the key-value (KV) cache. Existing systems typically identify important tokens and selectively offload their KV data to GPU and CPU memory. The KV data needs to be offloaded to disk due to the limited memory on a commodity GPU, but the process is bottlenecked by token importance evaluation overhead and the disk's low bandwidth. In this paper, we present LeoAM, the first efficient importance-aware long-context LLM inference system for a single commodity GPU with adaptive hierarchical GPU-CPU-Disk KV management. Our system employs an adaptive KV management strategy that partitions KV data into variable-sized chunks based on the skewed distribution of attention weights across different layers to reduce computational and additional transmission overheads. Moreover, we propose a lightweight KV abstract method, which minimizes transmission latency by storing and extracting the KV abstract of each chunk on disk instead of the full KV data. LeoAM also leverages the dynamic compression and pipeline techniques to further accelerate inference. Experimental results demonstrate that LongInfer achieves an average inference latency speedup of 3.46x, while maintaining comparable LLM response quality. In scenarios with larger batch sizes, it achieves up to a 5.47x speedup.

Paperid: 998, https://arxiv.org/pdf/2506.18245.pdf

Abstract:
Smart contract vulnerability detection remains a major challenge in blockchain security. Existing vulnerability detection methods face two main issues: (1) Existing datasets lack comprehensive coverage and high-quality explanations for preference learning. (2) Large language models (LLMs) often struggle with accurately interpreting specific concepts in smart contract security. Empirical analysis shows that even after continual pre-training (CPT) and supervised fine-tuning (SFT), LLMs may misinterpret the execution order of state changes, resulting in incorrect explanations despite making correct detection decisions. To address these challenges, we propose Smart-LLaMA-DPO based on LLaMA-3.1-8B. We construct a comprehensive dataset covering four major vulnerability types and machine-unauditable vulnerabilities, including precise labels, explanations, and locations for SFT, as well as high-quality and low-quality output pairs for Direct Preference Optimization (DPO). Second, we perform CPT using large-scale smart contract to enhance the LLM's understanding of specific security practices in smart contracts. Futhermore, we conduct SFT with our comprehensive dataset. Finally, we apply DPO, leveraging human feedback and a specially designed loss function that increases the probability of preferred explanations while reducing the likelihood of non-preferred outputs. We evaluate Smart-LLaMA-DPO on four major vulnerability types: reentrancy, timestamp dependence, integer overflow/underflow, and delegatecall, as well as machine-unauditable vulnerabilities. Our method significantly outperforms state-of-the-art baselines, with average improvements of 10.43% in F1 score and 7.87% in accuracy. Moreover, both LLM evaluation and human evaluation confirm that our method generates more correct, thorough, and clear explanations.

Paperid: 999, https://arxiv.org/pdf/2506.17894.pdf

Abstract:
Chip manufacturing is a complex process, and to achieve a faster time to market, an increasing number of untrusted third-party tools and designs from around the world are being utilized. The use of these untrusted third party intellectual properties (IPs) and tools increases the risk of adversaries inserting hardware trojans (HTs). The covert nature of HTs poses significant threats to cyberspace, potentially leading to severe consequences for national security, the economy, and personal privacy. Many graph neural network (GNN)-based HT detection methods have been proposed. However, they perform poorly on larger designs because they rely on training with smaller designs. Additionally, these methods do not explore different GNN models that are well-suited for HT detection or provide efficient training and inference processes. We propose a novel framework that generates graph embeddings for large designs (e.g., RISC-V) and incorporates various GNN models tailored for HT detection. Furthermore, our framework introduces domain-specific techniques for efficient training and inference by implementing model quantization. Model quantization reduces the precision of the weights, lowering the computational requirements, enhancing processing speed without significantly affecting detection accuracy. We evaluate our framework using a custom dataset, and our results demonstrate a precision of 98.66% and a recall (true positive rate) of 92.30%, highlighting the effectiveness and efficiency of our approach in detecting hardware trojans in large-scale chip designs

Paperid: 1000, https://arxiv.org/pdf/2506.16460.pdf

Abstract:
Multitask learning (MTL) has emerged as a powerful paradigm that leverages similarities among multiple learning tasks, each with insufficient samples to train a standalone model, to solve them simultaneously while minimizing data sharing across users and organizations. MTL typically accomplishes this goal by learning a shared representation that captures common structure among the tasks by embedding data from all tasks into a common feature space. Despite being designed to be the smallest unit of shared information necessary to effectively learn patterns across multiple tasks, these shared representations can inadvertently leak sensitive information about the particular tasks they were trained on. In this work, we investigate what information is revealed by the shared representations through the lens of inference attacks. Towards this, we propose a novel, black-box task-inference threat model where the adversary, given the embedding vectors produced by querying the shared representation on samples from a particular task, aims to determine whether that task was present when training the shared representation. We develop efficient, purely black-box attacks on machine learning models that exploit the dependencies between embeddings from the same task without requiring shadow models or labeled reference data. We evaluate our attacks across vision and language domains for multiple use cases of MTL and demonstrate that even with access only to fresh task samples rather than training data, a black-box adversary can successfully infer a task's inclusion in training. To complement our experiments, we provide theoretical analysis of a simplified learning setting and show a strict separation between adversaries with training samples and fresh samples from the target task's distribution.

Paperid: 1001, https://arxiv.org/pdf/2506.15790.pdf

Abstract:
With the advance application of blockchain technology in various fields, ensuring the security and stability of smart contracts has emerged as a critical challenge. Current security analysis methodologies in vulnerability detection can be categorized into static analysis and dynamic analysis methods.However, these existing traditional vulnerability detection methods predominantly rely on analyzing original contract code, not all smart contracts provide accessible code.We present ETrace, a novel event-driven vulnerability detection framework for smart contracts, which uniquely identifies potential vulnerabilities through LLM-powered trace analysis without requiring source code access. By extracting fine-grained event sequences from transaction logs, the framework leverages Large Language Models (LLMs) as adaptive semantic interpreters to reconstruct event analysis through chain-of-thought reasoning. ETrace implements pattern-matching to establish causal links between transaction behavior patterns and known attack behaviors. Furthermore, we validate the effectiveness of ETrace through preliminary experimental results.

Paperid: 1002, https://arxiv.org/pdf/2506.08693.pdf

Abstract:
Large Language Models (LLMs) have rapidly evolved over the past few years and are currently evaluated for their efficacy within the domain of offensive cyber-security. While initial forays showcase the potential of LLMs to enhance security research, they also raise critical ethical concerns regarding the dual-use of offensive security tooling. This paper analyzes a set of papers that leverage LLMs for offensive security, focusing on how ethical considerations are expressed and justified in their work. The goal is to assess the culture of AI in offensive security research regarding ethics communication, highlighting trends, best practices, and gaps in current discourse. We provide insights into how the academic community navigates the fine line between innovation and ethical responsibility. Particularly, our results show that 13 of 15 reviewed prototypes (86.6\%) mentioned ethical considerations and are thus aware of the potential dual-use of their research. Main motivation given for the research was allowing broader access to penetration-testing as well as preparing defenders for AI-guided attackers.

Paperid: 1003, https://arxiv.org/pdf/2506.07372.pdf

Abstract:
Static analysis, a cornerstone technique in cybersecurity, offers a noninvasive method for detecting malware by analyzing dormant software without executing potentially harmful code. However, traditional static analysis often relies on biased or outdated datasets, leading to gaps in detection capabilities against emerging malware threats. To address this, our study focuses on the binary content of files as key features for malware detection. These binary contents are transformed and represented as images, which then serve as inputs to deep learning models. This method takes into account the visual patterns within the binary data, allowing the model to analyze potential malware effectively. This paper introduces the application of the CBiGAN in the domain of malware anomaly detection. Our approach leverages the CBiGAN for its superior latent space mapping capabilities, critical for modeling complex malware patterns by utilizing a reconstruction error-based anomaly detection method. We utilized several datasets including both portable executable (PE) files as well as Object Linking and Embedding (OLE) files. We then evaluated our model against a diverse set of both PE and OLE files, including self-collected malicious executables from 214 malware families. Our findings demonstrate the robustness of this innovative approach, with the CBiGAN achieving high Area Under the Curve (AUC) results with good generalizability, thereby confirming its capability to distinguish between benign and diverse malicious files with reasonably high accuracy.

Paperid: 1004, https://arxiv.org/pdf/2506.06488.pdf

Abstract:
Shadow model attacks are the state-of-the-art approach for membership inference attacks on machine learning models. However, these attacks typically assume an adversary has access to a background (nonmember) data distribution that matches the distribution the target model was trained on. We initiate a study of membership inference attacks where the adversary or auditor cannot access an entire subclass from the distribution -- a more extreme but realistic version of distribution shift than has been studied previously. In this setting, we first show that the performance of shadow model attacks degrades catastrophically, and then demonstrate the promise of another approach, quantile regression, that does not have the same limitations. We show that quantile regression attacks consistently outperform shadow model attacks in the class dropout setting -- for example, quantile regression attacks achieve up to 11$\times$ the TPR of shadow models on the unseen class on CIFAR-100, and achieve nontrivial TPR on ImageNet even with 90% of training classes removed. We also provide a theoretical model that illustrates the potential and limitations of this approach.

Paperid: 1005, https://arxiv.org/pdf/2506.02324.pdf

Abstract:
Blockchain technology relies on decentralization to resist faults and attacks while operating without trusted intermediaries. Although industry experts have touted decentralization as central to their promise and disruptive potential, it is still unclear whether the crypto ecosystems built around blockchains are becoming more or less decentralized over time. As crypto plays an increasing role in facilitating economic transactions and peer-to-peer interactions, measuring their decentralization becomes even more essential. We thus propose a systematic framework for measuring the decentralization of crypto ecosystems over time and compare commonly used decentralization metrics. We applied this framework to seven prominent crypto ecosystems, across five distinct subsystems and across their lifetime for over 15 years. Our analysis revealed that while crypto has largely become more decentralized over time, recent trends show a shift toward centralization in the consensus layer, NFT marketplaces, and developers. Our framework and results inform researchers, policymakers, and practitioners about the design, regulation, and implementation of crypto ecosystems and provide a systematic, replicable foundation for future studies.

Paperid: 1006, https://arxiv.org/pdf/2506.01989.pdf

Abstract:
In this paper, we investigate the problem of distributed learning (DL) in the presence of Byzantine attacks. For this problem, various robust bounded aggregation (RBA) rules have been proposed at the central server to mitigate the impact of Byzantine attacks. However, current DL methods apply RBA rules for the local gradients from the honest devices and the disruptive information from Byzantine devices, and the learning performance degrades significantly when the local gradients of different devices vary considerably from each other. To overcome this limitation, we propose a new DL method to cope with Byzantine attacks based on coded robust aggregation (CRA-DL). Before training begins, the training data are allocated to the devices redundantly. During training, in each iteration, the honest devices transmit coded gradients to the server computed from the allocated training data, and the server then aggregates the information received from both honest and Byzantine devices using RBA rules. In this way, the global gradient can be approximately recovered at the server to update the global model. Compared with current DL methods applying RBA rules, the improvement of CRA-DL is attributed to the fact that the coded gradients sent by the honest devices are closer to each other. This closeness enhances the robustness of the aggregation against Byzantine attacks, since Byzantine messages tend to be significantly different from those of honest devices in this case. We theoretically analyze the convergence performance of CRA-DL. Finally, we present numerical results to verify the superiority of the proposed method over existing baselines, showing its enhanced learning performance under Byzantine attacks.

Paperid: 1007, https://arxiv.org/pdf/2506.01854.pdf

Abstract:
A pseudorandom code is a keyed error-correction scheme with the property that any polynomial number of encodings appear random to any computationally bounded adversary. We show that the pseudorandomness of any code tolerating a constant rate of random errors cannot be based on black-box reductions to almost any generic cryptographic primitive: for instance, anything that can be built from random oracles, generic multilinear groups, and virtual black-box obfuscation. Our result is optimal, as Ghentiyala and Guruswami (2024) observed that pseudorandom codes tolerating any sub-constant rate of random errors exist using a black-box reduction from one-way functions. The key technical ingredient in our proof is the hypercontractivity theorem for Boolean functions, which we use to prove our impossibility in the random oracle model. It turns out that this easily extends to an impossibility in the presence of ``crypto oracles,'' a notion recently introduced -- and shown to be capable of implementing all the primitives mentioned above -- by Lin, Mook, and Wichs (EUROCRYPT 2025).

Paperid: 1008, https://arxiv.org/pdf/2505.23791.pdf

Abstract:
Federated Learning (FL) is a collaborative learning framework designed to protect client data, yet it remains highly vulnerable to Intellectual Property (IP) threats. Model extraction (ME) attacks pose a significant risk to Machine Learning as a Service (MLaaS) platforms, enabling attackers to replicate confidential models by querying black-box (without internal insight) APIs. Despite FL's privacy-preserving goals, its distributed nature makes it particularly susceptible to such attacks. This paper examines the vulnerability of FL-based victim models to two types of model extraction attacks. For various federated clients built under the NVFlare platform, we implemented ME attacks across two deep learning architectures and three image datasets. We evaluate the proposed ME attack performance using various metrics, including accuracy, fidelity, and KL divergence. The experiments show that for different FL clients, the accuracy and fidelity of the extracted model are closely related to the size of the attack query set. Additionally, we explore a transfer learning based approach where pretrained models serve as the starting point for the extraction process. The results indicate that the accuracy and fidelity of the fine-tuned pretrained extraction models are notably higher, particularly with smaller query sets, highlighting potential advantages for attackers.

Paperid: 1009, https://arxiv.org/pdf/2505.19364.pdf

Abstract:
Machine Learning as a Service (MLaaS) enables users to leverage powerful machine learning models through cloud-based APIs, offering scalability and ease of deployment. However, these services are vulnerable to model extraction attacks, where adversaries repeatedly query the application programming interface (API) to reconstruct a functionally similar model, compromising intellectual property and security. Despite various defense strategies being proposed, many suffer from high computational costs, limited adaptability to evolving attack techniques, and a reduction in performance for legitimate users. In this paper, we introduce a Resilient Adaptive Defense Framework for Model Extraction Attack Protection (RADEP), a multifaceted defense framework designed to counteract model extraction attacks through a multi-layered security approach. RADEP employs progressive adversarial training to enhance model resilience against extraction attempts. Malicious query detection is achieved through a combination of uncertainty quantification and behavioral pattern analysis, effectively identifying adversarial queries. Furthermore, we develop an adaptive response mechanism that dynamically modifies query outputs based on their suspicion scores, reducing the utility of stolen models. Finally, ownership verification is enforced through embedded watermarking and backdoor triggers, enabling reliable identification of unauthorized model use. Experimental evaluations demonstrate that RADEP significantly reduces extraction success rates while maintaining high detection accuracy with minimal impact on legitimate queries. Extensive experiments show that RADEP effectively defends against model extraction attacks and remains resilient even against adaptive adversaries, making it a reliable security framework for MLaaS models.

Paperid: 1010, https://arxiv.org/pdf/2505.18734.pdf

Abstract:
We present MADCAT, a self-supervised approach designed to address the concept drift problem in malware detection. MADCAT employs an encoder-decoder architecture and works by test-time training of the encoder on a small, balanced subset of the test-time data using a self-supervised objective. During test-time training, the model learns features that are useful for detecting both previously seen (old) data and newly arriving samples. We demonstrate the effectiveness of MADCAT in continuous Android malware detection settings. MADCAT consistently outperforms baseline methods in detection performance at test time. We also show the synergy between MADCAT and prior approaches in addressing concept drift in malware detection

Paperid: 1011, https://arxiv.org/pdf/2505.13641.pdf

Abstract:
The paper presents an alignment evaluation between the mitigations present in the MITRE's ATT&CK framework and the essential cyber security requirements of the recently introduced Cyber Resilience Act (CRA) in the European Union. In overall, the two align well with each other. With respect to the CRA, there are notable gaps only in terms of data minimization, data erasure, and vulnerability coordination. In terms of the ATT&CK framework, gaps are present only in terms of threat intelligence, training, out-of-band communication channels, and residual risks. The evaluation presented contributes to narrowing of a common disparity between law and technical frameworks.

Paperid: 1012, https://arxiv.org/pdf/2505.12335.pdf

Abstract:
The rapid advancement of generative models, such as GANs and Diffusion models, has enabled the creation of highly realistic synthetic images, raising serious concerns about misinformation, deepfakes, and copyright infringement. Although numerous Artificial Intelligence Generated Image (AIGI) detectors have been proposed, often reporting high accuracy, their effectiveness in real-world scenarios remains questionable. To bridge this gap, we introduce AIGIBench, a comprehensive benchmark designed to rigorously evaluate the robustness and generalization capabilities of state-of-the-art AIGI detectors. AIGIBench simulates real-world challenges through four core tasks: multi-source generalization, robustness to image degradation, sensitivity to data augmentation, and impact of test-time pre-processing. It includes 23 diverse fake image subsets that span both advanced and widely adopted image generation techniques, along with real-world samples collected from social media and AI art platforms. Extensive experiments on 11 advanced detectors demonstrate that, despite their high reported accuracy in controlled settings, these detectors suffer significant performance drops on real-world data, limited benefits from common augmentations, and nuanced effects of pre-processing, highlighting the need for more robust detection strategies. By providing a unified and realistic evaluation framework, AIGIBench offers valuable insights to guide future research toward dependable and generalizable AIGI detection.

Paperid: 1013, https://arxiv.org/pdf/2505.04307.pdf

Abstract:
The paper presents a traceability analysis of how over 84 thousand vulnerabilities have propagated across 28 open source software ecosystems. According to the results, the propagation sequences have been complex in general, although GitHub, Debian, and Ubuntu stand out. Furthermore, the associated propagation delays have been lengthy, and these do not correlate well with the number of ecosystems involved in the associated sequences. Nor does the presence or absence of particularly ecosystems in the sequences yield clear, interpretable patterns. With these results, the paper contributes to the overlapping knowledge bases about software ecosystems, traceability, and vulnerabilities.

Paperid: 1014, https://arxiv.org/pdf/2504.21028.pdf

Abstract:
The rapid evolution of malware variants requires robust classification methods to enhance cybersecurity. While Large Language Models (LLMs) offer potential for generating malware descriptions to aid family classification, their utility is limited by semantic embedding overlaps and misalignment with binary behavioral features. We propose a contrastive fine-tuning (CFT) method that refines LLM embeddings via targeted selection of hard negative samples based on cosine similarity, enabling LLMs to distinguish between closely related malware families. Our approach combines high-similarity negatives to enhance discriminative power and mid-tier negatives to increase embedding diversity, optimizing both precision and generalization. Evaluated on the CIC-AndMal-2020 and BODMAS datasets, our refined embeddings are integrated into a multimodal classifier within a Model-Agnostic Meta-Learning (MAML) framework on a few-shot setting. Experiments demonstrate significant improvements: our method achieves 63.15% classification accuracy with as few as 20 samples on CIC-AndMal-2020, outperforming baselines by 11--21 percentage points and surpassing prior negative sampling strategies. Ablation studies confirm the superiority of similarity-based selection over random sampling, with gains of 10-23%. Additionally, fine-tuned LLMs generate attribute-aware descriptions that generalize to unseen variants, bridging textual and binary feature gaps. This work advances malware classification by enabling nuanced semantic distinctions and provides a scalable framework for adapting LLMs to cybersecurity challenges.

Paperid: 1015, https://arxiv.org/pdf/2504.20436.pdf

Abstract:
The emerging paradigm of Quantum Machine Learning (QML) combines features of quantum computing and machine learning (ML). QML enables the generation and recognition of statistical data patterns that classical computers and classical ML methods struggle to effectively execute. QML utilizes quantum systems to enhance algorithmic computation speed and real-time data processing capabilities, making it one of the most promising tools in the field of ML. Quantum superposition and entanglement features also hold the promise to potentially expand the potential feature representation capabilities of ML. Therefore, in this study, we explore how quantum computing affects ML and whether it can further improve the detection performance on network traffic detection, especially on unseen attacks which are types of malicious traffic that do not exist in the ML training dataset. Classical ML models often perform poorly in detecting these unseen attacks because they have not been trained on such traffic. Hence, this paper focuses on designing and proposing novel hybrid structures of Quantum Convolutional Neural Network (QCNN) to achieve the detection of malicious traffic. The detection performance, generalization, and robustness of the QML solutions are evaluated and compared with classical ML running on classical computers. The emphasis lies in assessing whether the QML-based malicious traffic detection outperforms classical solutions. Based on experiment results, QCNN models demonstrated superior performance compared to classical ML approaches on unseen attack detection.

Paperid: 1016, https://arxiv.org/pdf/2504.18812.pdf

Abstract:
In the evolving landscape of integrated circuit (IC) design, the increasing complexity of modern processors and intellectual property (IP) cores has introduced new challenges in ensuring design correctness and security. The recent advancements in hardware fuzzing techniques have shown their efficacy in detecting hardware bugs and vulnerabilities at the RTL abstraction level of hardware. However, they suffer from several limitations, including an inability to address vulnerabilities introduced during synthesis and gate-level transformations. These methods often fail to detect issues arising from library adversaries, where compromised or malicious library components can introduce backdoors or unintended behaviors into the design. In this paper, we present a novel hardware fuzzer, SynFuzz, designed to overcome the limitations of existing hardware fuzzing frameworks. SynFuzz focuses on fuzzing hardware at the gate-level netlist to identify synthesis bugs and vulnerabilities that arise during the transition from RTL to the gate-level. We analyze the intrinsic hardware behaviors using coverage metrics specifically tailored for the gate-level. Furthermore, SynFuzz implements differential fuzzing to uncover bugs associated with EDA libraries. We evaluated SynFuzz on popular open-source processors and IP designs, successfully identifying 7 new synthesis bugs. Additionally, by exploiting the optimization settings of EDA tools, we performed a compromised library mapping attack (CLiMA), creating a malicious version of hardware designs that remains undetectable by traditional verification methods. We also demonstrate how SynFuzz overcomes the limitations of the industry-standard formal verification tool, Cadence Conformal, providing a more robust and comprehensive approach to hardware verification.

Paperid: 1017, https://arxiv.org/pdf/2504.14965.pdf

Abstract:
Layer 2 (L2) solutions are the cornerstone of blockchain scalability, enabling high-throughput and low-cost interactions by shifting execution off-chain while maintaining security through interactions with the underlying ledger. Despite their common goals, the principal L2 paradigms -- payment channels, rollups, and sidechains -- differ substantially in architecture and assumptions, making it difficult to comparatively analyze their security and trade-offs. To address this, we present the first general security framework for L2 protocols. Our framework is based on the IITM-based Universal Composability (iUC) framework, in which L2 protocols are modeled as stateful machines interacting with higher-level protocol users and the underlying ledger. The methodology defines a generic execution environment that captures ledger events, message passing, and adversarial scheduling, and characterizes security through trace-based predicates parameterized by adversarial capabilities and timing assumptions. By abstracting away from protocol-specific details while preserving critical interface and execution behavior, the framework enables modular, protocol-agnostic reasoning and composable security proofs across a wide range of L2 constructions. To demonstrate its applicability, we analyze an example from each of the three dominant L2 scaling paradigms: a payment channel (Brick), a sidechain (Liquid Network), and a rollup (Arbitrum). By instantiating each within our framework, we derive their security properties and expose trade-offs. These include the time for dispute resolution, distribution of off-chain storage and computation, and varying trust assumptions (e.g., reliance on honest parties or data availability). Our framework unifies the analysis of diverse L2 designs and pinpoints their strengths and limitations, providing a foundation for secure, systematic L2 development.

Paperid: 1018, https://arxiv.org/pdf/2504.10112.pdf

Abstract:
Large Language Models (LLMs) have emerged as a powerful approach for driving offensive penetration-testing tooling. Due to the opaque nature of LLMs, empirical methods are typically used to analyze their efficacy. The quality of this analysis is highly dependent on the chosen testbed, captured metrics and analysis methods employed. This paper analyzes the methodology and benchmarking practices used for evaluating Large Language Model (LLM)-driven attacks, focusing on offensive uses of LLMs in cybersecurity. We review 19 research papers detailing 18 prototypes and their respective testbeds. We detail our findings and provide actionable recommendations for future research, emphasizing the importance of extending existing testbeds, creating baselines, and including comprehensive metrics and qualitative analysis. We also note the distinction between security research and practice, suggesting that CTF-based challenges may not fully represent real-world penetration testing scenarios.

Paperid: 1019, https://arxiv.org/pdf/2504.04311.pdf

Abstract:
In today's digital era, the Internet, especially social media platforms, plays a significant role in shaping public opinions, attitudes, and beliefs. Unfortunately, the credibility of scientific information sources is often undermined by the spread of misinformation through various means, including technology-driven tools like bots, cyborgs, trolls, sock-puppets, and deep fakes. This manipulation of public discourse serves antagonistic business agendas and compromises civil society. In response to this challenge, a new scientific discipline has emerged: social cybersecurity.

Paperid: 1020, https://arxiv.org/pdf/2504.01444.pdf

Abstract:
Multimodal Large Language Models (MLLMs), which integrate vision and other modalities into Large Language Models (LLMs), significantly enhance AI capabilities but also introduce new security vulnerabilities. By exploiting the vulnerabilities of the visual modality and the long-tail distribution characteristic of code training data, we present PiCo, a novel jailbreaking framework designed to progressively bypass multi-tiered defense mechanisms in advanced MLLMs. PiCo employs a tier-by-tier jailbreak strategy, using token-level typographic attacks to evade input filtering and embedding harmful intent within programming context instructions to bypass runtime monitoring. To comprehensively assess the impact of attacks, a new evaluation metric is further proposed to assess both the toxicity and helpfulness of model outputs post-attack. By embedding harmful intent within code-style visual instructions, PiCo achieves an average Attack Success Rate (ASR) of 84.13% on Gemini-Pro Vision and 52.66% on GPT-4, surpassing previous methods. Experimental results highlight the critical gaps in current defenses, underscoring the need for more robust strategies to secure advanced MLLMs.

Paperid: 1021, https://arxiv.org/pdf/2504.01094.pdf

Abstract:
Large Audio Language Models (LALMs) have significantly advanced audio understanding but introduce critical security risks, particularly through audio jailbreaks. While prior work has focused on English-centric attacks, we expose a far more severe vulnerability: adversarial multilingual and multi-accent audio jailbreaks, where linguistic and acoustic variations dramatically amplify attack success. In this paper, we introduce Multi-AudioJail, the first systematic framework to exploit these vulnerabilities through (1) a novel dataset of adversarially perturbed multilingual/multi-accent audio jailbreaking prompts, and (2) a hierarchical evaluation pipeline revealing that how acoustic perturbations (e.g., reverberation, echo, and whisper effects) interacts with cross-lingual phonetics to cause jailbreak success rates (JSRs) to surge by up to +57.25 percentage points (e.g., reverberated Kenyan-accented attack on MERaLiON). Crucially, our work further reveals that multimodal LLMs are inherently more vulnerable than unimodal systems: attackers need only exploit the weakest link (e.g., non-English audio inputs) to compromise the entire model, which we empirically show by multilingual audio-only attacks achieving 3.1x higher success rates than text-only attacks. We plan to release our dataset to spur research into cross-modal defenses, urging the community to address this expanding attack surface in multimodality as LALMs evolve.

Paperid: 1022, https://arxiv.org/pdf/2503.11897.pdf

Abstract:
We revisit the problem of secure aggregation of high-dimensional vectors in a two-server system such as Prio. These systems are typically used to aggregate vectors such as gradients in private federated learning, where the aggregate itself is protected via noise addition to ensure differential privacy. Existing approaches require communication scaling with the dimensionality, and thus limit the dimensionality of vectors one can efficiently process in this setup. We propose PREAMBLE: {\bf Pr}ivate {\bf E}fficient {\bf A}ggregation {\bf M}echanism via {\bf BL}ock-sparse {\bf E}uclidean Vectors. PREAMBLE builds on an extension of distributed point functions that enables communication- and computation-efficient aggregation of {\em block-sparse vectors}, which are sparse vectors where the non-zero entries occur in a small number of clusters of consecutive coordinates. We show that these block-sparse DPFs can be combined with random sampling and privacy amplification by sampling results, to allow asymptotically optimal privacy-utility trade-offs for vector aggregation, at a fraction of the communication cost. When coupled with recent advances in numerical privacy accounting, our approach incurs a negligible overhead in noise variance, compared to the Gaussian mechanism used with Prio.

Paperid: 1023, https://arxiv.org/pdf/2503.11850.pdf

Abstract:
Pan-privacy was proposed by Dwork et al. as an approach to designing a private analytics system that retains its privacy properties in the face of intrusions that expose the system's internal state. Motivated by federated telemetry applications, we study local pan-privacy, where privacy should be retained under repeated unannounced intrusions on the local state. We consider the problem of monitoring the count of an event in a federated system, where event occurrences on a local device should be hidden even from an intruder on that device. We show that under reasonable constraints, the goal of providing information-theoretic differential privacy under intrusion is incompatible with collecting telemetry information. We then show that this problem can be solved in a scalable way using standard cryptographic primitives.

Paperid: 1024, https://arxiv.org/pdf/2503.08704.pdf

Abstract:
Large language models (LLMs) have achieved remarkable success in natural language processing, yet their performance and computational costs vary significantly. LLM routers play a crucial role in dynamically balancing these trade-offs. While previous studies have primarily focused on routing efficiency, security vulnerabilities throughout the entire LLM router life cycle, from training to inference, remain largely unexplored. In this paper, we present a comprehensive investigation into the life-cycle routing vulnerabilities of LLM routers. We evaluate both white-box and black-box adversarial robustness, as well as backdoor robustness, across several representative routing models under extensive experimental settings. Our experiments uncover several key findings: 1) Mainstream DNN-based routers tend to exhibit the weakest adversarial and backdoor robustness, largely due to their strong feature extraction capabilities that amplify vulnerabilities during both training and inference; 2) Training-free routers demonstrate the strongest robustness across different attack types, benefiting from the absence of learnable parameters that can be manipulated. These findings highlight critical security risks spanning the entire life cycle of LLM routers and provide insights for developing more robust models.

Paperid: 1025, https://arxiv.org/pdf/2503.04867.pdf

Abstract:
Learnable Image Compression (LIC) has proven capable of outperforming standardized video codecs in compression efficiency. However, achieving both real-time and secure LIC operations on hardware presents significant conceptual and methodological challenges. The present work addresses these challenges by providing an integrated workflow and platform for training, securing, and deploying LIC models on hardware. To this end, a hardware-friendly LIC model is obtained by iteratively pruning and quantizing the model within a standard end-to-end learning framework. Notably, we introduce a novel Quantization-Aware Watermarking (QAW) technique, where the model is watermarked during quantization using a joint loss function, ensuring robust security without compromising model performance. The watermarked weights are then public-key encrypted, guaranteeing both content protection and user traceability. Experimental results across different FPGA platforms evaluate real-time performance, latency, energy consumption, and compression efficiency. The findings highlight that the watermarking and encryption processes maintain negligible impact on compression efficiency (average of -0.4 PSNR) and energy consumption (average of +2%), while still meeting real-time constraints and preserving security properties.

Paperid: 1026, https://arxiv.org/pdf/2503.03927.pdf

Abstract:
Machine learning models deployed locally on social media applications are used for features, such as face filters which read faces in-real time, and they expose sensitive attributes to the apps. However, the deployment of machine learning models, e.g., when, where, and how they are used, in social media applications is opaque to users. We aim to address this inconsistency and investigate how social media user perceptions and behaviors change once exposed to these models. We conducted user studies (N=21) and found that participants were unaware to both what the models output and when the models were used in Instagram and TikTok, two major social media platforms. In response to being exposed to the models' functionality, we observed long term behavior changes in 8 participants. Our analysis uncovers the challenges and opportunities in providing transparency for machine learning models that interact with local user data.

Paperid: 1027, https://arxiv.org/pdf/2503.03022.pdf

Abstract:
Machine learning has shown promise in network intrusion detection systems, yet its performance often degrades due to concept drift and imbalanced data. These challenges are compounded by the labor-intensive process of labeling network traffic, especially when dealing with evolving and rare attack types, which makes preparing the right data for adaptation difficult. To address these issues, we propose a generative active adaptation framework that minimizes labeling effort while enhancing model robustness. Our approach employs density-aware dataset prior selection to identify the most informative samples for annotation, and leverages deep generative models to conditionally synthesize diverse samples, thereby augmenting the training set and mitigating the effects of concept drift. We evaluate our end-to-end framework \NetGuard on both simulated IDS data and a real-world ISP dataset, demonstrating significant improvements in intrusion detection performance. Our method boosts the overall F1-score from 0.60 (without adaptation) to 0.86. Rare attacks such as Infiltration, Web Attack, and FTP-BruteForce, which originally achieved F1 scores of 0.001, 0.04, and 0.00, improve to 0.30, 0.50, and 0.71, respectively, with generative active adaptation in the CIC-IDS 2018 dataset. Our framework effectively enhances rare attack detection while reducing labeling costs, making it a scalable and practical solution for intrusion detection.

Paperid: 1028, https://arxiv.org/pdf/2503.02780.pdf

Abstract:
Cyber resilience is the ability of a system to recover from an attack with minimal impact on system operations. However, characterizing a network's resilience under a cyber attack is challenging, as there are no formal definitions of resilience applicable to diverse network topologies and attack patterns. In this work, we propose a quantifiable formulation of resilience that considers multiple defender operational goals, the criticality of various network resources for daily operations, and provides interpretability to security operators about their system's resilience under attack. We evaluate our approach within the CybORG environment, a reinforcement learning (RL) framework for autonomous cyber defense, analyzing trade-offs between resilience, costs, and prioritization of operational goals. Furthermore, we introduce methods to aggregate resilience metrics across time-variable attack patterns and multiple network topologies, comprehensively characterizing system resilience. Using insights gained from our resilience metrics, we design RL autonomous defensive agents and compare them against several heuristic baselines, showing that proactive network hardening techniques and prompt recovery of compromised machines are critical for effective cyber defenses.

Paperid: 1029, https://arxiv.org/pdf/2502.16670.pdf

Abstract:
There has been a long-standing hypothesis that a software's popularity is related to its security or insecurity in both research and popular discourse. There are also a few empirical studies that have examined the hypothesis, either explicitly or implicitly. The present work continues with and contributes to this research with a replication-motivated large-scale analysis of software written in the PHP programming language. The dataset examined contains nearly four hundred thousand open source software packages written in PHP. According to the results based on reported security vulnerabilities, the hypothesis does holds; packages having been affected by vulnerabilities over their release histories are generally more popular than packages without having been affected by a single vulnerability. With this replication results, the paper contributes to the efforts to strengthen the empirical knowledge base in cyber and software security.

Paperid: 1030, https://arxiv.org/pdf/2502.16366.pdf

Abstract:
Most safety training methods for large language models (LLMs) are based on fine-tuning that forces models to shift from an unsafe answer to refusal when faced with harmful requests. Unfortunately, these drastic distribution shifts generally compromise model capabilities. To avoid that, we propose to expand the model's vocabulary with a special token we call red flag token () and propose to train the model to insert this token into its response at any time when harmful content is generated or about to be generated. Our approach offers several advantages: it enables the model to explicitly learn the concept of harmfulness while marginally affecting the generated distribution, thus maintaining the model's utility. It also evaluates each generated answer and provides robustness as good as adversarial training without the need to run attacks during training. Moreover, by encapsulating our safety tuning in a LoRA module, we provide additional defenses against fine-tuning API attacks.

Paperid: 1031, https://arxiv.org/pdf/2502.16065.pdf

Abstract:
Model Extraction Attacks (MEAs) threaten modern machine learning systems by enabling adversaries to steal models, exposing intellectual property and training data. With the increasing deployment of machine learning models in distributed computing environments, including cloud, edge, and federated learning settings, each paradigm introduces distinct vulnerabilities and challenges. Without a unified perspective on MEAs across these distributed environments, organizations risk fragmented defenses, inadequate risk assessments, and substantial economic and privacy losses. This survey is motivated by the urgent need to understand how the unique characteristics of cloud, edge, and federated deployments shape attack vectors and defense requirements. We systematically examine the evolution of attack methodologies and defense mechanisms across these environments, demonstrating how environmental factors influence security strategies in critical sectors such as autonomous vehicles, healthcare, and financial services. By synthesizing recent advances in MEAs research and discussing the limitations of current evaluation practices, this survey provides essential insights for developing robust and adaptive defense strategies. Our comprehensive approach highlights the importance of integrating protective measures across the entire distributed computing landscape to ensure the secure deployment of machine learning models.

Paperid: 1032, https://arxiv.org/pdf/2502.11521.pdf

Abstract:
DeFi (Decentralized Finance) is one of the most important applications of today's cryptocurrencies and smart contracts. It manages hundreds of billions in Total Value Locked (TVL) on-chain, yet it remains susceptible to common DeFi price manipulation attacks. Despite state-of-the-art (SOTA) systems like DeFiRanger and DeFort, we found that they are less effective to non-standard price models in custom DeFi protocols, which account for 44.2% of the 95 DeFi price manipulation attacks reported over the past three years. In this paper, we introduce the first LLM-based approach, DeFiScope, for detecting DeFi price manipulation attacks in both standard and custom price models. Our insight is that large language models (LLMs) have certain intelligence to abstract price calculation from code and infer the trend of token price changes based on the extracted price models. To further strengthen LLMs in this aspect, we leverage Foundry to synthesize on-chain data and use it to fine-tune a DeFi price-specific LLM. Together with the high-level DeFi operations recovered from low-level transaction data, DeFiScope detects various DeFi price manipulations according to systematically mined patterns. Experimental results show that DeFiScope achieves a high precision of 96% and a recall rate of 80%, significantly outperforming SOTA approaches. Moreover, we evaluate DeFiScope's cost-effectiveness and demonstrate its practicality by helping our industry partner confirm 147 real-world price manipulation attacks, including discovering 81 previously unknown historical incidents.

Paperid: 1033, https://arxiv.org/pdf/2502.04227.pdf

Abstract:
Enterprise penetration-testing is often limited by high operational costs and the scarcity of human expertise. This paper investigates the feasibility and effectiveness of using Large Language Model (LLM)-driven autonomous systems to address these challenges in real-world Active Directory (AD) enterprise networks. We introduce a novel prototype designed to employ LLMs to autonomously perform Assumed Breach penetration-testing against enterprise networks. Our system represents the first demonstration of a fully autonomous, LLM-driven framework capable of compromising accounts within a real-life Microsoft Active Directory testbed, GOAD. We perform our empirical evaluation using five LLMs, comparing reasoning to non-reasoning models as well as including open-weight models. Through quantitative and qualitative analysis, incorporating insights from cybersecurity experts, we demonstrate that autonomous LLMs can effectively conduct Assumed Breach simulations. Key findings highlight their ability to dynamically adapt attack strategies, perform inter-context attacks (e.g., web-app audits, social engineering, and unstructured data analysis for credentials), and generate scenario-specific attack parameters like realistic password candidates. The prototype exhibits robust self-correction mechanisms, installing missing tools and rectifying invalid command generations. We find that the associated costs are competitive with, and often significantly lower than, those incurred by professional human pen-testers, suggesting a path toward democratizing access to essential security testing for organizations with budgetary constraints. However, our research also illuminates existing limitations, including instances of LLM ``going down rabbit holes'', challenges in comprehensive information transfer between planning and execution modules, and critical safety concerns that necessitate human oversight.

Paperid: 1034, https://arxiv.org/pdf/2502.04201.pdf

Abstract:
The advancements in autonomous driving technology, coupled with the growing interest from automotive manufacturers and tech companies, suggest a rising adoption of Connected Autonomous Vehicles (CAVs) in the near future. Despite some evidence of higher accident rates in AVs, these incidents tend to result in less severe injuries compared to traditional vehicles due to cooperative safety measures. However, the increased complexity of CAV systems exposes them to significant security vulnerabilities, potentially compromising their performance and communication integrity. This paper contributes by presenting a detailed analysis of existing security frameworks and protocols, focusing on intra- and inter-vehicle communications. We systematically evaluate the effectiveness of these frameworks in addressing known vulnerabilities and propose a set of best practices for enhancing CAV communication security. The paper also provides a comprehensive taxonomy of attack vectors in CAV ecosystems and suggests future research directions for designing more robust security mechanisms. Our key contributions include the development of a new classification system for CAV security threats, the proposal of practical security protocols, and the introduction of use cases that demonstrate how these protocols can be integrated into real-world CAV applications. These insights are crucial for advancing secure CAV adoption and ensuring the safe integration of autonomous vehicles into intelligent transportation systems.

Paperid: 1035, https://arxiv.org/pdf/2502.03682.pdf

Abstract:
Intimate Partner Infiltration (IPI)--a type of Intimate Partner Violence (IPV) that typically requires physical access to a victim's device--is a pervasive concern around the world, often manifesting through digital surveillance, control, and monitoring. Unlike conventional cyberattacks, IPI perpetrators leverage close proximity and personal knowledge to circumvent standard protections, underscoring the need for targeted interventions. While security clinics and other human-centered approaches effectively tailor solutions for victims, their scalability remains constrained by resource limitations and the need for specialized counseling. We present AID, an Automated IPI Detection system that continuously monitors for unauthorized access and suspicious behaviors on smartphones. AID employs a unified architecture to process multimodal signals stealthily and preserve user privacy. A brief calibration phase upon installation enables AID to adapt to each user's behavioral patterns, achieving high accuracy with minimal false alarms. Our 27-participant user study demonstrates that AID achieves highly accurate detection of non-owner access and fine-grained IPI-related activities, attaining a false positive rate of 1.6%, which is 11x lower than existing methods, and an end-to-end F1 score of 0.981. These findings suggest that AID can serve as a forensic tool that security clinics can deploy to scale their ability to identify IPI tactics and deliver personalized, far-reaching support to survivors.

Paperid: 1036, https://arxiv.org/pdf/2502.02960.pdf

Abstract:
Large Language Models (LLMs) represent a transformative leap in artificial intelligence, enabling the comprehension, generation, and nuanced interaction with human language on an unparalleled scale. However, LLMs are increasingly vulnerable to a range of adversarial attacks that threaten their privacy, reliability, security, and trustworthiness. These attacks can distort outputs, inject biases, leak sensitive information, or disrupt the normal functioning of LLMs, posing significant challenges across various applications. In this paper, we provide a novel comprehensive analysis of the adversarial landscape of LLMs, framed through the lens of attack objectives. By concentrating on the core goals of adversarial actors, we offer a fresh perspective that examines threats from the angles of privacy, integrity, availability, and misuse, moving beyond conventional taxonomies that focus solely on attack techniques. This objective-driven adversarial landscape not only highlights the strategic intent behind different adversarial approaches but also sheds light on the evolving nature of these threats and the effectiveness of current defenses. Our analysis aims to guide researchers and practitioners in better understanding, anticipating, and mitigating these attacks, ultimately contributing to the development of more resilient and robust LLM systems.

Paperid: 1037, https://arxiv.org/pdf/2502.02387.pdf

Abstract:
Zero-knowledge succinct non-interactive arguments of knowledge (zk-SNARKs) are a powerful tool for proving computation correctness, attracting significant interest from researchers, developers, and users. However, the complexity of zk-SNARKs has created gaps between these groups, hindering progress. Researchers focus on constructing efficient proving systems with stronger security and new properties, while developers and users prioritize toolchains, usability, and compatibility. In this work, we provide a comprehensive study of zk-SNARK, from theory to practice, pinpointing gaps and limitations. We first present a master recipe that unifies the main steps in converting a program into a zk-SNARK. We then classify existing zk-SNARKs according to their key techniques. Our classification addresses the main difference in practically valuable properties between existing zk-SNARK schemes. We survey over 40 zk-SNARKs since 2013 and provide a reference table listing their categories and properties. Following the steps in master recipe, we then survey 11 general-purpose popular used libraries. We elaborate on these libraries' usability, compatibility, efficiency and limitations. Since installing and executing these zk-SNARK systems is challenging, we also provide a completely virtual environment in which to run the compiler for each of them. We identify that the proving system is the primary focus in cryptography academia. In contrast, the constraint system presents a bottleneck in industry. To bridge this gap, we offer recommendations and advocate for the opensource community to enhance documentation, standardization and compatibility.

Paperid: 1038, https://arxiv.org/pdf/2501.18626.pdf

Abstract:
We present a novel class of jailbreak adversarial attacks on LLMs, termed Task-in-Prompt (TIP) attacks. Our approach embeds sequence-to-sequence tasks (e.g., cipher decoding, riddles, code execution) into the model's prompt to indirectly generate prohibited inputs. To systematically assess the effectiveness of these attacks, we introduce the PHRYGE benchmark. We demonstrate that our techniques successfully circumvent safeguards in six state-of-the-art language models, including GPT-4o and LLaMA 3.2. Our findings highlight critical weaknesses in current LLM safety alignments and underscore the urgent need for more sophisticated defence strategies. Warning: this paper contains examples of unethical inquiries used solely for research purposes.

Paperid: 1039, https://arxiv.org/pdf/2501.18371.pdf

Abstract:
While many hardware accelerators have recently been proposed to address the inefficiency problem of fully homomorphic encryption (FHE) schemes, none of them is able to deliver optimal performance when facing real-world FHE workloads consisting of a mixture of shallow and deep computations, due primarily to their homogeneous design principle. This paper presents FLASH-FHE, the first FHE accelerator with a heterogeneous architecture for mixed workloads. At its heart, FLASH-FHE designs two types of computation clusters, ie, bootstrappable and swift, to optimize for deep and shallow workloads respectively in terms of cryptographic parameters and hardware pipelines. We organize one bootstrappable and two swift clusters into one cluster affiliation, and present a scheduling scheme that provides sufficient acceleration for deep FHE workloads by utilizing all the affiliations, while improving parallelism for shallow FHE workloads by assigning one shallow workload per affiliation and dynamically decomposing the bootstrappable cluster into multiple swift pipelines to accelerate the assigned workload. We further show that these two types of clusters can share valuable on-chip memory, improving performance without significant resource consumption. We implement FLASH-FHE with RTL and synthesize it using both 7nm and 14/12nm technology nodes, and our experiment results demonstrate that FLASH-FHE achieves an average performance improvement of $1.4\times$ and $11.2\times$ compared to state-of-the-art FHE accelerators CraterLake and F1 for deep workloads, while delivering up to $8.0\times$ speedup for shallow workloads due to its heterogeneous architecture.

Paperid: 1040, https://arxiv.org/pdf/2501.14418.pdf

Abstract:
Payment channel networks (PCNs) offer a promising solution to address the limited transaction throughput of deployed blockchains. However, several attacks have recently been proposed that stress the vulnerability of PCNs to timelock and censoring attacks. To address such attacks, we introduce Thunderdome, the first timelock-free PCN. Instead, Thunderdome leverages the design rationale of virtual channels to extend a timelock-free payment channel primitive, thereby enabling multi-hop transactions without timelocks. Previous works either utilize timelocks or do not accommodate transactions between parties that do not share a channel. At its core, Thunderdome relies on a committee of non-trusted watchtowers, known as wardens, who ensure that no honest party loses funds, even when offline, during the channel closure process. We introduce tailored incentive mechanisms to ensure that all participants follow the protocol's correct execution. Besides a traditional security proof that assumes an honest majority of the committee, we conduct a formal game-theoretic analysis to demonstrate the security of Thunderdome when all participants, including wardens, act rationally. We implement a proof of concept of Thunderdome on Ethereum to validate its feasibility and evaluate its costs. Our evaluation shows that deploying Thunderdome, including opening the underlying payment channel, costs approximately \$15 (0.0089 ETH), while the worst-case cost for closing a channel is about \$7 (0.004 ETH).

Paperid: 1041, https://arxiv.org/pdf/2501.10639.pdf

Abstract:
Ensuring safety alignment is a critical requirement for large language models (LLMs), particularly given increasing deployment in real-world applications. Despite considerable advancements, LLMs remain susceptible to jailbreak attacks, which exploit system vulnerabilities to circumvent safety measures and elicit harmful or inappropriate outputs. Furthermore, while adversarial training-based defense methods have shown promise, a prevalent issue is the unintended over-defense behavior, wherein models excessively reject benign queries, significantly undermining their practical utility. To address these limitations, we introduce LATPC, a Latent-space Adversarial Training with Post-aware Calibration framework. LATPC dynamically identifies safety-critical latent dimensions by contrasting harmful and benign inputs, enabling the adaptive construction of targeted refusal feature removal attacks. This mechanism allows adversarial training to concentrate on real-world jailbreak tactics that disguise harmful queries as benign ones. During inference, LATPC employs an efficient embedding-level calibration mechanism to minimize over-defense behaviors with negligible computational overhead. Experimental results across five types of disguise-based jailbreak attacks demonstrate that LATPC achieves a superior balance between safety and utility compared to existing defense frameworks. Further analysis demonstrates the effectiveness of leveraging safety-critical dimensions in developing robust defense methods against jailbreak attacks.

Paperid: 1042, https://arxiv.org/pdf/2501.10560.pdf

Abstract:
Ensuring the proper use of sensitive data in analytics under complex privacy policies is an increasingly critical challenge. Many existing approaches lack portability, verifiability, and scalability across diverse data processing frameworks. We introduce Picachv, a novel security monitor that automatically enforces data use policies. It works on relational algebra as an abstraction for program semantics, enabling policy enforcement on query plans generated by programs during execution. This approach simplifies analysis across diverse analytical operations and supports various front-end query languages. By formalizing both data use policies and relational algebra semantics in Coq, we prove that Picachv correctly enforces policies. Picachv also leverages Trusted Execution Environments (TEEs) to enhance trust in runtime, providing provable policy compliance to stakeholders that the analytical tasks comply with their data use policies. We integrated Picachv into Polars, a state-of-the-art data analytics framework, and evaluate its performance using the TPC-H benchmark. We also apply our approach to real-world use cases. Our work demonstrates the practical application of formal methods in securing data analytics, addressing key challenges.

Paperid: 1043, https://arxiv.org/pdf/2501.08550.pdf

Abstract:
Modern blockchains increasingly consist of multiple clients that implement a single blockchain protocol. If there is a semantic mismatch between the protocol implementations, the blockchain can permanently split and introduce new attack vectors. Current ad-hoc test suites for client implementations are not sufficient to ensure a high degree of protocol conformance. As an alternative, we present a framework that performs protocol conformance testing using a formal model of the protocol and an implementation running inside a deterministic blockchain simulator. Our framework consists of two complementary workflows that use the components as trace generators and checkers. Our insight is that both workflows are needed to detect all types of violations. We have applied and demonstrated the utility of our framework on an industrial strength consensus protocol.

Paperid: 1044, https://arxiv.org/pdf/2501.07844.pdf

Abstract:
Quantum computing offers unparalleled processing power but raises significant data privacy challenges. Quantum Differential Privacy (QDP) leverages inherent quantum noise to safeguard privacy, surpassing traditional DP. This paper develops comprehensive noise profiles, identifies noise types beneficial for QDP, and highlights teh need for practical implementations beyond theoretical models. Existing QDP mechanisms, limited to single noise sources, fail to reflect teh multi-source noise reality of quantum systems. We propose a resilient hybrid QDP mechanism utilizing channel and measurement noise, optimizing privacy budgets to balance privacy and utility. Additionally, we introduce Lifted Quantum Differential Privacy, offering enhanced randomness for improved privacy audits and quantum algorithm evaluation.

Paperid: 1045, https://arxiv.org/pdf/2501.07262.pdf

Abstract:
Content providers increasingly utilise Content Delivery Networks (CDNs) to enhance users' content download experience. However, this deployment scenario raises significant security concerns regarding content confidentiality and user privacy due to the involvement of third-party providers. Prior proposals using private information retrieval (PIR) and oblivious RAM (ORAM) have proven impractical due to high computation and communication costs, as well as integration challenges within distributed CDN architectures. In response, we present \textsf{OblivCDN}, a practical privacy-preserving system meticulously designed for seamless integration with the existing real-world Internet-CDN infrastructure. Our design strategically adapts Range ORAM primitives to optimise memory and disk seeks when accessing contiguous blocks of CDN content, both at the origin and edge servers, while preserving both content confidentiality and user access pattern hiding features. Also, we carefully customise several oblivious building blocks that integrate the distributed trust model into the ORAM client, thereby eliminating the computational bottleneck in the origin server and reducing communication costs between the origin server and edge servers. Moreover, the newly-designed ORAM client also eliminates the need for trusted hardware on edge servers, and thus significantly ameliorates the compatibility towards networks with massive legacy devices.In real-world streaming evaluations, OblivCDN} demonstrates remarkable performance, downloading a $256$ MB video in just $5.6$ seconds. This achievement represents a speedup of $90\times$ compared to a strawman approach (direct ORAM adoption) and a $366\times$ improvement over the prior art, OblivP2P.

Paperid: 1046, https://arxiv.org/pdf/2501.02971.pdf

Abstract:
Existing research on federated learning has been focused on the setting where learning is coordinated by a centralized entity. Yet the greatest potential of future collaborative intelligence would be unleashed in a more open and democratized setting with no central entity in a dominant role, referred to as "decentralized federated learning". New challenges arise accordingly in achieving both correct model training and fair reward allocation with collective effort among all participating nodes, especially with the threat of the Byzantine node jeopardising both tasks. In this paper, we propose a blockchain-based decentralized Byzantine fault-tolerant federated learning framework based on a novel Proof-of-Data (PoD) consensus protocol to resolve both the "trust" and "incentive" components. By decoupling model training and contribution accounting, PoD is able to enjoy not only the benefit of learning efficiency and system liveliness from asynchronous societal-scale PoW-style learning but also the finality of consensus and reward allocation from epoch-based BFT-style voting. To mitigate false reward claims by data forgery from Byzantine attacks, a privacy-aware data verification and contribution-based reward allocation mechanism is designed to complete the framework. Our evaluation results show that PoD demonstrates performance in model training close to that of the centralized counterpart while achieving trust in consensus and fairness for reward allocation with a fault tolerance ratio of 1/3.

Paperid: 1047, https://arxiv.org/pdf/2501.02796.pdf

Abstract:
Cyber-physical-social systems (CPSSs) have emerged in many applications over recent decades, requiring increased attention to security concerns. The rise of sophisticated threats like Advanced Persistent Threats (APTs) makes ensuring security in CPSSs particularly challenging. Provenance graph analysis has proven effective for tracing and detecting anomalies within systems, but the sheer size and complexity of these graphs hinder the efficiency of existing methods, especially those relying on graph neural networks (GNNs). To address these challenges, we present GraphDART, a modular framework designed to distill provenance graphs into compact yet informative representations, enabling scalable and effective anomaly detection. GraphDART can take advantage of diverse graph distillation techniques, including classic and modern graph distillation methods, to condense large provenance graphs while preserving essential structural and contextual information. This approach significantly reduces computational overhead, allowing GNNs to learn from distilled graphs efficiently and enhance detection performance. Extensive evaluations on benchmark datasets demonstrate the robustness of GraphDART in detecting malicious activities across cyber-physical-social systems. By optimizing computational efficiency, GraphDART provides a scalable and practical solution to safeguard interconnected environments against APTs.

Paperid: 1048, https://arxiv.org/pdf/2501.01263.pdf

Abstract:
Powered by their superior performance, deep neural networks (DNNs) have found widespread applications across various domains. Many deep learning (DL) models are now embedded in mobile apps, making them more accessible to end users through on-device DL. However, deploying on-device DL to users' smartphones simultaneously introduces several security threats. One primary threat is backdoor attacks. Extensive research has explored backdoor attacks for several years and has proposed numerous attack approaches. However, few studies have investigated backdoor attacks on DL models deployed in the real world, or they have shown obvious deficiencies in effectiveness and stealthiness. In this work, we explore more effective and stealthy backdoor attacks on real-world DL models extracted from mobile apps. Our main justification is that imperceptible and sample-specific backdoor triggers generated by DNN-based steganography can enhance the efficacy of backdoor attacks on real-world models. We first confirm the effectiveness of steganography-based backdoor attacks on four state-of-the-art DNN models. Subsequently, we systematically evaluate and analyze the stealthiness of the attacks to ensure they are difficult to perceive. Finally, we implement the backdoor attacks on real-world models and compare our approach with three baseline methods. We collect 38,387 mobile apps, extract 89 DL models from them, and analyze these models to obtain the prerequisite model information for the attacks. After identifying the target models, our approach achieves an average of 12.50% higher attack success rate than DeepPayload while better maintaining the normal performance of the models. Extensive experimental results demonstrate that our method enables more effective, robust, and stealthy backdoor attacks on real-world models.

Paperid: 1049, https://arxiv.org/pdf/2506.23183.pdf

Abstract:
In machine learning security, one is often faced with the problem of removing outliers from a given set of high-dimensional vectors when computing their average. For example, many variants of data poisoning attacks produce gradient vectors during training that are outliers in the distribution of clean gradients, which bias the computed average used to derive the ML model. Filtering them out before averaging serves as a generic defense strategy. Byzantine robust aggregation is an algorithmic primitive which computes a robust average of vectors, in the presence of an $Îµ$ fraction of vectors which may have been arbitrarily and adaptively corrupted, such that the resulting bias in the final average is provably bounded. In this paper, we give the first robust aggregator that runs in quasi-linear time in the size of input vectors and provably has near-optimal bias bounds. Our algorithm also does not assume any knowledge of the distribution of clean vectors, nor does it require pre-computing any filtering thresholds from it. This makes it practical to use directly in standard neural network training procedures. We empirically confirm its expected runtime efficiency and its effectiveness in nullifying 10 different ML poisoning attacks.

Paperid: 1050, https://arxiv.org/pdf/2506.22949.pdf

Abstract:
One of the most difficult challenges in cybersecurity is eliminating Distributed Denial of Service (DDoS) attacks. Automating this task using artificial intelligence is a complex process due to the inherent class imbalance and lack of sufficient labeled samples of real-world datasets. This research investigates the use of Semi-Supervised Learning (SSL) techniques to improve DDoS attack detection when data is imbalanced and partially labeled. In this process, 13 state-of-the-art SSL algorithms are evaluated for detecting DDoS attacks in several scenarios. We evaluate their practical efficacy and shortcomings, including the extent to which they work in extreme environments. The results will offer insight into designing intelligent Intrusion Detection Systems (IDSs) that are robust against class imbalance and handle partially labeled data.

Paperid: 1051, https://arxiv.org/pdf/2506.21874.pdf

Abstract:
Today's text-to-image generative models are trained on millions of images sourced from the Internet, each paired with a detailed caption produced by Vision-Language Models (VLMs). This part of the training pipeline is critical for supplying the models with large volumes of high-quality image-caption pairs during training. However, recent work suggests that VLMs are vulnerable to stealthy adversarial attacks, where adversarial perturbations are added to images to mislead the VLMs into producing incorrect captions. In this paper, we explore the feasibility of adversarial mislabeling attacks on VLMs as a mechanism to poisoning training pipelines for text-to-image models. Our experiments demonstrate that VLMs are highly vulnerable to adversarial perturbations, allowing attackers to produce benign-looking images that are consistently miscaptioned by the VLM models. This has the effect of injecting strong "dirty-label" poison samples into the training pipeline for text-to-image models, successfully altering their behavior with a small number of poisoned samples. We find that while potential defenses can be effective, they can be targeted and circumvented by adaptive attackers. This suggests a cat-and-mouse game that is likely to reduce the quality of training data and increase the cost of text-to-image model development. Finally, we demonstrate the real-world effectiveness of these attacks, achieving high attack success (over 73%) even in black-box scenarios against commercial VLMs (Google Vertex AI and Microsoft Azure).

Paperid: 1052, https://arxiv.org/pdf/2506.19624.pdf

Abstract:
The widespread lack of broad source code verification on blockchain explorers such as Etherscan, where despite 78,047,845 smart contracts deployed on Ethereum (as of May 26, 2025), a mere 767,520 (< 1%) are open source, presents a severe impediment to blockchain security. This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode, a fundamental research challenge with direct implications for identifying vulnerabilities and understanding malicious behavior. Prevailing decompilers struggle to reverse bytecode in a readable manner, often yielding convoluted code that critically hampers vulnerability analysis and thwarts efforts to dissect contract functionalities for security auditing. This paper addresses this challenge by introducing a pioneering decompilation pipeline that, for the first time, successfully leverages Large Language Models (LLMs) to transform Ethereum Virtual Machine (EVM) bytecode into human-readable and semantically faithful Solidity code. Our novel methodology first employs rigorous static program analysis to convert bytecode into a structured three-address code (TAC) representation. This intermediate representation then guides a Llama-3.2-3B model, specifically fine-tuned on a comprehensive dataset of 238,446 TAC-to-Solidity function pairs, to generate high-quality Solidity. This approach uniquely recovers meaningful variable names, intricate control flow, and precise function signatures. Our extensive empirical evaluation demonstrates a significant leap beyond traditional decompilers, achieving an average semantic similarity of 0.82 with original source and markedly superior readability. The practical viability and effectiveness of our research are demonstrated through its implementation in a publicly accessible system, available at https://evmdecompiler.com.

Paperid: 1053, https://arxiv.org/pdf/2506.16347.pdf

Abstract:
Information and Communication Technologies (ICT) have a significant climate impact, and data centres account for a large proportion of the carbon emissions from ICT. To achieve sustainability goals, it is important that all parties involved in ICT supply chains can track and share accurate carbon emissions data with their customers, investors, and the authorities. However, businesses have strong incentives to make their numbers look good, whilst less so to publish their accounting methods along with all the input data, due to the risk of revealing sensitive information. It would be uneconomical to use a trusted third party to verify the data for every report for each party in the chain. As a result, carbon emissions reporting in supply chains currently relies on unverified data. This paper proposes a methodology that applies cryptography and zero-knowledge proofs for carbon emissions claims that can be subsequently verified without the knowledge of the private input data. The proposed system is based on a zero-knowledge Succinct Non-interactive ARguments of Knowledge (zk-SNARK) protocol, which enables verifiable emissions reporting mechanisms across a chain of energy suppliers, cloud data centres, cloud services providers, and customers, without any company needing to disclose commercially sensitive information. This allows customers of cloud services to accurately account for the emissions generated by their activities, improving data quality for their own regulatory reporting. Cloud services providers would also be held accountable for producing accurate carbon emissions data.

Paperid: 1054, https://arxiv.org/pdf/2506.15349.pdf

Abstract:
Differential privacy (DP) auditing aims to provide empirical lower bounds on the privacy guarantees of DP mechanisms like DP-SGD. While some existing techniques require many training runs that are prohibitively costly, recent work introduces one-run auditing approaches that effectively audit DP-SGD in white-box settings while still being computationally efficient. However, in the more practical black-box setting where gradients cannot be manipulated during training and only the last model iterate is observed, prior work shows that there is still a large gap between the empirical lower bounds and theoretical upper bounds. Consequently, in this work, we study how incorporating approaches for stronger membership inference attacks (MIA) can improve one-run auditing in the black-box setting. Evaluating on image classification models trained on CIFAR-10 with DP-SGD, we demonstrate that our proposed approach, which utilizes quantile regression for MIA, achieves tighter bounds while crucially maintaining the computational efficiency of one-run methods.

Paperid: 1055, https://arxiv.org/pdf/2506.13955.pdf

Abstract:
Anomaly detection (AD) is a critical task across domains such as cybersecurity and healthcare. In the unsupervised setting, an effective and theoretically-grounded principle is to train classifiers to distinguish normal data from (synthetic) anomalies. We extend this principle to semi-supervised AD, where training data also include a limited labeled subset of anomalies possibly present in test time. We propose a theoretically-grounded and empirically effective framework for semi-supervised AD that combines known and synthetic anomalies during training. To analyze semi-supervised AD, we introduce the first mathematical formulation of semi-supervised AD, which generalizes unsupervised AD. Here, we show that synthetic anomalies enable (i) better anomaly modeling in low-density regions and (ii) optimal convergence guarantees for neural network classifiers -- the first theoretical result for semi-supervised AD. We empirically validate our framework on five diverse benchmarks, observing consistent performance gains. These improvements also extend beyond our theoretical framework to other classification-based AD methods, validating the generalizability of the synthetic anomaly principle in AD.

Paperid: 1056, https://arxiv.org/pdf/2506.08435.pdf

Abstract:
Federated learning (FL) enables collaborative model training among multiple clients without the need to expose raw data. Its ability to safeguard privacy, at the heart of FL, has recently been a hot-button debate topic. To elaborate, several studies have introduced a type of attacks known as gradient leakage attacks (GLAs), which exploit the gradients shared during training to reconstruct clients' raw data. On the flip side, some literature, however, contends no substantial privacy risk in practical FL environments due to the effectiveness of such GLAs being limited to overly relaxed conditions, such as small batch sizes and knowledge of clients' data distributions. This paper bridges this critical gap by empirically demonstrating that clients' data can still be effectively reconstructed, even within realistic FL environments. Upon revisiting GLAs, we recognize that their performance failures stem from their inability to handle the gradient matching problem. To alleviate the performance bottlenecks identified above, we develop FedLeak, which introduces two novel techniques, partial gradient matching and gradient regularization. Moreover, to evaluate the performance of FedLeak in real-world FL environments, we formulate a practical evaluation protocol grounded in a thorough review of extensive FL literature and industry practices. Under this protocol, FedLeak can still achieve high-fidelity data reconstruction, thereby underscoring the significant vulnerability in FL systems and the urgent need for more effective defense methods.

Paperid: 1057, https://arxiv.org/pdf/2506.05403.pdf

Abstract:
With the widespread adoption of Artificial intelligence (AI), AI-based tools and components are becoming omnipresent in today's solutions. However, these components and tools are posing a significant threat when it comes to adversarial attacks. Mobile Crowdsensing (MCS) is a sensing paradigm that leverages the collective participation of workers and their smart devices to collect data. One of the key challenges faced at the selection stage is ensuring task completion due to workers' varying behavior. AI has been utilized to tackle this challenge by building unique models for each worker to predict their behavior. However, the integration of AI into the system introduces vulnerabilities that can be exploited by malicious insiders to reduce the revenue obtained by victim workers. This work proposes an adversarial attack targeting behavioral-based selection models in MCS. The proposed attack leverages Generative Adversarial Networks (GANs) to generate poisoning points that can mislead the models during the training stage without being detected. This way, the potential damage introduced by GANs on worker selection in MCS can be anticipated. Simulation results using a real-life dataset show the effectiveness of the proposed attack in compromising the victim workers' model and evading detection by an outlier detector, compared to a benchmark. In addition, the impact of the attack on reducing the payment obtained by victim workers is evaluated.

Paperid: 1058, https://arxiv.org/pdf/2506.00280.pdf

Abstract:
With 3D Gaussian Splatting (3DGS) being increasingly used in safety-critical applications, how can an adversary manipulate the scene to cause harm? We introduce CLOAK, the first attack that leverages view-dependent Gaussian appearances - colors and textures that change with viewing angle - to embed adversarial content visible only from specific viewpoints. We further demonstrate DAGGER, a targeted adversarial attack directly perturbing 3D Gaussians without access to underlying training data, deceiving multi-stage object detectors e.g., Faster R-CNN, through established methods such as projected gradient descent. These attacks highlight underexplored vulnerabilities in 3DGS, introducing a new potential threat to robotic learning for autonomous navigation and other safety-critical 3DGS applications.

Paperid: 1059, https://arxiv.org/pdf/2506.00201.pdf

Abstract:
We consider the problem of secret protection, in which a business or organization wishes to train a model on their own data, while attempting to not leak secrets potentially contained in that data via the model. The standard method for training models to avoid memorization of secret information is via differential privacy (DP). However, DP requires a large loss in utility or a large dataset to achieve its strict privacy definition, which may be unnecessary in our setting where the data curator and data owner are the same entity. We propose an alternate definition of secret protection that instead of targeting DP, instead targets a bound on the posterior probability of secret reconstruction. We then propose and empirically evaluate an algorithm for model training with this secret protection definition. Our algorithm solves a linear program to assign weights to examples based on the desired per-secret protections, and then performs Poisson sampling using these weights. We show our algorithm significantly outperforms the baseline of running DP-SGD on the whole dataset.

Paperid: 1060, https://arxiv.org/pdf/2505.24742.pdf

Abstract:
Data has become a crucial resource in the digital economy, fostering initiatives for secure and sovereign data sharing frameworks such as Data Spaces. However, these distributed environments require fine-grained access control mechanisms that balance openness with sovereignty and security. This paper proposes an extension of the Open Digital Rights Language (ODRL) standard, the ODRL Data Spaces (ODS) profile, aimed at supporting authorization and complementing existing authentication mechanisms throughout the data lifecycle. Additionally, a policy execution engine is introduced to translate ODRL policies into executable formats, enabling effective enforcement. The approach is validated through a use case involving OpenFGA, demonstrating its applicability to relationship-based access control scenarios.

Paperid: 1061, https://arxiv.org/pdf/2505.24703.pdf

Abstract:
Deep learning techniques have enabled vast improvements in computer vision technologies. Nevertheless, these models are vulnerable to adversarial patch attacks which catastrophically impair performance. The physically realizable nature of these attacks calls for certifiable defenses, which feature provable guarantees on robustness. While certifiable defenses have been successfully applied to single-label classification, limited work has been done for multi-label classification. In this work, we present PatchDEMUX, a certifiably robust framework for multi-label classifiers against adversarial patches. Our approach is a generalizable method which can extend any existing certifiable defense for single-label classification; this is done by considering the multi-label classification task as a series of isolated binary classification problems to provably guarantee robustness. Furthermore, in the scenario where an attacker is limited to a single patch we propose an additional certification procedure that can provide tighter robustness bounds. Using the current state-of-the-art (SOTA) single-label certifiable defense PatchCleanser as a backbone, we find that PatchDEMUX can achieve non-trivial robustness on the MS-COCO and PASCAL VOC datasets while maintaining high clean performance

Paperid: 1062, https://arxiv.org/pdf/2505.24698.pdf

Abstract:
Identity verification in Data Spaces is a fundamental aspect of ensuring security and privacy in digital environments. This paper presents an identity verification protocol tailored for shared data environments within Data Spaces. This protocol extends the Grant Negotiation and Authorization Protocol (GNAP) and integrates OpenID Connect for Verifiable Presentations (OIDC4VP) along with support for Linked Verifiable Presentations (LVP), providing a robust foundation for secure and privacy-preserving interactions. The proposed solution adheres to the principles of Self-Sovereign Identity (SSI) to facilitate decentralized, user-centric identity management while maintaining flexibility through protocol negotiation. Two alternative interaction flows are introduced: a "Wallet-Driven Interaction" utilizing OIDC4VP, and a "LVP Authorization" model for fully automated machine-to-machine communication. These flows address critical challenges encountered in Data Spaces, including privacy, interoperability, and regulatory compliance while simultaneously ensuring scalability and minimizing trust assumptions. The paper provides a detailed technical design, outlining the implementation considerations, and demonstrating how the proposed flows guarantee verifiable, secure, and efficient interactions between participants. This work contributes towards the establishment of a more trustworthy and sovereign digital infrastructure, in alignment with emerging European data governance initiatives.

Paperid: 1063, https://arxiv.org/pdf/2505.24379.pdf

Abstract:
Large language models are typically trained on datasets collected from the web, which may inadvertently contain harmful or sensitive personal information. To address growing privacy concerns, unlearning methods have been proposed to remove the influence of specific data from trained models. Of these, exact unlearning -- which retrains the model from scratch without the target data -- is widely regarded the gold standard, believed to be robust against privacy-related attacks. In this paper, we challenge this assumption by introducing a novel data extraction attack that compromises even exact unlearning. Our method leverages both the pre- and post-unlearning models: by guiding the post-unlearning model using signals from the pre-unlearning model, we uncover patterns that reflect the removed data distribution. Combining model guidance with a token filtering strategy, our attack significantly improves extraction success rates -- doubling performance in some cases -- across common benchmarks such as MUSE, TOFU, and WMDP. Furthermore, we demonstrate our attack's effectiveness on a simulated medical diagnosis dataset to highlight real-world privacy risks associated with exact unlearning. In light of our findings, which suggest that unlearning may, in a contradictory way, increase the risk of privacy leakage, we advocate for evaluation of unlearning methods to consider broader threat models that account not only for post-unlearning models but also for adversarial access to prior checkpoints.

Paperid: 1064, https://arxiv.org/pdf/2505.23821.pdf

Abstract:
With the surge of social media, maliciously tampered public speeches, especially those from influential figures, have seriously affected social stability and public trust. Existing speech tampering detection methods remain insufficient: they either rely on external reference data or fail to be both sensitive to attacks and robust to benign operations, such as compression and resampling. To tackle these challenges, we introduce SpeechVerifer to proactively verify speech integrity using only the published speech itself, i.e., without requiring any external references. Inspired by audio fingerprinting and watermarking, SpeechVerifier can (i) effectively detect tampering attacks, (ii) be robust to benign operations and (iii) verify the integrity only based on published speeches. Briefly, SpeechVerifier utilizes multiscale feature extraction to capture speech features across different temporal resolutions. Then, it employs contrastive learning to generate fingerprints that can detect modifications at varying granularities. These fingerprints are designed to be robust to benign operations, but exhibit significant changes when malicious tampering occurs. To enable speech verification in a self-contained manner, the generated fingerprints are then embedded into the speech signal by segment-wise watermarking. Without external references, SpeechVerifier can retrieve the fingerprint from the published audio and check it with the embedded watermark to verify the integrity of the speech. Extensive experimental results demonstrate that the proposed SpeechVerifier is effective in detecting tampering attacks and robust to benign operations.

Paperid: 1065, https://arxiv.org/pdf/2505.23603.pdf

Abstract:
Quantum networks use principles of quantum physics to create secure communication networks. Moving these networks off the ground using drones, balloons, or satellites could help increase the scalability of these networks. This article reviews how such aerial links work, what makes them difficult to build, and the possible solutions that can be used to overcome these problems. By combining ground stations, aerial relays, and orbiting satellites into one seamless system, we move closer to a practical quantum internet.

Paperid: 1066, https://arxiv.org/pdf/2505.23581.pdf

Abstract:
The Hilbert transform has been one of the foundational transforms in signal processing, finding it's way into multiple disciplines from cryptography to biomedical sciences. However, there does not exist any quantum analogue for the Hilbert transform. In this work, we introduce a formulation for the quantum Hilbert transform (QHT)and apply it to a quantum steganography protocol. By bridging classical phase-shift techniques with quantum operations, QHT opens new pathways in quantum signal processing, communications, sensing, and secure information hiding.

Paperid: 1067, https://arxiv.org/pdf/2505.23266.pdf

Abstract:
We present Adversarial Object Fusion (AdvOF), a novel attack framework targeting vision-and-language navigation (VLN) agents in service-oriented environments by generating adversarial 3D objects. While foundational models like Large Language Models (LLMs) and Vision Language Models (VLMs) have enhanced service-oriented navigation systems through improved perception and decision-making, their integration introduces vulnerabilities in mission-critical service workflows. Existing adversarial attacks fail to address service computing contexts, where reliability and quality-of-service (QoS) are paramount. We utilize AdvOF to investigate and explore the impact of adversarial environments on the VLM-based perception module of VLN agents. In particular, AdvOF first precisely aggregates and aligns the victim object positions in both 2D and 3D space, defining and rendering adversarial objects. Then, we collaboratively optimize the adversarial object with regularization between the adversarial and victim object across physical properties and VLM perceptions. Through assigning importance weights to varying views, the optimization is processed stably and multi-viewedly by iterative fusions from local updates and justifications. Our extensive evaluations demonstrate AdvOF can effectively degrade agent performance under adversarial conditions while maintaining minimal interference with normal navigation tasks. This work advances the understanding of service security in VLM-powered navigation systems, providing computational foundations for robust service composition in physical-world deployments.

Paperid: 1068, https://arxiv.org/pdf/2505.21726.pdf

Abstract:
Quantum key distribution (QKD) will most likely be an integral part of any practical quantum network in the future. However, not all QKD protocols can be used in today's networks because of the lack of single-photon emitters and noisy intermediate quantum hardware. Attenuated-photon transmission, typically used to simulate single-photon emitters, severely limits the achievable transmission distances and makes the integration of the QKD into existing classical networks, that use tens of thousands of photons per bit of transmission, difficult. Furthermore, it has been found that protocol performance varies with topology. In order to remove the reliance of QKD on single-photon emitters and increase transmission distances, it is worthwhile to explore QKD protocols that do not rely on single-photon transmissions for security, such as the 3-stage QKD protocol, which can tolerate multiple photons in each burst without information leakage. This paper compares and contrasts the 3-stage QKD protocol with conventional QKD protocols and its efficiency in different network topologies and conditions. Furthermore, we establish a mathematical relationship between achievable key rates to increase transmission distances in various topologies.

Paperid: 1069, https://arxiv.org/pdf/2505.19270.pdf

Abstract:
Quantum-augmented networks aim to use quantum phenomena to improve detection and protection against malicious actors in a classical communication network. This may include multiplexing quantum signals into classical fiber optical channels and incorporating purely quantum links alongside classical links in the network. In such hybrid networks, quantum protocols based on single photons become a bottleneck for transmission distances and data speeds, thereby reducing entire network performance. Furthermore, many of the security assumptions of the single-photon protocols do not hold up in practice because of the impossibility of manufacturing single-photon emitters. Multi-photon quantum protocols, on the other hand, are designed to operate under practical assumptions and do not require single photon emitters. As a result, they provide higher levels of security guarantees and longer transmission distances. However, the effect of channel and device noise on multiphoton protocols in terms of security, transmission distances, and bit rates has not been investigated. In this paper, we focus on channel noise and present our observations on the effect of various types of noise on multi-photon protocols. We also investigate the effect of topologies such as ring, star, and torus on the noise characteristics of the multi-photon protocols. Our results show the possible advantages of switching to multi-photon protocols and give insights into the repeater placement and topology choice for quantum-augmented networks.

Paperid: 1070, https://arxiv.org/pdf/2505.18282.pdf

Abstract:
In the past decade, several small-scale quantum key distribution networks have been established. However, the deployment of large-scale quantum networks depends on the development of quantum repeaters, quantum channels, quantum memories, and quantum network protocols. To improve the security of existing networks and adopt currently feasible quantum technologies, the next step is to augment classical networks with quantum devices, properties, and phenomena. To achieve this, we propose a change in the structure of the HTTP protocol such that it can carry both quantum and classical payload. This work lays the foundation for dividing one single network packet into classical and quantum payloads depending on the privacy needs. We implement logistic regression, CNN, LSTM, and BiLSTM models to classify the privacy label for outgoing communications. This enables reduced utilization of quantum resources allowing for a more efficient secure quantum network design. Experimental results using the proposed methods are presented.

Paperid: 1071, https://arxiv.org/pdf/2505.16008.pdf

Abstract:
We propose LAGO - Language Similarity-Aware Graph Optimization - a novel approach for few-shot cross-lingual embedding inversion attacks, addressing critical privacy vulnerabilities in multilingual NLP systems. Unlike prior work in embedding inversion attacks that treat languages independently, LAGO explicitly models linguistic relationships through a graph-based constrained distributed optimization framework. By integrating syntactic and lexical similarity as edge constraints, our method enables collaborative parameter learning across related languages. Theoretically, we show this formulation generalizes prior approaches, such as ALGEN, which emerges as a special case when similarity constraints are relaxed. Our framework uniquely combines Frobenius-norm regularization with linear inequality or total variation constraints, ensuring robust alignment of cross-lingual embedding spaces even with extremely limited data (as few as 10 samples per language). Extensive experiments across multiple languages and embedding models demonstrate that LAGO substantially improves the transferability of attacks with 10-20% increase in Rouge-L score over baselines. This work establishes language similarity as a critical factor in inversion attack transferability, urging renewed focus on language-aware privacy-preserving multilingual embeddings.

Paperid: 1072, https://arxiv.org/pdf/2505.15242.pdf

Abstract:
Large Language Models (LLMs) have shown great promise in code analysis and auditing; however, they still struggle with hallucinations and limited context-aware reasoning. We introduce SmartAuditFlow, a novel Plan-Execute framework that enhances smart contract security analysis through dynamic audit planning and structured execution. Unlike conventional LLM-based auditing approaches that follow fixed workflows and predefined steps, SmartAuditFlow dynamically generates and refines audit plans based on the unique characteristics of each smart contract. It continuously adjusts its auditing strategy in response to intermediate LLM outputs and newly detected vulnerabilities, ensuring a more adaptive and precise security assessment. The framework then executes these plans step by step, applying a structured reasoning process to enhance vulnerability detection accuracy while minimizing hallucinations and false positives. To further improve audit precision, SmartAuditFlow integrates iterative prompt optimization and external knowledge sources, such as static analysis tools and Retrieval-Augmented Generation (RAG). This ensures audit decisions are contextually informed and backed by real-world security knowledge, producing comprehensive security reports. Extensive evaluations across multiple benchmarks demonstrate that SmartAuditFlow outperforms existing methods, achieving 100 percent accuracy on common and critical vulnerabilities, 41.2 percent accuracy for comprehensive coverage of known smart contract weaknesses in real-world projects, and successfully identifying all 13 tested CVEs. These results highlight SmartAuditFlow's scalability, cost-effectiveness, and superior adaptability over traditional static analysis tools and contemporary LLM-based approaches, establishing it as a robust solution for automated smart contract auditing.

Paperid: 1073, https://arxiv.org/pdf/2505.13329.pdf

Abstract:
Voting advice applications (VAAs) help millions of voters understand which political parties or candidates best align with their views. This paper explores the potential risks these applications pose to the democratic process when targeted by adversarial entities. In particular, we expose 11 manipulation strategies and measure their impact using data from Switzerland's primary VAA, Smartvote, collected during the last two national elections. We find that altering application parameters, such as the matching method, can shift a party's recommendation frequency by up to 105%. Cherry-picking questionnaire items can increase party recommendation frequency by over 261%, while subtle changes to parties' or candidates' responses can lead to a 248% increase. To address these vulnerabilities, we propose adversarial robustness properties VAAs should satisfy, introduce empirical metrics for assessing the resilience of various matching methods, and suggest possible avenues for research toward mitigating the effect of manipulation. Our framework is key to ensuring secure and reliable AI-based VAAs poised to emerge in the near future.

Paperid: 1074, https://arxiv.org/pdf/2505.12239.pdf

Abstract:
The development of artificial intelligence demands that models incrementally update knowledge by Continual Learning (CL) to adapt to open-world environments. To meet privacy and security requirements, Continual Unlearning (CU) emerges as an important problem, aiming to sequentially forget particular knowledge acquired during the CL phase. However, existing unlearning methods primarily focus on single-shot joint forgetting and face significant limitations when applied to CU. First, most existing methods require access to the retained dataset for re-training or fine-tuning, violating the inherent constraint in CL that historical data cannot be revisited. Second, these methods often suffer from a poor trade-off between system efficiency and model fidelity, making them vulnerable to being overwhelmed or degraded by adversaries through deliberately frequent requests. In this paper, we identify that the limitations of existing unlearning methods stem fundamentally from their reliance on gradient-based updates. To bridge the research gap at its root, we propose a novel gradient-free method for CU, named Analytic Continual Unlearning (ACU), for efficient and exact forgetting with historical data privacy preservation. In response to each unlearning request, our ACU recursively derives an analytical (i.e., closed-form) solution in an interpretable manner using the least squares method. Theoretical and experimental evaluations validate the superiority of our ACU on unlearning effectiveness, model fidelity, and system efficiency.

Paperid: 1075, https://arxiv.org/pdf/2505.11710.pdf

Abstract:
Modern enterprise networks increasingly rely on Active Directory (AD) for identity and access management. However, this centralization exposes a single point of failure, allowing adversaries to compromise high-value assets. Existing AD defense approaches often assume static attacker behavior, but real-world adversaries adapt dynamically, rendering such methods brittle. To address this, we model attacker-defender interactions in AD as a Stackelberg game between an adaptive attacker and a proactive defender. We propose a co-evolutionary defense framework that combines Graph Neural Network Approximated Dynamic Programming (GNNDP) to model attacker strategies, with Evolutionary Diversity Optimization (EDO) to generate resilient blocking strategies. To ensure scalability, we introduce a Fixed-Parameter Tractable (FPT) graph reduction method that reduces complexity while preserving strategic structure. Our framework jointly refines attacker and defender policies to improve generalization and prevent premature convergence. Experiments on synthetic AD graphs show near-optimal results (within 0.1 percent of optimality on r500) and improved performance on larger graphs (r1000 and r2000), demonstrating the framework's scalability and effectiveness.

Paperid: 1076, https://arxiv.org/pdf/2505.10315.pdf

Abstract:
Transformer models have revolutionized AI, powering applications like content generation and sentiment analysis. However, their deployment in Machine Learning as a Service (MLaaS) raises significant privacy concerns, primarily due to the centralized processing of sensitive user data. Private Transformer Inference (PTI) offers a solution by utilizing cryptographic techniques such as secure multi-party computation and homomorphic encryption, enabling inference while preserving both user data and model privacy. This paper reviews recent PTI advancements, highlighting state-of-the-art solutions and challenges. We also introduce a structured taxonomy and evaluation framework for PTI, focusing on balancing resource efficiency with privacy and bridging the gap between high-performance inference and data privacy.

Paperid: 1077, https://arxiv.org/pdf/2505.08255.pdf

Abstract:
With the advancement of AI generative techniques, Deepfake faces have become incredibly realistic and nearly indistinguishable to the human eye. To counter this, Deepfake detectors have been developed as reliable tools for assessing face authenticity. These detectors are typically developed on Deep Neural Networks (DNNs) and trained using third-party datasets. However, this protocol raises a new security risk that can seriously undermine the trustfulness of Deepfake detectors: Once the third-party data providers insert poisoned (corrupted) data maliciously, Deepfake detectors trained on these datasets will be injected ``backdoors'' that cause abnormal behavior when presented with samples containing specific triggers. This is a practical concern, as third-party providers may distribute or sell these triggers to malicious users, allowing them to manipulate detector performance and escape accountability. This paper investigates this risk in depth and describes a solution to stealthily infect Deepfake detectors. Specifically, we develop a trigger generator, that can synthesize passcode-controlled, semantic-suppression, adaptive, and invisible trigger patterns, ensuring both the stealthiness and effectiveness of these triggers. Then we discuss two poisoning scenarios, dirty-label poisoning and clean-label poisoning, to accomplish the injection of backdoors. Extensive experiments demonstrate the effectiveness, stealthiness, and practicality of our method compared to several baselines.

Paperid: 1078, https://arxiv.org/pdf/2505.05751.pdf

Abstract:
Federated learning (FL) enables collaborative model training while preserving user data privacy by keeping data local. Despite these advantages, FL remains vulnerable to privacy attacks on user updates and model parameters during training and deployment. Secure aggregation protocols have been proposed to protect user updates by encrypting them, but these methods often incur high computational costs and are not resistant to quantum computers. Additionally, differential privacy (DP) has been used to mitigate privacy leakages, but existing methods focus on secure aggregation or DP, neglecting their potential synergies. To address these gaps, we introduce Beskar, a novel framework that provides post-quantum secure aggregation, optimizes computational overhead for FL settings, and defines a comprehensive threat model that accounts for a wide spectrum of adversaries. We also integrate DP into different stages of FL training to enhance privacy protection in diverse scenarios. Our framework provides a detailed analysis of the trade-offs between security, performance, and model accuracy, representing the first thorough examination of secure aggregation protocols combined with various DP approaches for post-quantum secure FL. Beskar aims to address the pressing privacy and security issues FL while ensuring quantum-safety and robust performance.

Paperid: 1079, https://arxiv.org/pdf/2505.03455.pdf

Abstract:
Voice authentication systems remain susceptible to two major threats: backdoor triggered attacks and targeted data poisoning attacks. This dual vulnerability is critical because conventional solutions typically address each threat type separately, leaving systems exposed to adversaries who can exploit both attacks simultaneously. We propose a unified defense framework that effectively addresses both BTA and TDPA. Our framework integrates a frequency focused detection mechanism that flags covert pitch boosting and sound masking backdoor attacks in near real time, followed by a convolutional neural network that addresses TDPA. This dual layered defense approach utilizes multidimensional acoustic features to isolate anomalous signals without requiring costly model retraining. In particular, our PBSM detection mechanism can seamlessly integrate into existing voice authentication pipelines and scale effectively for large scale deployments. Experimental results on benchmark datasets and their compression with the state of the art algorithm demonstrate that our PBSM detection mechanism outperforms the state of the art. Our framework reduces attack success rates to as low as five to fifteen percent while maintaining a recall rate of up to ninety five percent in recognizing TDPA.

Paperid: 1080, https://arxiv.org/pdf/2505.03345.pdf

Abstract:
The swift spread of fake news and disinformation campaigns poses a significant threat to public trust, political stability, and cybersecurity. Traditional Cyber Threat Intelligence (CTI) approaches, which rely on low-level indicators such as domain names and social media handles, are easily evaded by adversaries who frequently modify their online infrastructure. To address these limitations, we introduce a novel CTI framework that focuses on high-level, semantic indicators derived from recurrent narratives and relationships of disinformation campaigns. Our approach extracts structured CTI indicators from unstructured disinformation content, capturing key entities and their contextual dependencies within fake news using Large Language Models (LLMs). We further introduce FakeCTI, the first dataset that systematically links fake news to disinformation campaigns and threat actors. To evaluate the effectiveness of our CTI framework, we analyze multiple fake news attribution techniques, spanning from traditional Natural Language Processing (NLP) to fine-tuned LLMs. This work shifts the focus from low-level artifacts to persistent conceptual structures, establishing a scalable and adaptive approach to tracking and countering disinformation campaigns.

Paperid: 1081, https://arxiv.org/pdf/2505.00593.pdf

Abstract:
The security of image data in the Internet of Things (IoT) and edge networks is crucial due to the increasing deployment of intelligent systems for real-time decision-making. Traditional encryption algorithms such as AES and RSA are computationally expensive for resource-constrained IoT devices and ineffective for large-volume image data, leading to inefficiencies in privacy-preserving distributed learning applications. To address these concerns, this paper proposes a novel Feature-Aware Chaotic Image Encryption scheme that integrates Feature-Aware Pixel Segmentation (FAPS) with Chaotic Chain Permutation and Confusion mechanisms to enhance security while maintaining efficiency. The proposed scheme consists of three stages: (1) FAPS, which extracts and reorganizes pixels based on high and low edge intensity features for correlation disruption; (2) Chaotic Chain Permutation, which employs a logistic chaotic map with SHA-256-based dynamically updated keys for block-wise permutation; and (3) Chaotic chain Confusion, which utilises dynamically generated chaotic seed matrices for bitwise XOR operations. Extensive security and performance evaluations demonstrate that the proposed scheme significantly reduces pixel correlation -- almost zero, achieves high entropy values close to 8, and resists differential cryptographic attacks. The optimum design of the proposed scheme makes it suitable for real-time deployment in resource-constrained environments.

Paperid: 1082, https://arxiv.org/pdf/2504.21199.pdf

Abstract:
We study the problem of reconstructing tabular data from aggregate statistics, in which the attacker aims to identify interesting claims about the sensitive data that can be verified with 100% certainty given the aggregates. Successful attempts in prior work have conducted studies in settings where the set of published statistics is rich enough that entire datasets can be reconstructed with certainty. In our work, we instead focus on the regime where many possible datasets match the published statistics, making it impossible to reconstruct the entire private dataset perfectly (i.e., when approaches in prior work fail). We propose the problem of partial data reconstruction, in which the goal of the adversary is to instead output a $\textit{subset}$ of rows and/or columns that are $\textit{guaranteed to be correct}$. We introduce a novel integer programming approach that first $\textbf{generates}$ a set of claims and then $\textbf{verifies}$ whether each claim holds for all possible datasets consistent with the published aggregates. We evaluate our approach on the housing-level microdata from the U.S. Decennial Census release, demonstrating that privacy violations can still persist even when information published about such data is relatively sparse.

Paperid: 1083, https://arxiv.org/pdf/2504.21072.pdf

Abstract:
The expansion of large-scale text-to-image diffusion models has raised growing concerns about their potential to generate undesirable or harmful content, ranging from fabricated depictions of public figures to sexually explicit images. To mitigate these risks, prior work has devised machine unlearning techniques that attempt to erase unwanted concepts through fine-tuning. However, in this paper, we introduce a new threat model, Toxic Erasure (ToxE), and demonstrate how recent unlearning algorithms, including those explicitly designed for robustness, can be circumvented through targeted backdoor attacks. The threat is realized by establishing a link between a trigger and the undesired content. Subsequent unlearning attempts fail to erase this link, allowing adversaries to produce harmful content. We instantiate ToxE via two established backdoor attacks: one targeting the text encoder and another manipulating the cross-attention layers. Further, we introduce Deep Intervention Score-based Attack (DISA), a novel, deeper backdoor attack that optimizes the entire U-Net using a score-based objective, improving the attack's persistence across different erasure methods. We evaluate five recent concept erasure methods against our threat model. For celebrity identity erasure, our deep attack circumvents erasure with up to 82% success, averaging 57% across all erasure methods. For explicit content erasure, ToxE attacks can elicit up to 9 times more exposed body parts, with DISA yielding an average increase by a factor of 2.9. These results highlight a critical security gap in current unlearning strategies.

Paperid: 1084, https://arxiv.org/pdf/2504.20934.pdf

Abstract:
Transient execution vulnerabilities have emerged as a critical threat to modern processors. Hardware fuzzing testing techniques have recently shown promising results in discovering transient execution bugs in large-scale out-of-order processor designs. However, their poor microarchitectural controllability and observability prevent them from effectively and efficiently detecting transient execution vulnerabilities. This paper proposes DejaVuzz, a novel pre-silicon stage processor transient execution bug fuzzer. DejaVuzz utilizes two innovative operating primitives: dynamic swappable memory and differential information flow tracking, enabling more effective and efficient transient execution vulnerability detection. The dynamic swappable memory enables the isolation of different instruction streams within the same address space. Leveraging this capability, DejaVuzz generates targeted training for arbitrary transient windows and eliminates ineffective training, enabling efficient triggering of diverse transient windows. The differential information flow tracking aids in observing the propagation of sensitive data across the microarchitecture. Based on taints, DejaVuzz designs the taint coverage matrix to guide mutation and uses taint liveness annotations to identify exploitable leakages. Our evaluation shows that DejaVuzz outperforms the state-of-the-art fuzzer SpecDoctor, triggering more comprehensive transient windows with lower training overhead and achieving a 4.7x coverage improvement. And DejaVuzz also mitigates control flow over-tainting with acceptable overhead and identifies 5 previously undiscovered transient execution vulnerabilities (with 6 CVEs assigned) on BOOM and XiangShan.

Paperid: 1085, https://arxiv.org/pdf/2504.20556.pdf

Abstract:
Side-channel attacks (SCAs) pose a serious threat to system security by extracting secret keys through physical leakages such as power consumption, timing variations, and electromagnetic emissions. Among existing countermeasures, artificial noise injection is recognized as one of the most effective techniques. However, its high power consumption poses a major challenge for resource-constrained systems such as Internet of Things (IoT) devices, motivating the development of more efficient protection schemes. In this paper, we model SCAs as a communication channel and aim to suppress information leakage by minimizing the mutual information between the secret information and side-channel observations, subject to a power constraint on the artificial noise. We propose an optimal artificial noise injection method to minimize the mutual information in systems with Gaussian inputs. Specifically, we formulate two convex optimization problems: 1) minimizing the total mutual information, and 2) minimizing the maximum mutual information across observations. Numerical results show that the proposed methods significantly reduce both total and maximum mutual information compared to conventional techniques, confirming their effectiveness for resource-constrained, security-critical systems.

Paperid: 1086, https://arxiv.org/pdf/2504.19418.pdf

Abstract:
The increasing complexity and cost of manufacturing monolithic chips have driven the semiconductor industry toward chiplet-based designs, where smaller and modular chiplets are integrated onto a single interposer. While chiplet architectures offer significant advantages, such as improved yields, design flexibility, and cost efficiency, they introduce new security challenges in the horizontal hardware manufacturing supply chain. These challenges include risks of hardware Trojans, cross-die side-channel and fault injection attacks, probing of chiplet interfaces, and intellectual property theft. To address these concerns, this paper presents \textit{ChipletQuake}, a novel on-chiplet framework for verifying the physical security and integrity of adjacent chiplets during the post-silicon stage. By sensing the impedance of the power delivery network (PDN) of the system, \textit{ChipletQuake} detects tamper events in the interposer and neighboring chiplets without requiring any direct signal interface or additional hardware components. Fully compatible with the digital resources of FPGA-based chiplets, this framework demonstrates the ability to identify the insertion of passive and subtle malicious circuits, providing an effective solution to enhance the security of chiplet-based systems. To validate our claims, we showcase how our framework detects Hardware Trojan and interposer tampering.

Paperid: 1087, https://arxiv.org/pdf/2504.14654.pdf

Abstract:
Lack of memory-safety and exposure to side channels are two prominent, persistent challenges for the secure implementation of software. Memory-safe programming languages promise to significantly reduce the prevalence of memory-safety bugs, but make it more difficult to implement side-channel-resistant code. We aim to address both memory-safety and side-channel resistance by augmenting memory-safe hardware with the ability for data-oblivious programming. We describe an extension to the CHERI capability architecture to provide blinded capabilities that allow data-oblivious computation to be carried out by userspace tasks. We also present BLACKOUT, our realization of blinded capabilities on a FPGA softcore based on the speculative out-of-order CHERI-Toooba processor and extend the CHERI-enabled Clang/LLVM compiler and the CheriBSD operating system with support for blinded capabilities. BLACKOUT makes writing side-channel-resistant code easier by making non-data-oblivious operations via blinded capabilities explicitly fault. Through rigorous evaluation we show that BLACKOUT ensures memory operated on through blinded capabilities is securely allocated, used, and reclaimed and demonstrate that, in benchmarks comparable to those used by previous work, BLACKOUT imposes only a small performance degradation (1.5% geometric mean) compared to the baseline CHERI-Toooba processor.

Paperid: 1088, https://arxiv.org/pdf/2504.13811.pdf

Abstract:
WebShell attacks, where malicious scripts are injected into web servers, pose a significant cybersecurity threat. Traditional ML and DL methods are often hampered by challenges such as the need for extensive training data, catastrophic forgetting, and poor generalization. Recently, Large Language Models have emerged as powerful alternatives for code-related tasks, but their potential in WebShell detection remains underexplored. In this paper, we make two contributions: (1) a comprehensive evaluation of seven LLMs, including GPT-4, LLaMA 3.1 70B, and Qwen 2.5 variants, benchmarked against traditional sequence- and graph-based methods using a dataset of 26.59K PHP scripts, and (2) the Behavioral Function-Aware Detection (BFAD) framework, designed to address the specific challenges of applying LLMs to this domain. Our framework integrates three components: a Critical Function Filter that isolates malicious PHP function calls, a Context-Aware Code Extraction strategy that captures the most behaviorally indicative code segments, and Weighted Behavioral Function Profiling that enhances in-context learning by prioritizing the most relevant demonstrations based on discriminative function-level profiles. Our results show that, stemming from their distinct analytical strategies, larger LLMs achieve near-perfect precision but lower recall, while smaller models exhibit the opposite trade-off. However, all baseline models lag behind previous SOTA methods. With the application of BFAD, the performance of all LLMs improves significantly, yielding an average F1 score increase of 13.82%. Notably, larger models now outperform SOTA benchmarks, while smaller models such as Qwen-2.5-Coder-3B achieve performance competitive with traditional methods. This work is the first to explore the feasibility and limitations of LLMs for WebShell detection and provides solutions to address the challenges in this task.

Paperid: 1089, https://arxiv.org/pdf/2504.13598.pdf

Abstract:
Cryptocurrency blockchains, beyond their primary role as distributed payment systems, are increasingly used to store and share arbitrary content, such as text messages and files. Although often non-financial, this hidden content can impact price movements by conveying private information, shaping sentiment, and influencing public opinion. However, current analyses of such data are limited in scope and scalability, primarily relying on manual classification or hand-crafted heuristics. In this work, we address these limitations by employing Natural Language Processing techniques to analyze, detect patterns, and extract public sentiment encoded within blockchain transactional data. Using a variety of Machine Learning techniques, we showcase for the first time the predictive power of blockchain-embedded sentiment in forecasting cryptocurrency price movements on the Bitcoin and Ethereum blockchains. Our findings shed light on a previously underexplored source of freely available, transparent, and immutable data and introduce blockchain sentiment analysis as a novel and robust framework for enhancing financial predictions in cryptocurrency markets. Incidentally, we discover an asymmetry between cryptocurrencies; Bitcoin has an informational advantage over Ethereum in that the sentiment embedded into transactional data is sufficient to predict its price movement.

Paperid: 1090, https://arxiv.org/pdf/2504.11633.pdf

Abstract:
Static side-channel analysis attacks, which rely on a stopped clock to extract sensitive information, pose a growing threat to embedded systems' security. To protect against such attacks, several proposed defenses aim to detect unexpected variations in the clock signal and clear sensitive states. In this work, we present \emph{Chypnosis}, an undervolting attack technique that indirectly stops the target circuit clock, while retaining stored data. Crucially, Chypnosis also blocks the state clearing stage of prior defenses, allowing recovery of secret information even in their presence. However, basic undervolting is not sufficient in the presence of voltage sensors designed to handle fault injection via voltage tampering. To overcome such defenses, we observe that rapidly dropping the supply voltage can disable the response mechanism of voltage sensor systems. We implement Chypnosis on various FPGAs, demonstrating the successful bypass of their sensors, both in the form of soft and hard IPs. To highlight the real-world applicability of Chypnosis, we show that the alert handler of the OpenTitan root-of-trust, responsible for providing hardware responses to threats, can be bypassed. Furthermore, we demonstrate that by combining Chypnosis with static side-channel analysis techniques, namely laser logic state imaging (LLSI) and impedance analysis (IA), we can extract sensitive information from a side-channel protected cryptographic module used in OpenTitan, even in the presence of established clock and voltage sensors. Finally, we propose and implement an improvement to an established FPGA-compatible clock detection countermeasure, and we validate its resilience against Chypnosis.

Paperid: 1091, https://arxiv.org/pdf/2504.11106.pdf

Abstract:
Recent advancements in Text-to-Image (T2I) generation have significantly enhanced the realism and creativity of generated images. However, such powerful generative capabilities pose risks related to the production of inappropriate or harmful content. Existing defense mechanisms, including prompt checkers and post-hoc image checkers, are vulnerable to sophisticated adversarial attacks. In this work, we propose TCBS-Attack, a novel query-based black-box jailbreak attack that searches for tokens located near the decision boundaries defined by text and image checkers. By iteratively optimizing tokens near these boundaries, TCBS-Attack generates semantically coherent adversarial prompts capable of bypassing multiple defensive layers in T2I models. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art jailbreak attacks across various T2I models, including securely trained open-source models and commercial online services like DALL-E 3. TCBS-Attack achieves an ASR-4 of 45\% and an ASR-1 of 21\% on jailbreaking full-chain T2I models, significantly surpassing baseline methods.

Paperid: 1092, https://arxiv.org/pdf/2504.08254.pdf

Abstract:
Privacy attacks, particularly membership inference attacks (MIAs), are widely used to assess the privacy of generative models for tabular synthetic data, including those with Differential Privacy (DP) guarantees. These attacks often exploit outliers, which are especially vulnerable due to their position at the boundaries of the data domain (e.g., at the minimum and maximum values). However, the role of data domain extraction in generative models and its impact on privacy attacks have been overlooked. In this paper, we examine three strategies for defining the data domain: assuming it is externally provided (ideally from public data), extracting it directly from the input data, and extracting it with DP mechanisms. While common in popular implementations and libraries, we show that the second approach breaks end-to-end DP guarantees and leaves models vulnerable. While using a provided domain (if representative) is preferable, extracting it with DP can also defend against popular MIAs, even at high privacy budgets.

Paperid: 1093, https://arxiv.org/pdf/2504.06923.pdf

Abstract:
Differentially Private (DP) generative marginal models are often used in the wild to release synthetic tabular datasets in lieu of sensitive data while providing formal privacy guarantees. These models approximate low-dimensional marginals or query workloads; crucially, they require the training data to be pre-discretized, i.e., continuous values need to first be partitioned into bins. However, as the range of values (or their domain) is often inferred directly from the training data, with the number of bins and bin edges typically defined arbitrarily, this approach can ultimately break end-to-end DP guarantees and may not always yield optimal utility. In this paper, we present an extensive measurement study of four discretization strategies in the context of DP marginal generative models. More precisely, we design DP versions of three discretizers (uniform, quantile, and k-means) and reimplement the PrivTree algorithm. We find that optimizing both the choice of discretizer and bin count can improve utility, on average, by almost 30% across six DP marginal models, compared to the default strategy and number of bins, with PrivTree being the best-performing discretizer in the majority of cases. We demonstrate that, while DP generative models with non-private discretization remain vulnerable to membership inference attacks, applying DP during discretization effectively mitigates this risk. Finally, we improve on an existing approach for automatically selecting the optimal number of bins, and achieve high utility while reducing both privacy budget consumption and computational overhead.

Paperid: 1094, https://arxiv.org/pdf/2504.04367.pdf

Abstract:
In the era of data expansion, ensuring data privacy has become increasingly critical, posing significant challenges to traditional AI-based applications. In addition, the increasing adoption of IoT devices has introduced significant cybersecurity challenges, making traditional Network Intrusion Detection Systems (NIDS) less effective against evolving threats, and privacy concerns and regulatory restrictions limit their deployment. Federated Learning (FL) has emerged as a promising solution, allowing decentralized model training while maintaining data privacy to solve these issues. However, despite implementing privacy-preserving technologies, FL systems remain vulnerable to adversarial attacks. Furthermore, data distribution among clients is not heterogeneous in the FL scenario. We propose WeiDetect, a two-phase, server-side defense mechanism for FL-based NIDS that detects malicious participants to address these challenges. In the first phase, local models are evaluated using a validation dataset to generate validation scores. These scores are then analyzed using a Weibull distribution, identifying and removing malicious models. We conducted experiments to evaluate the effectiveness of our approach in diverse attack settings. Our evaluation included two popular datasets, CIC-Darknet2020 and CSE-CIC-IDS2018, tested under non-IID data distributions. Our findings highlight that WeiDetect outperforms state-of-the-art defense approaches, improving higher target class recall up to 70% and enhancing the global model's F1 score by 1% to 14%.

Paperid: 1095, https://arxiv.org/pdf/2504.00441.pdf

Abstract:
As large language models (LLMs) and generative AI become widely adopted, guardrails have emerged as a key tool to ensure their safe use. However, adding guardrails isn't without tradeoffs; stronger security measures can reduce usability, while more flexible systems may leave gaps for adversarial attacks. In this work, we explore whether current guardrails effectively prevent misuse while maintaining practical utility. We introduce a framework to evaluate these tradeoffs, measuring how different guardrails balance risk, security, and usability, and build an efficient guardrail. Our findings confirm that there is no free lunch with guardrails; strengthening security often comes at the cost of usability. To address this, we propose a blueprint for designing better guardrails that minimize risk while maintaining usability. We evaluate various industry guardrails, including Azure Content Safety, Bedrock Guardrails, OpenAI's Moderation API, Guardrails AI, Nemo Guardrails, and Enkrypt AI guardrails. Additionally, we assess how LLMs like GPT-4o, Gemini 2.0-Flash, Claude 3.5-Sonnet, and Mistral Large-Latest respond under different system prompts, including simple prompts, detailed prompts, and detailed prompts with chain-of-thought (CoT) reasoning. Our study provides a clear comparison of how different guardrails perform, highlighting the challenges in balancing security and usability.

Paperid: 1096, https://arxiv.org/pdf/2503.19339.pdf

Abstract:
The ever-increasing security vulnerabilities in the Internet-of-Things (IoT) systems require improved threat detection approaches. This paper presents a compact and efficient approach to detect botnet attacks by employing an integrated approach that consists of traffic pattern analysis, temporal support learning, and focused feature extraction. The proposed attention-based model benefits from a hybrid CNN-BiLSTM architecture and achieves 99% classification accuracy in detecting botnet attacks utilizing the N-BaIoT dataset, while maintaining high precision and recall across various scenarios. The proposed model's performance is further validated by key parameters, such as Mathews Correlation Coefficient and Cohen's kappa Correlation Coefficient. The close-to-ideal results for these parameters demonstrate the proposed model's ability to detect botnet attacks accurately and efficiently in practical settings and on unseen data. The proposed model proved to be a powerful defence mechanism for IoT networks to face emerging security challenges.

Paperid: 1097, https://arxiv.org/pdf/2503.17847.pdf

Abstract:
Multi-GPU systems are becoming increasingly important in highperformance computing (HPC) and cloud infrastructure, providing acceleration for data-intensive applications, including machine learning workloads. These systems consist of multiple GPUs interconnected through high-speed networking links such as NVIDIA's NVLink. In this work, we explore whether the interconnect on such systems can offer a novel source of leakage, enabling new forms of covert and side-channel attacks. Specifically, we reverse engineer the operations of NVlink and identify two primary sources of leakage: timing variations due to contention and accessible performance counters that disclose communication patterns. The leakage is visible remotely and even across VM instances in the cloud, enabling potentially dangerous attacks. Building on these observations, we develop two types of covert-channel attacks across two GPUs, achieving a bandwidth of over 70 Kbps with an error rate of 4.78% for the contention channel. We develop two end-to-end crossGPU side-channel attacks: application fingerprinting (including 18 high-performance computing and deep learning applications) and 3D graphics character identification within Blender, a multi-GPU rendering application. These attacks are highly effective, achieving F1 scores of up to 97.78% and 91.56%, respectively. We also discover that leakage surprisingly occurs across Virtual Machines on the Google Cloud Platform (GCP) and demonstrate a side-channel attack on Blender, achieving F1 scores exceeding 88%. We also explore potential defenses such as managing access to counters and reducing the resolution of the clock to mitigate the two sources of leakage.

Paperid: 1098, https://arxiv.org/pdf/2503.16104.pdf

Abstract:
One approach to risk-limiting audits (RLAs) compares randomly selected cast vote records (CVRs) to votes read by human auditors from the corresponding ballot cards. Historically, such methods reduce audit sample sizes by considering how each sampled CVR differs from the corresponding true vote, not merely whether they differ. Here we investigate the latter approach, auditing by testing whether the total number of mismatches in the full set of CVRs exceeds the minimum number of CVR errors required for the reported outcome to be wrong (the "CVR margin"). This strategy makes it possible to audit more social choice functions and simplifies RLAs conceptually, which makes it easier to explain than some other RLA approaches. The cost is larger sample sizes. "Mismatch-based RLAs" only require a lower bound on the CVR margin, which for some social choice functions is easier to calculate than the effect of particular errors. When the population rate of mismatches is low and the lower bound on the CVR margin is close to the true CVR margin, the increase in sample size is small. However, the increase may be very large when errors include errors that, if corrected, would widen the CVR margin rather than narrow it; errors affect the margin between candidates other than the reported winner with the fewest votes and the reported loser with the most votes; or errors that affect different margins.

Paperid: 1099, https://arxiv.org/pdf/2503.14803.pdf

Abstract:
Constructing efficient risk-limiting audits (RLAs) for multiwinner single transferable vote (STV) elections is a challenging problem. An STV RLA is designed to statistically verify that the reported winners of an election did indeed win according to the voters' expressed preferences and not due to mistabulation or interference, while limiting the risk of accepting an incorrect outcome to a desired threshold (the risk limit). Existing methods have shown that it is possible to form RLAs for two-seat STV elections in the context where the first seat has been awarded to a candidate in the first round of tabulation. This is called the first winner criterion. We present an assertion-based approach to conducting full or partial RLAs for STV elections with three or more seats, in which the first winner criterion is satisfied. Although the chance of forming a full audit that verifies all winners drops substantially as the number of seats increases, we show that we can quite often form partial audits that verify most, and sometimes all, of the reported winners. We evaluate our method on a dataset of over 500 three- and four-seat STV elections from the 2017 and 2022 local council elections in Scotland.

Paperid: 1100, https://arxiv.org/pdf/2503.12188.pdf

Abstract:
Multi-agent systems coordinate LLM-based agents to perform tasks on users' behalf. In real-world applications, multi-agent systems will inevitably interact with untrusted inputs, such as malicious Web content, files, email attachments, and more. Using several recently proposed multi-agent frameworks as concrete examples, we demonstrate that adversarial content can hijack control and communication within the system to invoke unsafe agents and functionalities. This results in a complete security breach, up to execution of arbitrary malicious code on the user's device or exfiltration of sensitive data from the user's containerized environment. For example, when agents are instantiated with GPT-4o, Web-based attacks successfully cause the multi-agent system execute arbitrary malicious code in 58-90\% of trials (depending on the orchestrator). In some model-orchestrator configurations, the attack success rate is 100\%. We also demonstrate that these attacks succeed even if individual agents are not susceptible to direct or indirect prompt injection, and even if they refuse to perform harmful actions. We hope that these results will motivate development of trust and security models for multi-agent systems before they are widely deployed.

Paperid: 1101, https://arxiv.org/pdf/2503.09953.pdf

Abstract:
In this digital age, ensuring the security of digital data, especially the image data is critically important. Image encryption plays an important role in securing the online transmission/storage of images from unauthorized access. In this regard, this paper presents a novel diffusion-confusion-based image encryption algorithm named as X-CROSS. The diffusion phase involves a dual-layer block permutation. It involves a bit-level permutation termed Inter-Bit Transference (IBT) using a Bit-Extraction key, and pixel permutation with a unique X-crosspermutation algorithm to effectively scramble the pixels within an image. The proposed algorithm utilizes a resilient 2D chaotic map with non-linear dynamical behavior, assisting in generating complex Extraction Keys. After the permutation phase, the confusion phase proceeds with a dynamic substitution technique on the permuted images, establishing the final encryption layer. This combination of novel permutation and confusion results in the removal of the image's inherent patterns and increases its resistance to cyber-attacks. The close to ideal statistical security results for information entropy, correlation, homogeneity, contrast, and energy validate the proposed scheme's effectiveness in hiding the information within the image.

Paperid: 1102, https://arxiv.org/pdf/2503.09939.pdf

Abstract:
In this digital era, ensuring the security of digital data during transmission and storage is crucial. Digital data, particularly image data, needs to be protected against unauthorized access. To address this, this paper presents a novel image encryption scheme based on a confusion diffusion architecture. The diffusion module introduces a novel geometric block permutation technique, which effectively scrambles the pixels based on geometric shape extraction of pixels. The image is converted into four blocks, and pixels are extracted from these blocks using L-shape, U-shape, square-shape, and inverted U-shape patterns for each block, respectively. This robust extraction and permutation effectively disrupts the correlation within the image. Furthermore, the confusion module utilises bit-XOR and dynamic substitution techniques. For the bit-XOR operation, 2D Henon map has been utilised to generate a chaotic seed matrix, which is bit-XORed with the scrambled image. The resultant image then undergoes the dynamic substitution process to complete confusion phase. A statistical security analysis demonstrates the superior security of the proposed scheme, with being high uncertainty and unpredictability, achieving an entropy of 7.9974 and a correlation coefficient of 0.0014. These results validate the proposed scheme's effectiveness in securing digital images.

Paperid: 1103, https://arxiv.org/pdf/2503.09047.pdf

Abstract:
Threshold Signature Scheme (TSS) protocols have gained significant attention over the past ten years due to their widespread adoption in cryptocurrencies. The adoption is mainly boosted by Gennaro and Goldfedder's TSS protocol. Since then, various TSS protocols have been introduced with different features, such as security and performance, etc. Large organizations are using TSS protocols to protect many digital assets, such as cryptocurrency. However, the adoption of these TSS protocols requires an understanding of state-of-the-art research in threshold signing. This study describes the holistic view of TSS protocols, evaluates cutting-edge TSS protocols, highlights their characteristics, and compares them in terms of security and performance. The evaluation of these TSS protocols will help the researchers address real-world problems by considering the relevant merits of different TSS protocols.

Paperid: 1104, https://arxiv.org/pdf/2503.09041.pdf

Abstract:
Electromyography (EMG) is extensively used in key biomedical areas, such as prosthetics, and assistive and interactive technologies. This paper presents a new hybrid neural network named ConSGruNet for precise and efficient hand gesture recognition. The proposed model comprises convolutional neural networks with smart skip connections in conjunction with a Gated Recurrent Unit (GRU). The proposed model is trained on the complete Ninapro DB1 dataset. The proposed model boasts an accuracy of 99.7\% in classifying 53 classes in just 25 milliseconds. In addition to being fast, the proposed model is lightweight with just 3,946 KB in size. Moreover, the proposed model has also been evaluated for the reliability parameters, i.e., Cohen's kappa coefficient, Matthew's correlation coefficient, and confidence intervals. The close to ideal results of these parameters validate the models performance on unseen data.

Paperid: 1105, https://arxiv.org/pdf/2503.09038.pdf

Abstract:
Securing image data in IoT networks and other insecure information channels is a matter of critical concern. This paper presents a new image encryption scheme using DNA encoding, snake permutation and chaotic substitution techniques that ensures robust security of the image data with reduced computational overhead. The DNA encoding and snake permutation modules ensure effective scrambling of the pixels and result in efficient diffusion in the plaintext image. For the confusion part, the chaotic substitution technique is implemented, which substitutes the pixel values chosen randomly from 3 S-boxes. Extensive security analysis validate the efficacy of the image encryption algorithm proposed in this paper and results demonstrate that the encrypted images have an ideal information entropy of 7.9895 and an almost zero correlation coefficient of -0.001660. These results indicate a high degree of randomness and no correlation in the encrypted image.

Paperid: 1106, https://arxiv.org/pdf/2503.03436.pdf

Abstract:
Social media has become integral to minors' daily lives and is used for various purposes, such as making friends, exploring shared interests, and engaging in educational activities. However, the increase in screen time has also led to heightened challenges, including cyberbullying, online grooming, and exploitations posed by malicious actors. Traditional content moderation techniques have proven ineffective against exploiters' evolving tactics. To address these growing challenges, we propose the EdgeAIGuard content moderation approach that is designed to protect minors from online grooming and various forms of digital exploitation. The proposed method comprises a multi-agent architecture deployed strategically at the network edge to enable rapid detection with low latency and prevent harmful content targeting minors. The experimental results show the proposed method is significantly more effective than the existing approaches.

Paperid: 1108, https://arxiv.org/pdf/2503.00062.pdf

Abstract:
Machine unlearning allows data owners to erase the impact of their specified data from trained models. Unfortunately, recent studies have shown that adversaries can recover the erased data, posing serious threats to user privacy. An effective unlearning method removes the information of the specified data from the trained model, resulting in different outputs for the same input before and after unlearning. Adversaries can exploit these output differences to conduct privacy leakage attacks, such as reconstruction and membership inference attacks. However, directly applying traditional defenses to unlearning leads to significant model utility degradation. In this paper, we introduce a Compressive Representation Forgetting Unlearning scheme (CRFU), designed to safeguard against privacy leakage on unlearning. CRFU achieves data erasure by minimizing the mutual information between the trained compressive representation (learned through information bottleneck theory) and the erased data, thereby maximizing the distortion of data. This ensures that the model's output contains less information that adversaries can exploit. Furthermore, we introduce a remembering constraint and an unlearning rate to balance the forgetting of erased data with the preservation of previously learned knowledge, thereby reducing accuracy degradation. Theoretical analysis demonstrates that CRFU can effectively defend against privacy leakage attacks. Our experimental results show that CRFU significantly increases the reconstruction mean square error (MSE), achieving a defense effect improvement of approximately $200\%$ against privacy reconstruction attacks with only $1.5\%$ accuracy degradation on MNIST.

Paperid: 1109, https://arxiv.org/pdf/2502.19883.pdf

Abstract:
Small language models (SLMs) have become increasingly prominent in the deployment on edge devices due to their high efficiency and low computational cost. While researchers continue to advance the capabilities of SLMs through innovative training strategies and model compression techniques, the security risks of SLMs have received considerably less attention compared to large language models (LLMs).To fill this gap, we provide a comprehensive empirical study to evaluate the security performance of 13 state-of-the-art SLMs under various jailbreak attacks. Our experiments demonstrate that most SLMs are quite susceptible to existing jailbreak attacks, while some of them are even vulnerable to direct harmful prompts.To address the safety concerns, we evaluate several representative defense methods and demonstrate their effectiveness in enhancing the security of SLMs. We further analyze the potential security degradation caused by different SLM techniques including architecture compression, quantization, knowledge distillation, and so on. We expect that our research can highlight the security challenges of SLMs and provide valuable insights to future work in developing more robust and secure SLMs.

Paperid: 1110, https://arxiv.org/pdf/2502.19785.pdf

Abstract:
Deep learning (DL) enabled semantic communications leverage DL to train encoders and decoders (codecs) to extract and recover semantic information. However, most semantic training datasets contain personal private information. Such concerns call for enormous requirements for specified data erasure from semantic codecs when previous users hope to move their data from the semantic system. {Existing machine unlearning solutions remove data contribution from trained models, yet usually in supervised sole model scenarios. These methods are infeasible in semantic communications that often need to jointly train unsupervised encoders and decoders.} In this paper, we investigate the unlearning problem in DL-enabled semantic communications and propose a semantic communication unlearning (SCU) scheme to tackle the problem. {SCU includes two key components. Firstly,} we customize the joint unlearning method for semantic codecs, including the encoder and decoder, by minimizing mutual information between the learned semantic representation and the erased samples. {Secondly,} to compensate for semantic model utility degradation caused by unlearning, we propose a contrastive compensation method, which considers the erased data as the negative samples and the remaining data as the positive samples to retrain the unlearned semantic models contrastively. Theoretical analysis and extensive experimental results on three representative datasets demonstrate the effectiveness and efficiency of our proposed methods.

Paperid: 1111, https://arxiv.org/pdf/2502.19770.pdf

Abstract:
With the increasing prevalence of Web-based platforms handling vast amounts of user data, machine unlearning has emerged as a crucial mechanism to uphold users' right to be forgotten, enabling individuals to request the removal of their specified data from trained models. However, the auditing of machine unlearning processes remains significantly underexplored. Although some existing methods offer unlearning auditing by leveraging backdoors, these backdoor-based approaches are inefficient and impractical, as they necessitate involvement in the initial model training process to embed the backdoors. In this paper, we propose a TAilored Posterior diffErence (TAPE) method to provide unlearning auditing independently of original model training. We observe that the process of machine unlearning inherently introduces changes in the model, which contains information related to the erased data. TAPE leverages unlearning model differences to assess how much information has been removed through the unlearning operation. Firstly, TAPE mimics the unlearned posterior differences by quickly building unlearned shadow models based on first-order influence estimation. Secondly, we train a Reconstructor model to extract and evaluate the private information of the unlearned posterior differences to audit unlearning. Existing privacy reconstructing methods based on posterior differences are only feasible for model updates of a single sample. To enable the reconstruction effective for multi-sample unlearning requests, we propose two strategies, unlearned data perturbation and unlearned influence-based division, to augment the posterior difference. Extensive experimental results indicate the significant superiority of TAPE over the state-of-the-art unlearning verification methods, at least 4.5$\times$ efficiency speedup and supporting the auditing for broader unlearning scenarios.

Paperid: 1112, https://arxiv.org/pdf/2502.19095.pdf

Abstract:
Cross-site scripting (XSS) poses a significant threat to web application security. While Deep Learning (DL) has shown remarkable success in detecting XSS attacks, it remains vulnerable to adversarial attacks due to the discontinuous nature of its input-output mapping. These adversarial attacks employ mutation-based strategies for different components of XSS attack vectors, allowing adversarial agents to iteratively select mutations to evade detection. Our work replicates a state-of-the-art XSS adversarial attack, highlighting threats to validity in the reference work and extending it toward a more effective evaluation strategy. Moreover, we introduce an XSS Oracle to mitigate these threats. The experimental results show that our approach achieves an escape rate above 96% when the threats to validity of the replicated technique are addressed.

Paperid: 1113, https://arxiv.org/pdf/2502.16406.pdf

Abstract:
The server-less nature of Decentralized Federated Learning (DFL) requires allocating the aggregation role to specific participants in each federated round. Current DFL architectures ensure the trustworthiness of the aggregator node upon selection. However, most of these studies overlook the possibility that the aggregating node may turn rogue and act maliciously after being nominated. To address this problem, this paper proposes a DFL structure, called TrustChain, that scores the aggregators before selection based on their past behavior and additionally audits them after the aggregation. To do this, the statistical independence between the client updates and the aggregated model is continuously monitored using the Hilbert-Schmidt Independence Criterion (HSIC). The proposed method relies on several principles, including blockchain, anomaly detection, and concept drift analysis. The designed structure is evaluated on several federated datasets and attack scenarios with different numbers of Byzantine nodes.

Paperid: 1114, https://arxiv.org/pdf/2502.16396.pdf

Abstract:
Federated learning systems are increasingly threatened by data poisoning attacks, where malicious clients compromise global models by contributing tampered updates. Existing defenses often rely on impractical assumptions, such as access to a central test dataset, or fail to generalize across diverse attack types, particularly those involving multiple malicious clients working collaboratively. To address this, we propose Federated Noise-Induced Activation Analysis (FedNIA), a novel defense framework to identify and exclude adversarial clients without relying on any central test dataset. FedNIA injects random noise inputs to analyze the layerwise activation patterns in client models leveraging an autoencoder that detects abnormal behaviors indicative of data poisoning. FedNIA can defend against diverse attack types, including sample poisoning, label flipping, and backdoors, even in scenarios with multiple attacking nodes. Experimental results on non-iid federated datasets demonstrate its effectiveness and robustness, underscoring its potential as a foundational approach for enhancing the security of federated learning systems.

Paperid: 1115, https://arxiv.org/pdf/2502.15334.pdf

Abstract:
Recent research has shown that carefully crafted jailbreak inputs can induce large language models to produce harmful outputs, despite safety measures such as alignment. It is important to anticipate the range of potential Jailbreak attacks to guide effective defenses and accurate assessment of model safety. In this paper, we present a new approach for generating highly effective Jailbreak attacks that manipulate the attention of the model to selectively strengthen or weaken attention among different parts of the prompt. By harnessing attention loss, we develop more effective jailbreak attacks, that are also transferrable. The attacks amplify the success rate of existing Jailbreak algorithms including GCG, AutoDAN, and ReNeLLM, while lowering their generation cost (for example, the amplified GCG attack achieves 91.2% ASR, vs. 67.9% for the original attack on Llama2-7B/AdvBench, using less than a third of the generation time).

Paperid: 1116, https://arxiv.org/pdf/2502.12958.pdf

Abstract:
Privacy concerns have led to the rise of federated recommender systems (FRS), which can create personalized models across distributed clients. However, FRS is vulnerable to poisoning attacks, where malicious users manipulate gradients to promote their target items intentionally. Existing attacks against FRS have limitations, as they depend on specific models and prior knowledge, restricting their real-world applicability. In our exploration of practical FRS vulnerabilities, we devise a model-agnostic and prior-knowledge-free attack, named PIECK (Popular Item Embedding based Attack). The core module of PIECK is popular item mining, which leverages embedding changes during FRS training to effectively identify the popular items. Built upon the core module, PIECK branches into two diverse solutions: The PIECKIPE solution employs an item popularity enhancement module, which aligns the embeddings of targeted items with the mined popular items to increase item exposure. The PIECKUEA further enhances the robustness of the attack by using a user embedding approximation module, which approximates private user embeddings using mined popular items. Upon identifying PIECK, we evaluate existing federated defense methods and find them ineffective against PIECK, as poisonous gradients inevitably overwhelm the cold target items. We then propose a novel defense method by introducing two regularization terms during user training, which constrain item popularity enhancement and user embedding approximation while preserving FRS performance. We evaluate PIECK and its defense across two base models, three real datasets, four top-tier attacks, and six general defense methods, affirming the efficacy of both PIECK and its defense.

Paperid: 1117, https://arxiv.org/pdf/2502.11379.pdf

Abstract:
Despite explicit alignment efforts for large language models (LLMs), they can still be exploited to trigger unintended behaviors, a phenomenon known as "jailbreaking." Current jailbreak attack methods mainly focus on discrete prompt manipulations targeting closed-source LLMs, relying on manually crafted prompt templates and persuasion rules. However, as the capabilities of open-source LLMs improve, ensuring their safety becomes increasingly crucial. In such an environment, the accessibility of model parameters and gradient information by potential attackers exacerbates the severity of jailbreak threats. To address this research gap, we propose a novel \underline{C}ontext-\underline{C}oherent \underline{J}ailbreak \underline{A}ttack (CCJA). We define jailbreak attacks as an optimization problem within the embedding space of masked language models. Through combinatorial optimization, we effectively balance the jailbreak attack success rate with semantic coherence. Extensive evaluations show that our method not only maintains semantic consistency but also surpasses state-of-the-art baselines in attack effectiveness. Additionally, by integrating semantically coherent jailbreak prompts generated by our method into widely used black-box methodologies, we observe a notable enhancement in their success rates when targeting closed-source commercial LLMs. This highlights the security threat posed by open-source LLMs to commercial counterparts. We will open-source our code if the paper is accepted.

Paperid: 1118, https://arxiv.org/pdf/2502.11308.pdf

Abstract:
With the growing popularity of Large Language Models (LLMs) and vector databases, private textual data is increasingly processed and stored as numerical embeddings. However, recent studies have proven that such embeddings are vulnerable to inversion attacks, where original text is reconstructed to reveal sensitive information. Previous research has largely assumed access to millions of sentences to train attack models, e.g., through data leakage or nearly unrestricted API access. With our method, a single data point is sufficient for a partially successful inversion attack. With as little as 1k data samples, performance reaches an optimum across a range of black-box encoders, without training on leaked data. We present a Few-shot Textual Embedding Inversion Attack using ALignment and GENeration (ALGEN), by aligning victim embeddings to the attack space and using a generative model to reconstruct text. We find that ALGEN attacks can be effectively transferred across domains and languages, revealing key information. We further examine a variety of defense mechanisms against ALGEN, and find that none are effective, highlighting the vulnerabilities posed by inversion attacks. By significantly lowering the cost of inversion and proving that embedding spaces can be aligned through one-step optimization, we establish a new textual embedding inversion paradigm with broader applications for embedding alignment in NLP.

Paperid: 1119, https://arxiv.org/pdf/2502.10475.pdf

Abstract:
3D Gaussian Splatting (3DGS) has been widely used in 3D reconstruction and 3D generation. Training to get a 3DGS scene often takes a lot of time and resources and even valuable inspiration. The increasing amount of 3DGS digital asset have brought great challenges to the copyright protection. However, it still lacks profound exploration targeted at 3DGS. In this paper, we propose a new framework X-SG$^2$S which can simultaneously watermark 1 to 3D messages while keeping the original 3DGS scene almost unchanged. Generally, we have a X-SG$^2$S injector for adding multi-modal messages simultaneously and an extractor for extract them. Specifically, we first split the watermarks into message patches in a fixed manner and sort the 3DGS points. A self-adaption gate is used to pick out suitable location for watermarking. Then use a XD(multi-dimension)-injection heads to add multi-modal messages into sorted 3DGS points. A learnable gate can recognize the location with extra messages and XD-extraction heads can restore hidden messages from the location recommended by the learnable gate. Extensive experiments demonstrated that the proposed X-SG$^2$S can effectively conceal multi modal messages without changing pretrained 3DGS pipeline or the original form of 3DGS parameters. Meanwhile, with simple and efficient model structure and high practicality, X-SG$^2$S still shows good performance in hiding and extracting multi-modal inner structured or unstructured messages. X-SG$^2$S is the first to unify 1 to 3D watermarking model for 3DGS and the first framework to add multi-modal watermarks simultaneous in one 3DGS which pave the wave for later researches.

Paperid: 1120, https://arxiv.org/pdf/2502.10110.pdf

Abstract:
With the rise of sophisticated scam websites that exploit human psychological vulnerabilities, distinguishing between legitimate and scam websites has become increasingly challenging. This paper presents ScamFerret, an innovative agent system employing a large language model (LLM) to autonomously collect and analyze data from a given URL to determine whether it is a scam. Unlike traditional machine learning models that require large datasets and feature engineering, ScamFerret leverages LLMs' natural language understanding to accurately identify scam websites of various types and languages without requiring additional training or fine-tuning. Our evaluation demonstrated that ScamFerret achieves 0.972 accuracy in classifying four scam types in English and 0.993 accuracy in classifying online shopping websites across three different languages, particularly when using GPT-4. Furthermore, we confirmed that ScamFerret collects and analyzes external information such as web content, DNS records, and user reviews as necessary, providing a basis for identifying scam websites from multiple perspectives. These results suggest that LLMs have significant potential in enhancing cybersecurity measures against sophisticated scam websites.

Paperid: 1121, https://arxiv.org/pdf/2502.09974.pdf

Abstract:
Prompt engineering has emerged as a powerful technique for optimizing large language models (LLMs) for specific applications, enabling faster prototyping and improved performance, and giving rise to the interest of the community in protecting proprietary system prompts. In this work, we explore a novel perspective on prompt privacy through the lens of membership inference. We develop Prompt Detective, a statistical method to reliably determine whether a given system prompt was used by a third-party language model. Our approach relies on a statistical test comparing the distributions of two groups of model outputs corresponding to different system prompts. Through extensive experiments with a variety of language models, we demonstrate the effectiveness of Prompt Detective for prompt membership inference. Our work reveals that even minor changes in system prompts manifest in distinct response distributions, enabling us to verify prompt usage with statistical significance.

Paperid: 1122, https://arxiv.org/pdf/2502.09175.pdf

Abstract:
The rapid advancement of Large Language Models (LLMs) has introduced significant challenges in moderating user-model interactions. While LLMs demonstrate remarkable capabilities, they remain vulnerable to adversarial attacks, particularly ``jailbreaking'' techniques that bypass content safety measures. Current content moderation systems, which primarily rely on input prompt filtering, have proven insufficient, with techniques like Best-of-N (BoN) jailbreaking achieving success rates of 80% or more against popular LLMs. In this paper, we introduce Flexible LLM-Assisted Moderation Engine (FLAME): a new approach that shifts the focus from input filtering to output moderation. Unlike traditional circuit-breaking methods that analyze user queries, FLAME evaluates model responses, offering several key advantages: (1) computational efficiency in both training and inference, (2) enhanced resistance to BoN jailbreaking attacks, and (3) flexibility in defining and updating safety criteria through customizable topic filtering. Our experiments demonstrate that FLAME significantly outperforms current moderation systems. For example, FLAME reduces attack success rate in GPT-4o-mini and DeepSeek-v3 by a factor of ~9, while maintaining low computational overhead. We provide comprehensive evaluation on various LLMs and analyze the engine's efficiency against the state-of-the-art jailbreaking. This work contributes to the development of more robust and adaptable content moderation systems for LLMs.

Paperid: 1123, https://arxiv.org/pdf/2502.08865.pdf

Abstract:
Extended Reality (XR) experiences involve interactions between users, the real world, and virtual content. A key step to enable these experiences is the XR headset sensing and estimating the user's pose in order to accurately place and render virtual content in the real world. XR headsets use multiple sensors (e.g., cameras, inertial measurement unit) to perform pose estimation and improve its robustness, but this provides an attack surface for adversaries to interfere with the pose estimation process. In this paper, we create and study the effects of acoustic attacks that create false signals in the inertial measurement unit (IMU) on XR headsets, leading to adverse downstream effects on XR applications. We generate resonant acoustic signals on a HoloLens 2 and measure the resulting perturbations in the IMU readings, and also demonstrate both fine-grained and coarse attacks on the popular ORB-SLAM3 and an open-source XR system (ILLIXR). With the knowledge gleaned from attacking these open-source frameworks, we demonstrate four end-to-end proof-of-concept attacks on a HoloLens 2: manipulating user input, clickjacking, zone invasion, and denial of user interaction. Our experiments show that current commercial XR headsets are susceptible to acoustic attacks, raising concerns for their security.

Paperid: 1124, https://arxiv.org/pdf/2502.08008.pdf

Abstract:
Federated learning (FL) enhances privacy by keeping user data on local devices. However, emerging attacks have demonstrated that the updates shared by users during training can reveal significant information about their data. This has greatly thwart the adoption of FL methods for training robust AI models in sensitive applications. Differential Privacy (DP) is considered the gold standard for safeguarding user data. However, DP guarantees are highly conservative, providing worst-case privacy guarantees. This can result in overestimating privacy needs, which may compromise the model's accuracy. Additionally, interpretations of these privacy guarantees have proven to be challenging in different contexts. This is further exacerbated when other factors, such as the number of training iterations, data distribution, and specific application requirements, can add further complexity to this problem. In this work, we proposed a framework that integrates a human entity as a privacy practitioner to determine an optimal trade-off between the model's privacy and utility. Our framework is the first to address the variable memory requirement of existing DP methods in FL settings, where resource-limited devices (e.g., cell phones) can participate. To support such settings, we adopt a recent DP method with fixed memory usage to ensure scalable private FL. We evaluated our proposed framework by fine-tuning a BERT-based LLM model using the GLUE dataset (a common approach in literature), leveraging the new accountant, and employing diverse data partitioning strategies to mimic real-world conditions. As a result, we achieved stable memory usage, with an average accuracy reduction of 1.33% for $Îµ= 10$ and 1.9% for $Îµ= 6$, when compared to the state-of-the-art DP accountant which does not support fixed memory usage.

Paperid: 1125, https://arxiv.org/pdf/2502.07207.pdf

Abstract:
Advanced Persistent Threats (APTs) pose a significant security risk to organizations and industries. These attacks often lead to severe data breaches and compromise the system for a long time. Mitigating these sophisticated attacks is highly challenging due to the stealthy and persistent nature of APTs. Machine learning models are often employed to tackle this challenge by bringing automation and scalability to APT detection. Nevertheless, these intelligent methods are data-driven, and thus, highly affected by the quality and relevance of input data. This paper aims to analyze measurements considered when recording network traffic and conclude which features contribute more to detecting APT samples. To do this, we study the features associated with various APT cases and determine their importance using a machine learning framework. To ensure the generalization of our findings, several feature selection techniques are employed and paired with different classifiers to evaluate their effectiveness. Our findings provide insights into how APT detection can be enhanced in real-world scenarios.

Paperid: 1126, https://arxiv.org/pdf/2502.05325.pdf

Abstract:
The advent of Machine Learning as a Service (MLaaS) has heightened the trade-off between model explainability and security. In particular, explainability techniques, such as counterfactual explanations, inadvertently increase the risk of model extraction attacks, enabling unauthorized replication of proprietary models. In this paper, we formalize and characterize the risks and inherent complexity of model reconstruction, focusing on the "oracle'' queries required for faithfully inferring the underlying prediction function. We present the first formal analysis of model extraction attacks through the lens of competitive analysis, establishing a foundational framework to evaluate their efficiency. Focusing on models based on additive decision trees (e.g., decision trees, gradient boosting, and random forests), we introduce novel reconstruction algorithms that achieve provably perfect fidelity while demonstrating strong anytime performance. Our framework provides theoretical bounds on the query complexity for extracting tree-based model, offering new insights into the security vulnerabilities of their deployment.

Paperid: 1127, https://arxiv.org/pdf/2502.05307.pdf

Abstract:
Recent research has shown that structured machine learning models such as tree ensembles are vulnerable to privacy attacks targeting their training data. To mitigate these risks, differential privacy (DP) has become a widely adopted countermeasure, as it offers rigorous privacy protection. In this paper, we introduce a reconstruction attack targeting state-of-the-art $Îµ$-DP random forests. By leveraging a constraint programming model that incorporates knowledge of the forest's structure and DP mechanism characteristics, our approach formally reconstructs the most likely dataset that could have produced a given forest. Through extensive computational experiments, we examine the interplay between model utility, privacy guarantees and reconstruction accuracy across various configurations. Our results reveal that random forests trained with meaningful DP guarantees can still leak portions of their training data. Specifically, while DP reduces the success of reconstruction attacks, the only forests fully robust to our attack exhibit predictive performance no better than a constant classifier. Building on these insights, we also provide practical recommendations for the construction of DP random forests that are more resilient to reconstruction attacks while maintaining a non-trivial predictive performance.

Paperid: 1128, https://arxiv.org/pdf/2502.03692.pdf

Abstract:
Document Visual Question Answering (DocVQA) has introduced a new paradigm for end-to-end document understanding, and quickly became one of the standard benchmarks for multimodal LLMs. Automating document processing workflows, driven by DocVQA models, presents significant potential for many business sectors. However, documents tend to contain highly sensitive information, raising concerns about privacy risks associated with training such DocVQA models. One significant privacy vulnerability, exploited by the membership inference attack, is the possibility for an adversary to determine if a particular record was part of the model's training data. In this paper, we introduce two novel membership inference attacks tailored specifically to DocVQA models. These attacks are designed for two different adversarial scenarios: a white-box setting, where the attacker has full access to the model architecture and parameters, and a black-box setting, where only the model's outputs are available. Notably, our attacks assume the adversary lacks access to auxiliary datasets, which is more realistic in practice but also more challenging. Our unsupervised methods outperform existing state-of-the-art membership inference attacks across a variety of DocVQA models and datasets, demonstrating their effectiveness and highlighting the privacy risks in this domain.

Paperid: 1129, https://arxiv.org/pdf/2502.01608.pdf

Abstract:
Browser fingerprinting is a pervasive online tracking technique used increasingly often for profiling and targeted advertising. Prior research on the prevalence of fingerprinting heavily relied on automated web crawls, which inherently struggle to replicate the nuances of human-computer interactions. This raises concerns about the accuracy of current understandings of real-world fingerprinting deployments. As a result, this paper presents a user study involving 30 participants over 10 weeks, capturing telemetry data from real browsing sessions across 3,000 top-ranked websites. Our evaluation reveals that automated crawls miss almost half (45%) of the fingerprinting websites encountered by real users. This discrepancy mainly stems from the crawlers' inability to access authentication-protected pages, circumvent bot detection, and trigger fingerprinting scripts activated by specific user interactions. We also identify potential new fingerprinting vectors present in real user data but absent from automated crawls. Finally, we evaluate the effectiveness of federated learning for training browser fingerprinting detection models on real user data, yielding improved performance than models trained solely on automated crawl data.

Paperid: 1130, https://arxiv.org/pdf/2502.01013.pdf

Abstract:
Large scale deep learning model, such as modern language models and diffusion architectures, have revolutionized applications ranging from natural language processing to computer vision. However, their deployment in distributed or decentralized environments raises significant privacy concerns, as sensitive data may be exposed during inference. Traditional techniques like secure multi-party computation, homomorphic encryption, and differential privacy offer partial remedies but often incur substantial computational overhead, latency penalties, or limited compatibility with non-linear network operations. In this work, we introduce Equivariant Encryption (EE), a novel paradigm designed to enable secure, "blind" inference on encrypted data with near zero performance overhead. Unlike fully homomorphic approaches that encrypt the entire computational graph, EE selectively obfuscates critical internal representations within neural network layers while preserving the exact functionality of both linear and a prescribed set of non-linear operations. This targeted encryption ensures that raw inputs, intermediate activations, and outputs remain confidential, even when processed on untrusted infrastructure. We detail the theoretical foundations of EE, compare its performance and integration complexity against conventional privacy preserving techniques, and demonstrate its applicability across a range of architectures, from convolutional networks to large language models. Furthermore, our work provides a comprehensive threat analysis, outlining potential attack vectors and baseline strategies, and benchmarks EE against standard inference pipelines in decentralized settings. The results confirm that EE maintains high fidelity and throughput, effectively bridging the gap between robust data confidentiality and the stringent efficiency requirements of modern, large scale model inference.

Paperid: 1131, https://arxiv.org/pdf/2502.00652.pdf

Abstract:
Human language encompasses a wide range of intricate and diverse implicit features, which attackers can exploit to launch adversarial or backdoor attacks, compromising DNN models for NLP tasks. Existing model-oriented defenses often require substantial computational resources as model size increases, whereas sample-oriented defenses typically focus on specific attack vectors or schemes, rendering them vulnerable to adaptive attacks. We observe that the root cause of both adversarial and backdoor attacks lies in the encoding process of DNN models, where subtle textual features, negligible for human comprehension, are erroneously assigned significant weight by less robust or trojaned models. Based on it we propose a unified and adaptive defense framework that is effective against both adversarial and backdoor attacks. Our approach leverages reformulation modules to address potential malicious features in textual inputs while preserving the original semantic integrity. Extensive experiments demonstrate that our framework outperforms existing sample-oriented defense baselines across a diverse range of malicious textual features.

Paperid: 1132, https://arxiv.org/pdf/2502.00494.pdf

Abstract:
In collaborative machine learning (CML), data valuation, i.e., evaluating the contribution of each client's data to the machine learning model, has become a critical task for incentivizing and selecting positive data contributions. However, existing studies often assume that clients engage in data valuation truthfully, overlooking the practical motivation for clients to exaggerate their contributions. To unlock this threat, this paper introduces the data overvaluation attack, enabling strategic clients to have their data significantly overvalued in federated learning, a widely adopted paradigm for decentralized CML. Furthermore, we propose a Bayesian truthful data valuation metric, named Truth-Shapley. Truth-Shapley is the unique metric that guarantees some promising axioms for data valuation while ensuring that clients' optimal strategy is to perform truthful data valuation under certain conditions. Our experiments demonstrate the vulnerability of existing data valuation metrics to the proposed attack and validate the robustness and effectiveness of Truth-Shapley.

Paperid: 1133, https://arxiv.org/pdf/2501.16558.pdf

Abstract:
This paper introduces a novel problem, distributional information embedding, motivated by the practical demands of multi-bit watermarking for large language models (LLMs). Unlike traditional information embedding, which embeds information into a pre-existing host signal, LLM watermarking actively controls the text generation process--adjusting the token distribution--to embed a detectable signal. We develop an information-theoretic framework to analyze this distributional information embedding problem, characterizing the fundamental trade-offs among three critical performance metrics: text quality, detectability, and information rate. In the asymptotic regime, we demonstrate that the maximum achievable rate with vanishing error corresponds to the entropy of the LLM's output distribution and increases with higher allowable distortion. We also characterize the optimal watermarking scheme to achieve this rate. Extending the analysis to the finite-token case with non-i.i.d. tokens, we identify schemes that maximize detection probability while adhering to constraints on false alarm and distortion.

Paperid: 1134, https://arxiv.org/pdf/2501.13894.pdf

Abstract:
Satellites are highly vulnerable to adversarial glitches or high-energy radiation in space, which could cause faults on the onboard computer. Various radiation- and fault-tolerant methods, such as error correction codes (ECC) and redundancy-based approaches, have been explored over the last decades to mitigate temporary soft errors on software and hardware. However, conventional ECC methods fail to deal with hard errors or permanent faults in the hardware components. This work introduces a detection- and response-based countermeasure to deal with partially damaged processor chips. It recovers the processor chip from permanent faults and enables continuous operation with available undamaged resources on the chip. We incorporate digitally-compatible delay-based sensors on the target processor's chip to reliably detect the incoming radiation or glitching attempts on the physical fabric of the chip, even before a fault occurs. Upon detecting a fault in one or more components of the processor's arithmetic logic unit (ALU), our countermeasure employs adaptive software recompilations to resynthesize and substitute the affected instructions with instructions of still functioning components to accomplish the task. Furthermore, if the fault is more widespread and prevents the correct operation of the entire processor, our approach deploys adaptive hardware partial reconfigurations to replace and reroute the failed components to undamaged locations of the chip. To validate our claims, we deploy a high-energy near-infrared (NIR) laser beam on a RISC-V processor implemented on a 28~nm FPGA to emulate radiation and even hard errors by partially damaging the FPGA fabric. We demonstrate that our sensor can confidently detect the radiation and trigger the processor testing and fault recovery mechanisms. Finally, we discuss the overhead imposed by our countermeasure.

Paperid: 1135, https://arxiv.org/pdf/2501.12736.pdf

Abstract:
Data heterogeneity and backdoor attacks rank among the most significant challenges facing federated learning (FL). For data heterogeneity, personalized federated learning (PFL) enables each client to maintain a private personalized model to cater to client-specific knowledge. Meanwhile, vanilla FL has proven vulnerable to backdoor attacks. However, recent advancements in PFL community have demonstrated a potential immunity against such attacks. This paper explores this intersection further, revealing that existing federated backdoor attacks fail in PFL because backdoors about manually designed triggers struggle to survive in personalized models. To tackle this, we design Bad-PFL, which employs features from natural data as our trigger. As long as the model is trained on natural data, it inevitably embeds the backdoor associated with our trigger, ensuring its longevity in personalized models. Moreover, our trigger undergoes mutual reinforcement training with the model, further solidifying the backdoor's durability and enhancing attack effectiveness. The large-scale experiments across three benchmark datasets demonstrate the superior performance of our attack against various PFL methods, even when equipped with state-of-the-art defense mechanisms.

Paperid: 1136, https://arxiv.org/pdf/2501.11546.pdf

Abstract:
Software security is of utmost importance for most software systems. Developers must systematically select, plan, design, implement, and especially, maintain and evolve security features -- functionalities to mitigate attacks or protect personal data such as cryptography or access control -- to ensure the security of their software. Although security features are usually available in libraries, integrating security features requires writing and maintaining additional security-critical code. While there have been studies on the use of such libraries, surprisingly little is known about how developers engineer security features, how they select what security features to implement and which ones may require custom implementation, and the implications for maintenance. As a result, we currently rely on assumptions that are largely based on common sense or individual examples. However, to provide them with effective solutions, researchers need hard empirical data to understand what practitioners need and how they view security -- data that we currently lack. To fill this gap, we contribute an exploratory study with 26 knowledgeable industrial participants. We study how security features of software systems are selected and engineered in practice, what their code-level characteristics are, and what challenges practitioners face. Based on the empirical data gathered, we provide insights into engineering practices and validate four common assumptions.

Paperid: 1137, https://arxiv.org/pdf/2501.08862.pdf

Abstract:
Private data, when published online, may be collected by unauthorized parties to train deep neural networks (DNNs). To protect privacy, defensive noises can be added to original samples to degrade their learnability by DNNs. Recently, unlearnable examples are proposed to minimize the training loss such that the model learns almost nothing. However, raw data are often pre-processed before being used for training, which may restore the private information of protected data. In this paper, we reveal the data privacy violation induced by data augmentation, a commonly used data pre-processing technique to improve model generalization capability, which is the first of its kind as far as we are concerned. We demonstrate that data augmentation can significantly raise the accuracy of the model trained on unlearnable examples from 21.3% to 66.1%. To address this issue, we propose a defense framework, dubbed ARMOR, to protect data privacy from potential breaches of data augmentation. To overcome the difficulty of having no access to the model training process, we design a non-local module-assisted surrogate model that better captures the effect of data augmentation. In addition, we design a surrogate augmentation selection strategy that maximizes distribution alignment between augmented and non-augmented samples, to choose the optimal augmentation strategy for each class. We also use a dynamic step size adjustment algorithm to enhance the defensive noise generation process. Extensive experiments are conducted on 4 datasets and 5 data augmentation methods to verify the performance of ARMOR. Comparisons with 6 state-of-the-art defense methods have demonstrated that ARMOR can preserve the unlearnability of protected private data under data augmentation. ARMOR reduces the test accuracy of the model trained on augmented protected samples by as much as 60% more than baselines.

Paperid: 1138, https://arxiv.org/pdf/2501.07805.pdf

Abstract:
Android apps can hold secret strings of themselves such as cloud service credentials or encryption keys. Leakage of such secret strings can induce unprecedented consequences like monetary losses or leakage of user private information. In practice, various security issues were reported because many apps failed to protect their secrets. However, little is known about the types, usages, exploitability, and consequences of app secret leakage issues. While a large body of literature has been devoted to studying user private information leakage, there is no systematic study characterizing app secret leakage issues. How far are Android app secrets from being stolen? To bridge this gap, we conducted the first systematic study to characterize app secret leakage issues in Android apps based on 575 potential app secrets sampled from 14,665 popular Android apps on Google Play. We summarized the common categories of leaked app secrets, assessed their security impacts and disclosed app bad practices in storing app secrets. We devised a text mining strategy using regular expressions and demonstrated that numerous app secrets can be easily stolen, even from the highly popular Android apps on Google. In a follow-up study, we harvested 3,711 distinct exploitable app secrets through automatic analysis. Our findings highlight the prevalence of this problem and call for greater attention to app secret protection.

Paperid: 1139, https://arxiv.org/pdf/2501.06650.pdf

Abstract:
Split Learning (SL) is a distributed deep learning approach enabling multiple clients and a server to collaboratively train and infer on a shared deep neural network (DNN) without requiring clients to share their private local data. The DNN is partitioned in SL, with most layers residing on the server and a few initial layers and inputs on the client side. This configuration allows resource-constrained clients to participate in training and inference. However, the distributed architecture exposes SL to backdoor attacks, where malicious clients can manipulate local datasets to alter the DNN's behavior. Existing defenses from other distributed frameworks like Federated Learning are not applicable, and there is a lack of effective backdoor defenses specifically designed for SL. We present SafeSplit, the first defense against client-side backdoor attacks in Split Learning (SL). SafeSplit enables the server to detect and filter out malicious client behavior by employing circular backward analysis after a client's training is completed, iteratively reverting to a trained checkpoint where the model under examination is found to be benign. It uses a two-fold analysis to identify client-induced changes and detect poisoned models. First, a static analysis in the frequency domain measures the differences in the layer's parameters at the server. Second, a dynamic analysis introduces a novel rotational distance metric that assesses the orientation shifts of the server's layer parameters during training. Our comprehensive evaluation across various data distributions, client counts, and attack scenarios demonstrates the high efficacy of this dual analysis in mitigating backdoor attacks while preserving model utility.

Paperid: 1140, https://arxiv.org/pdf/2501.04454.pdf

Abstract:
Security must be considered in almost every software system. Unfortunately, selecting and implementing security features remains challenging due to the variety of security threats and possible countermeasures. While security standards are intended to help developers, they are usually too abstract and vague to help implement security features, or they merely help configure such. A resource that describes security features at an abstraction level between high-level (i.e., rather too general) and low-level (i.e., rather too specific) security standards could facilitate secure systems development. To realize security features, developers typically use external security frameworks, to minimize implementation mistakes. Even then, developers still make mistakes, often resulting in security vulnerabilities. When security incidents occur or the system needs to be audited or maintained, it is essential to know the implemented security features and, more importantly, where they are located. This task, commonly referred to as feature location, is often tedious and error-prone. Therefore, we have to support long-term tracking of implemented security features. We present a study of security features in the literature and their coverage in popular security frameworks. We contribute (1) a taxonomy of 68 functional implementation-level security features including a mapping to widely used security standards, (2) an examination of 21 popular security frameworks concerning which of these security features they provide, and (3) a discussion on the representation of security features in source code. Our taxonomy aims to aid developers in selecting appropriate security features and frameworks and relating them to security standards when they need to choose and implement security features for a software system.

Paperid: 1141, https://arxiv.org/pdf/2506.22722.pdf

Abstract:
The proposed UniGuard is the first unified online detection framework capable of simultaneously addressing adversarial examples and backdoor attacks. UniGuard builds upon two key insights: first, both AE and backdoor attacks have to compromise the inference phase, making it possible to tackle them simultaneously during run-time via online detection. Second, an adversarial input, whether a perturbed sample in AE attacks or a trigger-carrying sample in backdoor attacks, exhibits distinctive trajectory signatures from a benign sample as it propagates through the layers of a DL model in forward inference. The propagation trajectory of the adversarial sample must deviate from that of its benign counterpart; otherwise, the adversarial objective cannot be fulfilled. Detecting these trajectory signatures is inherently challenging due to their subtlety; UniGuard overcomes this by treating the propagation trajectory as a time-series signal, leveraging LSTM and spectrum transformation to amplify differences between adversarial and benign trajectories that are subtle in the time domain. UniGuard exceptional efficiency and effectiveness have been extensively validated across various modalities (image, text, and audio) and tasks (classification and regression), ranging from diverse model architectures against a wide range of AE attacks and backdoor attacks, including challenging partial backdoors and dynamic triggers. When compared to SOTA methods, including ContraNet (NDSS 22) specific for AE detection and TED (IEEE SP 24) specific for backdoor detection, UniGuard consistently demonstrates superior performance, even when matched against each method's strengths in addressing their respective threats-each SOTA fails to parts of attack strategies while UniGuard succeeds for all.

Paperid: 1142, https://arxiv.org/pdf/2506.19889.pdf

Abstract:
Recent advances in large language models (LLMs) have made a profound impact on our society and also raised new security concerns. Particularly, due to the remarkable inference ability of LLMs, the privacy violation attack (PVA), revealed by Staab et al., introduces serious personal privacy issues. Existing defense methods mainly leverage LLMs to anonymize the input query, which requires costly inference time and cannot gain satisfactory defense performance. Moreover, directly rejecting the PVA query seems like an effective defense method, while the defense method is exposed, promoting the evolution of PVA. In this paper, we propose a novel defense paradigm based on retrieval-confused generation (RCG) of LLMs, which can efficiently and covertly defend the PVA. We first design a paraphrasing prompt to induce the LLM to rewrite the "user comments" of the attack query to construct a disturbed database. Then, we propose the most irrelevant retrieval strategy to retrieve the desired user data from the disturbed database. Finally, the "data comments" are replaced with the retrieved user data to form a defended query, leading to responding to the adversary with some wrong personal attributes, i.e., the attack fails. Extensive experiments are conducted on two datasets and eight popular LLMs to comprehensively evaluate the feasibility and the superiority of the proposed defense method.

Paperid: 1143, https://arxiv.org/pdf/2506.18731.pdf

Abstract:
One common critique of biometric authentication is that if an individual's biometric is compromised, then the individual has no recourse. The concept of revocable biometrics was developed to address this concern. A biometric scheme is revocable if an individual can have their current enrollment in the scheme revoked, so that the compromised biometric template becomes worthless, and the individual can re-enroll with a new template that has similar recognition power. We show that modern deep CNN face matchers inherently allow for a robust revocable biometric scheme. For a given state-of-the-art deep CNN backbone and training set, it is possible to generate an unlimited number of distinct face matcher models that have both (1) equivalent recognition power, and (2) strongly incompatible biometric templates. The equivalent recognition power extends to the point of generating impostor and genuine distributions that have the same shape and placement on the similarity dimension, meaning that the models can share a similarity threshold for a 1-in-10,000 false match rate. The biometric templates from different model instances are so strongly incompatible that the cross-instance similarity score for images of the same person is typically lower than the same-instance similarity score for images of different persons. That is, a stolen biometric template that is revoked is of less value in attempting to match the re-enrolled identity than the average impostor template. We also explore the feasibility of using a Vision Transformer (ViT) backbone-based face matcher in the revocable biometric system proposed in this work and demonstrate that it is less suitable compared to typical ResNet-based deep CNN backbones.

Paperid: 1144, https://arxiv.org/pdf/2506.17865.pdf

Abstract:
Ensuring the security of modern System-on-Chip (SoC) designs poses significant challenges due to increasing complexity and distributed assets across the intellectual property (IP) blocks. Formal property verification (FPV) provides the capability to model and validate design behaviors through security properties with model checkers; however, current practices require significant manual efforts to create such properties, making them time-consuming, costly, and error-prone. The emergence of Large Language Models (LLMs) has showcased remarkable proficiency across diverse domains, including HDL code generation and verification tasks. Current LLM-based techniques often produce vacuous assertions and lack efficient prompt generation, comprehensive verification, and bug detection. This paper presents LASA, a novel framework that leverages LLMs and retrieval-augmented generation (RAG) to produce non-vacuous security properties and SystemVerilog Assertions (SVA) from design specifications and related documentation for bus-based SoC designs. LASA integrates commercial EDA tool for FPV to generate coverage metrics and iteratively refines prompts through a feedback loop to enhance coverage. The effectiveness of LASA is validated through various open-source SoC designs, demonstrating high coverage values with an average of ~88\%, denoting comprehensive verification through efficient generation of security properties and SVAs. LASA also demonstrates bug detection capabilities, identifying five unique bugs in the buggy OpenTitan SoC from Hack@DAC'24 competition.

Paperid: 1145, https://arxiv.org/pdf/2506.17622.pdf

Abstract:
Stablecoins have become significant assets in modern finance, with a market capitalization exceeding USD 246 billion (May 2025). Yet, despite their systemic importance, a comprehensive and risk-oriented understanding of crucial aspects like their design trade-offs, security dynamics, and interdependent failure pathways often remains underdeveloped. This SoK confronts this gap through a large-scale analysis of 157 research studies, 95 active stablecoins, and 44 major security incidents. Our analysis establishes four pivotal insights: 1) stability is best understood not an inherent property but an emergent, fragile state reliant on the interplay between market confidence and continuous liquidity; 2) stablecoin designs demonstrate trade-offs in risk specialization instead of mitigation; 3) the widespread integration of yield mechanisms imposes a "dual mandate" that creates a systemic tension between the core mission of stability and the high-risk financial engineering required for competitive returns; and 4) major security incidents act as acute "evolutionary pressures", forging resilience by stress-testing designs and aggressively redefining the security frontier. We introduce the Stablecoin LEGO framework, a quantitative methodology mapping historical failures to current designs. Its application reveals that a lower assessed risk strongly correlates with integrating lessons from past incidents. We hope this provides a systematic foundation for building, evaluating, and regulating more resilient stablecoins.

Paperid: 1146, https://arxiv.org/pdf/2506.12100.pdf

Abstract:
Large Language Models (LLMs) are increasingly used for cybersecurity threat analysis, but their deployment in security-sensitive environments raises trust and safety concerns. With over 21,000 vulnerabilities disclosed in 2025, manual analysis is infeasible, making scalable and verifiable AI support critical. When querying LLMs, dealing with emerging vulnerabilities is challenging as they have a training cut-off date. While Retrieval-Augmented Generation (RAG) can inject up-to-date context to alleviate the cut-off date limitation, it remains unclear how much LLMs rely on retrieved evidence versus the model's internal knowledge, and whether the retrieved information is meaningful or even correct. This uncertainty could mislead security analysts, mis-prioritize patches, and increase security risks. Therefore, this work proposes LLM Embedding-based Attribution (LEA) to analyze the generated responses for vulnerability exploitation analysis. More specifically, LEA quantifies the relative contribution of internal knowledge vs. retrieved content in the generated responses. We evaluate LEA on 500 critical vulnerabilities disclosed between 2016 and 2025, across three RAG settings -- valid, generic, and incorrect -- using three state-of-the-art LLMs. Our results demonstrate LEA's ability to detect clear distinctions between non-retrieval, generic-retrieval, and valid-retrieval scenarios with over 95% accuracy on larger models. Finally, we demonstrate the limitations posed by incorrect retrieval of vulnerability information and raise a cautionary note to the cybersecurity community regarding the blind reliance on LLMs and RAG for vulnerability analysis. LEA offers security analysts with a metric to audit RAG-enhanced workflows, improving the transparent and trustworthy deployment of AI in cybersecurity threat analysis.

Paperid: 1147, https://arxiv.org/pdf/2506.09365.pdf

Abstract:
Modern Security Operations Centres (SOCs) integrate diverse tools, such as SIEM, IDS, and XDR systems, offering rich contextual data, including alert enrichments, flow features, and similar case histories. Yet, analysts must still manually determine which of these contextual cues are most relevant when validating specific alerts. We introduce ContextBuddy, an AI assistant that learns from analysts' prior investigations to help them identify the most relevant context for new alerts. Rather than providing enrichments, ContextBuddy models how analysts have previously selected context and suggests tailored cues based on the characteristics of each alert. We formulate context selection as a sequential decision-making problem and apply imitation learning (IL) to capture analysts' strategies, evaluating multiple IL approaches. Through staged evaluation, we validate ContextBuddy using two intrusion detection datasets (HIKARI-2021, UNSW-NB15). In simulation-based experiments, ContextBuddy helped simulated reinforcement learning analysts improve classification accuracy (p < 0.001) (increasing F1 by 2.5% for HIKARI and 9% for UNSW), reducing false negatives (1.5% for HIKARI and 10% for UNSW), and keeping false positives below 1%. Decision confidence among agents also improved by 2-3% (p < 0.001). In a within-subject user study (N=13; power = 0.8), non-experts using ContextBuddy improved classification accuracy by 21.1% (p = 0.008) and reduced alert validation time by 24% (p = 0.01). These results demonstrate that by learning context-selection patterns from analysts, ContextBuddy can yield notable improvements in investigation effectiveness and efficiency.

Paperid: 1148, https://arxiv.org/pdf/2506.08838.pdf

Abstract:
Fuzzing is a widely used technique for discovering software vulnerabilities, but identifying hot bytes that influence program behavior remains challenging. Traditional taint analysis can track such bytes white-box, but suffers from scalability issue. Fuzzing-Driven Taint Inference (FTI) offers a black-box alternative, yet typically incurs significant runtime overhead due to extra program executions. We observe that the commonly used havoc mutation scheme in fuzzing can be adapted for lightweight FTI with zero extra executions. We present a computational model of havoc mode, demonstrating that it can perform FTI while generating new test cases. Building on this, we propose ZTaint-Havoc, a novel, efficient FTI with minimal overhead (3.84% on UniBench, 12.58% on FuzzBench). We further design an effective mutation algorithm utilizing the identified hot bytes. Our comprehensive evaluation shows that ZTaint-Havoc, implemented in AFL++, improves edge coverage by up to 33.71% on FuzzBench and 51.12% on UniBench over vanilla AFL++, with average gains of 2.97% and 6.12% in 24-hour fuzzing campaigns.

Paperid: 1149, https://arxiv.org/pdf/2506.08188.pdf

Abstract:
In this paper, we introduce GradEscape, the first gradient-based evader designed to attack AI-generated text (AIGT) detectors. GradEscape overcomes the undifferentiable computation problem, caused by the discrete nature of text, by introducing a novel approach to construct weighted embeddings for the detector input. It then updates the evader model parameters using feedback from victim detectors, achieving high attack success with minimal text modification. To address the issue of tokenizer mismatch between the evader and the detector, we introduce a warm-started evader method, enabling GradEscape to adapt to detectors across any language model architecture. Moreover, we employ novel tokenizer inference and model extraction techniques, facilitating effective evasion even in query-only access. We evaluate GradEscape on four datasets and three widely-used language models, benchmarking it against four state-of-the-art AIGT evaders. Experimental results demonstrate that GradEscape outperforms existing evaders in various scenarios, including with an 11B paraphrase model, while utilizing only 139M parameters. We have successfully applied GradEscape to two real-world commercial AIGT detectors. Our analysis reveals that the primary vulnerability stems from disparity in text expression styles within the training data. We also propose a potential defense strategy to mitigate the threat of AIGT evaders. We open-source our GradEscape for developing more robust AIGT detectors.

Paperid: 1150, https://arxiv.org/pdf/2506.06730.pdf

Abstract:
The rapid global adoption of electric vehicles (EVs) has established electric vehicle supply equipment (EVSE) as a critical component of smart grid infrastructure. While essential for ensuring reliable energy delivery and accessibility, EVSE systems face significant cybersecurity challenges, including network reconnaissance, backdoor intrusions, and distributed denial-of-service (DDoS) attacks. These emerging threats, driven by the interconnected and autonomous nature of EVSE, require innovative and adaptive security mechanisms that go beyond traditional intrusion detection systems (IDS). Existing approaches, whether network-based or host-based, often fail to detect sophisticated and targeted attacks specifically crafted to exploit new vulnerabilities in EVSE infrastructure. This paper proposes a novel intrusion detection framework that leverages multimodal data sources, including network traffic and kernel events, to identify complex attack patterns. The framework employs a distributed learning approach, enabling collaborative intelligence across EVSE stations while preserving data privacy through federated learning. Experimental results demonstrate that the proposed framework outperforms existing solutions, achieving a detection rate above 98% and a precision rate exceeding 97% in decentralized environments. This solution addresses the evolving challenges of EVSE security, offering a scalable and privacypreserving response to advanced cyber threats

Paperid: 1151, https://arxiv.org/pdf/2506.03467.pdf

Abstract:
Gaussian Mixture Models (GMMs) are widely used statistical models for representing multi-modal data distributions, with numerous applications in data mining, pattern recognition, data simulation, and machine learning. However, recent research has shown that releasing GMM parameters poses significant privacy risks, potentially exposing sensitive information about the underlying data. In this paper, we address the challenge of releasing GMM parameters while ensuring differential privacy (DP) guarantees. Specifically, we focus on the privacy protection of mixture weights, component means, and covariance matrices. We propose to use Kullback-Leibler (KL) divergence as a utility metric to assess the accuracy of the released GMM, as it captures the joint impact of noise perturbation on all the model parameters. To achieve privacy, we introduce a DP mechanism that adds carefully calibrated random perturbations to the GMM parameters. Through theoretical analysis, we quantify the effects of privacy budget allocation and perturbation statistics on the DP guarantee, and derive a tractable expression for evaluating KL divergence. We formulate and solve an optimization problem to minimize the KL divergence between the released and original models, subject to a given $(Îµ, Î´)$-DP constraint. Extensive experiments on both synthetic and real-world datasets demonstrate that our approach achieves strong privacy guarantees while maintaining high utility.

Paperid: 1152, https://arxiv.org/pdf/2506.00262.pdf

Abstract:
Self-Sovereign Identity (SSI) is a novel identity model that empowers individuals with full control over their data, enabling them to choose what information to disclose, with whom, and when. This paradigm is rapidly gaining traction worldwide, supported by numerous initiatives such as the European Digital Identity (EUDI) Regulation or Singapore's National Digital Identity (NDI). For instance, by 2026, the EUDI Regulation will enable all European citizens to seamlessly access services across Europe using Verifiable Credentials (VCs). A key feature of SSI is the ability to selectively disclose only specific claims within a credential, enhancing privacy protection of the identity owner. This paper proposes a novel mechanism designed to achieve Compact and Selective Disclosure for VCs (CSD-JWT). Our method leverages a cryptographic accumulator to encode claims within a credential to a unique, compact representation. We implemented CSD-JWT as an open-source solution and extensively evaluated its performance under various conditions. CSD-JWT provides significant memory savings, reducing usage by up to 46% compared to the state-of-the-art. It also minimizes network overhead by producing remarkably smaller Verifiable Presentations (VPs), reduced in size by 27% to 93%. Such features make CSD-JWT especially well-suited for resource-constrained devices, including hardware wallets designed for managing credentials.

Paperid: 1153, https://arxiv.org/pdf/2505.24724.pdf

Abstract:
Can you imagine, blockchain transactions can talk! In this paper, we study how they talk and what they talk about. We focus on the input data field of Ethereum transactions, which is designed to allow external callers to interact with smart contracts. In practice, this field also enables users to embed natural language messages into transactions. Users can leverage these Input Data Messages (IDMs) for peer-to-peer communication. This means that, beyond Ethereum's well-known role as a financial infrastructure, it also serves as a decentralized communication medium. We present the first large-scale analysis of Ethereum IDMs from the genesis block to February 2024 (3134 days). We filter IDMs to extract 867,140 transactions with informative IDMs and use LLMs for language detection. We find that English (95.4%) and Chinese (4.4%) dominate the use of natural languages in IDMs. Interestingly, English IDMs center on security and scam warnings (24%) with predominantly negative emotions, while Chinese IDMs emphasize emotional expression and social connection (44%) with a more positive tone. We also observe that longer English IDMs often transfer high ETH values for protocol-level purposes, while longer Chinese IDMs tend to involve symbolic transfer amounts for emotional intent. Moreover, we find that the IDM participants tend to form small, loosely connected communities (59.99%). Our findings highlight culturally and functionally divergent use cases of the IDM channel across user communities. We further examine the security relevance of IDMs in on-chain attacks. Many victims use them to appeal to attackers for fund recovery. IDMs containing negotiations or reward offers are linked to higher reply rates. We also analyze IDMs' regulatory implications. Their misuse for abuse, threats, and sexual solicitation reveals the urgent need for content moderation and regulation in decentralized systems.

Paperid: 1154, https://arxiv.org/pdf/2505.23849.pdf

Abstract:
Privacy-Preserving Federated Learning (PPFL) is a decentralized machine learning approach where multiple clients train a model collaboratively. PPFL preserves the privacy and security of a client's data without exchanging it. However, ensuring that data at each client is of high quality and ready for federated learning (FL) is a challenge due to restricted data access. In this paper, we introduce CADRE (Customizable Assurance of Data Readiness) for federated learning (FL), a novel framework that allows users to define custom data readiness (DR) metrics, rules, and remedies tailored to specific FL tasks. CADRE generates comprehensive DR reports based on the user-defined metrics, rules, and remedies to ensure datasets are prepared for FL while preserving privacy. We demonstrate a practical application of CADRE by integrating it into an existing PPFL framework. We conducted experiments across six datasets and addressed seven different DR issues. The results illustrate the versatility and effectiveness of CADRE in ensuring DR across various dimensions, including data quality, privacy, and fairness. This approach enhances the performance and reliability of FL models as well as utilizes valuable resources.

Paperid: 1155, https://arxiv.org/pdf/2505.23847.pdf

Abstract:
Large language models (LLMs) are rapidly evolving into autonomous agents that cooperate across organizational boundaries, enabling joint disaster response, supply-chain optimization, and other tasks that demand decentralized expertise without surrendering data ownership. Yet, cross-domain collaboration shatters the unified trust assumptions behind current alignment and containment techniques. An agent benign in isolation may, when receiving messages from an untrusted peer, leak secrets or violate policy, producing risks driven by emergent multi-agent dynamics rather than classical software bugs. This position paper maps the security agenda for cross-domain multi-agent LLM systems. We introduce seven categories of novel security challenges, for each of which we also present plausible attacks, security evaluation metrics, and future research guidelines.

Paperid: 1156, https://arxiv.org/pdf/2505.23814.pdf

Abstract:
Watermarking has emerged as a leading technical proposal for attributing generative AI content and is increasingly cited in global governance frameworks. This paper argues that current implementations risk serving as symbolic compliance rather than delivering effective oversight. We identify a growing gap between regulatory expectations and the technical limitations of existing watermarking schemes. Through analysis of policy proposals and industry practices, we show how incentive structures disincentivize robust, auditable deployments. To realign watermarking with governance goals, we propose a three-layer framework encompassing technical standards, audit infrastructure, and enforcement mechanisms. Without enforceable requirements and independent verification, watermarking will remain inadequate for accountability and ultimately undermine broader efforts in AI safety and regulation.

Paperid: 1157, https://arxiv.org/pdf/2505.21636.pdf

Abstract:
Large language models (LLMs) are increasingly integrated into academic workflows, with many conferences and journals permitting their use for tasks such as language refinement and literature summarization. However, their use in peer review remains prohibited due to concerns around confidentiality breaches, hallucinated content, and inconsistent evaluations. As LLM-generated text becomes more indistinguishable from human writing, there is a growing need for reliable attribution mechanisms to preserve the integrity of the review process. In this work, we evaluate topic-based watermarking (TBW), a lightweight, semantic-aware technique designed to embed detectable signals into LLM-generated text. We conduct a comprehensive assessment across multiple LLM configurations, including base, few-shot, and fine-tuned variants, using authentic peer review data from academic conferences. Our results show that TBW maintains review quality relative to non-watermarked outputs, while demonstrating strong robustness to paraphrasing-based evasion. These findings highlight the viability of TBW as a minimally intrusive and practical solution for enforcing LLM usage in peer review.

Paperid: 1158, https://arxiv.org/pdf/2505.20866.pdf

Abstract:
Encrypted traffic classification is highly challenging in network security due to the need for extracting robust features from content-agnostic traffic data. Existing approaches face critical issues: (i) Distribution drift, caused by reliance on the closedworld assumption, limits adaptability to realworld, shifting patterns; (ii) Dependence on labeled data restricts applicability where such data is scarce or unavailable. Large language models (LLMs) have demonstrated remarkable potential in offering generalizable solutions across a wide range of tasks, achieving notable success in various specialized fields. However, their effectiveness in traffic analysis remains constrained by challenges in adapting to the unique requirements of the traffic domain. In this paper, we introduce a novel traffic representation model named Encrypted Traffic Out-of-Distribution Instruction Tuning with LLM (ETooL), which integrates LLMs with knowledge of traffic structures through a self-supervised instruction tuning paradigm. This framework establishes connections between textual information and traffic interactions. ETooL demonstrates more robust classification performance and superior generalization in both supervised and zero-shot traffic classification tasks. Notably, it achieves significant improvements in F1 scores: APP53 (I.I.D.) to 93.19%(6.62%) and 92.11%(4.19%), APP53 (O.O.D.) to 74.88%(18.17%) and 72.13%(15.15%), and ISCX-Botnet (O.O.D.) to 95.03%(9.16%) and 81.95%(12.08%). Additionally, we construct NETD, a traffic dataset designed to support dynamic distributional shifts, and use it to validate ETooL's effectiveness under varying distributional conditions. Furthermore, we evaluate the efficiency gains achieved through ETooL's instruction tuning approach.

Paperid: 1159, https://arxiv.org/pdf/2505.15483.pdf

Abstract:
Numerical data with bounded domains is a common data type in personal devices, such as wearable sensors. While the collection of such data is essential for third-party platforms, it raises significant privacy concerns. Local differential privacy (LDP) has been shown as a framework providing provable individual privacy, even when the third-party platform is untrusted. For numerical data with bounded domains, existing state-of-the-art LDP mechanisms are piecewise-based mechanisms, which are not optimal, leading to reduced data utility. This paper investigates the optimal design of piecewise-based mechanisms to maximize data utility under LDP. We demonstrate that existing piecewise-based mechanisms are heuristic forms of the $3$-piecewise mechanism, which is far from enough to study optimality. We generalize the $3$-piecewise mechanism to its most general form, i.e. $m$-piecewise mechanism with no pre-defined form of each piece. Under this form, we derive the closed-form optimal mechanism by combining analytical proofs and off-the-shelf optimization solvers. Next, we extend the generalized piecewise-based mechanism to the circular domain (along with the classical domain), defined on a cyclic range where the distance between the two endpoints is zero. By incorporating this property, we design the optimal mechanism for the circular domain, achieving significantly improved data utility compared with existing mechanisms. Our proposed mechanisms guarantee optimal data utility under LDP among all generalized piecewise-based mechanisms. We show that they also achieve optimal data utility in two common applications of LDP: distribution estimation and mean estimation. Theoretical analyses and experimental evaluations prove and validate the data utility advantages of our proposed mechanisms.

Paperid: 1160, https://arxiv.org/pdf/2505.11790.pdf

Abstract:
Large Language Models (LLMs) are trained with safety alignment to prevent generating malicious content. Although some attacks have highlighted vulnerabilities in these safety-aligned LLMs, they typically have limitations, such as necessitating access to the model weights or the generation process. Since proprietary models through API-calling do not grant users such permissions, these attacks find it challenging to compromise them. In this paper, we propose Jailbreaking Using LLM Introspection (JULI), which jailbreaks LLMs by manipulating the token log probabilities, using a tiny plug-in block, BiasNet. JULI relies solely on the knowledge of the target LLM's predicted token log probabilities. It can effectively jailbreak API-calling LLMs under a black-box setting and knowing only top-$5$ token log probabilities. Our approach demonstrates superior effectiveness, outperforming existing state-of-the-art (SOTA) approaches across multiple metrics.

Paperid: 1161, https://arxiv.org/pdf/2505.11459.pdf

Abstract:
The integration of large language models (LLMs) into a wide range of applications has highlighted the critical role of well-crafted system prompts, which require extensive testing and domain expertise. These prompts enhance task performance but may also encode sensitive information and filtering criteria, posing security risks if exposed. Recent research shows that system prompts are vulnerable to extraction attacks, while existing defenses are either easily bypassed or require constant updates to address new threats. In this work, we introduce ProxyPrompt, a novel defense mechanism that prevents prompt leakage by replacing the original prompt with a proxy. This proxy maintains the original task's utility while obfuscating the extracted prompt, ensuring attackers cannot reproduce the task or access sensitive information. Comprehensive evaluations on 264 LLM and system prompt pairs show that ProxyPrompt protects 94.70% of prompts from extraction attacks, outperforming the next-best defense, which only achieves 42.80%.

Paperid: 1162, https://arxiv.org/pdf/2505.10349.pdf

Abstract:
Local Differential Privacy (LDP) has been widely recognized as a powerful tool for providing a strong theoretical guarantee of data privacy to data contributors against an untrusted data collector. Under a typical LDP scheme, each data contributor independently randomly perturbs their data before submitting them to the data collector, which in turn infers valuable statistics about the original data from received perturbed data. Common to existing LDP mechanisms is an inherent trade-off between the level of privacy protection and data utility in the sense that strong data privacy often comes at the cost of reduced data utility. Frequency estimation based on Randomized Response (RR) is a fundamental building block of many LDP mechanisms. In this paper, we propose a novel Joint Randomized Response (JRR) mechanism based on correlated data perturbations to achieve locally differentially private frequency estimation. JRR divides data contributors into disjoint groups of two members and lets those in the same group jointly perturb their binary data to improve frequency-estimation accuracy and achieve the same level of data privacy by hiding the group membership information in contrast to the classical RR mechanism. Theoretical analysis and detailed simulation studies using both real and synthetic datasets show that JRR achieves the same level of data privacy as the classical RR mechanism while improving the frequency-estimation accuracy in the overwhelming majority of the cases by up to two orders of magnitude.

Paperid: 1163, https://arxiv.org/pdf/2505.06643.pdf

Abstract:
Reasoning large language models (RLLMs) have demonstrated outstanding performance across a variety of tasks, yet they also expose numerous security vulnerabilities. Most of these vulnerabilities have centered on the generation of unsafe content. However, recent work has identified a distinct "thinking-stopped" vulnerability in DeepSeek-R1: under adversarial prompts, the model's reasoning process ceases at the system level and produces an empty final answer. Building upon this vulnerability, researchers developed a novel prompt injection attack, termed reasoning interruption attack, and also offered an initial analysis of its root cause. Through extensive experiments, we verify the previous analyses, correct key errors based on three experimental findings, and present a more rigorous explanation of the fundamental causes driving the vulnerability. Moreover, existing attacks typically require over 2,000 tokens, impose significant overhead, reduce practicality, and are easily detected. To overcome these limitations, we propose the first practical reasoning interruption attack. It succeeds with just 109 tokens by exploiting our newly uncovered "reasoning token overflow" (RTO) effect to overwrite the model's final answer, forcing it to return an invalid response. Experimental results demonstrate that our proposed attack is highly effective. Furthermore, we discover that the method for triggering RTO differs between the official DeepSeek-R1 release and common unofficial deployments. As a broadened application of RTO, we also construct a novel jailbreak attack that enables the transfer of unsafe content within the reasoning tokens into final answer, thereby exposing it to the user. Our work carries significant implications for enhancing the security of RLLMs.

Paperid: 1164, https://arxiv.org/pdf/2505.03850.pdf

Abstract:
As a safety-critical cyber-physical system, cybersecurity and related safety issues for Autonomous Vehicles (AVs) have been important research topics for a while. Among all the modules on AVs, perception is one of the most accessible attack surfaces, as drivers and AVs have no control over the outside environment. Most current work targeting perception security for AVs focuses on perception correctness. In this work, we propose an impact analysis based on inference time attacks for autonomous vehicles. We demonstrate in a simulation system that such inference time attacks can also threaten the safety of both the ego vehicle and other traffic participants.

Paperid: 1165, https://arxiv.org/pdf/2505.03179.pdf

Abstract:
This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making. We examine two representative tasks-phishing email detection and intrusion detection-that differ in data modality, cognitive complexity, and user familiarity. Through a controlled mixed-methods user study, n = 58 (phishing, n = 34; intrusion, n = 24), we find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection. A learning effect is also observed when participants transition from collaboration to independent work, suggesting that LLMs can support long-term skill development. Our qualitative analysis shows that interaction dynamics-such as LLM definitiveness, explanation style, and tone-influence user trust, prompting strategies, and decision revision. Users engaged in more analytic questioning and showed greater reliance on LLM feedback in high-complexity settings. These results provide design guidance for building interpretable, adaptive, and trustworthy human-AI teaming systems, and demonstrate that LLMs can meaningfully support non-experts in reasoning through complex cybersecurity problems.

Paperid: 1166, https://arxiv.org/pdf/2505.02565.pdf

Abstract:
Antifragility of communication systems is defined as measure of benefits gained from the adverse events and variability of its environment. In this paper, we introduce the notion of antifragility in Reconfigurable Intelligent Surface (RIS) assisted communication systems affected by a jamming attack. We analyzed the antifragility of the two hop systems, where the wireless path contains source node, RIS, destination node, and a eavesdropping/jamming node. We propose and analyze the antifragility performance for several jamming models, such as Digital Radio Frequency Memory (DRFM) and phase and amplitude shifting. Our paper shows that antifragility throughput can indeed be achieved under certain power thresholds and for various jamming models. In particular, high jamming power combined with low baseline data rates yields an antifragile gain factor of approximately five times. The results confirm that reconfigurable intelligent surfaces, when coupled with an antifragile design philosophy, can convert hostile interference from a liability into a throughput gain.

Paperid: 1167, https://arxiv.org/pdf/2505.00289.pdf

Abstract:
Patch fuzzing is a technique aimed at identifying vulnerabilities that arise from newly patched code. While researchers have made efforts to apply patch fuzzing to testing JavaScript engines with considerable success, these efforts have been limited to using ordinary test cases or publicly available vulnerability PoCs (Proof of Concepts) as seeds, and the sustainability of these approaches is hindered by the challenges associated with automating the PoC collection. To address these limitations, we propose an end-to-end sustainable approach for JavaScript engine patch fuzzing, named PatchFuzz. It automates the collection of PoCs of a broader range of historical vulnerabilities and leverages both the PoCs and their corresponding patches to uncover new vulnerabilities more effectively. PatchFuzz starts by recognizing git commits which intend to fix security bugs. Subsequently, it extracts and processes PoCs from these commits to form the seeds for fuzzing, while utilizing code revisions to focus limited fuzzing resources on the more vulnerable code areas through selective instrumentation. The mutation strategy of PatchFuzz is also optimized to maximize the potential of the PoCs. Experimental results demonstrate the effectiveness of PatchFuzz. Notably, 54 bugs across six popular JavaScript engines have been exposed and a total of $62,500 bounties has been received.

Paperid: 1168, https://arxiv.org/pdf/2504.19433.pdf

Abstract:
With the rapid development of deep learning, existing generative text steganography methods based on autoregressive models have achieved success. However, these autoregressive steganography approaches have certain limitations. Firstly, existing methods require encoding candidate words according to their output probability and generating each stego word one by one, which makes the generation process time-consuming. Secondly, encoding and selecting candidate words changes the sampling probabilities, resulting in poor imperceptibility of the stego text. Thirdly, existing methods have low robustness and cannot resist replacement attacks. To address these issues, we propose a generative text steganography method based on a diffusion model (GTSD), which improves generative speed, robustness, and imperceptibility while maintaining security. To be specific, a novel steganography scheme based on diffusion model is proposed to embed secret information through prompt mapping and batch mapping. The prompt mapping maps secret information into a conditional prompt to guide the pre-trained diffusion model generating batches of candidate sentences. The batch mapping selects stego text based on secret information from batches of candidate sentences. Extensive experiments show that the GTSD outperforms the SOTA method in terms of generative speed, robustness, and imperceptibility while maintaining comparable anti-steganalysis performance. Moreover, we verify that the GTSD has strong potential: embedding capacity is positively correlated with prompt capacity and model batch sizes while maintaining security.

Paperid: 1169, https://arxiv.org/pdf/2504.18814.pdf

Abstract:
The Internet of Vehicles (IoV) is transforming transportation by enhancing connectivity and enabling autonomous driving. However, this increased interconnectivity introduces new security vulnerabilities. Bot malware and cyberattacks pose significant risks to Connected and Autonomous Vehicles (CAVs), as demonstrated by real-world incidents involving remote vehicle system compromise. To address these challenges, we propose an edge-based Intrusion Detection System (IDS) that monitors network traffic to and from CAVs. Our detection model is based on a meta-ensemble classifier capable of recognizing known (Nday) attacks and detecting previously unseen (zero-day) attacks. The approach involves training multiple Isolation Forest (IF) models on Multi-access Edge Computing (MEC) servers, with each IF specialized in identifying a specific type of botnet attack. These IFs, either trained locally or shared by other MEC nodes, are then aggregated using a Particle Swarm Optimization (PSO) based stacking strategy to construct a robust meta-classifier. The proposed IDS has been evaluated on a vehicular botnet dataset, achieving an average detection rate of 92.80% for N-day attacks and 77.32% for zero-day attacks. These results highlight the effectiveness of our solution in detecting both known and emerging threats, providing a scalable and adaptive defense mechanism for CAVs within the IoV ecosystem.

Paperid: 1170, https://arxiv.org/pdf/2504.18147.pdf

Abstract:
Large Language Models (LLM) are typically trained on vast amounts of data from various sources. Even when designed modularly (e.g., Mixture-of-Experts), LLMs can leak privacy on their sources. Conversely, training such models in isolation arguably prohibits generalization. To this end, we propose a framework, NoEsis, which builds upon the desired properties of modularity, privacy, and knowledge transfer. NoEsis integrates differential privacy with a hybrid two-staged parameter-efficient fine-tuning that combines domain-specific low-rank adapters, acting as experts, with common prompt tokens, acting as a knowledge-sharing backbone. Results from our evaluation on CodeXGLUE showcase that NoEsis can achieve provable privacy guarantees with tangible knowledge transfer across domains, and empirically show protection against Membership Inference Attacks. Finally, on code completion tasks, NoEsis bridges at least 77% of the accuracy gap between the non-shared and the non-private baseline.

Paperid: 1171, https://arxiv.org/pdf/2504.13775.pdf

Abstract:
Previous insertion-based and paraphrase-based backdoors have achieved great success in attack efficacy, but they ignore the text quality and semantic consistency between poisoned and clean texts. Although recent studies introduce LLMs to generate poisoned texts and improve the stealthiness, semantic consistency, and text quality, their hand-crafted prompts rely on expert experiences, facing significant challenges in prompt adaptability and attack performance after defenses. In this paper, we propose a novel backdoor attack based on adaptive optimization mechanism of black-box large language models (BadApex), which leverages a black-box LLM to generate poisoned text through a refined prompt. Specifically, an Adaptive Optimization Mechanism is designed to refine an initial prompt iteratively using the generation and modification agents. The generation agent generates the poisoned text based on the initial prompt. Then the modification agent evaluates the quality of the poisoned text and refines a new prompt. After several iterations of the above process, the refined prompt is used to generate poisoned texts through LLMs. We conduct extensive experiments on three dataset with six backdoor attacks and two defenses. Extensive experimental results demonstrate that BadApex significantly outperforms state-of-the-art attacks. It improves prompt adaptability, semantic consistency, and text quality. Furthermore, when two defense methods are applied, the average attack success rate (ASR) still up to 96.75%.

Paperid: 1172, https://arxiv.org/pdf/2504.13676.pdf

Abstract:
As the number of web applications and API endpoints exposed to the Internet continues to grow, so does the number of exploitable vulnerabilities. Manually identifying such vulnerabilities is tedious. Meanwhile, static security scanners tend to produce many false positives. While machine learning-based approaches are promising, they typically perform well only in scenarios where training and test data are closely related. A key challenge for ML-based vulnerability detection is providing suitable and concise code context, as excessively long contexts negatively affect the code comprehension capabilities of machine learning models, particularly smaller ones. This work introduces Trace Gadgets, a novel code representation that minimizes code context by removing non-related code. Trace Gadgets precisely capture the statements that cover the path to the vulnerability. As input for ML models, Trace Gadgets provide a minimal but complete context, thereby improving the detection performance. Moreover, we collect a large-scale dataset generated from real-world applications with manually curated labels to further improve the performance of ML-based vulnerability detectors. Our results show that state-of-the-art machine learning models perform best when using Trace Gadgets compared to previous code representations, surpassing the detection capabilities of industry-standard static scanners such as GitHub's CodeQL by at least 4% on a fully unseen dataset. By applying our framework to real-world applications, we identify and report previously unknown vulnerabilities in widely deployed software.

Paperid: 1173, https://arxiv.org/pdf/2504.12644.pdf

Abstract:
Deep learning (DL)-based image classification models are essential for autonomous vehicle (AV) perception modules since incorrect categorization might have severe repercussions. Adversarial attacks are widely studied cyberattacks that can lead DL models to predict inaccurate output, such as incorrectly classified traffic signs by the perception module of an autonomous vehicle. In this study, we create and compare hybrid classical-quantum deep learning (HCQ-DL) models with classical deep learning (C-DL) models to demonstrate robustness against adversarial attacks for perception modules. Before feeding them into the quantum system, we used transfer learning models, alexnet and vgg-16, as feature extractors. We tested over 1000 quantum circuits in our HCQ-DL models for projected gradient descent (PGD), fast gradient sign attack (FGSA), and gradient attack (GA), which are three well-known untargeted adversarial approaches. We evaluated the performance of all models during adversarial attacks and no-attack scenarios. Our HCQ-DL models maintain accuracy above 95\% during a no-attack scenario and above 91\% for GA and FGSA attacks, which is higher than C-DL models. During the PGD attack, our alexnet-based HCQ-DL model maintained an accuracy of 85\% compared to C-DL models that achieved accuracies below 21\%. Our results highlight that the HCQ-DL models provide improved accuracy for traffic sign classification under adversarial settings compared to their classical counterparts.

Paperid: 1174, https://arxiv.org/pdf/2504.02149.pdf

Abstract:
The growing use of smart home devices poses considerable privacy and security challenges, especially for individuals like migrant domestic workers (MDWs) who may be surveilled by their employers. This paper explores the privacy and security challenges experienced by MDWs in multi-user smart homes through in-depth semi-structured interviews with 26 MDWs and 5 staff members of agencies that recruit and/or train domestic workers in China. Our findings reveal that the relationships between MDWs, their employers, and agencies are characterized by significant power imbalances, influenced by Chinese cultural and social factors (such as Confucianism and collectivism), as well as legal ones. Furthermore, the widespread and normalized use of surveillance technologies in China, particularly in public spaces, exacerbates these power imbalances, reinforcing a sense of constant monitoring and control. Drawing on our findings, we provide recommendations to domestic worker agencies and policymakers to address the privacy and security challenges facing MDWs in Chinese smart homes.

Paperid: 1175, https://arxiv.org/pdf/2504.00694.pdf

Abstract:
Large Language Models (LLMs) have demonstrated strong capabilities in various code intelligence tasks. However, their effectiveness for Android malware analysis remains underexplored. Decompiled Android malware code presents unique challenges for analysis, due to the malicious logic being buried within a large number of functions and the frequent lack of meaningful function names. This paper presents CAMA, a benchmarking framework designed to systematically evaluate the effectiveness of Code LLMs in Android malware analysis. CAMA specifies structured model outputs to support key malware analysis tasks, including malicious function identification and malware purpose summarization. Built on these, it integrates three domain-specific evaluation metrics (consistency, fidelity, and semantic relevance), enabling rigorous stability and effectiveness assessment and cross-model comparison. We construct a benchmark dataset of 118 Android malware samples from 13 families collected in recent years, encompassing over 7.5 million distinct functions, and use CAMA to evaluate four popular open-source Code LLMs. Our experiments provide insights into how Code LLMs interpret decompiled code and quantify the sensitivity to function renaming, highlighting both their potential and current limitations in malware analysis.

Paperid: 1176, https://arxiv.org/pdf/2504.00035.pdf

Abstract:
In-Context Learning (ICL) and efficient fine-tuning methods significantly enhanced the efficiency of applying Large Language Models (LLMs) to downstream tasks. However, they also raise concerns about the imitation and infringement of personal creative data. Current methods for data copyright protection primarily focuses on content security but lacks effectiveness in protecting the copyrights of text styles. In this paper, we introduce a novel implicit zero-watermarking scheme, namely MiZero. This scheme establishes a precise watermark domain to protect the copyrighted style, surpassing traditional watermarking methods that distort the style characteristics. Specifically, we employ LLMs to extract condensed-lists utilizing the designed instance delimitation mechanism. These lists guide MiZero in generating the watermark. Extensive experiments demonstrate that MiZero effectively verifies text style copyright ownership against AI imitation.

Paperid: 1177, https://arxiv.org/pdf/2503.21126.pdf

Abstract:
Oblivious RAM (ORAM) allows a client to securely retrieve elements from outsourced servers without leakage about the accessed elements or their virtual addresses. Two-server ORAM, designed for secure two-party RAM computation, stores data across two non-colluding servers. However, many two-server ORAM schemes suffer from excessive local storage or high bandwidth costs. To serve lightweight clients, it is crucial for ORAM to achieve concretely efficient bandwidth while maintaining O(1) local storage. Hence, this paper presents two new client-friendly two-server ORAM schemes that achieve practical logarithmic bandwidth under O(1) local storage, while incurring linear symmetric key computations. The core design features a hierarchical structure and a pairwise-area setting for the elements and their tags. Accordingly, we specify efficient read-only and write-only private information retrieval (PIR) algorithms in our schemes to ensure obliviousness in accessing two areas respectively, so as to avoid the necessity of costly shuffle techniques in previous works. We empirically evaluate our schemes against LO13 (TCC'13), AFN17 (PKC'17), and KM19 (PKC'19) in terms of both bandwidth and time cost. The results demonstrate that our schemes reduce bandwidth by approximately 2-4x compared to LO13, and by 16-64x compared to AFN17 and KM19. For a database of size 2^14 blocks, our schemes are over 64x faster than KM19, while achieving similar performance to LO13 and AFN17 in the WAN setting, with a latency of around 1 second.

Paperid: 1178, https://arxiv.org/pdf/2503.17577.pdf

Abstract:
Deepfakes have become a universal and rapidly intensifying concern of generative AI across various media types such as images, audio, and videos. Among these, audio deepfakes have been of particular concern due to the ease of high-quality voice synthesis and distribution via platforms such as social media and robocalls. Consequently, detecting audio deepfakes plays a critical role in combating the growing misuse of AI-synthesized speech. However, real-world scenarios often introduce various audio corruptions, such as noise, modification, and compression, that may significantly impact detection performance. This work systematically evaluates the robustness of 10 audio deepfake detection models against 16 common corruptions, categorized into noise perturbation, audio modification, and compression. Using both traditional deep learning models and state-of-the-art foundation models, we make four unique observations. First, our findings show that while most models demonstrate strong robustness to noise, they are notably more vulnerable to modifications and compression, especially when neural codecs are applied. Second, speech foundation models generally outperform traditional models across most scenarios, likely due to their self-supervised learning paradigm and large-scale pre-training. Third, our results show that increasing model size improves robustness, albeit with diminishing returns. Fourth, we demonstrate how targeted data augmentation during training can enhance model resilience to unseen perturbations. A case study on political speech deepfakes highlights the effectiveness of foundation models in achieving high accuracy under real-world conditions. These findings emphasize the importance of developing more robust detection frameworks to ensure reliability in practical deployment settings.

Paperid: 1179, https://arxiv.org/pdf/2503.14284.pdf

Abstract:
Graph-based Network Intrusion Detection Systems (GNIDS) have gained significant momentum in detecting sophisticated cyber-attacks, such as Advanced Persistent Threats (APTs), within and across organizational boundaries. Though achieving satisfying detection accuracy and demonstrating adaptability to ever-changing attacks and normal patterns, existing GNIDS predominantly assume a centralized data setting. However, flexible data collection is not always realistic or achievable due to increasing constraints from privacy regulations and operational limitations. We argue that the practical development of GNIDS requires accounting for distributed collection settings and we leverage Federated Learning (FL) as a viable paradigm to address this prominent challenge. We observe that naively applying FL to GNIDS is unlikely to be effective, due to issues like graph heterogeneity over clients and the diverse design choices taken by different GNIDS. We address these issues with a set of novel techniques tailored to the graph datasets, including reference graph synthesis, graph sketching and adaptive contribution scaling, eventually developing a new system Entente. By leveraging the domain knowledge, Entente can achieve effectiveness, scalability and robustness simultaneously. Empirical evaluation on the large-scale LANL, OpTC and Pivoting datasets shows that Entente outperforms the SOTA FL baselines. We also evaluate Entente under FL poisoning attacks tailored to the GNIDS setting, showing the robustness by bounding the attack success rate to low values. Overall, our study suggests a promising direction to build cross-silo GNIDS.

Paperid: 1180, https://arxiv.org/pdf/2503.10081.pdf

Abstract:
The outstanding capability of diffusion models in generating high-quality images poses significant threats when misused by adversaries. In particular, we assume malicious adversaries exploiting diffusion models for inpainting tasks, such as replacing a specific region with a celebrity. While existing methods for protecting images from manipulation in diffusion-based generative models have primarily focused on image-to-image and text-to-image tasks, the challenge of preventing unauthorized inpainting has been rarely addressed, often resulting in suboptimal protection performance. To mitigate inpainting abuses, we propose ADVPAINT, a novel defensive framework that generates adversarial perturbations that effectively disrupt the adversary's inpainting tasks. ADVPAINT targets the self- and cross-attention blocks in a target diffusion inpainting model to distract semantic understanding and prompt interactions during image generation. ADVPAINT also employs a two-stage perturbation strategy, dividing the perturbation region based on an enlarged bounding box around the object, enhancing robustness across diverse masks of varying shapes and sizes. Our experimental results demonstrate that ADVPAINT's perturbations are highly effective in disrupting the adversary's inpainting tasks, outperforming existing methods; ADVPAINT attains over a 100-point increase in FID and substantial decreases in precision.

Paperid: 1181, https://arxiv.org/pdf/2503.05727.pdf

Abstract:
Cybergrooming exploits minors through online trust-building, yet research remains fragmented, limiting holistic prevention. Social sciences focus on behavioral insights, while computational methods emphasize detection, but their integration remains insufficient. This review systematically synthesizes both fields using the PRISMA framework to enhance clarity, reproducibility, and cross-disciplinary collaboration. Findings show that qualitative methods offer deep insights but are resource-intensive, machine learning models depend on data quality, and standard metrics struggle with imbalance and cultural nuances. By bridging these gaps, this review advances interdisciplinary cybergrooming research, guiding future efforts toward more effective prevention and detection strategies.

Paperid: 1182, https://arxiv.org/pdf/2503.04853.pdf

Abstract:
For the first time, we unveil discernible temporal (or historical) trajectory imprints resulting from adversarial example (AE) attacks. Standing in contrast to existing studies all focusing on spatial (or static) imprints within the targeted underlying victim models, we present a fresh temporal paradigm for understanding these attacks. Of paramount discovery is that these imprints are encapsulated within a single loss metric, spanning universally across diverse tasks such as classification and regression, and modalities including image, text, and audio. Recognizing the distinct nature of loss between adversarial and clean examples, we exploit this temporal imprint for AE detection by proposing TRAIT (TRaceable Adversarial temporal trajectory ImprinTs). TRAIT operates under minimal assumptions without prior knowledge of attacks, thereby framing the detection challenge as a one-class classification problem. However, detecting AEs is still challenged by significant overlaps between the constructed synthetic losses of adversarial and clean examples due to the absence of ground truth for incoming inputs. TRAIT addresses this challenge by converting the synthetic loss into a spectrum signature, using the technique of Fast Fourier Transform to highlight the discrepancies, drawing inspiration from the temporal nature of the imprints, analogous to time-series signals. Across 12 AE attacks including SMACK (USENIX Sec'2023), TRAIT demonstrates consistent outstanding performance across comprehensively evaluated modalities, tasks, datasets, and model architectures. In all scenarios, TRAIT achieves an AE detection accuracy exceeding 97%, often around 99%, while maintaining a false rejection rate of 1%. TRAIT remains effective under the formulated strong adaptive attacks.

Paperid: 1183, https://arxiv.org/pdf/2503.04473.pdf

Abstract:
Federated learning (FL), as a powerful learning paradigm, trains a shared model by aggregating model updates from distributed clients. However, the decoupling of model learning from local data makes FL highly vulnerable to backdoor attacks, where a single compromised client can poison the shared model. While recent progress has been made in backdoor detection, existing methods face challenges with detection accuracy and runtime effectiveness, particularly when dealing with complex model architectures. In this work, we propose a novel approach to detecting malicious clients in an accurate, stable, and efficient manner. Our method utilizes a sampling-based network representation method to quantify dissimilarities between clients, identifying model deviations caused by backdoor injections. We also propose an iterative algorithm to progressively detect and exclude malicious clients as outliers based on these dissimilarity measurements. Evaluations across a range of benchmark tasks demonstrate that our approach outperforms state-of-the-art methods in detection accuracy and defense effectiveness. When deployed for runtime protection, our approach effectively eliminates backdoor injections with marginal overheads.

Paperid: 1184, https://arxiv.org/pdf/2503.04036.pdf

Abstract:
Data watermarking in language models injects traceable signals, such as specific token sequences or stylistic patterns, into copyrighted text, allowing copyright holders to track and verify training data ownership. Previous data watermarking techniques primarily focus on effective memorization during pretraining, while overlooking challenges that arise in other stages of the LLM lifecycle, such as the risk of watermark filtering during data preprocessing and verification difficulties due to API-only access. To address these challenges, we propose a novel data watermarking approach that injects plausible yet fictitious knowledge into training data using generated passages describing a fictitious entity and its associated attributes. Our watermarks are designed to be memorized by the LLM through seamlessly integrating in its training data, making them harder to detect lexically during preprocessing. We demonstrate that our watermarks can be effectively memorized by LLMs, and that increasing our watermarks' density, length, and diversity of attributes strengthens their memorization. We further show that our watermarks remain effective after continual pretraining and supervised finetuning. Finally, we show that our data watermarks can be evaluated even under API-only access via question answering.

Paperid: 1185, https://arxiv.org/pdf/2502.20477.pdf

Abstract:
In the last years, especially since the COVID-19 pandemic, precision medicine platforms emerged as useful tools for supporting new tests like the ones that detect the presence of antibodies and antigens with better sensitivity and specificity than traditional methods. In addition, the pandemic has also influenced the way people interact (decentralization), behave (digital world) and purchase health services (online). Moreover, there is a growing concern in the way health data are managed, especially in terms of privacy. To tackle such issues, this article presents a sustainable direct-to-consumer health-service open-source platform called HELENE that is supported by blockchain and by a novel decentralized oracle that protects patient data privacy. Specifically, HELENE enables health test providers to compete through auctions, allowing patients to bid for their services and to keep the control over their health test results. Moreover, data exchanges among the involved stakeholders can be performed in a trustworthy, transparent and standardized way to ease software integration and to avoid incompatibilities. After providing a thorough description of the platform, the proposed health platform is assessed in terms of smart contract performance. In addition, the response time of the developed oracle is evaluated and NIST SP 800-22 tests are executed to demonstrate the adequacy of the devised random number generator. Thus, this article shows the capabilities and novel propositions of HELENE for delivering health services providing an open-source platform for future researchers, who can enhance it and adapt it to their needs.

Paperid: 1186, https://arxiv.org/pdf/2502.18520.pdf

Abstract:
Recent studies have highlighted the vulnerability of deep neural networks to backdoor attacks, where models are manipulated to rely on embedded triggers within poisoned samples, despite the presence of both benign and trigger information. While several defense methods have been proposed, they often struggle to balance backdoor mitigation with maintaining benign performance.In this work, inspired by the concept of optical polarizer-which allows light waves of specific polarizations to pass while filtering others-we propose a lightweight backdoor defense approach, NPD. This method integrates a neural polarizer (NP) as an intermediate layer within the compromised model, implemented as a lightweight linear transformation optimized via bi-level optimization. The learnable NP filters trigger information from poisoned samples while preserving benign content. Despite its effectiveness, we identify through empirical studies that NPD's performance degrades when the target labels (required for purification) are inaccurately estimated. To address this limitation while harnessing the potential of targeted adversarial mitigation, we propose class-conditional neural polarizer-based defense (CNPD). The key innovation is a fusion module that integrates the backdoored model's predicted label with the features to be purified. This architecture inherently mimics targeted adversarial defense mechanisms without requiring label estimation used in NPD. We propose three implementations of CNPD: the first is r-CNPD, which trains a replicated NP layer for each class and, during inference, selects the appropriate NP layer for defense based on the predicted class from the backdoored model. To efficiently handle a large number of classes, two variants are designed: e-CNPD, which embeds class information as additional features, and a-CNPD, which directs network attention using class information.

Paperid: 1187, https://arxiv.org/pdf/2502.15020.pdf

Abstract:
As deep learning gains popularity, edge IoT devices have seen proliferating deployment of pre-trained Deep Neural Network (DNN) models. These DNNs represent valuable intellectual property and face significant confidentiality threats from side-channel analysis (SCA), particularly non-invasive Differential Electromagnetic (EM) Analysis (DEMA), which retrieves individual model parameters from EM traces collected during model inference. Traditional SCA mitigation methods, such as masking and shuffling, can still be applied to DNN inference, but will incur significant performance degradation due to the large volume of operations and parameters. Based on the insight that DNN models have high redundancy and are robust to input variation, we introduce MACPruning, a novel lightweight defense against DEMA-based parameter extraction attacks, exploiting specific characteristics of DNN execution. The design principle of MACPruning is to randomly deactivate input pixels and prune the operations (typically multiply-accumulate-MAC) on those pixels. The technique removes certain leakages and overall redistributes weight-dependent EM leakages temporally, and thus effectively mitigates DEMA. To maintain DNN performance, we propose an importance-aware pixel map that preserves critical input pixels, keeping randomness in the defense while minimizing its impact on DNN performance due to operation pruning. We conduct a comprehensive security analysis of MACPruning on various datasets for DNNs on edge devices. Our evaluations demonstrate that MACPruning effectively reduces EM leakages with minimal impact on the model accuracy and negligible computational overhead.

Paperid: 1188, https://arxiv.org/pdf/2502.15012.pdf

Abstract:
Wide deployment of machine learning models on edge devices has rendered the model intellectual property (IP) and data privacy vulnerable. We propose GNNVault, the first secure Graph Neural Network (GNN) deployment strategy based on Trusted Execution Environment (TEE). GNNVault follows the design of 'partition-before-training' and includes a private GNN rectifier to complement with a public backbone model. This way, both critical GNN model parameters and the private graph used during inference are protected within secure TEE compartments. Real-world implementations with Intel SGX demonstrate that GNNVault safeguards GNN inference against state-of-the-art link stealing attacks with negligible accuracy degradation (<2%).

Paperid: 1189, https://arxiv.org/pdf/2502.09726.pdf

Abstract:
The DNS (Domain Name System) protocol has been in use since the early days of the Internet. Although DNS as a de facto networking protocol had no security considerations in its early years, there have been many security enhancements, such as DNSSec (Domain Name System Security Extensions), DoT (DNS over Transport Layer Security), DoH (DNS over HTTPS) and DoQ (DNS over QUIC). With all these security improvements, it is not yet clear what resource-constrained Internet-of-Things (IoT) devices should be used for robustness. In this paper, we investigate different DNS security approaches using an edge DNS resolver implemented as a Virtual Network Function (VNF) to replicate the impact of the protocol from an IoT perspective and compare their performances under different conditions. We present our results for cache-based and non-cached responses and evaluate the corresponding security benefits. Our results and framework can greatly help consumers, manufacturers, and the research community decide and implement their DNS protocols depending on the given dynamic network conditions and enable robust Internet access via DNS for different devices.

Paperid: 1190, https://arxiv.org/pdf/2502.09117.pdf

Abstract:
Low-code development frameworks for IoT platforms offer a simple drag-and-drop mechanism to create applications for the billions of existing IoT devices without the need for extensive programming knowledge. The security of such software is crucial given the close integration of IoT devices in many highly sensitive areas such as healthcare or home automation. Node-RED is such a framework, where applications are built from nodes that are contributed by open-source developers. Its reliance on unvetted open-source contributions and lack of security checks raises the concern that the applications could be vulnerable to attacks, thereby imposing a security risk to end users. The low-code approach suggests, that many users could lack the technical knowledge to mitigate, understand, or even realize such security concerns. This paper focuses on "hidden" information flows in Node-RED nodes, meaning flows that are not captured by the specifications. They could (unknowingly or with malicious intent) cause leaks of sensitive information to unauthorized entities. We report the results of a conformance analysis of all nodes in the Node-RED framework, for which we compared the numbers of specified inputs and outputs of each node against the number of sources and sinks detected with CodeQL. The results show, that 55% of all nodes exhibit more possible flows than are specified. A risk assessment of a subset of the nodes showed, that 28% of them are associated with a high severity and 36% with a medium severity rating.

Paperid: 1191, https://arxiv.org/pdf/2501.12292.pdf

Abstract:
Existing countermeasures for hardware IP protection, such as obfuscation, camouflaging, and redaction, aim to defend against confidentiality and integrity attacks. However, within the current threat model, these techniques overlook the potential risks posed by a highly skilled adversary with privileged access to the IC supply chain, who may be familiar with critical IP blocks and the countermeasures implemented in the design. To address this scenario, we introduce Library-Attack, a novel reverse engineering technique that leverages privileged design information and prior knowledge of security countermeasures to recover sensitive hardware IP. During Library-Attack, a privileged attacker uses known design features to curate a design library of candidate IPs and employs structural comparison metrics from commercial EDA tools to identify the closest match. We evaluate Library-Attack on transformed ISCAS89 benchmarks to demonstrate potential vulnerabilities in existing IP-level countermeasures and propose an updated threat model to incorporate them.

Paperid: 1192, https://arxiv.org/pdf/2501.08799.pdf

Abstract:
This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios. Our results show that GPT-4o demonstrates high consistency, particularly in few-shot in-context learning, where its performance improves as more examples are provided (reference data). We also observe that detailed prompts enable the model to provide scores reliably, a behavior not observed with concise prompts. Additionally, explanation-seeking prompts slightly enhance the model's performance by improving its interpretability. Remarkably, the model exhibits emergent reasoning capabilities, correctly predicting the attack type (print or replay) with high accuracy in few-shot scenarios, despite not being explicitly instructed to classify attack types. Despite these strengths, GPT-4o faces challenges in zero-shot tasks, where its performance is limited compared to specialized PAD systems. Experiments were conducted on a subset of the SOTERIA dataset, ensuring compliance with data privacy regulations by using only data from consenting individuals. These findings underscore GPT-4o's promise in PAD applications, laying the groundwork for future research to address broader data privacy concerns and improve cross-dataset generalization. Code available here: https://gitlab.idiap.ch/bob/bob.paper.wacv2025_chatgpt_face_pad

Paperid: 1193, https://arxiv.org/pdf/2501.01664.pdf

Abstract:
The integration of Internet of Things (IoT) technology in various domains has led to operational advancements, but it has also introduced new vulnerabilities to cybersecurity threats, as evidenced by recent widespread cyberattacks on IoT devices. Intrusion detection systems are often reactive, triggered by specific patterns or anomalies observed within the network. To address this challenge, this work proposes a proactive approach to anticipate and preemptively mitigate malicious activities, aiming to prevent potential damage before it occurs. This paper proposes an innovative intrusion prediction framework empowered by Pre-trained Large Language Models (LLMs). The framework incorporates two LLMs: a fine-tuned Bidirectional and AutoRegressive Transformers (BART) model for predicting network traffic and a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model for evaluating the predicted traffic. By harnessing the bidirectional capabilities of BART the framework then identifies malicious packets among these predictions. Evaluated using the CICIoT2023 IoT attack dataset, our framework showcases a notable enhancement in predictive performance, attaining an impressive 98% overall accuracy, providing a powerful response to the cybersecurity challenges that confront IoT networks.

Paperid: 1194, https://arxiv.org/pdf/2506.22727.pdf

Abstract:
Differential privacy (DP) has been integrated into graph neural networks (GNNs) to protect sensitive structural information, e.g., edges, nodes, and associated features across various applications. A common approach is to perturb the message-passing process, which forms the core of most GNN architectures. However, existing methods typically incur a privacy cost that grows linearly with the number of layers (Usenix Security'23), ultimately requiring excessive noise to maintain a reasonable privacy level. This limitation becomes particularly problematic when deep GNNs are necessary to capture complex and long-range interactions in graphs. In this paper, we theoretically establish that the privacy budget can converge with respect to the number of layers by applying privacy amplification techniques to the message-passing process, exploiting the contractive properties inherent to standard GNN operations. Motivated by this analysis, we propose a simple yet effective Contractive Graph Layer (CGL) that ensures the contractiveness required for theoretical guarantees while preserving model utility. Our framework, CARIBOU, supports both training and inference, equipped with a contractive aggregation module, a privacy allocation module, and a privacy auditing module. Experimental evaluations demonstrate that CARIBOU significantly improves the privacy-utility trade-off and achieves superior performance in privacy auditing tasks.

Paperid: 1195, https://arxiv.org/pdf/2506.21897.pdf

Abstract:
The 3D printing industry is rapidly growing and increasingly adopted across various sectors including manufacturing, healthcare, and defense. However, the operational setup often involves hazardous environments, necessitating remote monitoring through cameras and other sensors, which opens the door to cyber-based attacks. In this paper, we show that an adversary with access to video recordings of the 3D printing process can reverse engineer the underlying 3D print instructions. Our model tracks the printer nozzle movements during the printing process and maps the corresponding trajectory into G-code instructions. Further, it identifies the correct parameters such as feed rate and extrusion rate, enabling successful intellectual property theft. To validate this, we design an equivalence checker that quantitatively compares two sets of 3D print instructions, evaluating their similarity in producing objects alike in shape, external appearance, and internal structure. Unlike simple distance-based metrics such as normalized mean square error, our equivalence checker is both rotationally and translationally invariant, accounting for shifts in the base position of the reverse engineered instructions caused by different camera positions. Our model achieves an average accuracy of 90.87 percent and generates 30.20 percent fewer instructions compared to existing methods, which often produce faulty or inaccurate prints. Finally, we demonstrate a fully functional counterfeit object generated by reverse engineering 3D print instructions from video.

Paperid: 1196, https://arxiv.org/pdf/2506.19635.pdf

Abstract:
For more than a decade now, academicians and online platform administrators have been studying solutions to the problem of bot detection. Bots are computer algorithms whose use is far from being benign: malicious bots are purposely created to distribute spam, sponsor public characters and, ultimately, induce a bias within the public opinion. To fight the bot invasion on our online ecosystem, several approaches have been implemented, mostly based on (supervised and unsupervised) classifiers, which adopt the most varied account features, from the simplest to the most expensive ones to be extracted from the raw data obtainable through the Twitter public APIs. In this exploratory study, using Twitter as a benchmark, we compare the performances of four state-of-art feature sets in detecting novel bots: one of the output scores of the popular bot detector Botometer, which considers more than 1,000 features of an account to take a decision; two feature sets based on the account profile and timeline; and the information about the Twitter client from which the user tweets. The results of our analysis, conducted on six recently released datasets of Twitter accounts, hint at the possible use of general-purpose classifiers and cheap-to-compute account features for the detection of evolved bots.

Paperid: 1197, https://arxiv.org/pdf/2506.17279.pdf

Abstract:
Knowledge erasure in large language models (LLMs) is important for ensuring compliance with data and AI regulations, safeguarding user privacy, mitigating bias, and misinformation. Existing unlearning methods aim to make the process of knowledge erasure more efficient and effective by removing specific knowledge while preserving overall model performance, especially for retained information. However, it has been observed that the unlearning techniques tend to suppress and leave the knowledge beneath the surface, thus making it retrievable with the right prompts. In this work, we demonstrate that \textit{step-by-step reasoning} can serve as a backdoor to recover this hidden information. We introduce a step-by-step reasoning-based black-box attack, Sleek, that systematically exposes unlearning failures. We employ a structured attack framework with three core components: (1) an adversarial prompt generation strategy leveraging step-by-step reasoning built from LLM-generated queries, (2) an attack mechanism that successfully recalls erased content, and exposes unfair suppression of knowledge intended for retention and (3) a categorization of prompts as direct, indirect, and implied, to identify which query types most effectively exploit unlearning weaknesses. Through extensive evaluations on four state-of-the-art unlearning techniques and two widely used LLMs, we show that existing approaches fail to ensure reliable knowledge removal. Of the generated adversarial prompts, 62.5% successfully retrieved forgotten Harry Potter facts from WHP-unlearned Llama, while 50% exposed unfair suppression of retained knowledge. Our work highlights the persistent risks of information leakage, emphasizing the need for more robust unlearning strategies for erasure.

Paperid: 1198, https://arxiv.org/pdf/2506.16699.pdf

Abstract:
Intelligent Transportation Systems (ITS) are increasingly vulnerable to sophisticated cyberattacks due to their complex, interconnected nature. Ensuring the cybersecurity of these systems is paramount to maintaining road safety and minimizing traffic disruptions. This study presents a novel multi-agent framework leveraging Large Language Models (LLMs) to enhance traffic simulation and cybersecurity testing. The framework automates the creation of traffic scenarios, the design of cyberattack strategies, and the development of defense mechanisms. A case study demonstrates the framework's ability to simulate a cyberattack targeting connected vehicle broadcasts, evaluate its impact, and implement a defense mechanism that significantly mitigates traffic delays. Results show a 10.2 percent increase in travel time during an attack, which is reduced by 3.3 percent with the defense strategy. This research highlights the potential of LLM-driven multi-agent systems in advancing transportation cybersecurity and offers a scalable approach for future research in traffic simulation and cyber defense.

Paperid: 1199, https://arxiv.org/pdf/2506.15842.pdf

Abstract:
Maritime systems, including ships and ports, are critical components of global infrastructure, essential for transporting over 80% of the world's goods and supporting internet connectivity. However, these systems face growing cybersecurity threats, as shown by recent attacks disrupting Maersk, one of the world's largest shipping companies, causing widespread impacts on international trade. The unique challenges of the maritime environment--such as diverse operational conditions, extensive physical access points, fragmented regulatory frameworks, and its deeply interconnected structure--require maritime-specific cybersecurity research. Despite the sector's importance, maritime cybersecurity remains underexplored, leaving significant gaps in understanding its challenges and risks. To address these gaps, we investigate how maritime system operators perceive and navigate cybersecurity challenges within this complex landscape. We conducted a user study comprising surveys and semi-structured interviews with 21 officer-level mariners. Participants reported direct experiences with shipboard cyber-attacks, including GPS spoofing and logistics-disrupting ransomware, demonstrating the real-world impact of these threats. Our findings reveal systemic and human-centric issues, such as training poorly aligned with maritime needs, insufficient detection and response tools, and serious gaps in mariners' cybersecurity understanding. Our contributions include a categorization of threats identified by mariners and recommendations for improving maritime security, including better training, response protocols, and regulation. These insights aim to guide future research and policy to strengthen the resilience of maritime systems.

Paperid: 1200, https://arxiv.org/pdf/2506.15295.pdf

Abstract:
Lending protocols are one of the main applications of Decentralized Finance (DeFi), enabling crypto-assets loan markets with a total value estimated in the tens of billions of dollars. Unlike traditional lending systems, these protocols operate without relying on trusted authorities or off-chain enforcement mechanisms. To achieve key economic goals such as stability of the loan market, they devise instead trustless on-chain mechanisms, such as rewarding liquidators who repay the loans of under-collateralized borrowers by awarding them part of the borrower's collateral. The complexity of these incentive mechanisms, combined with their entanglement in low-level implementation details, makes it challenging to precisely assess the structural and economic properties of lending protocols, as well as to analyze user strategies and attacks. Crucially, since participation is open to anyone, any weaknesses in the incentive mechanism may give rise to unintended emergent behaviours, or even enable adversarial strategies aimed at making profits to the detriment of legit users, or at undermining the stability of the protocol. In this work, we propose a formal model of lending protocols that captures the essential features of mainstream platforms, enabling us to identify and prove key properties related to their economic and strategic dynamics.

Paperid: 1201, https://arxiv.org/pdf/2506.13590.pdf

Abstract:
As multi-agent systems evolve to encompass increasingly diverse and specialized agents, the challenge of enabling effective collaboration between heterogeneous agents has become paramount, with traditional agent communication protocols often assuming homogeneous environments or predefined interaction patterns that limit their applicability in dynamic, open-world scenarios. This paper presents the Agent Capability Negotiation and Binding Protocol (ACNBP), a novel framework designed to facilitate secure, efficient, and verifiable interactions between agents in heterogeneous multi-agent systems through integration with an Agent Name Service (ANS) infrastructure that provides comprehensive discovery, negotiation, and binding mechanisms. The protocol introduces a structured 10-step process encompassing capability discovery, candidate pre-screening and selection, secure negotiation phases, and binding commitment with built-in security measures including digital signatures, capability attestation, and comprehensive threat mitigation strategies, while a key innovation of ACNBP is its protocolExtension mechanism that enables backward-compatible protocol evolution and supports diverse agent architectures while maintaining security and interoperability. We demonstrate ACNBP's effectiveness through a comprehensive security analysis using the MAESTRO threat modeling framework, practical implementation considerations, and a detailed example showcasing the protocol's application in a document translation scenario, with the protocol addressing critical challenges in agent autonomy, capability verification, secure communication, and scalable agent ecosystem management.

Paperid: 1202, https://arxiv.org/pdf/2506.13323.pdf

Abstract:
Disassembly is a crucial yet challenging step in binary analysis. While emerging neural disassemblers show promise for efficiency and accuracy, they frequently generate outputs violating fundamental structural constraints, which significantly compromise their practical usability. To address this critical problem, we regularize the disassembly solution space by formalizing and applying key structural constraints based on post-dominance relations. This approach systematically detects widespread errors in existing neural disassemblers' outputs. These errors often originate from models' limited context modeling and instruction-level decoding that neglect global structural integrity. We introduce Tady, a novel neural disassembler featuring an improved model architecture and a dedicated post-processing algorithm, specifically engineered to address these deficiencies. Comprehensive evaluations on diverse binaries demonstrate that Tady effectively eliminates structural constraint violations and functions with high efficiency, while maintaining instruction-level accuracy.

Paperid: 1203, https://arxiv.org/pdf/2506.10949.pdf

Abstract:
Current LLM safety defenses fail under decomposition attacks, where a malicious goal is decomposed into benign subtasks that circumvent refusals. The challenge lies in the existing shallow safety alignment techniques: they only detect harm in the immediate prompt and do not reason about long-range intent, leaving them blind to malicious intent that emerges over a sequence of seemingly benign instructions. We therefore propose adding an external monitor that observes the conversation at a higher granularity. To facilitate our study of monitoring decomposition attacks, we curate the largest and most diverse dataset to date, including question-answering, text-to-image, and agentic tasks. We verify our datasets by testing them on frontier LLMs and show an 87% attack success rate on average on GPT-4o. This confirms that decomposition attack is broadly effective. Additionally, we find that random tasks can be injected into the decomposed subtasks to further obfuscate malicious intents. To defend in real time, we propose a lightweight sequential monitoring framework that cumulatively evaluates each subtask. We show that a carefully prompt engineered lightweight monitor achieves a 93% defense success rate, beating reasoning models like o3 mini as a monitor. Moreover, it remains robust against random task injection and cuts cost by 90% and latency by 50%. Our findings suggest that lightweight sequential monitors are highly effective in mitigating decomposition attacks and are viable in deployment.

Paperid: 1204, https://arxiv.org/pdf/2506.10024.pdf

Abstract:
Large Language Models (LLMs) memorize, and thus, among huge amounts of uncontrolled data, may memorize Personally Identifiable Information (PII), which should not be stored and, consequently, not leaked. In this paper, we introduce Private Memorization Editing (PME), an approach for preventing private data leakage that turns an apparent limitation, that is, the LLMs' memorization ability, into a powerful privacy defense strategy. While attacks against LLMs have been performed exploiting previous knowledge regarding their training data, our approach aims to exploit the same kind of knowledge in order to make a model more robust. We detect a memorized PII and then mitigate the memorization of PII by editing a model knowledge of its training data. We verify that our procedure does not affect the underlying language model while making it more robust against privacy Training Data Extraction attacks. We demonstrate that PME can effectively reduce the number of leaked PII in a number of configurations, in some cases even reducing the accuracy of the privacy attacks to zero.

Paperid: 1205, https://arxiv.org/pdf/2506.09956.pdf

Abstract:
Indirect Prompt Injection attacks exploit the inherent limitation of Large Language Models (LLMs) to distinguish between instructions and data in their inputs. Despite numerous defense proposals, the systematic evaluation against adaptive adversaries remains limited, even when successful attacks can have wide security and privacy implications, and many real-world LLM-based applications remain vulnerable. We present the results of LLMail-Inject, a public challenge simulating a realistic scenario in which participants adaptively attempted to inject malicious instructions into emails in order to trigger unauthorized tool calls in an LLM-based email assistant. The challenge spanned multiple defense strategies, LLM architectures, and retrieval configurations, resulting in a dataset of 208,095 unique attack submissions from 839 participants. We release the challenge code, the full dataset of submissions, and our analysis demonstrating how this data can provide new insights into the instruction-data separation problem. We hope this will serve as a foundation for future research towards practical structural solutions to prompt injection.

Paperid: 1206, https://arxiv.org/pdf/2506.08055.pdf

Abstract:
As cloud environments become widespread, cybersecurity has emerged as a top priority across areas such as networks, communication, data privacy, response times, and availability. Various sectors, including industries, healthcare, and government, have recently faced cyberattacks targeting their computing systems. Ensuring secure app deployment in cloud environments requires substantial effort. With the growing interest in cloud security, conducting a systematic literature review (SLR) is critical to identifying research gaps. Continuous Software Engineering, which includes continuous integration (CI), delivery (CDE), and deployment (CD), is essential for software development and deployment. In our SLR, we reviewed 66 papers, summarising tools, approaches, and challenges related to the security of CI/CD in the cloud. We addressed key aspects of cloud security and CI/CD and reported on tools such as Harbor, SonarQube, and GitHub Actions. Challenges such as image manipulation, unauthorised access, and weak authentication were highlighted. The review also uncovered research gaps in how tools and practices address these security issues in CI/CD pipelines, revealing a need for further study to improve cloud-based security solutions.

Paperid: 1207, https://arxiv.org/pdf/2506.06003.pdf

Abstract:
Membership inference tests aim to determine whether a particular data point was included in a language model's training set. However, recent works have shown that such tests often fail under the strict definition of membership based on exact matching, and have suggested relaxing this definition to include semantic neighbors as members as well. In this work, we show that membership inference tests are still unreliable under this relaxation - it is possible to poison the training dataset in a way that causes the test to produce incorrect predictions for a target point. We theoretically reveal a trade-off between a test's accuracy and its robustness to poisoning. We also present a concrete instantiation of this poisoning attack and empirically validate its effectiveness. Our results show that it can degrade the performance of existing tests to well below random.

Paperid: 1208, https://arxiv.org/pdf/2506.04390.pdf

Abstract:
Retrieval-augmented generation (RAG) systems are vulnerable to attacks that inject poisoned passages into the retrieved set, even at low corruption rates. We show that existing attacks are not designed to be stealthy, allowing reliable detection and mitigation. We formalize stealth using a distinguishability-based security game. If a few poisoned passages are designed to control the response, they must differentiate themselves from benign ones, inherently compromising stealth. This motivates the need for attackers to rigorously analyze intermediate signals involved in generation$\unicode{x2014}$such as attention patterns or next-token probability distributions$\unicode{x2014}$to avoid easily detectable traces of manipulation. Leveraging attention patterns, we propose a passage-level score$\unicode{x2014}$the Normalized Passage Attention Score$\unicode{x2014}$used by our Attention-Variance Filter algorithm to identify and filter potentially poisoned passages. This method mitigates existing attacks, improving accuracy by up to $\sim 20 \%$ over baseline defenses. To probe the limits of attention-based defenses, we craft stealthier adaptive attacks that obscure such traces, achieving up to $35 \%$ attack success rate, and highlight the challenges in improving stealth.

Paperid: 1209, https://arxiv.org/pdf/2506.01900.pdf

Abstract:
The meteoric rise and proliferation of autonomous Large Language Model (LLM) agents promise significant capabilities across various domains. However, their deployment is increasingly constrained by substantial computational demands, specifically for Graphics Processing Unit (GPU) resources. This paper addresses the critical problem of optimizing resource utilization in LLM agent systems. We introduce COALESCE (Cost-Optimized and Secure Agent Labour Exchange via Skill-based Competence Estimation), a novel framework designed to enable autonomous LLM agents to dynamically outsource specific subtasks to specialized, cost-effective third-party LLM agents. The framework integrates mechanisms for hybrid skill representation, dynamic skill discovery, automated task decomposition, a unified cost model comparing internal execution costs against external outsourcing prices, simplified market-based decision-making algorithms, and a standardized communication protocol between LLM agents. Comprehensive validation through 239 theoretical simulations demonstrates 41.8\% cost reduction potential, while large-scale empirical validation across 240 real LLM tasks confirms 20.3\% cost reduction with proper epsilon-greedy exploration, establishing both theoretical viability and practical effectiveness. The emergence of proposed open standards like Google's Agent2Agent (A2A) protocol further underscores the need for frameworks like COALESCE that can leverage such standards for efficient agent interaction. By facilitating a dynamic market for agent capabilities, potentially utilizing protocols like A2A for communication, COALESCE aims to significantly reduce operational costs, enhance system scalability, and foster the emergence of specialized agent economies, making complex LLM agent functionalities more accessible and economically viable.

Paperid: 1210, https://arxiv.org/pdf/2506.01333.pdf

Abstract:
The Model Context Protocol (MCP) plays a crucial role in extending the capabilities of Large Language Models (LLMs) by enabling integration with external tools and data sources. However, the standard MCP specification presents significant security vulnerabilities, notably Tool Poisoning and Rug Pull attacks. This paper introduces the Enhanced Tool Definition Interface (ETDI), a security extension designed to fortify MCP. ETDI incorporates cryptographic identity verification, immutable versioned tool definitions, and explicit permission management, often leveraging OAuth 2.0. We further propose extending MCP with fine-grained, policy-based access control, where tool capabilities are dynamically evaluated against explicit policies using a dedicated policy engine, considering runtime context beyond static OAuth scopes. This layered approach aims to establish a more secure, trustworthy, and controllable ecosystem for AI applications interacting with LLMs and external tools.

Paperid: 1211, https://arxiv.org/pdf/2506.00790.pdf

Abstract:
Quantum computers threaten widely deployed cryptographic primitives such as RSA, DSA, and ECC. While NIST has released post-quantum cryptographic (PQC) standards (e.g., Kyber, Dilithium), mobile app ecosystems remain largely unprepared for this transition. We present a large-scale binary analysis of over 4,000 Android apps to assess cryptographic readiness. Our results show widespread reliance on quantum-vulnerable algorithms such as MD5, SHA-1, and RSA, while PQC adoption remains absent in production apps. To bridge the readiness gap, we explore LLM-assisted migration. We evaluate leading LLMs (GPT-4o, Gemini Flash, Claude Sonnet, etc.) for automated cryptographic migration. All models successfully performed simple hash replacements (e.g., SHA-1 to SHA-256). However, none produced correct PQC upgrades due to multi-file changes, missing imports, and lack of context awareness. These results underscore the need for structured guidance and system-aware tooling for post-quantum migration

Paperid: 1212, https://arxiv.org/pdf/2505.21703.pdf

Abstract:
Internet of Vehicles (IoV) systems, while offering significant advancements in transportation efficiency and safety, introduce substantial security vulnerabilities due to their highly interconnected nature. These dynamic systems produce massive amounts of data between vehicles, infrastructure, and cloud services and present a highly distributed framework with a wide attack surface. In considering network-centered attacks on IoV systems, attacks such as Denial-of-Service (DoS) can prohibit the communication of essential physical traffic safety information between system elements, illustrating that the security concerns for these systems go beyond the traditional confidentiality, integrity, and availability concerns of enterprise systems. Given the complexity and volume of data generated by IoV systems, traditional security mechanisms are often inadequate for accurately detecting sophisticated and evolving cyberattacks. Here, we present an unsupervised autoencoder method trained entirely on benign network data for the purpose of unseen attack detection in IoV networks. We leverage a weighted combination of reconstruction and triplet margin loss to guide the autoencoder training and develop a diverse representation of the benign training set. We conduct extensive experiments on recent network intrusion datasets from two different application domains, industrial IoT and home IoT, that represent the modern IoV task. We show that our method performs robustly for all unseen attack types, with roughly 99% accuracy on benign data and between 97% and 100% performance on anomaly data. We extend these results to show that our model is adaptable through the use of transfer learning, achieving similarly high results while leveraging domain features from one domain to another.

Paperid: 1213, https://arxiv.org/pdf/2505.21641.pdf

Abstract:
The average treatment effect (ATE) is widely used to evaluate the effectiveness of drugs and other medical interventions. In safety-critical applications like medicine, reliable inferences about the ATE typically require valid uncertainty quantification, such as through confidence intervals (CIs). However, estimating treatment effects in these settings often involves sensitive data that must be kept private. In this work, we present PrivATE, a novel machine learning framework for computing CIs for the ATE under differential privacy. Specifically, we focus on deriving valid privacy-preserving CIs for the ATE from observational data. Our PrivATE framework consists of three steps: (i) estimating a differentially private ATE through output perturbation; (ii) estimating the differentially private variance through a truncated output perturbation mechanism; and (iii) constructing the CIs while accounting for the uncertainty from both the estimation and privatization steps. Our PrivATE framework is model agnostic, doubly robust, and ensures valid CIs. We demonstrate the effectiveness of our framework using synthetic and real-world medical datasets. To the best of our knowledge, we are the first to derive a general, doubly robust framework for valid CIs of the ATE under ($\varepsilon$, $Î´$)-differential privacy.

Paperid: 1214, https://arxiv.org/pdf/2505.21568.pdf

Abstract:
Voice cloning (VC)-resistant watermarking is an emerging technique for tracing and preventing unauthorized cloning. Existing methods effectively trace traditional VC models by training them on watermarked audio but fail in zero-shot VC scenarios, where models synthesize audio from an audio prompt without training. To address this, we propose VoiceMark, the first zero-shot VC-resistant watermarking method that leverages speaker-specific latents as the watermark carrier, allowing the watermark to transfer through the zero-shot VC process into the synthesized audio. Additionally, we introduce VC-simulated augmentations and VAD-based loss to enhance robustness against distortions. Experiments on multiple zero-shot VC models demonstrate that VoiceMark achieves over 95% accuracy in watermark detection after zero-shot VC synthesis, significantly outperforming existing methods, which only reach around 50%. See our code and demos at: https://huggingface.co/spaces/haiyunli/VoiceMark

Paperid: 1215, https://arxiv.org/pdf/2505.19633.pdf

Abstract:
State-of-the-art solutions detect jamming attacks ex-post, i.e., only when jamming has already disrupted the wireless communication link. In many scenarios, e.g., mobile networks or static deployments distributed over a large geographical area, it is often desired to detect jamming at the early stage, when it affects the communication link enough to be detected but not sufficiently to disrupt it (detection of weak jamming signals). Under such assumptions, devices can enhance situational awareness and promptly apply mitigation, e.g., moving away from the jammed area in mobile scenarios or changing communication frequency in static deployments, before jamming fully disrupts the communication link. Although some contributions recently demonstrated the feasibility of detecting low-power and weak jamming signals, they make simplistic assumptions far from real-world deployments. Given the current state of the art, no evidence exists that detection of weak jamming can be considered with real-world communication technologies. In this paper, we provide and comprehensively analyze new general-purpose strategies for detecting weak jamming signals, compatible by design with one of the most relevant communication technologies used by commercial-off-the-shelf devices, i.e., IEEE 802.11. We describe two operational modes: (i) binary classification via Convolutional Neural Networks and (ii) one-class classification via Sparse Autoencoders. We evaluate and compare the proposed approaches with the current state-of-the-art using data collected through an extensive real-world experimental campaign in three relevant environments. At the same time, we made the dataset available to the public. Our results demonstrate that detecting weak jamming signals is feasible in all considered real-world environments, and we provide an in-depth analysis considering different techniques, scenarios, and mobility patterns.

Paperid: 1216, https://arxiv.org/pdf/2505.19301.pdf

Abstract:
Traditional Identity and Access Management (IAM) systems, primarily designed for human users or static machine identities via protocols such as OAuth, OpenID Connect (OIDC), and SAML, prove fundamentally inadequate for the dynamic, interdependent, and often ephemeral nature of AI agents operating at scale within Multi Agent Systems (MAS), a computational system composed of multiple interacting intelligent agents that work collectively. This paper posits the imperative for a novel Agentic AI IAM framework: We deconstruct the limitations of existing protocols when applied to MAS, illustrating with concrete examples why their coarse-grained controls, single-entity focus, and lack of context-awareness falter. We then propose a comprehensive framework built upon rich, verifiable Agent Identities (IDs), leveraging Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), that encapsulate an agents capabilities, provenance, behavioral scope, and security posture. Our framework includes an Agent Naming Service (ANS) for secure and capability-aware discovery, dynamic fine-grained access control mechanisms, and critically, a unified global session management and policy enforcement layer for real-time control and consistent revocation across heterogeneous agent communication protocols. We also explore how Zero-Knowledge Proofs (ZKPs) enable privacy-preserving attribute disclosure and verifiable policy compliance. We outline the architecture, operational lifecycle, innovative contributions, and security considerations of this new IAM paradigm, aiming to establish the foundational trust, accountability, and security necessary for the burgeoning field of agentic AI and the complex ecosystems they will inhabit.

Paperid: 1217, https://arxiv.org/pdf/2505.14549.pdf

Abstract:
Large language models (LLMs) are increasingly being used to protect sensitive user data. However, current LLM-based privacy solutions assume that these models can reliably detect personally identifiable information (PII), particularly named entities. In this paper, we challenge that assumption by revealing systematic failures in LLM-based privacy tasks. Specifically, we show that modern LLMs regularly overlook human names even in short text snippets due to ambiguous contexts, which cause the names to be misinterpreted or mishandled. We propose AMBENCH, a benchmark dataset of seemingly ambiguous human names, leveraging the name regularity bias phenomenon, embedded within concise text snippets along with benign prompt injections. Our experiments on modern LLMs tasked to detect PII as well as specialized tools show that recall of ambiguous names drops by 20--40% compared to more recognizable names. Furthermore, ambiguous human names are four times more likely to be ignored in supposedly privacy-preserving summaries generated by LLMs when benign prompt injections are present. These findings highlight the underexplored risks of relying solely on LLMs to safeguard user privacy and underscore the need for a more systematic investigation into their privacy failure modes.

Paperid: 1218, https://arxiv.org/pdf/2505.12453.pdf

Abstract:
Federated recommender system (FedRec) has emerged as a solution to protect user data through collaborative training techniques. A typical FedRec involves transmitting the full model and entire weight updates between edge devices and the server, causing significant burdens to devices with limited bandwidth and computational power. While the sparsity of embedding updates provides opportunity for payload optimization, existing sparsity-aware federated protocols generally sacrifice privacy for efficiency. A key challenge in designing a secure sparsity-aware efficient protocol is to protect the rated item indices from the server. In this paper, we propose a lossless secure recommender systems on sparse embedding updates (SecEmb). SecEmb reduces user payload while ensuring that the server learns no information about both rated item indices and individual updates except the aggregated model. The protocol consists of two correlated modules: (1) a privacy-preserving embedding retrieval module that allows users to download relevant embeddings from the server, and (2) an update aggregation module that securely aggregates updates at the server. Empirical analysis demonstrates that SecEmb reduces both download and upload communication costs by up to 90x and decreases user-side computation time by up to 70x compared with secure FedRec protocols. Additionally, it offers non-negligible utility advantages compared with lossy message compression methods.

Paperid: 1219, https://arxiv.org/pdf/2505.12128.pdf

Abstract:
Matrix factorization mechanisms for differentially private training have emerged as a promising approach to improve model utility under privacy constraints. In practical settings, models are typically trained over multiple epochs, requiring matrix factorizations that account for repeated participation. Existing theoretical upper and lower bounds on multi-epoch factorization error leave a significant gap. In this work, we introduce a new explicit factorization method, Banded Inverse Square Root (BISR), which imposes a banded structure on the inverse correlation matrix. This factorization enables us to derive an explicit and tight characterization of the multi-epoch error. We further prove that BISR achieves asymptotically optimal error by matching the upper and lower bounds. Empirically, BISR performs on par with state-of-the-art factorization methods, while being simpler to implement, computationally efficient, and easier to analyze.

Paperid: 1220, https://arxiv.org/pdf/2505.10815.pdf

Abstract:
This paper studies the problem of securing task offloading transmissions from ground users against ground eavesdropping threats. Our study introduces a reconfigurable intelligent surface (RIS)-aided unmanned aerial vehicle (UAV)-mobile edge computing (MEC) scheme to enhance the secure task offloading while minimizing the energy consumption of the UAV subject to task completion constraints. Leveraging a data-driven approach, we propose a comprehensive optimization strategy that jointly optimizes the aerial MEC (AMEC)'s trajectory, task offloading partitioning, UE transmission scheduling, and RIS phase shifts. Our objective centers on optimizing the secrecy energy efficiency (SEE) of UE task offloading transmissions while preserving the AMEC's energy resources and meeting the task completion time requirements. Numerical results show that the proposed solution can effectively safeguard legitimate task offloading transmissions while preserving AMEC energy.

Paperid: 1221, https://arxiv.org/pdf/2505.10812.pdf

Abstract:
Cellular networks require strict security procedures and measures across various network components, from core to radio access network (RAN) and end-user devices. As networks become increasingly complex and interconnected, as in O-RAN deployments, they are exposed to a numerous security threats. Therefore, ensuring robust security is critical for O-RAN to protect network integrity and safeguard user data. This requires rigorous testing methodologies to mitigate threats. This paper introduces an automated, adaptive, and scalable user equipment (UE) based RAN security testing framework designed to address the shortcomings of existing RAN testing solutions. Experimental results on a 5G software radio testbed built with commercial off-the-shelf hardware and open source software validate the efficiency and reproducibility of sample security test procedures developed on the RAN Tester UE framework.

Paperid: 1222, https://arxiv.org/pdf/2505.10609.pdf

Abstract:
The proliferation of AI agents requires robust mechanisms for secure discovery. This paper introduces the Agent Name Service (ANS), a novel architecture based on DNS addressing the lack of a public agent discovery framework. ANS provides a protocol-agnostic registry infrastructure that leverages Public Key Infrastructure (PKI) certificates for verifiable agent identity and trust. The architecture features several key innovations: a formalized agent registration and renewal mechanism for lifecycle management; DNS-inspired naming conventions with capability-aware resolution; a modular Protocol Adapter Layer supporting diverse communication standards (A2A, MCP, ACP etc.); and precisely defined algorithms for secure resolution. We implement structured communication using JSON Schema and conduct a comprehensive threat analysis of our proposal. The result is a foundational directory service addressing the core challenges of secured discovery and interaction in multi-agent systems, paving the way for future interoperable, trustworthy, and scalable agent ecosystems.

Paperid: 1223, https://arxiv.org/pdf/2505.07239.pdf

Abstract:
With the growing use of large language models (LLMs) hosted on cloud platforms to offer inference services, privacy concerns about the potential leakage of sensitive information are escalating. Secure multi-party computation (MPC) is a promising solution to protect the privacy in LLM inference. However, MPC requires frequent inter-server communication, causing high performance overhead. Inspired by the prevalent activation sparsity of LLMs, where most neuron are not activated after non-linear activation functions, we propose an efficient private inference system, Comet. This system employs an accurate and fast predictor to predict the sparsity distribution of activation function output. Additionally, we introduce a new private inference protocol. It efficiently and securely avoids computations involving zero values by exploiting the spatial locality of the predicted sparse distribution. While this computation-avoidance approach impacts the spatiotemporal continuity of KV cache entries, we address this challenge with a low-communication overhead cache refilling strategy that merges miss requests and incorporates a prefetching mechanism. Finally, we evaluate Comet on four common LLMs and compare it with six state-of-the-art private inference systems. Comet achieves a 1.87x-2.63x speedup and a 1.94x-2.64x communication reduction.

Paperid: 1224, https://arxiv.org/pdf/2505.06747.pdf

Abstract:
Differential Privacy (DP) has emerged as a robust framework for privacy-preserving data releases and has been successfully applied in high-profile cases, such as the 2020 US Census. However, in organizational settings, the use of DP remains largely confined to isolated data releases. This approach restricts the potential of DP to serve as a framework for comprehensive privacy risk management at an organizational level. Although one might expect that the cumulative privacy risk of isolated releases could be assessed using DP's compositional property, in practice, individual DP guarantees are frequently tailored to specific releases, making it difficult to reason about their interaction or combined impact. At the same time, less tailored DP guarantees, which compose more easily, also offer only limited insight because they lead to excessively large privacy budgets that convey limited meaning. To address these limitations, we present DPolicy, a system designed to manage cumulative privacy risks across multiple data releases using DP. Unlike traditional approaches that treat each release in isolation or rely on a single (global) DP guarantee, our system employs a flexible framework that considers multiple DP guarantees simultaneously, reflecting the diverse contexts and scopes typical of real-world DP deployments. DPolicy introduces a high-level policy language to formalize privacy guarantees, making traditionally implicit assumptions on scopes and contexts explicit. By deriving the DP guarantees required to enforce complex privacy semantics from these high-level policies, DPolicy enables fine-grained privacy risk management on an organizational scale. We implement and evaluate DPolicy, demonstrating how it mitigates privacy risks that can emerge without comprehensive, organization-wide privacy risk management.

Paperid: 1225, https://arxiv.org/pdf/2505.03574.pdf

Abstract:
Large language models (LLMs) have evolved from simple chatbots into autonomous agents capable of performing complex tasks such as editing production code, orchestrating workflows, and taking higher-stakes actions based on untrusted inputs like webpages and emails. These capabilities introduce new security risks that existing security measures, such as model fine-tuning or chatbot-focused guardrails, do not fully address. Given the higher stakes and the absence of deterministic solutions to mitigate these risks, there is a critical need for a real-time guardrail monitor to serve as a final layer of defense, and support system level, use case specific safety policy definition and enforcement. We introduce LlamaFirewall, an open-source security focused guardrail framework designed to serve as a final layer of defense against security risks associated with AI Agents. Our framework mitigates risks such as prompt injection, agent misalignment, and insecure code risks through three powerful guardrails: PromptGuard 2, a universal jailbreak detector that demonstrates clear state of the art performance; Agent Alignment Checks, a chain-of-thought auditor that inspects agent reasoning for prompt injection and goal misalignment, which, while still experimental, shows stronger efficacy at preventing indirect injections in general scenarios than previously proposed approaches; and CodeShield, an online static analysis engine that is both fast and extensible, aimed at preventing the generation of insecure or dangerous code by coding agents. Additionally, we include easy-to-use customizable scanners that make it possible for any developer who can write a regular expression or an LLM prompt to quickly update an agent's security guardrails.

Paperid: 1226, https://arxiv.org/pdf/2505.01463.pdf

Abstract:
Protecting cloud applications is critical in an era where security threats are increasingly sophisticated and persistent. Continuous Integration and Continuous Deployment (CI/CD) pipelines are particularly vulnerable, making innovative security approaches essential. This research explores the application of Natural Language Processing (NLP) techniques, specifically Topic Modelling, to analyse security-related text data and anticipate potential threats. We focus on Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA) to extract meaningful patterns from data sources, including logs, reports, and deployment traces. Using the Gensim framework in Python, these methods categorise log entries into security-relevant topics (e.g., phishing, encryption failures). The identified topics are leveraged to highlight patterns indicative of security issues across CI/CD's continuous stages (build, test, deploy). This approach introduces a semantic layer that supports early vulnerability recognition and contextual understanding of runtime behaviours.

Paperid: 1227, https://arxiv.org/pdf/2505.01012.pdf

Abstract:
Anomaly Detection (AD) is critical in data analysis, particularly within the domain of IT security. In recent years, Machine Learning (ML) algorithms have emerged as a powerful tool for AD in large-scale data. In this study, we explore the potential of quantum ML approaches, specifically quantum kernel methods, for the application to robust AD. We build upon previous work on Quantum Support Vector Regression (QSVR) for semisupervised AD by conducting a comprehensive benchmark on IBM quantum hardware using eleven datasets. Our results demonstrate that QSVR achieves strong classification performance and even outperforms the noiseless simulation on two of these datasets. Moreover, we investigate the influence of - in the NISQ-era inevitable - quantum noise on the performance of the QSVR. Our findings reveal that the model exhibits robustness to depolarizing, phase damping, phase flip, and bit flip noise, while amplitude damping and miscalibration noise prove to be more disruptive. Finally, we explore the domain of Quantum Adversarial Machine Learning and demonstrate that QSVR is highly vulnerable to adversarial attacks and that noise does not improve the adversarial robustness of the model.

Paperid: 1228, https://arxiv.org/pdf/2504.21039.pdf

Abstract:
As transformer-based large language models (LLMs) increasingly permeate society, they have revolutionized domains such as software engineering, creative writing, and digital arts. However, their adoption in cybersecurity remains limited due to challenges like scarcity of specialized training data and complexity of representing cybersecurity-specific knowledge. To address these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on the Llama 3.1 architecture and enhanced through continued pretraining on a carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.

Paperid: 1229, https://arxiv.org/pdf/2504.20532.pdf

Abstract:
The emergence of diffusion models has facilitated the generation of speech with reinforced fidelity and naturalness. While deepfake detection technologies have manifested the ability to identify AI-generated content, their efficacy decreases as generative models become increasingly sophisticated. Furthermore, current research in the field has not adequately addressed the necessity for robust watermarking to safeguard the intellectual property rights associated with synthetic speech and generative models. To remedy this deficiency, we propose a \textbf{ro}bust generative \textbf{s}peech wat\textbf{e}rmarking method (TriniMark) for authenticating the generated content and safeguarding the copyrights by enabling the traceability of the diffusion model. We first design a structure-lightweight watermark encoder that embeds watermarks into the time-domain features of speech and reconstructs the waveform directly. A temporal-aware gated convolutional network is meticulously designed in the watermark decoder for bit-wise watermark recovery. Subsequently, the waveform-guided fine-tuning strategy is proposed for fine-tuning the diffusion model, which leverages the transferability of watermarks and enables the diffusion model to incorporate watermark knowledge effectively. When an attacker trains a surrogate model using the outputs of the target model, the embedded watermark can still be learned by the surrogate model and correctly extracted. Comparative experiments with state-of-the-art methods demonstrate the superior robustness of our method, particularly in countering compound attacks.

Paperid: 1230, https://arxiv.org/pdf/2504.19951.pdf

Abstract:
The rise of generative AI (GenAI) multi-agent systems (MAS) necessitates standardized protocols enabling agents to discover and interact with external tools. However, these protocols introduce new security challenges, particularly; tool squatting; the deceptive registration or representation of tools. This paper analyzes tool squatting threats within the context of emerging interoperability standards, such as Model Context Protocol (MCP) or seamless communication between agents protocols. It introduces a comprehensive Tool Registry system designed to mitigate these risks. We propose a security-focused architecture featuring admin-controlled registration, centralized tool discovery, fine grained access policies enforced via dedicated Agent and Tool Registry services, a dynamic trust scoring mechanism based on tool versioning and known vulnerabilities, and just in time credential provisioning. Based on its design principles, the proposed registry framework aims to effectively prevent common tool squatting vectors while preserving the flexibility and power of multi-agent systems. This work addresses a critical security gap in the rapidly evolving GenAI ecosystem and provides a foundation for secure tool integration in production environments.

Paperid: 1231, https://arxiv.org/pdf/2504.17548.pdf

Abstract:
Anomaly Detection (AD) defines the task of identifying observations or events that deviate from typical - or normal - patterns, a critical capability in IT security for recognizing incidents such as system misconfigurations, malware infections, or cyberattacks. In enterprise environments like SAP HANA Cloud systems, this task often involves monitoring high-dimensional, multivariate time series (MTS) derived from telemetry and log data. With the advent of quantum machine learning offering efficient calculations in high-dimensional latent spaces, many avenues open for dealing with such complex data. One approach is the Quantum Autoencoder (QAE), an emerging and promising method with potential for application in both data compression and AD. However, prior applications of QAEs to time series AD have been restricted to univariate data, limiting their relevance for real-world enterprise systems. In this work, we introduce a novel QAE-based framework designed specifically for MTS AD towards enterprise scale. We theoretically develop and experimentally validate the architecture, demonstrating that our QAE achieves performance competitive with neural-network-based autoencoders while requiring fewer trainable parameters. We evaluate our model on datasets that closely reflect SAP system telemetry and show that the proposed QAE is a viable and efficient alternative for semisupervised AD in real-world enterprise settings.

Paperid: 1232, https://arxiv.org/pdf/2504.16902.pdf

Abstract:
As Agentic AI systems evolve from basic workflows to complex multi agent collaboration, robust protocols such as Google's Agent2Agent (A2A) become essential enablers. To foster secure adoption and ensure the reliability of these complex interactions, understanding the secure implementation of A2A is essential. This paper addresses this goal by providing a comprehensive security analysis centered on the A2A protocol. We examine its fundamental elements and operational dynamics, situating it within the framework of agent communication development. Utilizing the MAESTRO framework, specifically designed for AI risks, we apply proactive threat modeling to assess potential security issues in A2A deployments, focusing on aspects such as Agent Card management, task execution integrity, and authentication methodologies. Based on these insights, we recommend practical secure development methodologies and architectural best practices designed to build resilient and effective A2A systems. Our analysis also explores how the synergy between A2A and the Model Context Protocol (MCP) can further enhance secure interoperability. This paper equips developers and architects with the knowledge and practical guidance needed to confidently leverage the A2A protocol for building robust and secure next generation agentic applications.

Paperid: 1233, https://arxiv.org/pdf/2504.16429.pdf

Abstract:
Retrieval-Augmented Code Generation (RACG) leverages external knowledge to enhance Large Language Models (LLMs) in code synthesis, improving the functional correctness of the generated code. However, existing RACG systems largely overlook security, leading to substantial risks. Especially, the poisoning of malicious code into knowledge bases can mislead LLMs, resulting in the generation of insecure outputs, which poses a critical threat in modern software development. To address this, we propose a security-hardening framework for RACG systems, CodeGuarder, that shifts the paradigm from retrieving only functional code examples to incorporating both functional code and security knowledge. Our framework constructs a security knowledge base from real-world vulnerability databases, including secure code samples and root cause annotations. For each code generation query, a retriever decomposes the query into fine-grained sub-tasks and fetches relevant security knowledge. To prioritize critical security guidance, we introduce a re-ranking and filtering mechanism by leveraging the LLMs' susceptibility to different vulnerability types. This filtered security knowledge is seamlessly integrated into the generation prompt. Our evaluation shows CodeGuarder significantly improves code security rates across various LLMs, achieving average improvements of 20.12\% in standard RACG, and 31.53\% and 21.91\% under two distinct poisoning scenarios without compromising functional correctness. Furthermore, CodeGuarder demonstrates strong generalization, enhancing security even when the targeted language's security knowledge is lacking. This work presents CodeGuarder as a pivotal advancement towards building secure and trustworthy RACG systems.

Paperid: 1234, https://arxiv.org/pdf/2504.16057.pdf

Abstract:
In this paper, we present MoCQ, a novel neuro-symbolic framework that combines the complementary strengths of Large Language Model (LLM) and classic vulnerability checkers to enable scalable, automated vulnerability detection. The key insight is to leverage an LLM to automatically generate vulnerability patterns and translate them into detection queries. Specifically, MoCQ incorporates an iterative loop in which an LLM refines queries based on carefully designed feedback information. The resulting queries are then executed to analyze large codebases and detect vulnerabilities. We evaluated MoCQ on 12 vulnerability types across four programming languages. MoCQ achieved comparable precision and recall compared to expert-developed queries, with significantly less expert time needed. MoCQ also uncovered 46 new vulnerability patterns that experts missed, each representing an overlooked vulnerability class. MoCQ further discovered seven previously unknown vulnerabilities in real-world applications.

Paperid: 1235, https://arxiv.org/pdf/2504.15343.pdf

Abstract:
We show that concrete hardness assumptions about learning or cloning the output state of a random quantum circuit can be used as the foundation for secure quantum cryptography. In particular, under these assumptions we construct secure one-way state generators (OWSGs), digital signature schemes, quantum bit commitments, and private key encryption schemes. We also discuss evidence for these hardness assumptions by analyzing the best-known quantum learning algorithms, as well as proving black-box lower bounds for cloning and learning given state preparation oracles. Our random circuit-based constructions provide concrete instantiations of quantum cryptographic primitives whose security do not depend on the existence of one-way functions. The use of random circuits in our constructions also opens the door to NISQ-friendly quantum cryptography. We discuss noise tolerant versions of our OWSG and digital signature constructions which can potentially be implementable on noisy quantum computers connected by a quantum network. On the other hand, they are still secure against noiseless quantum adversaries, raising the intriguing possibility of a useful implementation of an end-to-end cryptographic protocol on near-term quantum computers. Finally, our explorations suggest that the rich interconnections between learning theory and cryptography in classical theoretical computer science also extend to the quantum setting.

Paperid: 1236, https://arxiv.org/pdf/2504.15035.pdf

Abstract:
The accelerated advancement of speech generative models has given rise to security issues, including model infringement and unauthorized abuse of content. Although existing generative watermarking techniques have proposed corresponding solutions, most methods require substantial computational overhead and training costs. In addition, some methods have limitations in robustness when handling variable-length inputs. To tackle these challenges, we propose \textsc{SOLIDO}, a novel generative watermarking method that integrates parameter-efficient fine-tuning with speech watermarking through low-rank adaptation (LoRA) for speech diffusion models. Concretely, the watermark encoder converts the watermark to align with the input of diffusion models. To achieve precise watermark extraction from variable-length inputs, the watermark decoder based on depthwise separable convolution is designed for watermark recovery. To further enhance speech generation performance and watermark extraction capability, we propose a speech-driven lightweight fine-tuning strategy, which reduces computational overhead through LoRA. Comprehensive experiments demonstrate that the proposed method ensures high-fidelity watermarked speech even at a large capacity of 2000 bps. Furthermore, against common individual and compound speech attacks, our SOLIDO achieves a maximum average extraction accuracy of 99.20\% and 98.43\%, respectively. It surpasses other state-of-the-art methods by nearly 23\% in resisting time-stretching attacks.

Paperid: 1237, https://arxiv.org/pdf/2504.14832.pdf

Abstract:
The rapid advancement of generative models has led to the synthesis of real-fake ambiguous voices. To erase the ambiguity, embedding watermarks into the frequency-domain features of synthesized voices has become a common routine. However, the robustness achieved by choosing the frequency domain often comes at the expense of fine-grained voice features, leading to a loss of fidelity. Maximizing the comprehensive learning of time-domain features to enhance fidelity while maintaining robustness, we pioneer a \textbf{\underline{t}}emporal-aware \textbf{\underline{r}}ob\textbf{\underline{u}}st wat\textbf{\underline{e}}rmarking (\emph{True}) method for protecting the speech and singing voice. For this purpose, the integrated content-driven encoder is designed for watermarked waveform reconstruction, which is structurally lightweight. Additionally, the temporal-aware gated convolutional network is meticulously designed to bit-wise recover the watermark. Comprehensive experiments and comparisons with existing state-of-the-art methods have demonstrated the superior fidelity and vigorous robustness of the proposed \textit{True} achieving an average PESQ score of 4.63.

Paperid: 1238, https://arxiv.org/pdf/2504.14183.pdf

Abstract:
The First VoicePrivacy Attacker Challenge is an ICASSP 2025 SP Grand Challenge which focuses on evaluating attacker systems against a set of voice anonymization systems submitted to the VoicePrivacy 2024 Challenge. Training, development, and evaluation datasets were provided along with a baseline attacker. Participants developed their attacker systems in the form of automatic speaker verification systems and submitted their scores on the development and evaluation data. The best attacker systems reduced the equal error rate (EER) by 25-44% relative w.r.t. the baseline.

Paperid: 1239, https://arxiv.org/pdf/2504.13201.pdf

Abstract:
Large Language Models (LLMs) are increasingly becoming the cognitive core of Embodied Intelligence (EI) systems, such as robots and autonomous vehicles. However, this integration also exposes them to serious jailbreak risks, where malicious instructions can be transformed into dangerous physical actions. Existing defense mechanisms suffer from notable drawbacks--including high training costs, significant inference delays, and complex hyperparameter tuning--which limit their practical applicability. To address these challenges, we propose a novel and efficient inference-time defense framework: Concept Enhancement Engineering (CEE). CEE enhances the model's inherent safety mechanisms by directly manipulating its internal representations, requiring neither additional training nor external modules, thereby improving defense efficiency. Furthermore, CEE introduces a rotation-based control mechanism that enables stable and linearly tunable behavioral control of the model. This design eliminates the need for tedious manual tuning and avoids the output degradation issues commonly observed in other representation engineering methods. Extensive experiments across multiple EI safety benchmarks and diverse attack scenarios demonstrate that CEE significantly improves the defense success rates of various multimodal LLMs. It effectively mitigates safety risks while preserving high-quality generation and inference efficiency, offering a promising solution for deploying safer embodied intelligence systems.

Paperid: 1240, https://arxiv.org/pdf/2504.12812.pdf

Abstract:
The widespread adoption of EMV (Europay, Mastercard, and Visa) contactless payment systems has greatly improved convenience for both users and merchants. However, this growth has also exposed significant security challenges. This SoK provides a comprehensive analysis of security vulnerabilities in EMV contactless payments, particularly within the open-loop systems used by Visa and Mastercard. We categorize attacks into seven attack vectors across three key areas: application selection, cardholder authentication, and transaction authorization. We replicate the attacks on Visa and Mastercard protocols using our experimental platform to determine their practical feasibility and offer insights into the current security landscape of contactless payments. Our study also includes a detailed evaluation of the underlying protocols, along with a comparative analysis of Visa and Mastercard, highlighting vulnerabilities and recommending countermeasures.

Paperid: 1241, https://arxiv.org/pdf/2504.12229.pdf

Abstract:
Recent advancements in Large Language Models (LLMs) raised concerns over potential misuse, such as for spreading misinformation. In response two counter measures emerged: machine learning-based detectors that predict if text is synthetic, and LLM watermarking, which subtly marks generated text for identification and attribution. Meanwhile, humans are known to adjust language to their conversational partners both syntactically and lexically. By implication, it is possible that humans or unwatermarked LLMs could unintentionally mimic properties of LLM generated text, making counter measures unreliable. In this work we investigate the extent to which such conversational adaptation happens. We call the concept $\textit{mimicry}$ and demonstrate that both humans and LLMs end up mimicking, including the watermarking signal even in seemingly improbable settings. This challenges current academic assumptions and suggests that for long-term watermarking to be reliable, the likelihood of false positives needs to be significantly lower, while longer word sequences should be used for seeding watermarking mechanisms.

Paperid: 1242, https://arxiv.org/pdf/2504.11088.pdf

Abstract:
Federated learning based on homomorphic encryption has received widespread attention due to its high security and enhanced protection of user data privacy. However, the characteristics of encrypted computation lead to three challenging problems: ``computation-efficiency", ``attack-tracing" and ``contribution-assessment". The first refers to the efficiency of encrypted computation during model aggregation, the second refers to tracing malicious attacks in an encrypted state, and the third refers to the fairness of contribution assessment for local models after encryption. This paper proposes a federated learning storage security model with homomorphic encryption (FLSSM) to protect federated learning model privacy and address the three issues mentioned above. First, we utilize different nodes to aggregate local models in parallel, thereby improving encrypted models' aggregation efficiency. Second, we introduce trusted supervise nodes to examine local models when the global model is attacked, enabling the tracing of malicious attacks under homomorphic encryption. Finally, we fairly reward local training nodes with encrypted local models based on trusted training time. Experiments on multiple real-world datasets show that our model significantly outperforms baseline models in terms of both efficiency and security metrics.

Paperid: 1243, https://arxiv.org/pdf/2504.08623.pdf

Abstract:
The Model Context Protocol (MCP), introduced by Anthropic, provides a standardized framework for artificial intelligence (AI) systems to interact with external data sources and tools in real-time. While MCP offers significant advantages for AI integration and capability extension, it introduces novel security challenges that demand rigorous analysis and mitigation. This paper builds upon foundational research into MCP architecture and preliminary security assessments to deliver enterprise-grade mitigation frameworks and detailed technical implementation strategies. Through systematic threat modeling and analysis of MCP implementations and analysis of potential attack vectors, including sophisticated threats like tool poisoning, we present actionable security patterns tailored for MCP implementers and adopters. The primary contribution of this research lies in translating theoretical security concerns into a practical, implementable framework with actionable controls, thereby providing essential guidance for the secure enterprise adoption and governance of integrated AI systems.

Paperid: 1244, https://arxiv.org/pdf/2504.03823.pdf

Abstract:
Large Language Models (LLMs) offer powerful capabilities in text generation and are increasingly adopted across a wide range of domains. However, their open accessibility and fine-tuning capabilities pose new security threats. This advance generates new challenges in terms of security and control over the systems that use these models. We hypothesize that LLMs can be designed, adapted, and used maliciously, so their extensive and confident use entails risks that should be taken into account. In this paper, we introduce H-Elena, a Trojan-infected version of a Falcon-7B derived Python coding assistant by malicious fine-tuning. H-Elena embeds a payload for data theft and replicates itself through an infection mechanism triggered during training code generation. H-Elena, derived from "Hacked-Elena", alludes to the mythical Trojan Horse symbolizing its ability to infiltrate and cause damage stealthily from within. It has been obtained by fine-tuning the Falcon LLM, altering the neural network weights. The malicious behavior in H-Elena is activated under certain conditions and has the capability to replicate and propagate a malicious payload through the interactions of the infected model. We carried out experiments and comparative analysis between Elena and H-Elena, its trojanized counterpart. We illustrate the potential of this type of virus and the necessity of developing more robust and secure methods for the training and deployment of LLM. Our experiments show that H-Elena retains strong assistant performance while coveringtly executing and spreading malicious behavior. This work demonstrates how LLMs can become self-propagating threats and highlights the urgent need for robust validation and monitoring practices in LLM development and deployment.

Paperid: 1245, https://arxiv.org/pdf/2504.03759.pdf

Abstract:
Large language models (LLMs)-powered AI agents exhibit a high level of autonomy in addressing medical and healthcare challenges. With the ability to access various tools, they can operate within an open-ended action space. However, with the increase in autonomy and ability, unforeseen risks also arise. In this work, we investigated one particular risk, i.e., cyber attack vulnerability of medical AI agents, as agents have access to the Internet through web browsing tools. We revealed that through adversarial prompts embedded on webpages, cyberattackers can: i) inject false information into the agent's response; ii) they can force the agent to manipulate recommendation (e.g., healthcare products and services); iii) the attacker can also steal historical conversations between the user and agent, resulting in the leak of sensitive/private medical information; iv) furthermore, the targeted agent can also cause a computer system hijack by returning a malicious URL in its response. Different backbone LLMs were examined, and we found such cyber attacks can succeed in agents powered by most mainstream LLMs, with the reasoning models such as DeepSeek-R1 being the most vulnerable.

Paperid: 1246, https://arxiv.org/pdf/2504.03173.pdf

Abstract:
Privacy-Preserving Federated Learning (PPFL) enables multiple clients to collaboratively train models by submitting secreted model updates. Nonetheless, PPFL is vulnerable to data poisoning attacks due to its distributed training paradigm in cross-silo scenarios. Existing solutions have struggled to improve the performance of PPFL under poisoned Non-Independent and Identically Distributed (Non-IID) data. To address the issues, this paper proposes a privacy-preserving federated prototype learning framework, named PPFPL, which enhances the cross-silo FL performance against poisoned Non-IID data while protecting client privacy. Specifically, we adopt prototypes as client-submitted model updates to eliminate the impact of poisoned data distributions. In addition, we design a secure aggregation protocol utilizing homomorphic encryption to achieve Byzantine-robust aggregation on two servers, significantly reducing the impact of malicious clients. Theoretical analyses confirm the convergence and privacy of PPFPL. Experimental results on public datasets show that PPFPL effectively resists data poisoning attacks under Non-IID settings.

Paperid: 1247, https://arxiv.org/pdf/2504.02322.pdf

Abstract:
Effective anomaly detection from logs is crucial for enhancing cybersecurity defenses by enabling the early identification of threats. Despite advances in anomaly detection, existing systems often fall short in areas such as post-detection validation, scalability, and effective maintenance. These limitations not only hinder the detection of new threats but also impair overall system performance. To address these challenges, we propose CEDLog, a novel practical framework that integrates Elastic Weight Consolidation (EWC) for continual learning and implements distributed computing for scalable processing by integrating Apache Airflow and Dask. In CEDLog, anomalies are detected through the synthesis of Multi-layer Perceptron (MLP) and Graph Convolutional Networks (GCNs) using critical features present in event logs. Through comparisons with update strategies on large-scale datasets, we demonstrate the strengths of CEDLog, showcasing efficient updates and low false positives

Paperid: 1248, https://arxiv.org/pdf/2504.02120.pdf

Abstract:
Critical infrastructures integrate a wide range of smart technologies and become highly connected to the cyber world. This is especially true for Cyber-Physical Systems (CPSs), which integrate hardware and software components. Despite the advantages of smart infrastructures, they remain vulnerable to cyberattacks. This work focuses on the cyber resilience of CPSs. We propose a methodology based on knowledge graph modeling and graph analytics to quantify the resilience potential of complex systems by using a multilayered model based on knowledge graphs. Our methodology also allows us to identify critical points. These critical points are components or functions of an architecture that can generate critical failures if attacked. Thus, identifying them can help enhance resilience and avoid cascading effects. We use the SWaT (Secure Water Treatment) testbed as a use case to achieve this objective. This system mimics the actual behavior of a water treatment station in Singapore. We model three resilient designs of SWaT according to our multilayered model. We conduct a resilience assessment based on three relevant metrics used in graph analytics. We compare the results obtained with each metric and discuss their accuracy in identifying critical points. We perform an experimentation analysis based on the knowledge gained by a cyber adversary about the system architecture. We show that the most resilient SWaT design has the necessary potential to bounce back and absorb the attacks. We discuss our results and conclude this work by providing further research axes.

Paperid: 1249, https://arxiv.org/pdf/2504.01550.pdf

Abstract:
Large Language Models (LLMs) have emerged as powerful tools, but their inherent safety risks - ranging from harmful content generation to broader societal harms - pose significant challenges. These risks can be amplified by the recent adversarial attacks, fine-tuning vulnerabilities, and the increasing deployment of LLMs in high-stakes environments. Existing safety-enhancing techniques, such as fine-tuning with human feedback or adversarial training, are still vulnerable as they address specific threats and often fail to generalize across unseen attacks, or require manual system-level defenses. This paper introduces RepBend, a novel approach that fundamentally disrupts the representations underlying harmful behaviors in LLMs, offering a scalable solution to enhance (potentially inherent) safety. RepBend brings the idea of activation steering - simple vector arithmetic for steering model's behavior during inference - to loss-based fine-tuning. Through extensive evaluation, RepBend achieves state-of-the-art performance, outperforming prior methods such as Circuit Breaker, RMU, and NPO, with up to 95% reduction in attack success rates across diverse jailbreak benchmarks, all with negligible reduction in model usability and general capabilities.

Paperid: 1250, https://arxiv.org/pdf/2504.00341.pdf

Abstract:
The Open Radio Access Network (O-RAN) architecture is reshaping telecommunications by promoting openness, flexibility, and intelligent closed-loop optimization. By decoupling hardware and software and enabling multi-vendor deployments, O-RAN reduces costs, enhances performance, and allows rapid adaptation to new technologies. A key innovation is intelligent network slicing, which partitions networks into isolated slices tailored for specific use cases or quality of service requirements. The RAN Intelligent Controller further optimizes resource allocation, ensuring efficient utilization and improved service quality for user equipment (UEs). However, the modular and dynamic nature of O-RAN expands the threat surface, necessitating advanced security measures to maintain network integrity, confidentiality, and availability. Intrusion detection systems have become essential for identifying and mitigating attacks. This research explores using large language models (LLMs) to generate security recommendations based on the temporal traffic patterns of connected UEs. The paper introduces an LLM-driven intrusion detection framework and demonstrates its efficacy through experimental deployments, comparing non fine-tuned and fine-tuned models for task-specific accuracy.

Paperid: 1251, https://arxiv.org/pdf/2503.23510.pdf

Abstract:
In Ethereum, private transactions, a specialized transaction type employed to evade public Peer-to-Peer (P2P) network broadcasting, remain largely unexplored, particularly in the context of the transition from Proof-of-Work (PoW) to Proof-of-Stake (PoS) consensus mechanisms. To address this gap, we investigate the transaction characteristics, (un)intended usages, and monetary impacts by analyzing large-scale datasets comprising 14,810,392 private transactions within a 15.5-month PoW dataset and 30,062,232 private transactions within a 15.5-month PoS dataset. While originally designed for security purposes, we find that private transactions predominantly serve three distinct functions in both PoW and PoS Ethereum: extracting Maximum Extractable Value (MEV), facilitating monetary transfers to distribute mining rewards, and interacting with popular Decentralized Finance (DeFi) applications. Furthermore, we find that private transactions are utilized in DeFi attacks to circumvent surveillance by white hat monitors, with an increased prevalence observed in PoS Ethereum compared to PoW Ethereum. Additionally, in PoS Ethereum, there is a subtle uptick in the role of private transactions for MEV extraction. This shift could be attributed to the decrease in transaction costs. However, this reduction in transaction cost and the cancellation of block rewards result in a significant decrease in mining profits for block creators.

Paperid: 1252, https://arxiv.org/pdf/2503.20847.pdf

Abstract:
Synthetic data offers a promising solution to privacy concerns in healthcare by generating useful datasets in a privacy-aware manner. However, although synthetic data is typically developed with the intention of sharing said data, ambiguous reidentification risk assessments often prevent synthetic data from seeing the light of day. One of the main causes is that privacy metrics for synthetic data, which inform on reidentification risks, are not well-aligned with practical requirements and regulations regarding data sharing in healthcare. This article discusses the paradoxical situation where synthetic data is designed for data sharing but is often still restricted. We also discuss how the field should move forward to mitigate this issue.

Paperid: 1253, https://arxiv.org/pdf/2503.20279.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed as computer-use agents, autonomously performing tasks within real desktop or web environments. While this evolution greatly expands practical use cases for humans, it also creates serious security exposures. We present SUDO (Screen-based Universal Detox2Tox Offense), a novel attack framework that systematically bypasses refusal-trained safeguards in commercial computer-use agents, such as Claude for Computer Use. The core mechanism, Detox2Tox, transforms harmful requests (that agents initially reject) into seemingly benign requests via detoxification, secures detailed instructions from advanced vision language models (VLMs), and then reintroduces malicious content via toxification just before execution. Unlike conventional jailbreaks, SUDO iteratively refines its attacks based on a built-in refusal feedback, making it increasingly effective against robust policy filters. In extensive tests spanning 50 real-world tasks and multiple state-of-the-art VLMs, SUDO achieves a stark attack success rate of 24.41% (with no refinement), and up to 41.33% (by its iterative refinement) in Claude for Computer Use. By revealing these vulnerabilities and demonstrating the ease with which they can be exploited in real-world computing environments, this paper highlights an immediate need for robust, context-aware safeguards. WARNING: This paper includes harmful or offensive model outputs

Paperid: 1254, https://arxiv.org/pdf/2503.20244.pdf

Abstract:
Modern software systems are developed in diverse programming languages and often harbor critical vulnerabilities that attackers can exploit to compromise security. These vulnerabilities have been actively targeted in real-world attacks, causing substantial harm to users and cyberinfrastructure. Since many of these flaws originate from the code itself, a variety of techniques have been proposed to detect and mitigate them prior to software deployment. However, a comprehensive comparative study that spans different programming languages, program representations, bug types, and analysis techniques is still lacking. As a result, the relationships among programming languages, abstraction levels, vulnerability types, and detection approaches remain fragmented, and the limitations and research gaps across the landscape are not clearly understood. This article aims to bridge that gap by systematically examining widely used programming languages, levels of program representation, categories of vulnerabilities, and mainstream detection techniques. The survey provides a detailed understanding of current practices in vulnerability discovery, highlighting their strengths, limitations, and distinguishing characteristics. Furthermore, it identifies persistent challenges and outlines promising directions for future research in the field of software security.

Paperid: 1255, https://arxiv.org/pdf/2503.17332.pdf

Abstract:
Large language model (LLM) agents are increasingly capable of autonomously conducting cyberattacks, posing significant threats to existing applications. This growing risk highlights the urgent need for a real-world benchmark to evaluate the ability of LLM agents to exploit web application vulnerabilities. However, existing benchmarks fall short as they are limited to abstracted Capture the Flag competitions or lack comprehensive coverage. Building a benchmark for real-world vulnerabilities involves both specialized expertise to reproduce exploits and a systematic approach to evaluating unpredictable threats. To address this challenge, we introduce CVE-Bench, a real-world cybersecurity benchmark based on critical-severity Common Vulnerabilities and Exposures. In CVE-Bench, we design a sandbox framework that enables LLM agents to exploit vulnerable web applications in scenarios that mimic real-world conditions, while also providing effective evaluation of their exploits. Our evaluation shows that the state-of-the-art agent framework can resolve up to 13% of vulnerabilities.

Paperid: 1256, https://arxiv.org/pdf/2503.16431.pdf

Abstract:
Red teaming has emerged as a critical practice in assessing the possible risks of AI models and systems. It aids in the discovery of novel risks, stress testing possible gaps in existing mitigations, enriching existing quantitative safety metrics, facilitating the creation of new safety measurements, and enhancing public trust and the legitimacy of AI risk assessments. This white paper describes OpenAI's work to date in external red teaming and draws some more general conclusions from this work. We describe the design considerations underpinning external red teaming, which include: selecting composition of red team, deciding on access levels, and providing guidance required to conduct red teaming. Additionally, we show outcomes red teaming can enable such as input into risk assessment and automated evaluations. We also describe the limitations of external red teaming, and how it can fit into a broader range of AI model and system evaluations. Through these contributions, we hope that AI developers and deployers, evaluation creators, and policymakers will be able to better design red teaming campaigns and get a deeper look into how external red teaming can fit into model deployment and evaluation processes. These methods are evolving and the value of different methods continues to shift as the ecosystem around red teaming matures and models themselves improve as tools for red teaming.

Paperid: 1257, https://arxiv.org/pdf/2503.11514.pdf

Abstract:
Federated Learning (FL) has emerged as a promising privacy-preserving collaborative model training paradigm without sharing raw data. However, recent studies have revealed that private information can still be leaked through shared gradient information and attacked by Gradient Inversion Attacks (GIA). While many GIA methods have been proposed, a detailed analysis, evaluation, and summary of these methods are still lacking. Although various survey papers summarize existing privacy attacks in FL, few studies have conducted extensive experiments to unveil the effectiveness of GIA and their associated limiting factors in this context. To fill this gap, we first undertake a systematic review of GIA and categorize existing methods into three types, i.e., \textit{optimization-based} GIA (OP-GIA), \textit{generation-based} GIA (GEN-GIA), and \textit{analytics-based} GIA (ANA-GIA). Then, we comprehensively analyze and evaluate the three types of GIA in FL, providing insights into the factors that influence their performance, practicality, and potential threats. Our findings indicate that OP-GIA is the most practical attack setting despite its unsatisfactory performance, while GEN-GIA has many dependencies and ANA-GIA is easily detectable, making them both impractical. Finally, we offer a three-stage defense pipeline to users when designing FL frameworks and protocols for better privacy protection and share some future research directions from the perspectives of attackers and defenders that we believe should be pursued. We hope that our study can help researchers design more robust FL frameworks to defend against these attacks.

Paperid: 1258, https://arxiv.org/pdf/2503.09669.pdf

Abstract:
Text-to-image diffusion models have achieved remarkable success in generating high-quality contents from text prompts. However, their reliance on publicly available data and the growing trend of data sharing for fine-tuning make these models particularly vulnerable to data poisoning attacks. In this work, we introduce the Silent Branding Attack, a novel data poisoning method that manipulates text-to-image diffusion models to generate images containing specific brand logos or symbols without any text triggers. We find that when certain visual patterns are repeatedly in the training data, the model learns to reproduce them naturally in its outputs, even without prompt mentions. Leveraging this, we develop an automated data poisoning algorithm that unobtrusively injects logos into original images, ensuring they blend naturally and remain undetected. Models trained on this poisoned dataset generate images containing logos without degrading image quality or text alignment. We experimentally validate our silent branding attack across two realistic settings on large-scale high-quality image datasets and style personalization datasets, achieving high success rates even without a specific text trigger. Human evaluation and quantitative metrics including logo detection show that our method can stealthily embed logos.

Paperid: 1259, https://arxiv.org/pdf/2503.08969.pdf

Abstract:
As software grows in complexity to accommodate diverse features and platforms, software bloating has emerged as a significant challenge, adversely affecting performance and security. However, existing approaches inadequately address the dual objectives of debloating: maintaining functionality by preserving essential features and enhancing security by reducing security issues. Specifically, current software debloating techniques often rely on input-based analysis, using user inputs as proxies for the specifications of desired features. However, these approaches frequently overfit provided inputs, leading to functionality loss and potential security vulnerabilities. To address these limitations, we propose LEADER, a program debloating framework enhanced by Large Language Models (LLMs), which leverages their semantic understanding, generative capabilities, and decision-making strengths. LEADER mainly consists of two modules: (1) a documentation-guided test augmentation module designed to preserve functionality, which leverages LLMs to comprehend program documentation and generates sufficient tests to cover the desired features comprehensively, and (2) a multi-advisor-aided program debloating module that employs a neuro-symbolic pipeline to ensure that the security of the software can be perceived during debloating. This module combines debloating and security advisors for analysis and employs an LLM as a decision-maker to eliminate undesired code securely. Extensive evaluations on widely used benchmarks demonstrate the efficacy of LEADER. These results demonstrate that LEADER surpasses the state-of-the-art tool CovA in functionality and security. These results underscore the potential of LEADER to set a new standard in program debloating by effectively balancing functionality and security.

Paperid: 1260, https://arxiv.org/pdf/2503.05264.pdf

Abstract:
We introduce the Context Compliance Attack (CCA), a novel, optimization-free method for bypassing AI safety mechanisms. Unlike current approaches -- which rely on complex prompt engineering and computationally intensive optimization -- CCA exploits a fundamental architectural vulnerability inherent in many deployed AI systems. By subtly manipulating conversation history, CCA convinces the model to comply with a fabricated dialogue context, thereby triggering restricted behavior. Our evaluation across a diverse set of open-source and proprietary models demonstrates that this simple attack can circumvent state-of-the-art safety protocols. We discuss the implications of these findings and propose practical mitigation strategies to fortify AI systems against such elementary yet effective adversarial tactics.

Paperid: 1261, https://arxiv.org/pdf/2503.04954.pdf

Abstract:
Lacking security awareness, sensor fusion in systems with multi-agent networks such as smart cities is vulnerable to attacks. To guard against recent threats, we design security-aware sensor fusion that is based on the estimates of distributions over trust. Trust estimation can be cast as a hidden Markov model, and we solve it by mapping sensor data to trust pseudomeasurements (PSMs) that recursively update trust posteriors in a Bayesian context. Trust then feeds sensor fusion to facilitate trust-weighted updates to situational awareness. Essential to security-awareness are a novel field of view estimator, logic to map sensor data into PSMs, and the derivation of efficient Bayesian updates. We evaluate security-aware fusion under attacks on agents using case studies and Monte Carlo simulation in the physics-based Unreal Engine simulator, CARLA. A mix of novel and classical security-relevant metrics show that our security-aware fusion enables building trustworthy situational awareness even in hostile conditions.

Paperid: 1262, https://arxiv.org/pdf/2503.04140.pdf

Abstract:
Leveraging blockchain in Federated Learning (FL) emerges as a new paradigm for secure collaborative learning on Massive Edge Networks (MENs). As the scale of MENs increases, it becomes more difficult to implement and manage a blockchain among edge devices due to complex communication topologies, heterogeneous computation capabilities, and limited storage capacities. Moreover, the lack of a standard metric for blockchain security becomes a significant issue. To address these challenges, we propose a lightweight blockchain for verifiable and scalable FL, namely LiteChain, to provide efficient and secure services in MENs. Specifically, we develop a distributed clustering algorithm to reorganize MENs into a two-level structure to improve communication and computing efficiency under security requirements. Moreover, we introduce a Comprehensive Byzantine Fault Tolerance (CBFT) consensus mechanism and a secure update mechanism to ensure the security of model transactions through LiteChain. Our experiments based on Hyperledger Fabric demonstrate that LiteChain presents the lowest end-to-end latency and on-chain storage overheads across various network scales, outperforming the other two benchmarks. In addition, LiteChain exhibits a high level of robustness against replay and data poisoning attacks.

Paperid: 1263, https://arxiv.org/pdf/2503.03270.pdf

Abstract:
As one of the prominent AI-generated content, Deepfake has raised significant safety concerns. Although it has been demonstrated that temporal consistency cues offer better generalization capability, existing methods based on CNNs inevitably introduce spatial bias, which hinders the extraction of intrinsic temporal features. To address this issue, we propose a novel method called Spatial Dependency Reduction (SDR), which integrates common temporal consistency features from multiple spatially-perturbed clusters, to reduce the dependency of the model on spatial information. Specifically, we design multiple Spatial Perturbation Branch (SPB) to construct spatially-perturbed feature clusters. Subsequently, we utilize the theory of mutual information and propose a Task-Relevant Feature Integration (TRFI) module to capture temporal features residing in similar latent space from these clusters. Finally, the integrated feature is fed into a temporal transformer to capture long-range dependencies. Extensive benchmarks and ablation studies demonstrate the effectiveness and rationale of our approach.

Paperid: 1264, https://arxiv.org/pdf/2502.20225.pdf

Abstract:
In this paper, we propose a deep neural network approach for deepfake speech detection (DSD) based on a lowcomplexity Depthwise-Inception Network (DIN) trained with a contrastive training strategy (CTS). In this framework, input audio recordings are first transformed into spectrograms using Short-Time Fourier Transform (STFT) and Linear Filter (LF), which are then used to train the DIN. Once trained, the DIN processes bonafide utterances to extract audio embeddings, which are used to construct a Gaussian distribution representing genuine speech. Deepfake detection is then performed by computing the distance between a test utterance and this distribution to determine whether the utterance is fake or bonafide. To evaluate our proposed systems, we conducted extensive experiments on the benchmark dataset of ASVspoof 2019 LA. The experimental results demonstrate the effectiveness of combining the Depthwise-Inception Network with the contrastive learning strategy in distinguishing between fake and bonafide utterances. We achieved Equal Error Rate (EER), Accuracy (Acc.), F1, AUC scores of 4.6%, 95.4%, 97.3%, and 98.9% respectively using a single, low-complexity DIN with just 1.77 M parameters and 985 M FLOPS on short audio segments (4 seconds). Furthermore, our proposed system outperforms the single-system submissions in the ASVspoof 2019 LA challenge, showcasing its potential for real-time applications.

Paperid: 1265, https://arxiv.org/pdf/2502.15010.pdf

Abstract:
Recent copyright agreements between AI companies and content creators underscore the need for fine-grained control over language models' ability to reproduce copyrighted text. Existing defenses-ranging from aggressive unlearning to simplistic output filters-either sacrifice model utility or inadequately address verbatim leakage. We introduce Obliviate, a lightweight post-training method that surgically suppresses exact reproduction of specified sequences while preserving semantic understanding. Obliviate first identifies memorized passages and then, for each target token, minimally adjusts the model's output distribution via a Kullback-Leibler divergence penalty to drive down the probability of exact reproduction. Simultaneously, we enforce a consistency loss on non-target tokens to retain the model's fluency and task performance. We evaluate Obliviate on four popular 6-8B-parameter models (LLaMA-3.1, LLaMA-3.1-Instruct, Qwen-2.5, and Yi-1.5) using synthetic memorization benchmarks and organic copyrighted excerpts (e.g., Moby Dick, Frankenstein, Alice in Wonderland and Les Miserables). Across all settings, Obliviate reduces verbatim recall by two orders of magnitude (e.g., from hundreds of words to fewer than 12) while degrading downstream accuracy by at most 1% on HellaSwag, MMLU, TruthfulQA, and Winogrande. Furthermore, we benchmark Obliviate aganist different unlearning and copyright techniques using the MUSE and CoTaEval benchmarks. These results position Obliviate as a practical, high-fidelity solution for copyright compliance in deployed LLMs.

Paperid: 1266, https://arxiv.org/pdf/2502.13929.pdf

Abstract:
Formal verification plays a crucial role in making smart contracts safer, being able to find bugs or to guarantee their absence, as well as checking whether the business logic is correctly implemented. For Solidity, even though there already exist several mature verification tools, the semantical quirks of the language can make verification quite hard in practice. Move, on the other hand, has been designed with security and verification in mind, and it has been accompanied since its early stages by a formal verification tool, the Move Prover. In this paper, we investigate through a comparative analysis: 1) how the different designs of the two contract languages impact verification, and 2) what is the state-of-the-art of verification tools for the two languages, and how do they compare on three paradigmatic use cases. Our investigation is supported by an open dataset of verification tasks performed in Certora and in the Aptos Move Prover.

Paperid: 1267, https://arxiv.org/pdf/2502.13513.pdf

Abstract:
With the rapid development of blockchain technology, transaction logs play a central role in various applications, including decentralized exchanges, wallets, cross-chain bridges, and other third-party services. However, these logs, particularly those based on smart contract events, are highly susceptible to manipulation and forgery, creating substantial security risks across the ecosystem. To address this issue, we present the first in-depth security analysis of transaction log forgery in EVM-based blockchains, a phenomenon we term Phantom Events. We systematically model five types of attacks and propose a tool designed to detect event forgery vulnerabilities in smart contracts. Our evaluation demonstrates that our approach outperforms existing tools in identifying potential phantom events. Furthermore, we have successfully identified real-world instances for all five types of attacks across multiple decentralized applications. Finally, we call on community developers to take proactive steps to address these critical security vulnerabilities.

Paperid: 1268, https://arxiv.org/pdf/2502.11687.pdf

Abstract:
Backdoor attacks embed hidden functionalities in deep neural networks (DNN), triggering malicious behavior with specific inputs. Advanced defenses monitor anomalous DNN inferences to detect such attacks. However, concealed backdoors evade detection by maintaining a low pre-deployment attack success rate (ASR) and restoring high ASR post-deployment via machine unlearning. Existing concealed backdoors are often constrained by requiring white-box or black-box access or auxiliary data, limiting their practicality when such access or data is unavailable. This paper introduces ReVeil, a concealed backdoor attack targeting the data collection phase of the DNN training pipeline, requiring no model access or auxiliary data. ReVeil maintains low pre-deployment ASR across four datasets and four trigger patterns, successfully evades three popular backdoor detection methods, and restores high ASR post-deployment through machine unlearning.

Paperid: 1269, https://arxiv.org/pdf/2502.10771.pdf

Abstract:
The growing dependence on Electronic Identity Management Systems (EIDS) and recent advancements, such as non-human ID management, require a thorough evaluation of their trustworthiness. Assessing EIDS's trustworthiness ensures security, privacy, and reliability in managing sensitive user information. It safeguards against fraud, unauthorised access, and data breaches, fostering user confidence. Existing frameworks primarily focus on specific dimensions such as security and privacy, often neglecting critical dimensions such as ethics, resilience, robustness, and reliability. This paper introduces an integrated Digital Identity Systems Trustworthiness Assessment Framework (DISTAF) encapsulating these six pillars. It is supported by over 65 mechanisms and over 400 metrics derived from international standards and technical guidelines. By addressing the lifecycle of DIMS from design to deployment, our DISTAF evaluates trustworthiness at granular levels while remaining accessible to diverse stakeholders. We demonstrate the application of DISTAF through a real-world implementation using a Modular Open Source Identity Platform (MOSIP) instance, refining its metrics to simplify trustworthiness assessment. Our approach introduces clustering mechanisms for metrics, hierarchical scoring, and mandatory criteria to ensure robust and consistent evaluations across an EIDS in both the design and operation stages. Furthermore, DISTAF is adaptable to emerging technologies like Self-Sovereign Identity (SSI), integrating privacy-enhancing techniques and ethical considerations to meet modern challenges. The assessment tool developed alongside DISTAF provides a user-centric methodology and a simplified yet effective self-assessment process, enabling system designers and assessors to identify system gaps, improve configurations, and enhance public trust.

Paperid: 1270, https://arxiv.org/pdf/2502.09139.pdf

Abstract:
Constant-time code has become the de-facto standard for secure cryptographic implementations. However, some memory-based leakage classes such as ciphertext side-channels and silent stores remain unaddressed. Prior work proposed three different methods for ciphertext side-channel mitigation, for which one, the practicality of interleaving data with counter values, remains to be explored. To close this gap, we define design choices and requirements to leverage interleaving for a generic ciphertext side-channel mitigation. Based on these results, we implement Zebrafix, a compiler-based tool to ensure freshness of memory stores. We evaluate Zebrafix and find that interleaving can perform much better than other ciphertext side-channel mitigations, at the cost of a high practical complexity. We further observe that ciphertext side-channels and silent stores belong to a broader attack category: memory-centric side-channels. Under this unified view, we show that interleaving-based ciphertext side-channel mitigations can be used to prevent silent stores as well.

Paperid: 1271, https://arxiv.org/pdf/2502.08970.pdf

Abstract:
Metric Differential Privacy (mDP) builds upon the core principles of Differential Privacy (DP) by incorporating various distance metrics, which offer adaptable and context-sensitive privacy guarantees for a wide range of applications, such as location-based services, text analysis, and image processing. Since its inception in 2013, mDP has garnered substantial research attention, advancing theoretical foundations, algorithm design, and practical implementations. Despite this progress, existing surveys mainly focus on traditional DP and local DP, and they provide limited coverage of mDP. This paper provides a comprehensive survey of mDP research from 2013 to 2024, tracing its development from the foundations of DP. We categorize essential mechanisms, including Laplace, Exponential, and optimization-based approaches, and assess their strengths, limitations, and application domains. Additionally, we highlight key challenges and outline future research directions to encourage innovation and real-world adoption of mDP. This survey is designed to be a valuable resource for researchers and practitioners aiming to deepen their understanding and drive progress in mDP within the broader privacy ecosystem.

Paperid: 1272, https://arxiv.org/pdf/2502.08878.pdf

Abstract:
In the differentially private partition selection problem (a.k.a. private set union, private key discovery), users hold subsets of items from an unbounded universe. The goal is to output as many items as possible from the union of the users' sets while maintaining user-level differential privacy. Solutions to this problem are a core building block for many privacy-preserving ML applications including vocabulary extraction in a private corpus, computing statistics over categorical data and learning embeddings over user-provided items. We propose an algorithm for this problem, MaxAdaptiveDegree (MAD), which adaptively reroutes weight from items with weight far above the threshold needed for privacy to items with smaller weight, thereby increasing the probability that less frequent items are output. Our algorithm can be efficiently implemented in massively parallel computation systems allowing scalability to very large datasets. We prove that our algorithm stochastically dominates the standard parallel algorithm for this problem. We also develop a two-round version of our algorithm, MAD2R, where results of the computation in the first round are used to bias the weighting in the second round to maximize the number of items output. In experiments, our algorithms provide the best results among parallel algorithms and scale to datasets with hundreds of billions of items, up to three orders of magnitude larger than those analyzed by prior sequential algorithms.

Paperid: 1273, https://arxiv.org/pdf/2502.03721.pdf

Abstract:
Semantic communication systems, which leverage Generative AI (GAI) to transmit semantic meaning rather than raw data, are poised to revolutionize modern communications. However, they are vulnerable to backdoor attacks, a type of poisoning manipulation that embeds malicious triggers into training datasets. As a result, Backdoor attacks mislead the inference for poisoned samples while clean samples remain unaffected. The existing defenses may alter the model structure (such as neuron pruning that potentially degrades inference performance on clean inputs, or impose strict requirements on data formats (such as ``Semantic Shield" that requires image-text pairs). To address these limitations, this work proposes a defense mechanism that leverages semantic similarity to detect backdoor attacks without modifying the model structure or imposing data format constraints. By analyzing deviations in semantic feature space and establishing a threshold-based detection framework, the proposed approach effectively identifies poisoned samples. The experimental results demonstrate high detection accuracy and recall across varying poisoning ratios, underlining the significant effectiveness of our proposed solution.

Paperid: 1274, https://arxiv.org/pdf/2502.03233.pdf

Abstract:
The integration of Large Language Models (LLMs) into software development has revolutionized the field, particularly through the use of Retrieval-Augmented Code Generation (RACG) systems that enhance code generation with information from external knowledge bases. However, the security implications of RACG systems, particularly the risks posed by vulnerable code examples in the knowledge base, remain largely unexplored. This risk is particularly concerning given that public code repositories, which often serve as the sources for knowledge base collection in RACG systems, are usually accessible to anyone in the community. Malicious attackers can exploit this accessibility to inject vulnerable code into the knowledge base, making it toxic. Once these poisoned samples are retrieved and incorporated into the generated code, they can propagate security vulnerabilities into the final product. This paper presents the first comprehensive study on the security risks associated with RACG systems, focusing on how vulnerable code in the knowledge base compromises the security of generated code. We investigate the LLM-generated code security across different settings through extensive experiments using four major LLMs, two retrievers, and two poisoning scenarios. Our findings highlight the significant threat of knowledge base poisoning, where even a single poisoned code example can compromise up to 48% of generated code. Our findings provide crucial insights into vulnerability introduction in RACG systems and offer practical mitigation recommendations, thereby helping improve the security of LLM-generated code in future works.

Paperid: 1275, https://arxiv.org/pdf/2502.03149.pdf

Abstract:
Secure resource management (SRM) within a cloud computing environment is a critical yet infrequently studied research topic. This paper provides a comprehensive survey and comparative performance evaluation of potential cyber threat countermeasure strategies that address security challenges during cloud workload execution and resource management. Cybersecurity is explored specifically in the context of cloud resource management, with an emphasis on identifying the associated challenges. The cyber threat countermeasure methods are categorized into three classes: defensive strategies, mitigating strategies, and hybrid strategies. The existing countermeasure strategies belonging to each class are thoroughly discussed and compared. In addition to conceptual and theoretical analysis, the leading countermeasure strategies within these categories are implemented on a common platform and examined using two real-world virtual machine (VM) data traces. Based on this comprehensive study and performance evaluation, the paper discusses the trade-offs among these countermeasure strategies and their utility, providing imperative concluding remarks on the holistic study of cloud cyber threat countermeasures and secure resource management. Furthermore, the study suggests future methodologies that could effectively address the emerging challenges of secure cloud resource management.

Paperid: 1276, https://arxiv.org/pdf/2502.00834.pdf

Abstract:
This work investigates a novel approach to boost adversarial robustness and generalization by incorporating structural prior into the design of deep learning models. Specifically, our study surprisingly reveals that existing dictionary learning-inspired convolutional neural networks (CNNs) provide a false sense of security against adversarial attacks. To address this, we propose Elastic Dictionary Learning Networks (EDLNets), a novel ResNet architecture that significantly enhances adversarial robustness and generalization. This novel and effective approach is supported by a theoretical robustness analysis using influence functions. Moreover, extensive and reliable experiments demonstrate consistent and significant performance improvement on open robustness leaderboards such as RobustBench, surpassing state-of-the-art baselines. To the best of our knowledge, this is the first work to discover and validate that structural prior can reliably enhance deep learning robustness under strong adaptive attacks, unveiling a promising direction for future research.

Paperid: 1277, https://arxiv.org/pdf/2502.00580.pdf

Abstract:
Recent work showed Best-of-N (BoN) jailbreaking using repeated use of random augmentations (such as capitalization, punctuation, etc) is effective against all major large language models (LLMs). We have found that $100\%$ of the BoN paper's successful jailbreaks (confidence interval $[99.65\%, 100.00\%]$) and $99.8\%$ of successful jailbreaks in our replication (confidence interval $[99.28\%, 99.98\%]$) were blocked with our Defense Against The Dark Prompts (DATDP) method. The DATDP algorithm works by repeatedly utilizing an evaluation LLM to evaluate a prompt for dangerous or manipulative behaviors--unlike some other approaches, DATDP also explicitly looks for jailbreaking attempts--until a robust safety rating is generated. This success persisted even when utilizing smaller LLMs to power the evaluation (Claude and LLaMa-3-8B-instruct proved almost equally capable). These results show that, though language models are sensitive to seemingly innocuous changes to inputs, they seem also capable of successfully evaluating the dangers of these inputs. Versions of DATDP can therefore be added cheaply to generative AI systems to produce an immediate significant increase in safety.

Paperid: 1278, https://arxiv.org/pdf/2501.16888.pdf

Abstract:
Recommender systems often rely on graph-based filters, such as normalized item-item adjacency matrices and low-pass filters. While effective, the centralized computation of these components raises concerns about privacy, security, and the ethical use of user data. This work proposes two decentralized frameworks for securely computing these critical graph components without centralizing sensitive information. The first approach leverages lightweight Multi-Party Computation and distributed singular vector computations to privately compute key graph filters. The second extends this framework by incorporating low-rank approximations, enabling a trade-off between communication efficiency and predictive performance. Empirical evaluations on benchmark datasets demonstrate that the proposed methods achieve comparable accuracy to centralized state-of-the-art systems while ensuring data confidentiality and maintaining low communication costs. Our results highlight the potential for privacy-preserving decentralized architectures to bridge the gap between utility and user data protection in modern recommender systems.

Paperid: 1279, https://arxiv.org/pdf/2501.15975.pdf

Abstract:
This paper proposes a novel encryption-based access control mechanism for Named Data Networking (NDN). The scheme allows data producers to share their content in encrypted form before transmitting it to consumers. The encryption mechanism incorporates time-based subscription access policies directly into the encrypted content, enabling only consumers with valid subscriptions to decrypt it. This makes the scheme well-suited for real-world, subscription-based applications like Netflix. Additionally, the scheme introduces an anonymous and unlinkable signature-based authentication mechanism that empowers edge routers to block bogus content requests at the network's entry point, thereby mitigating Denial of Service (DoS) attacks. A formal security proof demonstrates the scheme's resistance to Chosen Plaintext Attacks (CPA). Performance analysis, using Mini-NDN-based emulation and a Charm library implementation, further confirms the practicality of the scheme. Moreover, it outperforms closely related works in terms of functionality, security, and communication overhead.

Paperid: 1280, https://arxiv.org/pdf/2501.15365.pdf

Abstract:
In recent years, rapid technological advancements and expanded Internet access have led to a significant rise in anomalies within network traffic and time-series data. Prompt detection of these irregularities is crucial for ensuring service quality, preventing financial losses, and maintaining robust security standards. While machine learning algorithms have shown promise in achieving high accuracy for anomaly detection, their performance is often constrained by the specific conditions of their training data. A persistent challenge in this domain is the scarcity of labeled data for anomaly detection in time-series datasets. This limitation hampers the training efficacy of both traditional machine learning and advanced deep learning models. To address this, unsupervised transfer learning emerges as a viable solution, leveraging unlabeled data from a source domain to identify anomalies in an unlabeled target domain. However, many existing approaches still depend on a small amount of labeled data from the target domain. To overcome these constraints, we propose a transfer learning-based model for anomaly detection in multivariate time-series datasets. Unlike conventional methods, our approach does not require labeled data in either the source or target domains. Empirical evaluations on novel intrusion detection datasets demonstrate that our model outperforms existing techniques in accurately identifying anomalies within an entirely unlabeled target domain.

Paperid: 1281, https://arxiv.org/pdf/2501.13716.pdf

Abstract:
The rapid expansion of connected devices has amplified the need for robust and scalable security frameworks. This paper proposes a holistic approach to securing network-connected devices, covering essential layers: hardware, firmware, communication, and application. At the hardware level, we focus on secure key management, reliable random number generation, and protecting critical assets. Firmware security is addressed through mechanisms like cryptographic integrity validation and secure boot processes. For secure communication, we emphasize TLS 1.3 and optimized cipher suites tailored for both standard and resource-constrained devices. To overcome the challenges of IoT, compact digital certificates, such as CBOR, are recommended to reduce overhead and enhance performance. Additionally, the paper explores forward-looking solutions, including post-quantum cryptography, to future-proof systems against emerging threats. This framework provides actionable guidelines for manufacturers and system administrators to build secure devices that maintain confidentiality, integrity, and availability throughout their lifecycle.

Paperid: 1282, https://arxiv.org/pdf/2501.11798.pdf

Abstract:
The emergence of quantum computing presents a formidable challenge to the security of blockchain systems. Traditional cryptographic algorithms, foundational to digital signatures, message encryption, and hashing functions, become vulnerable to the immense computational power of quantum computers. This paper conducts a thorough risk assessment of transitioning to quantum-resistant blockchains, comprehensively analyzing potential threats targeting vital blockchain components: the network, mining pools, transaction verification mechanisms, smart contracts, and user wallets. By elucidating the intricate challenges and strategic considerations inherent in transitioning to quantum-resistant algorithms, the paper evaluates risks and highlights obstacles in securing blockchain components with quantum-resistant cryptography. It offers a hybrid migration strategy to facilitate a smooth transition from classical to quantum-resistant cryptography. The analysis extends to prominent blockchains such as Bitcoin, Ethereum, Ripple, Litecoin, and Zcash, assessing vulnerable components, potential impacts, and associated STRIDE threats, thereby identifying areas susceptible to quantum attacks. Beyond analysis, the paper provides actionable guidance for designing secure and resilient blockchain ecosystems in the quantum computing era. Recognizing the looming threat of quantum computers, this research advocates for a proactive transition to quantum-resistant blockchain networks. It proposes a tailored security blueprint that strategically fortifies each component against the evolving landscape of quantum-induced cyber threats. Emphasizing the critical need for blockchain stakeholders to adopt proactive measures and implement quantum-resistant solutions, the paper underscores the importance of embracing these insights to navigate the complexities of the quantum era with resilience and confidence.

Paperid: 1283, https://arxiv.org/pdf/2501.07955.pdf

Abstract:
With the growth of online social services, social information graphs are becoming increasingly complex. Privacy issues related to analyzing or publishing on social graphs are also becoming increasingly serious. Since the shortest paths play an important role in graphs, privately publishing the shortest paths or distances has attracted the attention of researchers. Differential privacy (DP) is an excellent standard for preserving privacy. However, existing works to answer the distance query with the guarantee of DP were almost based on the weight private graph assumption, not on the paths themselves. In this paper, we consider edges as privacy and propose distance publishing mechanisms based on edge DP. To address the issue of utility damage caused by large global sensitivities, we revisit studies related to asymmetric neighborhoods in DP with the observation that the distance query is monotonic in asymmetric neighborhoods. We formally give the definition of asymmetric neighborhoods and propose Individual Asymmetric Differential Privacy with higher privacy guarantees in combination with smooth sensitivity. Then, we introduce two methods to efficiently compute the smooth sensitivity of distance queries in asymmetric neighborhoods. Finally, we validate our scheme using both real-world and synthetic datasets, which can reduce the error to $0.0862$.

Paperid: 1284, https://arxiv.org/pdf/2501.06071.pdf

Abstract:
The widespread usage of Microsoft Windows has unfortunately led to a surge in malware, posing a serious threat to the security and privacy of millions of users. In response, the research community has mobilized, with numerous efforts dedicated to strengthening defenses against these threats. The primary goal of these techniques is to detect malicious software early, preventing attacks before any damage occurs. However, many of these methods either claim that packing has minimal impact on malware detection or fail to address the reliability of their approaches when applied to packed samples. Consequently, they are not capable of assisting victims in handling packed programs or recovering from the damages caused by untimely malware detection. In light of these challenges, we propose VisUnpack, a static analysis-based data visualization framework for bolstering attack prevention while aiding recovery post-attack by unveiling malware patterns and offering more detailed information including both malware class and family. Our method includes unpacking packed malware programs, calculating local similarity descriptors based on basic blocks, enhancing correlations between descriptors, and refining them by minimizing noises to obtain self-analysis descriptors. Moreover, we employ machine learning to learn the correlations of self-analysis descriptors through architectural learning for final classification. Our comprehensive evaluation of VisUnpack based on a freshly gathered dataset with over 27,106 samples confirms its capability in accurately classifying malware programs with a precision of 99.7%. Additionally, VisUnpack reveals that most antivirus products in VirusTotal can not handle packed samples properly or provide precise malware classification information. We also achieve over 97% space savings compared to existing data visualization based methods.

Paperid: 1285, https://arxiv.org/pdf/2501.05793.pdf

Abstract:
To defend against Advanced Persistent Threats on the endpoint, threat hunting employs security knowledge such as cyber threat intelligence to continuously analyze system audit logs through retrospective scanning, querying, or pattern matching, aiming to uncover attack patterns/graphs that traditional detection methods (e.g., recognition for Point of Interest) fail to capture. However, existing threat hunting systems based on provenance graphs face challenges of high false negatives, high false positives, and low efficiency when confronted with diverse attack tactics and voluminous audit logs. To address these issues, we propose a system called Actminer, which constructs query graphs from descriptive relationships in cyber threat intelligence reports for precise threat hunting (i.e., graph alignment) on provenance graphs. First, we present a heuristic search strategy based on equivalent semantic transfer to reduce false negatives. Second, we establish a filtering mechanism based on causal relationships of attack behaviors to mitigate false positives. Finally, we design a tree structure to incrementally update the alignment results, significantly improving hunting efficiency. Evaluation on the DARPA Engagement dataset demonstrates that compared to the SOTA POIROT, Actminer reduces false positives by 39.1%, eliminates all false negatives, and effectively counters adversarial attacks.

Paperid: 1286, https://arxiv.org/pdf/2501.05258.pdf

Abstract:
In today's digital landscape, the importance of timely and accurate vulnerability detection has significantly increased. This paper presents a novel approach that leverages transformer-based models and machine learning techniques to automate the identification of software vulnerabilities by analyzing GitHub issues. We introduce a new dataset specifically designed for classifying GitHub issues relevant to vulnerability detection. We then examine various classification techniques to determine their effectiveness. The results demonstrate the potential of this approach for real-world application in early vulnerability detection, which could substantially reduce the window of exploitation for software vulnerabilities. This research makes a key contribution to the field by providing a scalable and computationally efficient framework for automated detection, enabling the prevention of compromised software usage before official notifications. This work has the potential to enhance the security of open-source software ecosystems.

Paperid: 1287, https://arxiv.org/pdf/2501.03977.pdf

Abstract:
Cyber Spectrum Intelligence (SpecInt) is emerging as a concept that extends beyond basic {\em spectrum sensing} and {\em signal intelligence} to encompass a broader set of capabilities and technologies aimed at monitoring the use of the radio spectrum and extracting information. SpecInt merges traditional spectrum sensing techniques with Artificial Intelligence (AI) and parallel processing to enhance the ability to extract and correlate simultaneous events occurring on various frequencies, allowing for a new wave of intelligence applications. This paper provides an overview of the emerging SpecInt research area, characterizing the system architecture and the most relevant applications for cyber-physical security. We identify five subcategories of spectrum intelligence for cyber-physical security, encompassing Device Intelligence, Channel Intelligence, Location Intelligence, Communication Intelligence, and Ambient Intelligence. We also provide preliminary results based on an experimental testbed showing the viability, feasibility, and potential of this emerging application area. Finally, we point out current research challenges and future directions paving the way for further research in this domain.

Paperid: 1288, https://arxiv.org/pdf/2501.03695.pdf

Abstract:
With the advancement of blockchain technology, chained Byzantine Fault Tolerant (BFT) protocols have been increasingly adopted in practical systems, making their performance a crucial aspect of the study. In this paper, we introduce a unified framework utilizing Markov Decision Processes (MDP) to model and assess the performance of three prominent chained BFT protocols. Our framework effectively captures complex adversarial behaviors, focusing on two key performance metrics: chain growth and commitment rate. We implement the optimal attack strategies obtained from MDP analysis on an existing evaluation platform for chained BFT protocols and conduct extensive experiments under various settings to validate our theoretical results. Through rigorous theoretical analysis and thorough practical experiments, we provide an in-depth evaluation of chained BFT protocols under diverse attack scenarios, uncovering optimal attack strategies. Contrary to conventional belief, our findings reveal that while responsiveness can enhance performance, it is not universally beneficial across all scenarios. This work not only deepens our understanding of chained BFT protocols, but also offers valuable insights and analytical tools that can inform the design of more robust and efficient protocols.

Paperid: 1289, https://arxiv.org/pdf/2501.03245.pdf

Abstract:
Elliptic Curve Cryptography (ECC) is an encryption method that provides security comparable to traditional techniques like Rivest-Shamir-Adleman (RSA) but with lower computational complexity and smaller key sizes, making it a competitive option for applications such as blockchain, secure multi-party computation, and database security. However, the throughput of ECC is still hindered by the significant performance overhead associated with elliptic curve (EC) operations. This paper presents gECC, a versatile framework for ECC optimized for GPU architectures, specifically engineered to achieve high-throughput performance in EC operations. gECC incorporates batch-based execution of EC operations and microarchitecture-level optimization of modular arithmetic. It employs Montgomery's trick to enable batch EC computation and incorporates novel computation parallelization and memory management techniques to maximize the computation parallelism and minimize the access overhead of GPU global memory. Also, we analyze the primary bottleneck in modular multiplication by investigating how the user codes of modular multiplication are compiled into hardware instructions and what these instructions' issuance rates are. We identify that the efficiency of modular multiplication is highly dependent on the number of Integer Multiply-Add (IMAD) instructions. To eliminate this bottleneck, we propose techniques to minimize the number of IMAD instructions by leveraging predicate registers to pass the carry information and using addition and subtraction instructions (IADD3) to replace IMAD instructions. Our results show that, for ECDSA and ECDH, gECC can achieve performance improvements of 5.56x and 4.94x, respectively, compared to the state-of-the-art GPU-based system. In a real-world blockchain application, we can achieve performance improvements of 1.56x, compared to the state-of-the-art CPU-based system.

Paperid: 1290, https://arxiv.org/pdf/2501.02970.pdf

Abstract:
With the growing popularity of blockchains, modern chained BFT protocols combining chaining and leader rotation to obtain better efficiency and leadership democracy have received increasing interest. Although the efficiency provisions of chained BFT protocols have been thoroughly analyzed, the leadership democracy has received little attention in prior work. In this paper, we scrutinize the leadership democracy of four representative chained BFT protocols, especially under attack. To this end, we propose a unified framework with two evaluation metrics, i.e., chain quality and censorship resilience, and quantitatively analyze chosen protocols through the Markov Decision Process (MDP). With this framework, we further examine the impact of two key components, i.e., voting pattern and leader rotation on leadership democracy. Our results indicate that leader rotation is not enough to provide the leadership democracy guarantee; an adversary could utilize the design, e.g., voting pattern, to deteriorate the leadership democracy significantly. Based on the analysis results, we propose customized countermeasures for three evaluated protocols to improve their leadership democracy with only slight protocol overhead and no change of consensus rules. We also discuss future directions toward building more democratic chained BFT protocols.

Paperid: 1291, https://arxiv.org/pdf/2501.02441.pdf

Abstract:
Large Language Models (LLMs) are rapidly gaining enormous popularity in recent years. However, the training of LLMs has raised significant privacy and legal concerns, particularly regarding the inclusion of copyrighted materials in their training data without proper attribution or licensing, which falls under the broader issue of data misappropriation. In this article, we focus on a specific problem of data misappropriation detection, namely, to determine whether a given LLM has incorporated data generated by another LLM. To address this issue, we propose embedding watermarks into the copyrighted training data and formulating the detection of data misappropriation as a hypothesis testing problem. We develop a general statistical testing framework, construct a pivotal statistic, determine the optimal rejection threshold, and explicitly control the type I and type II errors. Furthermore, we establish the asymptotic optimality properties of the proposed tests, and demonstrate its empirical effectiveness through intensive numerical experiments.

Paperid: 1292, https://arxiv.org/pdf/2501.00446.pdf

Abstract:
As the scope and impact of cyber threats have expanded, analysts utilize audit logs to hunt threats and investigate attacks. The provenance graphs constructed from kernel logs are increasingly considered as an ideal data source due to their powerful semantic expression and attack historic correlation ability. However, storing provenance graphs with traditional databases faces the challenge of high storage overhead, given the high frequency of kernel events and the persistence of attacks. To address this, we propose Dehydrator, an efficient provenance graph storage system. For the logs generated by auditing frameworks, Dehydrator uses field mapping encoding to filter field-level redundancy, hierarchical encoding to filter structure-level redundancy, and finally learns a deep neural network to support batch querying. We have conducted evaluations on seven datasets totaling over one billion log entries. Experimental results show that Dehydrator reduces the storage space by 84.55%. Dehydrator is 7.36 times more efficient than PostgreSQL, 7.16 times than Neo4j, and 16.17 times than Leonard (the work most closely related to Dehydrator, published at Usenix Security'23).

Paperid: 1293, https://arxiv.org/pdf/2501.00438.pdf

Abstract:
As Advanced Persistent Threat (APT) complexity increases, provenance data is increasingly used for detection. Anomaly-based systems are gaining attention due to their attack-knowledge-agnostic nature and ability to counter zero-day vulnerabilities. However, traditional detection paradigms, which train on offline, limited-size data, often overlook concept drift - unpredictable changes in streaming data distribution over time. This leads to high false positive rates. We propose incremental learning as a new paradigm to mitigate this issue. However, we identify FOUR CHALLENGES while integrating incremental learning as a new paradigm. First, the long-running incremental system must combat catastrophic forgetting (C1) and avoid learning malicious behaviors (C2). Then, the system needs to achieve precise alerts (C3) and reconstruct attack scenarios (C4). We present METANOIA, the first lifelong detection system that mitigates the high false positives due to concept drift. It connects pseudo edges to combat catastrophic forgetting, transfers suspicious states to avoid learning malicious behaviors, filters nodes at the path-level to achieve precise alerts, and constructs mini-graphs to reconstruct attack scenarios. Using state-of-the-art benchmarks, we demonstrate that METANOIA improves precision performance at the window-level, graph-level, and node-level by 30%, 54%, and 29%, respectively, compared to previous approaches.

Paperid: 1294, https://arxiv.org/pdf/2506.23814.pdf

Abstract:
Several recent works focused on the best practices for applying machine learning to cybersecurity. In the context of malware, TESSERACT highlighted the impact of concept drift on detection performance and suggested temporal and spatial constraints to be enforced to ensure realistic time-aware evaluations, which have been adopted by the community. In this paper, we demonstrate striking discrepancies in the performance of learning-based malware detection across the same time frame when evaluated on two representative Android malware datasets used in top-tier security conferences, both adhering to established sampling and evaluation guidelines. This questions our ability to understand how current state-of-the-art approaches would perform in realistic scenarios. To address this, we identify five novel temporal and spatial bias factors that affect realistic evaluations. We thoroughly evaluate the impact of these factors in the Android malware domain on two representative datasets and five Android malware classifiers used or proposed in top-tier security conferences. For each factor, we provide practical and actionable recommendations that the community should integrate in their methodology for more realistic and reproducible settings.

Paperid: 1295, https://arxiv.org/pdf/2506.23294.pdf

Abstract:
Digital signatures are crucial for securing Central Bank Digital Currencies (CBDCs) transactions. Like most forms of digital currencies, CBDC solutions rely on signatures for transaction authenticity and integrity, leading to major issues in the case of private key compromise. Our work explores threshold signature schemes (TSSs) in the context of CBDCs. TSSs allow distributed key management and signing, reducing the risk of a compromised key. We analyze CBDC-specific requirements, considering the applicability of TSSs, and use Filia CBDC solution as a base for a detailed evaluation. As most of the current solutions rely on ECDSA for compatibility, we focus on ECDSA-based TSSs and their supporting libraries. Our performance evaluation measured the computational and communication complexity across key processes, as well as the throughput and latency of end-to-end transactions. The results confirm that TSS can enhance the security of CBDC implementations while maintaining acceptable performance for real-world deployments.

Paperid: 1296, https://arxiv.org/pdf/2506.21842.pdf

Abstract:
Quantum Machine Learning (QML) integrates quantum computing with classical machine learning, primarily to solve classification, regression and generative tasks. However, its rapid development raises critical security challenges in the Noisy Intermediate-Scale Quantum (NISQ) era. This chapter examines adversarial threats unique to QML systems, focusing on vulnerabilities in cloud-based deployments, hybrid architectures, and quantum generative models. Key attack vectors include model stealing via transpilation or output extraction, data poisoning through quantum-specific perturbations, reverse engineering of proprietary variational quantum circuits, and backdoor attacks. Adversaries exploit noise-prone quantum hardware and insufficiently secured QML-as-a-Service (QMLaaS) workflows to compromise model integrity, ownership, and functionality. Defense mechanisms leverage quantum properties to counter these threats. Noise signatures from training hardware act as non-invasive watermarks, while hardware-aware obfuscation techniques and ensemble strategies disrupt cloning attempts. Emerging solutions also adapt classical adversarial training and differential privacy to quantum settings, addressing vulnerabilities in quantum neural networks and generative architectures. However, securing QML requires addressing open challenges such as balancing noise levels for reliability and security, mitigating cross-platform attacks, and developing quantum-classical trust frameworks. This chapter summarizes recent advances in attacks and defenses, offering a roadmap for researchers and practitioners to build robust, trustworthy QML systems resilient to evolving adversarial landscapes.

Paperid: 1297, https://arxiv.org/pdf/2506.18114.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) has introduced significant security challenges, necessitating efficient and adaptive Intrusion Detection Systems (IDS). Traditional IDS models often overlook the temporal characteristics of network traffic, limiting their effectiveness in early threat detection. We propose a Transformer-based Early Intrusion Detection System (EIDS) that incorporates dynamic temporal positional encodings to enhance detection accuracy while maintaining computational efficiency. By leveraging network flow timestamps, our approach captures both sequence structure and timing irregularities indicative of malicious behaviour. Additionally, we introduce a data augmentation pipeline to improve model robustness. Evaluated on the CICIoT2023 dataset, our method outperforms existing models in both accuracy and earliness. We further demonstrate its real-time feasibility on resource-constrained IoT devices, achieving low-latency inference and minimal memory footprint.

Paperid: 1298, https://arxiv.org/pdf/2506.17368.pdf

Abstract:
Large language models based on Mixture-of-Experts have achieved substantial gains in efficiency and scalability, yet their architectural uniqueness introduces underexplored safety alignment challenges. Existing safety alignment strategies, predominantly designed for dense models, are ill-suited to address MoE-specific vulnerabilities. In this work, we formalize and systematically study MoE model's positional vulnerability - the phenomenon where safety-aligned behaviors rely on specific expert modules, revealing critical risks inherent to MoE architectures. To this end, we present SAFEx, an analytical framework that robustly identifies, characterizes, and validates the safety-critical experts using a novel Stability-based Expert Selection (SES) algorithm. Notably, our approach enables the explicit decomposition of safety-critical experts into distinct functional groups, including those responsible for harmful content detection and those controlling safe response generation. Extensive experiments on mainstream MoE models, such as the recently released Qwen3-MoE, demonstrated that their intrinsic safety mechanisms heavily rely on a small subset of positional experts. Disabling these experts significantly compromised the models' ability to refuse harmful requests. For Qwen3-MoE with 6144 experts (in the FNN layer), we find that disabling as few as 12 identified safety-critical experts can cause the refusal rate to drop by 22%, demonstrating the disproportionate impact of a small set of experts on overall model safety.

Paperid: 1299, https://arxiv.org/pdf/2506.16968.pdf

Abstract:
Cyber Threat Intelligence (CTI) parsing aims to extract key threat information from massive data, transform it into actionable intelligence, enhance threat detection and defense efficiency, including attack graph construction, intelligence fusion and indicator extraction. Among these research topics, Attack Graph Construction (AGC) is essential for visualizing and understanding the potential attack paths of threat events from CTI reports. Existing approaches primarily construct the attack graphs purely from the textual data to reveal the logical threat relationships between entities within the attack behavioral sequence. However, they typically overlook the specific threat information inherent in visual modalities, which preserves the key threat details from inherently-multimodal CTI report. Therefore, we enhance the effectiveness of attack graph construction by analyzing visual information through Multimodal Large Language Models (MLLMs). Specifically, we propose a novel framework, MM-AttacKG, which can effectively extract key information from threat images and integrate it into attack graph construction, thereby enhancing the comprehensiveness and accuracy of attack graphs. It first employs a threat image parsing module to extract critical threat information from images and generate descriptions using MLLMs. Subsequently, it builds an iterative question-answering pipeline tailored for image parsing to refine the understanding of threat images. Finally, it achieves content-level integration between attack graphs and image-based answers through MLLMs, completing threat information enhancement. The experimental results demonstrate that MM-AttacKG can accurately identify key information in threat images and significantly improve the quality of multimodal attack graph construction, effectively addressing the shortcomings of existing methods in utilizing image-based threat information.

Paperid: 1300, https://arxiv.org/pdf/2506.12299.pdf

Abstract:
The recent advancements in Large Language Models(LLMs) have had a significant impact on a wide range of fields, from general domains to specialized areas. However, these advancements have also significantly increased the potential for malicious users to exploit harmful and jailbreak prompts for malicious attacks. Although there have been many efforts to prevent harmful prompts and jailbreak prompts, protecting LLMs from such malicious attacks remains an important and challenging task. In this paper, we propose QGuard, a simple yet effective safety guard method, that utilizes question prompting to block harmful prompts in a zero-shot manner. Our method can defend LLMs not only from text-based harmful prompts but also from multi-modal harmful prompt attacks. Moreover, by diversifying and modifying guard questions, our approach remains robust against the latest harmful prompts without fine-tuning. Experimental results show that our model performs competitively on both text-only and multi-modal harmful datasets. Additionally, by providing an analysis of question prompting, we enable a white-box analysis of user inputs. We believe our method provides valuable insights for real-world LLM services in mitigating security risks associated with harmful prompts.

Paperid: 1301, https://arxiv.org/pdf/2506.12195.pdf

Abstract:
Quantum communication is poised to become a foundational element of next-generation networking, offering transformative capabilities in security, entanglement-based connectivity, and computational offloading. However, the classical OSI model-designed for deterministic and error-tolerant systems-cannot support quantum-specific phenomena such as coherence fragility, probabilistic entanglement, and the no-cloning theorem. This paper provides a comprehensive survey and proposes an architectural redesign of the OSI model for quantum networks in the context of 7G. We introduce a Quantum-Converged OSI stack by extending the classical model with Layer 0 (Quantum Substrate) and Layer 8 (Cognitive Intent), supporting entanglement, teleportation, and semantic orchestration via LLMs and QML. Each layer is redefined to incorporate quantum mechanisms such as enhanced MAC protocols, fidelity-aware routing, and twin-based applications. This survey consolidates over 150 research works from IEEE, ACM, MDPI, arXiv, and Web of Science (2018-2025), classifying them by OSI layer, enabling technologies such as QKD, QEC, PQC, and RIS, and use cases such as satellite QKD, UAV swarms, and quantum IoT. A taxonomy of cross-layer enablers-such as hybrid quantum-classical control, metadata-driven orchestration, and blockchain-integrated quantum trust-is provided, along with simulation tools including NetSquid, QuNetSim, and QuISP. We present several domain-specific applications, including quantum healthcare telemetry, entangled vehicular networks, and satellite mesh overlays. An evaluation framework is proposed based on entropy throughput, coherence latency, and entanglement fidelity. Key future directions include programmable quantum stacks, digital twins, and AI-defined QNet agents, laying the groundwork for a scalable, intelligent, and quantum-compliant OSI framework for 7G and beyond.

Paperid: 1302, https://arxiv.org/pdf/2506.11635.pdf

Abstract:
The continuous growth of the e-commerce industry attracts fraudsters who exploit stolen credit card details. Companies often investigate suspicious transactions in order to retain customer trust and address gaps in their fraud detection systems. However, analysts are overwhelmed with an enormous number of alerts from credit card transaction monitoring systems. Each alert investigation requires from the fraud analysts careful attention, specialized knowledge, and precise documentation of the outcomes, leading to alert fatigue. To address this, we propose a fraud analyst assistant (FAA) framework, which employs multi-modal large language models (LLMs) to automate credit card fraud investigations and generate explanatory reports. The FAA framework leverages the reasoning, code execution, and vision capabilities of LLMs to conduct planning, evidence collection, and analysis in each investigation step. A comprehensive empirical evaluation of 500 credit card fraud investigations demonstrates that the FAA framework produces reliable and efficient investigations comprising seven steps on average. Thus we found that the FAA framework can automate large parts of the workload and help reduce the challenges faced by fraud analysts.

Paperid: 1303, https://arxiv.org/pdf/2506.10175.pdf

Abstract:
Effective attribution of Advanced Persistent Threats (APTs) increasingly hinges on the ability to correlate behavioral patterns and reason over complex, varied threat intelligence artifacts. We present AURA (Attribution Using Retrieval-Augmented Agents), a multi-agent, knowledge-enhanced framework for automated and interpretable APT attribution. AURA ingests diverse threat data including Tactics, Techniques, and Procedures (TTPs), Indicators of Compromise (IoCs), malware details, adversarial tools, and temporal information, which are processed through a network of collaborative agents. These agents are designed for intelligent query rewriting, context-enriched retrieval from structured threat knowledge bases, and natural language justification of attribution decisions. By combining Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs), AURA enables contextual linking of threat behaviors to known APT groups and supports traceable reasoning across multiple attack phases. Experiments on recent APT campaigns demonstrate AURA's high attribution consistency, expert-aligned justifications, and scalability. This work establishes AURA as a promising direction for advancing transparent, data-driven, and scalable threat attribution using multi-agent intelligence.

Paperid: 1304, https://arxiv.org/pdf/2506.10171.pdf

Abstract:
Large Language Model agents have begun to appear as personal assistants, customer service bots, and clinical aides. While these applications deliver substantial operational benefits, they also require continuous access to sensitive data, which increases the likelihood of unauthorized disclosures. This study proposes an auditing framework for conversational privacy that quantifies and audits these risks. The proposed Conversational Manipulation for Privacy Leakage (CMPL) framework, is an iterative probing strategy designed to stress-test agents that enforce strict privacy directives. Rather than focusing solely on a single disclosure event, CMPL simulates realistic multi-turn interactions to systematically uncover latent vulnerabilities. Our evaluation on diverse domains, data modalities, and safety configurations demonstrate the auditing framework's ability to reveal privacy risks that are not deterred by existing single-turn defenses. In addition to introducing CMPL as a diagnostic tool, the paper delivers (1) an auditing procedure grounded in quantifiable risk metrics and (2) an open benchmark for evaluation of conversational privacy across agent implementations.

Paperid: 1305, https://arxiv.org/pdf/2506.09600.pdf

Abstract:
Task-oriented LLM-based agents are increasingly used in domains with strict policies, such as refund eligibility or cancellation rules. The challenge lies in ensuring that the agent consistently adheres to these rules and policies, appropriately refusing any request that would violate them, while still maintaining a helpful and natural interaction. This calls for the development of tailored design and evaluation methodologies to ensure agent resilience against malicious user behavior. We propose a novel threat model that focuses on adversarial users aiming to exploit policy-adherent agents for personal benefit. To address this, we present CRAFT, a multi-agent red-teaming system that leverages policy-aware persuasive strategies to undermine a policy-adherent agent in a customer-service scenario, outperforming conventional jailbreak methods such as DAN prompts, emotional manipulation, and coercive. Building upon the existing tau-bench benchmark, we introduce tau-break, a complementary benchmark designed to rigorously assess the agent's robustness against manipulative user behavior. Finally, we evaluate several straightforward yet effective defense strategies. While these measures provide some protection, they fall short, highlighting the need for stronger, research-driven safeguards to protect policy-adherent agents from adversarial attacks

Paperid: 1306, https://arxiv.org/pdf/2506.09474.pdf

Abstract:
We explore covert entanglement generation over the lossy thermal-noise bosonic channel, which is a quantum-mechanical model of many practical settings, including optical, microwave, and radio-frequency (RF) channels. Covert communication ensures that an adversary is unable to detect the presence of transmissions, which are concealed in channel noise. We show that a $\textit{square root law}$ (SRL) for covert entanglement generation similar to that for classical: $L_{\rm EG}\sqrt{n}$ entangled bits (ebits) can be generated covertly and reliably over $n$ uses of a bosonic channel. We report a single-letter expression for optimal $L_{\rm EG}$ as well as an achievable method. We additionally analyze the performance of covert entanglement generation using single- and dual-rail photonic qubits, which may be more practical for physical implementation.

Paperid: 1307, https://arxiv.org/pdf/2506.08837.pdf

Abstract:
As AI agents powered by Large Language Models (LLMs) become increasingly versatile and capable of addressing a broad spectrum of tasks, ensuring their security has become a critical challenge. Among the most pressing threats are prompt injection attacks, which exploit the agent's resilience on natural language inputs -- an especially dangerous threat when agents are granted tool access or handle sensitive information. In this work, we propose a set of principled design patterns for building AI agents with provable resistance to prompt injection. We systematically analyze these patterns, discuss their trade-offs in terms of utility and security, and illustrate their real-world applicability through a series of case studies.

Paperid: 1308, https://arxiv.org/pdf/2506.06161.pdf

Abstract:
Binary code similarity analysis (BCSA) serves as a foundational technique for binary analysis tasks such as vulnerability detection and malware identification. Existing graph based BCSA approaches capture more binary code semantics and demonstrate remarkable performance. However, when code obfuscation is applied, the unstable control flow structure degrades their performance. To address this issue, we develop ORCAS, an Obfuscation-Resilient BCSA model based on Dominance Enhanced Semantic Graph (DESG). The DESG is an original binary code representation, capturing more binaries' implicit semantics without control flow structure, including inter-instruction relations (e.g., def-use), inter-basic block relations (i.e., dominance and post-dominance), and instruction-basic block relations. ORCAS takes binary functions from different obfuscation options, optimization levels, and instruction set architectures as input and scores their semantic similarity more robustly. Extensive experiments have been conducted on ORCAS against eight baseline approaches over the BinKit dataset. For example, ORCAS achieves an average 12.1% PR-AUC improvement when using combined three obfuscation options compared to the state-of-the-art approaches. In addition, an original obfuscated real-world vulnerability dataset has been constructed and released to facilitate a more comprehensive research on obfuscated binary code analysis. ORCAS outperforms the state-of-the-art approaches over this newly released real-world vulnerability dataset by up to a recall improvement of 43%.

Paperid: 1309, https://arxiv.org/pdf/2506.05594.pdf

Abstract:
Large Language Models (LLMs) have transformed natural language processing, demonstrating impressive capabilities across diverse tasks. However, deploying these models introduces critical risks related to intellectual property violations and potential misuse, particularly as adversaries can imitate these models to steal services or generate misleading outputs. We specifically focus on model stealing attacks, as they are highly relevant to proprietary LLMs and pose a serious threat to their security, revenue, and ethical deployment. While various watermarking techniques have emerged to mitigate these risks, it remains unclear how far the community and industry have progressed in developing and deploying watermarks in LLMs. To bridge this gap, we aim to develop a comprehensive systematization for watermarks in LLMs by 1) presenting a detailed taxonomy for watermarks in LLMs, 2) proposing a novel intellectual property classifier to explore the effectiveness and impacts of watermarks on LLMs under both attack and attack-free environments, 3) analyzing the limitations of existing watermarks in LLMs, and 4) discussing practical challenges and potential future directions for watermarks in LLMs. Through extensive experiments, we show that despite promising research outcomes and significant attention from leading companies and community to deploy watermarks, these techniques have yet to reach their full potential in real-world applications due to their unfavorable impacts on model utility of LLMs and downstream tasks. Our findings provide an insightful understanding of watermarks in LLMs, highlighting the need for practical watermarks solutions tailored to LLM deployment.

Paperid: 1310, https://arxiv.org/pdf/2506.05242.pdf

Abstract:
Large language models (LLMs) with diverse capabilities are increasingly being deployed in local environments, presenting significant security and controllability challenges. These locally deployed LLMs operate outside the direct control of developers, rendering them more susceptible to abuse. Existing mitigation techniques mainly designed for cloud-based LLM services are frequently circumvented or ineffective in deployer-controlled environments. We propose SECNEURON, the first framework that seamlessly embeds classic access control within the intrinsic capabilities of LLMs, achieving reliable, cost-effective, flexible, and certified abuse control for local deployed LLMs. SECNEURON employs neuron-level encryption and selective decryption to dynamically control the task-specific capabilities of LLMs, limiting unauthorized task abuse without compromising others. We first design a task-specific neuron extraction mechanism to decouple logically related neurons and construct a layered policy tree for handling coupled neurons. We then introduce a flexible and efficient hybrid encryption framework for millions of neurons in LLMs. Finally, we developed a distribution-based decrypted neuron detection mechanism on ciphertext to ensure the effectiveness of partially decrypted LLMs. We proved that SECNEURON satisfies IND-CPA Security and Collusion Resistance Security under the Task Controllability Principle. Experiments on various task settings show that SECNEURON limits unauthorized task accuracy to below 25% while keeping authorized accuracy loss with 2%. Using an unauthorized Code task example, the accuracy of abuse-related malicious code generation was reduced from 59% to 15%. SECNEURON also mitigates unauthorized data leakage, reducing PII extraction rates to below 5% and membership inference to random guesses.

Paperid: 1311, https://arxiv.org/pdf/2506.05022.pdf

Abstract:
Address Sanitizer (ASan) is a sharp weapon for detecting memory safety violations, including temporal and spatial errors hidden in C/C++ programs during execution. However, ASan incurs significant runtime overhead, which limits its efficiency in testing large software. The overhead mainly comes from sanitizer checks due to the frequent and expensive shadow memory access. Over the past decade, many methods have been developed to speed up ASan by eliminating and accelerating sanitizer checks, however, they either fail to adequately eliminate redundant checks or compromise detection capabilities. To address this issue, this paper presents Tech-ASan, a two-stage check based technique to accelerate ASan with safety assurance. First, we propose a novel two-stage check algorithm for ASan, which leverages magic value comparison to reduce most of the costly shadow memory accesses. Second, we design an efficient optimizer to eliminate redundant checks, which integrates a novel algorithm for removing checks in loops. Third, we implement Tech-ASan as a memory safety tool based on the LLVM compiler infrastructure. Our evaluation using the SPEC CPU2006 benchmark shows that Tech-ASan outperforms the state-of-the-art methods with 33.70% and 17.89% less runtime overhead than ASan and ASan--, respectively. Moreover, Tech-ASan detects 56 fewer false negative cases than ASan and ASan-- when testing on the Juliet Test Suite under the same redzone setting.

Paperid: 1312, https://arxiv.org/pdf/2506.04963.pdf

Abstract:
Modern search engines extensively personalize results by building detailed user profiles based on query history and behaviour. While personalization can enhance relevance, it introduces privacy risks and can lead to filter bubbles. This paper proposes and evaluates a lightweight, client-side query obfuscation strategy using randomly generated multilingual search queries to disrupt user profiling. Through controlled experiments on the Seznam.cz search engine, we assess the impact of interleaving real queries with obfuscating noise in various language configurations and ratios. Our findings show that while displayed search results remain largely stable, the search engine's identified user interests shift significantly under obfuscation. We further demonstrate that such random queries can prevent accurate profiling and overwrite established user profiles. This study provides practical evidence for query obfuscation as a viable privacy-preserving mechanism and introduces a tool that enables users to autonomously protect their search behaviour without modifying existing infrastructure.

Paperid: 1313, https://arxiv.org/pdf/2505.24089.pdf

Abstract:
We develop practical and theoretically grounded membership inference attacks (MIAs) against both independent and identically distributed (i.i.d.) data and graph-structured data. Building on the Bayesian decision-theoretic framework of Sablayrolles et al., we derive the Bayes-optimal membership inference rule for node-level MIAs against graph neural networks, addressing key open questions about optimal query strategies in the graph setting. We introduce BASE and G-BASE, computationally efficient approximations of the Bayes-optimal attack. G-BASE achieves superior performance compared to previously proposed classifier-based node-level MIA attacks. BASE, which is also applicable to non-graph data, matches or exceeds the performance of prior state-of-the-art MIAs, such as LiRA and RMIA, at a significantly lower computational cost. Finally, we show that BASE and RMIA are equivalent under a specific hyperparameter setting, providing a principled, Bayes-optimal justification for the RMIA attack.

Paperid: 1314, https://arxiv.org/pdf/2505.22162.pdf

Abstract:
Paramount to vehicle safety, broadcasted Cooperative Awareness Messages (CAMs) and Decentralized Environmental Notification Messages (DENMs) are pseudonymously authenticated for security and privacy protection, with each node needing to have all incoming messages validated within an expiration deadline. This creates an asymmetry that can be easily exploited by external adversaries to launch a clogging Denial of Service (DoS) attack: each forged VC message forces all neighboring nodes to cryptographically validate it; at increasing rates, easy to generate forged messages gradually exhaust processing resources and severely degrade or deny timely validation of benign CAMs/DENMs. The result can be catastrophic when awareness of neighbor vehicle positions or critical reports are missed. We address this problem making the standardized VC pseudonymous authentication DoS-resilient. We propose efficient cryptographic constructs, which we term message verification facilitators, to prioritize processing resources for verification of potentially valid messages among bogus messages and verify multiple messages based on one signature verification. Any message acceptance is strictly based on public-key based message authentication/verification for accountability, i.e., non-repudiation is not sacrificed, unlike symmetric key based approaches. This further enables drastic misbehavior detection, also exploiting the newly introduced facilitators, based on probabilistic signature verification and cross-checking over multiple facilitators verifying the same message; while maintaining verification latency low even when under attack, trading off modest communication overhead. Our facilitators can also be used for efficient discovery and verification of DENM or any event-driven message, including misbehavior evidence used for our scheme.

Paperid: 1315, https://arxiv.org/pdf/2505.20955.pdf

Abstract:
Diffusion models have achieved tremendous success in image generation, but they also raise significant concerns regarding privacy and copyright issues. Membership Inference Attacks (MIAs) are designed to ascertain whether specific data were utilized during a model's training phase. As current MIAs for diffusion models typically exploit the model's image prediction ability, we formalize them into a unified general paradigm which computes the membership score for membership identification. Under this paradigm, we empirically find that existing attacks overlook the inherent deficiency in how diffusion models process high-frequency information. Consequently, this deficiency leads to member data with more high-frequency content being misclassified as hold-out data, and hold-out data with less high-frequency content tend to be misclassified as member data. Moreover, we theoretically demonstrate that this deficiency reduces the membership advantage of attacks, thereby interfering with the effective discrimination of member data and hold-out data. Based on this insight, we propose a plug-and-play high-frequency filter module to mitigate the adverse effects of the deficiency, which can be seamlessly integrated into any attacks within this general paradigm without additional time costs. Extensive experiments corroborate that this module significantly improves the performance of baseline attacks across different datasets and models.

Paperid: 1316, https://arxiv.org/pdf/2505.20162.pdf

Abstract:
As large language models grow in capability and agency, identifying vulnerabilities through red-teaming becomes vital for safe deployment. However, traditional prompt-engineering approaches may prove ineffective once red-teaming turns into a weak-to-strong problem, where target models surpass red-teamers in capabilities. To study this shift, we frame red-teaming through the lens of the capability gap between attacker and target. We evaluate more than 500 attacker-target pairs using LLM-based jailbreak attacks that mimic human red-teamers across diverse families, sizes, and capability levels. Three strong trends emerge: (i) more capable models are better attackers, (ii) attack success drops sharply once the target's capability exceeds the attacker's, and (iii) attack success rates correlate with high performance on social science splits of the MMLU-Pro benchmark. From these trends, we derive a jailbreaking scaling law that predicts attack success for a fixed target based on attacker-target capability gap. These findings suggest that fixed-capability attackers (e.g., humans) may become ineffective against future models, increasingly capable open-source models amplify risks for existing systems, and model providers must accurately measure and control models' persuasive and manipulative abilities to limit their effectiveness as attackers.

Paperid: 1317, https://arxiv.org/pdf/2505.19773.pdf

Abstract:
We investigate long-context vulnerabilities in Large Language Models (LLMs) through Many-Shot Jailbreaking (MSJ). Our experiments utilize context length of up to 128K tokens. Through comprehensive analysis with various many-shot attack settings with different instruction styles, shot density, topic, and format, we reveal that context length is the primary factor determining attack effectiveness. Critically, we find that successful attacks do not require carefully crafted harmful content. Even repetitive shots or random dummy text can circumvent model safety measures, suggesting fundamental limitations in long-context processing capabilities of LLMs. The safety behavior of well-aligned models becomes increasingly inconsistent with longer contexts. These findings highlight significant safety gaps in context expansion capabilities of LLMs, emphasizing the need for new safety mechanisms.

Paperid: 1318, https://arxiv.org/pdf/2505.19644.pdf

Abstract:
A key research area in deepfake speech detection is source tracing - determining the origin of synthesised utterances. The approaches may involve identifying the acoustic model (AM), vocoder model (VM), or other generation-specific parameters. However, progress is limited by the lack of a dedicated, systematically curated dataset. To address this, we introduce STOPA, a systematically varied and metadata-rich dataset for deepfake speech source tracing, covering 8 AMs, 6 VMs, and diverse parameter settings across 700k samples from 13 distinct synthesisers. Unlike existing datasets, which often feature limited variation or sparse metadata, STOPA provides a systematically controlled framework covering a broader range of generative factors, such as the choice of the vocoder model, acoustic model, or pretrained weights, ensuring higher attribution reliability. This control improves attribution accuracy, aiding forensic analysis, deepfake detection, and generative model transparency.

Paperid: 1319, https://arxiv.org/pdf/2505.18955.pdf

Abstract:
Motivated by the success of general-purpose large language models (LLMs) in software patching, recent works started to train specialized patching models. Most works trained one model to handle the end-to-end patching pipeline (including issue localization, patch generation, and patch validation). However, it is hard for a small model to handle all tasks, as different sub-tasks have different workflows and require different expertise. As such, by using a 70 billion model, SOTA methods can only reach up to 41% resolved rate on SWE-bench-Verified. Motivated by the collaborative nature, we propose Co-PatcheR, the first collaborative patching system with small and specialized reasoning models for individual components. Our key technique novelties are the specific task designs and training recipes. First, we train a model for localization and patch generation. Our localization pinpoints the suspicious lines through a two-step procedure, and our generation combines patch generation and critique. We then propose a hybrid patch validation that includes two models for crafting issue-reproducing test cases with and without assertions and judging patch correctness, followed by a majority vote-based patch selection. Through extensive evaluation, we show that Co-PatcheR achieves 46% resolved rate on SWE-bench-Verified with only 3 x 14B models. This makes Co-PatcheR the best patcher with specialized models, requiring the least training resources and the smallest models. We conduct a comprehensive ablation study to validate our recipes, as well as our choice of training data number, model size, and testing-phase scaling strategy.

Paperid: 1320, https://arxiv.org/pdf/2505.14607.pdf

Abstract:
User authorization-based access privileges are a key feature in many safety-critical systems, but have not been extensively studied in the large language model (LLM) realm. In this work, drawing inspiration from such access control systems, we introduce sudoLLM, a novel framework that results in multi-role aligned LLMs, i.e., LLMs that account for, and behave in accordance with, user access rights. sudoLLM injects subtle user-based biases into queries and trains an LLM to utilize this bias signal in order to produce sensitive information if and only if the user is authorized. We present empirical results demonstrating that this approach shows substantially improved alignment, generalization, resistance to prefix-based jailbreaking attacks, and ``fails-closed''. The persistent tension between the language modeling objective and safety alignment, which is often exploited to jailbreak LLMs, is somewhat resolved with the aid of the injected bias signal. Our framework is meant as an additional security layer, and complements existing guardrail mechanisms for enhanced end-to-end safety with LLMs.

Paperid: 1321, https://arxiv.org/pdf/2505.14461.pdf

Abstract:
We investigate two natural relaxations of quantum cryptographic primitives. The first involves quantum input sampling, where inputs are generated by a quantum algorithm rather than sampled uniformly at random. Applying this to pseudorandom generators ($\textsf{PRG}$s) and pseudorandom states ($\textsf{PRS}$s), leads to the notions denoted as $\textsf{PRG}^{qs}$ and $\textsf{PRS}^{qs}$, respectively. The second relaxation, $\bot$-pseudodeterminism, relaxes the determinism requirement by allowing the output to be a special symbol $\bot$ on an inverse-polynomial fraction of inputs. We demonstrate an equivalence between bounded-query logarithmic-size $\textsf{PRS}^{qs}$, logarithmic-size $\textsf{PRS}^{qs}$, and $\textsf{PRG}^{qs}$. Moreover, we establish that $\textsf{PRG}^{qs}$ can be constructed from $\bot$-$\textsf{PRG}$s, which in turn were built from logarithmic-size $\textsf{PRS}$. Interestingly, these relations remain unknown in the uniform key setting. To further justify these relaxed models, we present black-box separations. Our results suggest that $\bot$-pseudodeterministic primitives may be weaker than their deterministic counterparts, and that primitives based on quantum input sampling may be inherently weaker than those using uniform sampling. Together, these results provide numerous new insights into the structure and hierarchy of primitives within MicroCrypt.

Paperid: 1322, https://arxiv.org/pdf/2505.12750.pdf

Abstract:
Malware are malicious programs that are grouped into families based on their penetration technique, source code, and other characteristics. Classifying malware programs into their respective families is essential for building effective defenses against cyber threats. Machine learning models have a huge potential in malware detection on mobile devices, as malware families can be recognized by classifying permission data extracted from Android manifest files. Still, the malware classification task is challenging due to the high-dimensional nature of permission data and the limited availability of training samples. In particular, the steady emergence of new malware families makes it impossible to acquire a comprehensive training set covering all the malware classes. In this work, we present a malware classification system that, on top of classifying known malware, detects new ones. In particular, we combine an open-set recognition technique developed within the computer vision community, namely MaxLogit, with a tree-based Gradient Boosting classifier, which is particularly effective in classifying high-dimensional data. Our solution turns out to be very practical, as it can be seamlessly employed in a standard classification workflow, and efficient, as it adds minimal computational overhead. Experiments on public and proprietary datasets demonstrate the potential of our solution, which has been deployed in a business environment.

Paperid: 1323, https://arxiv.org/pdf/2505.11557.pdf

Abstract:
Corporate LLMs are gaining traction for efficient knowledge dissemination and management within organizations. However, as current LLMs are vulnerable to leaking sensitive information, it has proven difficult to apply them in settings where strict access control is necessary. To this end, we design AC-LoRA, an end-to-end system for access control-aware corporate LLM chatbots that maintains a strong information isolation guarantee. AC-LoRA maintains separate LoRA adapters for permissioned datasets, along with the document embedding they are finetuned on. AC-LoRA retrieves a precise set of LoRA adapters based on the similarity score with the user query and their permission. This similarity score is later used to merge the responses if more than one LoRA is retrieved, without requiring any additional training for LoRA routing. We provide an end-to-end prototype of AC-LoRA, evaluate it on two datasets, and show that AC-LoRA matches or even exceeds the performance of state-of-the-art LoRA mixing techniques while providing strong isolation guarantees. Furthermore, we show that AC-LoRA design can be directly applied to different modalities.

Paperid: 1324, https://arxiv.org/pdf/2505.10264.pdf

Abstract:
Federated Learning (FL) enables collaborative training of machine learning models across distributed clients without sharing raw data, ostensibly preserving data privacy. Nevertheless, recent studies have revealed critical vulnerabilities in FL, showing that a malicious central server can manipulate model updates to reconstruct clients' private training data. Existing data reconstruction attacks have important limitations: they often rely on assumptions about the clients' data distribution or their efficiency significantly degrades when batch sizes exceed just a few tens of samples. In this work, we introduce a novel data reconstruction attack that overcomes these limitations. Our method leverages a new geometric perspective on fully connected layers to craft malicious model parameters, enabling the perfect recovery of arbitrarily large data batches in classification tasks without any prior knowledge of clients' data. Through extensive experiments on both image and tabular datasets, we demonstrate that our attack outperforms existing methods and achieves perfect reconstruction of data batches two orders of magnitude larger than the state of the art.

Paperid: 1325, https://arxiv.org/pdf/2505.08771.pdf

Abstract:
We present Kudzu, a high-throughput atomic broadcast protocol with an integrated fast path. Our contribution is based on the combination of two lines of work. Firstly, our protocol achieves finality in just two rounds of communication if all but $p$ out of $n = 3f + 2p + 1$ participating replicas behave correctly, where $f$ is the number of Byzantine faults that are tolerated. Due to the seamless integration of the fast path, even in the presence of more than $p$ faults, our protocol maintains state-of-the-art characteristics. Secondly, our protocol utilizes the bandwidth of participating replicas in a balanced way, alleviating the bottleneck at the leader, and thus enabling high throughput. This is achieved by disseminating blocks using erasure codes. Despite combining a novel set of advantages, Kudzu is remarkably simple: intricacies such as progress certificates, complex view changes, and speculative execution are avoided.

Paperid: 1326, https://arxiv.org/pdf/2505.08148.pdf

Abstract:
Millions of users leverage generative pretrained transformer (GPT)-based language models developed by leading model providers for a wide range of tasks. To support enhanced user interaction and customization, many platforms-such as OpenAI-now enable developers to create and publish tailored model instances, known as custom GPTs, via dedicated repositories or application stores. These custom GPTs empower users to browse and interact with specialized applications designed to meet specific needs. However, as custom GPTs see growing adoption, concerns regarding their security vulnerabilities have intensified. Existing research on these vulnerabilities remains largely theoretical, often lacking empirical, large-scale, and statistically rigorous assessments of associated risks. In this study, we analyze 14,904 custom GPTs to assess their susceptibility to seven exploitable threats, such as roleplay-based attacks, system prompt leakage, phishing content generation, and malicious code synthesis, across various categories and popularity tiers within the OpenAI marketplace. We introduce a multi-metric ranking system to examine the relationship between a custom GPT's popularity and its associated security risks. Our findings reveal that over 95% of custom GPTs lack adequate security protections. The most prevalent vulnerabilities include roleplay-based vulnerabilities (96.51%), system prompt leakage (92.20%), and phishing (91.22%). Furthermore, we demonstrate that OpenAI's foundational models exhibit inherent security weaknesses, which are often inherited or amplified in custom GPTs. These results highlight the urgent need for enhanced security measures and stricter content moderation to ensure the safe deployment of GPT-based applications.

Paperid: 1327, https://arxiv.org/pdf/2505.06836.pdf

Abstract:
Anti-phishing tools typically display generic warnings that offer users limited explanation on why a website is considered malicious, which can prevent end-users from developing the mental models needed to recognize phishing cues on their own. This becomes especially problematic when these tools inevitably fail - particularly against evasive threats, and users are found to be ill-equipped to identify and avoid them independently. To address these limitations, we present PhishXplain (PXP), a real-time explainable phishing warning system designed to augment existing detection mechanisms. PXP empowers users by clearly articulating why a site is flagged as malicious, highlighting suspicious elements using a memory-efficient implementation of LLaMA 3.2. It utilizes a structured two-step prompt architecture to identify phishing features, generate contextual explanations, and render annotated screenshots that visually reinforce the warning. Longitudinally implementing PhishXplain over a month on 7,091 live phishing websites, we found that it can generate warnings for 94% of the sites, with a correctness of 96%. We also evaluated PhishXplain through a user study with 150 participants split into two groups: one received conventional, generic warnings, while the other interacted with PXP's explainable alerts. Participants who received the explainable warnings not only demonstrated a significantly better understanding of phishing indicators but also achieved higher accuracy in identifying phishing threats, even without any warning. Moreover, they reported greater satisfaction and trust in the warnings themselves. These improvements were especially pronounced among users with lower initial levels of cybersecurity proficiency and awareness. To encourage the adoption of this framework, we release PhishXplain as a browser extension.

Paperid: 1328, https://arxiv.org/pdf/2505.06379.pdf

Abstract:
Ensuring data ownership and traceability of unauthorised redistribution are central to safeguarding intellectual property in shared data environments. Data fingerprinting addresses these challenges by embedding recipient-specific marks into the data, typically via content modifications. We propose NCorr-FP, a Neighbourhood-based Correlation-preserving Fingerprinting system for structured tabular data with the main goal of preserving statistical fidelity. The method uses local record similarity and density estimation to guide the insertion of fingerprint bits. The embedding logic is then reversed to extract the fingerprint from a potentially modified dataset. Extensive experiments confirm its effectiveness, fidelity, utility and robustness. Results show that fingerprints are virtually imperceptible, with minute Hellinger distances and KL divergences, even at high embedding ratios. The system also maintains high data utility for downstream predictive tasks. The method achieves 100\% detection confidence under substantial data deletions and remains robust against adaptive and collusion attacks. Satisfying all these requirements concurrently on mixed-type datasets highlights the strong applicability of NCorr-FP to real-world data settings.

Paperid: 1329, https://arxiv.org/pdf/2505.04977.pdf

Abstract:
With the widespread deployment of deep neural network (DNN) models, dynamic watermarking techniques are being used to protect the intellectual property of model owners. However, recent studies have shown that existing watermarking schemes are vulnerable to watermark removal and ambiguity attacks. Besides, the vague criteria for determining watermark presence further increase the likelihood of such attacks. In this paper, we propose a secure DNN watermarking scheme named ChainMarks, which generates secure and robust watermarks by introducing a cryptographic chain into the trigger inputs and utilizes a two-phase Monte Carlo method for determining watermark presence. First, ChainMarks generates trigger inputs as a watermark dataset by repeatedly applying a hash function over a secret key, where the target labels associated with trigger inputs are generated from the digital signature of model owner. Then, the watermarked model is produced by training a DNN over both the original and watermark datasets. To verify watermarks, we compare the predicted labels of trigger inputs with the target labels and determine ownership with a more accurate decision threshold that considers the classification probability of specific models. Experimental results show that ChainMarks exhibits higher levels of robustness and security compared to state-of-the-art watermarking schemes. With a better marginal utility, ChainMarks provides a higher probability guarantee of watermark presence in DNN models with the same level of watermark accuracy.

Paperid: 1330, https://arxiv.org/pdf/2505.02499.pdf

Abstract:
We present \textsc{CHOKE}, a novel code-based hybrid key-encapsulation mechanism (KEM) designed to securely and efficiently transmit multiple session keys simultaneously. By encoding $n$ independent session keys with an individually secure linear code and encapsulating each resulting coded symbol using a separate KEM, \textsc{CHOKE} achieves computational individual security -- each key remains secure as long as at least one underlying KEM remains unbroken. Compared to traditional serial or combiner-based hybrid schemes, \textsc{CHOKE} reduces computational and communication costs by an $n$-fold factor. Furthermore, we show that the communication cost of our construction is optimal under the requirement that each KEM must be used at least once.

Paperid: 1331, https://arxiv.org/pdf/2505.01782.pdf

Abstract:
Kyber is a lattice-based key encapsulation mechanism selected for standardization by the NIST Post-Quantum Cryptography (PQC) project. A critical component of Kyber's key generation process is the sampling of matrix elements from a uniform distribution over the ring Rq . This step is one of the most computationally intensive tasks in the scheme, significantly impacting performance in low-power embedded systems such as Internet of Things (IoT), wearable devices, wireless sensor networks (WSNs), smart cards, TPMs (Trusted Platform Modules), etc. Existing approaches to this sampling, notably conventional SampleNTT and Parse-SPDM3, rely on rejection sampling. Both algorithms require a large number of random bytes, which needs at least three SHAKE-128 squeezing steps per polynomial. As a result, it causes significant amount of latency and energy. In this work, we propose a novel and efficient sampling algorithm, namely Modified SampleNTT, which substantially educes the average number of bits required from SHAKE-128 to generate elements in Rq - achieving approximately a 33% reduction compared to conventional SampleNTT. Modified SampleNTT achieves 99.16% success in generating a complete polynomial using only two SHAKE-128 squeezes, outperforming both state-of-the-art methods, which never succeed in two squeezes of SHAKE-128. Furthermore, our algorithm maintains the same average rejection rate as existing techniques and passes all standard statistical tests for randomness quality. FPGA implementation on Artix-7 demonstrates a 33.14% reduction in energy, 33.32% lower latency, and 0.28% fewer slices compared to SampleNTT. Our results confirm that Modified SampleNTT is an efficient and practical alternative for uniform polynomial sampling in PQC schemes such as Kyber, especially for low-power security processors.

Paperid: 1332, https://arxiv.org/pdf/2505.01518.pdf

Abstract:
The increasing density of modern DRAM has heightened its vulnerability to Rowhammer attacks, which induce bit flips by repeatedly accessing specific memory rows. This paper presents an analysis of bit flip patterns generated by advanced Rowhammer techniques that bypass existing hardware defenses. First, we investigate the phenomenon of adjacent bit flips where two or more physically neighboring bits are corrupted simultaneously and demonstrate they occur with significantly higher frequency than previously documented. We also show that if multiple bits flip within a byte, we can probabilistically model the likelihood of flipped bits appearing adjacently. We also demonstrate that bit flips within a row will naturally cluster together likely due to the underlying physics of the attack. We then investigate two fault injection attacks enabled by multiple adjacent or nearby bit flips. First, we show how these correlated flips enable efficient cryptographic signature correction attacks, demonstrating how such flips could enable ECDSA private key recovery from OpenSSL implementations where single-bit approaches would be unfeasible. Second, we introduce a targeted attack against large language models by exploiting Rowhammer-induced corruptions in tokenizer dictionaries of GGUF model files. This attack effectively rewrites safety instructions in system prompts by swapping safety-critical tokens with benign alternatives, circumventing model guardrails while maintaining normal functionality in other contexts. Our experimental results across multiple DRAM configurations reveal that current memory protection schemes are inadequate against these sophisticated attack vectors, which can achieve their objectives with precise, minimal modifications rather than random corruption.

Paperid: 1333, https://arxiv.org/pdf/2504.21752.pdf

Abstract:
Despite differential privacy (DP) often being considered the de facto standard for data privacy, its realization is vulnerable to unfaithful execution of its mechanisms by servers, especially in distributed settings. Specifically, servers may sample noise from incorrect distributions or generate correlated noise while appearing to follow established protocols. This work analyzes these malicious behaviors in a general differential privacy framework within a distributed client-server-verifier setup. To address these adversarial problems, we propose a novel definition called Verifiable Distributed Differential Privacy (VDDP) by incorporating additional verification mechanisms. We also explore the relationship between zero-knowledge proofs (ZKP) and DP, demonstrating that while ZKPs are sufficient for achieving DP under verifiability requirements, they are not necessary. Furthermore, we develop two novel and efficient mechanisms that satisfy VDDP: (1) the Verifiable Distributed Discrete Laplacian Mechanism (VDDLM), which offers up to a $4 \times 10^5$x improvement in proof generation efficiency with only 0.1-0.2x error compared to the previous state-of-the-art verifiable differentially private mechanism; (2) an improved solution to Verifiable Randomized Response (VRR) under local DP, a special case of VDDP, achieving up a reduction of up to 5000x in communication costs and the verifier's overhead.

Paperid: 1334, https://arxiv.org/pdf/2504.21574.pdf

Abstract:
Generative Artificial Intelligence (GenAI) is rapidly reshaping the global financial landscape, offering unprecedented opportunities to enhance customer engagement, automate complex workflows, and extract actionable insights from vast financial data. This survey provides an overview of GenAI adoption across the financial ecosystem, examining how banks, insurers, asset managers, and fintech startups worldwide are integrating large language models and other generative tools into their operations. From AI-powered virtual assistants and personalized financial advisory to fraud detection and compliance automation, GenAI is driving innovation across functions. However, this transformation comes with significant cybersecurity and ethical risks. We discuss emerging threats such as AI-generated phishing, deepfake-enabled fraud, and adversarial attacks on AI systems, as well as concerns around bias, opacity, and data misuse. The evolving global regulatory landscape is explored in depth, including initiatives by major financial regulators and international efforts to develop risk-based AI governance. Finally, we propose best practices for secure and responsible adoption - including explainability techniques, adversarial testing, auditability, and human oversight. Drawing from academic literature, industry case studies, and policy frameworks, this chapter offers a perspective on how the financial sector can harness GenAI's transformative potential while navigating the complex risks it introduces.

Paperid: 1335, https://arxiv.org/pdf/2504.17921.pdf

Abstract:
In this paper, we investigate how concept-based models (CMs) respond to out-of-distribution (OOD) inputs. CMs are interpretable neural architectures that first predict a set of high-level concepts (e.g., stripes, black) and then predict a task label from those concepts. In particular, we study the impact of concept interventions (i.e., operations where a human expert corrects a CM's mispredicted concepts at test time) on CMs' task predictions when inputs are OOD. Our analysis reveals a weakness in current state-of-the-art CMs, which we term leakage poisoning, that prevents them from properly improving their accuracy when intervened on for OOD inputs. To address this, we introduce MixCEM, a new CM that learns to dynamically exploit leaked information missing from its concepts only when this information is in-distribution. Our results across tasks with and without complete sets of concept annotations demonstrate that MixCEMs outperform strong baselines by significantly improving their accuracy for both in-distribution and OOD samples in the presence and absence of concept interventions.

Paperid: 1336, https://arxiv.org/pdf/2504.17666.pdf

Abstract:
This paper focuses on the problem of evolving Boolean functions of odd sizes with high nonlinearity, a property of cryptographic relevance. Despite its simple formulation, this problem turns out to be remarkably difficult. We perform a systematic evaluation by considering three solution encodings and four problem instances, analyzing how well different types of evolutionary algorithms behave in finding a maximally nonlinear Boolean function. Our results show that genetic programming generally outperforms other evolutionary algorithms, although it falls short of the best-known results achieved by ad-hoc heuristics. Interestingly, by adding local search and restricting the space to rotation symmetric Boolean functions, we show that a genetic algorithm with the bitstring encoding manages to evolve a $9$-variable Boolean function with nonlinearity 241.

Paperid: 1337, https://arxiv.org/pdf/2504.17539.pdf

Abstract:
Blockchain technology enables secure, transparent data management in decentralized systems, supporting applications from cryptocurrencies like Bitcoin to tokenizing real-world assets like property. Its scalability and sustainability hinge on consensus mechanisms balancing security and efficiency. Proof of Work (PoW), used by Bitcoin, ensures security through energy-intensive computations but demands significant resources. Proof of Stake (PoS), as in Ethereum post-Merge, selects validators based on staked cryptocurrency, offering energy efficiency but risking centralization from wealth concentration. With AI models straining computational resources, we propose Proof of Useful Intelligence (PoUI), a hybrid consensus mechanism. In PoUI, workers perform AI tasks like language processing or image analysis to earn coins, which are staked to secure the network, blending security with practical utility. Decentralized nodes--job posters, market coordinators, workers, and validators --collaborate via smart contracts to manage tasks and rewards.

Paperid: 1338, https://arxiv.org/pdf/2504.15949.pdf

Abstract:
This paper explores the algebraic conditions under which a cellular automaton with a non-linear local rule exhibits surjectivity and reversibility. We also analyze the role of permutivity as a key factor influencing these properties and provide conditions that determine whether a non-linear CA is (bi)permutive. Through theoretical results and illustrative examples, we characterize the relationships between these fundamental properties, offering new insights into the dynamical behavior of non-linear CA.

Paperid: 1339, https://arxiv.org/pdf/2504.09437.pdf

Abstract:
With the advent of post-quantum cryptography (PQC) standards, it has become imperative for resource-constrained devices (RCDs) in the Internet of Things (IoT) to adopt these quantum-resistant protocols. However, the high computational overhead and the large key sizes associated with PQC make direct deployment on such devices impractical. To address this challenge, we propose an edge computing-enabled PQC framework that leverages a physical-layer security (PLS)-assisted offloading strategy, allowing devices to either offload intensive cryptographic tasks to a post-quantum edge server (PQES) or perform them locally. Furthermore, to ensure data confidentiality within the edge domain, our framework integrates two PLS techniques: offloading RCDs employ wiretap coding to secure data transmission, while non-offloading RCDs serve as friendly jammers by broadcasting artificial noise to disrupt potential eavesdroppers. Accordingly, we co-design the computation offloading and PLS strategy by jointly optimizing the device transmit power, PQES computation resource allocation, and offloading decisions to minimize overall latency under resource constraints. Numerical results demonstrate significant latency reductions compared to baseline schemes, confirming the scalability and efficiency of our approach for secure PQC operations in IoT networks.

Paperid: 1340, https://arxiv.org/pdf/2504.08192.pdf

Abstract:
Machine unlearning is a promising approach to improve LLM safety by removing unwanted knowledge from the model. However, prevailing gradient-based unlearning methods suffer from issues such as high computational costs, hyperparameter instability, poor sequential unlearning capability, vulnerability to relearning attacks, low data efficiency, and lack of interpretability. While Sparse Autoencoders are well-suited to improve these aspects by enabling targeted activation-based unlearning, prior approaches underperform gradient-based methods. This work demonstrates that, contrary to these earlier findings, SAEs can significantly improve unlearning when employed dynamically. We introduce $\textbf{Dynamic DAE Guardrails}$ (DSG), a novel method for precision unlearning that leverages principled feature selection and a dynamic classifier. Our experiments show DSG substantially outperforms leading unlearning methods, achieving superior forget-utility trade-offs. DSG addresses key drawbacks of gradient-based approaches for unlearning -- offering enhanced computational efficiency and stability, robust performance in sequential unlearning, stronger resistance to relearning attacks, better data efficiency including zero-shot settings, and more interpretable unlearning.

Paperid: 1341, https://arxiv.org/pdf/2504.07457.pdf

Abstract:
The increasing frequency and sophistication of cyberattacks demand innovative approaches to strengthen defense capabilities. Training on live infrastructure poses significant risks to organizations, making secure, isolated cyber ranges an essential tool for conducting Red vs. Blue Team training events. These events enable security teams to refine their skills without impacting operational environments. While such training provides a strong foundation, the ever-evolving nature of cyber threats necessitates additional support for effective defense. To address this challenge, we introduce CyberAlly, a knowledge graph-enhanced AI assistant designed to enhance the efficiency and effectiveness of Blue Teams during incident response. Integrated into our cyber range alongside an open-source SIEM platform, CyberAlly monitors alerts, tracks Blue Team actions, and suggests tailored mitigation recommendations based on insights from prior Red vs. Blue Team exercises. This demonstration highlights the feasibility and impact of CyberAlly in augmenting incident response and equipping defenders to tackle evolving threats with greater precision and confidence.

Paperid: 1342, https://arxiv.org/pdf/2504.07287.pdf

Abstract:
Research on exploit chains predominantly focuses on sequences with one type of exploit, e.g., either escalating privileges on a machine or executing remote code. In networks, hybrid exploit chains are critical because of their linkable vulnerabilities. Moreover, developing hybrid exploit chains is challenging because it requires understanding the diverse and independent dependencies and outcomes. We present hybrid chains encompassing privilege escalation (PE) and remote code execution (RCE) exploits. These chains are executable and can span large networks, where numerous potential exploit combinations arise from the large array of network assets, their hardware, software, configurations, and vulnerabilities. The chains are generated by ALFA-Chains, an AI-supported framework for the automated discovery of multi-step PE and RCE exploit chains in networks across arbitrary environments and segmented networks. Through an LLM-based classification, ALFA-Chains describes exploits in Planning Domain Description Language (PDDL). PDDL exploit and network descriptions then use off-the-shelf AI planners to find multiple exploit chains. ALFA-Chains finds 12 unknown chains on an example with a known three-step chain. A red-team exercise validates the executability with Metasploit. ALFA-Chains is efficient, finding an exploit chain in 0.01 seconds in an enterprise network with 83 vulnerabilities, 20 hosts, and 6 subnets. In addition, it is scalable, it finds an exploit chain in an industrial network with 114 vulnerabilities, 200 hosts, and 6 subnets in 3.16 seconds. It is comprehensive, finding 13 exploit chains in 26.26 seconds in the network. Finally, ALFA-Chains demonstrates flexibility across different exploit sources, ability to generalize across diverse network types, and robustness in discovering chains under constrained privilege assumptions.

Paperid: 1343, https://arxiv.org/pdf/2504.03909.pdf

Abstract:
Federated learning (FL) enables collaborative model training across decentralized datasets. NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal federated settings, facilitating joint model development without direct data sharing. However, the initial implementation assumed mutual trust over the sharing of intermediate gradient statistics produced by the XGBoost algorithm, leaving potential vulnerabilities to honest-but-curious adversaries. This work introduces "Secure Federated XGBoost", an efficient solution to mitigate these risks. We implement secure federated algorithms for both vertical and horizontal scenarios, addressing diverse data security patterns. To secure the messages, we leverage homomorphic encryption (HE) to protect sensitive information during training. A novel plugin and processor interface seamlessly integrates HE into the Federated XGBoost pipeline, enabling secure aggregation over ciphertexts. We present both CPU-based and CUDA-accelerated HE plugins, demonstrating significant performance gains. Notably, our CUDA-accelerated HE implementation achieves up to 30x speedups in vertical Federated XGBoost compared to existing third-party solutions. By securing critical computation steps and encrypting sensitive assets, Secure Federated XGBoost provides robust data privacy guarantees, reinforcing the fundamental benefits of federated learning while maintaining high performance.

Paperid: 1344, https://arxiv.org/pdf/2504.02142.pdf

Abstract:
Group robustness has become a major concern in machine learning (ML) as conventional training paradigms were found to produce high error on minority groups. Without explicit group annotations, proposed solutions rely on heuristics that aim to identify and then amplify the minority samples during training. In our work, we first uncover a critical shortcoming of these methods: an inability to distinguish legitimate minority samples from poison samples in the training set. By amplifying poison samples as well, group robustness methods inadvertently boost the success rate of an adversary -- e.g., from $0\%$ without amplification to over $97\%$ with it. Notably, we supplement our empirical evidence with an impossibility result proving this inability of a standard heuristic under some assumptions. Moreover, scrutinizing recent poisoning defenses both in centralized and federated learning, we observe that they rely on similar heuristics to identify which samples should be eliminated as poisons. In consequence, minority samples are eliminated along with poisons, which damages group robustness -- e.g., from $55\%$ without the removal of the minority samples to $41\%$ with it. Finally, as they pursue opposing goals using similar heuristics, our attempt to alleviate the trade-off by combining group robustness methods and poisoning defenses falls short. By exposing this tension, we also hope to highlight how benchmark-driven ML scholarship can obscure the trade-offs among different metrics with potentially detrimental consequences.

Paperid: 1345, https://arxiv.org/pdf/2504.01803.pdf

Abstract:
This paper introduces DISINFOX, an open-source threat intelligence exchange platform for the structured collection, management, and dissemination of disinformation incidents and influence operations. Analysts can upload and correlate information manipulation and interference incidents, while clients can access and analyze the data through an interactive web interface or programmatically via a public API. This facilitates integration with other vendors, providing a unified view of cybersecurity and disinformation events. The solution is fully containerized using Docker, comprising a web-based frontend for user interaction, a backend REST API for managing core functionalities, and a public API for structured data retrieval, enabling seamless integration with existing Cyber Threat Intelligence (CTI) workflows. In particular, DISINFOX models the incidents through DISARM Tactics, Techniques, and Procedures (TTPs), a MITRE ATT&CK-like framework for disinformation, with a custom data model based on the Structured Threat Information eXpression (STIX2) standard. As an open-source solution, DISINFOX provides a reproducible and extensible hub for researchers, analysts, and policymakers seeking to enhance the detection, investigation, and mitigation of disinformation threats. The intelligence generated from a custom dataset has been tested and utilized by a local instance of OpenCTI, a mature CTI platform, via a custom-built connector, validating the platform with the exchange of more than 100 disinformation incidents.

Paperid: 1346, https://arxiv.org/pdf/2504.01145.pdf

Abstract:
Current malware (malicious software) analysis tools focus on detection and family classification but fail to provide clear and actionable narrative insights into the malignant activity of the malware. Therefore, there is a need for a tool that translates raw malware data into human-readable descriptions. Developing such a tool accelerates incident response, reduces malware analysts' cognitive load, and enables individuals having limited technical expertise to understand malicious software behaviour. With this objective, we present MaLAware, which automatically summarizes the full spectrum of malicious activity of malware executables. MaLAware processes Cuckoo Sandbox-generated reports using large language models (LLMs) to correlate malignant activities and generate concise summaries explaining malware behaviour. We evaluate the tool's performance on five open-source LLMs. The evaluation uses the human-written malware behaviour description dataset as ground truth. The model's performance is measured using 11 extensive performance metrics, which boosts the confidence of MaLAware's effectiveness. The current version of the tool, i.e., MaLAware, supports Qwen2.5-7B, Llama2-7B, Llama3.1-8B, Mistral-7B, and Falcon-7B, along with the quantization feature for resource-constrained environments. MaLAware lays a foundation for future research in malware behavior explanation, and its extensive evaluation demonstrates LLMs' ability to narrate malware behavior in an actionable and comprehensive manner.

Paperid: 1347, https://arxiv.org/pdf/2504.00988.pdf

Abstract:
Cyber-physical systems, such as self-driving cars or digitized electrical grids, often involve complex interactions between security, safety, and defense. Proper risk management strategies must account for these three critical domains and their interaction because the failure to address one domain can exacerbate risks in the others, leading to cascading effects that compromise the overall system resilience. This work presents a case study from Ascentio Technologies, a mission-critical system company in Argentina specializing in aerospace, where the interplay between safety, security, and defenses is critical for ensuring the resilience and reliability of their systems. The main focus will be on the Ground Segment for the satellite project currently developed by the company. Analyzing safety, security, and defense mechanisms together in the Ground Segment of a satellite project is crucial because these domains are deeply interconnected--for instance, a security breach could disable critical safety functions, or a safety failure could create opportunities for attackers to exploit vulnerabilities, amplifying the risks to the entire system. This paper showcases the application of the Attack-Fault-Defense Tree (AFDT) framework, which integrates attack trees, fault trees, and defense mechanisms into a unified model. AFDT provides an intuitive visual language that facilitates interdisciplinary collaboration, enabling experts from various fields to better assess system vulnerabilities and defenses. By applying AFDT to the Ground Segment of the satellite project, we demonstrate how qualitative analyses can be performed to identify weaknesses and enhance the overall system's security and safety. This case highlights the importance of jointly analyzing attacks, faults, and defenses to improve resilience in complex cyber-physical environments.

Paperid: 1348, https://arxiv.org/pdf/2504.00346.pdf

Abstract:
We give an IOPP (interactive oracle proof of proximity) for trivariate Reed-Muller codes that achieves the best known query complexity in some range of security parameters. Specifically, for degree $d$ and security parameter $Î»\leq \frac{\log^2 d}{\log\log d}$ , our IOPP has $2^{-Î»}$ round-by-round soundness, $O(Î»)$ queries, $O(\log\log d)$ rounds and $O(d)$ length. This improves upon the FRI [Ben-Sasson, Bentov, Horesh, Riabzev, ICALP 2018] and the STIR [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] IOPPs for Reed-Solomon codes, that have larger query and round complexity standing at $O(Î»\log d)$ and $O(\log d+Î»\log\log d)$ respectively. We use our IOPP to give an IOP for the NP-complete language Rank-1-Constraint-Satisfaction with the same parameters. Our construction is based on the line versus point test in the low-soundness regime. Compared to the axis parallel test (which is used in all prior works), the general affine lines test has improved soundness, which is the main source of our improved soundness. Using this test involves several complications, most significantly that projection to affine lines does not preserve individual degrees, and we show how to overcome these difficulties. En route, we extend some existing machinery to more general settings. Specifically, we give proximity generators for Reed-Muller codes, show a more systematic way of handling ``side conditions'' in IOP constructions, and generalize the compiling procedure of [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] to general codes.

Paperid: 1349, https://arxiv.org/pdf/2503.22988.pdf

Abstract:
Differentially Private Stochastic Gradient Descent (DP-SGD) is a widely adopted technique for privacy-preserving deep learning. A critical challenge in DP-SGD is selecting the optimal clipping threshold C, which involves balancing the trade-off between clipping bias and noise magnitude, incurring substantial privacy and computing overhead during hyperparameter tuning. In this paper, we propose Dynamic Clipping DP-SGD (DC-SGD), a framework that leverages differentially private histograms to estimate gradient norm distributions and dynamically adjust the clipping threshold C. Our framework includes two novel mechanisms: DC-SGD-P and DC-SGD-E. DC-SGD-P adjusts the clipping threshold based on a percentile of gradient norms, while DC-SGD-E minimizes the expected squared error of gradients to optimize C. These dynamic adjustments significantly reduce the burden of hyperparameter tuning C. The extensive experiments on various deep learning tasks, including image classification and natural language processing, show that our proposed dynamic algorithms achieve up to 9 times acceleration on hyperparameter tuning than DP-SGD. And DC-SGD-E can achieve an accuracy improvement of 10.62% on CIFAR10 than DP-SGD under the same privacy budget of hyperparameter tuning. We conduct rigorous theoretical privacy and convergence analyses, showing that our methods seamlessly integrate with the Adam optimizer. Our results highlight the robust performance and efficiency of DC-SGD, offering a practical solution for differentially private deep learning with reduced computational overhead and enhanced privacy guarantees.

Paperid: 1350, https://arxiv.org/pdf/2503.22717.pdf

Abstract:
Traditionally, mobile wallets rely on a trusted server that provides them with a current view of the blockchain, and thus, these wallets do not need to validate the header chain or transaction inclusion themselves. If a mobile wallet were to validate a header chain and inclusion of its transactions, it would require significant storage and performance overhead, which is challenging and expensive to ensure on resource-limited devices, such as smartphones. Moreover, such an overhead would be multiplied by the number of cryptocurrencies the user holds in a wallet. Therefore, we introduce a novel approach, called FeatherWallet, to mobile wallet synchronization designed to eliminate trust in a server while providing efficient utilization of resources. Our approach addresses the challenges associated with storage and bandwidth requirements by off-chaining validation of header chains using SNARK-based proofs of chain extension, which are verified by a smart contract. This offers us a means of storing checkpoints in header chains of multiple blockchains. The key feature of our approach is the ability of mobile clients to update their partial local header chains using checkpoints derived from the proof verification results stored in the smart contract. In the evaluation, we created zk-SNARK proofs for the 2, 4, 8, 16, 32, and 64 headers within our trustless off-chain service. For 64-header proofs, the off-chain service producing proofs requires at least 40 GB of RAM, while the minimal gas consumption is achieved for 12 proofs bundled in a single transaction. We achieved a 20-fold reduction in storage overhead for a mobile client in contrast to traditional SPV clients. Although we have developed a proof-of-concept for PoW blockchains, the whole approach can be extended in principle to other consensus mechanisms, e.g., PoS.

Paperid: 1351, https://arxiv.org/pdf/2503.20846.pdf

Abstract:
Privacy-preserving synthetic data offers a promising solution to harness segregated data in high-stakes domains where information is compartmentalized for regulatory, privacy, or institutional reasons. This survey provides a comprehensive framework for understanding the landscape of privacy-preserving synthetic data, presenting the theoretical foundations of generative models and differential privacy followed by a review of state-of-the-art methods across tabular data, images, and text. Our synthesis of evaluation approaches highlights the fundamental trade-off between utility for down-stream tasks and privacy guarantees, while identifying critical research gaps: the lack of realistic benchmarks representing specialized domains and insufficient empirical evaluations required to contextualise formal guarantees. Through empirical analysis of four leading methods on five real-world datasets from specialized domains, we demonstrate significant performance degradation under realistic privacy constraints ($Îµ\leq 4$), revealing a substantial gap between results reported on general domain benchmarks and performance on domain-specific data. %Our findings highlight key challenges including unaccounted privacy leakage, insufficient empirical verification of formal guarantees, and a critical deficit of realistic benchmarks. These challenges underscore the need for robust evaluation frameworks, standardized benchmarks for specialized domains, and improved techniques to address the unique requirements of privacy-sensitive fields such that this technology can deliver on its considerable potential.

Paperid: 1352, https://arxiv.org/pdf/2503.17011.pdf

Abstract:
Quantum key distribution (QKD) enables unconditionally secure symmetric key exchange between parties. However, terrestrial fibre-optic links face inherent distance constraints due to quantum signal degradation. Traditional solutions to overcome these limits rely on trusted relay nodes, which perform intermediate re-encryption of keys using one-time pad (OTP) encryption. This approach, however, exposes keys as plaintext at each relay, requiring significant trust and stringent security controls at every intermediate node. These "trusted" relays become a security liability if compromised. To address this issue, we propose a zero-trust relay design that applies fully homomorphic encryption (FHE) to perform intermediate OTP re-encryption without exposing plaintext keys, effectively mitigating the risks associated with potentially compromised or malicious relay nodes. Additionally, the architecture enhances crypto-agility by incorporating external quantum random number generators, thus decoupling key generation from specific QKD hardware and reducing vulnerabilities tied to embedded key-generation modules. The solution is designed with the existing European Telecommunication Standards Institute (ETSI) QKD standards in mind, enabling straightforward integration into current infrastructures. Its feasibility has been successfully demonstrated through a hybrid network setup combining simulated and commercially available QKD equipment. The proposed zero-trust architecture thus significantly advances the scalability and practical security of large-scale QKD networks, greatly reducing reliance on fully trusted infrastructure.

Paperid: 1353, https://arxiv.org/pdf/2503.15015.pdf

Abstract:
Efficient and secure federated learning (FL) is a critical challenge for resource-limited devices, especially mobile devices. Existing secure FL solutions commonly incur significant overhead, leading to a contradiction between efficiency and security. As a result, these two concerns are typically addressed separately. This paper proposes Opportunistic Federated Learning (OFL), a novel FL framework designed explicitly for resource-heterogenous and privacy-aware FL devices, solving efficiency and security problems jointly. OFL optimizes resource utilization and adaptability across diverse devices by adopting a novel hierarchical and asynchronous aggregation strategy. OFL provides strong security by introducing a differentially private and opportunistic model updating mechanism for intra-cluster model aggregation and an advanced threshold homomorphic encryption scheme for inter-cluster aggregation. Moreover, OFL secures global model aggregation by implementing poisoning attack detection using frequency analysis while keeping models encrypted. We have implemented OFL in a real-world testbed and evaluated OFL comprehensively. The evaluation results demonstrate that OFL achieves satisfying model performance and improves efficiency and security, outperforming existing solutions.

Paperid: 1354, https://arxiv.org/pdf/2503.13872.pdf

Abstract:
NLP models trained with differential privacy (DP) usually adopt the DP-SGD framework, and privacy guarantees are often reported in terms of the privacy budget $Îµ$. However, $Îµ$ does not have any intrinsic meaning, and it is generally not possible to compare across variants of the framework. Work in image processing has therefore explored how to empirically calibrate noise across frameworks using Membership Inference Attacks (MIAs). However, this kind of calibration has not been established for NLP. In this paper, we show that MIAs offer little help in calibrating privacy, whereas reconstruction attacks are more useful. As a use case, we define a novel kind of directional privacy based on the von Mises-Fisher (VMF) distribution, a metric DP mechanism that perturbs angular distance rather than adding (isotropic) Gaussian noise, and apply this to NLP architectures. We show that, even though formal guarantees are incomparable, empirical privacy calibration reveals that each mechanism has different areas of strength with respect to utility-privacy trade-offs.

Paperid: 1355, https://arxiv.org/pdf/2503.12958.pdf

Abstract:
Federated learning (FL) enables participants to store data locally while collaborating in training, yet it remains vulnerable to privacy attacks, such as data reconstruction. Existing differential privacy (DP) technologies inject noise dynamically into the training process to mitigate the impact of excessive noise. However, this dynamic scheduling is often grounded in factors indirectly related to privacy, making it difficult to clearly explain the intricate relationship between dynamic noise adjustments and privacy requirements. To address this issue, we propose FedSDP, a novel and explainable DP-based privacy protection mechanism that guides noise injection based on privacy contribution. Specifically, FedSDP leverages Shapley values to assess the contribution of private attributes to local model training and dynamically adjusts the amount of noise injected accordingly. By providing theoretical insights into the injection of varying scales of noise into local training, FedSDP enhances interpretability. Extensive experiments demonstrate that FedSDP can achieve a superior balance between privacy preservation and model performance, surpassing state-of-the-art (SOTA) solutions.

Paperid: 1356, https://arxiv.org/pdf/2503.12172.pdf

Abstract:
Generative models have rapidly evolved to generate realistic outputs. However, their synthetic outputs increasingly challenge the clear distinction between natural and AI-generated content, necessitating robust watermarking techniques. Watermarks are typically expected to preserve the integrity of the target image, withstand removal attempts, and prevent unauthorized replication onto unrelated images. To address this need, recent methods embed persistent watermarks into images produced by diffusion models using the initial noise. Yet, to do so, they either distort the distribution of generated images or rely on searching through a long dictionary of used keys for detection. In this paper, we propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark, enabling a distortion-free watermark that can be verified without requiring a database of key patterns. Instead, the key pattern can be inferred from the semantic embedding of the image using locality-sensitive hashing. Furthermore, conditioning the watermark detection on the original image content improves robustness against forgery attacks. To demonstrate that, we consider two largely overlooked attack strategies: (i) an attacker extracting the initial noise and generating a novel image with the same pattern; (ii) an attacker inserting an unrelated (potentially harmful) object into a watermarked image, possibly while preserving the watermark. We empirically validate our method's increased robustness to these attacks. Taken together, our results suggest that content-aware watermarks can mitigate risks arising from image-generative models.

Paperid: 1357, https://arxiv.org/pdf/2503.09002.pdf

Abstract:
Static analysis is a powerful technique for bug detection in critical systems like operating system kernels. However, designing and implementing static analyzers is challenging, time-consuming, and typically limited to predefined bug patterns. While large language models (LLMs) have shown promise for static analysis, directly applying them to scan large systems remains impractical due to computational constraints and contextual limitations. We present KNighter, the first approach that unlocks scalable LLM-based static analysis by automatically synthesizing static analyzers from historical bug patterns. Rather than using LLMs to directly analyze massive systems, our key insight is leveraging LLMs to generate specialized static analyzers guided by historical patch knowledge. KNighter implements this vision through a multi-stage synthesis pipeline that validates checker correctness against original patches and employs an automated refinement process to iteratively reduce false positives. Our evaluation on the Linux kernel demonstrates that KNighter generates high-precision checkers capable of detecting diverse bug patterns overlooked by existing human-written analyzers. To date, KNighter-synthesized checkers have discovered 92 new, critical, long-latent bugs (average 4.3 years) in the Linux kernel; 77 are confirmed, 57 fixed, and 30 have been assigned CVE numbers. This work establishes an entirely new paradigm for scalable, reliable, and traceable LLM-based static analysis for real-world systems via checker synthesis.

Paperid: 1358, https://arxiv.org/pdf/2503.08829.pdf

Abstract:
We propose VIBE, a model-agnostic framework that trains classifiers resilient to backdoor attacks. The key concept behind our approach is to treat malicious inputs and corrupted labels from the training dataset as observed random variables, while the actual clean labels are latent. VIBE then recovers the corresponding latent clean label posterior through variational inference. The resulting training procedure follows the expectation-maximization (EM) algorithm. The E-step infers the clean pseudolabels by solving an entropy-regularized optimal transport problem, while the M-step updates the classifier parameters via gradient descent. Being modular, VIBE can seamlessly integrate with recent advancements in self-supervised representation learning, which enhance its ability to resist backdoor attacks. We experimentally validate the method effectiveness against contemporary backdoor attacks on standard datasets, a large-scale setup with 1$k$ classes, and a dataset poisoned with multiple attacks. VIBE consistently outperforms previous defenses across all tested scenarios.

Paperid: 1359, https://arxiv.org/pdf/2503.04825.pdf

Abstract:
Currently, deep learning models are easily exposed to data leakage risks. As a distributed model, Split Learning thus emerged as a solution to address this issue. The model is splitted to avoid data uploading to the server and reduce computing requirements while ensuring data privacy and security. However, the transmission of data between clients and server creates a potential vulnerability. In particular, model is vulnerable to intellectual property (IP) infringement such as piracy. Alarmingly, a dedicated copyright protection framework tailored for Split Learning models is still lacking. To this end, we propose the first copyright protection scheme for Split Learning model, leveraging fingerprint to ensure effective and robust copyright protection. The proposed method first generates a set of specifically designed adversarial examples. Then, we select those examples that would induce misclassifications to form the fingerprint set. These adversarial examples are embedded as fingerprints into the model during the training process. Exhaustive experiments highlight the effectiveness of the scheme. This is demonstrated by a remarkable fingerprint verification success rate (FVSR) of 100% on MNIST, 98% on CIFAR-10, and 100% on ImageNet, respectively. Meanwhile, the model's accuracy only decreases slightly, indicating that the embedded fingerprints do not compromise model performance. Even under label inference attack, our approach consistently achieves a high fingerprint verification success rate that ensures robust verification.

Paperid: 1360, https://arxiv.org/pdf/2503.03027.pdf

Abstract:
Privacy is an instance of a social norm formed through legal, technical, and cultural dimensions. Institutions such as regulators, industry, and researchers act as societal agents that both influence and respond to evolving norms. Attempts to promote privacy are often ineffective unless they account for this complexity and the dynamic interactions among these actors. Privacy enhancing technologies (PETs) are technical solutions for privacy issues that enable collaborative data analysis, allowing for the development of solutions that benefit society, all while ensuring the privacy of individuals whose data is being used. However, despite increased privacy challenges and a corresponding increase in new regulations being proposed by governments across the globe, a low adoption rate of PETs persists. In this work, we investigate the factors influencing industry's decision-making processes around PETs adoption as well as the extent to which privacy regulations inspire such adoption. We conducted a qualitative survey study with 22 industry participants from across Canada to investigate how businesses in Canada make decisions to adopt novel technologies and how new privacy regulations impact their business processes. Informed by the results of our analysis, we make recommendations for industry, researchers, and policymakers on how to support what each of them seeks from the other when attempting to improve digital privacy protections. By advancing our understanding of what challenges the industry faces, we increase the effectiveness of future privacy research that aims to help overcome these issues.

Paperid: 1361, https://arxiv.org/pdf/2502.20997.pdf

Abstract:
A key countermeasure in cybersecurity has been the development of standardized computational protocols for modeling and sharing cyber threat intelligence (CTI) between organizations, enabling a shared understanding of threats and coordinated global responses. However, while the cybersecurity domain benefits from mature threat exchange frameworks, there has been little progress in the automatic and interoperable sharing of knowledge about disinformation campaigns. This paper proposes an open-source disinformation threat intelligence framework for sharing interoperable disinformation incidents. This approach relies on i) the modeling of disinformation incidents with the DISARM framework (MITRE ATT&CK-based TTP modeling of disinformation attacks), ii) a custom mapping to STIX2 standard representation (computational data format), and iii) an exchange architecture (called DISINFOX) capable of using the proposed mapping with a centralized platform to store and manage disinformation incidents and CTI clients which consume the gathered incidents. The microservice-based implementation validates the framework with more than 100 real-world disinformation incidents modeled, stored, shared, and consumed successfully. To the best of our knowledge, this work is the first academic and technical effort to integrate disinformation threats in the CTI ecosystem.

Paperid: 1362, https://arxiv.org/pdf/2502.18608.pdf

Abstract:
In recent years, LLM watermarking has emerged as an attractive safeguard against AI-generated content, with promising applications in many real-world domains. However, there are growing concerns that the current LLM watermarking schemes are vulnerable to expert adversaries wishing to reverse-engineer the watermarking mechanisms. Prior work in breaking or stealing LLM watermarks mainly focuses on the distribution-modifying algorithm of Kirchenbauer et al. (2023), which perturbs the logit vector before sampling. In this work, we focus on reverse-engineering the other prominent LLM watermarking scheme, distortion-free watermarking (Kuditipudi et al. 2024), which preserves the underlying token distribution by using a hidden watermarking key sequence. We demonstrate that, even under a more sophisticated watermarking scheme, it is possible to compromise the LLM and carry out a spoofing attack, i.e. generate a large number of (potentially harmful) texts that can be attributed to the original watermarked LLM. Specifically, we propose using adaptive prompting and a sorting-based algorithm to accurately recover the underlying secret key for watermarking the LLM. Our empirical findings on LLAMA-3.1-8B-Instruct, Mistral-7B-Instruct, Gemma-7b, and OPT-125M challenge the current theoretical claims on the robustness and usability of the distortion-free watermarking techniques.

Paperid: 1363, https://arxiv.org/pdf/2502.18227.pdf

Abstract:
Tensor-valued data, increasingly common in distributed big data applications like autonomous driving and smart healthcare, poses unique challenges for privacy protection due to its multidimensional structure and the risk of losing critical structural information. Traditional local differential privacy methods, designed for scalars and matrices, are insufficient for tensors, as they fail to preserve essential relationships among tensor elements. We introduce TLDP, a novel LDP algorithm for Tensors, which employs a randomized response mechanism to perturb tensor components while maintaining structural integrity. To strike a better balance between utility and privacy, we incorporate a weight matrix that selectively protects sensitive regions. Both theoretical analysis and empirical findings from real-world datasets show that TLDP achieves superior utility while preserving privacy, making it a robust solution for high-dimensional tensor data.

Paperid: 1364, https://arxiv.org/pdf/2502.17748.pdf

Abstract:
Ensuring fairness in machine learning, particularly in human-centric applications, extends beyond algorithmic bias to encompass fairness in privacy, specifically the equitable distribution of privacy risk. This is critical in federated learning (FL), where decentralized data necessitates balanced privacy preservation across clients. We introduce FinP, a framework designed to achieve fairness in privacy by mitigating disproportionate exposure to source inference attacks (SIA). FinP employs a dual approach: (1) server-side adaptive aggregation to address unfairness in client contributions in global model, and (2) client-side regularization to reduce client vulnerability. This comprehensive strategy targets both the symptoms and root causes of privacy unfairness. Evaluated on the Human Activity Recognition (HAR) and CIFAR-10 datasets, FinP demonstrates ~20% improvement in fairness in privacy on HAR with minimal impact on model utility, and effectively mitigates SIA risks on CIFAR-10, showcasing its ability to provide fairness in privacy in FL systems without compromising performance.

Paperid: 1365, https://arxiv.org/pdf/2502.16174.pdf

Abstract:
With the rise of LLMs, ensuring model safety and alignment has become a critical concern. While modern instruction-finetuned LLMs incorporate alignment during training, they still frequently require moderation tools to prevent unsafe behavior. The most common approach to moderation are guard models that flag unsafe inputs. However, guards require costly training and are typically limited to fixed-size, pre-trained options, making them difficult to adapt to evolving risks and resource constraints. We hypothesize that instruction-finetuned LLMs already encode safety-relevant information internally and explore training-free safety assessment methods that work with off-the-shelf models. We show that simple prompting allows models to recognize harmful inputs they would otherwise mishandle. We also demonstrate that safe and unsafe prompts are distinctly separable in the models' latent space. Building on this, we introduce the Latent Prototype Moderator (LPM), a training-free moderation method that uses Mahalanobis distance in latent space to assess input safety. LPM is a lightweight, customizable add-on that generalizes across model families and sizes. Our method matches or exceeds state-of-the-art guard models across multiple safety benchmarks, offering a practical and flexible solution for scalable LLM moderation.

Paperid: 1366, https://arxiv.org/pdf/2502.14921.pdf

Abstract:
How much information about training samples can be leaked through synthetic data generated by Large Language Models (LLMs)? Overlooking the subtleties of information flow in synthetic data generation pipelines can lead to a false sense of privacy. In this paper, we assume an adversary has access to some synthetic data generated by a LLM. We design membership inference attacks (MIAs) that target the training data used to fine-tune the LLM that is then used to synthesize data. The significant performance of our MIA shows that synthetic data leak information about the training data. Further, we find that canaries crafted for model-based MIAs are sub-optimal for privacy auditing when only synthetic data is released. Such out-of-distribution canaries have limited influence on the model's output when prompted to generate useful, in-distribution synthetic data, which drastically reduces their effectiveness. To tackle this problem, we leverage the mechanics of auto-regressive models to design canaries with an in-distribution prefix and a high-perplexity suffix that leave detectable traces in synthetic data. This enhances the power of data-based MIAs and provides a better assessment of the privacy risks of releasing synthetic data generated by LLMs.

Paperid: 1367, https://arxiv.org/pdf/2502.09535.pdf

Abstract:
Mobile sensor data has been proposed for security-critical applications such as device pairing, proximity detection, and continuous authentication. However, the foundational premise that these signals provide sufficient entropy remains under-explored. In this work, we systematically analyse the entropy of mobile sensor data using four datasets from multiple application contexts (UCI-HAR, SHL, Relay, and PerilZIS). Using direct computation and estimation, we report entropy values (max, Shannon, collision, and min-entropy) for an exhaustive range of sensor combinations. We demonstrate that the entropy of mobile sensors remains far below what is considered secure by modern standards for security applications, such as for authentication and key generation systems, even when many sensors are combined. In particular, we observe an alarming divergence between average-case Shannon entropy and worst-case min-entropy. Single-sensor mean min-entropy varies between 3.408-4.483 bits despite Shannon entropy being several multiples higher. We further show that correlations between sensor modalities contribute to a ~75% reduction between Shannon and min-entropy. This brings joint min-entropy well below 10 bits in many cases and, in the best case, yielding only ~24 bits when combining 20+ sensor modalities. Our results reveal that adversaries may feasibly predict sensor signals through an exhaustive exploration of the measurement space. Our work also calls into question the widely held assumption that adding more sensors inherently yields higher security. Ultimately, we strongly urge caution when relying on mobile sensor data for security applications.

Paperid: 1368, https://arxiv.org/pdf/2502.04045.pdf

Abstract:
Within the machine learning community, reconstruction attacks are a principal concern and have been identified even in federated learning (FL), which was designed with privacy preservation in mind. In response to these threats, the privacy community recommends the use of differential privacy (DP) in the stochastic gradient descent algorithm, termed DP-SGD. However, the proliferation of variants of DP in recent years\textemdash such as metric privacy\textemdash has made it challenging to conduct a fair comparison between different mechanisms due to the different meanings of the privacy parameters $Îµ$ and $Î´$ across different variants. Thus, interpreting the practical implications of $Îµ$ and $Î´$ in the FL context and amongst variants of DP remains ambiguous. In this paper, we lay a foundational framework for comparing mechanisms with differing notions of privacy guarantees, namely $(Îµ,Î´)$-DP and metric privacy. We provide two foundational means of comparison: firstly, via the well-established $(Îµ,Î´)$-DP guarantees, made possible through the RÃ©nyi differential privacy framework; and secondly, via Bayes' capacity, which we identify as an appropriate measure for reconstruction threats.

Paperid: 1369, https://arxiv.org/pdf/2502.03698.pdf

Abstract:
Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to adversarial attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic and recently proposed algorithms, including Behavior Cloning (BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and VQ-Behavior Transformer (VQ-BET). We study the vulnerability of these methods to untargeted, targeted and universal adversarial perturbations. While explicit policies, such as BC, LSTM-GMM and VQ-BET can be attacked in the same manner as standard computer vision models, we find that attacks for implicit and denoising policy models are nuanced and require developing novel attack methods. Our experiments on several simulated robotic manipulation tasks reveal that most of the current methods are highly vulnerable to adversarial perturbations. We also show that these attacks are transferable across algorithms, architectures, and tasks, raising concerning security vulnerabilities with potentially a white-box threat model. In addition, we test the efficacy of a randomized smoothing, a widely used adversarial defense technique, and highlight its limitation in defending against attacks on complex and multi-modal action distribution common in complex control tasks. In summary, our findings highlight the vulnerabilities of modern BC algorithms, paving way for future work in addressing such limitations.

Paperid: 1370, https://arxiv.org/pdf/2502.03271.pdf

Abstract:
Rust supports type conversions and safe Rust guarantees the security of these conversions through robust static type checking and strict ownership guidelines. However, there are instances where programmers need to use unsafe Rust for certain type conversions, especially those involving pointers. Consequently, these conversions may cause severe memory corruption problems. Despite extensive research on type confusion bugs in C/C++, studies on type confusion bugs in Rust are still lacking. Also, due to Rust's new features in the type system, existing solutions in C/C++ cannot be directly applied to Rust. In this paper, we develop a static analysis tool called TYPEPULSE to detect three main categories of type confusion bugs in Rust including misalignment, inconsistent layout, and mismatched scope. TYPEPULSE first performs a type conversion analysis to collect and determine trait bounds for type pairs. Moreover, it performs a pointer alias analysis to resolve the alias relationship of pointers. Following the integration of information into the property graph, it constructs type patterns and detects each type of bug in various conversion scenarios. We run TYPEPULSE on the top 3,000 Rust packages and uncover 71 new type confusion bugs, exceeding the total number of type confusion bugs reported in RUSTSEC over the past five years. We have received 32 confirmations from developers, along with one CVE ID and six RUSTSEC IDs.

Paperid: 1371, https://arxiv.org/pdf/2502.01154.pdf

Abstract:
Large language models (LLMs) have seen rapid development in recent years, revolutionizing various applications and significantly enhancing convenience and productivity. However, alongside their impressive capabilities, ethical concerns and new types of attacks, such as jailbreaking, have emerged. While most prompting techniques focus on optimizing adversarial inputs for individual cases, resulting in higher computational costs when dealing with large datasets. Less research has addressed the more general setting of training a universal attacker that can transfer to unseen tasks. In this paper, we introduce JUMP, a prompt-based method designed to jailbreak LLMs using universal multi-prompts. We also adapt our approach for defense, which we term DUMP. Experimental results demonstrate that our method for optimizing universal multi-prompts outperforms existing techniques.

Paperid: 1372, https://arxiv.org/pdf/2502.00735.pdf

Abstract:
Large Language Models (LLMs) have seen widespread applications across various domains due to their growing ability to process diverse types of input data, including text, audio, image and video. While LLMs have demonstrated outstanding performance in understanding and generating contexts for different scenarios, they are vulnerable to prompt-based attacks, which are mostly via text input. In this paper, we introduce the first voice-based jailbreak attack against multimodal LLMs, termed as Flanking Attack, which can process different types of input simultaneously towards the multimodal LLMs. Our work is motivated by recent advancements in monolingual voice-driven large language models, which have introduced new attack surfaces beyond traditional text-based vulnerabilities for LLMs. To investigate these risks, we examine the state-of-the-art multimodal LLMs, which can be accessed via different types of inputs such as audio input, focusing on how adversarial prompts can bypass its defense mechanisms. We propose a novel strategy, in which the disallowed prompt is flanked by benign, narrative-driven prompts. It is integrated in the Flanking Attack which attempts to humanizes the interaction context and execute the attack through a fictional setting. Further, to better evaluate the attack performance, we present a semi-automated self-assessment framework for policy violation detection. We demonstrate that Flanking Attack is capable of manipulating state-of-the-art LLMs into generating misaligned and forbidden outputs, which achieves an average attack success rate ranging from 0.67 to 0.93 across seven forbidden scenarios.

Paperid: 1373, https://arxiv.org/pdf/2502.00068.pdf

Abstract:
By 2050, electric vehicles (EVs) are projected to account for 70% of global vehicle sales. While EVs provide environmental benefits, they also pose challenges for energy generation, grid infrastructure, and data privacy. Current research on EV routing and charge management often overlooks privacy when predicting energy demands, leaving sensitive mobility data vulnerable. To address this, we developed a Federated Learning Transformer Network (FLTN) to predict EVs' next charge location with enhanced privacy measures. Each EV operates as a client, training an onboard FLTN model that shares only model weights, not raw data with a community-based Distributed Energy Resource Management System (DERMS), which aggregates them into a community global model. To further enhance privacy, non-transitory EVs use peer-to-peer weight sharing and augmentation within their community, obfuscating individual contributions and improving model accuracy. Community DERMS global model weights are then redistributed to EVs for continuous training. Our FLTN approach achieved up to 92% accuracy while preserving data privacy, compared to our baseline centralised model, which achieved 98% accuracy with no data privacy. Simulations conducted across diverse charge levels confirm the FLTN's ability to forecast energy demands over extended periods. We present a privacy-focused solution for forecasting EV charge location prediction, effectively mitigating data leakage risks.

Paperid: 1374, https://arxiv.org/pdf/2501.18407.pdf

Abstract:
Boolean functions with good cryptographic properties like high nonlinearity and algebraic degree play an important in the security of stream and block ciphers. Such functions may be designed, for instance, by algebraic constructions or metaheuristics. This paper investigates the use of Evolutionary Algorithms (EAs) to design homogeneous bent Boolean functions, i.e., functions that are maximally nonlinear and whose algebraic normal form contains only monomials of the same degree. In our work, we evaluate three genotype encodings and four fitness functions. Our results show that while EAs manage to find quadratic homogeneous bent functions (with the best method being a GA leveraging a restricted encoding), none of the approaches result in cubic homogeneous bent functions.

Paperid: 1375, https://arxiv.org/pdf/2501.17335.pdf

Abstract:
Decentralized finance (DeFi) markets spread across Layer-1 (L1) and Layer-2 (L2) blockchains rely on arbitrage to keep prices aligned. Today most price gaps are closed against centralized exchanges (CEXes), whose deep liquidity and fast execution make them the primary venue for price discovery. As trading volume migrates on-chain, cross-chain arbitrage between decentralized exchanges (DEXes) will become the canonical mechanism for price alignment. Yet, despite its importance to DeFi-and the on-chain transparency making real activity tractable in a way CEX-to-DEX arbitrage is not-existing research remains confined to conceptual overviews and hypothetical opportunity analyses. We study cross-chain arbitrage with a profit-cost model and a year-long measurement. The model shows that opportunity frequency, bridging time, and token depreciation determine whether inventory- or bridge-based execution is more profitable. Empirically, we analyze one year of transactions (September 2023 - August 2024) across nine blockchains and identify 242,535 executed arbitrages totaling 868.64 million USD volume. Activity clusters on Ethereum-centric L1-L2 pairs, grows 5.5x over the study period, and surges-higher volume, more trades, lower fees-after the Dencun upgrade (March 13, 2024). Most trades use pre-positioned inventory (66.96%) and settle in 9s, whereas bridge-based arbitrages take 242s, underscoring the latency cost of today's bridges. Market concentration is high: the five largest addresses execute more than half of all trades, and one alone captures almost 40% of daily volume post-Dencun. We conclude that cross-chain arbitrage fosters vertical integration, centralizing sequencing infrastructure and economic power and thereby exacerbating censorship, liveness, and finality risks; decentralizing block building and lowering entry barriers are critical to countering these threats.

Paperid: 1376, https://arxiv.org/pdf/2501.15076.pdf

Abstract:
The fields of machine learning (ML) and cryptanalysis share an interestingly common objective of creating a function, based on a given set of inputs and outputs. However, the approaches and methods in doing so vary vastly between the two fields. In this paper, we explore integrating the knowledge from the ML domain to provide empirical evaluations of cryptosystems. Particularly, we utilize information theoretic metrics to perform ML-based distribution estimation. We propose two novel applications of ML algorithms that can be applied in a known plaintext setting to perform cryptanalysis on any cryptosystem. We use mutual information neural estimation to calculate a cryptosystem's mutual information leakage, and a binary cross entropy classification to model an indistinguishability under chosen plaintext attack (CPA). These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem and the results can provide a useful empirical bound. We evaluate the efficacy of our methodologies by empirically analyzing several encryption schemes. Furthermore, we extend the analysis to novel network coding-based cryptosystems and provide other use cases for our algorithms. We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy. It also identifies the faults in CPA-secure cryptosystems with faulty parameters, such a reduced counter version of AES-CTR. We also conclude that with our algorithms, in most cases a smaller-sized neural network using less computing power can identify vulnerabilities in cryptosystems, providing a quick check of the sanity of the cryptosystem and help to decide whether to spend more resources to deploy larger networks that are able to break the cryptosystem.

Paperid: 1377, https://arxiv.org/pdf/2501.10466.pdf

Abstract:
Compared with standard learning, adversarially robust learning is widely recognized to demand significantly more training examples. Recent works propose the use of self-supervised adversarial training (SSAT) with external or synthetically generated unlabeled data to enhance model robustness. However, SSAT requires a substantial amount of extra unlabeled data, significantly increasing memory usage and model training times. To address these challenges, we propose novel methods to strategically select a small subset of unlabeled data essential for SSAT and robustness improvement. Our selection prioritizes data points near the model's decision boundary based on latent clustering-based techniques, efficiently identifying a critical subset of unlabeled data with a higher concentration of boundary-adjacent points. While focusing on near-boundary data, our methods are designed to maintain a balanced ratio between boundary and non-boundary data points to avoid overfitting. Our experiments on image benchmarks show that integrating our selection strategies into self-supervised adversarial training can largely reduce memory and computational requirements while achieving high model robustness. In particular, our latent clustering-based selection method with k-means is the most effective, achieving nearly identical test-time robust accuracies with 5 to 10 times less external or generated unlabeled data when applied to image benchmarks. Additionally, we validate the generalizability of our approach across various application scenarios, including a real-world medical dataset for COVID-19 chest X-ray classification.

Paperid: 1378, https://arxiv.org/pdf/2501.02032.pdf

Abstract:
The advent of blockchain technology has facilitated the widespread adoption of smart contracts in the financial sector. However, current fraud detection methodologies exhibit limitations in capturing both global structural patterns within transaction networks and local semantic relationships embedded in transaction data. Most existing models focus on either structural information or semantic features individually, leading to suboptimal performance in detecting complex fraud patterns.In this paper, we propose a dynamic feature fusion model that combines graph-based representation learning and semantic feature extraction for blockchain fraud detection. Specifically, we construct global graph representations to model account relationships and extract local contextual features from transaction data. A dynamic multimodal fusion mechanism is introduced to adaptively integrate these features, enabling the model to capture both structural and semantic fraud patterns effectively. We further develop a comprehensive data processing pipeline, including graph construction, temporal feature enhancement, and text preprocessing. Experimental results on large-scale real-world blockchain datasets demonstrate that our method outperforms existing benchmarks across accuracy, F1 score, and recall metrics. This work highlights the importance of integrating structural relationships and semantic similarities for robust fraud detection and offers a scalable solution for securing blockchain systems.

Paperid: 1379, https://arxiv.org/pdf/2501.01584.pdf

Abstract:
Despite the advantage of preserving data privacy, federated learning (FL) still suffers from the straggler issue due to the limited computing resources of distributed clients and the unreliable wireless communication environment. By effectively imitating the distributed resources, digital twin (DT) shows great potential in alleviating this issue. In this paper, we leverage DT in the FL framework over non-orthogonal multiple access (NOMA) network to assist FL training process, considering malicious attacks on model updates from clients. A reputationbased client selection scheme is proposed, which accounts for client heterogeneity in multiple aspects and effectively mitigates the risks of poisoning attacks in FL systems. To minimize the total latency and energy consumption in the proposed system, we then formulate a Stackelberg game by considering clients and the server as the leader and the follower, respectively. Specifically, the leader aims to minimize the energy consumption while the objective of the follower is to minimize the total latency during FL training. The Stackelberg equilibrium is achieved to obtain the optimal solutions. We first derive the strategies for the followerlevel problem and include them in the leader-level problem which is then solved via problem decomposition. Simulation results verify the superior performance of the proposed scheme.

Paperid: 1380, https://arxiv.org/pdf/2506.23706.pdf

Abstract:
Benchmarks are important measures to evaluate safety and compliance of AI models at scale. However, they typically do not offer verifiable results and lack confidentiality for model IP and benchmark datasets. We propose Attestable Audits, which run inside Trusted Execution Environments and enable users to verify interaction with a compliant AI model. Our work protects sensitive data even when model provider and auditor do not trust each other. This addresses verification challenges raised in recent AI governance frameworks. We build a prototype demonstrating feasibility on typical audit benchmarks against Llama-3.1.

Paperid: 1381, https://arxiv.org/pdf/2506.21688.pdf

Abstract:
We introduce a novel cybersecurity encounter simulator between a network defender and an attacker designed to facilitate game-theoretic modeling and analysis while maintaining many significant features of real cyber defense. Our simulator, built within the OpenAI Gym framework, incorporates realistic network topologies, vulnerabilities, exploits (including-zero-days), and defensive mechanisms. Additionally, we provide a formal simulation-based game-theoretic model of cyberdefense using this simulator, which features a novel approach to modeling zero-days exploits, and a PSRO-style approach for approximately computing equilibria in this game. We use our simulator and associated game-theoretic framework to analyze the Volt Typhoon advanced persistent threat (APT). Volt Typhoon represents a sophisticated cyber attack strategy employed by state-sponsored actors, characterized by stealthy, prolonged infiltration and exploitation of network vulnerabilities. Our experimental results demonstrate the efficacy of game-theoretic strategies in understanding network resilience against APTs and zero-days, such as Volt Typhoon, providing valuable insight into optimal defensive posture and proactive threat mitigation.

Paperid: 1382, https://arxiv.org/pdf/2506.21106.pdf

Abstract:
Phishing attacks pose a significant cybersecurity threat, evolving rapidly to bypass detection mechanisms and exploit human vulnerabilities. This paper introduces PhishKey to address the challenges of adaptability, robustness, and efficiency. PhishKey is a novel phishing detection method using automatic feature extraction from hybrid sources. PhishKey combines character-level processing with Convolutional Neural Networks (CNN) for URL classification, and a Centroid-Based Key Component Phishing Extractor (CAPE) for HTML content at the word level. CAPE reduces noise and ensures complete sample processing avoiding crop operations on the input data. The predictions from both modules are integrated using a soft-voting ensemble to achieve more accurate and reliable classifications. Experimental evaluations on four state-of-the-art datasets demonstrate the effectiveness of PhishKey. It achieves up to 98.70% F1 Score and shows strong resistance to adversarial manipulations such as injection attacks with minimal performance degradation.

Paperid: 1383, https://arxiv.org/pdf/2506.20915.pdf

Abstract:
As the deployment of large language models (LLMs) grows in sensitive domains, ensuring the integrity of their computational provenance becomes a critical challenge, particularly in regulated sectors such as healthcare, where strict requirements are applied in dataset usage. We introduce ZKPROV, a novel cryptographic framework that enables zero-knowledge proofs of LLM provenance. It allows users to verify that a model is trained on a reliable dataset without revealing sensitive information about it or its parameters. Unlike prior approaches that focus on complete verification of the training process (incurring significant computational cost) or depend on trusted execution environments, ZKPROV offers a distinct balance. Our method cryptographically binds a trained model to its authorized training dataset(s) through zero-knowledge proofs while avoiding proof of every training step. By leveraging dataset-signed metadata and compact model parameter commitments, ZKPROV provides sound and privacy-preserving assurances that the result of the LLM is derived from a model trained on the claimed authorized and relevant dataset. Experimental results demonstrate the efficiency and scalability of the ZKPROV in generating this proof and verifying it, achieving a practical solution for real-world deployments. We also provide formal security guarantees, proving that our approach preserves dataset confidentiality while ensuring trustworthy dataset provenance.

Paperid: 1384, https://arxiv.org/pdf/2506.20415.pdf

Abstract:
Ensuring the security of complex system-on-chips (SoCs) designs is a critical imperative, yet traditional verification techniques struggle to keep pace due to significant challenges in automation, scalability, comprehensiveness, and adaptability. The advent of large language models (LLMs), with their remarkable capabilities in natural language understanding, code generation, and advanced reasoning, presents a new paradigm for tackling these issues. Moving beyond monolithic models, an agentic approach allows for the creation of multi-agent systems where specialized LLMs collaborate to solve complex problems more effectively. Recognizing this opportunity, we introduce SV-LLM, a novel multi-agent assistant system designed to automate and enhance SoC security verification. By integrating specialized agents for tasks like verification question answering, security asset identification, threat modeling, test plan and property generation, vulnerability detection, and simulation-based bug validation, SV-LLM streamlines the workflow. To optimize their performance in these diverse tasks, agents leverage different learning paradigms, such as in-context learning, fine-tuning, and retrieval-augmented generation (RAG). The system aims to reduce manual intervention, improve accuracy, and accelerate security analysis, supporting proactive identification and mitigation of risks early in the design cycle. We demonstrate its potential to transform hardware security practices through illustrative case studies and experiments that showcase its applicability and efficacy.

Paperid: 1385, https://arxiv.org/pdf/2506.19268.pdf

Abstract:
We present Health App Reviews for Privacy & Trust (HARPT), a large-scale annotated corpus of user reviews from Electronic Health (eHealth) applications (apps) aimed at advancing research in user privacy and trust. The dataset comprises 480K user reviews labeled in seven categories that capture critical aspects of trust in applications (TA), trust in providers (TP), and privacy concerns (PC). Our multistage strategy integrated keyword-based filtering, iterative manual labeling with review, targeted data augmentation, and weak supervision using transformer-based classifiers. In parallel, we manually annotated a curated subset of 7,000 reviews to support the development and evaluation of machine learning models. We benchmarked a broad range of models, providing a baseline for future work. HARPT is released under an open resource license to support reproducible research in usable privacy and trust in digital libraries and health informatics.

Paperid: 1386, https://arxiv.org/pdf/2506.18543.pdf

Abstract:
The widespread deployment of large language models (LLMs) has raised critical concerns over their vulnerability to jailbreak attacks, i.e., adversarial prompts that bypass alignment mechanisms and elicit harmful or policy-violating outputs. While proprietary models like GPT-4 have undergone extensive evaluation, the robustness of emerging open-source alternatives such as DeepSeek remains largely underexplored, despite their growing adoption in real-world applications. In this paper, we present the first systematic jailbreak evaluation of DeepSeek-series models, comparing them with GPT-3.5 and GPT-4 using the HarmBench benchmark. We evaluate seven representative attack strategies across 510 harmful behaviors categorized by both function and semantic domain. Our analysis reveals that DeepSeek's Mixture-of-Experts (MoE) architecture introduces routing sparsity that offers selective robustness against optimization-based attacks such as TAP-T, but leads to significantly higher vulnerability under prompt-based and manually engineered attacks. In contrast, GPT-4 Turbo demonstrates stronger and more consistent safety alignment across diverse behaviors, likely due to its dense Transformer design and reinforcement learning from human feedback. Fine-grained behavioral analysis and case studies further show that DeepSeek often routes adversarial prompts to under-aligned expert modules, resulting in inconsistent refusal behaviors. These findings highlight a fundamental trade-off between architectural efficiency and alignment generalization, emphasizing the need for targeted safety tuning and modular alignment strategies to ensure secure deployment of open-source LLMs.

Paperid: 1387, https://arxiv.org/pdf/2506.18087.pdf

Abstract:
With the widespread application of edge computing and cloud systems in AI-driven applications, how to maintain efficient performance while ensuring data privacy has become an urgent security issue. This paper proposes a federated learning-based data collaboration method to improve the security of edge cloud AI systems, and use large-scale language models (LLMs) to enhance data privacy protection and system robustness. Based on the existing federated learning framework, this method introduces a secure multi-party computation protocol, which optimizes the data aggregation and encryption process between distributed nodes by using LLM to ensure data privacy and improve system efficiency. By combining advanced adversarial training techniques, the model enhances the resistance of edge cloud AI systems to security threats such as data leakage and model poisoning. Experimental results show that the proposed method is 15% better than the traditional federated learning method in terms of data protection and model robustness.

Paperid: 1388, https://arxiv.org/pdf/2506.15112.pdf

Abstract:
Decentralized learning is vulnerable to poison attacks, where malicious clients manipulate local updates to degrade global model performance. Existing defenses mainly detect and filter malicious models, aiming to prevent a limited number of attackers from corrupting the global model. However, restoring an already compromised global model remains a challenge. A direct approach is to remove malicious clients and retrain the model using only the benign clients. Yet, retraining is time-consuming, computationally expensive, and may compromise model consistency and privacy. We propose PDLRecover, a novel method to recover a poisoned global model efficiently by leveraging historical model information while preserving privacy. The main challenge lies in protecting shared historical models while enabling parameter estimation for model recovery. By exploiting the linearity of approximate Hessian matrix computation, we apply secret sharing to protect historical updates, ensuring local models are not leaked during transmission or reconstruction. PDLRecover introduces client-side preparation, periodic recovery updates, and a final exact update to ensure robustness and convergence of the recovered model. Periodic updates maintain accurate curvature information, and the final step ensures high-quality convergence. Experiments show that the recovered global model achieves performance comparable to a fully retrained model but with significantly reduced computation and time cost. Moreover, PDLRecover effectively prevents leakage of local model parameters, ensuring both accuracy and privacy in recovery.

Paperid: 1389, https://arxiv.org/pdf/2506.14937.pdf

Abstract:
Currently, digital security mechanisms like Anomaly Detection Systems using Autoencoders (AE) show great potential for bypassing problems intrinsic to the data, such as data imbalance. Because AE use a non-trivial and nonstandardized separation threshold to classify the extracted reconstruction error, the definition of this threshold directly impacts the performance of the detection process. Thus, this work proposes the automatic definition of this threshold using some machine learning algorithms. For this, three algorithms were evaluated: the K-Nearst Neighbors, the K-Means and the Support Vector Machine.

Paperid: 1390, https://arxiv.org/pdf/2506.14124.pdf

Abstract:
This paper presents the first generic compiler that transforms any permissioned consensus protocol into a proof-of-stake permissionless consensus protocol. For each of the following properties, if the initial permissioned protocol satisfies that property in the partially synchronous setting, the consequent proof-of-stake protocol also satisfies that property in the partially synchronous and quasi-permissionless setting (with the same fault-tolerance): consistency; liveness; optimistic responsiveness; every composable log-specific property; and message complexity of a given order. Moreover, our transformation ensures that the output protocol satisfies accountability (identifying culprits in the event of a consistency violation), whether or not the original permissioned protocol satisfied it.

Paperid: 1391, https://arxiv.org/pdf/2506.13261.pdf

Abstract:
The automotive industry is undergoing a software-as-a-service transformation that enables software-defined functions and post-sale updates via cloud and vehicle-to-everything communication. Connectivity in cars introduces significant security challenges, as remote attacks on vehicles have become increasingly prevalent. Current automotive designs call for security solutions that address the entire lifetime of a vehicle. In this paper, we propose to authenticate and authorize in-vehicle services by integrating DNSSEC, DANE, and DANCE with automotive middleware. Our approach decouples the cryptographic authentication of the service from that of the service deployment with the help of DNSSEC and thereby largely simplifies key management. We propose to authenticate in-vehicle services by certificates that are solely generated by the service suppliers but published on deployment via DNSSEC TLSA records solely signed by the OEM. Building on well-established Internet standards ensures interoperability with various current and future protocols, scalable management of credentials for millions of connected vehicles at well-established security levels. We back our design proposal by a security analysis using the STRIDE threat model and by evaluations in a realistic in-vehicle setup that demonstrate its effectiveness.

Paperid: 1392, https://arxiv.org/pdf/2506.12699.pdf

Abstract:
Large language models (LLMs) are sophisticated artificial intelligence systems that enable machines to generate human-like text with remarkable precision. While LLMs offer significant technological progress, their development using vast amounts of user data scraped from the web and collected from extensive user interactions poses risks of sensitive information leakage. Most existing surveys focus on the privacy implications of the training data but tend to overlook privacy risks from user interactions and advanced LLM capabilities. This paper aims to fill that gap by providing a comprehensive analysis of privacy in LLMs, categorizing the challenges into four main areas: (i) privacy issues in LLM training data, (ii) privacy challenges associated with user prompts, (iii) privacy vulnerabilities in LLM-generated outputs, and (iv) privacy challenges involving LLM agents. We evaluate the effectiveness and limitations of existing mitigation mechanisms targeting these proposed privacy challenges and identify areas for further research.

Paperid: 1393, https://arxiv.org/pdf/2506.11022.pdf

Abstract:
The rapid adoption of Large Language Models(LLMs) for code generation has transformed software development, yet little attention has been given to how security vulnerabilities evolve through iterative LLM feedback. This paper analyzes security degradation in AI-generated code through a controlled experiment with 400 code samples across 40 rounds of "improvements" using four distinct prompting strategies. Our findings show a 37.6% increase in critical vulnerabilities after just five iterations, with distinct vulnerability patterns emerging across different prompting approaches. This evidence challenges the assumption that iterative LLM refinement improves code security and highlights the essential role of human expertise in the loop. We propose practical guidelines for developers to mitigate these risks, emphasizing the need for robust human validation between LLM iterations to prevent the paradoxical introduction of new security issues during supposedly beneficial code "improvements".

Paperid: 1394, https://arxiv.org/pdf/2506.10125.pdf

Abstract:
As one of the key tools in many security tasks, decompilers reconstruct human-readable source code from binaries. Yet, despite recent advances, their outputs often suffer from syntactic and semantic errors and remain difficult to read. Recently, with the advent of large language models (LLMs), researchers began to explore the potential of LLMs to refine decompiler output. Nevertheless, our study of these approaches reveals their problems, such as introducing new errors and relying on unreliable accuracy validation. In this paper, we present D-LIFT, an enhanced decompiler-LLM pipeline with a fine-tuned LLM using code quality-aware reinforcement learning. Unlike prior work that overlooks preserving accuracy, D-LIFT adheres to a key principle for enhancing the quality of decompiled code: preserving accuracy while improving readability. Central to D-LIFT, we propose D-Score, an integrated code quality assessment system to score the decompiled source code from multiple aspects, and use it to guide reinforcement learning fine-tuning and to select the best output during inference. In line with our principle, D-Score assigns low scores to any inaccurate output and only awards higher scores for readability to code that passes the accuracy check. Our implementation, based on Ghidra and a range of LLMs, demonstrates significant improvements for the accurate decompiled code from the coreutils and util-linux projects. Compared to baseline LLMs without D-Score-driven fine-tuning, our trained LLMs produce 55.3% more improved decompiled functions, as measured by D-Score. Overall, D-LIFT improves the quality of 68.2% of all the functions produced by the native decompiler.

Paperid: 1395, https://arxiv.org/pdf/2506.09387.pdf

Abstract:
Buy Now Pay Later (BNPL) is a rapidly proliferating e-commerce model, offering consumers to get the product immediately and defer payments. Meanwhile, emerging blockchain technologies endow BNPL platforms with digital currency transactions, allowing BNPL platforms to integrate with digital wallets. However, the transparency of transactions causes critical privacy concerns because malicious participants may derive consumers' financial statuses from on-chain asynchronous payments. Furthermore, the newly created transactions for deferred payments introduce additional time overheads, which weaken the scalability of BNPL services. To address these issues, we propose an efficient and privacy-preserving blockchain-based asynchronous payment scheme (Epass), which has promising scalability while protecting the privacy of on-chain consumer transactions. Specifically, Epass leverages locally verifiable signatures to guarantee the privacy of consumer transactions against malicious acts. Then, a privacy-preserving asynchronous payment scheme can be further constructed by leveraging time-release encryption to control trapdoors of redactable blockchain, reducing time overheads by modifying transactions for deferred payment. We give formal definitions and security models, generic structures, and formal proofs for Epass. Extensive comparisons and experimental analysis show that \textsf{Epass} achieves KB-level communication costs, and reduces time overhead by more than four times in comparisons with locally verifiable signatures and Go-Ethereum private test networks.

Paperid: 1396, https://arxiv.org/pdf/2506.07403.pdf

Abstract:
Recent advancements in watermarking techniques have enabled the embedding of secret messages into AI-generated text (AIGT), serving as an important mechanism for AIGT detection. Existing methods typically interfere with the generation processes of large language models (LLMs) to embed signals within the generated text. However, these methods often rely on heuristic rules, which can result in suboptimal token selection and a subsequent decline in the quality of the generated content. In this paper, we introduce a plug-and-play contextual generation states-aware watermarking framework (CAW) that dynamically adjusts the embedding process. It can be seamlessly integrated with various existing watermarking methods to enhance generation quality. First, CAW incorporates a watermarking capacity evaluator, which can assess the impact of embedding messages at different token positions by analyzing the contextual generation states. Furthermore, we introduce a multi-branch pre-generation mechanism to avoid the latency caused by the proposed watermarking strategy. Building on this, CAW can dynamically adjust the watermarking process based on the evaluated watermark capacity of each token, thereby minimizing potential degradation in content quality. Extensive experiments conducted on datasets across multiple domains have verified the effectiveness of our method, demonstrating superior performance compared to various baselines in terms of both detection rate and generation quality.

Paperid: 1397, https://arxiv.org/pdf/2506.06825.pdf

Abstract:
Generative AI (Gen-AI) deepfakes pose a rapidly evolving threat to biometric authentication, yet a significant gap exists between expert understanding of these risks and public perception. This disconnection creates critical vulnerabilities in systems trusted by millions. To bridge this gap, we conducted a comprehensive mixed-method study, surveying 408 professionals across key sectors and conducting in-depth interviews with 37 participants (25 experts, 12 general public [non-experts]). Our findings reveal a paradox: while the public increasingly relies on biometrics for convenience, experts express grave concerns about the spoofing of static modalities like face and voice recognition. We found significant demographic and sector-specific divides in awareness and trust, with finance professionals, for example, showing heightened skepticism. To systematically analyze these threats, we introduce a novel Deepfake Kill Chain model, adapted from Hutchins et al.'s cybersecurity frameworks to map the specific attack vectors used by malicious actors against biometric systems. Based on this model and our empirical findings, we propose a tri-layer mitigation framework that prioritizes dynamic biometric signals (e.g., eye movements), robust privacy-preserving data governance, and targeted educational initiatives. This work provides the first empirically grounded roadmap for defending against AI-generated identity threats by aligning technical safeguards with human-centered insights.

Paperid: 1398, https://arxiv.org/pdf/2506.05908.pdf

Abstract:
Gaze-based applications are increasingly advancing with the availability of large datasets but ensuring data quality presents a substantial challenge when collecting data at scale. It further requires different parties to collaborate, therefore, privacy concerns arise. We propose QualitEye--the first method for verifying image-based gaze data quality. QualitEye employs a new semantic representation of eye images that contains the information required for verification while excluding irrelevant information for better domain adaptation. QualitEye covers a public setting where parties can freely exchange data and a privacy-preserving setting where parties cannot reveal their raw data nor derive gaze features/labels of others with adapted private set intersection protocols. We evaluate QualitEye on the MPIIFaceGaze and GazeCapture datasets and achieve a high verification performance (with a small overhead in runtime for privacy-preserving versions). Hence, QualitEye paves the way for new gaze analysis methods at the intersection of machine learning, human-computer interaction, and cryptography.

Paperid: 1399, https://arxiv.org/pdf/2506.05411.pdf

Abstract:
This paper introduces QA-HFL, a quality-aware hierarchical federated learning framework that efficiently handles heterogeneous image quality across resource-constrained mobile devices. Our approach trains specialized local models for different image quality levels and aggregates their features using a quality-weighted fusion mechanism, while incorporating differential privacy protection. Experiments on MNIST demonstrate that QA-HFL achieves 92.31% accuracy after just three federation rounds, significantly outperforming state-of-the-art methods like FedRolex (86.42%). Under strict privacy constraints, our approach maintains 30.77% accuracy with formal differential privacy guarantees. Counter-intuitively, low-end devices contributed most significantly (63.5%) to the final model despite using 100 fewer parameters than high-end counterparts. Our quality-aware approach addresses accuracy decline through device-specific regularization, adaptive weighting, intelligent client selection, and server-side knowledge distillation, while maintaining efficient communication with a 4.71% compression ratio. Statistical analysis confirms that our approach significantly outperforms baseline methods (p 0.01) under both standard and privacy-constrained conditions.

Paperid: 1400, https://arxiv.org/pdf/2506.05126.pdf

Abstract:
Sequence models, such as Large Language Models (LLMs) and autoregressive image generators, have a tendency to memorize and inadvertently leak sensitive information. While this tendency has critical legal implications, existing tools are insufficient to audit the resulting risks. We hypothesize that those tools' shortcomings are due to mismatched assumptions. Thus, we argue that effectively measuring privacy leakage in sequence models requires leveraging the correlations inherent in sequential generation. To illustrate this, we adapt a state-of-the-art membership inference attack to explicitly model within-sequence correlations, thereby demonstrating how a strong existing attack can be naturally extended to suit the structure of sequence models. Through a case study, we show that our adaptations consistently improve the effectiveness of memorization audits without introducing additional computational costs. Our work hence serves as an important stepping stone toward reliable memorization audits for large sequence models.

Paperid: 1401, https://arxiv.org/pdf/2506.04556.pdf

Abstract:
To boost the encoder stealing attack under the perturbation-based defense that hinders the attack performance, we propose a boosting encoder stealing attack with perturbation recovery named BESA. It aims to overcome perturbation-based defenses. The core of BESA consists of two modules: perturbation detection and perturbation recovery, which can be combined with canonical encoder stealing attacks. The perturbation detection module utilizes the feature vectors obtained from the target encoder to infer the defense mechanism employed by the service provider. Once the defense mechanism is detected, the perturbation recovery module leverages the well-designed generative model to restore a clean feature vector from the perturbed one. Through extensive evaluations based on various datasets, we demonstrate that BESA significantly enhances the surrogate encoder accuracy of existing encoder stealing attacks by up to 24.63\% when facing state-of-the-art defenses and combinations of multiple defenses.

Paperid: 1402, https://arxiv.org/pdf/2506.04450.pdf

Abstract:
Purpose: This study proposes a framework for fine-tuning large language models (LLMs) with differential privacy (DP) to perform multi-abnormality classification on radiology report text. By injecting calibrated noise during fine-tuning, the framework seeks to mitigate the privacy risks associated with sensitive patient data and protect against data leakage while maintaining classification performance. Materials and Methods: We used 50,232 radiology reports from the publicly available MIMIC-CXR chest radiography and CT-RATE computed tomography datasets, collected between 2011 and 2019. Fine-tuning of LLMs was conducted to classify 14 labels from MIMIC-CXR dataset, and 18 labels from CT-RATE dataset using Differentially Private Low-Rank Adaptation (DP-LoRA) in high and moderate privacy regimes (across a range of privacy budgets = {0.01, 0.1, 1.0, 10.0}). Model performance was evaluated using weighted F1 score across three model architectures: BERT-medium, BERT-small, and ALBERT-base. Statistical analyses compared model performance across different privacy levels to quantify the privacy-utility trade-off. Results: We observe a clear privacy-utility trade-off through our experiments on 2 different datasets and 3 different models. Under moderate privacy guarantees the DP fine-tuned models achieved comparable weighted F1 scores of 0.88 on MIMIC-CXR and 0.59 on CT-RATE, compared to non-private LoRA baselines of 0.90 and 0.78, respectively. Conclusion: Differentially private fine-tuning using LoRA enables effective and privacy-preserving multi-abnormality classification from radiology reports, addressing a key challenge in fine-tuning LLMs on sensitive medical data.

Paperid: 1403, https://arxiv.org/pdf/2506.02711.pdf

Abstract:
Membership inference attack (MIA) has become one of the most widely used and effective methods for evaluating the privacy risks of machine learning models. These attacks aim to determine whether a specific sample is part of the model's training set by analyzing the model's output. While traditional membership inference attacks focus on leveraging the model's posterior output, such as confidence on the target sample, we propose IMIA, a novel attack strategy that utilizes the process of generating adversarial samples to infer membership. We propose to infer the member properties of the target sample using the number of iterations required to generate its adversarial sample. We conduct experiments across multiple models and datasets, and our results demonstrate that the number of iterations for generating an adversarial sample is a reliable feature for membership inference, achieving strong performance both in black-box and white-box attack scenarios. This work provides a new perspective for evaluating model privacy and highlights the potential of adversarial example-based features for privacy leakage assessment.

Paperid: 1404, https://arxiv.org/pdf/2506.00317.pdf

Abstract:
We present a study of how local frames (i.e., iframes loading content like "about:blank") are mishandled by a wide range of popular Web security and privacy tools. As a result, users of these tools remain vulnerable to the very attack techniques against which they seek to protect themselves, including browser fingerprinting, cookie-based tracking, and data exfiltration. The tools we study are vulnerable in different ways, but all share a root cause: legacy Web functionality interacts with browser privacy boundaries in unexpected ways, leading to systemic vulnerabilities in tools developed, maintained, and recommended by privacy experts and activists. We consider four core capabilities supported by most privacy tools and develop tests to determine whether each can be evaded through the use of local frames. We apply our tests to six popular Web privacy and security tools -- identifying at least one vulnerability in each for a total of 19 -- and extract common patterns regarding their mishandling of local frames. Our measurement of popular websites finds that 56% employ local frames and that 73.7% of the requests made by these local frames should be blocked by popular filter lists but instead trigger the vulnerabilities we identify. From another perspective, 14.3% of all sites that we crawl make requests that should be blocked inside of local frames. We disclosed these vulnerabilities to the tool authors and discuss both our experiences working with them to patch their products and the implications of our findings for other privacy and security research.

Paperid: 1405, https://arxiv.org/pdf/2505.23658.pdf

Abstract:
We introduce a new Bayesian perspective on the concept of data reconstruction, and leverage this viewpoint to propose a new security definition that, in certain settings, provably prevents reconstruction attacks. We use our paradigm to shed new light on one of the most notorious attacks in the privacy and memorization literature - fingerprinting code attacks (FPC). We argue that these attacks are really a form of membership inference attacks, rather than reconstruction attacks. Furthermore, we show that if the goal is solely to prevent reconstruction (but not membership inference), then in some cases the impossibility results derived from FPC no longer apply.

Paperid: 1406, https://arxiv.org/pdf/2505.22878.pdf

Abstract:
The current landscape of system-on-chips (SoCs) security verification faces challenges due to manual, labor-intensive, and inflexible methodologies. These issues limit the scalability and effectiveness of security protocols, making bug detection at the Register-Transfer Level (RTL) difficult. This paper proposes a new framework named BugWhisperer that utilizes a specialized, fine-tuned Large Language Model (LLM) to address these challenges. By enhancing the LLM's hardware security knowledge and leveraging its capabilities for text inference and knowledge transfer, this approach automates and improves the adaptability and reusability of the verification process. We introduce an open-source, fine-tuned LLM specifically designed for detecting security vulnerabilities in SoC designs. Our findings demonstrate that this tailored LLM effectively enhances the efficiency and flexibility of the security verification process. Additionally, we introduce a comprehensive hardware vulnerability database that supports this work and will further assist the research community in enhancing the security verification process.

Paperid: 1407, https://arxiv.org/pdf/2505.22845.pdf

Abstract:
Generative artificial intelligence is developing rapidly, impacting humans' interaction with information and digital media. It is increasingly used to create deceptively realistic misinformation, so lawmakers have imposed regulations requiring the disclosure of AI-generated content. However, only little is known about whether these labels reduce the risks of AI-generated misinformation. Our work addresses this research gap. Focusing on AI-generated images, we study the implications of labels, including the possibility of mislabeling. Assuming that simplicity, transparency, and trust are likely to impact the successful adoption of such labels, we first qualitatively explore users' opinions and expectations of AI labeling using five focus groups. Second, we conduct a pre-registered online survey with over 1300 U.S. and EU participants to quantitatively assess the effect of AI labels on users' ability to recognize misinformation containing either human-made or AI-generated images. Our focus groups illustrate that, while participants have concerns about the practical implementation of labeling, they consider it helpful in identifying AI-generated images and avoiding deception. However, considering security benefits, our survey revealed an ambiguous picture, suggesting that users might over-rely on labels. While inaccurate claims supported by labeled AI-generated images were rated less credible than those with unlabeled AI-images, the belief in accurate claims also decreased when accompanied by a labeled AI-generated image. Moreover, we find the undesired side effect that human-made images conveying inaccurate claims were perceived as more credible in the presence of labels.

Paperid: 1408, https://arxiv.org/pdf/2505.22638.pdf

Abstract:
Industrial Control Systems (ICS) manage critical infrastructures like power grids and water treatment plants. Cyberattacks on ICSs can disrupt operations, causing severe economic, environmental, and safety issues. For example, undetected pollution in a water plant can put the lives of thousands at stake. ICS researchers have increasingly turned to honeypots -- decoy systems designed to attract attackers, study their behaviors, and eventually improve defensive mechanisms. However, existing ICS honeypots struggle to replicate the ICS physical process, making them susceptible to detection. Accurately simulating the noise in ICS physical processes is challenging because different factors produce it, including sensor imperfections and external interferences. In this paper, we propose SimProcess, a novel framework to rank the fidelity of ICS simulations by evaluating how closely they resemble real-world and noisy physical processes. It measures the simulation distance from a target system by estimating the noise distribution with machine learning models like Random Forest. Unlike existing solutions that require detailed mathematical models or are limited to simple systems, SimProcess operates with only a timeseries of measurements from the real system, making it applicable to a broader range of complex dynamic systems. We demonstrate the framework's effectiveness through a case study using real-world power grid data from the EPIC testbed. We compare the performance of various simulation methods, including static and generative noise techniques. Our model correctly classifies real samples with a recall of up to 1.0. It also identifies Gaussian and Gaussian Mixture as the best distribution to simulate our power systems, together with a generative solution provided by an autoencoder, thereby helping developers to improve honeypot fidelity. Additionally, we make our code publicly available.

Paperid: 1409, https://arxiv.org/pdf/2505.21074.pdf

Abstract:
Text-to-image (T2I) models raise ethical and safety concerns due to their potential to generate inappropriate or harmful images. Evaluating these models' security through red-teaming is vital, yet white-box approaches are limited by their need for internal access, complicating their use with closed-source models. Moreover, existing black-box methods often assume knowledge about the model's specific defense mechanisms, limiting their utility in real-world commercial API scenarios. A significant challenge is how to evade unknown and diverse defense mechanisms. To overcome this difficulty, we propose a novel Rule-based Preference modeling Guided Red-Teaming (RPG-RT), which iteratively employs LLM to modify prompts to query and leverages feedback from T2I systems for fine-tuning the LLM. RPG-RT treats the feedback from each iteration as a prior, enabling the LLM to dynamically adapt to unknown defense mechanisms. Given that the feedback is often labeled and coarse-grained, making it difficult to utilize directly, we further propose rule-based preference modeling, which employs a set of rules to evaluate desired or undesired feedback, facilitating finer-grained control over the LLM's dynamic adaptation process. Extensive experiments on nineteen T2I systems with varied safety mechanisms, three online commercial API services, and T2V models verify the superiority and practicality of our approach.

Paperid: 1410, https://arxiv.org/pdf/2505.13819.pdf

Abstract:
Large language models (LLMs) can leak sensitive training data through memorization and membership inference attacks. Prior work has primarily focused on strong adversarial assumptions, including attacker access to entire samples or long, ordered prefixes, leaving open the question of how vulnerable LLMs are when adversaries have only partial, unordered sample information. For example, if an attacker knows a patient has "hypertension," under what conditions can they query a model fine-tuned on patient data to learn the patient also has "osteoarthritis?" In this paper, we introduce a more general threat model under this weaker assumption and show that fine-tuned LLMs are susceptible to these fragment-specific extraction attacks. To systematically investigate these attacks, we propose two data-blind methods: (1) a likelihood ratio attack inspired by methods from membership inference, and (2) a novel approach, PRISM, which regularizes the ratio by leveraging an external prior. Using examples from both medical and legal settings, we show that both methods are competitive with a data-aware baseline classifier that assumes access to labeled in-distribution data, underscoring their robustness.

Paperid: 1411, https://arxiv.org/pdf/2505.13292.pdf

Abstract:
In the age of cloud computing, data privacy protection has become a major challenge, especially when sharing sensitive data across cloud environments. However, how to optimize collaboration across cloud environments remains an unresolved problem. In this paper, we combine federated learning with large-scale language models to optimize the collaborative mechanism of AI systems. Based on the existing federated learning framework, we introduce a cross-cloud architecture in which federated learning works by aggregating model updates from decentralized nodes without exposing the original data. At the same time, combined with large-scale language models, its powerful context and semantic understanding capabilities are used to improve model training efficiency and decision-making ability. We've further innovated by introducing a secure communication layer to ensure the privacy and integrity of model updates and training data. The model enables continuous model adaptation and fine-tuning across different cloud environments while protecting sensitive data. Experimental results show that the proposed method is significantly better than the traditional federated learning model in terms of accuracy, convergence speed and data privacy protection.

Paperid: 1412, https://arxiv.org/pdf/2505.12612.pdf

Abstract:
Geospatial data statistics involve the aggregation and analysis of location data to derive the distribution of clients within geospatial. The need for privacy protection in geospatial data analysis has become paramount due to concerns over the misuse or unauthorized access of client location information. However, existing private geospatial data statistics mainly rely on privacy computing techniques such as cryptographic tools and differential privacy, which leads to significant overhead and inaccurate results. In practical applications, geospatial data is frequently generated by mobile devices such as smartphones and IoT sensors. The continuous mobility of clients and the need for real-time updates introduce additional complexity. To address these issues, we first design \textit{spatially distributed point functions (SDPF)}, which combines a quad-tree structure with distributed point functions, allowing clients to succinctly secret-share values on the nodes of an exponentially large quad-tree. Then, we use Gray code to partition the region and combine SDPF with it to propose $\mathtt{EPSpatial}$, a scheme for accurate, efficient, and private statistical analytics of geospatial data. Moreover, considering clients' frequent movement requires continuous location updates, we leverage the region encoding property to present an efficient update algorithm.Security analysis shows that $\mathtt{EPSpatial}$ effectively protects client location privacy. Theoretical analysis and experimental results on real datasets demonstrate that $\mathtt{EPSpatial}$ reduces computational and communication overhead by at least $50\%$ compared to existing statistical schemes.

Paperid: 1413, https://arxiv.org/pdf/2505.11963.pdf

Abstract:
Hardware security verification is a challenging and time-consuming task. For this purpose, design engineers may utilize tools such as formal verification, linters, and functional simulation tests, coupled with analysis and a deep understanding of the hardware design being inspected. Large Language Models (LLMs) have been used to assist during this task, either directly or in conjunction with existing tools. We improve the state of the art by proposing MARVEL, a multi-agent LLM framework for a unified approach to decision-making, tool use, and reasoning. MARVEL mimics the cognitive process of a designer looking for security vulnerabilities in RTL code. It consists of a supervisor agent that devises the security policy of the system-on-chips (SoCs) using its security documentation. It delegates tasks to validate the security policy to individual executor agents. Each executor agent carries out its assigned task using a particular strategy. Each executor agent may use one or more tools to identify potential security bugs in the design and send the results back to the supervisor agent for further analysis and confirmation. MARVEL includes executor agents that leverage formal tools, linters, simulation tests, LLM-based detection schemes, and static analysis-based checks. We test our approach on a known buggy SoC based on OpenTitan from the Hack@DATE competition. We find that 20 of the 48 issues reported by MARVEL pose security vulnerabilities.

Paperid: 1414, https://arxiv.org/pdf/2505.11901.pdf

Abstract:
As cyber threats continue to grow in scale and sophistication, blue team defenders increasingly require advanced tools to proactively detect and mitigate risks. Large Language Models (LLMs) offer promising capabilities for enhancing threat analysis. However, their effectiveness in real-world blue team threat-hunting scenarios remains insufficiently explored. In this paper, we present CYBERTEAM, a benchmark designed to guide LLMs in blue teaming practice. CYBERTEAM constructs an embodied environment in two stages. First, it models realistic threat-hunting workflows by capturing the dependencies among analytical tasks from threat attribution to incident response. Next, each task is addressed through a set of embodied functions tailored to its specific analytical requirements. This transforms the overall threat-hunting process into a structured sequence of function-driven operations, where each node represents a discrete function and edges define the execution order. Guided by this framework, LLMs are directed to perform threat-hunting tasks through modular steps. Overall, CYBERTEAM integrates 30 tasks and 9 embodied functions, guiding LLMs through pipelined threat analysis. We evaluate leading LLMs and state-of-the-art cybersecurity agents, comparing CYBERTEAM's embodied function-calling against fundamental elicitation strategies. Our results offer valuable insights into the current capabilities and limitations of LLMs in threat hunting, laying the foundation for the practical adoption in real-world cybersecurity applications.

Paperid: 1415, https://arxiv.org/pdf/2505.11708.pdf

Abstract:
Reinforcement Learning (RL) agents are increasingly used to simulate sophisticated cyberattacks, but their decision-making processes remain opaque, hindering trust, debugging, and defensive preparedness. In high-stakes cybersecurity contexts, explainability is essential for understanding how adversarial strategies are formed and evolve over time. In this paper, we propose a unified, multi-layer explainability framework for RL-based attacker agents that reveals both strategic (MDP-level) and tactical (policy-level) reasoning. At the MDP level, we model cyberattacks as a Partially Observable Markov Decision Processes (POMDPs) to expose exploration-exploitation dynamics and phase-aware behavioural shifts. At the policy level, we analyse the temporal evolution of Q-values and use Prioritised Experience Replay (PER) to surface critical learning transitions and evolving action preferences. Evaluated across CyberBattleSim environments of increasing complexity, our framework offers interpretable insights into agent behaviour at scale. Unlike previous explainable RL methods, which are often post-hoc, domain-specific, or limited in depth, our approach is both agent- and environment-agnostic, supporting use cases ranging from red-team simulation to RL policy debugging. By transforming black-box learning into actionable behavioural intelligence, our framework enables both defenders and developers to better anticipate, analyse, and respond to autonomous cyber threats.

Paperid: 1416, https://arxiv.org/pdf/2505.11502.pdf

Abstract:
The increasing concern in user privacy misuse has accelerated research into checking consistencies between smartphone apps' declared privacy policies and their actual behaviors. Recent advances in Large Language Models (LLMs) have introduced promising techniques for semantic comparison, but these methods often suffer from low accuracies and expensive computational costs. To address this problem, this paper proposes a novel hybrid approach that integrates 1) knowledge graph-based deterministic checking to ensure higher accuracy, and 2) LLMs exclusively used for preliminary semantic analysis to save computational costs. Preliminary evaluation indicates this hybrid approach not only achieves 37.63% increase in precision and 23.13% increase F1-score but also consumes 93.5% less tokens and 87.3% shorter time.

Paperid: 1417, https://arxiv.org/pdf/2505.09974.pdf

Abstract:
Large language models (LLMs) have been used in many application domains, including cyber security. The application of LLMs in the cyber security domain presents significant opportunities, such as for enhancing threat analysis and malware detection, but it can also introduce critical risks and safety concerns, including potential personal data leakage and automated generation of new malware. Building on recent findings that fine-tuning LLMs with pseudo-malicious cyber security data significantly compromises their safety, this paper presents a comprehensive validation and extension of these safety risks using a different evaluation framework. We employ the garak red teaming framework with the OWASP Top 10 for LLM Applications to assess four open-source LLMs: Mistral 7B, Llama 3 8B, Gemma 2 9B, and DeepSeek R1 8B. Our evaluation confirms and extends previous findings, showing that fine-tuning reduces safety resilience across all tested LLMs (e.g., the failure rate of Mistral 7B against prompt injection increases from 9.1% to 68.7%). We further propose and evaluate a novel safety alignment approach that carefully rewords instruction-response pairs to include explicit safety precautions and ethical considerations. This work validates previous safety concerns through independent evaluation and introduces new methods for mitigating these risks, contributing towards the development of secure, trustworthy, and ethically aligned LLMs. This approach demonstrates that it is possible to maintain or even improve model safety while preserving technical utility, offering a practical path towards developing safer fine-tuning methodologies.

Paperid: 1418, https://arxiv.org/pdf/2505.08804.pdf

Abstract:
Text-to-image (T2I) models have significantly advanced in producing high-quality images. However, such models have the ability to generate images containing not-safe-for-work (NSFW) content, such as pornography, violence, political content, and discrimination. To mitigate the risk of generating NSFW content, refusal mechanisms, i.e., safety checkers, have been developed to check potential NSFW content. Adversarial prompting techniques have been developed to evaluate the robustness of the refusal mechanisms. The key challenge remains to subtly modify the prompt in a way that preserves its sensitive nature while bypassing the refusal mechanisms. In this paper, we introduce TokenProber, a method designed for sensitivity-aware differential testing, aimed at evaluating the robustness of the refusal mechanisms in T2I models by generating adversarial prompts. Our approach is based on the key observation that adversarial prompts often succeed by exploiting discrepancies in how T2I models and safety checkers interpret sensitive content. Thus, we conduct a fine-grained analysis of the impact of specific words within prompts, distinguishing between dirty words that are essential for NSFW content generation and discrepant words that highlight the different sensitivity assessments between T2I models and safety checkers. Through the sensitivity-aware mutation, TokenProber generates adversarial prompts, striking a balance between maintaining NSFW content generation and evading detection. Our evaluation of TokenProber against 5 safety checkers on 3 popular T2I models, using 324 NSFW prompts, demonstrates its superior effectiveness in bypassing safety filters compared to existing methods (e.g., 54%+ increase on average), highlighting TokenProber's ability to uncover robustness issues in the existing refusal mechanisms.

Paperid: 1419, https://arxiv.org/pdf/2505.08799.pdf

Abstract:
In today's increasingly interconnected and fast-paced digital ecosystem, mobile networks, such as 5G and future generations such as 6G, play a pivotal role and must be considered as critical infrastructures. Ensuring their security is paramount to safeguard both individual users and the industries that depend on these networks. An essential condition for maintaining and improving the security posture of a system is the ability to effectively measure and monitor its security state. In this work we address the need for an objective measurement of the security state of 5G and future networks. We introduce a state machine model designed to capture the security life cycle of network functions and the transitions between different states within the life cycle. Such a model can be computed locally at each node, or hierarchically, by aggregating measurements into security domains or the whole network. We identify three essential security metrics -- attack surface exposure, impact of system vulnerabilities, and effectiveness of applied security controls -- that collectively form the basis for calculating the overall security score. With this approach, it is possible to provide a holistic understanding of the security posture, laying the foundation for effective security management in the expected dynamic threat landscape of 6G networks. Through practical examples, we illustrate the real-world application of our proposed methodology, offering valuable insights for developing risk management and informed decision-making strategies in 5G and 6G security operations and laying the foundation for effective security management in the expected dynamic threat landscape of 6G networks.

Paperid: 1420, https://arxiv.org/pdf/2505.06493.pdf

Abstract:
Large language models (LLMs) have gained widespread adoption across diverse applications due to their impressive generative capabilities. Their plug-and-play nature enables both developers and end users to interact with these models through simple prompts. However, as LLMs become more integrated into various systems in diverse domains, concerns around their security are growing. Existing studies mainly focus on threats arising from user prompts (e.g. prompt injection attack) and model output (e.g. model inversion attack), while the security of system prompts remains largely overlooked. This work bridges the critical gap. We introduce system prompt poisoning, a new attack vector against LLMs that, unlike traditional user prompt injection, poisons system prompts hence persistently impacts all subsequent user interactions and model responses. We systematically investigate four practical attack strategies in various poisoning scenarios. Through demonstration on both generative and reasoning LLMs, we show that system prompt poisoning is highly feasible without requiring jailbreak techniques, and effective across a wide range of tasks, including those in mathematics, coding, logical reasoning, and natural language processing. Importantly, our findings reveal that the attack remains effective even when user prompts employ advanced prompting techniques like chain-of-thought (CoT). We also show that such techniques, including CoT and retrieval-augmentation-generation (RAG), which are proven to be effective for improving LLM performance in a wide range of tasks, are significantly weakened in their effectiveness by system prompt poisoning.

Paperid: 1421, https://arxiv.org/pdf/2505.06307.pdf

Abstract:
The rapid development of Internet of Things (IoT) technology has transformed people's way of life and has a profound impact on both production and daily activities. However, with the rapid advancement of IoT technology, the security of IoT devices has become an unavoidable issue in both research and applications. Although some efforts have been made to detect or mitigate IoT security vulnerabilities, they often struggle to adapt to the complexity of IoT environments, especially when dealing with dynamic security scenarios. How to automatically, efficiently, and accurately understand these vulnerabilities remains a challenge. To address this, we propose an IoT security assistant driven by Large Language Model (LLM), which enhances the LLM's understanding of IoT security vulnerabilities and related threats. The aim of the ICoT method we propose is to enable the LLM to understand security issues by breaking down the various dimensions of security vulnerabilities and generating responses tailored to the user's specific needs and expertise level. By incorporating ICoT, LLM can gradually analyze and reason through complex security scenarios, resulting in more accurate, in-depth, and personalized security recommendations and solutions. Experimental results show that, compared to methods relying solely on LLM, our proposed LLM-driven IoT security assistant significantly improves the understanding of IoT security issues through the ICoT approach and provides personalized solutions based on the user's identity, demonstrating higher accuracy and reliability.

Paperid: 1422, https://arxiv.org/pdf/2505.02860.pdf

Abstract:
The allocation of resources plays an important role in the completion of system objectives and tasks, especially in the presence of strategic adversaries. Optimal allocation strategies are becoming increasingly more complex, given that multiple heterogeneous types of resources are at a system planner's disposal. In this paper, we focus on deriving optimal strategies for the allocation of heterogeneous resources in a well-known competitive resource allocation model known as the General Lotto game. In standard formulations, outcomes are determined solely by the players' allocation strategies of a common, single type of resource across multiple contests. In particular, a player wins a contest if it sends more resources than the opponent. Here, we propose a multi-resource extension where the winner of a contest is now determined not only by the amount of resources allocated, but also by the composition of resource types that are allocated. We completely characterize the equilibrium payoffs and strategies for two distinct formulations. The first consists of a weakest-link/best-shot winning rule, and the second considers a winning rule based on a weighted linear combination of the allocated resources. We then consider a scenario where the resource types are costly to purchase, and derive the players' equilibrium investments in each of the resource types.

Paperid: 1423, https://arxiv.org/pdf/2505.02239.pdf

Abstract:
Quantum computing threatens the security foundations of consumer electronics (CE). Preparing the diverse CE ecosystem, particularly resource-constrained devices, for the post-quantum era requires quantitative understanding of quantum-resistant cryptography (PQC) performance. This paper presents a comprehensive cross-platform performance analysis of leading PQC Key Encapsulation Mechanisms (KEMs) and digital signatures (NIST standards/candidates) compared against classical RSA/ECC. We evaluated execution time, communication costs (key/signature sizes), and memory footprint indicators on high-performance (macOS/M4, Ubuntu/x86) and constrained platforms (Raspberry Pi 4/ARM). Our quantitative results reveal lattice-based schemes, notably NIST standards ML-KEM (Kyber) and ML-DSA (Dilithium), provide a strong balance of computational efficiency and moderate communication/storage overhead, making them highly suitable for many CE applications. In contrast, code-based Classic McEliece imposes significant key size challenges, while hash-based SPHINCS+ offers high security assurance but demands large signature sizes impacting bandwidth and storage. Based on empirical data across platforms and security levels, we provide specific deployment recommendations tailored to different CE scenarios (e.g., wearables, smart home hubs, mobile devices), offering guidance for manufacturers navigating the PQC transition.

Paperid: 1424, https://arxiv.org/pdf/2505.01866.pdf

Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy, but its classical cryptographic underpinnings are vulnerable to quantum attacks. This vulnerability is particularly critical in sensitive domains like healthcare. This paper introduces PQS-BFL (Post-Quantum Secure Blockchain-based Federated Learning), a framework integrating post-quantum cryptography (PQC) with blockchain verification to secure FL against quantum adversaries. We employ ML-DSA-65 (a FIPS 204 standard candidate, formerly Dilithium) signatures to authenticate model updates and leverage optimized smart contracts for decentralized validation. Extensive evaluations on diverse datasets (MNIST, SVHN, HAR) demonstrate that PQS-BFL achieves efficient cryptographic operations (average PQC sign time: 0.65 ms, verify time: 0.53 ms) with a fixed signature size of 3309 Bytes. Blockchain integration incurs a manageable overhead, with average transaction times around 4.8 s and gas usage per update averaging 1.72 x 10^6 units for PQC configurations. Crucially, the cryptographic overhead relative to transaction time remains minimal (around 0.01-0.02% for PQC with blockchain), confirming that PQC performance is not the bottleneck in blockchain-based FL. The system maintains competitive model accuracy (e.g., over 98.8% for MNIST with PQC) and scales effectively, with round times showing sublinear growth with increasing client numbers. Our open-source implementation and reproducible benchmarks validate the feasibility of deploying long-term, quantum-resistant security in practical FL systems.

Paperid: 1425, https://arxiv.org/pdf/2505.01048.pdf

Abstract:
We propose a capability-based access control method that leverages OAuth 2.0 and Verifiable Credentials (VCs) to share resources in crowdsourced drone services. VCs securely encode claims about entities, offering flexibility. However, standardized protocols for VCs are lacking, limiting their adoption. To address this, we integrate VCs into OAuth 2.0, creating a novel access token. This token encapsulates VCs using JSON Web Tokens (JWT) and employs JWT-based methods for proof of possession. Our method streamlines VC verification with JSON Web Signatures (JWS) requires only minor adjustments to current OAuth 2.0 systems. Furthermore, in order to increase security and efficiency in multi-tenant environments, we provide a novel protocol for VC creation that makes use of the OAuth 2.0 client credentials grant. Using VCs as access tokens enhances OAuth 2.0, supporting long-term use and efficient data management. This system aids bushfire management authorities by ensuring high availability, enhanced privacy, and improved data portability. It supports multi-tenancy, allowing drone operators to control data access policies in a decentralized environment.

Paperid: 1426, https://arxiv.org/pdf/2504.19486.pdf

Abstract:
Microcontroller-based IoT devices often use embedded real-time operating systems (RTOSs). Vulnerabilities in these embedded RTOSs can lead to compromises of those IoT devices. Despite the significance of security protections, the absence of standardized security guidelines results in various levels of security risk across RTOS implementations. Our initial analysis reveals that popular RTOSs such as FreeRTOS lack essential security protections. While Zephyr OS and ThreadX are designed and implemented with essential security protections, our closer examination uncovers significant differences in their implementations of system call parameter sanitization. We identify a performance optimization practice in ThreadX that introduces security vulnerabilities, allowing for the circumvention of parameter sanitization processes. Leveraging this insight, we introduce a novel attack named the Kernel Object Masquerading (KOM) Attack (as the attacker needs to manipulate one or multiple kernel objects through carefully selected system calls to launch the attack), demonstrating how attackers can exploit these vulnerabilities to access sensitive fields within kernel objects, potentially leading to unauthorized data manipulation, privilege escalation, or system compromise. We introduce an automated approach involving under-constrained symbolic execution to identify the KOM attacks and to understand the implications. Experimental results demonstrate the feasibility of KOM attacks on ThreadX-powered platforms. We reported our findings to the vendors, who recognized the vulnerabilities, with Amazon and Microsoft acknowledging our contribution on their websites.

Paperid: 1427, https://arxiv.org/pdf/2504.18188.pdf

Abstract:
In this work, we derive the first lifting theorems for establishing security in the quantum random permutation and ideal cipher models. These theorems relate the success probability of an arbitrary quantum adversary to that of a classical algorithm making only a small number of classical queries. By applying these lifting theorems, we improve previous results and obtain new quantum query complexity bounds and post-quantum security results. Notably, we derive tight bounds for the quantum hardness of the double-sided zero search game and establish the post-quantum security for the preimage resistance, one-wayness, and multi-collision resistance of constant-round sponge, as well as the collision resistance of the Davies-Meyer construction.

Paperid: 1428, https://arxiv.org/pdf/2504.18083.pdf

Abstract:
As modern vehicles evolve into intelligent and connected systems, their growing complexity introduces significant cybersecurity risks. Threat Analysis and Risk Assessment (TARA) has therefore become essential for managing these risks under mandatory regulations. However, existing TARA automation methods rely on static threat libraries, limiting their utility in the detailed, function-level analyses demanded by industry. This paper introduces DefenseWeaver, the first system that automates function-level TARA using component-specific details and large language models (LLMs). DefenseWeaver dynamically generates attack trees and risk evaluations from system configurations described in an extended OpenXSAM++ format, then employs a multi-agent framework to coordinate specialized LLM roles for more robust analysis. To further adapt to evolving threats and diverse standards, DefenseWeaver incorporates Low-Rank Adaptation (LoRA) fine-tuning and Retrieval-Augmented Generation (RAG) with expert-curated TARA reports. We validated DefenseWeaver through deployment in four automotive security projects, where it identified 11 critical attack paths, verified through penetration testing, and subsequently reported and remediated by the relevant automakers and suppliers. Additionally, DefenseWeaver demonstrated cross-domain adaptability, successfully applying to unmanned aerial vehicles (UAVs) and marine navigation systems. In comparison to human experts, DefenseWeaver outperformed manual attack tree generation across six assessment scenarios. Integrated into commercial cybersecurity platforms such as UAES and Xiaomi, DefenseWeaver has generated over 8,200 attack trees. These results highlight its ability to significantly reduce processing time, and its scalability and transformative impact on cybersecurity across industries.

Paperid: 1429, https://arxiv.org/pdf/2504.17971.pdf

Abstract:
Data from domains such as social networks, healthcare, finance, and cybersecurity can be represented as graph-structured information. Given the sensitive nature of this data and their frequent distribution among collaborators, ensuring secure and attributable sharing is essential. Graph watermarking enables attribution by embedding user-specific signatures into graph-structured data. While prior work has addressed random perturbation attacks, the threat posed by adversaries leveraging structural properties through community detection remains unexplored. In this work, we introduce a cluster-aware threat model in which adversaries apply community-guided modifications to evade detection. We propose two novel attack strategies and evaluate them on real-world social network graphs. Our results show that cluster-aware attacks can reduce attribution accuracy by up to 80% more than random baselines under equivalent perturbation budgets on sparse graphs. To mitigate this threat, we propose a lightweight embedding enhancement that distributes watermark nodes across graph communities. This approach improves attribution accuracy by up to 60% under attack on dense graphs, without increasing runtime or structural distortion. Our findings underscore the importance of cluster-topological awareness in both watermarking design and adversarial modeling.

Paperid: 1430, https://arxiv.org/pdf/2504.17219.pdf

Abstract:
Variational Autoencoders (VAEs) have played a key role in scaling up diffusion-based generative models, as in Stable Diffusion, yet questions regarding their robustness remain largely underexplored. Although adversarial training has been an established technique for enhancing robustness in predictive models, it has been overlooked for generative models due to concerns about potential fidelity degradation by the nature of trade-offs between performance and robustness. In this work, we challenge this presumption, introducing Smooth Robust Latent VAE (SRL-VAE), a novel adversarial training framework that boosts both generation quality and robustness. In contrast to conventional adversarial training, which focuses on robustness only, our approach smooths the latent space via adversarial perturbations, promoting more generalizable representations while regularizing with originality representation to sustain original fidelity. Applied as a post-training step on pre-trained VAEs, SRL-VAE improves image robustness and fidelity with minimal computational overhead. Experiments show that SRL-VAE improves both generation quality, in image reconstruction and text-guided image editing, and robustness, against Nightshade attacks and image editing attacks. These results establish a new paradigm, showing that adversarial training, once thought to be detrimental to generative models, can instead enhance both fidelity and robustness.

Paperid: 1431, https://arxiv.org/pdf/2504.16695.pdf

Abstract:
Controller Area Networks (CANs) are the backbone for reliable intra-vehicular communication. Recent cyberattacks have, however, exposed the weaknesses of CAN, which was designed without any security considerations in the 1980s. Current efforts to retrofit security via intrusion detection or message authentication codes are insufficient to fully secure CAN as they cannot adequately protect against masquerading attacks, where a compromised communication device, a so-called electronic control units, imitates another device. To remedy this situation, multicast source authentication is required to reliably identify the senders of messages. In this paper, we present CAIBA, a novel multicast source authentication scheme specifically designed for communication buses like CAN. CAIBA relies on an authenticator overwriting authentication tags on-the-fly, such that a receiver only reads a valid tag if not only the integrity of a message but also its source can be verified. To integrate CAIBA into CAN, we devise a special message authentication scheme and a reactive bit overwriting mechanism. We achieve interoperability with legacy CAN devices, while protecting receivers implementing the AUTOSAR SecOC standard against masquerading attacks without communication overhead or verification delays.

Paperid: 1432, https://arxiv.org/pdf/2504.16125.pdf

Abstract:
This report presents a real-world case study demonstrating how prompt injection can attack large language model platforms such as ChatGPT according to a proposed injection framework. By providing three real-world examples, we show how adversarial prompts can be injected via user inputs, web-based retrieval, and system-level agent instructions. These attacks, though lightweight and low-cost, can cause persistent and misleading behaviors in LLM outputs. Our case study reveals that even commercial-grade LLMs remain vulnerable to subtle manipulations that bypass safety filters and influence user decisions. \textbf{More importantly, we stress that this report is not intended as an attack guide, but as a technical alert. As ethical researchers, we aim to raise awareness and call upon developers, especially those at OpenAI, to treat prompt-level security as a critical design priority.

Paperid: 1433, https://arxiv.org/pdf/2504.15580.pdf

Abstract:
Hierarchical clustering is a fundamental unsupervised machine learning task with the aim of organizing data into a hierarchy of clusters. Many applications of hierarchical clustering involve sensitive user information, therefore motivating recent studies on differentially private hierarchical clustering under the rigorous framework of Dasgupta's objective. However, it has been shown that any privacy-preserving algorithm under edge-level differential privacy necessarily suffers a large error. To capture practical applications of this problem, we focus on the weight privacy model, where each edge of the input graph is at least unit weight. We present a novel algorithm in the weight privacy model that shows significantly better approximation than known impossibility results in the edge-level DP setting. In particular, our algorithm achieves $O(\log^{1.5}n/\varepsilon)$ multiplicative error for $\varepsilon$-DP and runs in polynomial time, where $n$ is the size of the input graph, and the cost is never worse than the optimal additive error in existing work. We complement our algorithm by showing if the unit-weight constraint does not apply, the lower bound for weight-level DP hierarchical clustering is essentially the same as the edge-level DP, i.e. $Î©(n^2/\varepsilon)$ additive error. As a result, we also obtain a new lower bound of $\tildeÎ©(1/\varepsilon)$ additive error for balanced sparsest cuts in the weight-level DP model, which may be of independent interest. Finally, we evaluate our algorithm on synthetic and real-world datasets. Our experimental results show that our algorithm performs well in terms of extra cost and has good scalability to large graphs.

Paperid: 1434, https://arxiv.org/pdf/2504.12218.pdf

Abstract:
Safety and liveness are the two classical security properties of consensus protocols. Recent works have strengthened safety with accountability: should any safety violation occur, a sizable fraction of adversary nodes can be proven to be protocol violators. This paper studies to what extent analogous accountability guarantees are achievable for liveness. To reveal the full complexity of this question, we introduce an interpolation between the classical synchronous and partially-synchronous models that we call the $x$-partially-synchronous network model in which, intuitively, at most an $x$ fraction of the time steps in any sufficiently long interval are asynchronous (and, as with a partially-synchronous network, all time steps are synchronous following the passage of an unknown "global stablization time"). We prove a precise characterization of the parameter regime in which accountable liveness is achievable: if and only if $x < 1/2$ and $f < n/2$, where $n$ denotes the number of nodes and $f$ the number of nodes controlled by an adversary. We further refine the problem statement and our analysis by parameterizing by the number of violating nodes identified following a liveness violation, and provide evidence that the guarantees achieved by our protocol are near-optimal (as a function of $x$ and $f$). Our results provide rigorous foundations for liveness-accountability heuristics such as the "inactivity leaks" employed in Ethereum.

Paperid: 1435, https://arxiv.org/pdf/2504.11924.pdf

Abstract:
Cryptocurrency users increasingly rely on obfuscation techniques such as mixers, swappers, and decentralised or no-KYC exchanges to protect their anonymity. However, at the same time, these services are exploited by criminals to conceal and launder illicit funds. Among obfuscation services, mixers remain one of the most challenging entities to tackle. This is because their owners are often unwilling to cooperate with Law Enforcement Agencies, and technically, they operate as 'black boxes'. To better understand their functionalities, this paper proposes an approach to analyse the operations of mixers by examining their address-transaction graphs and identifying topological similarities to uncover common patterns that can define the mixer's modus operandi. The approach utilises community detection algorithms to extract dense topological structures and clustering algorithms to group similar communities. The analysis is further enriched by incorporating data from external sources related to known Exchanges, in order to understand their role in mixer operations. The approach is applied to dissect the Blender.io mixer activities within the Bitcoin blockchain, revealing: i) consistent structural patterns across address-transaction graphs; ii) that Exchanges play a key role, following a well-established pattern, which raises several concerns about their AML/KYC policies. This paper represents an initial step toward dissecting and understanding the complex nature of mixer operations in cryptocurrency networks and extracting their modus operandi.

Paperid: 1436, https://arxiv.org/pdf/2504.11783.pdf

Abstract:
The increasing deployment of large language models (LLMs) in the cybersecurity domain underscores the need for effective model selection and evaluation. However, traditional evaluation methods often overlook specific cybersecurity knowledge gaps that contribute to performance limitations. To address this, we develop CSEBenchmark, a fine-grained cybersecurity evaluation framework based on 345 knowledge points expected of cybersecurity experts. Drawing from cognitive science, these points are categorized into factual, conceptual, and procedural types, enabling the design of 11,050 tailored multiple-choice questions. We evaluate 12 popular LLMs on CSEBenchmark and find that even the best-performing model achieves only 85.42% overall accuracy, with particular knowledge gaps in the use of specialized tools and uncommon commands. Different LLMs have unique knowledge gaps. Even large models from the same family may perform poorly on knowledge points where smaller models excel. By identifying and addressing specific knowledge gaps in each LLM, we achieve up to an 84% improvement in correcting previously incorrect predictions across three existing benchmarks for two cybersecurity tasks. Furthermore, our assessment of each LLM's knowledge alignment with specific cybersecurity roles reveals that different models align better with different roles, such as GPT-4o for the Google Senior Intelligence Analyst and Deepseek-V3 for the Amazon Privacy Engineer. These findings underscore the importance of aligning LLM selection with the specific knowledge requirements of different cybersecurity roles for optimal performance.

Paperid: 1437, https://arxiv.org/pdf/2504.11208.pdf

Abstract:
An essential step for mounting cache attacks is finding eviction sets, collections of memory locations that contend on cache space. On Intel processors, one of the main challenges for identifying contending addresses is the sliced cache design, where the processor hashes the physical address to determine where in the cache a memory location is stored. While past works have demonstrated that the hash function can be reversed, they also showed that it depends on physical address bits that the adversary does not know. In this work, we make three main contributions to the art of finding eviction sets. We first exploit microarchitectural races to compare memory access times and identify the cache slice to which an address maps. We then use the known hash function to both reduce the error rate in our slice identification method and to reduce the work by extrapolating slice mappings to untested memory addresses. Finally, we show how to propagate information on eviction sets across different page offsets for the hitherto unexplored case of non-linear hash functions. Our contributions allow for entire LLC eviction set generation in 0.7 seconds on the Intel i7-9850H and 1.6 seconds on the i9-10900K, both using non-linear functions. This represents a significant improvement compared to state-of-the-art techniques taking 9x and 10x longer, respectively.

Paperid: 1438, https://arxiv.org/pdf/2504.10016.pdf

Abstract:
Split inference (SI) partitions deep neural networks into distributed sub-models, enabling privacy-preserving collaborative learning. Nevertheless, it remains vulnerable to Data Reconstruction Attacks (DRAs), wherein adversaries exploit exposed smashed data to reconstruct raw inputs. Despite extensive research on adversarial attack-defense games, a shortfall remains in the fundamental analysis of privacy risks. This paper establishes a theoretical framework for privacy leakage quantification using information theory, defining it as the adversary's certainty and deriving both average-case and worst-case error bounds. We introduce Fisher-approximated Shannon information (FSInfo), a novel privacy metric utilizing Fisher Information (FI) for operational privacy leakage computation. We empirically show that our privacy metric correlates well with empirical attacks and investigate some of the factors that affect privacy leakage, namely the data distribution, model size, and overfitting.

Paperid: 1439, https://arxiv.org/pdf/2504.08897.pdf

Abstract:
Recent research has shown the vulnerability of Spiking Neural Networks (SNNs) under adversarial examples that are nearly indistinguishable from clean data in the context of frame-based and event-based information. The majority of these studies are constrained in generating adversarial examples using Backpropagation Through Time (BPTT), a gradient-based method which lacks biological plausibility. In contrast, local learning methods, which relax many of BPTT's constraints, remain under-explored in the context of adversarial attacks. To address this problem, we examine adversarial robustness in SNNs through the framework of four types of training algorithms. We provide an in-depth analysis of the ineffectiveness of gradient-based adversarial attacks to generate adversarial instances in this scenario. To overcome these limitations, we introduce a hybrid adversarial attack paradigm that leverages the transferability of adversarial instances. The proposed hybrid approach demonstrates superior performance, outperforming existing adversarial attack methods. Furthermore, the generalizability of the method is assessed under multi-step adversarial attacks, adversarial attacks in black-box FGSM scenarios, and within the non-spiking domain.

Paperid: 1440, https://arxiv.org/pdf/2504.05832.pdf

Abstract:
Networks built on the IEEE 802.11 standard have experienced rapid growth in the last decade. Their field of application is vast, including smart home applications, Internet of Things (IoT), and short-range high throughput static and dynamic inter-vehicular communication networks. Within such networks, Channel State Information (CSI) provides a detailed view of the state of the communication channel and represents the combined effects of multipath propagation, scattering, phase shift, fading, and power decay. In this work, we investigate the problem of jamming attack detection in static and dynamic vehicular networks. We utilize ESP32-S3 modules to set up a communication network between an Unmanned Aerial Vehicle (UAV) and a Ground Control Station (GCS), to experimentally test the combined effects of a constant jammer on recorded CSI parameters, and the feasibility of jamming detection through CSI analysis in static and dynamic communication scenarios.

Paperid: 1441, https://arxiv.org/pdf/2504.05259.pdf

Abstract:
As LLM agents grow more capable of causing harm autonomously, AI developers will rely on increasingly sophisticated control measures to prevent possibly misaligned agents from causing harm. AI developers could demonstrate that their control measures are sufficient by running control evaluations: testing exercises in which a red team produces agents that try to subvert control measures. To ensure control evaluations accurately capture misalignment risks, the affordances granted to this red team should be adapted to the capability profiles of the agents to be deployed under control measures. In this paper we propose a systematic framework for adapting affordances of red teams to advancing AI capabilities. Rather than assuming that agents will always execute the best attack strategies known to humans, we demonstrate how knowledge of an agents's actual capability profile can inform proportional control evaluations, resulting in more practical and cost-effective control measures. We illustrate our framework by considering a sequence of five fictional models (M1-M5) with progressively advanced capabilities, defining five distinct AI control levels (ACLs). For each ACL, we provide example rules for control evaluation, control measures, and safety cases that could be appropriate. Finally, we show why constructing a compelling AI control safety case for superintelligent LLM agents will require research breakthroughs, highlighting that we might eventually need alternative approaches to mitigating misalignment risk.

Paperid: 1442, https://arxiv.org/pdf/2504.01198.pdf

Abstract:
Zero-knowledge (ZK) protocols enable software developers to provide proofs of their programs' correctness to other parties without revealing the programs themselves. Regular expressions are pervasive in real-world software, and zero-knowledge protocols have been developed in the past for the problem of checking whether an individual string appears in the language of a regular expression, but no existing protocol addresses the more complex PSPACE-complete problem of proving that two regular expressions are equivalent. We introduce Crepe, the first ZK protocol for encoding regular expression equivalence proofs and also the first ZK protocol to target a PSPACE-complete problem. Crepe uses a custom calculus of proof rules based on regular expression derivatives and coinduction, and we introduce a sound and complete algorithm for generating proofs in our format. We test Crepe on a suite of hundreds of regular expression equivalence proofs. Crepe can validate large proofs in only a few seconds each.

Paperid: 1443, https://arxiv.org/pdf/2504.01096.pdf

Abstract:
The Boolean Kalman Filter and associated Boolean Dynamical System Theory have been proposed to study the spread of infection on computer networks. Such models feature a network where attacks propagate through, an intrusion detection system that provides noisy signals of the true state of the network, and the capability of the defender to clean a subset of computers at any time. The Boolean Kalman Filter has been used to solve the optimal estimation problem, by estimating the hidden true state given the attack-defense dynamics and noisy observations. However, this algorithm is intractable because it runs in exponential time and space with respect to the network size. We address this feasibility problem by proposing a mean-field estimation approach, which is inspired by the epidemic modeling literature. Although our approach is heuristic, we prove that our estimator exactly matches the optimal estimator in certain non-trivial cases. We conclude by using simulations to show both the run-time improvement and estimation accuracy of our approach.

Paperid: 1444, https://arxiv.org/pdf/2504.00147.pdf

Abstract:
Embedding inversion, i.e., reconstructing text given its embedding and black-box access to the embedding encoder, is a fundamental problem in both NLP and security. From the NLP perspective, it helps determine how much semantic information about the input is retained in the embedding. From the security perspective, it measures how much information is leaked by vector databases and embedding-based retrieval systems. State-of-the-art methods for embedding inversion, such as vec2text, have high accuracy but require (a) training a separate model for each embedding, and (b) a large number of queries to the corresponding encoder. We design, implement, and evaluate ZSInvert, a zero-shot inversion method based on the recently proposed adversarial decoding technique. ZSInvert is fast, query-efficient, and can be used for any text embedding without training an embedding-specific inversion model. We measure the effectiveness of ZSInvert on several embeddings and demonstrate that it recovers key semantic information about the corresponding texts.

Paperid: 1445, https://arxiv.org/pdf/2503.20247.pdf

Abstract:
Industry 5.0 depends on intelligence, automation, and hyperconnectivity operations for effective and sustainable human-machine collaboration. Pivotal technologies like the Internet of Things (IoT) enable this by facilitating connectivity and data-driven decision-making between cyber-physical devices. As IoT devices are prone to cyberattacks, they can use blockchain to improve transparency in the network and prevent data tampering. However, in some cases, even blockchain networks are vulnerable to Sybil and 51% attacks. This has motivated the development of quantum blockchains that are more resilient to such attacks as they leverage post-quantum cryptographic protocols and secure quantum communication channels. In this work, we develop a quantum binary voting algorithm for the IoT-quantum blockchain frameworks that enables inter-connected devices to reach a consensus on the validity of transactions, even in the presence of potential faults or malicious actors. The correctness of the voting protocol is provided in detail, and the results show that it guarantees the achievement of a consensus securely against all kinds of significant external and internal attacks concerning quantum bit commitment, quantum blockchain, and quantum Byzantine agreement. We also provide an implementation of the voting algorithm with the quantum circuits simulated on the IBM Quantum platform and Simulaqron library.

Paperid: 1446, https://arxiv.org/pdf/2503.17537.pdf

Abstract:
The automotive industry has experienced a drastic transformation in the past few years when vehicles got connected to the internet. Nowadays, connected vehicles require complex architecture and interdependent functionalities, facilitating modern lifestyles and their needs. As a result, automotive software has shifted from just embedded system or SoC (System on Chip) to a more hybrid platform, which includes software for web or mobile applications, cloud, simulation, infotainment, etc. Automatically, the security concerns for automotive software have also developed accordingly. This paper presents a study on automotive vulnerabilities from 2018 to September 2024, i.e., the last seven years, intending to understand and report the noticeable changes in their pattern. 1,663 automotive software vulnerabilities were found to have been reported in the studied time frame. The study reveals the Common Weakness Enumeration (CWE) associated with these vulnerabilities develop over time and how different parts of the automotive ecosystem are exposed to these CWEs. Our study provides the platform to understand the automotive software weaknesses and loopholes and paves the way for identifying the phases in the software development lifecycle where the vulnerability was introduced. Our findings are a step forward to support vulnerability management in automotive software across its entire life cycle.

Paperid: 1447, https://arxiv.org/pdf/2503.17252.pdf

Abstract:
We investigate differentially private estimators for individual parameters within larger parametric models. While generic private estimators exist, the estimators we provide repose on new local notions of estimand stability, and these notions allow procedures that provide private certificates of their own stability. By leveraging these private certificates, we provide computationally and statistical efficient mechanisms that release private statistics that are, at least asymptotically in the sample size, essentially unimprovable: they achieve instance optimal bounds. Additionally, we investigate the practicality of the algorithms both in simulated data and in real-world data from the American Community Survey and US Census, highlighting scenarios in which the new procedures are successful and identifying areas for future work.

Paperid: 1448, https://arxiv.org/pdf/2503.17067.pdf

Abstract:
With the growing interconnection between In-Vehicle Networks (IVNs) and external environments, intelligent vehicles are increasingly vulnerable to sophisticated external network attacks. This paper proposes ATHENA, the first IVN intrusion detection framework that adopts a vehicle-cloud integrated architecture to achieve better security performance for the resource-constrained vehicular environment. Specifically, in the cloud with sufficient resources, ATHENA uses the clustering method of multi-distribution mixture model combined with deep data mining technology to generate the raw Payload Rule Bank of IVN CAN messages, and then improves the rule quality with the help of exploitation on the first-principled physical knowledge of the vehicle system, after which the payload rules are periodically sent to the vehicle terminal. At the vehicle terminal, a simple LSTM component is used to generate the Time Rule Bank representing the long-term time series dependencies and the periodic characteristics of CAN messages, but not for any detection tasks as in traditional usage scenarios, where only the generated time rules are the candidates for further IVN intrusion detection tasks. Based on both the payload and time rules generated from cloud and vehicle terminal, ATHENA can achieve efficient intrusion detection capability by simple rule-base matching operations, rather than using complex black-box reasoning of resource-intensive neural network models, which is in fact only used for rule logic generation phase instead of the actual intrusion detection phase in our framework. Comparative experimental results on the ROAD dataset, which is current the most outstanding real-world in-vehicle CAN dataset covering new instances of sophisticated and stealthy masquerade attacks, demonstrate ATHENA significantly outperforms the state-of-the-art IVN intrusion detection methods in detecting complex attacks.

Paperid: 1449, https://arxiv.org/pdf/2503.10846.pdf

Abstract:
Web Application Firewalls (WAFs) have been introduced as essential and popular security gates that inspect incoming HTTP traffic to filter out malicious requests and provide defenses against a diverse array of web-based threats. Evading WAFs can compromise these defenses, potentially harming Internet users. In recent years, parsing discrepancies have plagued many entities in the communication path; however, their potential impact on WAF evasion and request smuggling remains largely unexplored. In this work, we present an innovative approach to bypassing WAFs by uncovering and exploiting parsing discrepancies through advanced fuzzing techniques. By targeting non-malicious components such as headers and segments of the body and using widely used content-types such as application/json, multipart/form-data, and application/xml, we identified and confirmed 1207 bypasses across 5 well-known WAFs, AWS, Azure, Cloud Armor, Cloudflare, and ModSecurity. To validate our findings, we conducted a study in the wild, revealing that more than 90% of websites accepted both application/x-www-form-urlencoded and multipart/form-data interchangeably, highlighting a significant vulnerability and the broad applicability of our bypass techniques. We have reported these vulnerabilities to the affected parties and received acknowledgments from all, as well as bug bounty rewards from some vendors. Further, to mitigate these vulnerabilities, we introduce HTTP-Normalizer, a robust proxy tool designed to rigorously validate HTTP requests against current RFC standards. Our results demonstrate its effectiveness in normalizing or blocking all bypass attempts presented in this work.

Paperid: 1450, https://arxiv.org/pdf/2503.09538.pdf

Abstract:
We study equilibrium finding in polymatrix games under differential privacy constraints. To start, we show that high accuracy and asymptotically vanishing differential privacy budget (as the number of players goes to infinity) cannot be achieved simultaneously under either of the two settings: (i) We seek to establish equilibrium approximation guarantees in terms of Euclidean distance to the equilibrium set, and (ii) the adversary has access to all communication channels. Then, assuming the adversary has access to a constant number of communication channels, we develop a novel distributed algorithm that recovers strategies with simultaneously vanishing Nash gap (in expected utility, also referred to as exploitability and privacy budget as the number of players increases.

Paperid: 1451, https://arxiv.org/pdf/2503.09381.pdf

Abstract:
We propose a protocol based on mechanism design theory and encrypted control to solve average consensus problems among rational and strategic agents while preserving their privacy. The proposed protocol provides a mechanism that incentivizes the agents to faithfully implement the intended behavior specified in the protocol. Furthermore, the protocol runs over encrypted data using homomorphic encryption and secret sharing to protect the privacy of agents. We also analyze the security of the proposed protocol using a simulation paradigm in secure multi-party computation. The proposed protocol demonstrates that mechanism design and encrypted control can complement each other to achieve security under rational adversaries.

Paperid: 1452, https://arxiv.org/pdf/2503.09334.pdf

Abstract:
The integration of large language models (LLMs) into cyber security applications presents both opportunities and critical safety risks. We introduce CyberLLMInstruct, a dataset of 54,928 pseudo-malicious instruction-response pairs spanning cyber security tasks including malware analysis, phishing simulations, and zero-day vulnerabilities. Our comprehensive evaluation using seven open-source LLMs reveals a critical trade-off: while fine-tuning improves cyber security task performance (achieving up to 92.50% accuracy on CyberMetric), it severely compromises safety resilience across all tested models and attack vectors (e.g., Llama 3.1 8B's security score against prompt injection drops from 0.95 to 0.15). The dataset incorporates diverse sources including CTF challenges, academic papers, industry reports, and CVE databases to ensure comprehensive coverage of cyber security domains. Our findings highlight the unique challenges of securing LLMs in adversarial domains and establish the critical need for developing fine-tuning methodologies that balance performance gains with safety preservation in security-sensitive domains.

Paperid: 1453, https://arxiv.org/pdf/2503.05528.pdf

Abstract:
Two-source extractors aim to extract randomness from two independent sources of weak randomness. It has been shown that any two-source extractor which is secure against classical side information remains secure against quantum side information. Unfortunately, this generic reduction comes with a significant penalty to the performance of the extractor. In this paper, we show that the two-source extractor from Dodis et al. performs equally well against quantum side information as in the classical realm, surpassing previously known results about this extractor. Additionally, we derive a new quantum XOR-Lemma which allows us to re-derive the generic reduction but also allows for improvements for a large class of extractors.

Paperid: 1454, https://arxiv.org/pdf/2503.04850.pdf

Abstract:
We identify the slow liquidity drain (SLID) scam, an insidious and highly profitable threat to decentralized finance (DeFi), posing a large-scale, persistent, and growing risk to the ecosystem. Unlike traditional scams such as rug pulls or honeypots (USENIX Sec'19, USENIX Sec'23), SLID gradually siphons funds from liquidity pools over extended periods, making detection significantly more challenging. In this paper, we conducted the first large-scale empirical analysis of 319,166 liquidity pools across six major decentralized exchanges (DEXs) since 2018. We identified 3,117 SLID affected liquidity pools, resulting in cumulative losses of more than US$103 million. We propose a rule-based heuristic and an enhanced machine learning model for early detection. Our machine learning model achieves a detection speed 4.77 times faster than the heuristic while maintaining 95% accuracy. Our study establishes a foundation for protecting DeFi investors at an early stage and promoting transparency in the DeFi ecosystem.

Paperid: 1455, https://arxiv.org/pdf/2503.02959.pdf

Abstract:
Graph unlearning aims to remove a subset of graph entities (i.e. nodes and edges) from a graph neural network (GNN) trained on the graph. Unlike machine unlearning for models trained on Euclidean-structured data, effectively unlearning a model trained on non-Euclidean-structured data, such as graphs, is challenging because graph entities exhibit mutual dependencies. Existing works utilize graph partitioning, influence function, or additional layers to achieve graph unlearning. However, none of them can achieve high scalability and effectiveness without additional constraints. In this paper, we achieve more effective graph unlearning by utilizing the embedding space. The primary training objective of a GNN is to generate proper embeddings for each node that encapsulates both structural information and node feature representations. Thus, directly optimizing the embedding space can effectively remove the target nodes' information from the model. Based on this intuition, we propose node-level contrastive unlearning (Node-CUL). It removes the influence of the target nodes (unlearning nodes) by contrasting the embeddings of remaining nodes and neighbors of unlearning nodes. Through iterative updates, the embeddings of unlearning nodes gradually become similar to those of unseen nodes, effectively removing the learned information without directly incorporating unseen data. In addition, we introduce a neighborhood reconstruction method that optimizes the embeddings of the neighbors in order to remove influence of unlearning nodes to maintain the utility of the GNN model. Experiments on various graph data and models show that our Node-CUL achieves the best unlearn efficacy and enhanced model utility with requiring comparable computing resources with existing frameworks.

Paperid: 1456, https://arxiv.org/pdf/2503.02402.pdf

Abstract:
Kernel rootkits provide adversaries with permanent high-privileged access to compromised systems and are often a key element of sophisticated attack chains. At the same time, they enable stealthy operation and are thus difficult to detect. Thereby, they inject code into kernel functions to appear invisible to users, for example, by manipulating file enumerations. Existing detection approaches are insufficient, because they rely on signatures that are unable to detect novel rootkits or require domain knowledge about the rootkits to be detected. To overcome this challenge, our approach leverages the fact that runtimes of kernel functions targeted by rootkits increase when additional code is executed. The framework outlined in this paper injects probes into the kernel to measure time stamps of functions within relevant system calls, computes distributions of function execution times, and uses statistical tests to detect time shifts. The evaluation of our open-source implementation on publicly available data sets indicates high detection accuracy with an F1 score of 98.7\% across five scenarios with varying system states.

Paperid: 1457, https://arxiv.org/pdf/2503.00917.pdf

Abstract:
Machine unlearning, where users can request the deletion of a forget dataset, is becoming increasingly important because of numerous privacy regulations. Initial works on ``exact'' unlearning (e.g., retraining) incur large computational overheads. However, while computationally inexpensive, ``approximate'' methods have fallen short of reaching the effectiveness of exact unlearning: models produced fail to obtain comparable accuracy and prediction confidence on both the forget and test (i.e., unseen) dataset. Exploiting this observation, we propose a new unlearning method, Adversarial Machine UNlearning (AMUN), that outperforms prior state-of-the-art (SOTA) methods for image classification. AMUN lowers the confidence of the model on the forget samples by fine-tuning the model on their corresponding adversarial examples. Adversarial examples naturally belong to the distribution imposed by the model on the input space; fine-tuning the model on the adversarial examples closest to the corresponding forget samples (a) localizes the changes to the decision boundary of the model around each forget sample and (b) avoids drastic changes to the global behavior of the model, thereby preserving the model's accuracy on test samples. Using AMUN for unlearning a random $10\%$ of CIFAR-10 samples, we observe that even SOTA membership inference attacks cannot do better than random guessing.

Paperid: 1458, https://arxiv.org/pdf/2503.00719.pdf

Abstract:
In classical cryptography, certified deletion is simply impossible. Since classical information can be copied any number of times easily. In quantum cryptography, certified deletion is possible because of theorems of quantum mechanics such as the quantum no-clone theorem, quantum superposition etc. In this paper, we show the PKE-CD (Public Key Encryption with Certified Deletion) scheme constructed in by Bartusek and Khurana in CRYPTO 2023 lack an important security property, which is important in practical applications. Then we show how to enhance this property, and construct a concrete scheme with this property. And we also discuss the relations between PKE-CD and other quantum cryptographic schemes such as quantum seal, quantum bit commitment etc.

Paperid: 1459, https://arxiv.org/pdf/2503.00615.pdf

Abstract:
In this paper, we focus on addressing the challenges of detecting malicious attacks in networks by designing an advanced Explainable Intrusion Detection System (xIDS). The existing machine learning and deep learning approaches have invisible limitations, such as potential biases in predictions, a lack of interpretability, and the risk of overfitting to training data. These issues can create doubt about their usefulness, transparency, and a decrease in trust among stakeholders. To overcome these challenges, we propose an ensemble learning technique called "EnsembleGuard." This approach uses the predicted outputs of multiple models, including tree-based methods (LightGBM, GBM, Bagging, XGBoost, CatBoost) and deep learning models such as LSTM (long short-term memory) and GRU (gated recurrent unit), to maintain a balance and achieve trustworthy results. Our work is unique because it combines both tree-based and deep learning models to design an interpretable and explainable meta-model through model distillation. By considering the predictions of all individual models, our meta-model effectively addresses key challenges and ensures both explainable and reliable results. We evaluate our model using well-known datasets, including UNSW-NB15, NSL-KDD, and CIC-IDS-2017, to assess its reliability against various types of attacks. During analysis, we found that our model outperforms both tree-based models and other comparative approaches in different attack scenarios.

Paperid: 1460, https://arxiv.org/pdf/2503.00066.pdf

Abstract:
Active Learning (AL) is a machine learning technique where the model selectively queries the most informative data points for labeling by human experts. Integrating AL with crowdsourcing leverages crowd diversity to enhance data labeling but introduces challenges in consensus and privacy. This poster presents CrowdAL, a blockchain-empowered crowd AL system designed to address these challenges. CrowdAL integrates blockchain for transparency and a tamper-proof incentive mechanism, using smart contracts to evaluate crowd workers' performance and aggregate labeling results, and employs zero-knowledge proofs to protect worker privacy.

Paperid: 1461, https://arxiv.org/pdf/2502.20427.pdf

Abstract:
Deepfakes - manipulated or forged audio and video media - pose significant security risks to individuals, organizations, and society at large. To address these challenges, machine learning-based classifiers are commonly employed to detect deepfake content. In this paper, we assess the robustness of such classifiers through a systematic penetration testing methodology, which we introduce as DeePen. Our approach operates without prior knowledge of or access to the target deepfake detection models. Instead, it leverages a set of carefully selected signal processing modifications - referred to as attacks - to evaluate model vulnerabilities. Using DeePen, we analyze both real-world production systems and publicly available academic model checkpoints, demonstrating that all tested systems exhibit weaknesses and can be reliably deceived by simple manipulations such as time-stretching or echo addition. Furthermore, our findings reveal that while some attacks can be mitigated by retraining detection systems with knowledge of the specific attack, others remain persistently effective. We release all associated code.

Paperid: 1462, https://arxiv.org/pdf/2502.18948.pdf

Abstract:
Active techniques have been introduced to give better detectability performance for cyber-attack diagnosis in cyber-physical systems (CPS). In this paper, switching multiplicative watermarking is considered, whereby we propose an optimal design strategy to define switching filter parameters. Optimality is evaluated exploiting the so-called output-to-output gain of the closed loop system, including some supposed attack dynamics. A worst-case scenario of a matched covert attack is assumed, presuming that an attacker with full knowledge of the closed-loop system injects a stealthy attack of bounded energy. Our algorithm, given watermark filter parameters at some time instant, provides optimal next-step parameters. Analysis of the algorithm is given, demonstrating its features, and demonstrating that through initialization of certain parameters outside of the algorithm, the parameters of the multiplicative watermarking can be randomized. Simulation shows how, by adopting our method for parameter design, the attacker's impact on performance diminishes.

Paperid: 1463, https://arxiv.org/pdf/2502.18528.pdf

Abstract:
We introduce ARACNE, a fully autonomous LLM-based pentesting agent tailored for SSH services that can execute commands on real Linux shell systems. Introduces a new agent architecture with multi-LLM model support. Experiments show that ARACNE can reach a 60\% success rate against the autonomous defender ShelLM and a 57.58\% success rate against the Over The Wire Bandit CTF challenges, improving over the state-of-the-art. When winning, the average number of actions taken by the agent to accomplish the goals was less than 5. The results show that the use of multi-LLM is a promising approach to increase accuracy in the actions.

Paperid: 1464, https://arxiv.org/pdf/2502.15098.pdf

Abstract:
Internet-of-Things (IoT) devices are vulnerable to malware and require new mitigation techniques due to their limited resources. To that end, previous research has used periodic Remote Attestation (RA) or Traffic Analysis (TA) to detect malware in IoT devices. However, RA is expensive, and TA only raises suspicion without confirming malware presence. To solve this, we design MADEA, the first system that blends RA and TA to offer a comprehensive approach to malware detection for the IoT ecosystem. TA builds profiles of expected packet traces during benign operations of each device and then uses them to detect malware from network traffic in real-time. RA confirms the presence or absence of malware on the device. MADEA achieves 100% true positive rate. It also outperforms other approaches with 160x faster detection time. Finally, without MADEA, effective periodic RA can consume at least ~14x the amount of energy that a device needs in one hour.

Paperid: 1465, https://arxiv.org/pdf/2502.08921.pdf

Abstract:
The task of text-to-image generation has achieved tremendous success in practice, with emerging concept generation models capable of producing highly personalized and customized content. Fervor for concept generation is increasing rapidly among users, and platforms for concept sharing have sprung up. The concept owners may upload malicious concepts and disguise them with non-malicious text descriptions and example images to deceive users into downloading and generating malicious content. The platform needs a quick method to determine whether a concept is malicious to prevent the spread of malicious concepts. However, simply relying on concept image generation to judge whether a concept is malicious requires time and computational resources. Especially, as the number of concepts uploaded and downloaded on the platform continues to increase, this approach becomes impractical and poses a risk of generating malicious content. In this paper, we propose Concept QuickLook, the first systematic work to incorporate malicious concept detection into research, which performs detection based solely on concept files without generating any images. We define malicious concepts and design two work modes for detection: concept matching and fuzzy detection. Extensive experiments demonstrate that the proposed Concept QuickLook can detect malicious concepts and demonstrate practicality in concept sharing platforms. We also design robustness experiments to further validate the effectiveness of the solution. We hope this work can initiate malicious concept detection tasks and provide some inspiration.

Paperid: 1466, https://arxiv.org/pdf/2502.05509.pdf

Abstract:
As machine learning models become integral to security-sensitive applications, concerns over data leakage from adversarial attacks continue to rise. Model Inversion (MI) attacks pose a significant privacy threat by enabling adversaries to reconstruct training data from model outputs. While MI attacks on Artificial Neural Networks (ANNs) have been widely studied, Spiking Neural Networks (SNNs) remain largely unexplored in this context. Due to their event-driven and discrete computations, SNNs introduce fundamental differences in information processing that may offer inherent resistance to such attacks. A critical yet underexplored aspect of this threat lies in black-box settings, where attackers operate through queries without direct access to model parameters or gradients-representing a more realistic adversarial scenario in deployed systems. This work presents the first study of black-box MI attacks on SNNs. We adapt a generative adversarial MI framework to the spiking domain by incorporating rate-based encoding for input transformation and decoding mechanisms for output interpretation. Our results show that SNNs exhibit significantly greater resistance to MI attacks than ANNs, as demonstrated by degraded reconstructions, increased instability in attack convergence, and overall reduced attack effectiveness across multiple evaluation metrics. Further analysis suggests that the discrete and temporally distributed nature of SNN decision boundaries disrupts surrogate modeling, limiting the attacker's ability to approximate the target model.

Paperid: 1467, https://arxiv.org/pdf/2502.05223.pdf

Abstract:
Jailbreak attacks exploit specific prompts to bypass LLM safeguards, causing the LLM to generate harmful, inappropriate, and misaligned content. Current jailbreaking methods rely heavily on carefully designed system prompts and numerous queries to achieve a single successful attack, which is costly and impractical for large-scale red-teaming. To address this challenge, we propose to distill the knowledge of an ensemble of SOTA attackers into a single open-source model, called Knowledge-Distilled Attacker (KDA), which is finetuned to automatically generate coherent and diverse attack prompts without the need for meticulous system prompt engineering. Compared to existing attackers, KDA achieves higher attack success rates and greater cost-time efficiency when targeting multiple SOTA open-source and commercial black-box LLMs. Furthermore, we conducted a quantitative diversity analysis of prompts generated by baseline methods and KDA, identifying diverse and ensemble attacks as key factors behind KDA's effectiveness and efficiency.

Paperid: 1468, https://arxiv.org/pdf/2502.04953.pdf

Abstract:
The exploit or the Proof of Concept of the vulnerability plays an important role in developing superior vulnerability repair techniques, as it can be used as an oracle to verify the correctness of the patches generated by the tools. However, the vulnerability exploits are often unavailable and require time and expert knowledge to craft. Obtaining them from the exploit generation techniques is another potential solution. The goal of this survey is to aid the researchers and practitioners in understanding the existing techniques for exploit generation through the analysis of their characteristics and their usability in practice. We identify a list of exploit generation techniques from literature and group them into four categories: automated exploit generation, security testing, fuzzing, and other techniques. Most of the techniques focus on the memory-based vulnerabilities in C/C++ programs and web-based injection vulnerabilities in PHP and Java applications. We found only a few studies that publicly provided usable tools associated with their techniques.

Paperid: 1469, https://arxiv.org/pdf/2502.01822.pdf

Abstract:
LLM agents will likely communicate on behalf of users with other entity-representing agents on tasks involving long-horizon plans with interdependent goals. Current work neglects these agentic networks and their challenges. We identify required properties for agent communication: proactivity, adaptability, privacy (sharing only task-necessary information), and security (preserving integrity and utility against selfish entities). After demonstrating communication vulnerabilities, we propose a practical design and protocol inspired by network security principles. Our framework automatically derives task-specific rules from prior conversations to build firewalls. These firewalls construct a closed language that is completely controlled by the developer. They transform any personal data to the allowed degree of permissibility entailed by the task. Both operations are completely quarantined from external attackers, disabling the potential for prompt injections, jailbreaks, or manipulation. By incorporating rules learned from their previous mistakes, agents rewrite their instructions and self-correct during communication. Evaluations on diverse attacks demonstrate our framework significantly reduces privacy and security vulnerabilities while allowing adaptability.

Paperid: 1470, https://arxiv.org/pdf/2502.01798.pdf

Abstract:
Terms and conditions for online shopping websites often contain terms that can have significant financial consequences for customers. Despite their impact, there is currently no comprehensive understanding of the types and potential risks associated with unfavorable financial terms. Furthermore, there are no publicly available detection systems or datasets to systematically identify or mitigate these terms. In this paper, we take the first steps toward solving this problem with three key contributions. \textit{First}, we introduce \textit{TermMiner}, an automated data collection and topic modeling pipeline to understand the landscape of unfavorable financial terms. \textit{Second}, we create \textit{ShopTC-100K}, a dataset of terms and conditions from shopping websites in the Tranco top 100K list, comprising 1.8 million terms from 8,251 websites. Consequently, we develop a taxonomy of 22 types from 4 categories of unfavorable financial terms -- spanning purchase, post-purchase, account termination, and legal aspects. \textit{Third}, we build \textit{TermLens}, an automated detector that uses Large Language Models (LLMs) to identify unfavorable financial terms. Fine-tuned on an annotated dataset, \textit{TermLens} achieves an F1 score of 94.6\% and a false positive rate of 2.3\% using GPT-4o. When applied to shopping websites from the Tranco top 100K, we find that 42.06\% of these sites contain at least one unfavorable financial term, with such terms being more prevalent on less popular websites. Case studies further highlight the financial risks and customer dissatisfaction associated with unfavorable financial terms, as well as the limitations of existing ecosystem defenses.

Paperid: 1471, https://arxiv.org/pdf/2502.01020.pdf

Abstract:
Since 2020, GitGuardian has been detecting checked-in hard-coded secrets in GitHub repositories. During 2020-2023, GitGuardian has observed an upward annual trend and a four-fold increase in hard-coded secrets, with 12.8 million exposed in 2023. However, removing all the secrets from software artifacts is not feasible due to time constraints and technical challenges. Additionally, the security risks of the secrets are not equal, protecting assets ranging from obsolete databases to sensitive medical data. Thus, secret removal should be prioritized by security risk reduction, which existing secret detection tools do not support. The goal of this research is to aid software practitioners in prioritizing secrets removal efforts through our security risk-based tool. We present RiskHarvester, a risk-based tool to compute a security risk score based on the value of the asset and ease of attack on a database. We calculated the value of asset by identifying the sensitive data categories present in a database from the database keywords in the source code. We utilized data flow analysis, SQL, and ORM parsing to identify the database keywords. To calculate the ease of attack, we utilized passive network analysis to retrieve the database host information. To evaluate RiskHarvester, we curated RiskBench, a benchmark of 1,791 database secret-asset pairs with sensitive data categories and host information manually retrieved from 188 GitHub repositories. RiskHarvester demonstrates precision of (95%) and recall (90%) in detecting database keywords for the value of asset and precision of (96%) and recall (94%) in detecting valid hosts for ease of attack. Finally, we conducted a survey (52 respondents) to understand whether developers prioritize secret removal based on security risk score. We found that 86% of the developers prioritized the secrets for removal with descending security risk scores.

Paperid: 1472, https://arxiv.org/pdf/2501.17089.pdf

Abstract:
Like any digital certificate, Verifiable Credentials (VCs) require a way to revoke them in case of an error or key compromise. Existing solutions for VC revocation, most prominently Bitstring Status List, are not viable for many use cases because they may leak the issuer's activity, which in turn leaks internal business metrics. For instance, staff fluctuation through the revocation of employee IDs. We identify the protection of issuer activity as a key gap and propose a formal definition for a corresponding characteristic of a revocation mechanism. Then, we introduce CRSet, a non-interactive mechanism that trades some space efficiency to reach these privacy characteristics. For that, we provide a proof sketch. Issuers periodically encode revocation data and publish it via Ethereum blob-carrying transactions, ensuring secure and private availability. Relying Parties (RPs) can download it to perform revocation checks locally. Sticking to a non-interactive design also makes adoption easier because it requires no changes to wallet agents and exchange protocols. We also implement and empirically evaluate CRSet, finding its real-world behavior to match expectations. One Ethereum blob fits revocation data for about 170k VCs.

Paperid: 1473, https://arxiv.org/pdf/2501.16784.pdf

Abstract:
The rapidly expanding Internet of Things (IoT) landscape is shifting toward cloudless architectures, removing reliance on centralized cloud services but exposing devices directly to the internet and increasing their vulnerability to cyberattacks. Our research revealed an unexpected pattern of substantial Tor network traffic targeting cloudless IoT devices. suggesting that attackers are using Tor to anonymously exploit undisclosed vulnerabilities (possibly obtained from underground markets). To delve deeper into this phenomenon, we developed TORCHLIGHT, a tool designed to detect both known and unknown threats targeting cloudless IoT devices by analyzing Tor traffic. TORCHLIGHT filters traffic via specific IP patterns, strategically deploys virtual private server (VPS) nodes for cost-effective detection, and uses a chain-of-thought (CoT) process with large language models (LLMs) for accurate threat identification. Our results are significant: for the first time, we have demonstrated that attackers are indeed using Tor to conceal their identities while targeting cloudless IoT devices. Over a period of 12 months, TORCHLIGHT analyzed 26 TB of traffic, revealing 45 vulnerabilities, including 29 zero-day exploits with 25 CVE-IDs assigned (5 CRITICAL, 3 HIGH, 16 MEDIUM, and 1 LOW) and an estimated value of approximately $312,000. These vulnerabilities affect around 12.71 million devices across 148 countries, exposing them to severe risks such as information disclosure, authentication bypass, and arbitrary command execution. The findings have attracted significant attention, sparking widespread discussion in cybersecurity circles, reaching the top 25 on Hacker News, and generating over 190,000 views.

Paperid: 1474, https://arxiv.org/pdf/2501.15911.pdf

Abstract:
Recently, reproducibility has become a cornerstone in the security and privacy research community, including artifact evaluations and even a new symposium topic. However, Web measurements lack tools that can be reused across many measurement tasks without modification, while being robust to circumvention, and accurate across the wide range of behaviors in the Web. As a result, most measurement studies use custom tools and varied archival formats, each of unknown correctness and significant limitations, systematically affecting the research's accuracy and reproducibility. To address these limitations, we present WebREC, a Web measurement tool that is, compared against the current state-of-the-art, accurate (i.e., correctly measures and attributes events not possible with existing tools), general (i.e., reusable without modification for a broad range of measurement tasks), and comprehensive (i.e., handling events from all relevant browser behaviors). We also present .web, an archival format for the accurate and reproducible measurement of a wide range of website behaviors. We empirically evaluate WebREC's accuracy by replicating well-known Web measurement studies and showing that WebREC's results more accurately match our baseline. We then assess if WebREC and .web succeed as general-purpose tools, which could be used to accomplish many Web measurement tasks without modification. We find that this is so: 70% of papers discussed in a 2024 web crawling SoK paper could be conducted using WebREC as is, and a larger number (48%) could be leveraged against .web archives without requiring any new crawling.

Paperid: 1475, https://arxiv.org/pdf/2501.14328.pdf

Abstract:
To address the issue of powerful row hammer (RH) attacks, our study involved an extensive analysis of the prevalent attack patterns in the field. We discovered a strong correlation between the timing and density of the active-to-active command period, ${tRC}$, and the likelihood of RH attacks. In this paper, we introduce MARC, an innovative ARFM-driven RH mitigation IP that significantly reinforces existing RH mitigation IPs. MARC dynamically adjusts the frequency of RFM in response to the severity of the RH attack environment, offering a tailored security solution that not only detects the threats but also adapts to varying threat levels. MARC's detection mechanism has demonstrated remarkable efficiency, identifying over 99\% of attack patterns. Moreover, MARC is designed as a compact hardware module, facilitating tight integration either on the memory controller-side or DRAM-side within the memory system. It only occupies a negligible hardware area of 3363~\textit{$Î¼m^2$}. By activating ARFM based on MARC's detection, the additional energy overhead is also negligible in normal workloads. We conduct experiments to compare the highest row count throughout the patterns, defined as max exposure, between the vanilla RH mitigation IPs and the MARC-enhanced versions of the same IPs, focusing on both DRAM-side and memory controller-side. On the DRAM-side, MARC + probabilistic scheme and MARC + counter-based tracking scheme achieve 8.1$\times$ and 1.5$\times$ improvement in max exposure ratio compared to the vanilla IPs, respectively. On the memory controller-side, the MARC + PARA and MARC + Graphene achieve 50$\times$ and 5.7$\times$ improvement in max exposure ratio compared to the vanilla IPs, respectively. MARC ensures optimal security without sacrificing system performance, making MARC a pioneering solution in the realm of RH attack mitigation.

Paperid: 1476, https://arxiv.org/pdf/2501.13799.pdf

Abstract:
Resource-constrained devices such as wireless sensors and Internet of Things (IoT) devices have become ubiquitous in our digital ecosystem. These devices generate and handle a major part of our digital data. However, due to the impending threat of quantum computers on our existing public-key cryptographic schemes and the limited resources available on IoT devices, it is important to design lightweight post-quantum cryptographic (PQC) schemes suitable for these devices. In this work, we explored the design space of learning with error-based PQC schemes to design a lightweight key-encapsulation mechanism (KEM) suitable for resource-constrained devices. We have done a scrupulous and extensive analysis and evaluation of different design elements, such as polynomial size, field modulus structure, reduction algorithm, and secret and error distribution of an LWE-based KEM. Our explorations led to the proposal of a lightweight PQC-KEM, Rudraksh, without compromising security. Our scheme provides security against chosen ciphertext attacks (CCA) with more than 100 bits of Core-SVP post-quantum security and belongs to the NIST-level-I security category (provide security at least as much as AES-128). We have also shown how ASCON can be used for lightweight pseudo-random number generation and hash function in the lattice-based KEMs instead of the widely used Keccak for lightweight design. Our FPGA results show that Rudraksh currently requires the least area among the PQC KEMs of similar security. Our implementation of Rudraksh provides a $\sim3\times$ improvement in terms of the area requirement compared to the state-of-the-art area-optimized implementation of Kyber, can operate at $63\%$-$76\%$ higher frequency with respect to high-throughput Kyber, and improves time-area-product $\sim2\times$ compared to the state-of-the-art compact implementation of Kyber published in HPEC 2022.

Paperid: 1477, https://arxiv.org/pdf/2501.13782.pdf

Abstract:
Android malware presents a persistent threat to users' privacy and data integrity. To combat this, researchers have proposed machine learning-based (ML-based) Android malware detection (AMD) systems. However, adversarial Android malware attacks compromise the detection integrity of the ML-based AMD systems, raising significant concerns. Existing defenses against adversarial Android malware provide protections against feature space attacks which generate adversarial feature vectors only, leaving protection against realistic threats from problem space attacks which generate real adversarial malware an open problem. In this paper, we address this gap by proposing ADD, a practical adversarial Android malware defense framework designed as a plug-in to enhance the adversarial robustness of the ML-based AMD systems against problem space attacks. Our extensive evaluation across various ML-based AMD systems demonstrates that ADD is effective against state-of-the-art problem space adversarial Android malware attacks. Additionally, ADD shows the defense effectiveness in enhancing the adversarial robustness of real-world antivirus solutions.

Paperid: 1478, https://arxiv.org/pdf/2501.12275.pdf

Abstract:
Advances in self-supervised learning (SSL) for machine vision have improved representation robustness and model performance, giving rise to pre-trained backbones like \emph{ResNet} and \emph{ViT} models tuned with SSL methods such as \emph{SimCLR}. Due to the computational and data demands of pre-training, the utilization of such backbones becomes a strenuous necessity. However, employing these backbones may inherit vulnerabilities to adversarial attacks. While adversarial robustness has been studied under \emph{white-box} and \emph{black-box} settings, the robustness of models tuned on pre-trained backbones remains largely unexplored. Additionally, the role of tuning meta-information in mitigating exploitation risks is unclear. This work systematically evaluates the adversarial robustness of such models across $20,000$ combinations of tuning meta-information, including fine-tuning techniques, backbone families, datasets, and attack types. We propose using proxy models to transfer attacks, simulating varying levels of target knowledge by fine-tuning these proxies with diverse configurations. Our findings reveal that proxy-based attacks approach the effectiveness of \emph{white-box} methods, even with minimal tuning knowledge. We also introduce a naive "backbone attack," leveraging only the backbone to generate adversarial samples, which outperforms \emph{black-box} attacks and rivals \emph{white-box} methods, highlighting critical risks in model-sharing practices. Finally, our ablations reveal how increasing tuning meta-information impacts attack transferability, measuring each meta-information combination.

Paperid: 1479, https://arxiv.org/pdf/2501.11685.pdf

Abstract:
In cybersecurity, Intrusion Detection Systems (IDS) serve as a vital defensive layer against adversarial threats. Accurate benchmarking is critical to evaluate and improve IDS effectiveness, yet traditional methodologies face limitations due to their reliance on previously known attack signatures and lack of creativity of automated tests. This paper introduces a novel approach to evaluating IDS through Capture the Flag (CTF) events, specifically designed to uncover weaknesses within IDS. CTFs, known for engaging a diverse community in tackling complex security challenges, offer a dynamic platform for this purpose. Our research investigates the effectiveness of using tailored CTF challenges to identify weaknesses in IDS by integrating them into live CTF competitions. This approach leverages the creativity and technical skills of the CTF community, enhancing both the benchmarking process and the participants' practical security skills. We present a methodology that supports the development of IDS-specific challenges, a scoring system that fosters learning and engagement, and the insights of running such a challenge in a real Jeopardy-style CTF event. Our findings highlight the potential of CTFs as a tool for IDS evaluation, demonstrating the ability to effectively expose vulnerabilities while also providing insights into necessary improvements for future implementations.

Paperid: 1480, https://arxiv.org/pdf/2501.10985.pdf

Abstract:
Graph neural networks (GNNs) have exhibited superior performance in various classification tasks on graph-structured data. However, they encounter the potential vulnerability from the link stealing attacks, which can infer the presence of a link between two nodes via measuring the similarity of its incident nodes' prediction vectors produced by a GNN model. Such attacks pose severe security and privacy threats to the training graph used in GNN models. In this work, we propose a novel solution, called Graph Link Disguise (GRID), to defend against link stealing attacks with the formal guarantee of GNN model utility for retaining prediction accuracy. The key idea of GRID is to add carefully crafted noises to the nodes' prediction vectors for disguising adjacent nodes as n-hop indirect neighboring nodes. We take into account the graph topology and select only a subset of nodes (called core nodes) covering all links for adding noises, which can avert the noises offset and have the further advantages of reducing both the distortion loss and the computation cost. Our crafted noises can ensure 1) the noisy prediction vectors of any two adjacent nodes have their similarity level like that of two non-adjacent nodes and 2) the model prediction is unchanged to ensure zero utility loss. Extensive experiments on five datasets are conducted to show the effectiveness of our proposed GRID solution against different representative link-stealing attacks under transductive settings and inductive settings respectively, as well as two influence-based attacks. Meanwhile, it achieves a much better privacy-utility trade-off than existing methods when extended to GNNs.

Paperid: 1481, https://arxiv.org/pdf/2501.06953.pdf

Abstract:
The advancement of AI models, especially those powered by deep learning, faces significant challenges in data-sensitive industries like healthcare and finance due to the distributed and private nature of data. Federated Learning (FL) and Secure Federated Learning (SFL) enable collaborative model training without data sharing, enhancing privacy by encrypting shared intermediate results. However, SFL currently lacks effective Byzantine robustness, a critical property that ensures model performance remains intact even when some participants act maliciously. Existing Byzantine-robust methods in FL are incompatible with SFL due to the inefficiency and limitations of encryption operations in handling complex aggregation calculations. This creates a significant gap in secure and robust model training. To address this gap, we propose ByzSFL, a novel SFL system that achieves Byzantine-robust secure aggregation with high efficiency. Our approach offloads aggregation weight calculations to individual parties and introduces a practical zero-knowledge proof (ZKP) protocol toolkit. This toolkit supports widely used operators for calculating aggregation weights, ensuring correct computations without compromising data privacy. Not only does this method maintain aggregation integrity, but it also significantly boosts computational efficiency, making ByzSFL approximately 100 times faster than existing solutions. Furthermore, our method aligns with open-source AI trends, enabling plaintext publication of the final model without additional information leakage, thereby enhancing the practicality and robustness of SFL in real-world applications.

Paperid: 1482, https://arxiv.org/pdf/2501.04985.pdf

Abstract:
The increasing threat of SMS spam, driven by evolving adversarial techniques and concept drift, calls for more robust and adaptive detection methods. In this paper, we evaluate the potential of large language models (LLMs), both open-source and commercial, for SMS spam detection, comparing their performance across zero-shot, few-shot, fine-tuning, and chain-of-thought prompting approaches. Using a comprehensive dataset of SMS messages, we assess the spam detection capabilities of prominent LLMs such as GPT-4, DeepSeek, LLAMA-2, and Mixtral. Our findings reveal that while zero-shot learning provides convenience, it is unreliable for effective spam detection. Few-shot learning, particularly with carefully selected examples, improves detection but exhibits variability across models. Fine-tuning emerges as the most effective strategy, with Mixtral achieving 98.6% accuracy and a balanced false positive and false negative rate below 2%, meeting the criteria for robust spam detection. Furthermore, we explore the resilience of these models to adversarial attacks, finding that fine-tuning significantly enhances robustness against both perceptible and imperceptible manipulations. Lastly, we investigate the impact of concept drift and demonstrate that fine-tuned LLMs, especially when combined with few-shot learning, can mitigate its effects, maintaining high performance even on evolving spam datasets. This study highlights the importance of fine-tuning and tailored learning strategies to deploy LLMs effectively for real-world SMS spam detection

Paperid: 1483, https://arxiv.org/pdf/2506.21134.pdf

Abstract:
Kubernetes has emerged as the de facto standard for container orchestration. Unfortunately, its increasing popularity has also made it an attractive target for malicious actors. Despite extensive research on securing Kubernetes, little attention has been paid to the impact of network configuration on the security of application deployments. This paper addresses this gap by conducting a comprehensive analysis of network misconfigurations in a Kubernetes cluster with specific reference to lateral movement. Accordingly, we carried out an extensive evaluation of 287 open-source applications belonging to six different organizations, ranging from IT companies and public entities to non-profits. As a result, we identified 634 misconfigurations, well beyond what could be found by solutions in the state of the art. We responsibly disclosed our findings to the concerned organizations and engaged in a discussion to assess their severity. As of now, misconfigurations affecting more than thirty applications have been fixed with the mitigations we proposed.

Paperid: 1484, https://arxiv.org/pdf/2506.20931.pdf

Abstract:
Federated Learning (FL) has emerged as a leading paradigm for privacy-preserving distributed machine learning, yet the distributed nature of FL introduces unique security challenges, notably the threat of backdoor attacks. Existing backdoor strategies predominantly rely on end-to-end label supervision, which, despite their efficacy, often results in detectable feature disentanglement and limited persistence. In this work, we propose a novel and stealthy backdoor attack framework, named SPA, which fundamentally departs from traditional approaches by leveraging feature-space alignment rather than direct trigger-label association. Specifically, SPA reduces representational distances between backdoor trigger features and target class features, enabling the global model to misclassify trigger-embedded inputs with high stealth and persistence. We further introduce an adaptive, adversarial trigger optimization mechanism, utilizing boundary-search in the feature space to enhance attack longevity and effectiveness, even against defensive FL scenarios and non-IID data distributions. Extensive experiments on various FL benchmarks demonstrate that SPA consistently achieves high attack success rates with minimal impact on model utility, maintains robustness under challenging participation and data heterogeneity conditions, and exhibits persistent backdoor effects far exceeding those of conventional techniques. Our results call urgent attention to the evolving sophistication of backdoor threats in FL and emphasize the pressing need for advanced, feature-level defense techniques.

Paperid: 1485, https://arxiv.org/pdf/2506.20000.pdf

Abstract:
We propose Guardian-FC, a novel two-layer framework for privacy preserving federated computing that unifies safety enforcement across diverse privacy preserving mechanisms, including cryptographic back-ends like fully homomorphic encryption (FHE) and multiparty computation (MPC), as well as statistical techniques such as differential privacy (DP). Guardian-FC decouples guard-rails from privacy mechanisms by executing plug-ins (modular computation units), written in a backend-neutral, domain-specific language (DSL) designed specifically for federated computing workflows and interchangeable Execution Providers (EPs), which implement DSL operations for various privacy back-ends. An Agentic-AI control plane enforces a finite-state safety loop through signed telemetry and commands, ensuring consistent risk management and auditability. The manifest-centric design supports fail-fast job admission and seamless extensibility to new privacy back-ends. We present qualitative scenarios illustrating backend-agnostic safety and a formal model foundation for verification. Finally, we outline a research agenda inviting the community to advance adaptive guard-rail tuning, multi-backend composition, DSL specification development, implementation, and compiler extensibility alongside human-override usability.

Paperid: 1486, https://arxiv.org/pdf/2506.19943.pdf

Abstract:
The Domain Name System (DNS) plays a foundational role in Internet infrastructure, yet its core protocols remain vulnerable to compromise by quantum adversaries. As cryptographically relevant quantum computers become a realistic threat, ensuring DNS confidentiality, authenticity, and integrity in the post-quantum era is imperative. In this paper, we present a comprehensive system-level study of post-quantum DNS security across three widely deployed mechanisms: DNSSEC, DNS-over-TLS (DoT), and DNS-over-HTTPS (DoH). We propose Post-Quantum Cryptographic (PQC)-DNS, a unified framework for benchmarking DNS security under legacy, post-quantum, and hybrid cryptographic configurations. Our implementation leverages the Open Quantum Safe (OQS) libraries and integrates lattice- and hash-based primitives into BIND9 and TLS 1.3 stacks. We formalize performance and threat models and analyze the impact of post-quantum key encapsulation and digital signatures on end-to-end DNS resolution. Experimental results on a containerized testbed reveal that lattice-based primitives such as Module-Lattice-Based Key-Encapsulation Mechanism (MLKEM) and Falcon offer practical latency and resource profiles, while hash-based schemes like SPHINCS+ significantly increase message sizes and processing overhead. We also examine security implications including downgrade risks, fragmentation vulnerabilities, and susceptibility to denial-of-service amplification. Our findings inform practical guidance for deploying quantum-resilient DNS and contribute to the broader effort of securing core Internet protocols for the post-quantum future.

Paperid: 1487, https://arxiv.org/pdf/2506.18462.pdf

Abstract:
Alert prioritisation (AP) is crucial for security operations centres (SOCs) to manage the overwhelming volume of alerts and ensure timely detection and response to genuine threats, while minimising alert fatigue. Although predictive AI can process large alert volumes and identify known patterns, it struggles with novel and evolving scenarios that demand contextual understanding and nuanced judgement. A promising solution is Human-AI teaming (HAT), which combines human expertise with AI's computational capabilities. Learning to Defer (L2D) operationalises HAT by enabling AI to "defer" uncertain or unfamiliar cases to human experts. However, traditional L2D models rely on static deferral policies that do not evolve with experience, limiting their ability to learn from human feedback and adapt over time. To overcome this, we introduce Learning to Defer with Human Feedback (L2DHF), an adaptive deferral framework that leverages Deep Reinforcement Learning from Human Feedback (DRLHF) to optimise deferral decisions. By dynamically incorporating human feedback, L2DHF continuously improves AP accuracy and reduces unnecessary deferrals, enhancing SOC effectiveness and easing analyst workload. Experiments on two widely used benchmark datasets, UNSW-NB15 and CICIDS2017, demonstrate that L2DHF significantly outperforms baseline models. Notably, it achieves 13-16% higher AP accuracy for critical alerts on UNSW-NB15 and 60-67% on CICIDS2017. It also reduces misprioritisations, for example, by 98% for high-category alerts on CICIDS2017. Moreover, L2DHF decreases deferrals, for example, by 37% on UNSW-NB15, directly reducing analyst workload. These gains are achieved with efficient execution, underscoring L2DHF's practicality for real-world SOC deployment.

Paperid: 1488, https://arxiv.org/pdf/2506.18150.pdf

Abstract:
Fully Homomorphic Encryption (FHE) is an encryption scheme that not only encrypts data but also allows for computations to be applied directly on the encrypted data. While computationally expensive, FHE can enable privacy-preserving neural inference in the client-server setting: a client encrypts their input with FHE and sends it to an untrusted server. The server then runs neural inference on the encrypted data and returns the encrypted results. The client decrypts the output locally, keeping both the input and result private from the server. Private inference has focused on networks with dense inputs such as image classification, and less attention has been given to networks with sparse features. Unlike dense inputs, sparse features require efficient encrypted lookup operations into large embedding tables, which present computational and memory constraints for FHE. In this paper, we explore the challenges and opportunities when applying FHE to Deep Learning Recommendation Models (DLRM) from both a compiler and systems perspective. DLRMs utilize conventional MLPs for dense features and embedding tables to map sparse, categorical features to dense vector representations. We develop novel methods for performing compressed embedding lookups in order to reduce FHE computational costs while keeping the underlying model performant. Our embedding lookup improves upon a state-of-the-art approach by $77 \times$. Furthermore, we present an efficient multi-embedding packing strategy that enables us to perform a 44 million parameter embedding lookup under FHE. Finally, we integrate our solutions into the open-source Orion framework and present HE-LRM, an end-to-end encrypted DLRM. We evaluate HE-LRM on UCI (health prediction) and Criteo (click prediction), demonstrating that with the right compression and packing strategies, encrypted inference for recommendation systems is practical.

Paperid: 1489, https://arxiv.org/pdf/2506.17512.pdf

Abstract:
Security analysts struggle to quickly and efficiently query and correlate log data due to the heterogeneity and lack of structure in real-world logs. Existing AI-based parsers focus on learning syntactic log templates but lack the semantic interpretation needed for querying. Directly querying large language models on raw logs is impractical at scale and vulnerable to prompt injection attacks. In this paper, we introduce Matryoshka, the first end-to-end system leveraging LLMs to automatically generate semantically-aware structured log parsers. Matryoshka combines a novel syntactic parser-employing precise regular expressions rather than wildcards-with a completely new semantic parsing layer that clusters variables and maps them into a queryable, contextually meaningful schema. This approach provides analysts with queryable and semantically rich data representations, facilitating rapid and precise log querying without the traditional burden of manual parser construction. Additionally, Matryoshka can map the newly created fields to recognized attributes within the Open Cybersecurity Schema Framework (OCSF), enabling interoperability. We evaluate Matryoshka on a newly curated real-world log benchmark, introducing novel metrics to assess how consistently fields are named and mapped across logs. Matryoshka's syntactic parser outperforms prior works, and the semantic layer achieves an F1 score of 0.95 on realistic security queries. Although mapping fields to the extensive OCSF taxonomy remains challenging, Matryoshka significantly reduces manual effort by automatically extracting and organizing valuable fields, moving us closer to fully automated, AI-driven log analytics.

Paperid: 1490, https://arxiv.org/pdf/2506.17315.pdf

Abstract:
ChatGPT has quickly advanced from simple natural language processing to tackling more sophisticated and specialized tasks. Drawing inspiration from the success of mobile app ecosystems, OpenAI allows developers to create applications that interact with third-party services, known as GPTs. GPTs can choose to leverage third-party services to integrate with specialized APIs for domain-specific applications. However, the way these disclose privacy setting information limits accessibility and analysis, making it challenging to systematically evaluate the data privacy implications of third-party integrate to GPTs. In order to support academic research on the integration of third-party services in GPTs, we introduce GPTs-ThirdSpy, an automated framework designed to extract privacy settings of GPTs. GPTs-ThirdSpy provides academic researchers with real-time, reliable metadata on third-party services used by GPTs, enabling in-depth analysis of their integration, compliance, and potential security risks. By systematically collecting and structuring this data, GPTs-ThirdSpy facilitates large-scale research on the transparency and regulatory challenges associated with the GPT app ecosystem.

Paperid: 1491, https://arxiv.org/pdf/2506.17269.pdf

Abstract:
The increasing proliferation of digital and mobile devices equipped with cameras, microphones, GPS, and other privacy invasive components has raised significant concerns for businesses operating in sensitive or policy restricted environments. Current solutions rely on passive enforcement, such as signage or verbal instructions, which are largely ineffective. This paper presents Digital Privacy Everywhere (DPE), a comprehensive and scalable system designed to actively enforce custom privacy policies for digital devices within predefined physical boundaries. The DPE architecture includes a centralized management console, field verification units (FVUs), enforcement modules for mobile devices (EMMDs), and an External Geo Ownership Service (EGOS). These components collaboratively detect, configure, and enforce privacy settings such as disabling cameras, microphones, or radios across various premises like theaters, hospitals, financial institutions, and educational facilities. The system ensures privacy compliance in real time while maintaining a seamless user experience and operational scalability across geographies.

Paperid: 1492, https://arxiv.org/pdf/2506.15975.pdf

Abstract:
Digital watermarking is a promising solution for mitigating some of the risks arising from the misuse of automatically generated text. These approaches either embed non-specific watermarks to allow for the detection of any text generated by a particular sampler, or embed specific keys that allow the identification of the LLM user. However, simultaneously using the same embedding for both detection and user identification leads to a false detection problem, whereby, as user capacity grows, unwatermarked text is increasingly likely to be falsely detected as watermarked. Through theoretical analysis, we identify the underlying causes of this phenomenon. Building on these insights, we propose Dual Watermarking which jointly encodes detection and identification watermarks into generated text, significantly reducing false positives while maintaining high detection accuracy. Our experimental results validate our theoretical findings and demonstrate the effectiveness of our approach.

Paperid: 1493, https://arxiv.org/pdf/2506.15918.pdf

Abstract:
Decomposing DRAM address mappings into component-level functions is critical for understanding memory behavior and enabling precise RowHammer attacks, yet existing reverse-engineering methods fall short. We introduce novel timing-based techniques leveraging DRAM refresh intervals and consecutive access latencies to infer component-specific functions. Based on this, we present Sudoku, the first software-based tool to automatically decompose full DRAM address mappings into channel, rank, bank group, and bank functions while identifying row and column bits. We validate Sudoku's effectiveness, successfully decomposing mappings on recent Intel and AMD processors.

Paperid: 1494, https://arxiv.org/pdf/2506.14582.pdf

Abstract:
We show the security risk associated with using machine learning classifiers in United States election tabulators. The central classification task in election tabulation is deciding whether a mark does or does not appear on a bubble associated to an alternative in a contest on the ballot. Barretto et al. (E-Vote-ID 2021) reported that convolutional neural networks are a viable option in this field, as they outperform simple feature-based classifiers. Our contributions to election security can be divided into four parts. To demonstrate and analyze the hypothetical vulnerability of machine learning models on election tabulators, we first introduce four new ballot datasets. Second, we train and test a variety of different models on our new datasets. These models include support vector machines, convolutional neural networks (a basic CNN, VGG and ResNet), and vision transformers (Twins and CaiT). Third, using our new datasets and trained models, we demonstrate that traditional white box attacks are ineffective in the voting domain due to gradient masking. Our analyses further reveal that gradient masking is a product of numerical instability. We use a modified difference of logits ratio loss to overcome this issue (Croce and Hein, ICML 2020). Fourth, in the physical world, we conduct attacks with the adversarial examples generated using our new methods. In traditional adversarial machine learning, a high (50% or greater) attack success rate is ideal. However, for certain elections, even a 5% attack success rate can flip the outcome of a race. We show such an impact is possible in the physical domain. We thoroughly discuss attack realism, and the challenges and practicality associated with printing and scanning ballot adversarial examples.

Paperid: 1495, https://arxiv.org/pdf/2506.13360.pdf

Abstract:
Bitcoin is a representative decentralized currency system. For the security of Bitcoin, fairness in the distribution of mining rewards plays a crucial role in preventing the concentration of computational power in a few miners. Here, fairness refers to the distribution of block rewards in proportion to contributed computational resources. If miners with greater computational resources receive disproportionately higher rewards, i.e., if the Rich Get Richer (TRGR) phenomenon holds in Bitcoin, it indicates a threat to the system's decentralization. This study analyzes TRGR in Bitcoin by focusing on unintentional blockchain forks, an inherent phenomenon in Bitcoin. Previous research has failed to provide generalizable insights due to the low precision of their analytical methods. In contrast, we avoid this problem by adopting a method whose analytical precision has been empirically validated. The primary contribution of this work is a theoretical analysis that clearly demonstrates TRGR in Bitcoin under the assumption of fixed block propagation delays between different miners. More specifically, we show that the mining profit rate depends linearly on the proportion of hashrate. Furthermore, we examine the robustness of this result from multiple perspectives in scenarios where block propagation delays between different miners are not necessarily fixed.

Paperid: 1496, https://arxiv.org/pdf/2506.13052.pdf

Abstract:
Static and hard-coded layer-two network identifiers are well known to present security vulnerabilities and endanger user privacy. In this work, we introduce a new privacy attack against Wi-Fi access points listed on secondhand marketplaces. Specifically, we demonstrate the ability to remotely gather a large quantity of layer-two Wi-Fi identifiers by programmatically querying the eBay marketplace and applying state-of-the-art computer vision techniques to extract IEEE 802.11 BSSIDs from the seller's posted images of the hardware. By leveraging data from a global Wi-Fi Positioning System (WPS) that geolocates BSSIDs, we obtain the physical locations of these devices both pre- and post-sale. In addition to validating the degree to which a seller's location matches the location of the device, we examine cases of device movement -- once the device is sold and then subsequently re-used in a new environment. Our work highlights a previously unrecognized privacy vulnerability and suggests, yet again, the strong need to protect layer-two network identifiers.

Paperid: 1497, https://arxiv.org/pdf/2506.12344.pdf

Abstract:
Gaussian blur is widely used to blur human faces in sensitive photos before the photos are posted on the Internet. However, it is unclear to what extent the blurred faces can be restored and used to re-identify the person, especially under a high-blurring setting. In this paper, we explore this question by developing a deblurring method called Revelio. The key intuition is to leverage a generative model's memorization effect and approximate the inverse function of Gaussian blur for face restoration. Compared with existing methods, we design the deblurring process to be identity-preserving. It uses a conditional Diffusion model for preliminary face restoration and then uses an identity retrieval model to retrieve related images to further enhance fidelity. We evaluate Revelio with large public face image datasets and show that it can effectively restore blurred faces, especially under a high-blurring setting. It has a re-identification accuracy of 95.9%, outperforming existing solutions. The result suggests that Gaussian blur should not be used for face anonymization purposes. We also demonstrate the robustness of this method against mismatched Gaussian kernel sizes and functions, and test preliminary countermeasures and adaptive attacks to inspire future work.

Paperid: 1498, https://arxiv.org/pdf/2506.11970.pdf

Abstract:
JEDEC has introduced the Per Row Activation Counting (PRAC) framework for DDR5 and future DRAMs to enable precise counting of DRAM row activations using per-row activation counts. While recent PRAC implementations enable holistic mitigation of Rowhammer attacks, they impose slowdowns of up to 10% due to the increased DRAM timings for performing a read-modify-write of the counter. Alternatively, recent work, Chronus, addresses these slowdowns, but incurs energy overheads due to the additional DRAM activations for counters. In this paper, we propose CnC-PRAC, a PRAC implementation that addresses both performance and energy overheads. Unlike prior works focusing on caching activation counts to reduce their overheads, our key idea is to reorder and coalesce accesses to activation counts located in the same physical row. Our design achieves this by decoupling counter access from the critical path of data accesses. This enables optimizations such as buffering counter read-modify-write requests and coalescing requests to the same row. Together, these enable a reduction in row activations for counter accesses by almost 75%-83% compared to state-of-the-art solutions like Chronus and enable a PRAC implementation with negligible slowdown and a minimal dynamic energy overhead of 0.84%-1% compared to insecure DDR5 DRAM.

Paperid: 1499, https://arxiv.org/pdf/2506.10722.pdf

Abstract:
Deep Neural Networks (DNNs) are vulnerable to backdoor attacks, where attackers implant hidden triggers during training to maliciously control model behavior. Topological Evolution Dynamics (TED) has recently emerged as a powerful tool for detecting backdoor attacks in DNNs. However, TED can be vulnerable to backdoor attacks that adaptively distort topological representation distributions across network layers. To address this limitation, we propose TED-LaST (Topological Evolution Dynamics against Laundry, Slow release, and Target mapping attack strategies), a novel defense strategy that enhances TED's robustness against adaptive attacks. TED-LaST introduces two key innovations: label-supervised dynamics tracking and adaptive layer emphasis. These enhancements enable the identification of stealthy threats that evade traditional TED-based defenses, even in cases of inseparability in topological space and subtle topological perturbations. We review and classify data poisoning tricks in state-of-the-art adaptive attacks and propose enhanced adaptive attack with target mapping, which can dynamically shift malicious tasks and fully leverage the stealthiness that adaptive attacks possess. Our comprehensive experiments on multiple datasets (CIFAR-10, GTSRB, and ImageNet100) and model architectures (ResNet20, ResNet101) show that TED-LaST effectively counteracts sophisticated backdoors like Adap-Blend, Adapt-Patch, and the proposed enhanced adaptive attack. TED-LaST sets a new benchmark for robust backdoor detection, substantially enhancing DNN security against evolving threats.

Paperid: 1500, https://arxiv.org/pdf/2506.10020.pdf

Abstract:
Safely aligning large language models (LLMs) often demands extensive human-labeled preference data, a process that's both costly and time-consuming. While synthetic data offers a promising alternative, current methods frequently rely on complex iterative prompting or auxiliary models. To address this, we introduce Refusal-Aware Adaptive Injection (RAAI), a straightforward, training-free, and model-agnostic framework that repurposes LLM attack techniques. RAAI works by detecting internal refusal signals and adaptively injecting predefined phrases to elicit harmful, yet fluent, completions. Our experiments show RAAI effectively jailbreaks LLMs, increasing the harmful response rate from a baseline of 2.15% to up to 61.04% on average across four benchmarks. Crucially, fine-tuning LLMs with the synthetic data generated by RAAI improves model robustness against harmful prompts while preserving general capabilities on standard tasks like MMLU and ARC. This work highlights how LLM attack methodologies can be reframed as practical tools for scalable and controllable safety alignment.

Paperid: 1501, https://arxiv.org/pdf/2506.08781.pdf

Abstract:
The growing deployment of resource-limited Internet of Things (IoT) devices and their expanding attack surfaces demand efficient and scalable security mechanisms. System logs are vital for the trust and auditability of IoT, and offloading their maintenance to a Cold Storage-as-a-Service (Cold-STaaS) enhances cost-effectiveness and reliability. However, existing cryptographic logging solutions either burden low-end IoT devices with heavy computation or create verification delays and storage inefficiencies at Cold-STaaS. There is a pressing need for cryptographic primitives that balance security, performance, and scalability across IoT-Cold-STaaS continuum. In this work, we present Parallel Optimal Signatures for Secure Logging (POSLO), a novel digital signature framework that, to our knowledge, is the first to offer constant-size signatures and public keys, near-optimal signing efficiency, and tunable fine-to-coarse-grained verification for log auditing. POSLO achieves these properties through efficient randomness management, flexible aggregation, and multiple algorithmic instantiations. It also introduces a GPU-accelerated batch verification framework that exploits homomorphic signature aggregation to deliver ultra-fast performance. For example, POSLO can verify 231 log entries per second on a mid-range consumer GPU (NVIDIA GTX 3060) while being significantly more compact than state-of-the-art. POSLO also preserves signer-side efficiency, offering substantial battery savings for IoT devices, and is well-suited for the IoT-Cold-STaaS ecosystem.

Paperid: 1502, https://arxiv.org/pdf/2506.07010.pdf

Abstract:
Formal methods can be used for verifying security protocols, but their adoption can be hindered by the complexity of translating natural language protocol specifications into formal representations. In this paper, we introduce ModelForge, a novel tool that automates the translation of protocol specifications for the Cryptographic Protocol Shapes Analyzer (CPSA). By leveraging advances in Natural Language Processing (NLP) and Generative AI (GenAI), ModelForge processes protocol specifications and generates a CPSA protocol definition. This approach reduces the manual effort required, making formal analysis more accessible. We evaluate ModelForge by fine-tuning a large language model (LLM) to generate protocol definitions for CPSA, comparing its performance with other popular LLMs. The results from our evaluation show that ModelForge consistently produces quality outputs, excelling in syntactic accuracy, though some refinement is needed to handle certain protocol details. The contributions of this work include the architecture and proof of concept for a translating tool designed to simplify the adoption of formal methods in the development of security protocols.

Paperid: 1503, https://arxiv.org/pdf/2506.06975.pdf

Abstract:
As API access becomes a primary interface to large language models (LLMs), users often interact with black-box systems that offer little transparency into the deployed model. To reduce costs or maliciously alter model behaviors, API providers may discreetly serve quantized or fine-tuned variants, which can degrade performance and compromise safety. Detecting such substitutions is difficult, as users lack access to model weights and, in most cases, even output logits. To tackle this problem, we propose a rank-based uniformity test that can verify the behavioral equality of a black-box LLM to a locally deployed authentic model. Our method is accurate, query-efficient, and avoids detectable query patterns, making it robust to adversarial providers that reroute or mix responses upon the detection of testing attempts. We evaluate the approach across diverse threat scenarios, including quantization, harmful fine-tuning, jailbreak prompts, and full model substitution, showing that it consistently achieves superior statistical power over prior methods under constrained query budgets.

Paperid: 1504, https://arxiv.org/pdf/2506.06486.pdf

Abstract:
With the growing adoption of data privacy regulations, the ability to erase private or copyrighted information from trained models has become a crucial requirement. Traditional unlearning methods often assume access to the complete training dataset, which is unrealistic in scenarios where the source data is no longer available. To address this challenge, we propose a certified unlearning framework that enables effective data removal \final{without access to the original training data samples}. Our approach utilizes a surrogate dataset that approximates the statistical properties of the source data, allowing for controlled noise scaling based on the statistical distance between the two. \updated{While our theoretical guarantees assume knowledge of the exact statistical distance, practical implementations typically approximate this distance, resulting in potentially weaker but still meaningful privacy guarantees.} This ensures strong guarantees on the model's behavior post-unlearning while maintaining its overall utility. We establish theoretical bounds, introduce practical noise calibration techniques, and validate our method through extensive experiments on both synthetic and real-world datasets. The results demonstrate the effectiveness and reliability of our approach in privacy-sensitive settings.

Paperid: 1505, https://arxiv.org/pdf/2506.06407.pdf

Abstract:
Synthetic time series generated by diffusion models enable sharing privacy-sensitive datasets, such as patients' functional MRI records. Key criteria for synthetic data include high data utility and traceability to verify the data source. Recent watermarking methods embed in homogeneous latent spaces, but state-of-the-art time series generators operate in real space, making latent-based watermarking incompatible. This creates the challenge of watermarking directly in real space while handling feature heterogeneity and temporal dependencies. We propose TimeWak, the first watermarking algorithm for multivariate time series diffusion models. To handle temporal dependence and spatial heterogeneity, TimeWak embeds a temporal chained-hashing watermark directly within the real temporal-feature space. The other unique feature is the $Îµ$-exact inversion, which addresses the non-uniform reconstruction error distribution across features from inverting the diffusion process to detect watermarks. We derive the error bound of inverting multivariate time series and further maintain high watermark detectability. We extensively evaluate TimeWak on its impact on synthetic data quality, watermark detectability, and robustness under various post-editing attacks, against 5 datasets and baselines of different temporal lengths. Our results show that TimeWak achieves improvements of 61.96% in context-FID score, and 8.44% in correlational scores against the state-of-the-art baseline, while remaining consistently detectable.

Paperid: 1506, https://arxiv.org/pdf/2506.06108.pdf

Abstract:
Synthetic data is often positioned as a solution to replace sensitive fixed-size datasets with a source of unlimited matching data, freed from privacy concerns. There has been much progress in synthetic data generation over the last decade, leveraging corresponding advances in machine learning and data analytics. In this survey, we cover the key developments and the main concepts in tabular synthetic data generation, including paradigms based on probabilistic graphical models and on deep learning. We provide background and motivation, before giving a technical deep-dive into the methodologies. We also address the limitations of synthetic data, by studying attacks that seek to retrieve information about the original sensitive data. Finally, we present extensions and open problems in this area.

Paperid: 1507, https://arxiv.org/pdf/2506.05421.pdf

Abstract:
The proliferation of propaganda on mobile platforms raises critical concerns around detection accuracy and user privacy. To address this, we propose TRIDENT - a three-tier propaganda detection model implementing transformers, adversarial learning, and differential privacy which integrates syntactic obfuscation and label perturbation to mitigate privacy leakage while maintaining propaganda detection accuracy. TRIDENT leverages multilingual back-translation to introduce semantic variance, character-level noise, and entity obfuscation for differential privacy enforcement, and combines these techniques into a unified defense mechanism. Using a binary propaganda classification dataset, baseline transformer models (BERT, GPT-2) we achieved F1 scores of 0.89 and 0.90. Applying TRIDENT's third-tier defense yields a reduced but effective cumulative F1 of 0.83, demonstrating strong privacy protection across mobile ML deployments with minimal degradation.

Paperid: 1508, https://arxiv.org/pdf/2506.05101.pdf

Abstract:
Synthetic data inherits the differential privacy guarantees of the model used to generate it. Additionally, synthetic data may benefit from privacy amplification when the generative model is kept hidden. While empirical studies suggest this phenomenon, a rigorous theoretical understanding is still lacking. In this paper, we investigate this question through the well-understood framework of linear regression. First, we establish negative results showing that if an adversary controls the seed of the generative model, a single synthetic data point can leak as much information as releasing the model itself. Conversely, we show that when synthetic data is generated from random inputs, releasing a limited number of synthetic data points amplifies privacy beyond the model's inherent guarantees. We believe our findings in linear regression can serve as a foundation for deriving more general bounds in the future.

Paperid: 1509, https://arxiv.org/pdf/2506.04909.pdf

Abstract:
The honesty of large language models (LLMs) is a critical alignment challenge, especially as advanced systems with chain-of-thought (CoT) reasoning may strategically deceive humans. Unlike traditional honesty issues on LLMs, which could be possibly explained as some kind of hallucination, those models' explicit thought paths enable us to study strategic deception--goal-driven, intentional misinformation where reasoning contradicts outputs. Using representation engineering, we systematically induce, detect, and control such deception in CoT-enabled LLMs, extracting "deception vectors" via Linear Artificial Tomography (LAT) for 89% detection accuracy. Through activation steering, we achieve a 40% success rate in eliciting context-appropriate deception without explicit prompts, unveiling the specific honesty-related issue of reasoning models and providing tools for trustworthy AI alignment.

Paperid: 1510, https://arxiv.org/pdf/2506.02277.pdf

Abstract:
In this work, we show that parallel repetition of public-coin interactive arguments reduces the soundness error at an exponential rate even in the post-quantum setting. Moreover, we generalize this result to hold for threshold verifiers, where the parallel repeated verifier accepts if and only if at least $t$ of the executions are accepted (for some threshold $t$). Prior to this work, these results were known only when the cheating prover was assumed to be classical. We also prove a similar result for three-message private-coin arguments. Previously, Bostanci, Qian, Spooner, and Yuen (STOC 2024) proved such a parallel repetition result in the more general setting of quantum protocols, where the verifier and communication may be quantum. We consider only protocols where the verifier is classical, but obtain a simplified analysis, and for the more general setting of threshold verifiers.

Paperid: 1511, https://arxiv.org/pdf/2506.01011.pdf

Abstract:
Autoregressive (AR) image generation models have gained increasing attention for their breakthroughs in synthesis quality, highlighting the need for robust watermarking to prevent misuse. However, existing in-generation watermarking techniques are primarily designed for diffusion models, where watermarks are embedded within diffusion latent states. This design poses significant challenges for direct adaptation to AR models, which generate images sequentially through token prediction. Moreover, diffusion-based regeneration attacks can effectively erase such watermarks by perturbing diffusion latent states. To address these challenges, we propose Lexical Bias Watermarking (LBW), a novel framework designed for AR models that resists regeneration attacks. LBW embeds watermarks directly into token maps by biasing token selection toward a predefined green list during generation. This approach ensures seamless integration with existing AR models and extends naturally to post-hoc watermarking. To increase the security against white-box attacks, instead of using a single green list, the green list for each image is randomly sampled from a pool of green lists. Watermark detection is performed via quantization and statistical analysis of the token distribution. Extensive experiments demonstrate that LBW achieves superior watermark robustness, particularly in resisting regeneration attacks.

Paperid: 1512, https://arxiv.org/pdf/2506.00377.pdf

Abstract:
The widespread adoption of the Internet of Things (IoT) has raised a new challenge for developers since it is prone to known and unknown cyberattacks due to its heterogeneity, flexibility, and close connectivity. To defend against such security breaches, researchers have focused on building sophisticated intrusion detection systems (IDSs) using machine learning (ML) techniques. Although these algorithms notably improve detection performance, they require excessive computing power and resources, which are crucial issues in IoT networks considering the recent trends of decentralized data processing and computing systems. Consequently, many optimization techniques have been incorporated with these ML models. Specifically, a special category of optimizer adopted from the behavior of living creatures and different aspects of natural phenomena, known as metaheuristic algorithms, has been a central focus in recent years and brought about remarkable results. Considering this vital significance, we present a comprehensive and systematic review of various applications of metaheuristics algorithms in developing a machine learning-based IDS, especially for IoT. A significant contribution of this study is the discovery of hidden correlations between these optimization techniques and machine learning models integrated with state-of-the-art IoT-IDSs. In addition, the effectiveness of these metaheuristic algorithms in different applications, such as feature selection, parameter or hyperparameter tuning, and hybrid usages are separately analyzed. Moreover, a taxonomy of existing IoT-IDSs is proposed. Furthermore, we investigate several critical issues related to such integration. Our extensive exploration ends with a discussion of promising optimization algorithms and technologies that can enhance the efficiency of IoT-IDSs.

Paperid: 1513, https://arxiv.org/pdf/2505.21008.pdf

Abstract:
Cryptocurrencies and central bank digital currencies (CBDCs) are reshaping the monetary landscape, offering transparency and efficiency while raising critical concerns about user privacy and regulatory compliance. This survey provides a comprehensive and technically grounded overview of privacy-preserving digital currencies, covering both cryptocurrencies and CBDCs. We propose a taxonomy of privacy goals -- including anonymity, confidentiality, unlinkability, and auditability -- and map them to underlying cryptographic primitives, protocol mechanisms, and system architectures. Unlike previous surveys, our work adopts a design-oriented perspective, linking high-level privacy objectives to concrete implementations. We also trace the evolution of privacy-preserving currencies through three generations, highlighting shifts from basic anonymity guarantees toward more nuanced privacy-accountability trade-offs. Finally, we identify open challenges at the intersection of cryptography, distributed systems, and policy definition, which motivate further investigation into the primitives and design of digital currencies that balance real-world privacy and auditability needs.

Paperid: 1514, https://arxiv.org/pdf/2505.15051.pdf

Abstract:
With the rapid development of blockchain technology, various blockchain systems are exhibiting vitality and potential. As a representative of Blockchain 3.0, the EOS blockchain has been regarded as a strong competitor to Ethereum. Nevertheless, compared with Bitcoin and Ethereum, academic research and in-depth analyses of EOS remain scarce. To address this gap, this study conducts a comprehensive investigation of the EOS blockchain from five key dimensions: system architecture, decentralization, performance, smart contracts, and behavioral security. The architectural analysis focuses on six core components of the EOS system, detailing their functionalities and operational workflows. The decentralization and performance evaluations, based on data from the XBlock data-sharing platform, reveal several critical issues: low account activity, limited participation in the supernode election process, minimal variation in the set of block producers, and a substantial gap between actual throughput and the claimed million-level performance. Five types of contract vulnerabilities are identified in the smart contract dimension, and four mainstream vulnerability detection platforms are introduced and comparatively analyzed. In terms of behavioral security, four real-world attacks targeting the structural characteristics of EOS are summarized. This study contributes to the ongoing development of the EOS blockchain and provides valuable insights for enhancing the security and regulatory mechanisms of blockchain ecosystems.

Paperid: 1515, https://arxiv.org/pdf/2505.14565.pdf

Abstract:
Total Value Locked (TVL) aims to measure the aggregate value of cryptoassets deposited in Decentralized Finance (DeFi) protocols. Although blockchain data is public, the way TVL is computed is not well understood. In practice, its calculation on major TVL aggregators relies on self-reports from community members and lacks standardization, making it difficult to verify published figures independently. We thus conduct a systematic study on 939 DeFi projects deployed in Ethereum. We study the methodologies used to compute TVL, examine factors hindering verifiability, and ultimately propose standardization attempts in the field. We find that 10.5% of the protocols rely on external servers; 68 methods alternative to standard balance queries exist, although their use decreased over time; and 240 equal balance queries are repeated on multiple protocols. These findings indicate limits to verifiability and transparency. We thus introduce ``verifiable Total Value Locked'' (vTVL), a metric measuring the TVL that can be verified relying solely on on-chain data and standard balance queries. A case study on 400 protocols shows that our estimations align with published figures for 46.5% of protocols. Informed by these findings, we discuss design guidelines that could facilitate a more verifiable, standardized, and explainable TVL computation.

Paperid: 1516, https://arxiv.org/pdf/2505.14316.pdf

Abstract:
Although large language models (LLMs) have achieved remarkable advancements, their security remains a pressing concern. One major threat is jailbreak attacks, where adversarial prompts bypass model safeguards to generate harmful or objectionable content. Researchers study jailbreak attacks to understand security and robustness of LLMs. However, existing jailbreak attack methods face two main challenges: (1) an excessive number of iterative queries, and (2) poor generalization across models. In addition, recent jailbreak evaluation datasets focus primarily on question-answering scenarios, lacking attention to text generation tasks that require accurate regeneration of toxic content. To tackle these challenges, we propose two contributions: (1) ICE, a novel black-box jailbreak method that employs Intent Concealment and divErsion to effectively circumvent security constraints. ICE achieves high attack success rates (ASR) with a single query, significantly improving efficiency and transferability across different models. (2) BiSceneEval, a comprehensive dataset designed for assessing LLM robustness in question-answering and text-generation tasks. Experimental results demonstrate that ICE outperforms existing jailbreak techniques, revealing critical vulnerabilities in current defense mechanisms. Our findings underscore the necessity of a hybrid security strategy that integrates predefined security mechanisms with real-time semantic decomposition to enhance the security of LLMs.

Paperid: 1517, https://arxiv.org/pdf/2505.13861.pdf

Abstract:
The growing utilization of Internet of Medical Things (IoMT) devices, including smartwatches and wearable medical devices, has facilitated real-time health monitoring and data analysis to enhance healthcare outcomes. These gadgets necessitate improved security measures to safeguard sensitive health data while tackling scalability issues in real-time settings. The proposed system, hChain 4.0, employs a permissioned blockchain to provide a secure and scalable data infrastructure designed to fulfill these needs. This stands in contrast to conventional systems, which are vulnerable to security flaws or rely on public blockchains, constrained by scalability and expense. The proposed approach introduces a high-privacy method in which health data are encrypted using the Advanced Encryption Standard (AES) for time-efficient encryption, combined with Partial Homomorphic Encryption (PHE) to enable secure computations on encrypted data, thereby enhancing privacy. Moreover, it utilizes private channels that enable isolated communication and ledger between stakeholders, ensuring robust privacy while supporting collaborative operations. The proposed framework enables anonymized health data sharing for medical research by pseudonymizing patient identity. Additionally, hChain 4.0 incorporates Attribute-Based Access Control (ABAC) to provide secure electronic health record (EHR) sharing among authorized parties, where ABAC ensures fine-grained permission management vital for multi-organizational healthcare settings. Experimental assessments indicate that the proposed approach achieves higher scalability, cost-effectiveness, and validated security.

Paperid: 1518, https://arxiv.org/pdf/2505.11880.pdf

Abstract:
The Advanced Encryption Standard (AES) is a widely adopted cryptographic algorithm essential for securing embedded systems and IoT platforms. However, existing AES hardware accelerators often face limitations in performance, energy efficiency, and flexibility. This paper presents AES-RV, a hardware-efficient RISC-V accelerator featuring low-latency AES instruction extensions optimized for real-time processing across all AES modes and key sizes. AES-RV integrates three key innovations: high-bandwidth internal buffers for continuous data processing, a specialized AES unit with custom low-latency instructions, and a pipelined system supported by a ping-pong memory transfer mechanism. Implemented on the Xilinx ZCU102 SoC FPGA, AES-RV achieves up to 255.97 times speedup and up to 453.04 times higher energy efficiency compared to baseline and conventional CPU/GPU platforms. It also demonstrates superior throughput and area efficiency against state-of-the-art AES accelerators, making it a strong candidate for secure and high-performance embedded systems.

Paperid: 1519, https://arxiv.org/pdf/2505.11611.pdf

Abstract:
Polysemanticity -- where individual neurons encode multiple unrelated features -- is a well-known characteristic of large neural networks and remains a central challenge in the interpretability of language models. At the same time, its implications for model safety are also poorly understood. Leveraging recent advances in sparse autoencoders, we investigate the polysemantic structure of two small models (Pythia-70M and GPT-2-Small) and evaluate their vulnerability to targeted, covert interventions at the prompt, feature, token, and neuron levels. Our analysis reveals a consistent polysemantic topology shared across both models. Strikingly, we demonstrate that this structure can be exploited to mount effective interventions on two larger, black-box instruction-tuned models (LLaMA3.1-8B-Instruct and Gemma-2-9B-Instruct). These findings suggest not only the generalizability of the interventions but also point to a stable and transferable polysemantic structure that could potentially persist across architectures and training regimes.

Paperid: 1520, https://arxiv.org/pdf/2505.10942.pdf

Abstract:
Federated Learning (FL) has emerged as a powerful paradigm for collaborative model training while keeping client data decentralized and private. However, it is vulnerable to Data Reconstruction Attacks (DRA) such as "LoKI" and "Robbing the Fed", where malicious models sent from the server to the client can reconstruct sensitive user data. To counter this, we introduce DRArmor, a novel defense mechanism that integrates Explainable AI with targeted detection and mitigation strategies for DRA. Unlike existing defenses that focus on the entire model, DRArmor identifies and addresses the root cause (i.e., malicious layers within the model that send gradients with malicious intent) by analyzing their contribution to the output and detecting inconsistencies in gradient values. Once these malicious layers are identified, DRArmor applies defense techniques such as noise injection, pixelation, and pruning to these layers rather than the whole model, minimizing the attack surface and preserving client data privacy. We evaluate DRArmor's performance against the advanced LoKI attack across diverse datasets, including MNIST, CIFAR-10, CIFAR-100, and ImageNet, in a 200-client FL setup. Our results demonstrate DRArmor's effectiveness in mitigating data leakage, achieving high True Positive and True Negative Rates of 0.910 and 0.890, respectively. Additionally, DRArmor maintains an average accuracy of 87%, effectively protecting client privacy without compromising model performance. Compared to existing defense mechanisms, DRArmor reduces the data leakage rate by 62.5% with datasets containing 500 samples per client.

Paperid: 1521, https://arxiv.org/pdf/2505.09928.pdf

Abstract:
Smart contracts have been a topic of interest in blockchain research and are a key enabling technology for Connected Autonomous Vehicles (CAVs) in the era of Web 3.0. These contracts enable trustless interactions without the need for intermediaries, as they operate based on predefined rules encoded on the blockchain. However, smart contacts face significant challenges in cross-contract communication and information sharing, making it difficult to establish seamless connectivity and collaboration among CAVs with Web 3.0. In this paper, we propose DeFeed, a novel secure protocol that incorporates various gas-saving functions for CAVs, originated from in-depth research into the interaction among smart contracts for decentralized cross-contract data feed in Web 3.0. DeFeed allows smart contracts to obtain information from other contracts efficiently in a single click, without complicated operations. We judiciously design and complete various functions with DeFeed, including a pool function and a cache function for gas optimization, a subscribe function for facilitating data access, and an update function for the future iteration of our protocol. Tailored for CAVs with Web 3.0 use cases, DeFeed enables efficient data feed between smart contracts underpinning decentralized applications and vehicle coordination. Implemented and tested on the Ethereum official test network, DeFeed demonstrates significant improvements in contract interaction efficiency, reducing computational complexity and gas costs. Our solution represents a critical step towards seamless, decentralized communication in Web 3.0 ecosystems.

Paperid: 1522, https://arxiv.org/pdf/2505.09374.pdf

Abstract:
Mobile applications continuously generate DNS queries that can reveal sensitive user behavioral patterns even when communications are encrypted. This paper presents a privacy enhancement framework based on query forgery to protect users against profiling attempts that leverage these background communications. We first mathematically model user profiles as probability distributions over interest categories derived from mobile application traffic. We then evaluate three query forgery strategies -- uniform sampling, TrackMeNot-based generation, and an optimized approach that minimizes Kullback-Leibler divergence -- to quantify their effectiveness in obfuscating user profiles. Then we create a synthetic dataset comprising 1,000 user traces constructed from real mobile application traffic and we extract the user profiles based on DNS traffic. Our evaluation reveals that a 50\% privacy improvement is achievable with less than 20\% traffic overhead when using our approach, while achieving 100\% privacy protection requires approximately 40-60\% additional traffic. We further propose a modular system architecture for practical implementation of our protection mechanisms on mobile devices. This work offers a client-side privacy solution that operates without third-party trust requirements, empowering individual users to defend against traffic analysis without compromising application functionality.

Paperid: 1523, https://arxiv.org/pdf/2505.06632.pdf

Abstract:
Autonomous Vehicles (AV) proliferation brings important and pressing security and reliability issues that must be dealt with to guarantee public safety and help their widespread adoption. The contribution of the proposed research is towards achieving more secure, reliable, and trustworthy autonomous transportation system by providing more capabilities for anomaly detection, data provenance, and real-time response in safety critical AV deployments. In this research, we develop a new framework that combines the power of Artificial Intelligence (AI) for real-time anomaly detection with blockchain technology to detect and prevent any malicious activity including sensor failures in AVs. Through Long Short-Term Memory (LSTM) networks, our approach continually monitors associated multi-sensor data streams to detect anomalous patterns that may represent cyberattacks as well as hardware malfunctions. Further, this framework employs a decentralized platform for securely storing sensor data and anomaly alerts in a blockchain ledger for data incorruptibility and authenticity, while offering transparent forensic features. Moreover, immediate automated response mechanisms are deployed using smart contracts when anomalies are found. This makes the AV system more resilient to attacks from both cyberspace and hardware component failure. Besides, we identify potential challenges of scalability in handling high frequency sensor data, computational constraint in resource constrained environment, and of distributed data storage in terms of privacy.

Paperid: 1524, https://arxiv.org/pdf/2505.06520.pdf

Abstract:
It is often desirable to remove (a.k.a. unlearn) a specific part of the training data from a trained neural network model. A typical application scenario is to protect the data holder's right to be forgotten, which has been promoted by many recent regulation rules. Existing unlearning methods involve training alternative models with remaining data, which may be costly and challenging to verify from the data holder or a thirdparty auditor's perspective. In this work, we provide a new angle and propose a novel unlearning approach by imposing carefully crafted "patch" on the original neural network to achieve targeted "forgetting" of the requested data to delete. Specifically, inspired by the research line of neural network repair, we propose to strategically seek a lightweight minimum "patch" for unlearning a given data point with certifiable guarantee. Furthermore, to unlearn a considerable amount of data points (or an entire class), we propose to iteratively select a small subset of representative data points to unlearn, which achieves the effect of unlearning the whole set. Extensive experiments on multiple categorical datasets demonstrates our approach's effectiveness, achieving measurable unlearning while preserving the model's performance and being competitive in efficiency and memory consumption compared to various baseline methods.

Paperid: 1525, https://arxiv.org/pdf/2505.06284.pdf

Abstract:
Large language models (LLMs) are inherently vulnerable to unintended privacy breaches. Consequently, systematic red-teaming research is essential for developing robust defense mechanisms. However, current data extraction methods suffer from several limitations: (1) rely on dataset duplicates (addressable via deduplication), (2) depend on prompt engineering (now countered by detection and defense), and (3) rely on random-search adversarial generation. To address these challenges, we propose DMRL, a Data- and Model-aware Reward Learning approach for data extraction. This technique leverages inverse reinforcement learning to extract sensitive data from LLMs. Our method consists of two main components: (1) constructing an introspective reasoning dataset that captures leakage mindsets to guide model behavior, and (2) training reward models with Group Relative Policy Optimization (GRPO), dynamically tuning optimization based on task difficulty at both the data and model levels. Comprehensive experiments across various LLMs demonstrate that DMRL outperforms all baseline methods in data extraction performance.

Paperid: 1526, https://arxiv.org/pdf/2505.05920.pdf

Abstract:
The growing use of machine learning in cloud environments raises critical concerns about data security and privacy, especially in finance. Fully Homomorphic Encryption (FHE) offers a solution by enabling computations on encrypted data, but its high computational cost limits practicality. In this paper, we propose PP-FinTech, a privacy-preserving scheme for financial applications that employs a CKKS-based encrypted soft-margin SVM, enhanced with a hybrid kernel for modeling non-linear patterns and an adaptive thresholding mechanism for robust encrypted classification. Experiments on the Credit Card Approval dataset demonstrate comparable performance to the plaintext models, highlighting PP-FinTech's ability to balance privacy, and efficiency in secure financial ML systems.

Paperid: 1527, https://arxiv.org/pdf/2505.04015.pdf

Abstract:
This paper proposes MergeGuard, a novel methodology for mitigation of AI Trojan attacks. Trojan attacks on AI models cause inputs embedded with triggers to be misclassified to an adversary's target class, posing a significant threat to model usability trained by an untrusted third party. The core of MergeGuard is a new post-training methodology for linearizing and merging fully connected layers which we show simultaneously improves model generalizability and performance. Our Proof of Concept evaluation on Transformer models demonstrates that MergeGuard maintains model accuracy while decreasing trojan attack success rate, outperforming commonly used (post-training) Trojan mitigation by fine-tuning methodologies.

Paperid: 1528, https://arxiv.org/pdf/2505.04014.pdf

Abstract:
Today, users can "lift-and-shift" unmodified applications into modern, VM-based Trusted Execution Environments (TEEs) in order to gain hardware-based security guarantees. However, TEEs do not protect applications against disk rollback attacks, where persistent storage can be reverted to an earlier state after a crash; existing rollback resistance solutions either only support a subset of applications or require code modification. Our key insight is that restoring disk consistency after a rollback attack guarantees rollback resistance for any application. We present Rollbaccine, a device mapper that provides automatic rollback resistance for all applications by provably preserving disk consistency. Rollbaccine intercepts and replicates writes to disk, restores lost state from backups during recovery, and minimizes overheads by taking advantage of the weak, multi-threaded semantics of disk operations. Across benchmarks over two real applications (PostgreSQL and HDFS) and two file systems (ext4 and xfs), Rollbaccine adds only 19% overhead, except for the fsync-heavy Filebench Varmail. In addition, Rollbaccine outperforms the state-of-the-art, non-automatic rollback resistant solution by $208\times$.

Paperid: 1529, https://arxiv.org/pdf/2505.03451.pdf

Abstract:
The rise of QR code based phishing ("Quishing") poses a growing cybersecurity threat, as attackers increasingly exploit QR codes to bypass traditional phishing defenses. Existing detection methods predominantly focus on URL analysis, which requires the extraction of the QR code payload, and may inadvertently expose users to malicious content. Moreover, QR codes can encode various types of data beyond URLs, such as Wi-Fi credentials and payment information, making URL-based detection insufficient for broader security concerns. To address these gaps, we propose the first framework for quishing detection that directly analyzes QR code structure and pixel patterns without extracting the embedded content. We generated a dataset of phishing and benign QR codes and we used it to train and evaluate multiple machine learning models, including Logistic Regression, Decision Trees, Random Forest, Naive Bayes, LightGBM, and XGBoost. Our best-performing model (XGBoost) achieves an AUC of 0.9106, demonstrating the feasibility of QR-centric detection. Through feature importance analysis, we identify key visual indicators of malicious intent and refine our feature set by removing non-informative pixels, improving performance to an AUC of 0.9133 with a reduced feature space. Our findings reveal that the structural features of QR code correlate strongly with phishing risk. This work establishes a foundation for quishing mitigation and highlights the potential of direct QR analysis as a critical layer in modern phishing defenses.

Paperid: 1530, https://arxiv.org/pdf/2505.01454.pdf

Abstract:
Federated Learning (FL) enables collaborative model training across distributed clients while preserving data privacy, yet it faces significant challenges in communication efficiency and vulnerability to poisoning attacks. While sparsification techniques mitigate communication overhead by transmitting only critical model parameters, they inadvertently amplify security risks: adversarial clients can exploit sparse updates to evade detection and degrade model performance. Existing defense mechanisms, designed for standard FL communication scenarios, are ineffective in addressing these vulnerabilities within sparsified FL. To bridge this gap, we propose FLARE, a novel federated learning framework that integrates sparse index mask inspection and model update sign similarity analysis to detect and mitigate poisoning attacks in sparsified FL. Extensive experiments across multiple datasets and adversarial scenarios demonstrate that FLARE significantly outperforms existing defense strategies, effectively securing sparsified FL against poisoning attacks while maintaining communication efficiency.

Paperid: 1531, https://arxiv.org/pdf/2505.00240.pdf

Abstract:
The increasing complexity and scale of the Internet of Things (IoT) have made security a critical concern. This paper presents a novel Large Language Model (LLM)-based framework for comprehensive threat detection and prevention in IoT environments. The system integrates lightweight LLMs fine-tuned on IoT-specific datasets (IoT-23, TON_IoT) for real-time anomaly detection and automated, context-aware mitigation strategies optimized for resource-constrained devices. A modular Docker-based deployment enables scalable and reproducible evaluation across diverse network conditions. Experimental results in simulated IoT environments demonstrate significant improvements in detection accuracy, response latency, and resource efficiency over traditional security methods. The proposed framework highlights the potential of LLM-driven, autonomous security solutions for future IoT ecosystems.

Paperid: 1532, https://arxiv.org/pdf/2504.21846.pdf

Abstract:
High-profile speech videos are prime targets for falsification, owing to their accessibility and influence. This work proposes VeriLight, a low-overhead and unobtrusive system for protecting speech videos from visual manipulations of speaker identity and lip and facial motion. Unlike the predominant purely digital falsification detection methods, VeriLight creates dynamic physical signatures at the event site and embeds them into all video recordings via imperceptible modulated light. These physical signatures encode semantically-meaningful features unique to the speech event, including the speaker's identity and facial motion, and are cryptographically-secured to prevent spoofing. The signatures can be extracted from any video downstream and validated against the portrayed speech content to check its integrity. Key elements of VeriLight include (1) a framework for generating extremely compact (i.e., 150-bit), pose-invariant speech video features, based on locality-sensitive hashing; and (2) an optical modulation scheme that embeds $>$200 bps into video while remaining imperceptible both in video and live. Experiments on extensive video datasets show VeriLight achieves AUCs $\geq$ 0.99 and a true positive rate of 100% in detecting falsified videos. Further, VeriLight is highly robust across recording conditions, video post-processing techniques, and white-box adversarial attacks on its feature extraction methods. A demonstration of VeriLight is available at https://mobilex.cs.columbia.edu/verilight.

Paperid: 1533, https://arxiv.org/pdf/2504.21739.pdf

Abstract:
Federated learning is a distributed machine learning paradigm that enables collaborative training across multiple parties while ensuring data privacy. Gradient Boosting Decision Trees (GBDT), such as XGBoost, have gained popularity due to their high performance and strong interpretability. Therefore, there has been a growing interest in adapting XGBoost for use in federated settings via cryptographic techniques. However, it should be noted that these approaches may not always provide rigorous theoretical privacy guarantees, and they often come with a high computational cost in terms of time and space requirements. In this paper, we propose a variant of vertical federated XGBoost with bilateral differential privacy guarantee: MaskedXGBoost. We build well-calibrated noise to perturb the intermediate information to protect privacy. The noise is structured with part of its ingredients in the null space of the arithmetical operation for splitting score evaluation in XGBoost, helping us achieve consistently better utility than other perturbation methods and relatively lower overhead than encryption-based techniques. We provide theoretical utility analysis and empirically verify privacy preservation. Compared with other algorithms, our algorithm's superiority in both utility and efficiency has been validated on multiple datasets.

Paperid: 1534, https://arxiv.org/pdf/2504.20801.pdf

Abstract:
Black-box scanners have played a significant role in detecting vulnerabilities for web applications. A key focus in current black-box scanning is increasing test coverage (i.e., accessing more web pages). However, since many web applications are user-oriented, some deep pages can only be accessed through complex user interactions, which are difficult to reach by existing black-box scanners. To fill this gap, a key insight is that web pages contain a wealth of semantic information that can aid in understanding potential user intention. Based on this insight, we propose Hoyen, a black-box scanner that uses the Large Language Model to predict user intention and provide guidance for expanding the scanning scope. Hoyen has been rigorously evaluated on 12 popular open-source web applications and compared with 6 representative tools. The results demonstrate that Hoyen performs a comprehensive exploration of web applications, expanding the attack surface while achieving about 2x than the coverage of other scanners on average, with high request accuracy. Furthermore, Hoyen detected over 90% of its requests towards the core functionality of the application, detecting more vulnerabilities than other scanners, including unique vulnerabilities in well-known web applications. Our data/code is available at https://hoyen.tjunsl.com/

Paperid: 1535, https://arxiv.org/pdf/2504.20295.pdf

Abstract:
Digital twins (DTs) are improving water distribution systems by using real-time data, analytics, and prediction models to optimize operations. This paper presents a DT platform designed for a Spanish water supply network, utilizing Long Short-Term Memory (LSTM) networks to predict water consumption. However, machine learning models are vulnerable to adversarial attacks, such as the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). These attacks manipulate critical model parameters, injecting subtle distortions that degrade forecasting accuracy. To further exploit these vulnerabilities, we introduce a Learning Automata (LA) and Random LA-based approach that dynamically adjusts perturbations, making adversarial attacks more difficult to detect. Experimental results show that this approach significantly impacts prediction reliability, causing the Mean Absolute Percentage Error (MAPE) to rise from 26% to over 35%. Moreover, adaptive attack strategies amplify this effect, highlighting cybersecurity risks in AI-driven DTs. These findings emphasize the urgent need for robust defenses, including adversarial training, anomaly detection, and secure data pipelines.

Paperid: 1536, https://arxiv.org/pdf/2504.20275.pdf

Abstract:
Water distribution systems in rural areas face serious challenges such as a lack of real-time monitoring, vulnerability to cyberattacks, and unreliable data handling. This paper presents an integrated framework that combines LoRaWAN-based data acquisition, a machine learning-driven Intrusion Detection System (IDS), and a blockchain-enabled Digital Twin (BC-DT) platform for secure and transparent water management. The IDS filters anomalous or spoofed data using a Long Short-Term Memory (LSTM) Autoencoder and Isolation Forest before validated data is logged via smart contracts on a private Ethereum blockchain using Proof of Authority (PoA) consensus. The verified data feeds into a real-time DT model supporting leak detection, consumption forecasting, and predictive maintenance. Experimental results demonstrate that the system achieves over 80 transactions per second (TPS) with under 2 seconds of latency while remaining cost-effective and scalable for up to 1,000 smart meters. This work demonstrates a practical and secure architecture for decentralized water infrastructure in under-connected rural environments.

Paperid: 1537, https://arxiv.org/pdf/2504.18007.pdf

Abstract:
With the rapid digitalization of healthcare systems, there has been a substantial increase in the generation and sharing of private health data. Safeguarding patient information is essential for maintaining consumer trust and ensuring compliance with legal data protection regulations. Machine learning is critical in healthcare, supporting personalized treatment, early disease detection, predictive analytics, image interpretation, drug discovery, efficient operations, and patient monitoring. It enhances decision-making, accelerates research, reduces errors, and improves patient outcomes. In this paper, we utilize machine learning methodologies, including differential privacy and federated learning, to develop privacy-preserving models that enable healthcare stakeholders to extract insights without compromising individual privacy. Differential privacy introduces noise to data to guarantee statistical privacy, while federated learning enables collaborative model training across decentralized datasets. We explore applying these technologies to Heart Disease Data, demonstrating how they preserve privacy while delivering valuable insights and comprehensive analysis. Our results show that using a federated learning model with differential privacy achieved a test accuracy of 85%, ensuring patient data remained secure and private throughout the process.

Paperid: 1538, https://arxiv.org/pdf/2504.17919.pdf

Abstract:
An improved design of a cryptosystem based on small Ree groups is proposed. We have changed the encryption algorithm and propose to use a logarithmic signature for the entire Ree group. This approach improves security against sequential key recovery attacks. Hence, the complexity of the key recovery attack will be defined by a brute-force attack over the entire group. In this paper, we have proved that to construct secure cryptosystems with group computations over a small finite field, it is needed to use a 3-parametric small Ree group.

Paperid: 1539, https://arxiv.org/pdf/2504.16474.pdf

Abstract:
The transfer-based black-box adversarial attack setting poses the challenge of crafting an adversarial example (AE) on known surrogate models that remain effective against unseen target models. Due to the practical importance of this task, numerous methods have been proposed to address this challenge. However, most previous methods are heuristically designed and intuitively justified, lacking a theoretical foundation. To bridge this gap, we derive a novel transferability bound that offers provable guarantees for adversarial transferability. Our theoretical analysis has the advantages of \textit{(i)} deepening our understanding of previous methods by building a general attack framework and \textit{(ii)} providing guidance for designing an effective attack algorithm. Our theoretical results demonstrate that optimizing AEs toward flat minima over the surrogate model set, while controlling the surrogate-target model shift measured by the adversarial model discrepancy, yields a comprehensive guarantee for AE transferability. The results further lead to a general transfer-based attack framework, within which we observe that previous methods consider only partial factors contributing to the transferability. Algorithmically, inspired by our theoretical results, we first elaborately construct the surrogate model set in which models exhibit diverse adversarial vulnerabilities with respect to AEs to narrow an instantiated adversarial model discrepancy. Then, a \textit{model-Diversity-compatible Reverse Adversarial Perturbation} (DRAP) is generated to effectively promote the flatness of AEs over diverse surrogate models to improve transferability. Extensive experiments on NIPS2017 and CIFAR-10 datasets against various target models demonstrate the effectiveness of our proposed attack.

Paperid: 1540, https://arxiv.org/pdf/2504.16226.pdf

Abstract:
Edge computing-based Next-Generation Wireless Networks (NGWN)-IoT offer enhanced bandwidth capacity for large-scale service provisioning but remain vulnerable to evolving cyber threats. Existing intrusion detection and prevention methods provide limited security as adversaries continually adapt their attack strategies. We propose a dynamic attack detection and prevention approach to address this challenge. First, blockchain-based authentication uses the Deoxys Authentication Algorithm (DAA) to verify IoT device legitimacy before data transmission. Next, a bi-stage intrusion detection system is introduced: the first stage uses signature-based detection via an Improved Random Forest (IRF) algorithm. In contrast, the second stage applies feature-based anomaly detection using a Diffusion Convolution Recurrent Neural Network (DCRNN). To ensure Quality of Service (QoS) and maintain Service Level Agreements (SLA), trust-aware service migration is performed using Heap-Based Optimization (HBO). Additionally, on-demand virtual High-Interaction honeypots deceive attackers and extract attack patterns, which are securely stored using the Bimodal Lattice Signature Scheme (BLISS) to enhance signature-based Intrusion Detection Systems (IDS). The proposed framework is implemented in the NS3 simulation environment and evaluated against existing methods across multiple performance metrics, including accuracy, attack detection rate, false negative rate, precision, recall, ROC curve, memory usage, CPU usage, and execution time. Experimental results demonstrate that the framework significantly outperforms existing approaches, reinforcing the security of NGWN-enabled IoT ecosystems

Paperid: 1541, https://arxiv.org/pdf/2504.15717.pdf

Abstract:
Blockchain and distributed ledger technologies (DLTs) facilitate decentralized computations across trust boundaries. However, ensuring complex computations with low gas fees and confidentiality remains challenging. Recent advances in Confidential Computing -- leveraging hardware-based Trusted Execution Environments (TEEs) -- and Proof-carrying Data -- employing cryptographic Zero-Knowledge Virtual Machines (zkVMs) -- hold promise for secure, privacy-preserving off-chain and layer-2 computations. On the other side, a homogeneous reliance on a single technology, such as TEEs or zkVMs, is impractical for decentralized environments with heterogeneous computational requirements. This paper introduces the Trusted Compute Unit (TCU), a unifying framework that enables composable and interoperable verifiable computations across heterogeneous technologies. Our approach allows decentralized applications (dApps) to flexibly offload complex computations to TCUs, obtaining proof of correctness. These proofs can be anchored on-chain for automated dApp interactions, while ensuring confidentiality of input data, and integrity of output data. We demonstrate how TCUs can support a prominent blockchain use case, such as federated learning. By enabling secure off-chain interactions without incurring on-chain confirmation delays or gas fees, TCUs significantly improve system performance and scalability. Experimental insights and performance evaluations confirm the feasibility and practicality of this unified approach, advancing the state of the art in verifiable off-chain services for the blockchain ecosystem.

Paperid: 1542, https://arxiv.org/pdf/2504.15676.pdf

Abstract:
Decentralized Autonomous Machines (DAMs) represent a transformative paradigm in automation economy, integrating artificial intelligence (AI), blockchain technology, and Internet of Things (IoT) devices to create self-governing economic agents participating in Decentralized Physical Infrastructure Networks (DePIN). Capable of managing both digital and physical assets and unlike traditional Decentralized Autonomous Organizations (DAOs), DAMs extend autonomy into the physical world, enabling trustless systems for Real and Digital World Assets (RDWAs). In this paper, we explore the technological foundations, and challenges of DAMs and argue that DAMs are pivotal in transitioning from trust-based to trustless economic models, offering scalable, transparent, and equitable solutions for asset management. The integration of AI-driven decision-making, IoT-enabled operational autonomy, and blockchain-based governance allows DAMs to decentralize ownership, optimize resource allocation, and democratize access to economic opportunities. Therefore, in this research, we highlight the potential of DAMs to address inefficiencies in centralized systems, reduce wealth disparities, and foster a post-labor economy.

Paperid: 1543, https://arxiv.org/pdf/2504.15391.pdf

Abstract:
This scholarly work presents an advanced cryptographic framework utilizing automorphism groups as the foundational structure for encryption scheme implementation. The proposed methodology employs a three-parameter group construction, distinguished by its application of logarithmic signatures positioned outside the group's center, a significant departure from conventional approaches. A key innovation in this implementation is utilizing the Hermitian function field as the underlying mathematical framework. This particular function field provides enhanced structural properties that strengthen the cryptographic protocol when integrated with the three-parameter group architecture. The encryption mechanism features phased key de-encapsulation from ciphertext, representing a substantial advantage over alternative implementations. This sequential extraction process introduces additional computational complexity for potential adversaries while maintaining efficient legitimate decryption. A notable characteristic of this cryptosystem is the direct correlation between the underlying group's mathematical strength and both the attack complexity and message size parameters. This relationship enables precise security-efficiency calibration based on specific implementation requirements and threat models. The application of automorphism groups with logarithmic signatures positioned outside the center represents a significant advancement in non-traditional cryptographic designs, particularly relevant in the context of post-quantum cryptographic resilience.

Paperid: 1544, https://arxiv.org/pdf/2504.11804.pdf

Abstract:
This article presents a method for enhancing the encryption algorithm in the MST3 cryptosystem for generalized Suzuki 2-groups. The conventional MST cryptosystem based on Suzuki groups utilizes logarithmic signatures (LS) restricted to the center of the group, resulting in an expansive array of logarithmic signatures. We propose an encryption scheme based on multi-parameter non-commutative groups, specifically selecting multi-parameter generalized Suzuki 2-groups as the group construction framework. In our approach, the logarithmic signature extends across the entire group, with cipher security dependent on the group order. This design enables the development of encryption optimized for implementation efficiency determined by logarithmic signature size while maintaining robust security through appropriate key sizes and the finite field of group representation. The primary innovation in our encryption implementation lies in the sequential de-encapsulation of keys from ciphertext using logarithmic signatures and associated keys. The security evaluation of the cipher relies on attack complexity analysis, which is quantified through comprehensive key enumeration methodologies.

Paperid: 1545, https://arxiv.org/pdf/2504.10987.pdf

Abstract:
Differentially Private Synthetic Data Generation (DP-SDG) is a key enabler of private and secure tabular-data sharing, producing artificial data that carries through the underlying statistical properties of the input data. This typically involves adding carefully calibrated statistical noise to guarantee individual privacy, at the cost of synthetic data quality. Recent literature has explored scenarios where a small amount of public data is used to help enhance the quality of synthetic data. These methods study a horizontal public-private partitioning which assumes access to a small number of public rows that can be used for model initialization, providing a small utility gain. However, realistic datasets often naturally consist of public and private attributes, making a vertical public-private partitioning relevant for practical synthetic data deployments. We propose a novel framework that adapts horizontal public-assisted methods into the vertical setting. We compare this framework against our alternative approach that uses conditional generation, highlighting initial limitations of public-data assisted methods and proposing future research directions to address these challenges.

Paperid: 1546, https://arxiv.org/pdf/2504.10947.pdf

Abstract:
This article proposes a comprehensive approach to implementing encryption schemes based on the automorphism group of the Hermitian function field. We utilize a three-parameter group with logarithmic representations outside the group's center. In this work, we enhance the Hermitian function field-based encryption scheme with homomorphic encryption capabilities, which constitutes a significant advantage of our implementation. Both the attack complexity and the encrypted message size are directly correlated with the order of the group.

Paperid: 1548, https://arxiv.org/pdf/2504.07875.pdf

Abstract:
To address the rapidly growing demand for cloud-based quantum computing, various researchers are proposing shifting from the existing single-tenant model to a multi-tenant model that expands resource utilization and improves accessibility. However, while multi-tenancy enables multiple users to access the same quantum computer, it introduces potential for security and reliability vulnerabilities. It therefore becomes important to investigate these vulnerabilities, especially considering realistic attackers who operate without elevated privileges relative to ordinary users. To address this research need, this paper presents and evaluates QubitHammer, the first attack to demonstrate that an adversary can remotely induce unauthorized changes to a victim's quantum circuit's qubit's state within a multi-tenant model by using custom qubit control pulses that are generated within constraints of the public interfaces and without elevated privileges. Through extensive evaluation on real-world superconducting devices from IBM and Rigetti, this work demonstrates that QubitHammer allows an adversary to significantly change the output distribution of a victim quantum circuit. In the experimentation, variational distance is used to evaluate the magnitude of the changes, and variational distance as high as 0.938 is observed. Cross-platform analysis of QubitHammer on a number of quantum computing devices exposes a fundamental susceptibility in superconducting hardware. Further, QubitHammer was also found to evade all currently proposed defenses aimed at ensuring reliable execution in multi-tenant superconducting quantum systems.

Paperid: 1549, https://arxiv.org/pdf/2504.07717.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of applications, e.g., medical question-answering, mathematical sciences, and code generation. However, they also exhibit inherent limitations, such as outdated knowledge and susceptibility to hallucinations. Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm to address these issues, but it also introduces new vulnerabilities. Recent efforts have focused on the security of RAG-based LLMs, yet existing attack methods face three critical challenges: (1) their effectiveness declines sharply when only a limited number of poisoned texts can be injected into the knowledge database, (2) they lack sufficient stealth, as the attacks are often detectable by anomaly detection systems, which compromises their effectiveness, and (3) they rely on heuristic approaches to generate poisoned texts, lacking formal optimization frameworks and theoretic guarantees, which limits their effectiveness and applicability. To address these issues, we propose coordinated Prompt-RAG attack (PR-attack), a novel optimization-driven attack that introduces a small number of poisoned texts into the knowledge database while embedding a backdoor trigger within the prompt. When activated, the trigger causes the LLM to generate pre-designed responses to targeted queries, while maintaining normal behavior in other contexts. This ensures both high effectiveness and stealth. We formulate the attack generation process as a bilevel optimization problem leveraging a principled optimization framework to develop optimal poisoned texts and triggers. Extensive experiments across diverse LLMs and datasets demonstrate the effectiveness of PR-Attack, achieving a high attack success rate even with a limited number of poisoned texts and significantly improved stealth compared to existing methods.

Paperid: 1550, https://arxiv.org/pdf/2504.07590.pdf

Abstract:
Due to its open-source nature, the Android operating system has consistently been a primary target for attackers. Learning-based methods have made significant progress in the field of Android malware detection. However, traditional detection methods based on static features struggle to identify obfuscated malicious code, while methods relying on dynamic analysis suffer from low efficiency. To address this, we propose a dynamic weighted feature selection method that analyzes the importance and stability of features, calculates scores to filter out the most robust features, and combines these selected features with the program's structural information. We then utilize graph neural networks for classification, thereby improving the robustness and accuracy of the detection system. We analyzed 8,664 malware samples from eight malware families and tested a total of 44,940 malware variants generated using seven obfuscation strategies. Experiments demonstrate that our proposed method achieves an F1-score of 95.56% on the unobfuscated dataset and 92.28% on the obfuscated dataset, indicating that the model can effectively detect obfuscated malware.

Paperid: 1551, https://arxiv.org/pdf/2504.07318.pdf

Abstract:
The article describes a new implementation of MST3 cryptosystems based on the automorphism group of the Suzuki function field. The main difference in the presented implementation is to use the logarithmic signature for encryption not only in the center of the group, as in the well-known implementation of MST3 for Suzuki groups but also for coordinates outside the center of the group. The present implementation of a cryptosystem has higher reliability. The complexity of cryptanalysis and the size of the message for encryption squared is higher than that of the MST3 cryptosystem in the Suzuki group.

Paperid: 1552, https://arxiv.org/pdf/2504.07048.pdf

Abstract:
Multi-programming quantum computers improve device utilization and throughput. However, crosstalk from concurrent two-qubit CNOT gates poses security risks, compromising the fidelity and output of co-running victim programs. We design Zero Knowledge Tampering Attacks (ZKTAs), using which attackers can exploit crosstalk without knowledge of the hardware error profile. ZKTAs can alter victim program outputs in 40% of cases on commercial systems. We identify that ZKTAs succeed because the attacker's program consistently runs with the same victim program in a fixed context. To mitigate this, we propose QONTEXTS: a context-switching technique that defends against ZKTAs by running programs across multiple contexts, each handling only a subset of trials. QONTEXTS uses multi-programming with frequent context switching while identifying a unique set of programs for each context. This helps limit only a fraction of execution to ZKTAs. We enhance QONTEXTS with attack detection capabilities that compare the distributions from different contexts against each other to identify noisy contexts executed with ZKTAs. Our evaluations on real IBMQ systems show that QONTEXTS increases program resilience by three orders of magnitude and fidelity by 1.33$\times$ on average. Moreover, QONTEXTS improves throughput by 2$\times$, advancing security in multi-programmed environments.

Paperid: 1553, https://arxiv.org/pdf/2504.06017.pdf

Abstract:
By 2028 most cybersecurity actions will be autonomous, with humans teleoperating. We present the first classification of autonomy levels in cybersecurity and introduce Cybersecurity AI (CAI), an open-source framework that democratizes advanced security testing through specialized AI agents. Through rigorous empirical evaluation, we demonstrate that CAI consistently outperforms state-of-the-art results in CTF benchmarks, solving challenges across diverse categories with significantly greater efficiency -up to 3,600x faster than humans in specific tasks and averaging 11x faster overall. CAI achieved first place among AI teams and secured a top-20 position worldwide in the "AI vs Human" CTF live Challenge, earning a monetary reward of $750. Based on our results, we argue against LLM-vendor claims about limited security capabilities. Beyond cybersecurity competitions, CAI demonstrates real-world effectiveness, reaching top-30 in Spain and top-500 worldwide on Hack The Box within a week, while dramatically reducing security testing costs by an average of 156x. Our framework transcends theoretical benchmarks by enabling non-professionals to discover significant security bugs (CVSS 4.3-7.5) at rates comparable to experts during bug bounty exercises. By combining modular agent design with seamless tool integration and human oversight (HITL), CAI addresses critical market gaps, offering organizations of all sizes access to AI-powered bug bounty security testing previously available only to well-resourced firms -thereby challenging the oligopolistic ecosystem currently dominated by major bug bounty platforms.

Paperid: 1554, https://arxiv.org/pdf/2504.05866.pdf

Abstract:
Organizations are increasingly targeted by Advanced Persistent Threats (APTs), which involve complex, multi-stage tactics and diverse techniques. Cyber Threat Intelligence (CTI) sources, such as incident reports and security blogs, provide valuable insights, but are often unstructured and in natural language, making it difficult to automatically extract information. Recent studies have explored the use of AI to perform automatic extraction from CTI data, leveraging existing CTI datasets for performance evaluation and fine-tuning. However, they present challenges and limitations that impact their effectiveness. To overcome these issues, we introduce a novel dataset manually constructed from CTI reports and structured according to the MITRE ATT&CK framework. To assess its quality, we conducted an inter-annotator agreement study using Krippendorff alpha, confirming its reliability. Furthermore, the dataset was used to evaluate a Large Language Model (LLM) in a real-world business context, showing promising generalizability.

Paperid: 1555, https://arxiv.org/pdf/2504.05509.pdf

Abstract:
Smart contracts power decentralized financial (DeFi) services but are vulnerable to complex security exploits that can lead to significant financial losses. Existing security measures often fail to adequately protect these contracts due to the composability of DeFi protocols and the increasing sophistication of attacks. Through a large-scale empirical study of historical transactions from the 30 hacked DeFi protocols, we discovered that while benign transactions typically exhibit a limited number of unique control flows, in stark contrast, attack transactions consistently introduce novel, previously unobserved control flows. Building on these insights, we developed CrossGuard, a novel framework that enforces control flow integrity in real-time to secure smart contracts. Crucially, CrossGuard does not require prior knowledge of specific hacks; instead, it dynamically enforces control flow whitelisting policies and applies simplification heuristics at runtime. This approach monitors and prevents potential attacks by reverting all transactions that do not adhere to the established control flow whitelisting rules. Our evaluation demonstrates that CrossGuard effectively blocks 28 of the 30 analyzed attacks when configured only once prior to contract deployment, maintaining a low false positive rate of 0.28% and minimal additional gas costs. These results underscore the efficacy of applying control flow integrity to smart contracts, significantly enhancing security beyond traditional methods and addressing the evolving threat landscape in the DeFi ecosystem.

Paperid: 1556, https://arxiv.org/pdf/2504.04285.pdf

Abstract:
Cloud-based quantum service providers allow multiple users to run programs on shared hardware concurrently to maximize resource utilization and minimize operational costs. This multi-tenant computing (MTC) model relies on the error parameters of the hardware for fair qubit allocation and scheduling, as error-prone qubits can degrade computational accuracy asymmetrically for users sharing the hardware. To maintain low error rates, quantum providers perform periodic hardware calibration, often relying on third-party calibration services. If an adversary within this calibration service misreports error rates, the allocator can be misled into making suboptimal decisions even when the physical hardware remains unchanged. We demonstrate such an attack model in which an adversary strategically misreports qubit error rates to reduce hardware throughput, and probability of successful trial (PST) for two previously proposed allocation frameworks, i.e. Greedy and Community-Based Dynamic Allocation Partitioning (COMDAP). Experimental results show that adversarial misreporting increases execution latency by 24% and reduces PST by 7.8%. We also propose to identify inconsistencies in reported error rates by analyzing statistical deviations in error rates across calibration cycles.

Paperid: 1557, https://arxiv.org/pdf/2504.02537.pdf

Abstract:
Cyberthreat intelligence sharing is a critical aspect of cybersecurity, and it is essential to understand its definition, objectives, benefits, and impact on society. Blockchain and Distributed Ledger Technology (DLT) are emerging technologies that have the potential to transform intelligence sharing. This paper aims to provide a comprehensive understanding of intelligence sharing and the role of blockchain and DLT in enhancing it. The paper addresses questions related to the definition, objectives, benefits, and impact of intelligence sharing and provides a review of the existing literature. Additionally, the paper explores the challenges associated with blockchain and DLT and their potential impact on security and privacy. The paper also discusses the use of DLT and blockchain in security and intelligence sharing and highlights the associated challenges and risks. Furthermore, the paper examines the potential impact of a National Cybersecurity Strategy on addressing cybersecurity risks. Finally, the paper explores the experimental set up required for implementing blockchain and DLT for intelligence sharing and discusses the curricular ramifications of intelligence sharing.

Paperid: 1558, https://arxiv.org/pdf/2503.21598.pdf

Abstract:
Large Language Models (LLMs) have transformed task automation and content generation across various domains while incorporating safety filters to prevent misuse. We introduce a novel jailbreaking framework that employs distributed prompt processing combined with iterative refinements to bypass these safety measures, particularly in generating malicious code. Our architecture consists of four key modules: prompt segmentation, parallel processing, response aggregation, and LLM-based jury evaluation. Tested on 500 malicious prompts across 10 cybersecurity categories, the framework achieves a 73.2% Success Rate (SR) in generating malicious code. Notably, our comparative analysis reveals that traditional single-LLM judge evaluation overestimates SRs (93.8%) compared to our LLM jury system (73.2%), with manual verification confirming that single-judge assessments often accept incomplete implementations. Moreover, we demonstrate that our distributed architecture improves SRs by 12% over the non-distributed approach in an ablation study, highlighting both the effectiveness of distributed prompt processing and the importance of robust evaluation methodologies in assessing jailbreak attempts.

Paperid: 1559, https://arxiv.org/pdf/2503.20806.pdf

Abstract:
The rise of cyber threats on social media platforms necessitates advanced metrics to assess and mitigate social cyber vulnerabilities. This paper presents the Social Cyber Vulnerability Index (SCVI), a novel framework integrating individual-level factors (e.g., awareness, behavioral traits, psychological attributes) and attack-level characteristics (e.g., frequency, consequence, sophistication) for comprehensive socio-cyber vulnerability assessment. SCVI is validated using survey data (iPoll) and textual data (Reddit scam reports), demonstrating adaptability across modalities while revealing demographic disparities and regional vulnerabilities. Comparative analyses with the Common Vulnerability Scoring System (CVSS) and the Social Vulnerability Index (SVI) show the superior ability of SCVI to capture nuanced socio-technical risks. Monte Carlo-based weight variability analysis confirms SCVI is robust and highlights its utility in identifying high-risk groups. By addressing gaps in traditional metrics, SCVI offers actionable insights for policymakers and practitioners, advancing inclusive strategies to mitigate emerging threats such as AI-powered phishing and deepfake scams.

Paperid: 1560, https://arxiv.org/pdf/2503.20093.pdf

Abstract:
The adoption of modern encryption protocols such as TLS 1.3 has significantly challenged traditional network traffic classification (NTC) methods. As a consequence, researchers are increasingly turning to machine learning (ML) approaches to overcome these obstacles. In this paper, we comprehensively analyze ML-based NTC studies, developing a taxonomy of their design choices, benchmarking suites, and prevalent assumptions impacting classifier performance. Through this systematization, we demonstrate widespread reliance on outdated datasets, oversights in design choices, and the consequences of unsubstantiated assumptions. Our evaluation reveals that the majority of proposed encrypted traffic classifiers have mistakenly utilized unencrypted traffic due to the use of legacy datasets. Furthermore, by conducting 348 feature occlusion experiments on state-of-the-art classifiers, we show how oversights in NTC design choices lead to overfitting, and validate or refute prevailing assumptions with empirical evidence. By highlighting lessons learned, we offer strategic insights, identify emerging research directions, and recommend best practices to support the development of real-world applicable NTC methodologies.

Paperid: 1561, https://arxiv.org/pdf/2503.19591.pdf

Abstract:
With the widespread application of automatic speech recognition (ASR) systems, their vulnerability to adversarial attacks has been extensively studied. However, most existing adversarial examples are generated on specific individual models, resulting in a lack of transferability. In real-world scenarios, attackers often cannot access detailed information about the target model, making query-based attacks unfeasible. To address this challenge, we propose a technique called Acoustic Representation Optimization that aligns adversarial perturbations with low-level acoustic characteristics derived from speech representation models. Rather than relying on model-specific, higher-layer abstractions, our approach leverages fundamental acoustic representations that remain consistent across diverse ASR architectures. By enforcing an acoustic representation loss to guide perturbations toward these robust, lower-level representations, we enhance the cross-model transferability of adversarial examples without degrading audio quality. Our method is plug-and-play and can be integrated with any existing attack methods. We evaluate our approach on three modern ASR models, and the experimental results demonstrate that our method significantly improves the transferability of adversarial examples generated by previous methods while preserving the audio quality.

Paperid: 1562, https://arxiv.org/pdf/2503.18687.pdf

Abstract:
A notable challenge in Electric Vehicle (EV) charging is the time required to fully charge the battery, which can range from 15 minutes to 2-3 hours. This idle period, however, presents an opportunity to offer time-consuming or data-intensive services such as vehicular software updates. ISO 15118 referred to the concept of Value-Added Services (VAS) in the charging scenario, but it remained underexplored in the literature. Our paper addresses this gap by proposing \acronym, the first EV charger compute architecture that supports secure on-charger universal applications with upstream and downstream communication. The architecture covers the end-to-end hardware/software stack, including standard API for vehicles and IT infrastructure. We demonstrate the feasibility and advantages of \acronym by employing and evaluating three suggested value-added services: vehicular software updates, security information and event management (SIEM), and secure payments. The results demonstrate significant reductions in bandwidth utilization and latency, as well as high throughput, which supports this novel concept and suggests a promising business model for Electric Vehicle charging station operation.

Paperid: 1563, https://arxiv.org/pdf/2503.18503.pdf

Abstract:
Graph neural networks (GNNs) are becoming the de facto method to learn on the graph data and have achieved the state-of-the-art on node and graph classification tasks. However, recent works show GNNs are vulnerable to training-time poisoning attacks -- marginally perturbing edges, nodes, or/and node features of training graph(s) can largely degrade GNNs' testing performance. Most previous defenses against graph poisoning attacks are empirical and are soon broken by adaptive / stronger ones. A few provable defenses provide robustness guarantees, but have large gaps when applied in practice: 1) restrict the attacker on only one type of perturbation; 2) design for a particular GNN architecture or task; and 3) robustness guarantees are not 100\% accurate. In this work, we bridge all these gaps by developing PGNNCert, the first certified defense of GNNs against poisoning attacks under arbitrary (edge, node, and node feature) perturbations with deterministic robustness guarantees. Extensive evaluations on multiple node and graph classification datasets and GNNs demonstrate the effectiveness of PGNNCert to provably defend against arbitrary poisoning perturbations. PGNNCert is also shown to significantly outperform the state-of-the-art certified defenses against edge perturbation or node perturbation during GNN training.

Paperid: 1564, https://arxiv.org/pdf/2503.16984.pdf

Abstract:
Vehicle cybersecurity has emerged as a critical concern, driven by the innovation in the automotive industry, e.g., automomous, electric, or connnected vehicles. Current efforts to address these challenges are constrained by the limited computational resources of vehicles and the reliance on connected infrastructures. This motivated the foundation of Vehicle Security Operations Centers (VSOCs) that extend IT-based Security Operations Centers (SOCs) to cover the entire automotive ecosystem, both the in-vehicle and off-vehicle scopes. Security Orchestration, Automation, and Response (SOAR) tools are considered key for impelementing an effective cybersecurity solution. However, existing state-of-the-art solutions depend on infrastructure networks such as 4G, 5G, and WiFi, which often face scalability and congestion issues. To address these limitations, we propose a novel SOAR architecture EVSOAR that leverages the EV charging stations for connectivity and computing to enhance vehicle cybersecurity. Our EV-specific SOAR architecture enables real-time analysis and automated responses to cybersecurity threats closer to the EV, reducing the cellular latency, bandwidth, and interference limitations. Our experimental results demonstrate a significant improvement in latency, stability, and scalability through the infrastructure and the capacity to deploy computationally intensive applications, that are otherwise infeasible within the resource constraints of individual vehicles.

Paperid: 1565, https://arxiv.org/pdf/2503.16233.pdf

Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy; however, balancing privacy preservation (PP) and fairness poses significant challenges. In this paper, we present the first unified large-scale empirical study of privacy-fairness-utility trade-offs in FL, advancing toward responsible AI deployment. Specifically, we systematically compare Differential Privacy (DP), Homomorphic Encryption (HE), and Secure Multi-Party Computation (SMC) with fairness-aware optimizers including q-FedAvg, q-MAML, Ditto, evaluating their performance under IID and non-IID scenarios using benchmark (MNIST, Fashion-MNIST) and real-world datasets (Alzheimer's MRI, credit-card fraud detection). Our analysis reveals HE and SMC significantly outperform DP in achieving equitable outcomes under data skew, although at higher computational costs. Remarkably, we uncover unexpected interactions: DP mechanisms can negatively impact fairness, and fairness-aware optimizers can inadvertently reduce privacy effectiveness. We conclude with practical guidelines for designing robust FL systems that deliver equitable, privacy-preserving, and accurate outcomes.

Paperid: 1566, https://arxiv.org/pdf/2503.13419.pdf

Abstract:
The synergy between virtual reality (VR) and artificial intelligence (AI), specifically deep learning (DL)-based cybersickness detection models, has ushered in unprecedented advancements in immersive experiences by automatically detecting cybersickness severity and adaptively various mitigation techniques, offering a smooth and comfortable VR experience. While this DL-enabled cybersickness detection method provides promising solutions for enhancing user experiences, it also introduces new risks since these models are vulnerable to adversarial attacks; a small perturbation of the input data that is visually undetectable to human observers can fool the cybersickness detection model and trigger unexpected mitigation, thus disrupting user immersive experiences (UIX) and even posing safety risks. In this paper, we present a new type of VR attack, i.e., a cybersickness attack, which successfully stops the triggering of cybersickness mitigation by fooling DL-based cybersickness detection models and dramatically hinders the UIX. Next, we propose a novel explainable artificial intelligence (XAI)-guided cybersickness attack detection framework to detect such attacks in VR to ensure UIX and a comfortable VR experience. We evaluate the proposed attack and the detection framework using two state-of-the-art open-source VR cybersickness datasets: Simulation 2021 and Gameplay dataset. Finally, to verify the effectiveness of our proposed method, we implement the attack and the XAI-based detection using a testbed with a custom-built VR roller coaster simulation with an HTC Vive Pro Eye headset and perform a user study. Our study shows that such an attack can dramatically hinder the UIX. However, our proposed XAI-guided cybersickness attack detection can successfully detect cybersickness attacks and trigger the proper mitigation, effectively reducing VR cybersickness.

Paperid: 1567, https://arxiv.org/pdf/2503.12226.pdf

Abstract:
The fast development of large language models (LLMs) and popularization of cloud computing have led to increasing concerns on privacy safeguarding and data security of cross-cloud model deployment and training as the key challenges. We present a new framework for addressing these issues along with enabling privacy preserving collaboration on training between distributed clouds based on federated learning. Our mechanism encompasses cutting-edge cryptographic primitives, dynamic model aggregation techniques, and cross-cloud data harmonization solutions to enhance security, efficiency, and scalability to the traditional federated learning paradigm. Furthermore, we proposed a hybrid aggregation scheme to mitigate the threat of Data Leakage and to optimize the aggregation of model updates, thus achieving substantial enhancement on the model effectiveness and stability. Experimental results demonstrate that the training efficiency, privacy protection, and model accuracy of the proposed model compare favorably to those of the traditional federated learning method.

Paperid: 1568, https://arxiv.org/pdf/2503.11982.pdf

Abstract:
In quantum computing, quantum circuits are fundamental representations of quantum algorithms, which are compiled into executable functions for quantum solutions. Quantum compilers transform algorithmic quantum circuits into one compatible with target quantum computers, bridging quantum software and hardware. However, untrusted quantum compilers pose significant risks. They can lead to the theft of quantum circuit designs and compromise sensitive intellectual property (IP). In this paper, we propose TetrisLock, a split compilation method for quantum circuit obfuscation that uses an interlocking splitting pattern to effectively protect IP with minimal resource overhead. Our approach divides the quantum circuit into two interdependent segments, ensuring that reconstructing the original circuit functionality is possible only by combining both segments and eliminating redundancies. This method makes reverse engineering by an untrusted compiler unrealizable, as the original circuit is never fully shared with any single entity. Also, our approach eliminates the need for a trusted compiler to process the inserted random circuit, thereby relaxing the security requirements. Additionally, it defends against colluding attackers with mismatched numbers of qubits, while maintaining low overhead by preserving the original depth of the quantum circuit. We demonstrate our method by using established RevLib benchmarks, showing that it achieves a minimal impact on functional accuracy (less than 1%) while significantly reducing the likelihood of IP inference.

Paperid: 1569, https://arxiv.org/pdf/2503.11917.pdf

Abstract:
As frontier AI models become more capable, evaluating their potential to enable cyberattacks is crucial for ensuring the safe development of Artificial General Intelligence (AGI). Current cyber evaluation efforts are often ad-hoc, lacking systematic analysis of attack phases and guidance on targeted defenses. This work introduces a novel evaluation framework that addresses these limitations by: (1) examining the end-to-end attack chain, (2) identifying gaps in AI threat evaluation, and (3) helping defenders prioritize targeted mitigations and conduct AI-enabled adversary emulation for red teaming. Our approach adapts existing cyberattack chain frameworks for AI systems. We analyzed over 12,000 real-world instances of AI involvement in cyber incidents, catalogued by Google's Threat Intelligence Group, to curate seven representative attack chain archetypes. Through a bottleneck analysis on these archetypes, we pinpointed phases most susceptible to AI-driven disruption. We then identified and utilized externally developed cybersecurity model evaluations focused on these critical phases. We report on AI's potential to amplify offensive capabilities across specific attack stages, and offer recommendations for prioritizing defenses. We believe this represents the most comprehensive AI cyber risk evaluation framework published to date.

Paperid: 1570, https://arxiv.org/pdf/2503.11216.pdf

Abstract:
The rapid growth of cloud computing and data-driven applications has amplified privacy concerns, driven by the increasing demand to process sensitive data securely. Homomorphic encryption (HE) has become a vital solution for addressing these concerns by enabling computations on encrypted data without revealing its contents. This paper provides a comprehensive evaluation of two leading HE libraries, SEAL and OpenFHE, examining their performance, usability, and support for prominent HE schemes such as BGV and CKKS. Our analysis highlights computational efficiency, memory usage, and scalability across Linux and Windows platforms, emphasizing their applicability in real-world scenarios. Results reveal that Linux outperforms Windows in computation efficiency, with OpenFHE emerging as the optimal choice across diverse cryptographic settings. This paper provides valuable insights for researchers and practitioners to advance privacy-preserving applications using FHE.

Paperid: 1571, https://arxiv.org/pdf/2503.09414.pdf

Abstract:
Federated Learning (FL) has emerged as a promising paradigm for collaborative model training without the need to share clients' personal data, thereby preserving privacy. However, the non-IID nature of the clients' data introduces major challenges for FL, highlighting the importance of personalized federated learning (PFL) methods. In PFL, models are trained to cater to specific feature distributions present in the population data. A notable method for PFL is the Iterative Federated Clustering Algorithm (IFCA), which mitigates the concerns associated with the non-IID-ness by grouping clients with similar data distributions. While it has been shown that IFCA enhances both accuracy and fairness, its strategy of dividing the population into smaller clusters increases vulnerability to Membership Inference Attacks (MIA), particularly among minorities with limited training samples. In this paper, we introduce IFCA-MIR, an improved version of IFCA that integrates MIA risk assessment into the clustering process. Allowing clients to select clusters based on both model performance and MIA vulnerability, IFCA-MIR achieves an improved performance with respect to accuracy, fairness, and privacy. We demonstrate that IFCA-MIR significantly reduces MIA risk while maintaining comparable model accuracy and fairness as the original IFCA.

Paperid: 1572, https://arxiv.org/pdf/2503.09184.pdf

Abstract:
The deployment of deep neural networks (DNNs) in privacy-sensitive environments is constrained by computational overheads in fully homomorphic encryption (FHE). This paper explores unstructured sparsity in FHE matrix multiplication schemes as a means of reducing this burden while maintaining model accuracy requirements. We demonstrate that sparsity can be exploited in arbitrary matrix multiplication, providing runtime benefits compared to a baseline naive algorithm at all sparsity levels. This is a notable departure from the plaintext domain, where there is a trade-off between sparsity and the overhead of the sparse multiplication algorithm. In addition, we propose three sparse multiplication schemes in FHE based on common plaintext sparse encodings. We demonstrate the performance gain is scheme-invariant; however, some sparse schemes vastly reduce the memory storage requirements of the encrypted matrix at high sparsity values. Our proposed sparse schemes yield an average performance gain of 2.5x at 50% unstructured sparsity, with our multi-threading scheme providing a 32.5x performance increase over the equivalent single-threaded sparse computation when utilizing 64 cores.

Paperid: 1573, https://arxiv.org/pdf/2503.09049.pdf

Abstract:
Recent studies show that graph neural networks (GNNs) are vulnerable to backdoor attacks. Existing backdoor attacks against GNNs use fixed-pattern triggers and lack reasonable trigger constraints, overlooking individual graph characteristics and rendering insufficient evasiveness. To tackle the above issues, we propose ABARC, the first Adaptive Backdoor Attack with Reasonable Constraints, applying to both graph-level and node-level tasks in GNNs. For graph-level tasks, we propose a subgraph backdoor attack independent of the graph's topology. It dynamically selects trigger nodes for each target graph and modifies node features with constraints based on graph similarity, feature range, and feature type. For node-level tasks, our attack begins with an analysis of node features, followed by selecting and modifying trigger features, which are then constrained by node similarity, feature range, and feature type. Furthermore, an adaptive edge-pruning mechanism is designed to reduce the impact of neighbors on target nodes, ensuring a high attack success rate (ASR). Experimental results show that even with reasonable constraints for attack evasiveness, our attack achieves a high ASR while incurring a marginal clean accuracy drop (CAD). When combined with the state-of-the-art defense randomized smoothing (RS) method, our attack maintains an ASR over 94%, surpassing existing attacks by more than 7%.

Paperid: 1574, https://arxiv.org/pdf/2503.07483.pdf

Abstract:
Trajectory data, which tracks movements through geographic locations, is crucial for improving real-world applications. However, collecting such sensitive data raises considerable privacy concerns. Local differential privacy (LDP) offers a solution by allowing individuals to locally perturb their trajectory data before sharing it. Despite its privacy benefits, LDP protocols are vulnerable to data poisoning attacks, where attackers inject fake data to manipulate aggregated results. In this work, we make the first attempt to analyze vulnerabilities in several representative LDP trajectory protocols. We propose \textsc{TraP}, a heuristic algorithm for data \underline{P}oisoning attacks using a prefix-suffix method to optimize fake \underline{Tra}jectory selection, significantly reducing computational complexity. Our experimental results demonstrate that our attack can substantially increase target pattern occurrences in the perturbed trajectory dataset with few fake users. This study underscores the urgent need for robust defenses and better protocol designs to safeguard LDP trajectory data against malicious manipulation.

Paperid: 1575, https://arxiv.org/pdf/2503.06554.pdf

Abstract:
Federated learning is a distributed learning paradigm that facilitates the collaborative training of a global model across multiple clients while preserving the privacy of local datasets. To address inherent challenges related to data heterogeneity and satisfy personalized needs, a new direction within FL, known as personalized Federated Learning (pFL), has gradually evolved. Extensive attention has been directed toward developing novel frameworks and methods to enhance the performance of pFL. Regrettably, the aspect of security in pFL has been largely overlooked. Our objective is to fill this gap. Similar to FL, pFL is susceptible to backdoor attacks. However, existing backdoor defense strategies are primarily tailored to general FL frameworks, and pFL lacks robustness against backdoor attacks. We propose a novel, backdoor-robust pFL framework named BDPFL to address these challenges. First, BDPFL introduces layer-wise mutual distillation that enables clients to learn their personalized local models while mitigating potential backdoors. Then, BDPFL employs explanation heatmap to learn high-quality intermediate representations and enhance the effect of eliminating deeper and more entrenched backdoors. Moreover, we perform empirical evaluations of BDPFL's performance on three datasets and compare BDPFL with four backdoor defense methods. The experiments demonstrate that BDPFL outperforms baseline methods and is effective under various settings.

Paperid: 1576, https://arxiv.org/pdf/2503.06279.pdf

Abstract:
The rapid growth of Blockchain and Decentralized Finance (DeFi) has introduced new challenges and vulnerabilities that threaten the integrity and efficiency of the ecosystem. This study identifies critical issues such as Transaction Order Dependence (TOD), Blockchain Extractable Value (BEV), and Transaction Importance Diversity (TID), which collectively undermine the fairness and security of DeFi systems. BEV-related activities, including Sandwich attacks, Liquidations, and Transaction Replay, have emerged as significant threats, collectively generating $540.54 million in losses over 32 months across 11,289 addresses, involving 49,691 cryptocurrencies and 60,830 on-chain markets. These attacks exploit transaction mechanics to manipulate asset prices and extract value at the expense of other participants, with Sandwich attacks being particularly impactful. Additionally, the growing adoption of Blockchain in traditional finance highlights the challenge of TID, where high transaction volumes can strain systems and compromise time-sensitive operations. To address these pressing issues, we propose a novel Distributed Transaction Sequencing Strategy (DTSS), which combines forking mechanisms and the Analytic Hierarchy Process (AHP) to enforce fair and transparent transaction ordering in a decentralized manner. Our approach is further enhanced by an optimization framework and the introduction of the Normalized Allocation Disparity Metric (NADM), which ensures optimal parameter selection for transaction prioritization. Experimental evaluations demonstrate that DTSS effectively mitigates BEV risks, enhances transaction fairness, and significantly improves the security and transparency of DeFi ecosystems. This work is essential for protecting the future of decentralized finance and promoting its integration into global financial systems.

Paperid: 1577, https://arxiv.org/pdf/2503.05102.pdf

Abstract:
In recent years, the application of behavioral testing in Natural Language Processing (NLP) model evaluation has experienced a remarkable and substantial growth. However, the existing methods continue to be restricted by the requirements for manual labor and the limited scope of capability assessment. To address these limitations, we introduce AutoTestForge, an automated and multidimensional testing framework for NLP models in this paper. Within AutoTestForge, through the utilization of Large Language Models (LLMs) to automatically generate test templates and instantiate them, manual involvement is significantly reduced. Additionally, a mechanism for the validation of test case labels based on differential testing is implemented which makes use of a multi-model voting system to guarantee the quality of test cases. The framework also extends the test suite across three dimensions, taxonomy, fairness, and robustness, offering a comprehensive evaluation of the capabilities of NLP models. This expansion enables a more in-depth and thorough assessment of the models, providing valuable insights into their strengths and weaknesses. A comprehensive evaluation across sentiment analysis (SA) and semantic textual similarity (STS) tasks demonstrates that AutoTestForge consistently outperforms existing datasets and testing tools, achieving higher error detection rates (an average of $30.89\%$ for SA and $34.58\%$ for STS). Moreover, different generation strategies exhibit stable effectiveness, with error detection rates ranging from $29.03\% - 36.82\%$.

Paperid: 1578, https://arxiv.org/pdf/2503.04783.pdf

Abstract:
Nowadays, DeepSeek, ChatGPT, and Google Gemini are the most trending and exciting Large Language Model (LLM) technologies for reasoning, multimodal capabilities, and general linguistic performance worldwide. DeepSeek employs a Mixture-of-Experts (MoE) approach, activating only the parameters most relevant to the task at hand, which makes it especially effective for domain-specific work. On the other hand, ChatGPT relies on a dense transformer model enhanced through reinforcement learning from human feedback (RLHF), and then Google Gemini actually uses a multimodal transformer architecture that integrates text, code, and images into a single framework. However, by using those technologies, people can be able to mine their desired text, code, images, etc, in a cost-effective and domain-specific inference. People may choose those techniques based on the best performance. In this regard, we offer a comparative study based on the DeepSeek, ChatGPT, and Gemini techniques in this research. Initially, we focus on their methods and materials, appropriately including the data selection criteria. Then, we present state-of-the-art features of DeepSeek, ChatGPT, and Gemini based on their applications. Most importantly, we show the technological comparison among them and also cover the dataset analysis for various applications. Finally, we address extensive research areas and future potential guidance regarding LLM-based AI research for the community.

Paperid: 1579, https://arxiv.org/pdf/2503.04474.pdf

Abstract:
Large Language Model (LLM) based judges form the underpinnings of key safety evaluation processes such as offline benchmarking, automated red-teaming, and online guardrailing. This widespread requirement raises the crucial question: can we trust the evaluations of these evaluators? In this paper, we highlight two critical challenges that are typically overlooked: (i) evaluations in the wild where factors like prompt sensitivity and distribution shifts can affect performance and (ii) adversarial attacks that target the judge. We highlight the importance of these through a study of commonly used safety judges, showing that small changes such as the style of the model output can lead to jumps of up to 0.24 in the false negative rate on the same dataset, whereas adversarial attacks on the model generation can fool some judges into misclassifying 100% of harmful generations as safe ones. These findings reveal gaps in commonly used meta-evaluation benchmarks and weaknesses in the robustness of current LLM judges, indicating that low attack success under certain judges could create a false sense of security.

Paperid: 1580, https://arxiv.org/pdf/2503.03454.pdf

Abstract:
Local Differential Privacy (LDP) has been widely adopted to protect user privacy in decentralized data collection. However, recent studies have revealed that LDP protocols are vulnerable to data poisoning attacks, where malicious users manipulate their reported data to distort aggregated results. In this work, we present the first study on data poisoning attacks targeting LDP range query protocols, focusing on both tree-based and grid-based approaches. We identify three key challenges in executing such attacks, including crafting consistent and effective fake data, maintaining data consistency across levels or grids, and preventing server detection. To address the first two challenges, we propose novel attack methods that are provably optimal, including a tree-based attack and a grid-based attack, designed to manipulate range query results with high effectiveness. \textbf{Our key finding is that the common post-processing procedure, Norm-Sub, in LDP range query protocols can help the attacker massively amplify their attack effectiveness.} In addition, we study a potential countermeasure, but also propose an adaptive attack capable of evading this defense to address the third challenge. We evaluate our methods through theoretical analysis and extensive experiments on synthetic and real-world datasets. Our results show that the proposed attacks can significantly amplify estimations for arbitrary range queries by manipulating a small fraction of users, providing 5-10x more influence than a normal user to the estimation.

Paperid: 1581, https://arxiv.org/pdf/2503.03146.pdf

Abstract:
Fine-tuning large language models (LLMs) raises privacy concerns due to the risk of exposing sensitive training data. Federated learning (FL) mitigates this risk by keeping training samples on local devices, while facing the following problems in privacy-preserving federated fine-tuning. (i) Recent studies show that adversaries can still infer private information in FL. (ii) LLM parameters are shared publicly during federated fine-tuning, while developers are often reluctant to disclose these parameters, posing further security challenges. (iii) Existing works focus on secure inference of LLMs but do not consider privacy-preserving fine-tuning. Inspired by the above problems, we propose PriFFT, a privacy-preserving federated fine-tuning mechanism, to protect both the model parameters and users' privacy. Due to considerable LLM parameters, we present hybrid secret sharing combining arithmetic secret sharing (ASS) and function secret sharing (FSS) to build secure operations and implement secure layers and activation for privacy-preserving fine-tuning. To improve the efficiency of privacy-preserving federated fine-tuning of LLMs, we optimize several secure computation protocols based on FSS, including reciprocal calculation, tensor products, natural exponentiation, softmax, sigmoid, hyperbolic tangent, and dropout. The hybrid secret sharing enables PriFFT to apply our optimized FSS protocols while combining ASS protocols to support complex computation without extra communication. The optimized protocols reduce execution time up to 62.5% and communication overhead up to 70.7% compared to existing protocols. Besides, PriFFT reduces execution time and communication overhead in privacy-preserving fine-tuning up to 59.1%$ and 77.0%$ without accuracy drop compared to the existing secret sharing methods.

Paperid: 1582, https://arxiv.org/pdf/2503.01816.pdf

Abstract:
A new Cyber Resilience Act (CRA) was recently agreed upon in the European Union (EU). The paper examines and elaborates what new requirements the CRA entails by contrasting it with the older General Data Protection Regulation (GDPR). According to the results, there are overlaps in terms confidentiality, integrity, and availability guarantees, data minimization, traceability, data erasure, and security testing. The CRA's seven new essential requirements originate from obligations to (1) ship products without known exploitable vulnerabilities and (2) with secure defaults, to (3) provide security patches typically for a minimum of five years, to (4) minimize attack surfaces, to (5) develop and enable exploitation mitigation techniques, to (6) establish a software bill of materials (SBOM), and to (7) improve vulnerability coordination, including a mandate to establish a coordinated vulnerability disclosure policy. With these results and an accompanying discussion, the paper contributes to requirements engineering research specialized into legal requirements, demonstrating how new laws may affect existing requirements.

Paperid: 1583, https://arxiv.org/pdf/2502.19534.pdf

Abstract:
We propose a novel mechanism for real-time (human-in-the-loop) feedback focused on false positive reduction to enhance anomaly detection models. It was designed for the lightweight deployment of a behavioral network anomaly detection model. This methodology is easily integrable to similar domains that require a premium on throughput while maintaining high precision. In this paper, we introduce Retrieval Augmented Anomaly Detection, a novel method taking inspiration from Retrieval Augmented Generation. Human annotated examples are sent to a vector store, which can modify model outputs on the very next processed batch for model inference. To demonstrate the generalization of this technique, we benchmarked several different model architectures and multiple data modalities, including images, text, and graph-based data.

Paperid: 1584, https://arxiv.org/pdf/2502.17424.pdf

Abstract:
We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding. It asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment. This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned. Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment. In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger. It's important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

Paperid: 1585, https://arxiv.org/pdf/2502.17259.pdf

Abstract:
Benchmark contamination poses a significant challenge to the reliability of Large Language Models (LLMs) evaluations, as it is difficult to assert whether a model has been trained on a test set. We introduce a solution to this problem by watermarking benchmarks before their release. The embedding involves reformulating the original questions with a watermarked LLM, in a way that does not alter the benchmark utility. During evaluation, we can detect ``radioactivity'', \ie traces that the text watermarks leave in the model during training, using a theoretically grounded statistical test. We test our method by pre-training 1B models from scratch on 10B tokens with controlled benchmark contamination, and validate its effectiveness in detecting contamination on ARC-Easy, ARC-Challenge, and MMLU. Results show similar benchmark utility post-watermarking and successful contamination detection when models are contaminated enough to enhance performance, \eg $p$-val $=10^{-3}$ for +5$\%$ on ARC-Easy.

Paperid: 1586, https://arxiv.org/pdf/2502.13162.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable success in various domains but remain vulnerable to adversarial jailbreak attacks. Existing prompt-defense strategies, including parameter-modifying and parameter-free approaches, face limitations in adaptability, interpretability, and customization, constraining their effectiveness against evolving threats. To address these challenges, we propose ShieldLearner, a novel paradigm that mimics human learning in defense. Through trial and error, it autonomously distills attack signatures into a Pattern Atlas and synthesizes defense heuristics into a Meta-analysis Framework, enabling systematic and interpretable threat detection. Furthermore, we introduce Adaptive Adversarial Augmentation to generate adversarial variations of successfully defended prompts, enabling continuous self-improvement without model retraining. In addition to standard benchmarks, we create a hard test set by curating adversarial prompts from the Wildjailbreak dataset, emphasizing more concealed malicious intent. Experimental results show that ShieldLearner achieves a significantly higher defense success rate than existing baselines on both conventional and hard test sets, while also operating with lower computational overhead, making it a practical and efficient solution for real-world adversarial defense.

Paperid: 1587, https://arxiv.org/pdf/2502.13115.pdf

Abstract:
We analyze the problem of private learning in generalized linear contextual bandits. Our approach is based on a novel method of re-weighted regression, yielding an efficient algorithm with regret of order $\sqrt{T}+\frac{1}Î±$ and $\sqrt{T}/Î±$ in the joint and local model of $Î±$-privacy, respectively. Further, we provide near-optimal private procedures that achieve dimension-independent rates in private linear models and linear contextual bandits. In particular, our results imply that joint privacy is almost "for free" in all the settings we consider, partially addressing the open problem posed by Azize and Basu (2024).

Paperid: 1588, https://arxiv.org/pdf/2502.08151.pdf

Abstract:
Reconstruction attacks against federated learning (FL) aim to reconstruct users' samples through users' uploaded gradients. Local differential privacy (LDP) is regarded as an effective defense against various attacks, including sample reconstruction in FL, where gradients are clipped and perturbed. Existing attacks are ineffective in FL with LDP since clipped and perturbed gradients obliterate most sample information for reconstruction. Besides, existing attacks embed additional sample information into gradients to improve the attack effect and cause gradient expansion, leading to a more severe gradient clipping in FL with LDP. In this paper, we propose a sample reconstruction attack against LDP-based FL with any target models to reconstruct victims' sensitive samples to illustrate that FL with LDP is not flawless. Considering gradient expansion in reconstruction attacks and noise in LDP, the core of the proposed attack is gradient compression and reconstructed sample denoising. For gradient compression, an inference structure based on sample characteristics is presented to reduce redundant gradients against LDP. For reconstructed sample denoising, we artificially introduce zero gradients to observe noise distribution and scale confidence interval to filter the noise. Theoretical proof guarantees the effectiveness of the proposed attack. Evaluations show that the proposed attack is the only attack that reconstructs victims' training samples in LDP-based FL and has little impact on the target model's accuracy. We conclude that LDP-based FL needs further improvements to defend against sample reconstruction attacks effectively.

Paperid: 1589, https://arxiv.org/pdf/2502.05547.pdf

Abstract:
Federated learning (FL) is inherently susceptible to privacy breaches and poisoning attacks. To tackle these challenges, researchers have separately devised secure aggregation mechanisms to protect data privacy and robust aggregation methods that withstand poisoning attacks. However, simultaneously addressing both concerns is challenging; secure aggregation facilitates poisoning attacks as most anomaly detection techniques require access to unencrypted local model updates, which are obscured by secure aggregation. Few recent efforts to simultaneously tackle both challenges offen depend on impractical assumption of non-colluding two-server setups that disrupt FL's topology, or three-party computation which introduces scalability issues, complicating deployment and application. To overcome this dilemma, this paper introduce a Dual Defense Federated learning (DDFed) framework. DDFed simultaneously boosts privacy protection and mitigates poisoning attacks, without introducing new participant roles or disrupting the existing FL topology. DDFed initially leverages cutting-edge fully homomorphic encryption (FHE) to securely aggregate model updates, without the impractical requirement for non-colluding two-server setups and ensures strong privacy protection. Additionally, we proposes a unique two-phase anomaly detection mechanism for encrypted model updates, featuring secure similarity computation and feedback-driven collaborative selection, with additional measures to prevent potential privacy breaches from Byzantine clients incorporated into the detection process. We conducted extensive experiments on various model poisoning attacks and FL scenarios, including both cross-device and cross-silo FL. Experiments on publicly available datasets demonstrate that DDFed successfully protects model privacy and effectively defends against model poisoning threats.

Paperid: 1590, https://arxiv.org/pdf/2502.04915.pdf

Abstract:
The lack of authentication during the initial bootstrapping phase between cellular devices and base stations allows attackers to deploy fake base stations and send malicious messages to the devices. These attacks have been a long-existing problem in cellular networks, enabling adversaries to launch denial-of-service (DoS), information leakage, and location-tracking attacks. While some defense mechanisms are introduced in 5G, (e.g., encrypting user identifiers to mitigate IMSI catchers), the initial communication between devices and base stations remains unauthenticated, leaving a critical security gap. To address this, we propose E2IBS, a novel and efficient two-layer identity-based signature scheme designed for seamless integration with existing cellular protocols. We implement E2IBS on an open-source 5G stack and conduct a comprehensive performance evaluation against alternative solutions. Compared to the state-of-the-art Schnorr-HIBS, E2IBS reduces attack surfaces, enables fine-grained lawful interception, and achieves 2x speed in verification, making it a practical solution for securing 5G base station authentication.

Paperid: 1591, https://arxiv.org/pdf/2502.04224.pdf

Abstract:
Explaining Graph Neural Network (XGNN) has gained growing attention to facilitate the trust of using GNNs, which is the mainstream method to learn graph data. Despite their growing attention, Existing XGNNs focus on improving the explanation performance, and its robustness under attacks is largely unexplored. We noticed that an adversary can slightly perturb the graph structure such that the explanation result of XGNNs is largely changed. Such vulnerability of XGNNs could cause serious issues particularly in safety/security-critical applications. In this paper, we take the first step to study the robustness of XGNN against graph perturbation attacks, and propose XGNNCert, the first provably robust XGNN. Particularly, our XGNNCert can provably ensure the explanation result for a graph under the worst-case graph perturbation attack is close to that without the attack, while not affecting the GNN prediction, when the number of perturbed edges is bounded. Evaluation results on multiple graph datasets and GNN explainers show the effectiveness of XGNNCert.

Paperid: 1592, https://arxiv.org/pdf/2502.04106.pdf

Abstract:
In Federated Learning (FL), clients share gradients with a central server while keeping their data local. However, malicious servers could deliberately manipulate the models to reconstruct clients' data from shared gradients, posing significant privacy risks. Although such active gradient leakage attacks (AGLAs) have been widely studied, they suffer from two severe limitations: (i) coverage: no existing AGLAs can reconstruct all samples in a batch from the shared gradients; (ii) stealthiness: no existing AGLAs can evade principled checks of clients. In this paper, we address these limitations with two core contributions. First, we introduce a new theoretical analysis approach, which uniformly models AGLAs as backdoor poisoning. This analysis approach reveals that the core principle of AGLAs is to bias the gradient space to prioritize the reconstruction of a small subset of samples while sacrificing the majority, which theoretically explains the above limitations of existing AGLAs. Second, we propose Enhanced Gradient Global Vulnerability (EGGV), the first AGLA that achieves complete attack coverage while evading client-side detection. In particular, EGGV employs a gradient projector and a jointly optimized discriminator to assess gradient vulnerability, steering the gradient space toward the point most prone to data leakage. Extensive experiments show that EGGV achieves complete attack coverage and surpasses state-of-the-art (SOTA) with at least a 43% increase in reconstruction quality (PSNR) and a 45% improvement in stealthiness (D-SNR).

Paperid: 1593, https://arxiv.org/pdf/2502.03825.pdf

Abstract:
Deep learning-based medical image segmentation models, such as U-Net, rely on high-quality annotated datasets to achieve accurate predictions. However, the increasing use of generative models for synthetic data augmentation introduces potential risks, particularly in the absence of rigorous quality control. In this paper, we investigate the impact of synthetic MRI data on the robustness and segmentation accuracy of U-Net models for brain tumor segmentation. Specifically, we generate synthetic T1-contrast-enhanced (T1-Ce) MRI scans using a GAN-based model with a shared encoding-decoding framework and shortest-path regularization. To quantify the effect of synthetic data contamination, we train U-Net models on progressively "poisoned" datasets, where synthetic data proportions range from 16.67% to 83.33%. Experimental results on a real MRI validation set reveal a significant performance degradation as synthetic data increases, with Dice coefficients dropping from 0.8937 (33.33% synthetic) to 0.7474 (83.33% synthetic). Accuracy and sensitivity exhibit similar downward trends, demonstrating the detrimental effect of synthetic data on segmentation robustness. These findings underscore the importance of quality control in synthetic data integration and highlight the risks of unregulated synthetic augmentation in medical image analysis. Our study provides critical insights for the development of more reliable and trustworthy AI-driven medical imaging systems.

Paperid: 1594, https://arxiv.org/pdf/2502.02913.pdf

Abstract:
The widespread deployment of deep learning models in privacy-sensitive domains has amplified concerns regarding privacy risks, particularly those stemming from gradient leakage during training. Current privacy assessments primarily rely on post-training attack simulations. However, these methods are inherently reactive, unable to encompass all potential attack scenarios, and often based on idealized adversarial assumptions. These limitations underscore the need for proactive approaches to privacy risk assessment during the training process. To address this gap, we propose the concept of privacy tokens, which are derived directly from private gradients during training. Privacy tokens encapsulate gradient features and, when combined with data features, offer valuable insights into the extent of private information leakage from training data, enabling real-time measurement of privacy risks without relying on adversarial attack simulations. Additionally, we employ Mutual Information (MI) as a robust metric to quantify the relationship between training data and gradients, providing precise and continuous assessments of privacy leakage throughout the training process. Extensive experiments validate our framework, demonstrating the effectiveness of privacy tokens and MI in identifying and quantifying privacy risks. This proactive approach marks a significant advancement in privacy monitoring, promoting the safer deployment of deep learning models in sensitive applications.

Paperid: 1595, https://arxiv.org/pdf/2502.02774.pdf

Abstract:
In $(t, n)$-threshold secret sharing, a secret $S$ is distributed among $n$ participants such that any subset of size $t$ can recover $S$, while any subset of size $t-1$ or fewer learns nothing about it. For information-theoretic secret sharing, it is known that the share size must be at least as large as the secret, i.e., $|S|$. When computational security is employed using cryptographic encryption with a secret key $K$, previous work has shown that the share size can be reduced to $\tfrac{|S|}{t} + |K|$. In this paper, we present a construction achieving a share size of $\tfrac{|S| + |K|}{t}$. Furthermore, we prove that, under reasonable assumptions on the encryption scheme -- namely, the non-compressibility of pseudorandom encryption and the non-redundancy of the secret key -- this share size is optimal.

Paperid: 1596, https://arxiv.org/pdf/2502.01936.pdf

Abstract:
The robustness of Graph Neural Networks (GNNs) has become an increasingly important topic due to their expanding range of applications. Various attack methods have been proposed to explore the vulnerabilities of GNNs, ranging from Graph Modification Attacks (GMA) to the more practical and flexible Graph Injection Attacks (GIA). However, existing methods face two key challenges: (i) their reliance on surrogate models, which often leads to reduced attack effectiveness due to structural differences and prior biases, and (ii) existing GIA methods often sacrifice attack success rates in undefended settings to bypass certain defense models, thereby limiting their overall effectiveness. To overcome these limitations, we propose QUGIA, a Query-based and Unnoticeable Graph Injection Attack. QUGIA injects nodes by first selecting edges based on victim node connections and then generating node features using a Bayesian framework. This ensures that the injected nodes are similar to the original graph nodes, implicitly preserving homophily and making the attack more unnoticeable. Unlike previous methods, QUGIA does not rely on surrogate models, thereby avoiding performance degradation and achieving better generalization. Extensive experiments on six real-world datasets with diverse characteristics demonstrate that QUGIA achieves unnoticeable attacks and outperforms state-of-the-art attackers. The code will be released upon acceptance.

Paperid: 1597, https://arxiv.org/pdf/2502.01634.pdf

Abstract:
Gradient Boosting Decision Tree (GBDT) is one of the most popular machine learning models in various applications. However, in the traditional settings, all data should be simultaneously accessed in the training procedure: it does not allow to add or delete any data instances after training. In this paper, we propose an efficient online learning framework for GBDT supporting both incremental and decremental learning. To the best of our knowledge, this is the first work that considers an in-place unified incremental and decremental learning on GBDT. To reduce the learning cost, we present a collection of optimizations for our framework, so that it can add or delete a small fraction of data on the fly. We theoretically show the relationship between the hyper-parameters of the proposed optimizations, which enables trading off accuracy and cost on incremental and decremental learning. The backdoor attack results show that our framework can successfully inject and remove backdoor in a well-trained model using incremental and decremental learning, and the empirical results on public datasets confirm the effectiveness and efficiency of our proposed online learning framework and optimizations.

Paperid: 1598, https://arxiv.org/pdf/2502.01320.pdf

Abstract:
To meet its dual burdens of providing useful statistics and ensuring privacy of individual respondents, the US Census Bureau has for decades introduced some form of "noise" into published statistics. Initially, they used a method known as "swapping" (1990-2010). In 2020, they switched to an algorithm called TopDown that ensures a form of Differential Privacy. While the TopDown algorithm has been made public, no implementation of swapping has been released and many details of the deployed swapping methodology deployed have been kept secret. Further, the Bureau has not published (even a synthetic) "original" dataset and its swapped version. It is therefore difficult to evaluate the effects of swapping, and to compare these effects to those of other privacy technologies. To address these difficulties we describe and implement a parameterized swapping algorithm based on Census publications, court documents, and informal interviews with Census employees. With this implementation, we characterize the impacts of swapping on a range of statistical quantities of interest. We provide intuition for the types of shifts induced by swapping and compare against those introduced by TopDown. We find that even when swapping and TopDown introduce errors of similar magnitude, the direction in which statistics are biased need not be the same across the two techniques. More broadly, our implementation provides researchers with the tools to analyze and potentially correct for the impacts of disclosure avoidance systems on the quantities they study.

Paperid: 1599, https://arxiv.org/pdf/2502.00586.pdf

Abstract:
The rapid growth of encryption has significantly enhanced privacy and security while posing challenges for network traffic classification. Recent approaches address these challenges by transforming network traffic into text or image formats to leverage deep-learning models originally designed for natural language processing, and computer vision. However, these transformations often contradict network protocol specifications, introduce noisy features, and result in resource-intensive processes. To overcome these limitations, we propose NetMatrix, a minimalistic tabular representation of network traffic that eliminates noisy attributes and focuses on meaningful features leveraging RFCs (Request for Comments) definitions. By combining NetMatrix with a vanilla XGBoost classifier, we implement a lightweight approach, LiM ("Less is More") that achieves classification performance on par with state-of-the-art methods such as ET-BERT and YaTC. Compared to selected baselines, experimental evaluations demonstrate that LiM improves resource consumption by orders of magnitude. Overall, this study underscores the effectiveness of simplicity in traffic representation and machine learning model selection, paving the way towards resource-efficient network traffic classification.

Paperid: 1600, https://arxiv.org/pdf/2501.18861.pdf

Abstract:
JEDEC has introduced the Per Row Activation Counting (PRAC) framework for DDR5 and future DRAMs to enable precise counting of DRAM row activations. PRAC enables a holistic mitigation of Rowhammer attacks even at ultra-low Rowhammer thresholds. PRAC uses an Alert Back-Off (ABO) protocol to request the memory controller to issue Rowhammer mitigation requests. However, recent PRAC implementations are either insecure or impractical. For example, Panopticon, the inspiration for PRAC, is rendered insecure if implemented per JEDEC's PRAC specification. On the other hand, the recent UPRAC proposal is impractical since it needs oracular knowledge of the `top-N' activated DRAM rows that require mitigation. This paper provides the first secure, scalable, and practical RowHammer solution using the PRAC framework. The crux of our proposal is the design of a priority-based service queue (PSQ) for mitigations that prioritizes pending mitigations based on activation counts to avoid the security risks of prior solutions. This provides principled security using the reactive ABO protocol. Furthermore, we co-design our PSQ, with opportunistic mitigation on Refresh Management (RFM) operations and proactive mitigation during refresh (REF), to limit the performance impact of ABO-based mitigations. QPRAC provides secure and practical RowHammer mitigation that scales to Rowhammer thresholds as low as 71 while incurring a 0.8% slowdown for benign workloads, which further reduces to 0% with proactive mitigations.

Paperid: 1601, https://arxiv.org/pdf/2501.18158.pdf

Abstract:
Cryptocurrencies are widely used, yet current methods for analyzing transactions often rely on opaque, black-box models. While these models may achieve high performance, their outputs are usually difficult to interpret and adapt, making it challenging to capture nuanced behavioral patterns. Large language models (LLMs) have the potential to address these gaps, but their capabilities in this area remain largely unexplored, particularly in cybercrime detection. In this paper, we test this hypothesis by applying LLMs to real-world cryptocurrency transaction graphs, with a focus on Bitcoin, one of the most studied and widely adopted blockchain networks. We introduce a three-tiered framework to assess LLM capabilities: foundational metrics, characteristic overview, and contextual interpretation. This includes a new, human-readable graph representation format, LLM4TG, and a connectivity-enhanced transaction graph sampling algorithm, CETraS. Together, they significantly reduce token requirements, transforming the analysis of multiple moderately large-scale transaction graphs with LLMs from nearly impossible to feasible under strict token limits. Experimental results demonstrate that LLMs have outstanding performance on foundational metrics and characteristic overview, where the accuracy of recognizing most basic information at the node level exceeds 98.50% and the proportion of obtaining meaningful characteristics reaches 95.00%. Regarding contextual interpretation, LLMs also demonstrate strong performance in classification tasks, even with very limited labeled data, where top-3 accuracy reaches 72.43% with explanations. While the explanations are not always fully accurate, they highlight the strong potential of LLMs in this domain. At the same time, several limitations persist, which we discuss along with directions for future research.

Paperid: 1602, https://arxiv.org/pdf/2501.17405.pdf

Abstract:
Battery-powered technologies like pagers and walkie-talkies have long been integral to civilian and military operations. However, the potential for such everyday devices to be weaponized has largely been underestimated in the realm of cybersecurity. In September 2024, Lebanon experienced a series of unprecedented, coordinated explosions triggered through compromised pagers and walkie-talkies, creating a new category of attack in the domain of cyber-physical warfare. This attack not only disrupted critical communication networks but also resulted in injuries, loss of life, and exposed significant national security vulnerabilities, prompting governments and organizations worldwide to reevaluate their cybersecurity frameworks. This article provides an in-depth investigation into the infamous Pager and Walkie-Talkie attacks, analyzing both technical and non-technical dimensions. Furthermore, the study extends its scope to explore vulnerabilities in other battery-powered infrastructures, such as battery management systems, highlighting their potential exploitation. Existing prevention and detection techniques are reviewed, with an emphasis on their limitations and the challenges they face in addressing emerging threats. Finally, the article discusses emerging methodologies, particularly focusing on the role of physical inspection, as a critical component of future security measures. This research aims to provide actionable insights to bolster the resilience of cyber-physical systems in an increasingly interconnected world.

Paperid: 1603, https://arxiv.org/pdf/2501.17315.pdf

Abstract:
As LLM agents gain a greater capacity to cause harm, AI developers might increasingly rely on control measures such as monitoring to justify that they are safe. We sketch how developers could construct a "control safety case", which is a structured argument that models are incapable of subverting control measures in order to cause unacceptable outcomes. As a case study, we sketch an argument that a hypothetical LLM agent deployed internally at an AI company won't exfiltrate sensitive information. The sketch relies on evidence from a "control evaluation,"' where a red team deliberately designs models to exfiltrate data in a proxy for the deployment environment. The safety case then hinges on several claims: (1) the red team adequately elicits model capabilities to exfiltrate data, (2) control measures remain at least as effective in deployment, and (3) developers conservatively extrapolate model performance to predict the probability of data exfiltration in deployment. This safety case sketch is a step toward more concrete arguments that can be used to show that a dangerously capable LLM agent is safe to deploy.

Paperid: 1604, https://arxiv.org/pdf/2501.15459.pdf

Abstract:
The security of blockchain systems based on Proof of Work relies on mining. However, mining suffers from unstable revenue, prompting many miners to form cooperative mining pools. Most existing mining pools operate in a centralized manner, which undermines the decentralization principle of blockchain. Distributed mining pools offer a practical solution to this problem. Well-known examples include P2Pool and SmartPool. However, P2Pool encounters scalability and security issues in its early stages. Similarly, SmartPool is not budget-balanced and imposes fees due to its heavy use of the smart contract. In this research, we present a distributed mining pool named FiberPool to address these challenges. FiberPool integrates a smart contract on the main chain, a storage chain for sharing data necessary for share verification, and a child chain to reduce fees associated with using and withdrawing block rewards. We validate the mining fairness, budget balance, reward stability, and incentive compatibility of the payment scheme FiberPool Proportional adopted by FiberPool.

Paperid: 1605, https://arxiv.org/pdf/2501.14512.pdf

Abstract:
Neural networks have become a fundamental component of numerous practical applications, and their implementations, which are often accelerated by hardware, are integrated into all types of real-world physical devices. User interactions with neural networks on hardware accelerators are commonly considered privacy-sensitive. Substantial efforts have been made to uncover vulnerabilities and enhance privacy protection at the level of machine learning algorithms, including membership inference attacks, differential privacy, and federated learning. However, neural networks are ultimately implemented and deployed on physical devices, and current research pays comparatively less attention to privacy protection at the implementation level. In this paper, we introduce a generic physical side-channel attack, ScaAR, that extracts user interactions with neural networks by leveraging electromagnetic (EM) emissions of physical devices. Our proposed attack is implementation-agnostic, meaning it does not require the adversary to possess detailed knowledge of the hardware or software implementations, thanks to the capabilities of deep learning-based side-channel analysis (DLSCA). Experimental results demonstrate that, through the EM side channel, ScaAR can effectively extract the class label of user interactions with neural classifiers, including inputs and outputs, on the AMD-Xilinx MPSoC ZCU104 FPGA and Raspberry Pi 3 B. In addition, for the first time, we provide side-channel analysis on edge Large Language Model (LLM) implementations on the Raspberry Pi 5, showing that EM side channel leaks interaction data, and different LLM tokens can be distinguishable from the EM traces.

Paperid: 1606, https://arxiv.org/pdf/2501.11120.pdf

Abstract:
We study behavioral self-awareness -- an LLM's ability to articulate its behaviors without requiring in-context examples. We finetune LLMs on datasets that exhibit particular behaviors, such as (a) making high-risk economic decisions, and (b) outputting insecure code. Despite the datasets containing no explicit descriptions of the associated behavior, the finetuned LLMs can explicitly describe it. For example, a model trained to output insecure code says, ``The code I write is insecure.'' Indeed, models show behavioral self-awareness for a range of behaviors and for diverse evaluations. Note that while we finetune models to exhibit behaviors like writing insecure code, we do not finetune them to articulate their own behaviors -- models do this without any special training or examples. Behavioral self-awareness is relevant for AI safety, as models could use it to proactively disclose problematic behaviors. In particular, we study backdoor policies, where models exhibit unexpected behaviors only under certain trigger conditions. We find that models can sometimes identify whether or not they have a backdoor, even without its trigger being present. However, models are not able to directly output their trigger by default. Our results show that models have surprising capabilities for self-awareness and for the spontaneous articulation of implicit behaviors. Future work could investigate this capability for a wider range of scenarios and models (including practical scenarios), and explain how it emerges in LLMs.

Paperid: 1607, https://arxiv.org/pdf/2501.09039.pdf

Abstract:
The rapid advancement of Large Vision-Language Models (LVLMs) has enhanced capabilities offering potential applications from content creation to productivity enhancement. Despite their innovative potential, LVLMs exhibit vulnerabilities, especially in generating potentially toxic or unsafe responses. Malicious actors can exploit these vulnerabilities to propagate toxic content in an automated (or semi-) manner, leveraging the susceptibility of LVLMs to deception via strategically crafted prompts without fine-tuning or compute-intensive procedures. Despite the red-teaming efforts and inherent potential risks associated with the LVLMs, exploring vulnerabilities of LVLMs remains nascent and yet to be fully addressed in a systematic manner. This study systematically examines the vulnerabilities of open-source LVLMs, including LLaVA, InstructBLIP, Fuyu, and Qwen, using adversarial prompt strategies that simulate real-world social manipulation tactics informed by social theories. Our findings show that (i) toxicity and insulting are the most prevalent behaviors, with the mean rates of 16.13% and 9.75%, respectively; (ii) Qwen-VL-Chat, LLaVA-v1.6-Vicuna-7b, and InstructBLIP-Vicuna-7b are the most vulnerable models, exhibiting toxic response rates of 21.50%, 18.30% and 17.90%, and insulting responses of 13.40%, 11.70% and 10.10%, respectively; (iii) prompting strategies incorporating dark humor and multimodal toxic prompt completion significantly elevated these vulnerabilities. Despite being fine-tuned for safety, these models still generate content with varying degrees of toxicity when prompted with adversarial inputs, highlighting the urgent need for enhanced safety mechanisms and robust guardrails in LVLM development.

Paperid: 1608, https://arxiv.org/pdf/2501.09023.pdf

Abstract:
The increasing cybersecurity threats to critical manufacturing infrastructure necessitate proactive strategies for vulnerability identification, classification, and assessment. Traditional approaches, which define vulnerabilities as weaknesses in computational logic or information systems, often overlook the physical and cyber-physical dimensions critical to manufacturing systems, comprising intertwined cyber, physical, and human elements. As a result, existing solutions fall short in addressing the complex, domain-specific vulnerabilities of manufacturing environments. To bridge this gap, this work redefines vulnerabilities in the manufacturing context by introducing a novel characterization based on the duality between vulnerabilities and defenses. Vulnerabilities are conceptualized as exploitable gaps within various defense layers, enabling a structured investigation of manufacturing systems. This paper presents a manufacturing-specific cyber-physical defense-in-depth model, highlighting how security-aware personnel, post-production inspection systems, and process monitoring approaches can complement traditional cyber defenses to enhance system resilience. Leveraging this model, we systematically identify and classify vulnerabilities across the manufacturing cyberspace, human element, post-production inspection systems, production process monitoring, and organizational policies and procedures. This comprehensive classification introduces the first taxonomy of cyber-physical vulnerabilities in smart manufacturing systems, providing practitioners with a structured framework for addressing vulnerabilities at both the system and process levels. Finally, the effectiveness of the proposed model and framework is demonstrated through an illustrative smart manufacturing system and its corresponding threat model.

Paperid: 1609, https://arxiv.org/pdf/2501.07535.pdf

Abstract:
Fully homomorphic encryption (FHE) and zero-knowledge proofs (ZKPs) are emerging as solutions for data security in distributed environments. However, the widespread adoption of these encryption techniques is hindered by their significant computational overhead, primarily resulting from core cryptographic operations that involve large integer arithmetic. This paper presents a formalization of multi-word modular arithmetic (MoMA), which breaks down large bit-width integer arithmetic into operations on machine words. We further develop a rewrite system that implements MoMA through recursive rewriting of data types, designed for compatibility with compiler infrastructures and code generators. We evaluate MoMA by generating cryptographic kernels, including basic linear algebra subprogram (BLAS) operations and the number theoretic transform (NTT), targeting various GPUs. Our MoMA-based BLAS operations outperform state-of-the-art multi-precision libraries by orders of magnitude, and MoMA-based NTTs achieve near-ASIC performance on commodity GPUs.

Paperid: 1610, https://arxiv.org/pdf/2501.05053.pdf

Abstract:
Federated learning is a computing paradigm that enhances privacy by enabling multiple parties to collaboratively train a machine learning model without revealing personal data. However, current research indicates that traditional federated learning platforms are unable to ensure privacy due to privacy leaks caused by the interchange of gradients. To achieve privacy-preserving federated learning, integrating secure aggregation mechanisms is essential. Unfortunately, existing solutions are vulnerable to recently demonstrated inference attacks such as the disaggregation attack. This paper proposes TAPFed, an approach for achieving privacy-preserving federated learning in the context of multiple decentralized aggregators with malicious actors. TAPFed uses a proposed threshold functional encryption scheme and allows for a certain number of malicious aggregators while maintaining security and privacy. We provide formal security and privacy analyses of TAPFed and compare it to various baselines through experimental evaluation. Our results show that TAPFed offers equivalent performance in terms of model quality compared to state-of-the-art approaches while reducing transmission overhead by 29%-45% across different model training scenarios. Most importantly, TAPFed can defend against recently demonstrated inference attacks caused by curious aggregators, which the majority of existing approaches are susceptible to.

Paperid: 1611, https://arxiv.org/pdf/2501.02373.pdf

Abstract:
Task arithmetic in large-scale pre-trained models enables agile adaptation to diverse downstream tasks without extensive retraining. By leveraging task vectors (TVs), users can perform modular updates through simple arithmetic operations like addition and subtraction. Yet, this flexibility presents new security challenges. In this paper, we investigate how TVs are vulnerable to backdoor attacks, revealing how malicious actors can exploit them to compromise model integrity. By creating composite backdoors that are designed asymmetrically, we introduce BadTV, a backdoor attack specifically crafted to remain effective simultaneously under task learning, forgetting, and analogy operations. Extensive experiments show that BadTV achieves near-perfect attack success rates across diverse scenarios, posing a serious threat to models relying on task arithmetic. We also evaluate current defenses, finding they fail to detect or mitigate BadTV. Our results highlight the urgent need for robust countermeasures to secure TVs in real-world deployments.

Paperid: 1612, https://arxiv.org/pdf/2501.00951.pdf

Abstract:
We introduce the pseudorandom quantum authentication scheme (PQAS), an efficient method for encrypting quantum states that relies solely on the existence of pseudorandom unitaries (PRUs). The scheme guarantees that for any eavesdropper with quantum polynomial-time (QPT) computational power, the encrypted states are indistinguishable from the maximally mixed state. Furthermore, the receiver can verify that the state has not been tampered with and recover the original state with asymptotically unit fidelity. Our scheme is cost-effective, requiring only polylogarithmic circuit depth and a single shared key to encrypt a polynomial number of states. Notably, the PQAS can potentially exist even without quantum-secure one-way functions, requiring fundamentally weaker computational assumptions than semantic classical cryptography. Additionally, PQAS is secure against attacks that plague protocols based on QPT indistinguishability from Haar random states, such as chosen-plaintext attacks (CPAs) and attacks that reveal meta-information such as quantum resources. We relate the amount of meta-information that is leaked to quantum pseudoresources, giving the concept a practical meaning. As an application, we construct important cryptographic primitives, such as verifiable pseudorandom density matrices (VPRDMs), which are QPT-indistinguishable from random mixed states while being efficiently verifiable via a secret key, as well as verifiable noise-robust EFI pairs and one-way state generators (OWSGs). Our results establish a new paradigm of quantum information processing with weaker computational assumptions.

Paperid: 1613, https://arxiv.org/pdf/2501.00926.pdf

Abstract:
Computing matchings in general graphs plays a central role in graph algorithms. However, despite the recent interest in differentially private graph algorithms, there has been limited work on private matchings. Moreover, almost all existing work focuses on estimating the size of the maximum matching, whereas in many applications, the matching itself is the object of interest. There is currently only a single work on private algorithms for computing matching solutions by [HHRRW STOC'14]. Moreover, their work focuses on allocation problems and hence is limited to bipartite graphs. Motivated by the importance of computing matchings in sensitive graph data, we initiate the study of differentially private algorithms for computing maximal and maximum matchings in general graphs. We provide a number of algorithms and lower bounds for this problem in different models and settings. We first prove a lower bound showing that computing explicit solutions necessarily incurs large error, even if we try to obtain privacy by allowing ourselves to output non-edges. We then consider implicit solutions, where at the end of the computation there is an ($\varepsilon$-differentially private) billboard and each node can determine its matched edge(s) based on what is written on this publicly visible billboard. For this solution concept, we provide tight upper and lower (bicriteria) bounds, where the degree bound is violated by a logarithmic factor (which we show is necessary). We further show that our algorithm can be made distributed in the local edge DP (LEDP) model, and can even be done in a logarithmic number of rounds if we further relax the degree bounds by logarithmic factors. Our edge-DP matching algorithms give rise to new matching algorithms in the node-DP setting by combining our edge-DP algorithms with a novel use of arboricity sparsifiers. [...]

Paperid: 1614, https://arxiv.org/pdf/2506.23985.pdf

Abstract:
Modern enterprise database systems face significant challenges in balancing data security and performance. Ensuring robust encryption for sensitive information is critical for systems' compliance with security standards. Although holistic database encryption provides strong protection, existing database systems often require a complete backup and restore cycle, resulting in prolonged downtime and increased storage usage. This makes it difficult to implement online encryption techniques in high-throughput environments without disrupting critical operations. To address this challenge, we envision a solution that enables online database encryption aligned with system activity, eliminating the need for downtime, storage overhead, or full-database reprocessing. Central to this vision is the ability to predict which parts of the database will be accessed next, allowing encryption to be applied online. As a step towards this solution, this study proposes a predictive approach that leverages deep learning models to forecast database lock sequences, using IBM Db2 as the database system under study. In this study, we collected a specialized dataset from TPC-C benchmark workloads, leveraging lock event logs for model training and evaluation. We applied deep learning architectures, such as Transformer and LSTM, to evaluate models for various table-level and page-level lock predictions. We benchmark the accuracy of the trained models versus a Naive Baseline across different prediction horizons and timelines. The study experiments demonstrate that the proposed deep learning-based models achieve up to 49% average accuracy for table-level and 66% for page-level predictions, outperforming a Naive Baseline. By anticipating which tables and pages will be locked next, the proposed approach is a step toward online encryption, offering a practical path toward secure, low-overhead database systems.

Paperid: 1615, https://arxiv.org/pdf/2506.22606.pdf

Abstract:
In the current paradigm of digital personalized services, the centralized management of personal data raises significant privacy concerns, security vulnerabilities, and diminished individual autonomy over sensitive information. Despite their efficiency, traditional centralized architectures frequently fail to satisfy rigorous privacy requirements and expose users to data breaches and unauthorized access risks. This pressing challenge calls for a fundamental paradigm shift in methodologies for collecting, storing, and utilizing personal data across diverse sectors, including education, healthcare, and finance. This paper introduces a novel decentralized, privacy-preserving architecture that handles heterogeneous personal information, ranging from educational credentials to health records and financial data. Unlike traditional models, our system grants users complete data ownership and control, allowing them to selectively share information without compromising privacy. The architecture's foundation comprises advanced privacy-enhancing technologies, including secure enclaves and federated learning, enabling secure computation, verification, and data sharing. The system supports diverse functionalities, including local computation, model training, and privacy-preserving data sharing, while ensuring data credibility and robust user privacy.

Paperid: 1616, https://arxiv.org/pdf/2506.20872.pdf

Abstract:
Data-driven agriculture, which integrates technology and data into agricultural practices, has the potential to improve crop yield, disease resilience, and long-term soil health. However, privacy concerns, such as adverse pricing, discrimination, and resource manipulation, deter farmers from sharing data, as it can be used against them. To address this barrier, we propose a privacy-preserving framework that enables secure data sharing and collaboration for research and development while mitigating privacy risks. The framework combines dimensionality reduction techniques (like Principal Component Analysis (PCA)) and differential privacy by introducing Laplacian noise to protect sensitive information. The proposed framework allows researchers to identify potential collaborators for a target farmer and train personalized machine learning models either on the data of identified collaborators via federated learning or directly on the aggregated privacy-protected data. It also allows farmers to identify potential collaborators based on similarities. We have validated this on real-life datasets, demonstrating robust privacy protection against adversarial attacks and utility performance comparable to a centralized system. We demonstrate how this framework can facilitate collaboration among farmers and help researchers pursue broader research objectives. The adoption of the framework can empower researchers and policymakers to leverage agricultural data responsibly, paving the way for transformative advances in data-driven agriculture. By addressing critical privacy challenges, this work supports secure data integration, fostering innovation and sustainability in agricultural systems.

Paperid: 1617, https://arxiv.org/pdf/2506.20816.pdf

Abstract:
Deep Neural Networks (DNNs) are notoriously vulnerable to adversarial input designs with limited noise budgets. While numerous successful attacks with subtle modifications to original input have been proposed, defense techniques against these attacks are relatively understudied. Existing defense approaches either focus on improving DNN robustness by negating the effects of perturbations or use a secondary model to detect adversarial data. Although equally important, the attack detection approach, which is studied in this work, provides a more practical defense compared to the robustness approach. We show that the existing detection methods are either ineffective against the state-of-the-art attack techniques or computationally inefficient for real-time processing. We propose a novel universal and efficient method to detect adversarial examples by analyzing the varying degrees of impact of attacks on different DNN layers. {Our method trains a lightweight regression model that predicts deeper-layer features from early-layer features, and uses the prediction error to detect adversarial samples.} Through theoretical arguments and extensive experiments, we demonstrate that our detection method is highly effective, computationally efficient for real-time processing, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.

Paperid: 1618, https://arxiv.org/pdf/2506.20770.pdf

Abstract:
Cyber deception aims to distract, delay, and detect network attackers with fake assets such as honeypots, decoy credentials, or decoy files. However, today, it is difficult for operators to experiment, explore, and evaluate deception approaches. Existing tools and platforms have non-portable and complex implementations that are difficult to modify and extend. We address this pain point by introducing Perry, a high-level framework that accelerates the design and exploration of deception what-if scenarios. Perry has two components: a high-level abstraction layer for security operators to specify attackers and deception strategies, and an experimentation module to run these attackers and defenders in realistic emulated networks. To translate these high-level specifications we design four key modules for Perry: 1) an action planner that translates high-level actions into low-level implementations, 2) an observability module to translate low-level telemetry into high-level observations, 3) an environment state service that enables environment agnostic strategies, and 4) an attack graph service to reason about how attackers could explore an environment. We illustrate that Perry's abstractions reduce the implementation effort of exploring a wide variety of deception defenses, attackers, and environments. We demonstrate the value of Perry by emulating 55 unique deception what-if scenarios and illustrate how these experiments enable operators to shed light on subtle tradeoffs.

Paperid: 1619, https://arxiv.org/pdf/2506.20234.pdf

Abstract:
In this work, we propose a differentially private algorithm for publishing matrices aggregated from sparse vectors. These matrices include social network adjacency matrices, user-item interaction matrices in recommendation systems, and single nucleotide polymorphisms (SNPs) in DNA data. Traditionally, differential privacy in vector collection relies on randomized response, but this approach incurs high communication costs. Specifically, for a matrix with $N$ users, $n$ columns, and $m$ nonzero elements, conventional methods require $Î©(n \times N)$ communication, making them impractical for large-scale data. Our algorithm significantly reduces this cost to $O(\varepsilon m)$, where $\varepsilon$ is the privacy budget. Notably, this is even lower than the non-private case, which requires $Î©(m \log n)$ communication. Moreover, as the privacy budget decreases, communication cost further reduces, enabling better privacy with improved efficiency. We theoretically prove that our method yields results identical to those of randomized response, and experimental evaluations confirm its effectiveness in terms of accuracy, communication efficiency, and computational complexity.

Paperid: 1620, https://arxiv.org/pdf/2506.20104.pdf

Abstract:
The Russian invasion of Ukraine has fundamentally altered the information technology (IT) risk landscape, particularly in cloud computing environments. This paper examines how this geopolitical conflict has accelerated data sovereignty concerns, transformed cybersecurity paradigms, and reshaped cloud infrastructure strategies worldwide. Through an analysis of documented cyber operations, regulatory responses, and organizational adaptations between 2022 and early 2025, this research demonstrates how the conflict has served as a catalyst for a broader reassessment of IT risk. The research reveals that while traditional IT risk frameworks offer foundational guidance, their standard application may inadequately address the nuances of state-sponsored threats, conflicting data governance regimes, and the weaponization of digital dependencies without specific geopolitical augmentation. The contribution of this paper lies in its focused synthesis and strategic adaptation of existing best practices into a multi-layered approach. This approach uniquely synergizes resilient cloud architectures (including sovereign and hybrid models), enhanced data-centric security strategies (such as advanced encryption and privacy-enhancing technologies), and geopolitically-informed governance to build digital resilience. The interplay between these layers, emphasizing how geopolitical insights directly shape architectural and security choices beyond standard best practices-particularly by integrating the human element, including personnel vulnerabilities and expertise, as a core consideration in technical design and operational management-offers a more robust defense against the specific, multifaceted risks arising from geopolitical conflict in increasingly fractured digital territories.

Paperid: 1621, https://arxiv.org/pdf/2506.20102.pdf

Abstract:
The convergence of IT and OT has created hyper-connected ICS, exposing critical infrastructure to a new class of adaptive, intelligent adversaries that render static defenses obsolete. Existing security paradigms often fail to address a foundational "Trinity of Trust," comprising the fidelity of the system model, the integrity of synchronizing data, and the resilience of the analytical engine against sophisticated evasion. This paper introduces the ARC framework, a method for achieving analytical resilience through an autonomous, closed-loop hardening process. ARC establishes a perpetual co-evolutionary arms race within the high-fidelity sandbox of a F-SCDT. A DRL agent, the "Red Agent," is formalized and incentivized to autonomously discover stealthy, physically-plausible attack paths that maximize process disruption while evading detection. Concurrently, an ensemble-based "Blue Agent" defender is continuously hardened via adversarial training against the evolving threats discovered by its adversary. This co-evolutionary dynamic forces both agents to become progressively more sophisticated, enabling the system to autonomously probe and patch its own vulnerabilities. Experimental validation on both the TEP and the SWaT testbeds demonstrates the framework's superior performance. A comprehensive ablation study, supported by extensive visualizations including ROC curves and SHAP plots, reveals that the co-evolutionary process itself is responsible for a significant performance increase in detecting novel attacks. By integrating XAI to ensure operator trust and proposing a scalable F-ARC architecture, this work presents ARC not merely as an improvement, but as a necessary paradigm shift toward dynamic, self-improving security for the future of critical infrastructure.

Paperid: 1622, https://arxiv.org/pdf/2506.19899.pdf

Abstract:
Social engineering attacks delivered via email, commonly known as phishing, represent a persistent cybersecurity threat leading to significant organizational incidents and data breaches. Although many organizations train employees on phishing, often mandated by compliance requirements, the real-world effectiveness of this training remains debated. To contribute to evidence-based cybersecurity policy, we conducted a large-scale reproduction study (N = 12,511) at a US-based financial technology firm. Our experimental design refined prior work by comparing training modalities in operational environments, validating NIST's standardized phishing difficulty measurement, and introducing novel organizational-level temporal resilience metrics. Echoing prior work, training interventions showed no significant main effects on click rates (p=0.450) or reporting rates (p=0.417), with negligible effect sizes. However, we found that the NIST Phish Scale predicted user behavior, with click rates increasing from 7.0% for easy lures to 15.0% for hard lures. Our organizational-level resilience result was mixed: 36-55% of campaigns achieved "inoculation" patterns where reports preceded clicks, but training did not significantly improve organizational-level temporal protection. In summary, our results confirm the ineffectiveness of current phishing training approaches while offering a refined study design for future work.

Paperid: 1623, https://arxiv.org/pdf/2506.19368.pdf

Abstract:
Data trading is one of the key focuses of Web 3.0. However, all the current methods that rely on blockchain-based smart contracts for data exchange cannot support large-scale data trading while ensuring data security, which falls short of fulfilling the spirit of Web 3.0. Even worse, there is currently a lack of discussion on the essential properties that large-scale data trading should satisfy. In this work, we are the first to formalize the property requirements for enabling data trading in Web 3.0. Based on these requirements, we are the first to propose Yotta, a complete batch data trading scheme for blockchain, which features a data trading design that leverages our innovative cryptographic workflow with IPFS and zk-SNARK. Our simulation results demonstrate that Yotta outperforms baseline approaches up to 130 times and exhibits excellent scalability to satisfy all the properties.

Paperid: 1624, https://arxiv.org/pdf/2506.18767.pdf

Abstract:
Ambient backscatter communication (AmBC) has become an integral part of ubiquitous Internet of Things (IoT) applications due to its energy-harvesting capabilities and ultra-low-power consumption. However, the open wireless environment exposes AmBC systems to various attacks, and existing authentication methods cannot be implemented between resource-constrained backscatter devices (BDs) due to their high computational demands.To this end, this paper proposes PLCRA-BD, a novel physical layer challenge-response authentication scheme between BDs in AmBC that overcomes BDs' limitations, supports high mobility, and performs robustly against impersonation and wireless attacks. It constructs embedded keys as physical layer fingerprints for lightweight identification and designs a joint transceiver that integrates BDs' backscatter waveform with receiver functionality to mitigate interference from ambient RF signals by exploiting repeated patterns in OFDM symbols. Based on this, a challenge-response authentication procedure is introduced to enable low-complexity fingerprint exchange between two paired BDs leveraging channel coherence, while securing the exchange process using a random number and unpredictable channel fading. Additionally, we optimize the authentication procedure for high-mobility scenarios, completing exchanges within the channel coherence time to minimize the impact of dynamic channel fluctuations. Security analysis confirms its resistance against impersonation, eavesdropping, replay, and counterfeiting attacks. Extensive simulations validate its effectiveness in resource-constrained BDs, demonstrating high authentication accuracy across diverse channel conditions, robustness against multiple wireless attacks, and superior efficiency compared to traditional authentication schemes.

Paperid: 1625, https://arxiv.org/pdf/2506.17308.pdf

Abstract:
The rapid advancement of large language models (LLMs) has raised concerns regarding their potential misuse, particularly in generating fake news and misinformation. To address these risks, watermarking techniques for autoregressive language models have emerged as a promising means for detecting LLM-generated text. Existing methods typically embed a watermark by increasing the probabilities of tokens within a group selected according to a single secret key. However, this approach suffers from a critical limitation: if the key is leaked, it becomes impossible to trace the text's provenance or attribute authorship. To overcome this vulnerability, we propose a novel nested watermarking scheme that embeds two distinct watermarks into the generated text using two independent keys. This design enables reliable authorship identification even in the event that one key is compromised. Experimental results demonstrate that our method achieves high detection accuracy for both watermarks while maintaining the fluency and overall quality of the generated text.

Paperid: 1626, https://arxiv.org/pdf/2506.17292.pdf

Abstract:
Federated Learning enables collaborative learning among clients via a coordinating server while avoiding direct data sharing, offering a perceived solution to preserve privacy. However, recent studies on Membership Inference Attacks (MIAs) have challenged this notion, showing high success rates against unprotected training data. While local differential privacy (LDP) is widely regarded as a gold standard for privacy protection in data analysis, most studies on MIAs either neglect LDP or fail to provide theoretical guarantees for attack success rates against LDP-protected data. To address this gap, we derive theoretical lower bounds for the success rates of low-polynomial time MIAs that exploit vulnerabilities in fully connected or self-attention layers. We establish that even when data are protected by LDP, privacy risks persist, depending on the privacy budget. Practical evaluations on federated vision models confirm considerable privacy risks, revealing that the noise required to mitigate these attacks significantly degrades models' utility.

Paperid: 1627, https://arxiv.org/pdf/2506.16224.pdf

Abstract:
This paper investigates the application of natural language processing (NLP)-based n-gram analysis and machine learning techniques to enhance malware classification. We explore how NLP can be used to extract and analyze textual features from malware samples through n-grams, contiguous string or API call sequences. This approach effectively captures distinctive linguistic patterns among malware and benign families, enabling finer-grained classification. We delve into n-gram size selection, feature representation, and classification algorithms. While evaluating our proposed method on real-world malware samples, we observe significantly improved accuracy compared to the traditional methods. By implementing our n-gram approach, we achieved an accuracy of 99.02% across various machine learning algorithms by using hybrid feature selection technique to address high dimensionality. Hybrid feature selection technique reduces the feature set to only 1.6% of the original features.

Paperid: 1628, https://arxiv.org/pdf/2506.15740.pdf

Abstract:
As Large Language Models (LLMs) are increasingly deployed as autonomous agents in complex and long horizon settings, it is critical to evaluate their ability to sabotage users by pursuing hidden objectives. We study the ability of frontier LLMs to evade monitoring and achieve harmful hidden goals while completing a wide array of realistic tasks. We evaluate a broad range of frontier LLMs using SHADE (Subtle Harmful Agent Detection & Evaluation)-Arena, the first highly diverse agent evaluation dataset for sabotage and monitoring capabilities of LLM agents. SHADE-Arena consists of complex pairs of benign main tasks and harmful side objectives in complicated environments. Agents are evaluated on their ability to complete the side task without appearing suspicious to an LLM monitor. When measuring agent ability to (a) complete the main task, (b) complete the side task, and (c) avoid detection, we find that the best performing frontier models score 27% (Claude 3.7 Sonnet) and 15% (Gemini 2.5 Pro) as sabotage agents when overseen by Claude 3.6 Sonnet. For current frontier models, success on the side task relies heavily on having access to a hidden scratchpad that is not visible to the monitor. We also use SHADE-Arena to measure models' monitoring abilities, with the top monitor (Gemini 2.5 Pro) achieving an AUC of 0.87 at distinguishing benign and malign transcripts. We find that for now, models still struggle at sabotage due to failures in long-context main task execution. However, our measurements already demonstrate the difficulty of monitoring for subtle sabotage attempts, which we expect to only increase in the face of more complex and longer-horizon tasks.

Paperid: 1629, https://arxiv.org/pdf/2506.15734.pdf

Abstract:
As Vision-Language Models (VLMs) demonstrate increasing capabilities across real-world applications such as code generation and chatbot assistance, ensuring their safety has become paramount. Unlike traditional Large Language Models (LLMs), VLMs face unique vulnerabilities due to their multimodal nature, allowing adversaries to modify visual or textual inputs to bypass safety guardrails and trigger the generation of harmful content. Through systematic analysis of VLM behavior under attack, we identify a novel phenomenon termed ``delayed safety awareness''. Specifically, we observe that safety-aligned VLMs may initially be compromised to produce harmful content, but eventually recognize the associated risks and attempt to self-correct. This pattern suggests that VLMs retain their underlying safety awareness but experience a temporal delay in their activation. Building on this insight, we hypothesize that VLMs' safety awareness can be proactively reactivated through carefully designed prompts. To this end, we introduce ``The Safety Reminder'', a soft prompt tuning approach that optimizes learnable prompt tokens, which are periodically injected during the text generation process to enhance safety awareness, effectively preventing harmful content generation. Additionally, our safety reminder only activates when harmful content is detected, leaving normal conversations unaffected and preserving the model's performance on benign tasks. Through comprehensive evaluation across three established safety benchmarks and one adversarial attacks, we demonstrate that our approach significantly reduces attack success rates while maintaining model utility, offering a practical solution for deploying safer VLMs in real-world applications.

Paperid: 1630, https://arxiv.org/pdf/2506.15656.pdf

Abstract:
Phishing websites continue to pose a significant cybersecurity threat, often leveraging deceptive structures, brand impersonation, and social engineering tactics to evade detection. While recent advances in large language models (LLMs) have enabled improved phishing detection through contextual understanding, most existing approaches rely on single-agent classification facing the risks of hallucination and lack interpretability or robustness. To address these limitations, we propose PhishDebate, a modular multi-agent LLM-based debate framework for phishing website detection. PhishDebate employs four specialized agents to independently analyze different textual aspects of a webpage--URL structure, HTML composition, semantic content, and brand impersonation--under the coordination of a Moderator and a final Judge. Through structured debate and divergent thinking, the framework delivers more accurate and interpretable decisions. Extensive evaluations on commercial LLMs demonstrate that PhishDebate achieves 98.2% recall and 98.2% True Positive Rate (TPR) on a real-world phishing dataset, and outperforms single-agent and Chain of Thought (CoT) baselines. Additionally, its modular design allows agent-level configurability, enabling adaptation to varying resource and application requirements.

Paperid: 1631, https://arxiv.org/pdf/2506.15224.pdf

Abstract:
In this paper, we introduce an adaptation of the facility location problem and analyze it within the framework of local differential privacy (LDP). Under this model, we ensure the privacy of client presence at specific locations. When n is the number of points, Gupta et al. established a lower bound of $Î©(\sqrt{n})$ on the approximation ratio for any differentially private algorithm applied to the original facility location problem. As a result, subsequent works have adopted the super-set assumption, which may, however, compromise user privacy. We show that this lower bound does not apply to our adaptation by presenting an LDP algorithm that achieves a constant approximation ratio with a relatively small additive factor. Additionally, we provide experimental results demonstrating that our algorithm outperforms the straightforward approach on both synthetically generated and real-world datasets.

Paperid: 1632, https://arxiv.org/pdf/2506.15170.pdf

Abstract:
Large language models (LLMs) are rapidly evolving from single-modal systems to multimodal LLMs and intelligent agents, significantly expanding their capabilities while introducing increasingly severe security risks. This paper presents a systematic survey of the growing complexity of jailbreak attacks and corresponding defense mechanisms within the expanding LLM ecosystem. We first trace the developmental trajectory from LLMs to MLLMs and Agents, highlighting the core security challenges emerging at each stage. Next, we categorize mainstream jailbreak techniques from both the attack impact and visibility perspectives, and provide a comprehensive analysis of representative attack methods, related datasets, and evaluation metrics. On the defense side, we organize existing strategies based on response timing and technical approach, offering a structured understanding of their applicability and implementation. Furthermore, we identify key limitations in existing surveys, such as insufficient attention to agent-specific security issues, the absence of a clear taxonomy for hybrid jailbreak methods, a lack of detailed analysis of experimental setups, and outdated coverage of recent advancements. To address these limitations, we provide an updated synthesis of recent work and outline future research directions in areas such as dataset construction, evaluation framework optimization, and strategy generalization. Our study seeks to enhance the understanding of jailbreak mechanisms and facilitate the advancement of more resilient and adaptive defense strategies in the context of ever more capable LLMs.

Paperid: 1633, https://arxiv.org/pdf/2506.15070.pdf

Abstract:
The exponential growth of Internet of Things (IoT) applications has intensified the demand for efficient, high-throughput, and energy-efficient data processing at the edge. Conventional CPU-centric encryption methods suffer from performance bottlenecks and excessive data movement, especially in latency-sensitive and resource-constrained environments. In this paper, we present SPiME, a lightweight, scalable, and FPGA-compatible Secure Processor-in-Memory Encryption architecture that integrates the Advanced Encryption Standard (AES-128) directly into a Processing-in-Memory (PiM) framework. SPiME is designed as a modular array of parallel PiM units, each combining an AES core with a minimal control unit to enable distributed in-place encryption with minimal overhead. The architecture is fully implemented in Verilog and tested on multiple AMD UltraScale and UltraScale+ FPGAs. Evaluation results show that SPiME can scale beyond 4,000 parallel units while maintaining less than 5\% utilization of key FPGA resources on high-end devices. It delivers over 25~Gbps in sustained encryption throughput with predictable, low-latency performance. The design's portability, configurability, and resource efficiency make it a compelling solution for secure edge computing, embedded cryptographic systems, and customizable hardware accelerators.

Paperid: 1634, https://arxiv.org/pdf/2506.14466.pdf

Abstract:
Malicious package detection has become a critical task in ensuring the security and stability of the PyPI. Existing detection approaches have focused on advancing model selection, evolving from traditional machine learning (ML) models to large language models (LLMs). However, as the complexity of the model increases, the time consumption also increases, which raises the question of whether a lightweight model achieves effective detection. Through empirical research, we demonstrate that collecting a sufficiently comprehensive feature set enables even traditional ML models to achieve outstanding performance. However, with the continuous emergence of new malicious packages, considerable human and material resources are required for feature analysis. Also, traditional ML model-based approaches lack of explainability to malicious packages.Therefore, we propose a novel approach MalGuard based on graph centrality analysis and the LIME (Local Interpretable Model-agnostic Explanations) algorithm to detect malicious packages.To overcome the above two challenges, we leverage graph centrality analysis to extract sensitive APIs automatically to replace manual analysis. To understand the sensitive APIs, we further refine the feature set using LLM and integrate the LIME algorithm with ML models to provide explanations for malicious packages. We evaluated MalGuard against six SOTA baselines with the same settings. Experimental results show that our proposed MalGuard, improves precision by 0.5%-33.2% and recall by 1.8%-22.1%. With MalGuard, we successfully identified 113 previously unknown malicious packages from a pool of 64,348 newly-uploaded packages over a five-week period, and 109 out of them have been removed by the PyPI official.

Paperid: 1635, https://arxiv.org/pdf/2506.12580.pdf

Abstract:
The limited or no protection for civilian Global Navigation Satellite System (GNSS) signals makes spoofing attacks relatively easy. With modern mobile devices often featuring network interfaces, state-of-the-art signals of opportunity (SOP) schemes can provide accurate network positions in replacement of GNSS. The use of onboard inertial sensors can also assist in the absence of GNSS, possibly in the presence of jammers. The combination of SOP and inertial sensors has received limited attention, yet it shows strong results on fully custom-built platforms. We do not seek to improve such special-purpose schemes. Rather, we focus on countering GNSS attacks, notably detecting them, with emphasis on deployment with consumer-grade platforms, notably smartphones, that provide off-the-shelf opportunistic information (i.e., network position and inertial sensor data). Our Position-based Attack Detection Scheme (PADS) is a probabilistic framework that uses regression and uncertainty analysis for positions. The regression optimization problem is a weighted mean square error of polynomial fitting, with constraints that the fitted positions satisfy the device velocity and acceleration. Then, uncertainty is modeled by a Gaussian process, which provides more flexibility to analyze how sure or unsure we are about position estimations. In the detection process, we combine all uncertainty information with the position estimations into a fused test statistic, which is the input utilized by an anomaly detector based on outlier ensembles. The evaluation shows that the PADS outperforms a set of baseline methods that rely on SOP or inertial sensor-based or statistical tests, achieving up to 3 times the true positive rate at a low false positive rate.

Paperid: 1636, https://arxiv.org/pdf/2506.11701.pdf

Abstract:
Permission systems which restrict access to system resources are a well-established technology in operating systems, especially for smartphones. However, as such systems are implemented in the operating system they can at most manage access on the process-level. Since moderns software often (re)uses code from third-parties libraries, a permission system for libraries can be desirable to enhance security. In this short-paper, we adapt concepts from capability systems building a novel theoretical foundation for permission system at the level of the programming language. This leads to PermRust, a token-based permission system for the Rust programming language as a zero cost abstraction on top of its type-system. With it access to system resources can be managed per library.

Paperid: 1637, https://arxiv.org/pdf/2506.11612.pdf

Abstract:
Binary code similarity analysis (BCSA) is a crucial research area in many fields such as cybersecurity. Specifically, function-level diffing tools are the most widely used in BCSA: they perform function matching one by one for evaluating the similarity between binary programs. However, such methods need a high time complexity, making them unscalable in large-scale scenarios (e.g., 1/n-to-n search). Towards effective and efficient program-level BCSA, we propose KEENHash, a novel hashing approach that hashes binaries into program-level representations through large language model (LLM)-generated function embeddings. KEENHash condenses a binary into one compact and fixed-length program embedding using K-Means and Feature Hashing, allowing us to do effective and efficient large-scale program-level BCSA, surpassing the previous state-of-the-art methods. The experimental results show that KEENHash is at least 215 times faster than the state-of-the-art function matching tools while maintaining effectiveness. Furthermore, in a large-scale scenario with 5.3 billion similarity evaluations, KEENHash takes only 395.83 seconds while these tools will cost at least 56 days. We also evaluate KEENHash on the program clone search of large-scale BCSA across extensive datasets in 202,305 binaries. Compared with 4 state-of-the-art methods, KEENHash outperforms all of them by at least 23.16%, and displays remarkable superiority over them in the large-scale BCSA security scenario of malware detection.

Paperid: 1638, https://arxiv.org/pdf/2506.07836.pdf

Abstract:
Nowadays, the Internet of Things (IoT) is widely employed, and its usage is growing exponentially because it facilitates remote monitoring, predictive maintenance, and data-driven decision making, especially in the healthcare and industrial sectors. However, IoT devices remain vulnerable due to their resource constraints and difficulty in applying security patches. Consequently, various cybersecurity attacks are reported daily, such as Denial of Service, particularly in IoT-driven solutions. Most attack detection methodologies are based on Machine Learning (ML) techniques, which can detect attack patterns. However, the focus is more on identification rather than considering the impact of ML algorithms on computational resources. This paper proposes a green methodology to identify IoT malware networking attacks based on flow privacy-preserving statistical features. In particular, the hyperparameters of three tree-based models -- Decision Trees, Random Forest and Extra-Trees -- are optimized based on energy consumption and test-time performance in terms of Matthew's Correlation Coefficient. Our results show that models maintain high performance and detection accuracy while consistently reducing power usage in terms of watt-hours (Wh). This suggests that on-premise ML-based Intrusion Detection Systems are suitable for IoT and other resource-constrained devices.

Paperid: 1639, https://arxiv.org/pdf/2506.07827.pdf

Abstract:
The kind of malware designed to conceal malicious system resources (e.g. processes, network connections, files, etc.) is commonly referred to as a rootkit. This kind of malware represents a significant threat in contemporany systems. Despite the existence of kernel-space rootkits (i.e. rootkits that infect the operating system kernel), user-space rootkits (i.e. rootkits that infect the user-space operating system tools, commands and libraries) continue to pose a significant danger. However, kernel-space rootkits attract all the attention, implicitly assuming that user-space rootkits (malware that is still in existence) are easily detectable by well-known user-space tools that look for anomalies. The primary objective of this work is to answer the following question: Is detecting user-space rootkits with user-space tools futile? Contrary to the prevailing view that considers it effective, we argue that the detection of user-space rootkits cannot be done in user-space at all. Moreover, the detection results must be communicated to the user with extreme caution. To support this claim, we conducted different experiments focusing on process concealing in Linux systems. In these experiments, we evade the detection mechanisms widely accepted as the standard solution for this type of user-space malware, bypassing the most popular open source anti-rootkit tool for process hiding. This manuscript describes the classical approach to build user-space library rootkits, the traditional detection mechanisms, and different evasion techniques (it also includes understandable code snippets and examples). In addition, it offers some guidelines to implement new detection tools and improve the existing ones to the extent possible.

Paperid: 1640, https://arxiv.org/pdf/2506.07795.pdf

Abstract:
Large Language Model (LLM) unlearning aims to erase or suppress undesirable knowledge within the model, offering promise for controlling harmful or private information to prevent misuse. However, recent studies highlight its limited efficacy in real-world scenarios, hindering practical adoption. In this study, we identify a pervasive issue underlying many downstream failures: the effectiveness of existing unlearning methods heavily depends on the form of training samples and frequently fails to generalize to alternate expressions of the same knowledge. We formally characterize this problem as Form-Dependent Bias and systematically investigate its specific manifestation patterns across various downstream tasks. To quantify its prevalence and support future research, we introduce ORT, a novel benchmark designed to evaluate the robustness of unlearning methods against variations in knowledge expression. Results reveal that Form-Dependent Bias is both widespread and severe among current techniques. We argue that LLM unlearning should be form-independent to address the endless forms of downstream tasks encountered in real-world security-critical scenarios. Towards this goal, we introduce Rank-one Concept Redirection (ROCR), a novel training-free method, as a promising solution path. ROCR performs unlearning by targeting the invariants in downstream tasks, specifically the activated dangerous concepts. It is capable of modifying model parameters within seconds to redirect the model's perception of a specific unlearning target concept to another harmless concept. Extensive experiments demonstrate that ROCR significantly improves unlearning effectiveness compared to traditional methods while generating highly natural outputs.

Paperid: 1641, https://arxiv.org/pdf/2506.07640.pdf

Abstract:
Class groups of real quadratic fields represent fundamental structures in algebraic number theory with significant computational implications. While Stark's conjecture establishes theoretical connections between special units and class group structures, explicit constructions have remained elusive, and precise quantum complexity bounds for class group computations are lacking. Here we establish an integrated framework defining Stark-Coleman invariants $Îº_p(K) = \log_p \left( \frac{\varepsilon_{\mathrm{St},p}}{Ï(\varepsilon_{\mathrm{St},p})} \right) \mod p^{\mathrm{ord}_p(Î_K)}$ through a synthesis of $p$-adic Hodge theory and extended Coleman integration. We prove these invariants classify class groups under the Generalized Riemann Hypothesis (GRH), resolving the isomorphism problem for discriminants $D > 10^{32}$. Furthermore, we demonstrate that this approach yields the quantum lower bound $\exp\left(Î©\left(\frac{\log D}{(\log \log D)^2}\right)\right)$ for the class group discrete logarithm problem, improving upon previous bounds lacking explicit constants. Our results indicate that Stark units constrain the geometric organization of class groups, providing theoretical insight into computational complexity barriers.

Paperid: 1642, https://arxiv.org/pdf/2506.07586.pdf

Abstract:
The dual use nature of Large Language Models (LLMs) presents a growing challenge in cybersecurity. While LLM enhances automation and reasoning for defenders, they also introduce new risks, particularly their potential to be misused for generating evasive, AI crafted malware. Despite this emerging threat, the research community currently lacks controlled and extensible tools that can simulate such behavior for testing and defense preparation. We present MalGEN, a multi agent framework that simulates coordinated adversarial behavior to generate diverse, activity driven malware samples. The agents work collaboratively to emulate attacker workflows, including payload planning, capability selection, and evasion strategies, within a controlled environment built for ethical and defensive research. Using MalGEN, we synthesized ten novel malware samples and evaluated them against leading antivirus and behavioral detection engines. Several samples exhibited stealthy and evasive characteristics that bypassed current defenses, validating MalGEN's ability to model sophisticated and new threats. By transforming the threat of LLM misuse into an opportunity for proactive defense, MalGEN offers a valuable framework for evaluating and strengthening cybersecurity systems. The framework addresses data scarcity, enables rigorous testing, and supports the development of resilient and future ready detection strategies.

Paperid: 1643, https://arxiv.org/pdf/2506.07402.pdf

Abstract:
Large language models (LLMs) are increasingly deployed in real-world applications, raising concerns about their security. While jailbreak attacks highlight failures under overtly harmful queries, they overlook a critical risk: incorrectly answering harmless-looking inputs can be dangerous and cause real-world harm (Implicit Harm). We systematically reformulate the LLM risk landscape through a structured quadrant perspective based on output factuality and input harmlessness, uncovering an overlooked high-risk region. To investigate this gap, we propose JailFlipBench, a benchmark aims to capture implicit harm, spanning single-modal, multimodal, and factual extension scenarios with diverse evaluation metrics. We further develop initial JailFlip attack methodologies and conduct comprehensive evaluations across multiple open-source and black-box LLMs, show that implicit harm present immediate and urgent real-world risks, calling for broader LLM safety assessments and alignment beyond conventional jailbreak paradigms.

Paperid: 1644, https://arxiv.org/pdf/2506.07056.pdf

Abstract:
The robustness of Deep Neural Network models is crucial for defending models against adversarial attacks. Recent defense methods have employed collaborative learning frameworks to enhance model robustness. Two key limitations of existing methods are (i) insufficient guidance of the target model via loss functions and (ii) non-collaborative adversarial generation. We, therefore, propose a dual regularization loss (D2R Loss) method and a collaborative adversarial generation (CAG) strategy for adversarial training. D2R loss includes two optimization steps. The adversarial distribution and clean distribution optimizations enhance the target model's robustness by leveraging the strengths of different loss functions obtained via a suitable function space exploration to focus more precisely on the target model's distribution. CAG generates adversarial samples using a gradient-based collaboration between guidance and target models. We conducted extensive experiments on three benchmark databases, including CIFAR-10, CIFAR-100, Tiny ImageNet, and two popular target models, WideResNet34-10 and PreActResNet18. Our results show that D2R loss with CAG produces highly robust models.

Paperid: 1645, https://arxiv.org/pdf/2506.05739.pdf

Abstract:
LLM agents are widely used as agents for customer support, content generation, and code assistance. However, they are vulnerable to prompt injection attacks, where adversarial inputs manipulate the model's behavior. Traditional defenses like input sanitization, guard models, and guardrails are either cumbersome or ineffective. In this paper, we propose a novel, lightweight defense mechanism called Polymorphic Prompt Assembling (PPA), which protects against prompt injection with near-zero overhead. The approach is based on the insight that prompt injection requires guessing and breaking the structure of the system prompt. By dynamically varying the structure of system prompts, PPA prevents attackers from predicting the prompt structure, thereby enhancing security without compromising performance. We conducted experiments to evaluate the effectiveness of PPA against existing attacks and compared it with other defense methods.

Paperid: 1646, https://arxiv.org/pdf/2506.05290.pdf

Abstract:
Privacy-preserving advertising APIs like Privacy-Preserving Attribution (PPA) are designed to enhance web privacy while enabling effective ad measurement. PPA offers an alternative to cross-site tracking with encrypted reports governed by differential privacy (DP), but current designs lack a principled approach to privacy budget management, creating uncertainty around critical design decisions. We present Big Bird, a privacy budget manager for PPA that clarifies per-site budget semantics and introduces a global budgeting system grounded in resource isolation principles. Big Bird enforces utility-preserving limits via quota budgets and improves global budget utilization through a novel batched scheduling algorithm. Together, these mechanisms establish a robust foundation for enforcing privacy protections in adversarial environments. We implement Big Bird in Firefox and evaluate it on real-world ad data, demonstrating its resilience and effectiveness.

Paperid: 1647, https://arxiv.org/pdf/2506.05129.pdf

Abstract:
Confidential computing has gained traction across major architectures with Intel TDX, AMD SEV-SNP, and Arm CCA. Unlike TDX and SEV-SNP, a key challenge in researching Arm CCA is the absence of hardware support, forcing researchers to develop ad-hoc performance prototypes on non-CCA Arm boards. This approach leads to duplicated efforts, inconsistent performance comparisons, and high barriers to entry. To address this, we present OpenCCA, an open research platform that enables the execution of CCA-bound code on commodity Armv8.2 hardware. By systematically adapting the software stack -- including bootloader, firmware, hypervisor, and kernel -- OpenCCA emulates CCA operations for performance evaluation while preserving functional correctness. We demonstrate its effectiveness with typical life-cycle measurements and case-studies inspired by prior CCA-based papers on a easily available Armv8.2 Rockchip board that costs $250.

Paperid: 1648, https://arxiv.org/pdf/2506.04566.pdf

Abstract:
Differentially private (DP) language model inference is an approach for generating private synthetic text. A sensitive input example is used to prompt an off-the-shelf large language model (LLM) to produce a similar example. Multiple examples can be aggregated together to formally satisfy the DP guarantee. Prior work creates inference batches by sampling sensitive inputs uniformly at random. We show that uniform sampling degrades the quality of privately generated text, especially when the sensitive examples concern heterogeneous topics. We remedy this problem by clustering the input data before selecting inference batches. Next, we observe that clustering also leads to more similar next-token predictions across inferences. We use this insight to introduce a new algorithm that aggregates next token statistics by privately computing medians instead of averages. This approach leverages the fact that the median has decreased local sensitivity when next token predictions are similar, allowing us to state a data-dependent and ex-post DP guarantee about the privacy properties of this algorithm. Finally, we demonstrate improvements in terms of representativeness metrics (e.g., MAUVE) as well as downstream task performance. We show that our method produces high-quality synthetic data at significantly lower privacy cost than a previous state-of-the-art method.

Paperid: 1649, https://arxiv.org/pdf/2506.01412.pdf

Abstract:
As malware continues to become more complex and harder to detect, Malware Analysis needs to continue to evolve to stay one step ahead. One promising key area approach focuses on using system calls and API Calls, the core communication between user applications and the operating system and their kernels. These calls provide valuable insight into how software or programs behaves, making them an useful tool for spotting suspicious or harmful activity of programs and software. This chapter takes a deep down look at how system calls are used in malware detection and classification, covering techniques like static and dynamic analysis, as well as sandboxing. By combining these methods with advanced techniques like machine learning, statistical analysis, and anomaly detection, researchers can analyze system call patterns to tell the difference between normal and malicious behavior. The chapter also explores how these techniques are applied across different systems, including Windows, Linux, and Android, while also looking at the ways sophisticated malware tries to evade detection.

Paperid: 1650, https://arxiv.org/pdf/2506.01396.pdf

Abstract:
Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves the accuracy of the worst-performing class on average over 10 percentage points on skewed MNIST and Fashion MNIST compared to the unbounded adaptive clipping, and over 5 percentage points over constant clipping.

Paperid: 1651, https://arxiv.org/pdf/2506.01055.pdf

Abstract:
Previous benchmarks on prompt injection in large language models (LLMs) have primarily focused on generic tasks and attacks, offering limited insights into more complex threats like data exfiltration. This paper examines how prompt injection can cause tool-calling agents to leak personal data observed during task execution. Using a fictitious banking agent, we develop data flow-based attacks and integrate them into AgentDojo, a recent benchmark for agentic security. To enhance its scope, we also create a richer synthetic dataset of human-AI banking conversations. In 16 user tasks from AgentDojo, LLMs show a 15-50 percentage point drop in utility under attack, with average attack success rates (ASR) around 20 percent; some defenses reduce ASR to zero. Most LLMs, even when successfully tricked by the attack, avoid leaking highly sensitive data like passwords, likely due to safety alignments, but they remain vulnerable to disclosing other personal data. The likelihood of password leakage increases when a password is requested along with one or two additional personal details. In an extended evaluation across 48 tasks, the average ASR is around 15 percent, with no built-in AgentDojo defense fully preventing leakage. Tasks involving data extraction or authorization workflows, which closely resemble the structure of exfiltration attacks, exhibit the highest ASRs, highlighting the interaction between task type, agent performance, and defense efficacy.

Paperid: 1652, https://arxiv.org/pdf/2505.24231.pdf

Abstract:
Malware detection and classification remains a topic of concern for cybersecurity, since it is becoming common for attackers to use advanced obfuscation on their malware to stay undetected. Conventional static analysis is not effective against polymorphic and metamorphic malware as these change their appearance without modifying their behavior, thus defying the analysis by code structure alone. This makes it important to use dynamic detection that monitors malware behavior at runtime. In this paper, we present a dynamic malware categorization framework that extracts API argument calls at the runtime execution of Windows Portable Executable (PE) files. Extracting and encoding the dynamic features of API names, argument return values, and other relative features, we convert raw behavioral data to temporal patterns. To enhance feature portrayal, the generated patterns are subsequently converted into grayscale pictures using a magma colormap. These improved photos are used to teach a Convolutional Neural Network (CNN) model discriminative features, which allows for reliable and accurate malware classification. Results from experiments indicate that our method, with an average accuracy of 98.36% is effective in classifying different classes of malware and benign by integrating dynamic analysis and deep learning. It not only achieves high classification accuracy but also demonstrates significant resilience against typical evasion strategies.

Paperid: 1653, https://arxiv.org/pdf/2505.24021.pdf

Abstract:
A Cyber-Physical System (CPS) testbed serves as a powerful platform for testing and validating cyber intrusion detection and mitigation strategies in substations. This study presents the design and development of a CPS testbed that can effectively assess the real-time dynamics of a substation. Cyber attacks exploiting IEC 61850-based SV and GOOSE protocols are demonstrated using the testbed, along with an analysis on attack detection. Realistic timing measurements are obtained, and the time frames for deploying detection and mitigation strategies are evaluated.

Paperid: 1654, https://arxiv.org/pdf/2505.23968.pdf

Abstract:
Cautious predictions -- where a machine learning model abstains when uncertain -- are crucial for limiting harmful errors in safety-critical applications. In this work, we identify a novel threat: a dishonest institution can exploit these mechanisms to discriminate or unjustly deny services under the guise of uncertainty. We demonstrate the practicality of this threat by introducing an uncertainty-inducing attack called Mirage, which deliberately reduces confidence in targeted input regions, thereby covertly disadvantaging specific individuals. At the same time, Mirage maintains high predictive performance across all data points. To counter this threat, we propose Confidential Guardian, a framework that analyzes calibration metrics on a reference dataset to detect artificially suppressed confidence. Additionally, it employs zero-knowledge proofs of verified inference to ensure that reported confidence scores genuinely originate from the deployed model. This prevents the provider from fabricating arbitrary model confidence values while protecting the model's proprietary details. Our results confirm that Confidential Guardian effectively prevents the misuse of cautious predictions, providing verifiable assurances that abstention reflects genuine model uncertainty rather than malicious intent.

Paperid: 1655, https://arxiv.org/pdf/2505.22703.pdf

Abstract:
Many problems in trustworthy ML can be formulated as minimization of the model error under constraints on the prediction rates of the model for suitably-chosen marginals, including most group fairness constraints (demographic parity, equality of odds, etc.). In this work, we study such constrained minimization problems under differential privacy (DP). Standard DP optimization techniques like DP-SGD rely on the loss function's decomposability into per-sample contributions. However, rate constraints introduce inter-sample dependencies, violating the decomposability requirement. To address this, we develop RaCO-DP, a DP variant of the Stochastic Gradient Descent-Ascent (SGDA) algorithm which solves the Lagrangian formulation of rate constraint problems. We demonstrate that the additional privacy cost of incorporating these constraints reduces to privately estimating a histogram over the mini-batch at each optimization step. We prove the convergence of our algorithm through a novel analysis of SGDA that leverages the linear structure of the dual parameter. Finally, empirical results on learning under group fairness constraints demonstrate that our method Pareto-dominates existing private learning approaches in fairness-utility trade-offs.

Paperid: 1656, https://arxiv.org/pdf/2505.22220.pdf

Abstract:
In recent years, malware with tunneling (or: covert channel) capabilities is on the rise. While malware research led to several methods and innovations, the detection and differentiation of malware solely based on its DNS tunneling features is still in its infancy. Moreover, no work so far has used the DNS tunneling traffic to gain knowledge over the current actions taken by the malware. In this paper, we present Domainator, an approach to detect and differentiate state-of-the-art malware and DNS tunneling tools without relying on trivial (but quickly altered) features such as "magic bytes" that are embedded into subdomains. Instead, we apply an analysis of sequential patterns to identify specific types of malware. We evaluate our approach with 7 different malware samples and tunneling tools and can identify the particular malware based on its DNS traffic. We further infer the rough behavior of the particular malware through its DNS tunneling artifacts. Finally, we compare our Domainator with related methods.

Paperid: 1657, https://arxiv.org/pdf/2505.21938.pdf

Abstract:
Adversarial attacks on stochastic bandits have traditionally relied on some unrealistic assumptions, such as per-round reward manipulation and unbounded perturbations, limiting their relevance to real-world systems. We propose a more practical threat model, Fake Data Injection, which reflects realistic adversarial constraints: the attacker can inject only a limited number of bounded fake feedback samples into the learner's history, simulating legitimate interactions. We design efficient attack strategies under this model, explicitly addressing both magnitude constraints (on reward values) and temporal constraints (on when and how often data can be injected). Our theoretical analysis shows that these attacks can mislead both Upper Confidence Bound (UCB) and Thompson Sampling algorithms into selecting a target arm in nearly all rounds while incurring only sublinear attack cost. Experiments on synthetic and real-world datasets validate the effectiveness of our strategies, revealing significant vulnerabilities in widely used stochastic bandit algorithms under practical adversarial scenarios.

Paperid: 1658, https://arxiv.org/pdf/2505.19951.pdf

Abstract:
Deep learning voice models are commonly used nowadays, but the safety processing of personal data, such as human identity and speech content, remains suspicious. To prevent malicious user identification, speaker anonymization methods were proposed. Current methods, particularly based on universal adversarial patch (UAP) applications, have drawbacks such as significant degradation of audio quality, decreased speech recognition quality, low transferability across different voice biometrics models, and performance dependence on the input audio length. To mitigate these drawbacks, in this work, we introduce and leverage the novel Exponential Total Variance (TV) loss function and provide experimental evidence that it positively affects UAP strength and imperceptibility. Moreover, we present a novel scalable UAP insertion procedure and demonstrate its uniformly high performance for various audio lengths.

Paperid: 1659, https://arxiv.org/pdf/2505.19216.pdf

Abstract:
Consider people with smartphones operating without external authorities or global resources other than the network itself. In this setting, high-end applications supporting sovereign democratic digital communities, community banks, and digital cooperatives require consensus executed by community members, which must be reconfigurable to support community dynamics. The Constitutional Consensus protocol aims to address this need by introducing constitutional self-governance to consensus: participants dynamically amend the participant set, supermajority threshold, and timeout parameter through the consensus protocol itself. We achieve this by enhancing a DAG-based protocol (like Cordial Miners) with participant-controlled reconfiguration, while also supporting both high- and low-throughput operation (like Morpheus), remaining quiescent when idle. This three-way synthesis uniquely combines: (1) constitutional amendments for self-governance, (2) a cryptographic DAG structure for simplicity, parallelism, and throughput, and (3) both high- and low-throughput operation. The protocol achieves consensus in $3Î´$, maintains O(n) amortized communication complexity during high throughput, and seamlessly transitions between modes. The basic protocol (without constitutional amendments) realizes these features in 25 lines of pseudocode, making it one of the most concise consensus protocols for eventual synchrony.

Paperid: 1660, https://arxiv.org/pdf/2505.18944.pdf

Abstract:
Lately, cybercriminals constantly formulate productive approaches to exploit individuals. This article exemplifies an innovative attack, namely QR-based Browser-in-The-Browser (BiTB), using proficiencies of Large Language Model (LLM) i.e. Google Gemini. The presented attack is a fusion of two emerging attacks: BiTB and Quishing (QR code phishing). Our study underscores attack's simplistic implementation utilizing malicious prompts provided to Gemini-LLM. Moreover, we presented a case study to highlight a lucrative attack method, we also performed an experiment to comprehend the attack execution on victims' device. The findings of this work obligate the researchers' contributions in confronting this type of phishing attempts through LLMs.

Paperid: 1661, https://arxiv.org/pdf/2505.18518.pdf

Abstract:
Current Wi-Fi authentication methods face issues such as insufficient security, user privacy leakage, high management costs, and difficulty in billing. To address these challenges, a Wi-Fi access control solution based on blockchain smart contracts is proposed. Firstly, semi-fungible Wi-Fi tokens (SFWTs) are designed using the ERC1155 token standard as credentials for users to access Wi-Fi. Secondly, a Wi-Fi access control system based on SFWTs is developed to securely verify and manage the access rights of Wi-Fi users. Experimental results demonstrate that SFWTs, designed based on the ERC1155 standard, along with the SFWT access right verification process, can significantly reduce Wi-Fi operating costs and authentication time, effectively meeting users' needs for safe and convenient Wi-Fi access.

Paperid: 1662, https://arxiv.org/pdf/2505.18332.pdf

Abstract:
Recent advances in Large Language Models (LLMs) have led to the widespread adoption of third-party inference services, raising critical privacy concerns. Existing methods of performing private third-party inference, such as Secure Multiparty Computation (SMPC), often rely on cryptographic methods. However, these methods are thousands of times slower than standard unencrypted inference, and fail to scale to large modern LLMs. Therefore, recent lines of work have explored the replacement of expensive encrypted nonlinear computations in SMPC with statistical obfuscation methods - in particular, revealing permuted hidden states to the third parties, with accompanying strong claims of the difficulty of reversal into the unpermuted states. In this work, we begin by introducing a novel reconstruction technique that can recover original prompts from hidden states with nearly perfect accuracy across multiple state-of-the-art LLMs. We then show that extensions of our attack are nearly perfectly effective in reversing permuted hidden states of LLMs, demonstrating the insecurity of three recently proposed privacy schemes. We further dissect the shortcomings of prior theoretical `proofs' of permuation security which allow our attack to succeed. Our findings highlight the importance of rigorous security analysis in privacy-preserving LLM inference.

Paperid: 1663, https://arxiv.org/pdf/2505.18290.pdf

Abstract:
We introduce EtherBee, a global dataset integrating detailed Ethereum node metrics, network traffic metadata, and honeypot interaction logs collected from ten geographically diverse vantage points over three months. By correlating node data with granular network sessions and security events, EtherBee provides unique insights into benign and malicious activity, node stability, and network-level threats in the Ethereum peer-to-peer network. A case study shows how client-based optimizations can unintentionally concentrate the network geographically, impacting resilience and censorship resistance. We publicly release EtherBee to promote further investigations into performance, reliability, and security in decentralized networks.

Paperid: 1664, https://arxiv.org/pdf/2505.17502.pdf

Abstract:
Quantum key distribution (QKD), one of the latest cryptographic techniques, founded on the laws of quantum mechanics rather than mathematical complexity, promises for the first time unconditional secure remote communications. Integrating this technology into the next generation nuclear systems - designed for universal data collection and real-time sharing as well as cutting-edge instrumentation and increased dependency on digital technologies - could provide significant benefits enabling secure, unattended, and autonomous operation in remote areas, e.g., microreactors and fission batteries. However, any practical implementation on a critical reactor system must meet strict requirements on latency, control system compatibility, stability, and performance under operational transients. Here, we report the complete end-to-end demonstration of a phase-encoding decoy-state BB84 protocol QKD system under prototypic conditions on Purdue's fully digital nuclear reactor, PUR-1. The system was installed in PUR-1 successfully executing real-time encryption and decryption of 2,000 signals over optic fiber distances up to 82 km using OTP-based encryption and up to 140 km with AES-based encryption. For a core of 68 signals, OTP-secure communication was achieved for up to 135 km. The QKD system maintained a stable secret key rate of 320 kbps and a quantum bit error of 3.8% at 54 km. Our results demonstrate that OTP-based encryption introduces minimal latency while the more key-efficient AES and ASCON encryption schemes can significantly increase the number of signals encrypted without latency penalties. Additionally, implementation of a dynamic key pool ensures several hours of secure key availability during potential system downtimes. This work shows the potential of quantum-based secure remote communications for future digitally driven nuclear reactor technologies.

Paperid: 1665, https://arxiv.org/pdf/2505.17072.pdf

Abstract:
Recent studies on the safety alignment of large language models (LLMs) have revealed that existing approaches often operate superficially, leaving models vulnerable to various adversarial attacks. Despite their significance, these studies generally fail to offer actionable solutions beyond data augmentation for achieving more robust safety mechanisms. This paper identifies a fundamental cause of this superficiality: existing alignment approaches often presume that models can implicitly learn a safety-related reasoning task during the alignment process, enabling them to refuse harmful requests. However, the learned safety signals are often diluted by other competing objectives, leading models to struggle with drawing a firm safety-conscious decision boundary when confronted with adversarial attacks. Based on this observation, by explicitly introducing a safety-related binary classification task and integrating its signals with our attention and decoding strategies, we eliminate this ambiguity and allow models to respond more responsibly to malicious queries. We emphasize that, with less than 0.2x overhead cost, our approach enables LLMs to assess the safety of both the query and the previously generated tokens at each necessary generating step. Extensive experiments demonstrate that our method significantly improves the resilience of LLMs against various adversarial attacks, offering a promising pathway toward more robust generative AI systems.

Paperid: 1666, https://arxiv.org/pdf/2505.15954.pdf

Abstract:
This work explores a novel integration of blockchain methodologies with Wide Area Visual Navigation (WAVN) to address challenges in visual navigation for a heterogeneous team of mobile robots deployed for unstructured applications in agriculture, forestry, etc. Focusing on overcoming challenges such as GPS independence, environmental changes, and computational limitations, the study introduces the Proof of Stake (PoS) mechanism, commonly used in blockchain systems, into the WAVN framework \cite{Lyons_2022}. This integration aims to enhance the cooperative navigation capabilities of robotic teams by prioritizing robot contributions based on their navigation reliability. The methodology involves a stake weight function, consensus score with PoS, and a navigability function, addressing the computational complexities of robotic cooperation and data validation. This innovative approach promises to optimize robotic teamwork by leveraging blockchain principles, offering insights into the scalability, efficiency, and overall system performance. The project anticipates significant advancements in autonomous navigation and the broader application of blockchain technology beyond its traditional financial context.

Paperid: 1667, https://arxiv.org/pdf/2505.15148.pdf

Abstract:
Centralized Dynamic Spectrum Sharing (DSS) faces challenges like data security, high management costs, and limited scalability. To address these issues, a blockchain-based DSS scheme has been proposed in this paper. First, we utilize the ERC4907 standard to mint Non-Fungible Spectrum Tokens (NFSTs) that serve as unique identifiers for spectrum resources and facilitate renting. Next, we develop a smart contract for NFST auctions, ensuring secure spectrum transactions through the auction process. Lastly, we create a Web3 spectrum auction platform where users can access idle spectrum data and participate in auctions for NFST leases corresponding to the available spectrum. Experimental results demonstrate that our NFST, designed according to the ERC4907 standard, effectively meets users' secure and efficient DSS requirements, making it a feasible solution.

Paperid: 1668, https://arxiv.org/pdf/2505.14251.pdf

Abstract:
We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data. We call an input $(m,Î±,Î²)$-subsamplable if a random subsample of size $m$ (or larger) preserves w.p $\geq 1-Î²$ the spectral structure of the original second moment matrix up to a multiplicative factor of $1\pm Î±$. Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al 2019, that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p. the accuracy of the second moment estimation upto an arbitrary factor of $(1\pmÎ³)$. We then show how to apply our algorithm to approximate the second moment matrix of a distribution $\mathcal{D}$, even when a noticeable fraction of the input are outliers.

Paperid: 1669, https://arxiv.org/pdf/2505.13895.pdf

Abstract:
The dynamic landscape of cybersecurity demands precise and scalable solutions for vulnerability management in heterogeneous systems, where configuration-specific vulnerabilities are often misidentified due to inconsistent data in databases like the National Vulnerability Database (NVD). Inaccurate Common Platform Enumeration (CPE) data in NVD further leads to false positives and incomplete vulnerability retrieval. Informed by our systematic analysis of CPE and CVEdeails data, revealing more than 50% vendor name inconsistencies, we propose VulCPE, a framework that standardizes data and models configuration dependencies using a unified CPE schema (uCPE), entity recognition, relation extraction, and graph-based modeling. VulCPE achieves superior retrieval precision (0.766) and coverage (0.926) over existing tools. VulCPE ensures precise, context-aware vulnerability management, enhancing cyber resilience.

Paperid: 1670, https://arxiv.org/pdf/2505.13842.pdf

Abstract:
Embedded devices are increasingly ubiquitous and vital, often supporting safety-critical functions. However, due to strict cost and energy constraints, they are typically implemented with Micro-Controller Units (MCUs) that lack advanced architectural security features. Within this space, recent efforts have created low-cost architectures capable of generating Proofs of Execution (PoX) of software on potentially compromised MCUs. This capability can ensure the integrity of sensor data from the outset, by binding sensed results to an unforgeable cryptographic proof of execution on edge sensor MCUs. However, the security of existing PoX requires the proven execution to occur atomically. This requirement precludes the application of PoX to (1) time-shared systems, and (2) applications with real-time constraints, creating a direct conflict between execution integrity and the real-time availability needs of several embedded system uses. In this paper, we formulate a new security goal called Real-Time Proof of Execution (RT-PoX) that retains the integrity guarantees of classic PoX while enabling its application to existing real-time systems. This is achieved by relaxing the atomicity requirement of PoX while dispatching interference attempts from other potentially malicious tasks (or compromised operating systems) executing on the same device. To realize the RT-PoX goal, we develop Provable Execution Architecture for Real-Time Systems (PEARTS). To the best of our knowledge, PEARTS is the first PoX system that can be directly deployed alongside a commodity embedded real-time operating system (FreeRTOS). This enables both real-time scheduling and execution integrity guarantees on commodity MCUs. To showcase this capability, we develop a PEARTS open-source prototype atop FreeRTOS on a single-core ARM Cortex-M33 processor. We evaluate and report on PEARTS security and (modest) overheads.

Paperid: 1671, https://arxiv.org/pdf/2505.13153.pdf

Abstract:
In this paper, we present Prink, a novel and practically applicable concept and fully implemented prototype for ks-anonymizing data streams in real-world application architectures. Building upon the pre-existing, yet rudimentary CASTLE scheme, Prink for the first time introduces semantics-aware ks-anonymization of non-numerical (such as categorical or hierarchically generalizable) streaming data in a information loss-optimized manner. In addition, it provides native integration into Apache Flink, one of the prevailing frameworks for enterprise-grade stream data processing in numerous application domains. Our contributions excel the previously established state of the art for the privacy guarantee-providing anonymization of streaming data in that they 1) allow to include non-numerical data in the anonymization process, 2) provide discrete datapoints instead of aggregates, thereby facilitating flexible data use, 3) are applicable in real-world system contexts with minimal integration efforts, and 4) are experimentally proven to raise acceptable performance overheads and information loss in realistic settings. With these characteristics, Prink provides an anonymization approach which is practically feasible for a broad variety of real-world, enterprise-grade stream processing applications and environments.

Paperid: 1672, https://arxiv.org/pdf/2505.09743.pdf

Abstract:
Location-based service (LBS) applications proliferate and support transportation, entertainment, and more. Modern mobile platforms, with smartphones being a prominent example, rely on terrestrial and satellite infrastructures (e.g., global navigation satellite system (GNSS) and crowdsourced Wi-Fi, Bluetooth, cellular, and IP databases) for correct positioning. However, they are vulnerable to attacks that manipulate positions to control and undermine LBS functionality -- thus enabling the scamming of users or services. Our work reveals that GNSS spoofing attacks succeed even though smartphones have multiple sources of positioning information. Moreover, that Wi-Fi spoofing attacks with GNSS jamming are surprisingly effective. More concerning is the evidence that sophisticated, coordinated spoofing attacks are highly effective. Attacks can target GNSS in combination with other positioning methods, thus defenses that assume that only GNSS is under attack cannot be effective. More so, resilient GNSS receivers and special-purpose antennas are not feasible on smartphones. To address this gap, we propose an extended receiver autonomous integrity monitoring (RAIM) framework that leverages the readily available, redundant, often so-called opportunistic positioning information on off-the-shelf platforms. We jointly use onboard sensors, terrestrial infrastructures, and GNSS. We show that our extended RAIM framework improves resilience against location spoofing, e.g., achieving a detection accuracy improvement of up to 24-58% compared to the state-of-the-art algorithms and location providers; detecting attacks within 5 seconds, with a low false positive rate.

Paperid: 1673, https://arxiv.org/pdf/2505.06989.pdf

Abstract:
This paper presents the first large-scale empirical study of commercial personally identifiable information (PII) removal systems -- commercial services that claim to improve privacy by automating the removal of PII from data broker's databases. Popular examples of such services include DeleteMe, Mozilla Monitor, Incogni, among many others. The claims these services make may be very appealing to privacy-conscious Web users, but how effective these services actually are at improving privacy has not been investigated. This work aims to improve our understanding of commercial PII removal services in multiple ways. First, we conduct a user study where participants purchase subscriptions from four popular PII removal services, and report (i) what PII the service find, (ii) from which data brokers, (iii) whether the service is able to have the information removed, and (iv) whether the identified information actually is PII describing the participant. And second, by comparing the claims and promises the services makes (e.g. which and how many data brokers each service claims to cover). We find that these services have significant accuracy and coverage issues that limit the usefulness of these services as a privacy-enhancing technology. For example, we find that the measured services are unable to remove the majority of the identified PII records from data broker's (48.2% of the successfully removed found records) and that most records identified by these services are not PII about the user (study participants found that only 41.1% of records identified by these services were PII about themselves).

Paperid: 1674, https://arxiv.org/pdf/2505.06171.pdf

Abstract:
Global navigation satellite systems (GNSS) are vulnerable to spoofing attacks, with adversarial signals manipulating the location or time information of receivers, potentially causing severe disruptions. The task of discerning the spoofing signals from benign ones is naturally relevant for machine learning, thus recent interest in applying it for detection. While deep learning-based methods are promising, they require extensive labeled datasets, consume significant computational resources, and raise privacy concerns due to the sensitive nature of position data. This is why this paper proposes a self-supervised federated learning framework for GNSS spoofing detection. It consists of a cloud server and local mobile platforms. Each mobile platform employs a self-supervised anomaly detector using long short-term memory (LSTM) networks. Labels for training are generated locally through a spoofing-deviation prediction algorithm, ensuring privacy. Local models are trained independently, and only their parameters are uploaded to the cloud server, which aggregates them into a global model using FedAvg. The updated global model is then distributed back to the mobile platforms and trained iteratively. The evaluation shows that our self-supervised federated learning framework outperforms position-based and deep learning-based methods in detecting spoofing attacks while preserving data privacy.

Paperid: 1675, https://arxiv.org/pdf/2505.05613.pdf

Abstract:
As sequential learning algorithms are increasingly applied to real life, ensuring data privacy while maintaining their utilities emerges as a timely question. In this context, regret minimisation in stochastic bandits under $Îµ$-global Differential Privacy (DP) has been widely studied. Unlike bandits without DP, there is a significant gap between the best-known regret lower and upper bound in this setting, though they "match" in order. Thus, we revisit the regret lower and upper bounds of $Îµ$-global DP algorithms for Bernoulli bandits and improve both. First, we prove a tighter regret lower bound involving a novel information-theoretic quantity characterising the hardness of $Îµ$-global DP in stochastic bandits. Our lower bound strictly improves on the existing ones across all $Îµ$ values. Then, we choose two asymptotically optimal bandit algorithms, i.e. DP-KLUCB and DP-IMED, and propose their DP versions using a unified blueprint, i.e., (a) running in arm-dependent phases, and (b) adding Laplace noise to achieve privacy. For Bernoulli bandits, we analyse the regrets of these algorithms and show that their regrets asymptotically match our lower bound up to a constant arbitrary close to 1. This refutes the conjecture that forgetting past rewards is necessary to design optimal bandit algorithms under global DP. At the core of our algorithms lies a new concentration inequality for sums of Bernoulli variables under Laplace mechanism, which is a new DP version of the Chernoff bound. This result is universally useful as the DP literature commonly treats the concentrations of Laplace noise and random variables separately, while we couple them to yield a tighter bound.

Paperid: 1676, https://arxiv.org/pdf/2505.05370.pdf

Abstract:
Decentralized storage systems face a fundamental trade-off between replication overhead, recovery efficiency, and security guarantees. Current approaches either rely on full replication, incurring substantial storage costs, or employ trivial erasure coding schemes that struggle with efficient recovery especially under high storage-node churn. We present Walrus, a novel decentralized blob storage system that addresses these limitations through multiple technical innovations. At the core of Walrus is RedStuff, a two-dimensional erasure coding protocol that achieves high security with only 4.5x replication factor, while enabling self-healing recovery that requires bandwidth proportional to only the lost data $(O(|blob|/n)$ versus $O(|blob|)$ in traditional systems). Crucially, RedStuff is the first protocol to support storage challenges in asynchronous networks, preventing adversaries from exploiting network delays to pass verification without actually storing data. Walrus also introduces a novel multi-stage epoch change protocol that efficiently handles storage node churn while maintaining uninterrupted availability during committee transitions. Our system incorporates authenticated data structures to defend against malicious clients and ensures data consistency throughout storage and retrieval processes. Experimental evaluation demonstrates that Walrus achieves practical performance at scale, making it suitable for a wide range of decentralized applications requiring high-integrity, available blob storage with reasonable overhead.

Paperid: 1677, https://arxiv.org/pdf/2505.05090.pdf

Abstract:
The sixth generation of wireless networks defined several key performance indicators (KPIs) for assessing its networks, mainly in terms of reliability, coverage, and sensing. In this regard, remarkable attention has been paid recently to the integrated sensing and communication (ISAC) paradigm as an enabler for efficiently and jointly performing communication and sensing using the same spectrum and hardware resources. On the other hand, ensuring communication and data security has been an imperative requirement for wireless networks throughout their evolution. The physical-layer security (PLS) concept paved the way to catering to the security needs in wireless networks in a sustainable way while guaranteeing theoretically secure transmissions, independently of the computational capacity of adversaries. Therefore, it is of paramount importance to consider a balanced trade-off between communication reliability, sensing, and security in future networks, such as the 5G and beyond, and the 6G. In this paper, we provide a comprehensive and system-wise review of designed secure ISAC systems from a PLS point of view. In particular, the impact of various physical-layer techniques, schemes, and wireless technologies to ensure the sensing-security trade-off is studied from the surveyed work. Furthermore, the amalgamation of PLS and ISAC is analyzed in a broader impact by considering attacks targeting data confidentiality, communication covertness, and sensing spoofing. The paper also serves as a tutorial by presenting several theoretical foundations on ISAC and PLS, which represent a practical guide for readers to develop novel secure ISAC network designs.

Paperid: 1678, https://arxiv.org/pdf/2505.04249.pdf

Abstract:
Typical magnetic induction (MI) communication is commonly considered a secure underwater wireless communication (UWC) technology due to its non-audible and non-visible nature compared to acoustic and optical UWC technologies. However, vulnerabilities in communication systems inevitably exist and may lead to different types of attacks. In this paper, we investigate the eavesdropping attack in underwater MI communication to quantitatively measure the system's vulnerability under this attack. We consider different potential eavesdropping configuration setups based on the positions and orientations of the eavesdropper node to investigate how they impact the received voltage and secrecy at the legitimate receiver node. To this end, we develop finite-element-method-based simulation models for each configuration in an underwater environment and evaluate the received voltage and the secrecy capacity against different system parameters such as magnetic flux, magnetic flux density, distance, and orientation sensitivity. Furthermore, we construct an experimental setup within a laboratory environment to replicate the simulation experiments. Both simulation and lab experimental confirm the susceptibility of underwater MI communication to eavesdropping attacks. However, this vulnerability is highly dependent on the position and orientation of the coil between the eavesdropper and the legitimate transmitter. On the positive side, we also observe a unique behavior in the received coil reception that might be used to detect malicious node activities in the vicinity, which might lead to a potential security mechanism against eavesdropping attacks.

Paperid: 1679, https://arxiv.org/pdf/2505.03147.pdf

Abstract:
This work evaluates the performance of Cyber Threat Intelligence (CTI) extraction methods in identifying attack techniques from threat reports available on the web using the MITRE ATT&CK framework. We analyse four configurations utilising state-of-the-art tools, including the Threat Report ATT&CK Mapper (TRAM) and open-source Large Language Models (LLMs) such as Llama2. Our findings reveal significant challenges, including class imbalance, overfitting, and domain-specific complexity, which impede accurate technique extraction. To mitigate these issues, we propose a novel two-step pipeline: first, an LLM summarises the reports, and second, a retrained SciBERT model processes a rebalanced dataset augmented with LLM-generated data. This approach achieves an improvement in F1-scores compared to baseline models, with several attack techniques surpassing an F1-score of 0.90. Our contributions enhance the efficiency of web-based CTI systems and support collaborative cybersecurity operations in an interconnected digital landscape, paving the way for future research on integrating human-AI collaboration platforms.

Paperid: 1680, https://arxiv.org/pdf/2505.03072.pdf

Abstract:
This article describes SafeTab-H, a disclosure avoidance algorithm applied to the release of the U.S. Census Bureau's Detailed Demographic and Housing Characteristics File B (Detailed DHC-B) as part of the 2020 Census. The tabulations contain household statistics about household type and tenure iterated by the householder's detailed race, ethnicity, or American Indian and Alaska Native tribe and village at varying levels of geography. We describe the algorithmic strategy which is based on adding noise from a discrete Gaussian distribution and show that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy. We discuss how the implementation of the SafeTab-H codebase relies on the Tumult Analytics privacy library. We also describe the theoretical expected error properties of the algorithm and explore various aspects of its parameter tuning.

Paperid: 1681, https://arxiv.org/pdf/2505.01472.pdf

Abstract:
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the Detailed Demographic and Housing Characteristics File A (Detailed DHC-A) of the 2020 Census. The tabulations contain statistics (counts) of demographic characteristics of the entire population of the United States, crossed with detailed races and ethnicities at varying levels of geography. The article describes the SafeTab-P algorithm, which is based on adding noise drawn to statistics of interest from a discrete Gaussian distribution. A key innovation in SafeTab-P is the ability to adaptively choose how many statistics and at what granularity to release them, depending on the size of a population group. We prove that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy (zCDP). We then describe how the algorithm was implemented on Tumult Analytics and briefly outline the parameterization and tuning of the algorithm.

Paperid: 1682, https://arxiv.org/pdf/2505.01254.pdf

Abstract:
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC). The tabulations contain statistics of counts of U.S. persons living in certain types of households, including averages. The article describes the PHSafe algorithm, which is based on adding noise drawn from a discrete Gaussian distribution to the statistics of interest. We prove that the algorithm satisfies a well-studied variant of differential privacy, called zero-concentrated differential privacy. We then describe how the algorithm was implemented on Tumult Analytics and briefly outline the parameterization and tuning of the algorithm.

Paperid: 1683, https://arxiv.org/pdf/2505.00946.pdf

Abstract:
Fraud detection is crucial in social service networks to maintain user trust and improve service network security. Existing spectral graph-based methods address this challenge by leveraging different graph filters to capture signals with different frequencies in service networks. However, most graph filter-based methods struggle with deriving clean and discriminative graph signals. On the one hand, they overlook the noise in the information propagation process, resulting in degradation of filtering ability. On the other hand, they fail to discriminate the frequency-specific characteristics of graph signals, leading to distortion of signals fusion. To address these issues, we develop a novel spectral graph network based on information bottleneck theory (SGNN-IB) for fraud detection in service networks. SGNN-IB splits the original graph into homophilic and heterophilic subgraphs to better capture the signals at different frequencies. For the first limitation, SGNN-IB applies information bottleneck theory to extract key characteristics of encoded representations. For the second limitation, SGNN-IB introduces prototype learning to implement signal fusion, preserving the frequency-specific characteristics of signals. Extensive experiments on three real-world datasets demonstrate that SGNN-IB outperforms state-of-the-art fraud detection methods.

Paperid: 1684, https://arxiv.org/pdf/2505.00465.pdf

Abstract:
Windows operating systems (OS) are ubiquitous in enterprise Information Technology (IT) and operational technology (OT) environments. Due to their widespread adoption and known vulnerabilities, they are often the primary targets of malware and ransomware attacks. With 93% of the ransomware targeting Windows-based systems, there is an urgent need for advanced defensive mechanisms to detect, analyze, and mitigate threats effectively. In this paper, we propose HoneyWin a high-interaction Windows honeypot that mimics an enterprise IT environment. The HoneyWin consists of three Windows 11 endpoints and an enterprise-grade gateway provisioned with comprehensive network traffic capturing, host-based logging, deceptive tokens, endpoint security and real-time alerts capabilities. The HoneyWin has been deployed live in the wild for 34 days and receives more than 5.79 million unsolicited connections, 1.24 million login attempts, 5 and 354 successful logins via remote desktop protocol (RDP) and secure shell (SSH) respectively. The adversary interacted with the deceptive token in one of the RDP sessions and exploited the public-facing endpoint to initiate the Simple Mail Transfer Protocol (SMTP) brute-force bot attack via SSH sessions. The adversary successfully harvested 1,250 SMTP credentials after attempting 151,179 credentials during the attack.

Paperid: 1685, https://arxiv.org/pdf/2504.21480.pdf

Abstract:
With the rapid advancement of blockchain technology, smart contracts have enabled the implementation of increasingly complex functionalities. However, ensuring the security of smart contracts remains a persistent challenge across the stages of development, compilation, and execution. Vulnerabilities within smart contracts not only undermine the security of individual applications but also pose significant risks to the broader blockchain ecosystem, as demonstrated by the growing frequency of attacks since 2016, resulting in substantial financial losses. This paper provides a comprehensive analysis of key security risks in Ethereum smart contracts, specifically those written in Solidity and executed on the Ethereum Virtual Machine (EVM). We focus on two prevalent and critical vulnerability types (reentrancy and integer overflow) by examining their underlying mechanisms, replicating attack scenarios, and assessing effective countermeasures.

Paperid: 1686, https://arxiv.org/pdf/2504.21036.pdf

Abstract:
Fine-tuning large language models (LLMs) has become an essential strategy for adapting them to specialized tasks; however, this process introduces significant privacy challenges, as sensitive training data may be inadvertently memorized and exposed. Although differential privacy (DP) offers strong theoretical guarantees against such leakage, its empirical privacy effectiveness on LLMs remains unclear, especially under different fine-tuning methods. In this paper, we systematically investigate the impact of DP across fine-tuning methods and privacy budgets, using both data extraction and membership inference attacks to assess empirical privacy risks. Our main findings are as follows: (1) Differential privacy reduces model utility, but its impact varies significantly across different fine-tuning methods. (2) Without DP, the privacy risks of models fine-tuned with different approaches differ considerably. (3) When DP is applied, even a relatively high privacy budget can substantially lower privacy risk. (4) The privacy-utility trade-off under DP training differs greatly among fine-tuning methods, with some methods being unsuitable for DP due to severe utility degradation. Our results provide practical guidance for privacy-conscious deployment of LLMs and pave the way for future research on optimizing the privacy-utility trade-off in fine-tuning methodologies.

Paperid: 1687, https://arxiv.org/pdf/2504.20570.pdf

Abstract:
Parameter-efficient fine-tuning (PEFT) has emerged as a practical solution for adapting large language models (LLMs) to custom datasets with significantly reduced computational cost. When carrying out PEFT under collaborative learning scenarios (e.g., federated learning), it is often required to exchange model updates (or gradients) across parties. These gradients, even with limited dimensions, can cause severe breach of data privacy. Recent works have shown that both contextual prefixes and personally identifiable information (PII) can be exposed through gradients. However, \emph{simultaneously} and \emph{accurately} recovering both components from the same training instance remains infeasible due to the following challenges: 1) limited number of PEFT parameters; 2) high-dimensional token spaces; and 3) large batch sizes. We propose ReCIT, a novel privacy attack that addresses all challenges, and achieves recovery of \emph{full} private data from PEFT gradients with high fidelity. Specifically, ReCIT proposes to enhance the memorization capability of the pre-trained model through malicious fine-tuning with Personal Notes; ReCIT also proposes a novel filter-based token extraction technique and a token pairing mechanism, to accurately reconstruct tokens from the training sequences with large batch sizes. Extensive evaluations show that ReCIT consistently outperforms state-of-the-art gradient inversion and memorization-based attacks across different PEFT paradigms. It achieves up to 10$\times$ higher PII recovery rates and remains effective across varying batch sizes, especially in settings where prefix reconstruction is intractable for conventional approaches. These findings highlight an urgent need to reassess the privacy guarantees of PEFT, especially in decentralized or shared training environments.

Paperid: 1688, https://arxiv.org/pdf/2504.20260.pdf

Abstract:
The inclusion of pervasive computing devices in a democratized edge computing ecosystem can significantly expand the capability and coverage of near-end computing for large-scale applications. However, offloading user tasks to heterogeneous and decentralized edge devices comes with the dual risk of both endangered user data security and privacy due to the curious base station or malicious edge servers, and unfair offloading and malicious attacks targeting edge servers from other edge servers and/or users. Existing solutions to edge access control and offloading either rely on "always-on" cloud servers with reduced edge benefits or fail to protect sensitive user service information. To address these challenges, this paper presents SA2FE, a novel framework for edge access control, offloading and accounting. We design a rerandomizable puzzle primitive and a corresponding scheme to protect sensitive service information from eavesdroppers and ensure fair offloading decisions, while a blind token-based scheme safeguards user privacy, prevents double spending, and ensures usage accountability. The security of SA2FE is proved under the Universal Composability framework, and its performance and scalability are demonstrated with implementation on commodity mobile devices and edge servers.

Paperid: 1689, https://arxiv.org/pdf/2504.19566.pdf

Abstract:
For those seeking end-to-end private communication free from pervasive metadata tracking and censorship, the Tor network has been the de-facto choice in practice, despite its susceptibility to traffic analysis attacks. Recently, numerous metadata-private messaging proposals have emerged with the aim to surpass Tor in the messaging context by obscuring the relationships between any two messaging buddies, even against global and active attackers. However, most of these systems face an undesirable usability constraint: they require a metadata-private "dialing" phase to establish mutual agreement and timing or round coordination before initiating any regular chats among users. This phase is not only resource-intensive but also inflexible, limiting users' ability to manage multiple concurrent conversations seamlessly. For stringent privacy requirement, the often-enforced traffic uniformity further exacerbated the limitations of this roadblock. In this paper, we introduce PingPong, a new end-to-end system for metadata-private messaging designed to overcome these limitations. Under the same traffic uniformity requirement, PingPong replaces the rigid "dial-before-converse" paradigm with a more flexible "notify-before-retrieval" workflow. This workflow incorporates a metadata-private notification subsystem, Ping, and a metadata-private message store, Pong. Both Ping and Pong leverage hardware-assisted secure enclaves for performance and operates through a series of customized oblivious algorithms, while meeting the uniformity requirements for metadata protection. By allowing users to switch between conversations on demand, PingPong achieves a level of usability akin to modern instant messaging systems, while also offering improved performance and bandwidth utilization for goodput. We have built a prototype of PingPong with 32 8-core servers equipped with enclaves to validate our claims.

Paperid: 1690, https://arxiv.org/pdf/2504.18349.pdf

Abstract:
With the surge of large language models (LLMs), Large Vision-Language Models (VLMs)--which integrate vision encoders with LLMs for accurate visual grounding--have shown great potential in tasks like generalist agents and robotic control. However, VLMs are typically trained on massive web-scraped images, raising concerns over copyright infringement and privacy violations, and making data auditing increasingly urgent. Membership inference (MI), which determines whether a sample was used in training, has emerged as a key auditing technique, with promising results on open-source VLMs like LLaVA (AUC > 80%). In this work, we revisit these advances and uncover a critical issue: current MI benchmarks suffer from distribution shifts between member and non-member images, introducing shortcut cues that inflate MI performance. We further analyze the nature of these shifts and propose a principled metric based on optimal transport to quantify the distribution discrepancy. To evaluate MI in realistic settings, we construct new benchmarks with i.i.d. member and non-member images. Existing MI methods fail under these unbiased conditions, performing only marginally better than chance. Further, we explore the theoretical upper bound of MI by probing the Bayes Optimality within the VLM's embedding space and find the irreducible error rate remains high. Despite this pessimistic outlook, we analyze why MI for VLMs is particularly challenging and identify three practical scenarios--fine-tuning, access to ground-truth texts, and set-based inference--where auditing becomes feasible. Our study presents a systematic view of the limits and opportunities of MI for VLMs, providing guidance for future efforts in trustworthy data auditing.

Paperid: 1691, https://arxiv.org/pdf/2504.15942.pdf

Abstract:
AI-based systems, such as Google's GenCast, have recently redefined the state of the art in weather forecasting, offering more accurate and timely predictions of both everyday weather and extreme events. While these systems are on the verge of replacing traditional meteorological methods, they also introduce new vulnerabilities into the forecasting process. In this paper, we investigate this threat and present a novel attack on autoregressive diffusion models, such as those used in GenCast, capable of manipulating weather forecasts and fabricating extreme events, including hurricanes, heat waves, and intense rainfall. The attack introduces subtle perturbations into weather observations that are statistically indistinguishable from natural noise and change less than 0.1% of the measurements - comparable to tampering with data from a single meteorological satellite. As modern forecasting integrates data from nearly a hundred satellites and many other sources operated by different countries, our findings highlight a critical security risk with the potential to cause large-scale disruptions and undermine public trust in weather prediction.

Paperid: 1692, https://arxiv.org/pdf/2504.15817.pdf

Abstract:
Fully Homomorphic Encryption (FHE) is a set of powerful cryptographic schemes that allows computation to be performed directly on encrypted data with an unlimited depth. Despite FHE's promising in privacy-preserving computing, yet in most FHE schemes, ciphertext generally blows up thousands of times compared to the original message, and the massive amount of data load from off-chip memory for bootstrapping and privacy-preserving machine learning applications (such as HELR, ResNet-20), both degrade the performance of FHE-based computation. Several hardware designs have been proposed to address this issue, however, most of them require enormous resources and power. An acceleration platform with easy programmability, high efficiency, and low overhead is a prerequisite for practical application. This paper proposes EFFACT, a highly efficient full-stack FHE acceleration platform with a compiler that provides comprehensive optimizations and vector-friendly hardware. We start by examining the computational overhead across different real-world benchmarks to highlight the potential benefits of reallocating computing resources for efficiency enhancement. Then we make a design space exploration to find an optimal SRAM size with high utilization and low cost. On the other hand, EFFACT features a novel optimization named streaming memory access which is proposed to enable high throughput with limited SRAMs. Regarding the software-side optimization, we also propose a circuit-level function unit reuse scheme, to substantially reduce the computing resources without performance degradation. Moreover, we design novel NTT and automorphism units that are suitable for a cost-sensitive and highly efficient architecture, leading to low area. For generality, EFFACT is also equipped with an ISA and a compiler backend that can support several FHE schemes like CKKS, BGV, and BFV.

Paperid: 1693, https://arxiv.org/pdf/2504.14957.pdf

Abstract:
Ma and Huang recently proved that the PFC construction, introduced by Metger, Poremba, Sinha and Yuen [MPSY24], gives an adaptive-secure pseudorandom unitary family PRU. Their proof developed a new path recording technique [MH24]. In this work, we show that a linear number of sequential repetitions of the parallel Kac's Walk, introduced by Lu, Qin, Song, Yao and Zhao [LQSY+24], also forms an adaptive-secure PRU, confirming a conjecture therein. Moreover, it additionally satisfies strong security against adversaries making inverse queries. This gives an alternative PRU construction, and provides another instance demonstrating the power of the path recording technique. We also discuss some further simplifications and implications.

Paperid: 1694, https://arxiv.org/pdf/2504.14809.pdf

Abstract:
Blockchain technology promises a decentralized, trustless, and interoperable infrastructure. However, widespread adoption remains hindered by issues such as limited scalability, high transaction costs, and the complexity of maintaining coherent verification logic across different blockchain layers. This paper introduces Verifiable Applications (vApps), a novel development framework designed to streamline the creation and deployment of verifiable blockchain computing applications. vApps offer a unified Rust-based Domain-Specific Language (DSL) within a comprehensive SDK, featuring modular abstractions for verification, proof generation, and inter-chain connectivity. This eases the developer's burden in securing diverse software components, allowing them to focus on application logic. The DSL also ensures that applications can automatically take advantage of specialized precompiles and hardware acceleration to achieve consistently high performance with minimal developer effort, as demonstrated by benchmark results for zero-knowledge virtual machines (zkVMs). Experiments show that native Rust execution eliminates interpretation overhead, delivering up to an 197x cycle count improvement compared to EVM-based approaches. Precompiled circuits can accelerate the proof by more than 95%, while GPU acceleration increases throughput by up to 30x and recursion compresses the proof size by up to 230x, enabling succinct and efficient verification. The framework also supports seamless integration with the Web2 and Web3 systems, enabling developers to focus solely on their application logic. Through modular architecture, robust security guarantees, and composability, vApps pave the way toward a trust-minimized and verifiable Internet-scale application environment.

Paperid: 1695, https://arxiv.org/pdf/2504.13385.pdf

Abstract:
Cache occupancy attacks exploit the shared nature of cache hierarchies to infer a victim's activities by monitoring overall cache usage, unlike access-driven cache attacks that focus on specific cache lines or sets. There exists some prior work that target the last-level cache (LLC) of Intel processors, which is inclusive of higher-level caches, and L2 caches of ARM systems. In this paper, we target the System-Level Cache (SLC) of Apple M-series SoCs, which is exclusive to higher-level CPU caches. We address the challenges of the exclusiveness and propose a suite of SLC-cache occupancy attacks, the first of its kind, where an adversary can monitor GPU and other CPU cluster activities from their own CPU cluster. We first discover the structure of SLC in Apple M1 SOC and various policies pertaining to access and sharing through reverse engineering. We propose two attacks against websites. One is a coarse-grained fingerprinting attack, recognizing which website is accessed based on their different GPU memory access patterns monitored through the SLC occupancy channel. The other attack is a fine-grained pixel stealing attack, which precisely monitors the GPU memory usage for rendering different pixels, through the SLC occupancy channel. Third, we introduce a novel screen capturing attack which works beyond webpages, with the monitoring granularity of 57 rows of pixels (there are 1600 rows for the screen). This significantly expands the attack surface, allowing the adversary to retrieve any screen display, posing a substantial new threat to system security. Our findings reveal critical vulnerabilities in Apple's M-series SoCs and emphasize the urgent need for effective countermeasures against cache occupancy attacks in heterogeneous computing environments.

Paperid: 1696, https://arxiv.org/pdf/2504.11961.pdf

Abstract:
Zero-knowledge (ZK) circuits enable privacy-preserving computations and are central to many cryptographic protocols. Systems like Circom simplify ZK development by combining witness computation and circuit constraints in one program. However, even small errors can compromise security of ZK programs --under-constrained circuits may accept invalid witnesses, while over-constrained ones may reject valid ones. Static analyzers are often imprecise with high false positives, and formal tools struggle with real-world circuit scale. Additionally, existing tools overlook several critical behaviors, such as intermediate computations and program aborts, and thus miss many vulnerabilities. Our theoretical contribution is the Trace-Constraint Consistency Test (TCCT), a foundational language-independent formulation of ZK circuit bugs that defines bugs as discrepancies between the execution traces of the computation and the circuit constraints. TCCT captures both intermediate computations and program aborts, detecting bugs that elude prior tools. Our systems contribution is zkFuzz, a novel program mutation-based fuzzing framework for detecting TCCT violations. zkFuzz systematically mutates the computational logic of Zk programs guided by a novel fitness function, and injects carefully crafted inputs using tailored heuristics to expose bugs. We evaluated zkFuzz on 354 real-world ZK circuits written in Circom, a leading programming system for ZK development. zkFuzz successfully identified 66 bugs, including 38 zero-days --18 of which were confirmed by developers and 6 fixed, earning bug bounties.

Paperid: 1697, https://arxiv.org/pdf/2504.10717.pdf

Abstract:
Fuzz testing to find semantic control vulnerabilities is an essential activity to evaluate the robustness of autonomous driving (AD) software. Whilst there is a preponderance of disparate fuzzing tools that target different parts of the test environment, such as the scenario, sensors, and vehicle dynamics, there is a lack of fuzzing strategies that ensemble these fuzzers to enable concurrent fuzzing, utilizing diverse techniques and targets. This research proposes FuzzSense, a modular, black-box, mutation-based fuzzing framework that is architected to ensemble diverse AD fuzzing tools. To validate the utility of FuzzSense, a LiDAR sensor fuzzer was developed as a plug-in, and the fuzzer was implemented in the new AD simulation platform AWSIM and Autoware.Universe AD software platform. The results demonstrated that FuzzSense was able to find vulnerabilities in the new Autoware.Universe software. We contribute to FuzzSense open-source with the aim of initiating a conversation in the community on the design of AD-specific fuzzers and the establishment of a community fuzzing framework to better target the diverse technology base of autonomous vehicles.

Paperid: 1698, https://arxiv.org/pdf/2504.08508.pdf

Abstract:
Deploying machine learning (ML) models on user devices can improve privacy (by keeping data local) and reduce inference latency. Trusted Execution Environments (TEEs) are a practical solution for protecting proprietary models, yet existing TEE solutions have architectural constraints that hinder on-device model deployment. Arm Confidential Computing Architecture (CCA), a new Arm extension, addresses several of these limitations and shows promise as a secure platform for on-device ML. In this paper, we evaluate the performance-privacy trade-offs of deploying models within CCA, highlighting its potential to enable confidential and efficient ML applications. Our evaluations show that CCA can achieve an overhead of, at most, 22% in running models of different sizes and applications, including image classification, voice recognition, and chat assistants. This performance overhead comes with privacy benefits; for example, our framework can successfully protect the model against membership inference attack by an 8.3% reduction in the adversary's success rate. To support further research and early adoption, we make our code and methodology publicly available.

Paperid: 1699, https://arxiv.org/pdf/2504.07015.pdf

Abstract:
As modern hardware designs grow in complexity and size, ensuring security across the confidentiality, integrity, and availability (CIA) triad becomes increasingly challenging. Information flow tracking (IFT) is a widely-used approach to tracing data propagation, identifying unauthorized activities that may compromise confidentiality or/and integrity in hardware. However, traditional IFT methods struggle with scalability and adaptability, particularly in high-density and interconnected architectures, leading to tracing bottlenecks that limit applicability in large-scale hardware. To address these limitations and show the potential of transformer-based models in integrated circuit (IC) design, this paper introduces LLM-IFT that integrates large language models (LLM) for the realization of the IFT process in hardware. LLM-IFT exploits LLM-driven structured reasoning to perform hierarchical dependency analysis, systematically breaking down even the most complex designs. Through a multi-step LLM invocation, the framework analyzes both intra-module and inter-module dependencies, enabling comprehensive IFT assessment. By focusing on a set of Trust-Hub vulnerability test cases at both the IP level and the SoC level, our experiments demonstrate a 100\% success rate in accurate IFT analysis for confidentiality and integrity checks in hardware.

Paperid: 1700, https://arxiv.org/pdf/2504.05689.pdf

Abstract:
Conversational large language models (LLMs) have gained widespread attention due to their instruction-following capabilities. To ensure conversational LLMs follow instructions, role separators are employed to distinguish between different participants in a conversation. However, incorporating role separators introduces potential vulnerabilities. Misusing roles can lead to prompt injection attacks, which can easily misalign the model's behavior with the user's intentions, raising significant security concerns. Although various prompt injection attacks have been proposed, recent research has largely overlooked the impact of role separators on safety. This highlights the critical need to thoroughly understand the systemic weaknesses in dialogue systems caused by role separators. This paper identifies modeling weaknesses caused by role separators. Specifically, we observe a strong positional bias associated with role separators, which is inherent in the format of dialogue modeling and can be triggered by the insertion of role separators. We further develop the Separators Injection Attack (SIA), a new orthometric attack based on role separators. The experiment results show that SIA is efficient and extensive in manipulating model behavior with an average gain of 18.2% for manual methods and enhances the attack success rate to 100% with automatic methods.

Paperid: 1701, https://arxiv.org/pdf/2504.05202.pdf

Abstract:
Differential privacy (DP) can be achieved in a distributed manner, where multiple parties add independent noise such that their sum protects the overall dataset with DP. A common technique here is for each party to sample their noise from the decomposition of an infinitely divisible distribution. We analyze two mechanisms in this setting: 1) the generalized discrete Laplace (GDL) mechanism, whose distribution (which is closed under summation) follows from differences of i.i.d. negative binomial shares, and 2) the multi-scale discrete Laplace (MSDLap) mechanism, a novel mechanism following the sum of multiple i.i.d. discrete Laplace shares at different scales. For $\varepsilon \geq 1$, our mechanisms can be parameterized to have $O\left(Î^3 e^{-\varepsilon}\right)$ and $O\left(\min\left(Î^3 e^{-\varepsilon}, Î^2 e^{-2\varepsilon/3}\right)\right)$ MSE, respectively, where $Î$ denote the sensitivity; the latter bound matches known optimality results. We also show a transformation from the discrete setting to the continuous setting, which allows us to transform both mechanisms to the continuous setting and thereby achieve the optimal $O\left(Î^2 e^{-2\varepsilon / 3}\right)$ MSE. To our knowledge, these are the first infinitely divisible additive noise mechanisms that achieve order-optimal MSE under pure DP, so our work shows formally there is no separation in utility when query-independent noise adding mechanisms are restricted to infinitely divisible noise. For the continuous setting, our result improves upon the Arete mechanism from [Pagh and Stausholm, ALT 2022] which gives an MSE of $O\left(Î^2 e^{-\varepsilon/4}\right)$. Furthermore, we give an exact sampler tuned to efficiently implement the MSDLap mechanism, and we apply our results to improve a state of the art multi-message shuffle DP protocol in the high $\varepsilon$ regime.

Paperid: 1702, https://arxiv.org/pdf/2504.05143.pdf

Abstract:
Blockchain solutions typically assume a synchronous network to ensure consistency and achieve consensus. In contrast, offline transaction systems aim to enable users to agree on and execute transactions without assuming bounded communication delays when interacting with the blockchain. Most existing offline payment schemes depend on trusted hardware wallets that are assumed to be secure and tamper-proof. While this work introduces Overdraft, a novel offline payment system that shifts the reliance from hardware to users themselves. Overdraft allows potential payment receivers to assess the likelihood of being paid, allowing them to accept transactions with confidence or deny them. Overdraft achieves this by maintaining a loan network that is weighted by online reputation. This loan network contains time-limited agreements where users pledge to cover another user's payment if necessary. For example, when a payer lacks sufficient funds at the moment of commitment. Offline users rely on the last known view of the loan network -- which they had access to when last online -- to determine whether to participate in an offline transaction. This view is used to estimate the probability of eventual payment, possibly using multiple loans. Once online again, users commit their transactions to the blockchain with any conflicts being resolved deterministically. Overdraft incorporates incentives for users and is designed to be resilient against Sybil attacks. As a proof of concept, we implemented Overdraft as an Ethereum Solidity smart contract and deployed it on the Sepolia testnet to evaluate its performance.

Paperid: 1703, https://arxiv.org/pdf/2504.01856.pdf

Abstract:
We study the tasks of collective coin flipping and leader election in the full-information model. We prove new lower bounds for coin flipping protocols, implying lower bounds for leader election protocols. We show that any $k$-round coin flipping protocol, where each of $\ell$ players sends 1 bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. For all $k>1$ this strengthens previous lower bounds [RSZ, SICOMP 2002], which ruled out protocols resilient to adversaries controlling $O(\ell/\log^{(2k-1)}(\ell))$ players. Consequently, we establish that any protocol tolerating a linear fraction of corrupt players, with only 1 bit per round, must run for at least $\log^*\ell-O(1)$ rounds, improving on the prior best lower bound of $\frac12 \log^*\ell-\log^*\log^*\ell$. This lower bound matches the number of rounds, $\log^*\ell$, taken by the current best coin flipping protocols from [RZ, JCSS 2001], [F, FOCS 1999] that can handle a linear sized coalition of bad players, but with players sending unlimited bits per round. We also derive lower bounds for protocols allowing multi-bit messages per round. Our results show that the protocols from [RZ, JCSS 2001], [F, FOCS 1999] that handle a linear number of corrupt players are almost optimal in terms of round complexity and communication per player in a round. A key technical ingredient in proving our lower bounds is a new result regarding biasing most functions from a family of functions using a common set of bad players and a small specialized set of bad players specific to each function that is biased. We give improved constant-round coin flipping protocols in the setting that each player can send 1 bit per round. For two rounds, our protocol can handle $O(\ell/(\log\ell)(\log\log\ell)^2)$ sized coalition of bad players; better than the best one-round protocol by [AL, Combinatorica 1993] in this setting.

Paperid: 1704, https://arxiv.org/pdf/2503.22413.pdf

Abstract:
The growing trend of legal disputes over the unauthorized use of data in machine learning (ML) systems highlights the urgent need for reliable data-use auditing mechanisms to ensure accountability and transparency in ML. We present the first proactive, instance-level, data-use auditing method designed to enable data owners to audit the use of their individual data instances in ML models, providing more fine-grained auditing results than previous work. To do so, our research generalizes previous work integrating black-box membership inference and sequential hypothesis testing, expanding its scope of application while preserving the quantifiable and tunable false-detection rate that is its hallmark. We evaluate our method on three types of visual ML models: image classifiers, visual encoders, and vision-language models (Contrastive Language-Image Pretraining (CLIP) and Bootstrapping Language-Image Pretraining (BLIP) models). In addition, we apply our method to evaluate the performance of two state-of-the-art approximate unlearning methods. As a noteworthy second contribution, our work reveals that neither method successfully removes the influence of the unlearned data instances from image classifiers and CLIP models, even if sacrificing model utility by $10\%$.

Paperid: 1705, https://arxiv.org/pdf/2503.20364.pdf

Abstract:
Global Navigation Satellite Systems (GNSS) provide standalone precise navigation for a wide gamut of applications. Nevertheless, applications or systems such as unmanned vehicles (aerial or ground vehicles and surface vessels) generally require a much higher level of accuracy than those provided by standalone receivers. The most effective and economical way of achieving centimeter-level accuracy is to rely on corrections provided by fixed \emph{reference station} receivers to improve the satellite ranging measurements. Differential GNSS (DGNSS) and Real Time Kinematics (RTK) provide centimeter-level accuracy by distributing online correction streams to connected nearby mobile receivers typically termed \emph{rovers}. However, due to their static nature, reference stations are prime targets for GNSS attacks, both simplistic jamming and advanced spoofing, with different levels of adversarial control and complexity. Jamming the reference station would deny corrections and thus accuracy to the rovers. Spoofing the reference station would force it to distribute misleading corrections. As a result, all connected rovers using those corrections will be equally influenced by the adversary independently of their actual trajectory. We evaluate a battery of tests generated with an RF simulator to test the robustness of a common DGNSS/RTK processing library and receivers. We test both jamming and synchronized spoofing to demonstrate that adversarial action on the rover using reference spoofing is both effective and convenient from an adversarial perspective. Additionally, we discuss possible strategies based on existing countermeasures (self-validation of the PNT solution and monitoring of own clock drift) that the rover and the reference station can adopt to avoid using or distributing bogus corrections.

Paperid: 1706, https://arxiv.org/pdf/2503.18345.pdf

Abstract:
The Tor network enhances clients' privacy by routing traffic through an overlay network of volunteered intermediate relays. Tor employs a distributed protocol among nine hard-coded Directory Authority (DA) servers to securely disseminate information about these relays to produce a new consensus document every hour. With a straightforward voting mechanism to ensure consistency, the protocol is expected to be secure even when a minority of those authorities get compromised. However, the current consensus protocol is flawed: it allows an equivocation attack that enables only a single compromised authority to create a valid consensus document with malicious relays. Importantly the vulnerability is not innocuous: We demonstrate that the compromised authority can effectively trick a targeted client into using the equivocated consensus document in an undetectable manner. Moreover, even if we have archived Tor consensus documents available since its beginning, we cannot be sure that no client was ever tricked. We propose a two-stage solution to deal with this exploit. In the short term, we have developed and deployed TorEq, a monitor to detect such exploits reactively: the Tor clients can refer to the monitor before updating the consensus to ensure no equivocation. To solve the problem proactively, we first define the Tor DA consensus problem as the interactive consistency (IC) problem from the distributed computing literature. We then design DirCast, a novel secure Byzantine Broadcast protocol that requires minimal code change from the current Tor DA code base. Our protocol has near-optimal efficiency that uses optimistically five rounds and at most nine rounds to reach an agreement in the current nine-authority system. We are communicating with the Tor security team to incorporate the solutions into the Tor project.

Paperid: 1707, https://arxiv.org/pdf/2503.17132.pdf

Abstract:
This paper explores the promising interplay between spiking neural networks (SNNs) and event-based cameras for privacy-preserving human action recognition (HAR). The unique feature of event cameras in capturing only the outlines of motion, combined with SNNs' proficiency in processing spatiotemporal data through spikes, establishes a highly synergistic compatibility for event-based HAR. Previous studies, however, have been limited by SNNs' ability to process long-term temporal information, essential for precise HAR. In this paper, we introduce two novel frameworks to address this: temporal segment-based SNN (\textit{TS-SNN}) and 3D convolutional SNN (\textit{3D-SNN}). The \textit{TS-SNN} extracts long-term temporal information by dividing actions into shorter segments, while the \textit{3D-SNN} replaces 2D spatial elements with 3D components to facilitate the transmission of temporal information. To promote further research in event-based HAR, we create a dataset, \textit{FallingDetection-CeleX}, collected using the high-resolution CeleX-V event camera $(1280 \times 800)$, comprising 7 distinct actions. Extensive experimental results show that our proposed frameworks surpass state-of-the-art SNN methods on our newly collected dataset and three other neuromorphic datasets, showcasing their effectiveness in handling long-range temporal information for event-based HAR.

Paperid: 1708, https://arxiv.org/pdf/2503.14279.pdf

Abstract:
Many computing systems need to be protected against physical attacks using active tamper detection based on sensors. One technical solution is to employ an ATR (Anti-Tamper Radio) approach, analyzing the radio wave propagation effects within a protected device to detect unauthorized physical alterations. However, ATR systems face key challenges in terms of susceptibility to signal manipulation attacks, limited reliability due to environmental noise, and regulatory constraints from wide bandwidth usage. In this work, we propose and experimentally evaluate an ATR system complemented by an RIS to dynamically reconfigure the wireless propagation environment. We show that this approach can enhance resistance against signal manipulation attacks, reduce bandwidth requirements from several~GHz down to as low as 20 MHz, and improve robustness to environmental disturbances such as internal fan movements. Our work demonstrates that RIS integration can strengthen the ATR performance to enhance security, sensitivity, and robustness, recognizing the potential of smart radio environments for ATR-based tamper detection

Paperid: 1709, https://arxiv.org/pdf/2503.11945.pdf

Abstract:
Invisible watermarking of AI-generated images can help with copyright protection, enabling detection and identification of AI-generated media. In this work, we present a novel approach to watermark images of T2I Latent Diffusion Models (LDMs). By only fine-tuning text token embeddings $W_*$, we enable watermarking in selected objects or parts of the image, offering greater flexibility compared to traditional full-image watermarking. Our method leverages the text encoder's compatibility across various LDMs, allowing plug-and-play integration for different LDMs. Moreover, introducing the watermark early in the encoding stage improves robustness to adversarial perturbations in later stages of the pipeline. Our approach achieves $99\%$ bit accuracy ($48$ bits) with a $10^5 \times$ reduction in model parameters, enabling efficient watermarking.

Paperid: 1710, https://arxiv.org/pdf/2503.09513.pdf

Abstract:
Internet of Things (IoT) platforms with trigger-action capability allow event conditions to trigger actions in IoT devices autonomously by creating a chain of interactions. Adversaries exploit this chain of interactions to maliciously inject fake event conditions into IoT hubs, triggering unauthorized actions on target IoT devices to implement remote injection attacks. Existing defense mechanisms focus mainly on the verification of event transactions using physical event fingerprints to enforce the security policies to block unsafe event transactions. These approaches are designed to provide offline defense against injection attacks. The state-of-the-art online defense mechanisms offer real-time defense, but extensive reliability on the inference of attack impacts on the IoT network limits the generalization capability of these approaches. In this paper, we propose a platform-independent multi-agent online defense system, namely RESTRAIN, to counter remote injection attacks at runtime. RESTRAIN allows the defense agent to profile attack actions at runtime and leverages reinforcement learning to optimize a defense policy that complies with the security requirements of the IoT network. The experimental results show that the defense agent effectively takes real-time defense actions against complex and dynamic remote injection attacks and maximizes the security gain with minimal computational overhead.

Paperid: 1711, https://arxiv.org/pdf/2503.09365.pdf

Abstract:
Deep learning models have an intrinsic privacy issue as they memorize parts of their training data, creating a privacy leakage. Membership Inference Attacks (MIA) exploit it to obtain confidential information about the data used for training, aiming to steal information. They can be repurposed as a measurement of data integrity by inferring whether it was used to train a machine learning model. While state-of-the-art attacks achieve a significant privacy leakage, their requirements are not feasible enough, hindering their role as practical tools to assess the magnitude of the privacy risk. Moreover, the most appropriate evaluation metric of MIA, the True Positive Rate at low False Positive Rate lacks interpretability. We claim that the incorporation of Few-Shot Learning techniques to the MIA field and a proper qualitative and quantitative privacy evaluation measure should deal with these issues. In this context, our proposal is twofold. We propose a Few-Shot learning based MIA, coined as the FeS-MIA model, which eases the evaluation of the privacy breach of a deep learning model by significantly reducing the number of resources required for the purpose. Furthermore, we propose an interpretable quantitative and qualitative measure of privacy, referred to as Log-MIA measure. Jointly, these proposals provide new tools to assess the privacy leakage and to ease the evaluation of the training data integrity of deep learning models, that is, to analyze the privacy breach of a deep learning model. Experiments carried out with MIA over image classification and language modeling tasks and its comparison to the state-of-the-art show that our proposals excel at reporting the privacy leakage of a deep learning model with little extra information.

Paperid: 1712, https://arxiv.org/pdf/2503.07857.pdf

Abstract:
Open Radio Access Networks (O-RAN) are transforming telecommunications by shifting from centralized to distributed architectures, promoting flexibility, interoperability, and innovation through open interfaces and multi-vendor environments. However, O-RAN's reliance on cloud-based architecture and enhanced observability introduces significant security and resource management challenges. Efficient resource management is crucial for secure and reliable communication in O-RAN, within the resource-constrained environment and heterogeneity of requirements, where multiple User Equipment (UE) and O-RAN Radio Units (O-RUs) coexist. This paper develops a framework to manage these aspects, ensuring each O-RU is associated with UEs based on their communication channel qualities and computational resources, and selecting appropriate encryption algorithms to safeguard data confidentiality, integrity, and authentication. A Multi-objective Optimization Problem (MOP) is formulated to minimize latency and maximize security within resource constraints. Different approaches are proposed to relax the complexity of the problem and achieve near-optimal performance, facilitating trade-offs between latency, security, and solution complexity. Simulation results demonstrate that the proposed approaches are close enough to the optimal solution, proving that our approach is both effective and efficient.

Paperid: 1713, https://arxiv.org/pdf/2503.05445.pdf

Abstract:
Large language models (LLMs) have shown state-of-the-art results in translating natural language questions into SQL queries (Text-to-SQL), a long-standing challenge within the database community. However, security concerns remain largely unexplored, particularly the threat of backdoor attacks, which can introduce malicious behaviors into models through fine-tuning with poisoned datasets. In this work, we systematically investigate the vulnerabilities of LLM-based Text-to-SQL models and present ToxicSQL, a novel backdoor attack framework. Our approach leverages stealthy {semantic and character-level triggers} to make backdoors difficult to detect and remove, ensuring that malicious behaviors remain covert while maintaining high model accuracy on benign inputs. Furthermore, we propose leveraging SQL injection payloads as backdoor targets, enabling the generation of malicious yet executable SQL queries, which pose severe security and privacy risks in language model-based SQL development. We demonstrate that injecting only 0.44% of poisoned data can result in an attack success rate of 79.41%, posing a significant risk to database security. Additionally, we propose detection and mitigation strategies to enhance model reliability. Our findings highlight the urgent need for security-aware Text-to-SQL development, emphasizing the importance of robust defenses against backdoor threats.

Paperid: 1714, https://arxiv.org/pdf/2503.04980.pdf

Abstract:
Synthetic data generation is one approach for sharing individual-level data. However, to meet legislative requirements, it is necessary to demonstrate that the individuals' privacy is adequately protected. There is no consolidated standard for measuring privacy in synthetic data. Through an expert panel and consensus process, we developed a framework for evaluating privacy in synthetic data. Our findings indicate that current similarity metrics fail to measure identity disclosure, and their use is discouraged. For differentially private synthetic data, a privacy budget other than close to zero was not considered interpretable. There was consensus on the importance of membership and attribute disclosure, both of which involve inferring personal information about an individual without necessarily revealing their identity. The resultant framework provides precise recommendations for metrics that address these types of disclosures effectively. Our findings further present specific opportunities for future research that can help with widespread adoption of synthetic data.

Paperid: 1715, https://arxiv.org/pdf/2503.04846.pdf

Abstract:
Fault injection attacks represent a class of threats that can compromise embedded systems across multiple layers of abstraction, such as system software, instruction set architecture (ISA), microarchitecture, and physical implementation. Early detection of these vulnerabilities and understanding their root causes along with their propagation from the physical layer to the system software is critical to secure the cyberinfrastructure. This present presents a comprehensive methodology for conducting controlled fault injection attacks at the pre-silicon level and an analysis of the underlying system for root-causing behavior. As the driving application, we use the clock glitch attacks in AI/ML applications for critical misclassification. Our study aims to characterize and diagnose the impact of faults within the RISC-V instruction set and pipeline stages, while tracing fault propagation from the circuit level to the AI/ML application software. This analysis resulted in discovering a novel vulnerability through controlled clock glitch parameters, specifically targeting the RISC-V decode stage.

Paperid: 1716, https://arxiv.org/pdf/2503.04292.pdf

Abstract:
Browser extensions are additional tools developed by third parties that integrate with web browsers to extend their functionality beyond standard capabilities. However, the browser extension platform is increasingly being exploited by hackers to launch sophisticated cyber threats. These threats encompass a wide range of malicious activities, including but not limited to phishing, spying, Distributed Denial of Service (DDoS) attacks, email spamming, affiliate fraud, malvertising, and payment fraud. This paper examines the evolving threat landscape of malicious browser extensions in 2025, focusing on Mozilla Firefox and Chrome. Our research successfully bypassed security mechanisms of Firefox and Chrome, demonstrating that malicious extensions can still be developed, published, and executed within the Mozilla Add-ons Store and Chrome Web Store. These findings highlight the persisting weaknesses in browser's vetting process and security framework. It provides insights into the risks associated with browser extensions, helping users understand these threats while aiding the industry in developing controls and countermeasures to defend against such attacks. All experiments discussed in this paper were conducted in a controlled laboratory environment by the researchers, adhering to proper ethical guidelines. The sole purpose of these experiments is to raise security awareness among the industry, research community, and the general public.

Paperid: 1717, https://arxiv.org/pdf/2503.04174.pdf

Abstract:
As modern networks grow increasingly complex--driven by diverse devices, encrypted protocols, and evolving threats--network traffic analysis has become critically important. Existing machine learning models often rely only on a single representation of packets or flows, limiting their ability to capture the contextual relationships essential for robust analysis. Furthermore, task-specific architectures for supervised, semi-supervised, and unsupervised learning lead to inefficiencies in adapting to varying data formats and security tasks. To address these gaps, we propose UniNet, a unified framework that introduces a novel multi-granular traffic representation (T-Matrix), integrating session, flow, and packet-level features to provide comprehensive contextual information. Combined with T-Attent, a lightweight attention-based model, UniNet efficiently learns latent embeddings for diverse security tasks. Extensive evaluations across four key network security and privacy problems--anomaly detection, attack classification, IoT device identification, and encrypted website fingerprinting--demonstrate UniNet's significant performance gain over state-of-the-art methods, achieving higher accuracy, lower false positive rates, and improved scalability. By addressing the limitations of single-level models and unifying traffic analysis paradigms, UniNet sets a new benchmark for modern network security.

Paperid: 1718, https://arxiv.org/pdf/2503.03877.pdf

Abstract:
Fault injection attacks represent a class of threats that can compromise embedded systems across multiple layers of abstraction, such as system software, instruction set architecture (ISA), microarchitecture, and physical implementation. Early detection of these vulnerabilities and understanding their root causes, along with their propagation from the physical layer to the system software, is critical to secure the cyberinfrastructure. This work presents a comprehensive methodology for conducting controlled fault injection attacks at the pre-silicon level and an analysis of the underlying system for root-causing behavior. As the driving application, we use the clock glitch attacks in AI/ML applications for critical misclassification. Our study aims to characterize and diagnose the impact of faults within the RISC-V instruction set and pipeline stages, while tracing fault propagation from the circuit level to the AI/ML application software. This analysis resulted in discovering two new vulnerabilities through controlled clock glitch parameters. First, we reveal a novel method for causing instruction skips, thereby preventing the loading of critical values from memory. This can cause disruption and affect program continuity and correctness. Second, we demonstrate an attack that converts legal instructions into illegal ones, thereby diverting control flow in a manner exploitable by attackers. Our work underscores the complexity of fault injection attack exploits and emphasizes the importance of preemptive security analysis.

Paperid: 1719, https://arxiv.org/pdf/2503.03539.pdf

Abstract:
Decarbonization, decentralization and digitalization are the three key elements driving the twin energy transition. The energy system is evolving to a more data driven ecosystem, leading to the need of communication and storage of large amount of data of different resolution from the prosumers and other stakeholders in the energy ecosystem. While the energy system is certainly advancing, this paradigm shift is bringing in new privacy and security issues related to collection, processing and storage of data - not only from the technical dimension, but also from the regulatory perspective. Understanding data privacy and security in the evolving energy system, regarding regulatory compliance, is an immature field of research. Contextualized knowledge of how related issues are regulated is still in its infancy, and the practical and technical basis for the regulatory framework for data privacy and security is not clear. To fill this gap, this paper conducts a comprehensive review of the data-related issues for the energy system by integrating both technical and regulatory dimensions. We start by reviewing open-access data, data communication and data-processing techniques for the energy system, and use it as the basis to connect the analysis of data-related issues from the integrated perspective. We classify the issues into three categories: (i) data-sharing among energy end users and stakeholders (ii) privacy of end users, and (iii) cyber security, and then explore these issues from a regulatory perspective. We analyze the evolution of related regulations, and introduce the relevant regulatory initiatives for the categorized issues in terms of regulatory definitions, concepts, principles, rights and obligations in the context of energy systems. Finally, we provide reflections on the gaps that still exist, and guidelines for regulatory frameworks for a truly participatory energy system.

Paperid: 1720, https://arxiv.org/pdf/2503.03267.pdf

Abstract:
Dementia, a neurological disorder impacting millions globally, presents significant challenges in diagnosis and patient care. With the rise of privacy concerns and security threats in healthcare, federated learning (FL) has emerged as a promising approach to enable collaborative model training across decentralized datasets without exposing sensitive patient information. However, FL remains vulnerable to advanced security breaches such as gradient inversion and eavesdropping attacks. This paper introduces a novel framework that integrates federated learning with quantum-inspired encryption techniques for dementia classification, emphasizing privacy preservation and security. Leveraging quantum key distribution (QKD), the framework ensures secure transmission of model weights, protecting against unauthorized access and interception during training. The methodology utilizes a convolutional neural network (CNN) for dementia classification, with federated training conducted across distributed healthcare nodes, incorporating QKD-encrypted weight sharing to secure the aggregation process. Experimental evaluations conducted on MRI data from the OASIS dataset demonstrate that the proposed framework achieves identical accuracy levels to a baseline model while enhancing data security and reducing loss by almost 1% compared to the classical baseline model. The framework offers significant implications for democratizing access to AI-driven dementia diagnostics in low- and middle-income countries, addressing critical resource and privacy constraints. This work contributes a robust, scalable, and secure federated learning solution for healthcare applications, paving the way for broader adoption of quantum-inspired techniques in AI-driven medical research.

Paperid: 1721, https://arxiv.org/pdf/2503.02441.pdf

Abstract:
Security researchers grapple with the surge of malicious files, necessitating swift identification and classification of malware strains for effective protection. Visual classifiers and in particular Convolutional Neural Networks (CNNs) have emerged as vital tools for this task. However, issues of robustness and explainability, common in other high risk domain like medicine and autonomous vehicles, remain understudied in current literature. Although deep learning visualization classifiers presented in research obtain great results without the need for expert feature extraction, they have not been properly studied in terms of their replicability. Additionally, the literature is not clear on how these types of classifiers arrive to their answers. Our study addresses these gaps by replicating six CNN models and exploring their pitfalls. We employ Class Activation Maps (CAMs), like GradCAM and HiResCAM, to assess model explainability. We evaluate the CNNs' performance and interpretability on two standard datasets, MalImg and Big2015, and a newly created called VX-Zoo. We employ these different CAM techniques to gauge the explainability of each of the models. With these tools, we investigate the underlying factors contributing to different interpretations of inputs across the different models, empowering human researchers to discern patterns crucial for identifying distinct malware families and explain why CNN models arrive at their conclusions. Other then highlighting the patterns found in the interpretability study, we employ the extracted heatmpas to enhance Visual Transformers classifiers' performance and explanation quality. This approach yields substantial improvements in F1 score, ranging from 2% to 8%, across the datasets compared to benchmark values.

Paperid: 1722, https://arxiv.org/pdf/2503.01799.pdf

Abstract:
Phishing URL detection is crucial in cybersecurity as malicious websites disguise themselves to steal sensitive infor mation. Traditional machine learning techniques struggle to per form well in complex real-world scenarios due to large datasets and intricate patterns. Motivated by quantum computing, this paper proposes using Variational Quantum Classifiers (VQC) to enhance phishing URL detection. We present PhishVQC, a quantum model that combines quantum feature maps and vari ational ansatzes such as RealAmplitude and EfficientSU2. The model is evaluated across two experimental setups with varying dataset sizes and feature map repetitions. PhishVQC achieves a maximum macro average F1-score of 0.89, showing a 22% improvement over prior studies. This highlights the potential of quantum machine learning to improve phishing detection accuracy. The study also notes computational challenges, with execution wall times increasing as dataset size grows.

Paperid: 1723, https://arxiv.org/pdf/2503.01395.pdf

Abstract:
The rapid advancements in generative AI models, such as ChatGPT, have introduced both significant benefits and new risks within the cybersecurity landscape. This paper investigates the potential misuse of the latest AI model, ChatGPT-4o Mini, in facilitating social engineering attacks, with a particular focus on phishing, one of the most pressing cybersecurity threats today. While existing literature primarily addresses the technical aspects, such as jailbreaking techniques, none have fully explored the free and straightforward execution of a comprehensive phishing campaign by novice users using ChatGPT-4o Mini. In this study, we examine the vulnerabilities of AI-driven chatbot services in 2025, specifically how methods like jailbreaking and reverse psychology can bypass ethical safeguards, allowing ChatGPT to generate phishing content, suggest hacking tools, and assist in carrying out phishing attacks. Our findings underscore the alarming ease with which even inexperienced users can execute sophisticated phishing campaigns, emphasizing the urgent need for stronger cybersecurity measures and heightened user awareness in the age of AI.

Paperid: 1724, https://arxiv.org/pdf/2503.01391.pdf

Abstract:
Machine learning has become a key tool in cybersecurity, improving both attack strategies and defense mechanisms. Deep learning models, particularly Convolutional Neural Networks (CNNs), have demonstrated high accuracy in detecting malware images generated from binary data. However, the decision-making process of these black-box models remains difficult to interpret. This study addresses this challenge by integrating quantitative analysis with explainability tools such as Occlusion Maps, HiResCAM, and SHAP to better understand CNN behavior in malware classification. We further demonstrate that obfuscation techniques can reduce model accuracy by up to 50%, and propose a mitigation strategy to enhance robustness. Additionally, we analyze heatmaps from multiple tests and outline a methodology for identification of artifacts, aiding researchers in conducting detailed manual investigations. This work contributes to improving the interpretability and resilience of deep learning-based intrusion detection systems

Paperid: 1725, https://arxiv.org/pdf/2502.20835.pdf

Abstract:
Distributed Key Generation (DKG) is vital to threshold-based cryptographic protocols such as threshold signatures, secure multiparty computation, and i-voting. Yet, standard $(n,t)$-DKG requires a known set of $n$ participants and a fixed threshold $t$, making it impractical for public or decentralized settings where membership and availability can change. We introduce Federated Distributed Key Generation (FDKG), which relaxes these constraints by allowing each participant to select its own guardian set, with a local threshold to reconstruct that participant's partial key. FDKG generalizes DKG and draws inspiration from Federated Byzantine Agreement, enabling dynamic trust delegation with minimal message complexity (two rounds). The protocol's liveness can tolerate adversary that controls up to $k - t + 1$ nodes in every guardian set. The paper presents a detailed protocol, a formal description of liveness, privacy, and integrity properties, and a simulation-based evaluation showcasing the efficacy of FDKG in mitigating node unreliability. In a setting of 100 parties, a 50% participation rate, 80% retention, and 40 guardians, the distribution phase incurred a total message size of 332.7 kB ($O(n\,k)$), and reconstruction phase 416.56 kB ($O(n\,k)$. Groth16 client-side proving took about 5 s in the distribution phase and ranged from 0.619 s up to 29.619 s in the reconstruction phase. Our work advances distributed cryptography by enabling flexible trust models for dynamic networks, with applications ranging from ad-hoc collaboration to blockchain governance.

Paperid: 1726, https://arxiv.org/pdf/2502.20791.pdf

Abstract:
The exponential growth of cyber threat knowledge, exemplified by the expansion of databases such as MITRE-CVE and NVD, poses significant challenges for cyber threat analysis. Security professionals are increasingly burdened by the sheer volume and complexity of information, creating an urgent need for effective tools to navigate, synthesize, and act on large-scale data to counter evolving threats proactively. However, conventional threat intelligence tools often fail to scale with the dynamic nature of this data and lack the adaptability to support diverse threat intelligence tasks. In this work, we introduce CYLENS, a cyber threat intelligence copilot powered by large language models (LLMs). CYLENS is designed to assist security professionals throughout the entire threat management lifecycle, supporting threat attribution, contextualization, detection, correlation, prioritization, and remediation. To ensure domain expertise, CYLENS integrates knowledge from 271,570 threat reports into its model parameters and incorporates six specialized NLP modules to enhance reasoning capabilities. Furthermore, CYLENS can be customized to meet the unique needs of different or ganizations, underscoring its adaptability. Through extensive evaluations, we demonstrate that CYLENS consistently outperforms industry-leading LLMs and state-of-the-art cybersecurity agents. By detailing its design, development, and evaluation, this work provides a blueprint for leveraging LLMs to address complex, data-intensive cybersecurity challenges.

Paperid: 1727, https://arxiv.org/pdf/2502.20629.pdf

Abstract:
This work aims to provide both privacy and utility within a split learning framework while considering both forward attribute inference and backward reconstruction attacks. To address this, a novel approach has been proposed, which makes use of class activation maps and autoencoders as a plug-in strategy aiming to increase the user's privacy and destabilize an adversary. The proposed approach is compared with a dimensionality-reduction-based plug-in strategy, which makes use of principal component analysis to transform the feature map onto a lower-dimensional feature space. Our work shows that our proposed autoencoder-based approach is preferred as it can provide protection at an earlier split position over the tested architectures in our setting, and, hence, better utility for resource-constrained devices in edge-cloud collaborative inference (EC) systems.

Paperid: 1728, https://arxiv.org/pdf/2502.19757.pdf

Abstract:
Adversarial attacks on machine learning models often rely on small, imperceptible perturbations to mislead classifiers. Such strategy focuses on minimizing the visual perturbation for humans so they are not confused, and also maximizing the misclassification for machine learning algorithms. An orthogonal strategy for adversarial attacks is to create perturbations that are clearly visible but do not confuse humans, yet still maximize misclassification for machine learning algorithms. This work follows the later strategy, and demonstrates instance of it through the Snowball Adversarial Attack in the context of traffic sign recognition. The attack leverages the human brain's superior ability to recognize objects despite various occlusions, while machine learning algorithms are easily confused. The evaluation shows that the Snowball Adversarial Attack is robust across various images and is able to confuse state-of-the-art traffic sign recognition algorithm. The findings reveal that Snowball Adversarial Attack can significantly degrade model performance with minimal effort, raising important concerns about the vulnerabilities of deep neural networks and highlighting the necessity for improved defenses for image recognition machine learning models.

Paperid: 1729, https://arxiv.org/pdf/2502.19341.pdf

Abstract:
The broadcast nature of the wireless medium and openness of wireless standards, e.g., 3GPP releases 16-20, invite adversaries to launch various active and passive attacks on cellular and other wireless networks. This work identifies one such loose end of wireless standards and presents a novel passive attack method enabling an eavesdropper (Eve) to localize a line of sight wireless user (Bob) who is communicating with a base station or WiFi access point (Alice). The proposed attack involves two phases. In the first phase, Eve performs modulation classification by intercepting the downlink channel between Alice and Bob. This enables Eve to utilize the publicly available modulation and coding scheme (MCS) tables to do pesudo-ranging, i.e., the Eve determines the ring within which Bob is located, which drastically reduces the search space. In the second phase, Eve sniffs the uplink channel, and employs multiple strategies to further refine Bob's location within the ring. Towards the end, we present our thoughts on how this attack can be extended to non-line-of-sight scenarios, and how this attack could act as a scaffolding to construct a malicious digital twin map.

Paperid: 1730, https://arxiv.org/pdf/2502.18724.pdf

Abstract:
Adversarial attacks on deep learning models have proliferated in recent years. In many cases, a different adversarial perturbation is required to be added to each image to cause the deep learning model to misclassify it. This is ineffective as each image has to be modified in a different way. Meanwhile, research on universal perturbations focuses on designing a single perturbation that can be applied to all images in a data set, and cause a deep learning model to misclassify the images. This work advances the field of universal perturbations by exploring universal perturbations in the context of traffic signs and autonomous vehicle systems. This work introduces a novel method for generating universal perturbations that visually look like simple black and white stickers, and using them to cause incorrect street sign predictions. Unlike traditional adversarial perturbations, the adversarial universal stickers are designed to be applicable to any street sign: same sticker, or stickers, can be applied in same location to any street sign and cause it to be misclassified. Further, to enable safe experimentation with adversarial images and street signs, this work presents a virtual setting that leverages Street View images of street signs, rather than the need to physically modify street signs, to test the attacks. The experiments in the virtual setting demonstrate that these stickers can consistently mislead deep learning models used commonly in street sign recognition, and achieve high attack success rates on dataset of US traffic signs. The findings highlight the practical security risks posed by simple stickers applied to traffic signs, and the ease with which adversaries can generate adversarial universal stickers that can be applied to many street signs.

Paperid: 1731, https://arxiv.org/pdf/2502.17801.pdf

Abstract:
Cloud computing environments are increasingly vulnerable to security threats such as distributed denial-of-service (DDoS) attacks and SQL injection. Traditional security mechanisms, based on rule matching and feature recognition, struggle to adapt to evolving attack strategies. This paper proposes an adaptive security protection framework leveraging deep learning to construct a multi-layered defense architecture. The proposed system is evaluated in a real-world business environment, achieving a detection accuracy of 97.3%, an average response time of 18 ms, and an availability rate of 99.999%. Experimental results demonstrate that the proposed method significantly enhances detection accuracy, response efficiency, and resource utilization, offering a novel and effective approach to cloud computing security.

Paperid: 1732, https://arxiv.org/pdf/2502.17763.pdf

Abstract:
Traditional security protection methods struggle to address sophisticated attack vectors in large-scale distributed systems, particularly when balancing detection accuracy with data privacy concerns. This paper presents a novel distributed security threat detection system that integrates federated learning with multimodal large language models (LLMs). Our system leverages federated learning to ensure data privacy while employing multimodal LLMs to process heterogeneous data sources including network traffic, system logs, images, and sensor data. Experimental evaluation on a 10TB distributed dataset demonstrates that our approach achieves 96.4% detection accuracy, outperforming traditional baseline models by 4.1 percentage points. The system reduces both false positive and false negative rates by 1.8 and 2.4 percentage points respectively. Performance analysis shows that our system maintains efficient processing capabilities in distributed environments, requiring 180 seconds for model training and 3.8 seconds for threat detection across the distributed network. These results demonstrate significant improvements in detection accuracy and computational efficiency while preserving data privacy, suggesting strong potential for real-world deployment in large-scale security systems.

Paperid: 1733, https://arxiv.org/pdf/2502.16286.pdf

Abstract:
In the rapidly evolving landscape of neural network security, the resilience of neural networks against bit-flip attacks (i.e., an attacker maliciously flips an extremely small amount of bits within its parameter storage memory system to induce harmful behavior), has emerged as a relevant area of research. Existing studies suggest that quantization may serve as a viable defense against such attacks. Recognizing the documented susceptibility of real-valued neural networks to such attacks and the comparative robustness of quantized neural networks (QNNs), in this work, we introduce BFAVerifier, the first verification framework designed to formally verify the absence of bit-flip attacks or to identify all vulnerable parameters in a sound and rigorous manner. BFAVerifier comprises two integral components: an abstraction-based method and an MILP-based method. Specifically, we first conduct a reachability analysis with respect to symbolic parameters that represent the potential bit-flip attacks, based on a novel abstract domain with a sound guarantee. If the reachability analysis fails to prove the resilience of such attacks, then we encode this verification problem into an equivalent MILP problem which can be solved by off-the-shelf solvers. Therefore, BFAVerifier is sound, complete, and reasonably efficient. We conduct extensive experiments, which demonstrate its effectiveness and efficiency across various network architectures, quantization bit-widths, and adversary capabilities.

Paperid: 1734, https://arxiv.org/pdf/2502.15270.pdf

Abstract:
User tracking is critical in the mobile ecosystem, which relies on device identifiers to build clear user profiles. In earlier ages, Android allowed easy access to non-resettable device identifiers like device serial numbers and IMEI by third-party apps for user tracking. As privacy concerns grew, Google has tightened restrictions on these identifiers in native Android. Despite this, stakeholders in custom Android systems seek consistent and stable user tracking capabilities across different system and device models, and they have introduced covert channels (e.g., system properties and settings) in customized systems to access identifiers, which undoubtedly increases the risk of user privacy breaches. This paper examines the introduction of non-resettable identifiers through system customization and their vulnerability due to poor access control. We present IDRadar, a scalable and accurate approach for identifying vulnerable properties and settings on custom Android ROMs. Applying our approach to 1,814 custom ROMs, we have identified 8,192 system properties and 3,620 settings that store non-resettable identifiers, with 3,477 properties and 1,336 settings lacking adequate access control, which can be abused by third-party apps to track users without permissions. Our large-scale analysis can identify a large number of security issues which are two orders of magnitude greater than existing techniques. We further investigate the root causes of these access control deficiencies. Validation on 32 devices through the remote testing service confirmed our results. Additionally, we observe that the vulnerable properties and settings occur in devices of the same OEMs. We have reported our findings to the vendors and received positive confirmations. Our work underscores the need for greater scrutiny of covert access channels to device identifiers and better solutions to safeguard user privacy.

Paperid: 1735, https://arxiv.org/pdf/2502.14945.pdf

Abstract:
Internet censorship limits the access of nodes residing within a specific network environment to the public Internet, and vice versa. During the last decade, techniques for conducting Internet censorship have been developed further. Consequently, methodology for measuring Internet censorship had been improved as well. In this paper, we firstly provide a survey of Internet censorship techniques. Secondly, we survey censorship measurement methodology, including a coverage of available datasets. In cases where it is beneficial, we bridge the terminology and taxonomy of Internet censorship with related domains, namely traffic obfuscation and information hiding. We cover both, technical and human aspects, as well as recent trends, and challenges.

Paperid: 1736, https://arxiv.org/pdf/2502.13998.pdf

Abstract:
Image watermarks have been considered a promising technique to help detect AI-generated content, which can be used to protect copyright or prevent fake image abuse. In this work, we present a black-box method for removing invisible image watermarks, without the need of any dataset of watermarked images or any knowledge about the watermark system. Our approach is simple to implement: given a single watermarked image, we regress it by deep image prior (DIP). We show that from the intermediate steps of DIP one can reliably find an evasion image that can remove invisible watermarks while preserving high image quality. Due to its unique working mechanism and practical effectiveness, we advocate including DIP as a baseline invasion method for benchmarking the robustness of watermarking systems. Finally, by showing the limited ability of DIP and other existing black-box methods in evading training-based visible watermarks, we discuss the positive implications on the practical use of training-based visible watermarks to prevent misinformation abuse.

Paperid: 1737, https://arxiv.org/pdf/2502.11647.pdf

Abstract:
Large Language Models (LLMs) are widely applied in decision making, but their deployment is threatened by jailbreak attacks, where adversarial users manipulate model behavior to bypass safety measures. Existing defense mechanisms, such as safety fine-tuning and model editing, either require extensive parameter modifications or lack precision, leading to performance degradation on general tasks, which is unsuitable to post-deployment safety alignment. To address these challenges, we propose DELMAN (Dynamic Editing for LLMs JAilbreak DefeNse), a novel approach leveraging direct model editing for precise, dynamic protection against jailbreak attacks. DELMAN directly updates a minimal set of relevant parameters to neutralize harmful behaviors while preserving the model's utility. To avoid triggering a safe response in benign context, we incorporate KL-divergence regularization to ensure the updated model remains consistent with the original model when processing benign queries. Experimental results demonstrate that DELMAN outperforms baseline methods in mitigating jailbreak attacks while preserving the model's utility, and adapts seamlessly to new attack instances, providing a practical and efficient solution for post-deployment model protection.

Paperid: 1738, https://arxiv.org/pdf/2502.11191.pdf

Abstract:
Large Language Models (LLMs) have shown remarkable advancements in specialized fields such as finance, law, and medicine. However, in cybersecurity, we have noticed a lack of open-source datasets, with a particular lack of high-quality cybersecurity pretraining corpora, even though much research indicates that LLMs acquire their knowledge during pretraining. To address this, we present a comprehensive suite of datasets covering all major training stages, including pretraining, instruction fine-tuning, and reasoning distillation with cybersecurity-specific self-reflection data. Extensive ablation studies demonstrate their effectiveness on public cybersecurity benchmarks. In particular, continual pre-training on our dataset yields a 15.88% improvement in the aggregate score, while reasoning distillation leads to a 10% gain in security certification (CISSP). We will release all datasets and trained cybersecurity LLMs under the ODC-BY and MIT licenses to encourage further research in the community. For access to all datasets and model weights, please refer to https://huggingface.co/collections/trendmicro-ailab/primus-67b1fd27052b802b4af9d243.

Paperid: 1739, https://arxiv.org/pdf/2502.11143.pdf

Abstract:
As interconnected systems proliferate, safeguarding complex infrastructures against an escalating array of cyber threats has become an urgent challenge. The increasing number of vulnerabilities, combined with resource constraints, makes addressing every vulnerability impractical, making effective prioritization essential. However, existing risk prioritization methods often rely on expert judgment or focus solely on exploit likelihood and consequences, lacking the granularity and adaptability needed for complex systems. This work introduces a graph-based framework for vulnerability patch prioritization that optimizes security by integrating diverse data sources and metrics into a universally applicable model. Refined risk metrics enable detailed assessments at the component, asset, and system levels. The framework employs two key graphs: a network communication graph to model potential attack paths and identify the shortest routes to critical assets, and a system dependency graph to capture risk propagation from exploited vulnerabilities across interconnected components. Asset criticality and component dependency rules systematically assess and mitigate risks. Benchmarking against state-of-the-art methods demonstrates superior accuracy in vulnerability patch ranking, with enhanced explainability. This framework advances vulnerability management and sets the stage for future research in adaptive cybersecurity strategies.

Paperid: 1740, https://arxiv.org/pdf/2502.11121.pdf

Abstract:
With the rapid advancements in information technology, reversible data hiding over encrypted images (RDH-EI) has become essential for secure image management in cloud services. However, existing RDH-EI schemes often suffer from high computational complexity, low embedding rates, and excessive data expansion. This paper addresses these challenges by first analyzing the block-based secret sharing in existing schemes, revealing significant data redundancy within image blocks. Based on this observation, we propose two space-preserving methods: the direct space-vacating method and the image-shrinking-based space-vacating method. Using these techniques, we design two novel RDH-EI schemes: a high-capacity RDH-EI scheme and a size-reduced RDH-EI scheme. The high-capacity RDH-EI scheme directly creates embedding space in encrypted images, eliminating the need for complex space-vacating operations and achieving higher and more stable embedding rates. In contrast, the size-reduced RDH-EI scheme minimizes data expansion by discarding unnecessary shares, resulting in smaller encrypted images. Experimental results show that the high-capacity RDH-EI scheme outperforms existing methods in terms of embedding capacity, while the size-reduced RDH-EI scheme excels in minimizing data expansion. Both schemes provide effective solutions to the challenges in RDH-EI, offering promising applications in fields such as medical imaging and cloud storage.

Paperid: 1741, https://arxiv.org/pdf/2502.11070.pdf

Abstract:
In the highly interconnected digital landscape of today, safeguarding complex infrastructures against cyber threats has become increasingly challenging due to the exponential growth in the number and complexity of vulnerabilities. Resource constraints necessitate effective vulnerability prioritization strategies, focusing efforts on the most critical risks. This paper presents a systematic literature review of 82 studies, introducing a novel taxonomy that categorizes metrics into severity, exploitability, contextual factors, predictive indicators, and aggregation methods. Our analysis reveals significant gaps in existing approaches and challenges with multi-domain applicability. By emphasizing the need for dynamic, context-aware metrics and scalable solutions, we provide actionable insights to bridge the gap between research and real-world applications. This work contributes to the field by offering a comprehensive framework for evaluating vulnerability prioritization methodologies and setting a research agenda to advance the state of practice.

Paperid: 1742, https://arxiv.org/pdf/2502.10825.pdf

Abstract:
The MITRE ATT&CK framework is a widely adopted tool for enhancing cybersecurity, supporting threat intelligence, incident response, attack modeling, and vulnerability prioritization. This paper synthesizes research on its application across these domains by analyzing 417 peer-reviewed publications. We identify commonly used adversarial tactics, techniques, and procedures (TTPs) and examine the integration of natural language processing (NLP) and machine learning (ML) with ATT&CK to improve threat detection and response. Additionally, we explore the interoperability of ATT&CK with other frameworks, such as the Cyber Kill Chain, NIST guidelines, and STRIDE, highlighting its versatility. The paper further evaluates the framework from multiple perspectives, including its effectiveness, validation methods, and sector-specific challenges, particularly in industrial control systems (ICS) and healthcare. We conclude by discussing current limitations and proposing future research directions to enhance the applicability of ATT&CK in dynamic cybersecurity environments.

Paperid: 1743, https://arxiv.org/pdf/2502.10556.pdf

Abstract:
The rapid evolution of malware has necessitated the development of sophisticated detection methods that go beyond traditional signature-based approaches. Graph learning techniques have emerged as powerful tools for modeling and analyzing the complex relationships inherent in malware behavior, leveraging advancements in Graph Neural Networks (GNNs) and related methods. This survey provides a comprehensive exploration of recent advances in malware detection, focusing on the interplay between graph learning and explainability. It begins by reviewing malware analysis techniques and datasets, emphasizing their foundational role in understanding malware behavior and supporting detection strategies. The survey then discusses feature engineering, graph reduction, and graph embedding methods, highlighting their significance in transforming raw data into actionable insights, while ensuring scalability and efficiency. Furthermore, this survey focuses on explainability techniques and their applications in malware detection, ensuring transparency and trustworthiness. By integrating these components, this survey demonstrates how graph learning and explainability contribute to building robust, interpretable, and scalable malware detection systems. Future research directions are outlined to address existing challenges and unlock new opportunities in this critical area of cybersecurity.

Paperid: 1744, https://arxiv.org/pdf/2502.10450.pdf

Abstract:
The capabilities of artificial intelligence systems have been advancing to a great extent, but these systems still struggle with failure modes, vulnerabilities, and biases. In this paper, we study the current state of the field, and present promising insights and perspectives regarding concerns that challenge the trustworthiness of AI models. In particular, this paper investigates the issues regarding three thrusts: safety, privacy, and bias, which hurt models' trustworthiness. For safety, we discuss safety alignment in the context of large language models, preventing them from generating toxic or harmful content. For bias, we focus on spurious biases that can mislead a network. Lastly, for privacy, we cover membership inference attacks in deep neural networks. The discussions addressed in this paper reflect our own experiments and observations.

Paperid: 1745, https://arxiv.org/pdf/2502.09896.pdf

Abstract:
Internet of Things (IoT) has gained widespread popularity, revolutionizing industries and daily life. However, it has also emerged as a prime target for attacks. Numerous efforts have been made to improve IoT security, and substantial IoT security and threat information, such as datasets and reports, have been developed. However, existing research often falls short in leveraging these insights to assist or guide users in harnessing IoT security practices in a clear and actionable way. In this paper, we propose ChatIoT, a large language model (LLM)-based IoT security assistant designed to disseminate IoT security and threat intelligence. By leveraging the versatile property of retrieval-augmented generation (RAG), ChatIoT successfully integrates the advanced language understanding and reasoning capabilities of LLM with fast-evolving IoT security information. Moreover, we develop an end-to-end data processing toolkit to handle heterogeneous datasets. This toolkit converts datasets of various formats into retrievable documents and optimizes chunking strategies for efficient retrieval. Additionally, we define a set of common use case specifications to guide the LLM in generating answers aligned with users' specific needs and expertise levels. Finally, we implement a prototype of ChatIoT and conduct extensive experiments with different LLMs, such as LLaMA3, LLaMA3.1, and GPT-4o. Experimental evaluations demonstrate that ChatIoT can generate more reliable, relevant, and technical in-depth answers for most use cases. When evaluating the answers with LLaMA3:70B, ChatIoT improves the above metrics by over 10% on average, particularly in relevance and technicality, compared to using LLMs alone.

Paperid: 1746, https://arxiv.org/pdf/2502.08886.pdf

Abstract:
As Generative AI (GenAI) continues to gain prominence and utility across various sectors, their integration into the realm of Internet of Things (IoT) security evolves rapidly. This work delves into an examination of the state-of-the-art literature and practical applications on how GenAI could improve and be applied in the security landscape of IoT. Our investigation aims to map the current state of GenAI implementation within IoT security, exploring their potential to fortify security measures further. Through the compilation, synthesis, and analysis of the latest advancements in GenAI technologies applied to IoT, this paper not only introduces fresh insights into the field, but also lays the groundwork for future research directions. It explains the prevailing challenges within IoT security, discusses the effectiveness of GenAI in addressing these issues, and identifies significant research gaps through MITRE Mitigations. Accompanied with three case studies, we provide a comprehensive overview of the progress and future prospects of GenAI applications in IoT security. This study serves as a foundational resource to improve IoT security through the innovative application of GenAI, thus contributing to the broader discourse on IoT security and technology integration.

Paperid: 1747, https://arxiv.org/pdf/2502.08679.pdf

Abstract:
Malware classification in dynamic environments presents a significant challenge due to concept drift, where the statistical properties of malware data evolve over time, complicating detection efforts. To address this issue, we propose a deep learning framework enhanced with a genetic algorithm to improve malware classification accuracy and adaptability. Our approach incorporates mutation operations and fitness score evaluations within genetic algorithms to continuously refine the deep learning model, ensuring robustness against evolving malware threats. Experimental results demonstrate that this hybrid method significantly enhances classification performance and adaptability, outperforming traditional static models. Our proposed approach offers a promising solution for real-time malware classification in ever-changing cybersecurity landscapes.

Paperid: 1748, https://arxiv.org/pdf/2502.08467.pdf

Abstract:
Cross-Site Scripting (XSS) is a prevalent and well known security problem in web applications. Numerous methods to automatically analyze and detect these vulnerabilities exist. However, all of these methods require that either code or feedback from the application is available to guide the detection process. In larger web applications, inputs can propagate from a frontend to an internal backend that provides no feedback to the outside. None of the previous approaches are applicable in this scenario, known as blind XSS (BXSS). In this paper, we address this problem and present the first comprehensive study on BXSS. As no feedback channel exists, we verify the presence of vulnerabilities through blind code execution. For this purpose, we develop a method for synthesizing polyglots, small XSS payloads that execute in all common injection contexts. Seven of these polyglots are already sufficient to cover a state-of-the-art XSS testbed. In a validation on real-world client-side vulnerabilities, we show that their XSS detection rate is on par with existing taint tracking approaches. Based on these polyglots, we conduct a study of BXSS vulnerabilities on the Tranco Top 100,000 websites. We discover 20 vulnerabilities in 18 web-based backend systems. These findings demonstrate the efficacy of our detection approach and point at a largely unexplored attack surface in web security.

Paperid: 1749, https://arxiv.org/pdf/2502.08341.pdf

Abstract:
A common paradigm for improving fuzzing performance is to focus on selected regions of a program rather than its entirety. While previous work has largely explored how these locations can be reached, their selection, that is, the where, has received little attention so far. A common paradigm for improving fuzzing performance is to focus on selected regions of a program rather than its entirety. While previous work has largely explored how these locations can be reached, their selection, that is, the where, has received little attention so far. In this paper, we fill this gap and present the first comprehensive analysis of target selection methods for fuzzing. To this end, we examine papers from leading security and software engineering conferences, identifying prevalent methods for choosing targets. By modeling these methods as general scoring functions, we are able to compare and measure their efficacy on a corpus of more than 1,600 crashes from the OSS-Fuzz project. Our analysis provides new insights for target selection in practice: First, we find that simple software metrics significantly outperform other methods, including common heuristics used in directed fuzzing, such as recently modified code or locations with sanitizer instrumentation. Next to this, we identify language models as a promising choice for target selection. In summary, our work offers a new perspective on directed fuzzing, emphasizing the role of target selection as an orthogonal dimension to improve performance.

Paperid: 1750, https://arxiv.org/pdf/2502.04786.pdf

Abstract:
SQL Injection (SQLi) continues to pose a significant threat to the security of web applications, enabling attackers to manipulate databases and access sensitive information without authorisation. Although advancements have been made in detection techniques, traditional signature-based methods still struggle to identify sophisticated SQL injection attacks that evade predefined patterns. As SQLi attacks evolve, the need for more adaptive detection systems becomes crucial. This paper introduces an innovative approach that leverages generative models to enhance SQLi detection and prevention mechanisms. By incorporating Variational Autoencoders (VAE), Conditional Wasserstein GAN with Gradient Penalty (CWGAN-GP), and U-Net, synthetic SQL queries were generated to augment training datasets for machine learning models. The proposed method demonstrated improved accuracy in SQLi detection systems by reducing both false positives and false negatives. Extensive empirical testing further illustrated the ability of the system to adapt to evolving SQLi attack patterns, resulting in enhanced precision and robustness.

Paperid: 1751, https://arxiv.org/pdf/2502.03870.pdf

Abstract:
Global Navigation Satellite Systems enable precise localization and timing even for highly mobile devices, but legacy implementations provide only limited support for the new generation of security-enhanced signals. Inertial Measurement Units have proved successful in augmenting the accuracy and robustness of the GNSS-provided navigation solution, but effective navigation based on inertial techniques in denied contexts requires high-end sensors. However, commercially available mobile devices usually embed a much lower-grade inertial system. To counteract an attacker transmitting all the adversarial signals from a single antenna, we exploit carrier phase-based observations coupled with a low-end inertial sensor to identify spoofing and meaconing. By short-time integration with an inertial platform, which tracks the displacement of the GNSS antenna, the high-frequency movement at the receiver is correlated with the variation in the carrier phase. In this way, we identify legitimate transmitters, based on their geometrical diversity with respect to the antenna system movement. We introduce a platform designed to effectively compare different tiers of commercial INS platforms with a GNSS receiver. By characterizing different inertial sensors, we show that simple MEMS INS perform as well as high-end industrial-grade sensors. Sensors traditionally considered unsuited for navigation purposes offer great performance at the short integration times used to evaluate the carrier phase information consistency against the high-frequency movement. Results from laboratory evaluation and through field tests at Jammertest 2024 show that the detector is up to 90% accurate in correctly identifying spoofing (or the lack of it), without any modification to the receiver structure, and with mass-production grade INS typical for mobile phones.

Paperid: 1752, https://arxiv.org/pdf/2502.03868.pdf

Abstract:
To safeguard Civilian Global Navigation Satellite Systems (GNSS) external information available to the platform encompassing the GNSS receiver can be used to detect attacks. Cross-checking the GNSS-provided time against alternative multiple trusted time sources can lead to attack detection aiming at controlling the GNSS receiver time. Leveraging external, network-connected secure time providers and onboard clock references, we achieve detection even under fine-grained time attacks. We provide an extensive evaluation of our multi-layered defense against adversaries mounting attacks against the GNSS receiver along with controlling the network link. We implement adversaries spanning from simplistic spoofers to advanced ones synchronized with the GNSS constellation. We demonstrate attack detection is possible in all tested cases (sharp discontinuity, smooth take-over, and coordinated network manipulation) without changes to the structure of the GNSS receiver. Leveraging the diversity of the reference time sources, detection of take-over time push as low as 150us is possible. Smooth take-overs forcing variations as low as 30ns are also detected based on on-board precision oscillators. The method (and thus the evaluation) is largely agnostic to the satellite constellation and the attacker type, making time-based data validation of GNSS information compatible with existing receivers and readily deployable.

Paperid: 1753, https://arxiv.org/pdf/2502.03134.pdf

Abstract:
In this paper, a dataset of IoT network traffic is presented. Our dataset was generated by utilising the Gotham testbed, an emulated large-scale Internet of Things (IoT) network designed to provide a realistic and heterogeneous environment for network security research. The testbed includes 78 emulated IoT devices operating on various protocols, including MQTT, CoAP, and RTSP. Network traffic was captured in Packet Capture (PCAP) format using tcpdump, and both benign and malicious traffic were recorded. Malicious traffic was generated through scripted attacks, covering a variety of attack types, such as Denial of Service (DoS), Telnet Brute Force, Network Scanning, CoAP Amplification, and various stages of Command and Control (C&C) communication. The data were subsequently processed in Python for feature extraction using the Tshark tool, and the resulting data was converted to Comma Separated Values (CSV) format and labelled. The data repository includes the raw network traffic in PCAP format and the processed labelled data in CSV format. Our dataset was collected in a distributed manner, where network traffic was captured separately for each IoT device at the interface between the IoT gateway and the device. Our dataset was collected in a distributed manner, where network traffic was separately captured for each IoT device at the interface between the IoT gateway and the device. With its diverse traffic patterns and attack scenarios, this dataset provides a valuable resource for developing Intrusion Detection Systems and security mechanisms tailored to complex, large-scale IoT environments. The dataset is publicly available at Zenodo.

Paperid: 1754, https://arxiv.org/pdf/2502.02990.pdf

Abstract:
Distributed data analysis is a large and growing field driven by a massive proliferation of user devices, and by privacy concerns surrounding the centralised storage of data. We consider two \emph{adaptive} algorithms for estimating one quantile (e.g.~the median) when each user holds a single data point lying in a domain $[B]$ that can be queried once through a private mechanism; one under local differential privacy (LDP) and another for shuffle differential privacy (shuffle-DP). In the adaptive setting we present an $\varepsilon$-LDP algorithm which can estimate any quantile within error $Î±$ only requiring $O(\frac{\log B}{\varepsilon^2Î±^2})$ users, and an $(\varepsilon,Î´)$-shuffle DP algorithm requiring only $\widetilde{O}((\frac{1}{\varepsilon^2}+\frac{1}{Î±^2})\log B)$ users. Prior (nonadaptive) algorithms require more users by several logarithmic factors in $B$. We further provide a matching lower bound for adaptive protocols, showing that our LDP algorithm is optimal in the low-$\varepsilon$ regime. Additionally, we establish lower bounds against non-adaptive protocols which paired with our understanding of the adaptive case, proves a fundamental separation between these models.

Paperid: 1755, https://arxiv.org/pdf/2502.02749.pdf

Abstract:
Female Health Applications (FHA), a growing segment of FemTech, aim to provide affordable and accessible healthcare solutions for women globally. These applications gather and monitor health and reproductive data from millions of users. With ongoing debates on women's reproductive rights and privacy, it's crucial to assess how these apps protect users' privacy. In this paper, we undertake a security and data protection assessment of 45 popular FHAs. Our investigation uncovers harmful permissions, extensive collection of sensitive personal and medical data, and the presence of numerous third-party tracking libraries. Furthermore, our examination of their privacy policies reveals deviations from fundamental data privacy principles. These findings highlight a significant lack of privacy and security measures for FemTech apps, especially as women's reproductive rights face growing political challenges. The results and recommendations provide valuable insights for users, app developers, and policymakers, paving the way for better privacy and security in Female Health Applications.

Paperid: 1756, https://arxiv.org/pdf/2502.02309.pdf

Abstract:
Demographic fairness in face recognition (FR) has emerged as a critical area of research, given its impact on fairness, equity, and reliability across diverse applications. As FR technologies are increasingly deployed globally, disparities in performance across demographic groups -- such as race, ethnicity, and gender -- have garnered significant attention. These biases not only compromise the credibility of FR systems but also raise ethical concerns, especially when these technologies are employed in sensitive domains. This review consolidates extensive research efforts providing a comprehensive overview of the multifaceted aspects of demographic fairness in FR. We systematically examine the primary causes, datasets, assessment metrics, and mitigation approaches associated with demographic disparities in FR. By categorizing key contributions in these areas, this work provides a structured approach to understanding and addressing the complexity of this issue. Finally, we highlight current advancements and identify emerging challenges that need further investigation. This article aims to provide researchers with a unified perspective on the state-of-the-art while emphasizing the critical need for equitable and trustworthy FR systems.

Paperid: 1757, https://arxiv.org/pdf/2502.02009.pdf

Abstract:
Security misconfigurations in Container Orchestrators (COs) can pose serious threats to software systems. While Static Analysis Tools (SATs) can effectively detect these security vulnerabilities, the industry currently lacks automated solutions capable of fixing these misconfigurations. The emergence of Large Language Models (LLMs), with their proven capabilities in code understanding and generation, presents an opportunity to address this limitation. This study introduces LLMSecConfig, an innovative framework that bridges this gap by combining SATs with LLMs. Our approach leverages advanced prompting techniques and Retrieval-Augmented Generation (RAG) to automatically repair security misconfigurations while preserving operational functionality. Evaluation of 1,000 real-world Kubernetes configurations achieved a 94\% success rate while maintaining a low rate of introducing new misconfigurations. Our work makes a promising step towards automated container security management, reducing the manual effort required for configuration maintenance.

Paperid: 1758, https://arxiv.org/pdf/2502.01352.pdf

Abstract:
Federated learning is a distributed learning technique that allows training a global model with the participation of different data owners without the need to share raw data. This architecture is orchestrated by a central server that aggregates the local models from the clients. This server may be trusted, but not all nodes in the network. Then, differential privacy (DP) can be used to privatize the global model by adding noise. However, this may affect convergence across the rounds of the federated architecture, depending also on the aggregation strategy employed. In this work, we aim to introduce the notion of metric-privacy to mitigate the impact of classical server side global-DP on the convergence of the aggregated model. Metric-privacy is a relaxation of DP, suitable for domains provided with a notion of distance. We apply it from the server side by computing a distance for the difference between the local models. We compare our approach with standard DP by analyzing the impact on six classical aggregation strategies. The proposed methodology is applied to an example of medical imaging and different scenarios are simulated across homogeneous and non-i.i.d clients. Finally, we introduce a novel client inference attack, where a semi-honest client tries to find whether another client participated in the training and study how it can be mitigated using DP and metric-privacy. Our evaluation shows that metric-privacy can increase the performance of the model compared to standard DP, while offering similar protection against client inference attacks.

Paperid: 1759, https://arxiv.org/pdf/2502.00067.pdf

Abstract:
AI powered health chatbot applications are increasingly utilized for personalized healthcare services, yet they pose significant challenges related to user data security and privacy. This study evaluates the effectiveness of automated methods, specifically BART and Gemini GenAI, in identifying security privacy related (SPR) concerns within these applications' user reviews, benchmarking their performance against manual qualitative analysis. Our results indicate that while Gemini's performance in SPR classification is comparable to manual labeling, both automated methods have limitations, including the misclassification of unrelated issues. Qualitative analysis revealed critical user concerns, such as data collection practices, data misuse, and insufficient transparency and consent mechanisms. This research enhances the understanding of the relationship between user trust, privacy, and emerging mobile AI health chatbot technologies, offering actionable insights for improving security and privacy practices in AI driven health chatbots. Although exploratory, our findings highlight the necessity for rigorous audits and transparent communication strategies, providing valuable guidance for app developers and vendors in addressing user security and privacy concerns.

Paperid: 1760, https://arxiv.org/pdf/2501.18841.pdf

Abstract:
We conduct experiments on the impact of increasing inference-time compute in reasoning models (specifically OpenAI o1-preview and o1-mini) on their robustness to adversarial attacks. We find that across a variety of attacks, increased inference-time compute leads to improved robustness. In many cases (with important exceptions), the fraction of model samples where the attack succeeds tends to zero as the amount of test-time compute grows. We perform no adversarial training for the tasks we study, and we increase inference-time compute by simply allowing the models to spend more compute on reasoning, independently of the form of attack. Our results suggest that inference-time compute has the potential to improve adversarial robustness for Large Language Models. We also explore new attacks directed at reasoning models, as well as settings where inference-time compute does not improve reliability, and speculate on the reasons for these as well as ways to address them.

Paperid: 1761, https://arxiv.org/pdf/2501.18632.pdf

Abstract:
Large language models (LLMs) are increasingly utilized in healthcare applications. However, their deployment in clinical practice raises significant safety concerns, including the potential spread of harmful information. This study systematically assesses the vulnerabilities of seven LLMs to three advanced black-box jailbreaking techniques within medical contexts. To quantify the effectiveness of these techniques, we propose an automated and domain-adapted agentic evaluation pipeline. Experiment results indicate that leading commercial and open-source LLMs are highly vulnerable to medical jailbreaking attacks. To bolster model safety and reliability, we further investigate the effectiveness of Continual Fine-Tuning (CFT) in defending against medical adversarial attacks. Our findings underscore the necessity for evolving attack methods evaluation, domain-specific safety alignment, and LLM safety-utility balancing. This research offers actionable insights for advancing the safety and reliability of AI clinicians, contributing to ethical and effective AI deployment in healthcare.

Paperid: 1762, https://arxiv.org/pdf/2501.17501.pdf

Abstract:
Code language models, while widely popular, are often trained on unsanitized source code gathered from across the Internet. Previous work revealed that pre-trained models can remember the content of their training data and regurgitate them through data extraction attacks. Due to the large size of current models, only a few entities have the resources for pre-training such models. However, fine-tuning requires fewer resources and is increasingly used by both small and large entities for its effectiveness on specialized data. Such small curated data for fine-tuning might contain sensitive information or proprietary assets. In this study, we attack both pre-trained and fine-tuned code language models to investigate the extent of data extractability. We first develop a custom benchmark to assess the vulnerability of both pre-training and fine-tuning samples to extraction attacks. Our findings reveal that 54.9% of extractable pre-training data could be retrieved from StarCoder2-15B, whereas this number decreased to 23.5% after fine-tuning. This indicates that fine-tuning reduces the extractability of pre-training data. However, compared to larger models, fine-tuning smaller models increases their vulnerability to data extraction attacks on fine-tuning data. Given the potential sensitivity of fine-tuning data, this can lead to more severe consequences. Lastly, we also manually analyzed 2000 extractable samples before and after fine-tuning. We also found that data carriers and licensing information are the most likely data categories to be memorized from pre-trained and fine-tuned models, while the latter is the most likely to be forgotten after fine-tuning.

Paperid: 1763, https://arxiv.org/pdf/2501.17021.pdf

Abstract:
This work investigates the problem of Oblivious Transfer (OT) over a noisy Multiple Access Channel (MAC) involving two non-colluding senders and a single receiver. The channel model is characterized by correlations among the parties, with the parties assumed to be either honest-but-curious or, in the receiver's case, potentially malicious. We propose a multiparty protocol for honest-but-curious parties where the general MAC is reduced to a certain correlation. In scenarios where the receiver is malicious, the protocol achieves an achievable rate region.

Paperid: 1764, https://arxiv.org/pdf/2501.16466.pdf

Abstract:
LLMs have shown preliminary promise in some security tasks and CTF challenges. Real cyberattacks are often multi-host network attacks, which involve executing a number of steps across multiple hosts such as conducting reconnaissance, exploiting vulnerabilities, and using compromised hosts to exfiltrate data. To date, the extent to which LLMs can autonomously execute multi-host network attacks} is not well understood. To this end, our first contribution is MHBench, an open-source multi-host attack benchmark with 10 realistic emulated networks (from 25 to 50 hosts). We find that popular LLMs including modern reasoning models (e.g., GPT4o, Gemini 2.5 Pro, Sonnet 3.7 Thinking) with state-of-art security-relevant prompting strategies (e.g., PentestGPT, CyberSecEval3) cannot autonomously execute multi-host network attacks. To enable LLMs to autonomously execute such attacks, our second contribution is Incalmo, an high-level abstraction layer. Incalmo enables LLMs to specify high-level actions (e.g., infect a host, scan a network). Incalmo's translation layer converts these actions into lower-level primitives (e.g., commands to exploit tools) through expert agents. In 9 out of 10 networks in MHBench, LLMs using Incalmo achieve at least some of the attack goals. Even smaller LLMs (e.g., Haiku 3.5, Gemini 2 Flash) equipped with Incalmo achieve all goals in 5 of 10 environments. We also validate the key role of high-level actions in Incalmo's abstraction in enabling LLMs to autonomously execute such attacks.

Paperid: 1765, https://arxiv.org/pdf/2501.16007.pdf

Abstract:
Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers. This introduces trust challenges: how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality-sensitive hashing mechanism for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision with 100% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes the memory overhead of the generated proofs by $1000\times$, requiring only 258 bytes of storage per 32 new tokens, compared to the 262 KB requirement of storing the token embeddings directly for Llama 3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and laying a foundation for decentralized, verifiable and trustless AI services.

Paperid: 1766, https://arxiv.org/pdf/2501.15363.pdf

Abstract:
In the era of data-driven decision-making, ensuring the privacy and security of shared data is paramount across various domains. Applying existing deep neural networks (DNNs) to encrypted data is critical and often compromises performance, security, and computational overhead. To address these limitations, this research introduces a secure framework consisting of a learnable encryption method based on the block-pixel operation to encrypt the data and subsequently integrate it with the Vision Transformer (ViT). The proposed framework ensures data privacy and security by creating unique scrambling patterns per key, providing robust performance against adversarial attacks without compromising computational efficiency and data integrity. The framework was tested on sensitive medical datasets to validate its efficacy, proving its ability to handle highly confidential information securely. The suggested framework was validated with a 94\% success rate after extensive testing on real-world datasets, such as MRI brain tumors and histological scans of lung and colon cancers. Additionally, the framework was tested under diverse adversarial attempts against secure data sharing with optimum performance and demonstrated its effectiveness in various threat scenarios. These comprehensive analyses underscore its robustness, making it a trustworthy solution for secure data sharing in critical applications.

Paperid: 1767, https://arxiv.org/pdf/2501.13770.pdf

Abstract:
Web3's decentralised infrastructure has upended the standardised approach to digital identity established by protocols like OpenID Connect. Web2 and Web3 currently operate in silos, with Web2 leveraging selective disclosure JSON web tokens (SD-JWTs) and Web3 dApps being reliant on on-chain data and sometimes clinging to centralised system data. This fragmentation hinders user experience and the interconnectedness of the digital world. This paper explores the integration of Web3 within the OpenID Connect framework, scrutinising established authentication protocols for their adaptability to decentralised identities. The research examines the interplay between OpenID Connect and decentralised identity concepts, the limitations of existing protocols like OpenID Connect for verifiable credential issuance, OpenID Connect framework for verifiable presentations, and self-issued OpenID provider. As a result, a novel privacy-preserving digital identity bridge is proposed, which aims to answer the research question of whether authentication protocols should inherently support Web3 functionalities and the mechanisms for their integration. Through a Decentralised Autonomous Organisation (DAO) use case, the findings indicate that a privacy-centric bridge can mitigate existing fragmentation by aggregating different identities to provide a better user experience. While the digital identity bridge demonstrates a possible approach to harmonise digital identity across platforms for their use in Web3, the bridging is unidirectional and limits root trust of credentials. The bridge's dependence on centralised systems may further fuel the debate on (de-)centralised identities.

Paperid: 1768, https://arxiv.org/pdf/2501.13677.pdf

Abstract:
Large Language Models (LLMs) commonly rely on explicit refusal prefixes for safety, making them vulnerable to prefix injection attacks. We introduce HumorReject, a novel data-driven approach that reimagines LLM safety by decoupling it from refusal prefixes through humor as an indirect refusal strategy. Rather than explicitly rejecting harmful instructions, HumorReject responds with contextually appropriate humor that naturally defuses potentially dangerous requests. Our approach effectively addresses common "over-defense" issues while demonstrating superior robustness against various attack vectors. Our findings suggest that improvements in training data design can be as important as the alignment algorithm itself in achieving effective LLM safety.

Paperid: 1769, https://arxiv.org/pdf/2501.10915.pdf

Abstract:
Large Language Models (LLMs) hold promise for advancing legal practice by automating complex tasks and improving access to justice. However, their adoption is limited by concerns over client confidentiality, especially when lawyers include sensitive Personally Identifiable Information (PII) in prompts, risking unauthorized data exposure. To mitigate this, we introduce LegalGuardian, a lightweight, privacy-preserving framework tailored for lawyers using LLM-based tools. LegalGuardian employs Named Entity Recognition (NER) techniques and local LLMs to mask and unmask confidential PII within prompts, safeguarding sensitive data before any external interaction. We detail its development and assess its effectiveness using a synthetic prompt library in immigration law scenarios. Comparing traditional NER models with one-shot prompted local LLM, we find that LegalGuardian achieves a F1-score of 93% with GLiNER and 97% with Qwen2.5-14B in PII detection. Semantic similarity analysis confirms that the framework maintains high fidelity in outputs, ensuring robust utility of LLM-based tools. Our findings indicate that legal professionals can harness advanced AI technologies without compromising client confidentiality or the quality of legal documents.

Paperid: 1770, https://arxiv.org/pdf/2501.10013.pdf

Abstract:
This work presents CaFA, a system for Cost-aware Feasible Attacks for assessing the robustness of neural tabular classifiers against adversarial examples realizable in the problem space, while minimizing adversaries' effort. To this end, CaFA leverages TabPGD$-$an algorithm we set forth to generate adversarial perturbations suitable for tabular data$-$ and incorporates integrity constraints automatically mined by state-of-the-art database methods. After producing adversarial examples in the feature space via TabPGD, CaFA projects them on the mined constraints, leading, in turn, to better attack realizability. We tested CaFA with three datasets and two architectures and found, among others, that the constraints we use are of higher quality (measured via soundness and completeness) than ones employed in prior work. Moreover, CaFA achieves higher feasible success rates$-$i.e., it generates adversarial examples that are often misclassified while satisfying constraints$-$than prior attacks while simultaneously perturbing few features with lower magnitudes, thus saving effort and improving inconspicuousness. We open-source CaFA, hoping it will serve as a generic system enabling machine-learning engineers to assess their models' robustness against realizable attacks, thus advancing deployed models' trustworthiness.

Paperid: 1771, https://arxiv.org/pdf/2501.09216.pdf

Abstract:
Prior research yielded many techniques to mitigate software compromise for low-end Internet of Things (IoT) devices. Some of them detect software modifications via remote attestation and similar services, while others preventatively ensure software (static) integrity. However, achieving run-time (dynamic) security, e.g., control-flow integrity (CFI), remains a challenge. Control-flow attestation (CFA) is one approach that minimizes the burden on devices. However, CFA is not a real-time countermeasure against run-time attacks since it requires communication with a verifying entity. This poses significant risks if safety- or time-critical tasks have memory vulnerabilities. To address this issue, we construct EILID - a hybrid architecture that ensures software execution integrity by actively monitoring control-flow violations on low-end devices. EILID is built atop CASU, a prevention-based (i.e., active) hybrid Root-of-Trust (RoT) that guarantees software immutability. EILID achieves fine-grained backward-edge and function-level forward-edge CFI via semi-automatic code instrumentation and a secure shadow stack.

Paperid: 1772, https://arxiv.org/pdf/2501.08970.pdf

Abstract:
We often interact with untrusted parties. Prioritization of privacy can limit the effectiveness of these interactions, as achieving certain goals necessitates sharing private data. Traditionally, addressing this challenge has involved either seeking trusted intermediaries or constructing cryptographic protocols that restrict how much data is revealed, such as multi-party computations or zero-knowledge proofs. While significant advances have been made in scaling cryptographic approaches, they remain limited in terms of the size and complexity of applications they can be used for. In this paper, we argue that capable machine learning models can fulfill the role of a trusted third party, thus enabling secure computations for applications that were previously infeasible. In particular, we describe Trusted Capable Model Environments (TCMEs) as an alternative approach for scaling secure computation, where capable machine learning model(s) interact under input/output constraints, with explicit information flow control and explicit statelessness. This approach aims to achieve a balance between privacy and computational efficiency, enabling private inference where classical cryptographic solutions are currently infeasible. We describe a number of use cases that are enabled by TCME, and show that even some simple classic cryptographic problems can already be solved with TCME. Finally, we outline current limitations and discuss the path forward in implementing them.

Paperid: 1773, https://arxiv.org/pdf/2501.08947.pdf

Abstract:
We present the first systematic approach to static and dynamic taint analysis for Graph APIs focusing on broken access control. The approach comprises the following. We taint nodes in the Graph API if they represent data requiring specific privileges in order to be retrieved or manipulated, and identify API calls which are related to sources and sinks. Then, we statically analyze whether tainted information flow between API source and sink calls occurs. To this end, we model the API calls using graph transformation rules. We subsequently use critical pair analysis to automatically analyze potential dependencies between rules representing source calls and rules representing sink calls. We distinguish direct from indirect tainted information flow and argue under which conditions the CPA is able to detect not only direct, but also indirect tainted flow. The static taint analysis (i) identifies flows that need to be further reviewed, since tainted nodes may be created by an API call and used or manipulated by another API call later without having the necessary privileges, and (ii) can be used to systematically design dynamic security tests for broken access control. The dynamic taint analysis checks if potential broken access control risks detected during the static taint analysis really occur. We apply the approach to a part of the GitHub GraphQL API. The application illustrates that our analysis supports the detection of two types of broken access control systematically: the case where users of the API may not be able to access or manipulate information, although they should be able to do so; and the case where users (or attackers) of the API may be able to access/manipulate information that they should not.

Paperid: 1774, https://arxiv.org/pdf/2501.07597.pdf

Abstract:
Safety-critical cyber-physical systems (CPS), such as quadrotor UAVs, are particularly prone to cyber attacks, which can result in significant consequences if not detected promptly and accurately. During outdoor operations, the nonlinear dynamics of UAV systems, combined with non-Gaussian noise, pose challenges to the effectiveness of conventional statistical and machine learning methods. To overcome these limitations, we present QUADFormer, an advanced attack detection framework for quadrotor UAVs leveraging a transformer-based architecture. This framework features a residue generator that produces sequences sensitive to anomalies, which are then analyzed by the transformer to capture statistical patterns for detection and classification. Furthermore, an alert mechanism ensures UAVs can operate safely even when under attack. Extensive simulations and experimental evaluations highlight that QUADFormer outperforms existing state-of-the-art techniques in detection accuracy.

Paperid: 1775, https://arxiv.org/pdf/2501.07149.pdf

Abstract:
Human motion is a behavioral biometric trait that can be used to identify individuals and infer private attributes such as medical conditions. This poses a serious threat to privacy as motion extraction from video and motion capture are increasingly used for a variety of applications, including mixed reality, robotics, medicine, and the quantified self. In order to protect the privacy of the tracked individuals, anonymization techniques that preserve the utility of the data are required. However, anonymizing motion data is a challenging task because there are many dependencies in motion sequences (such as physiological constraints) that, if ignored, make the anonymized motion sequence appear unnatural. In this paper, we propose Pantomime, a full-body anonymization technique for motion data, which uses foundation motion models to generate motion sequences that adhere to the dependencies in the data, thus keeping the utility of the anonymized data high. Our results show that Pantomime can maintain the naturalness of the motion sequences while reducing the identification accuracy to 10%.

Paperid: 1776, https://arxiv.org/pdf/2501.06997.pdf

Abstract:
Advanced Persistent Threat (APT) have grown increasingly complex and concealed, posing formidable challenges to existing Intrusion Detection Systems in identifying and mitigating these attacks. Recent studies have incorporated graph learning techniques to extract detailed information from provenance graphs, enabling the detection of attacks with greater granularity. Nevertheless, existing studies have largely overlooked the continuous yet subtle temporal variations in the structure of provenance graphs, which may correspond to surreptitious perturbation anomalies in ongoing APT attacks. Therefore, we introduce TFLAG, an advanced anomaly detection framework that for the first time integrates the structural dynamic extraction capabilities of temporal graph model with the anomaly delineation abilities of deviation networks to pinpoint covert attack activities in provenance graphs. This self-supervised integration framework leverages the graph model to extract neighbor interaction data under continuous temporal changes from historical benign behaviors within provenance graphs, while simultaneously utilizing deviation networks to accurately distinguish authentic attack activities from false positive deviations due to unexpected subtle perturbations. The experimental results indicate that, through a comprehensive design that utilizes both attribute and temporal information, it can accurately identify the time windows associated with APT attack behaviors without prior knowledge (e.g., labeled data samples), demonstrating superior accuracy compared to current state-of-the-art methods in differentiating between attack events and system false positive events.

Paperid: 1777, https://arxiv.org/pdf/2501.03808.pdf

Abstract:
Distributed ledger technology offers several advantages for banking and finance industry, including efficient transaction processing and cross-party transaction reconciliation. The key challenges for adoption of this technology in financial institutes are (a) the building of a privacy-preserving ledger, (b) supporting auditing and regulatory requirements, and (c) flexibility to adapt to complex use-cases with multiple digital assets and actors. This paper proposes a framework for a private, audit-able, and distributed ledger (PADL) that adapts easily to fundamental use-cases within financial institutes. PADL employs widely-used cryptography schemes combined with zero-knowledge proofs to propose a transaction scheme for a `table' like ledger. It enables fast confidential peer-to-peer multi-asset transactions, and transaction graph anonymity, in a no-trust setup, but with customized privacy. We prove that integrity and anonymity of PADL is secured against a strong threat model. Furthermore, we showcase three fundamental real-life use-cases, namely, an assets exchange ledger, a settlement ledger, and a bond market ledger. Based on these use-cases we show that PADL supports smooth-lined inter-assets auditing while preserving privacy of the participants. For example, we show how a bank can be audited for its liquidity or credit risk without violation of privacy of itself or any other party, or how can PADL ensures honest coupon rate payment in bond market without sharing investors values. Finally, our evaluation shows PADL's advantage in performance against previous relevant schemes.

Paperid: 1778, https://arxiv.org/pdf/2501.03141.pdf

Abstract:
Today, many auctions are carried out with the help of intermediary platforms like Google and eBay. We refer to such auctions as platform-assisted auctions.Traditionally, the auction theory literature mainly focuses on designing auctions that incentivize the buyers to bid truthfully,assuming that the platform always faithfully implements the auction. In practice, however, the platforms have been found to manipulate the auctions to earn more profit, resulting in high-profile anti-trust lawsuits. We propose a new model for studying platform-assisted auctions in the permissionless setting. We explore whether it is possible to design a dream auction in thisnew model, such that honest behavior is the utility-maximizing strategy for each individual buyer, the platform, the seller, as well as platform-seller or platform-buyer coalitions.Through a collection of feasibility and infeasibility results,we carefully characterize the mathematical landscape of platform-assisted auctions. We show how cryptography can lend to the design of an efficient platform-assisted auction with dream properties. Although a line of works have also used MPC or the blockchain to remove the reliance on a trusted auctioneer, our work is distinct in nature in several dimensions.First, we initiate a systematic exploration of the game theoretic implications when the service providers are strategic and can collude with sellers or buyers. Second, we observe that the full simulation paradigm is too stringent and leads to high asymptotical costs. Specifically, because every player has a different private outcomein an auction protocol, running any generic MPC protocol among the players would incur at least $n^2$ total cost. We propose a new notion of simulation calledutility-dominated emulation.Under this new notion, we showhow to design efficient auction protocols with quasilinear efficiency.

Paperid: 1779, https://arxiv.org/pdf/2501.03118.pdf

Abstract:
Key-Value Stores (KVSs) are No-SQL databases that store data as key-value pairs and have gained popularity due to their simplicity, scalability, and fast retrieval capabilities. However, storing sensitive data in KVSs requires strong security properties to prevent data leakage and unauthorized tampering. While software (SW)-based encryption techniques are commonly used to maintain data confidentiality and integrity, they suffer from several drawbacks. They strongly assume trust in the hosting system stack and do not secure data during processing unless using performance-heavy techniques (e.g., homomorphic encryption). Alternatively, Trusted Execution Environments (TEEs) provide a solution that enforces the confidentiality and integrity of code and data at the CPU level, allowing users to build trusted applications in an untrusted environment. They also secure data in use by providing an encapsulated processing environment called enclave. Nevertheless, TEEs come with their own set of drawbacks, including performance issues due to memory size limitations and CPU context switching. This paper examines the state of the art in TEE-based confidential KVSs and highlights common design strategies used in KVSs to leverage TEE security features while overcoming their inherent limitations. This work aims to provide a comprehensive understanding of the use of TEEs in KVSs and to identify research directions for future work.

Paperid: 1780, https://arxiv.org/pdf/2501.02407.pdf

Abstract:
Rapid advances in Natural Language Processing (NLP) have revolutionized many fields, including healthcare. However, these advances raise significant privacy concerns, especially when pre-trained models fine-tuned and specialized on sensitive data can memorize and then expose and regurgitate personal information. This paper presents a privacy-preserving language modeling approach to address the problem of language models anonymization, and thus promote their sharing. Specifically, we propose both a Masking Language Modeling (MLM) methodology to specialize a BERT-like language model, and a Causal Language Modeling (CLM) methodology to specialize a GPT-like model that avoids the model from memorizing direct and indirect identifying information present in the training data. We have comprehensively evaluated our approaches using a medical dataset and compared them against different baselines. Our results indicate that by avoiding memorizing both direct and indirect identifiers during model specialization, our masking and causal language modeling schemes offer a good tradeoff for maintaining high privacy while retaining high utility.

Paperid: 1781, https://arxiv.org/pdf/2501.00790.pdf

Abstract:
The rapid proliferation of Industrial Internet of Things (IIoT) systems necessitates advanced, interpretable, and scalable intrusion detection systems (IDS) to combat emerging cyber threats. Traditional IDS face challenges such as high computational demands, limited explainability, and inflexibility against evolving attack patterns. To address these limitations, this study introduces the Lightweight Explainable Network Security framework (LENS-XAI), which combines robust intrusion detection with enhanced interpretability and scalability. LENS-XAI integrates knowledge distillation, variational autoencoder models, and attribution-based explainability techniques to achieve high detection accuracy and transparency in decision-making. By leveraging a training set comprising 10% of the available data, the framework optimizes computational efficiency without sacrificing performance. Experimental evaluation on four benchmark datasets: Edge-IIoTset, UKM-IDS20, CTU-13, and NSL-KDD, demonstrates the framework's superior performance, achieving detection accuracies of 95.34%, 99.92%, 98.42%, and 99.34%, respectively. Additionally, the framework excels in reducing false positives and adapting to complex attack scenarios, outperforming existing state-of-the-art methods. Key strengths of LENS-XAI include its lightweight design, suitable for resource-constrained environments, and its scalability across diverse IIoT and cybersecurity contexts. Moreover, the explainability module enhances trust and transparency, critical for practical deployment in dynamic and sensitive applications. This research contributes significantly to advancing IDS by addressing computational efficiency, feature interpretability, and real-world applicability. Future work could focus on extending the framework to ensemble AI systems for distributed environments, further enhancing its robustness and adaptability.

Paperid: 1782, https://arxiv.org/pdf/2506.23909.pdf

Abstract:
This work addresses the challenge of malware classification using machine learning by developing a novel dataset labeled at both the malware type and family levels. Raw binaries were collected from sources such as VirusShare, VX Underground, and MalwareBazaar, and subsequently labeled with family information parsed from binary names and type-level labels integrated from ClarAVy. The dataset includes 14 malware types and 17 malware families, and was processed using a unified feature extraction pipeline based on static analysis, particularly extracting features from Portable Executable headers, to support advanced classification tasks. The evaluation was focused on three key classification tasks. In the binary classification of malware versus benign samples, Random Forest and XGBoost achieved high accuracy on the full datasets, reaching 98.5% for type-based detection and 98.98% for family-based detection. When using truncated datasets of 1,000 samples to assess performance under limited data conditions, both models still performed strongly, achieving 97.6% for type-based detection and 98.66% for family-based detection. For interclass classification, which distinguishes between malware types or families, the models reached up to 97.5% accuracy on type-level tasks and up to 93.7% on family-level tasks. In the multiclass classification setting, which assigns samples to the correct type or family, SVM achieved 81.1% accuracy on type labels, while Random Forest and XGBoost reached approximately 73.4% on family labels. The results highlight practical trade-offs between accuracy and computational cost, and demonstrate that labeling at both the type and family levels enables more fine-grained and insightful malware classification. The work establishes a robust foundation for future research on advanced malware detection and classification.

Paperid: 1783, https://arxiv.org/pdf/2506.20944.pdf

Abstract:
The rapid spread of misinformation in mobile and wireless networks presents critical security challenges. This study introduces a training-free, retrieval-based multimodal fact verification system that leverages pretrained vision-language models and large language models for credibility assessment. By dynamically retrieving and cross-referencing trusted data sources, our approach mitigates vulnerabilities of traditional training-based models, such as adversarial attacks and data poisoning. Additionally, its lightweight design enables seamless edge device integration without extensive on-device processing. Experiments on two fact-checking benchmarks achieve SOTA results, confirming its effectiveness in misinformation detection and its robustness against various attack vectors, highlighting its potential to enhance security in mobile and wireless communication environments.

Paperid: 1784, https://arxiv.org/pdf/2506.20926.pdf

Abstract:
Generative code models (GCMs) significantly enhance development efficiency through automated code generation and code summarization. However, building and training these models require computational resources and time, necessitating effective digital copyright protection to prevent unauthorized leaks and misuse. Backdoor watermarking, by embedding hidden identifiers, simplifies copyright verification by breaking the model's black-box nature. Current backdoor watermarking techniques face two main challenges: first, limited generalization across different tasks and datasets, causing fluctuating verification rates; second, insufficient stealthiness, as watermarks are easily detected and removed by automated methods. To address these issues, we propose CodeGuard, a novel watermarking method combining attention mechanisms with distributed trigger embedding strategies. Specifically, CodeGuard employs attention mechanisms to identify watermark embedding positions, ensuring verifiability. Moreover, by using homomorphic character replacement, it avoids manual detection, while distributed trigger embedding reduces the likelihood of automated detection. Experimental results demonstrate that CodeGuard achieves up to 100% watermark verification rates in both code summarization and code generation tasks, with no impact on the primary task performance. In terms of stealthiness, CodeGuard performs exceptionally, with a maximum detection rate of only 0.078 against ONION detection methods, significantly lower than baseline methods.

Paperid: 1785, https://arxiv.org/pdf/2506.19886.pdf

Abstract:
Semantic communication has emerged as a promising neural network-based system design for 6G networks. Task-oriented semantic communication is a novel paradigm whose core goal is to efficiently complete specific tasks by transmitting semantic information, optimizing communication efficiency and task performance. The key challenge lies in preserving privacy while maintaining task accuracy, as this scenario is susceptible to model inversion attacks. In such attacks, adversaries can restore or even reconstruct input data by analyzing and processing model outputs, owing to the neural network-based nature of the systems. In addition, traditional systems use image quality indicators (such as PSNR or SSIM) to assess attack severity, which may be inadequate for task-oriented semantic communication, since visual differences do not necessarily ensure semantic divergence. In this paper, we propose a diffusion-based semantic communication framework, named DiffSem, that optimizes semantic information reconstruction through a diffusion mechanism with self-referential label embedding to significantly improve task performance. Our model also compensates channel noise and adopt semantic information distortion to ensure the robustness of the system in various signal-to-noise ratio environments. To evaluate the attacker's effectiveness, we propose a new metric that better quantifies the semantic fidelity of estimations from the adversary. Experimental results based on this criterion show that on the MNIST dataset, DiffSem improves the classification accuracy by 10.03%, and maintain stable performance under dynamic channels. Our results further demonstrate that significant deviation exists between traditional image quality indicators and the leakage of task-relevant semantic information.

Paperid: 1786, https://arxiv.org/pdf/2506.19874.pdf

Abstract:
Recent secure weight release schemes claim to enable open-source model distribution while protecting model ownership and preventing misuse. However, these approaches lack rigorous security foundations and provide only informal security guarantees. Inspired by established works in cryptography, we formalize the security of weight release schemes by introducing several concrete security definitions. We then demonstrate our definition's utility through a case study of TaylorMLP, a prominent secure weight release scheme. Our analysis reveals vulnerabilities that allow parameter extraction thus showing that TaylorMLP fails to achieve its informal security goals. We hope this work will advocate for rigorous research at the intersection of machine learning and security communities and provide a blueprint for how future weight release schemes should be designed and evaluated.

Paperid: 1787, https://arxiv.org/pdf/2506.19533.pdf

Abstract:
Backdoor attacks embed a hidden functionality into deep neural networks, causing the network to display anomalous behavior when activated by a predetermined pattern in the input Trigger, while behaving well otherwise on public test data. Recent works have shown that backdoored face recognition (FR) systems can respond to natural-looking triggers like a particular pair of sunglasses. Such attacks pose a serious threat to the applicability of FR systems in high-security applications. We propose a novel technique to (1) detect whether an FR network is compromised with a natural, physically realizable trigger, and (2) identify such triggers given a compromised network. We demonstrate the effectiveness of our methods with a compromised FR network, where we are able to identify the trigger (e.g., green sunglasses or red hat) with a top-5 accuracy of 74%, whereas a naive brute force baseline achieves 56% accuracy.

Paperid: 1788, https://arxiv.org/pdf/2506.17446.pdf

Abstract:
This paper examines the effects of replay attacks on the integrity of both uplink and downlink communications during critical phases of spacecraft communication. By combining software-defined radios (SDRs) with a real-time channel emulator, we replicate realistic attack conditions on the Orion spacecraft's communication systems in both launch and reentry. Our evaluation shows that, under replay attacks, the attacker's signal can overpower legitimate transmissions, leading to a Signal to Noise Ratio (SNR) difference of up to -7.8 dB during reentry and -6.5 dB during launch. To mitigate these threats, we propose a more secure receiver design incorporating a phase-coherency-dependent decision-directed (DD) equalizer with a narrowed phase-locked loop (PLL) bandwidth. This configuration enhances resilience by making synchronization more sensitive to phase distortions caused by replay interference.

Paperid: 1789, https://arxiv.org/pdf/2506.16328.pdf

Abstract:
Kubernetes has emerged as the de facto orchestrator of microservices, providing scalability and extensibility to a highly dynamic environment. It builds an intricate and deeply connected system that requires extensive monitoring capabilities to be properly managed. To this account, K8s natively offers audit logs, a powerful feature for tracking API interactions in the cluster. Audit logs provide a detailed and chronological record of all activities in the system. Unfortunately, K8s auditing suffers from several practical limitations: it generates large volumes of data continuously, as all components within the cluster interact and respond to user actions. Moreover, each action can trigger a cascade of secondary events dispersed across the log, with little to no explicit linkage, making it difficult to reconstruct the context behind user-initiated operations. In this paper, we introduce K8NTEXT, a novel approach for streamlining K8s audit logs by reconstructing contexts, i.e., grouping actions performed by actors on the cluster with the subsequent events these actions cause. Correlated API calls are automatically identified, labeled, and consistently grouped using a combination of inference rules and a Machine Learning model, largely simplifying data consumption. We evaluate K8NTEXT's performance, scalability, and expressiveness both in systematic tests and with a series of use cases. We show that it consistently provides accurate context reconstruction, even for complex operations involving 50, 100 or more correlated actions, achieving over 95 percent accuracy across the entire spectrum, from simple to highly composite actions.

Paperid: 1790, https://arxiv.org/pdf/2506.15102.pdf

Abstract:
Privacy-preserving neural network training in vertically partitioned scenarios is vital for secure collaborative modeling across institutions. This paper presents \textbf{EVA-S2PMLP}, an Efficient, Verifiable, and Accurate Secure Two-Party Multi-Layer Perceptron framework that introduces spatial-scale optimization for enhanced privacy and performance. To enable reliable computation under real-number domain, EVA-S2PMLP proposes a secure transformation pipeline that maps scalar inputs to vector and matrix spaces while preserving correctness. The framework includes a suite of atomic protocols for linear and non-linear secure computations, with modular support for secure activation, matrix-vector operations, and loss evaluation. Theoretical analysis confirms the reliability, security, and asymptotic complexity of each protocol. Extensive experiments show that EVA-S2PMLP achieves high inference accuracy and significantly reduced communication overhead, with up to $12.3\times$ improvement over baselines. Evaluation on benchmark datasets demonstrates that the framework maintains model utility while ensuring strict data confidentiality, making it a practical solution for privacy-preserving neural network training in finance, healthcare, and cross-organizational AI applications.

Paperid: 1791, https://arxiv.org/pdf/2506.14913.pdf

Abstract:
The pre-training of large language models (LLMs) relies on massive text datasets sourced from diverse and difficult-to-curate origins. Although membership inference attacks and hidden canaries have been explored to trace data usage, such methods rely on memorization of training data, which LM providers try to limit. In this work, we demonstrate that indirect data poisoning (where the targeted behavior is absent from training data) is not only feasible but also allow to effectively protect a dataset and trace its use. Using gradient-based optimization prompt-tuning, we make a model learn arbitrary secret sequences: secret responses to secret prompts that are absent from the training corpus. We validate our approach on language models pre-trained from scratch and show that less than 0.005% of poisoned tokens are sufficient to covertly make a LM learn a secret and detect it with extremely high confidence ($p < 10^{-55}$) with a theoretically certifiable scheme. Crucially, this occurs without performance degradation (on LM benchmarks) and despite secrets never appearing in the training set.

Paperid: 1792, https://arxiv.org/pdf/2506.14337.pdf

Abstract:
Phishing attacks remain a significant threat to modern cybersecurity, as they successfully deceive both humans and the defense mechanisms intended to protect them. Traditional detection systems primarily focus on email metadata that users cannot see in their inboxes. Additionally, these systems struggle with phishing emails, which experienced users can often identify empirically by the text alone. This paper investigates the practical potential of Large Language Models (LLMs) to detect these emails by focusing on their intent. In addition to the binary classification of phishing emails, the paper introduces an intent-type taxonomy, which is operationalized by the LLMs to classify emails into distinct categories and, therefore, generate actionable threat information. To facilitate our work, we have curated publicly available datasets into a custom dataset containing a mix of legitimate and phishing emails. Our results demonstrate that existing LLMs are capable of detecting and categorizing phishing emails, underscoring their potential in this domain.

Paperid: 1793, https://arxiv.org/pdf/2506.13024.pdf

Abstract:
While certified robustness is widely promoted as a solution to adversarial examples in Artificial Intelligence systems, significant challenges remain before these techniques can be meaningfully deployed in real-world applications. We identify critical gaps in current research, including the paradox of detection without distinction, the lack of clear criteria for practitioners to evaluate certification schemes, and the potential security risks arising from users' expectations surrounding ``guaranteed" robustness claims. These create an alignment issue between how certifications are presented and perceived, relative to their actual capabilities. This position paper is a call to arms for the certification research community, proposing concrete steps to address these fundamental challenges and advance the field toward practical applicability.

Paperid: 1794, https://arxiv.org/pdf/2506.12466.pdf

Abstract:
The increasing complexity of cyberphysical power systems leads to larger attack surfaces to be exploited by malicious actors and a higher risk of faults through misconfiguration. We propose to meet those risks with a declarative approach to describe cyberphysical power systems and to automatically evaluate security and safety controls. We leverage Semantic Web technologies as a well-standardized framework, providing languages to specify ontologies, rules and shape constraints. We model infrastructure through an ontology which combines external ontologies, architecture and data models for sufficient expressivity and interoperability with external systems. The ontology can enrich itself through rules defined in SPARQL, allowing for the inference of knowledge that is not explicitly stated. Through the evaluation of SHACL shape constraints we can then validate the data and verify safety and security constraints. We demonstrate this concept with two use cases and illustrate how this solution can be developed further in a community-driven fashion.

Paperid: 1795, https://arxiv.org/pdf/2506.12113.pdf

Abstract:
In a context of malware analysis, numerous approaches rely on Artificial Intelligence to handle a large volume of data. However, these techniques focus on data view (images, sequences) and not on an expert's view. Noticing this issue, we propose a preprocessing that focuses on expert knowledge to improve malware semantic analysis and result interpretability. We propose a new preprocessing method which creates JSON reports for Portable Executable files. These reports gather features from both static and behavioral analysis, and incorporate packer signature detection, MITRE ATT\&CK and Malware Behavior Catalog (MBC) knowledge. The purpose of this preprocessing is to gather a semantic representation of binary files, understandable by malware analysts, and that can enhance AI models' explainability for malicious files analysis. Using this preprocessing to train a Large Language Model for Malware classification, we achieve a weighted-average F1-score of 0.94 on a complex dataset, representative of market reality.

Paperid: 1796, https://arxiv.org/pdf/2506.12097.pdf

Abstract:
Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models (LLMs) while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive decoding, leveraging two auxiliary smaller models, one trained without the forget set and one trained with it, to guide the outputs of the original model using their difference during inference. Our strategy substantially improves the tradeoff between unlearning effectiveness and model utility. We evaluate our approach on two unlearning benchmarks, TOFU and MUSE. Results show notable gains in both forget quality and retained performance in comparison to prior approaches, suggesting that incorporating contrastive decoding can offer an efficient, practical avenue for unlearning concepts in large-scale models.

Paperid: 1797, https://arxiv.org/pdf/2506.11486.pdf

Abstract:
In this paper, we investigate the differential and boomerang properties of a class of binomial $F_{r,u}(x) = x^r(1 + uÏ(x))$ over the finite field $\mathbb{F}_{p^n}$, where $r = \frac{p^n+1}{4}$, $p^n \equiv 3 \pmod{4}$, and $Ï(x) = x^{\frac{p^n -1}{2}}$ is the quadratic character in $\mathbb{F}_{p^n}$. We show that $F_{r,\pm1}$ is locally-PN with boomerang uniformity $0$ when $p^n \equiv 3 \pmod{8}$. To the best of our knowledge, the second known non-PN function class with boomerang uniformity $0$, and the first such example over odd characteristic fields with $p > 3$. Moreover, we show that $F_{r,\pm1}$ is locally-APN with boomerang uniformity at most $2$ when $p^n \equiv 7 \pmod{8}$. We also provide complete classifications of the differential and boomerang spectra of $F_{r,\pm1}$. Furthermore, we thoroughly investigate the differential uniformity of $F_{r,u}$ for $u\in \mathbb{F}_{p^n}^* \setminus \{\pm1\}$.

Paperid: 1798, https://arxiv.org/pdf/2506.09807.pdf

Abstract:
The identification of the devices from which a message is received is part of security mechanisms to ensure authentication in wireless communications. Conventional authentication approaches are cryptography-based, which, however, are usually computationally expensive and not adequate in the Internet of Things (IoT), where devices tend to be low-cost and with limited resources. This paper provides a comprehensive survey of physical layer-based device fingerprinting, which is an emerging device authentication for wireless security. In particular, this article focuses on hardware impairment-based identity authentication and channel features-based authentication. They are passive techniques that are readily applicable to legacy IoT devices. Their intrinsic hardware and channel features, algorithm design methodologies, application scenarios, and key research questions are extensively reviewed here. The remaining research challenges are discussed, and future work is suggested that can further enhance the physical layer-based device fingerprinting.

Paperid: 1799, https://arxiv.org/pdf/2506.09502.pdf

Abstract:
To more effectively control and allocate network resources, MAC CE has been introduced into the network protocol, which is a type of control signaling located in the MAC layer. Since MAC CE lacks encryption and integrity protection mechanisms provided by PDCP, the control signaling carried by MAC CE is vulnerable to interception or tampering by attackers during resource scheduling and allocation. Currently, the 3GPP has analyzed the security risks of Layer 1/Layer 2 Triggered Mobility (LTM), where handover signaling sent to the UE via MAC CE by the network can lead to privacy leaks and network attacks. However, in addition to LTM, there may be other potential security vulnerabilities in other protocol procedures. Therefore, this paper explores the security threats to MAC CE and the corresponding protection mechanisms. The research is expected to support the 3GPP's study of MAC CE and be integrated with the security research of lower-layer protocols, thereby enhancing the security and reliability of the entire communication system.

Paperid: 1800, https://arxiv.org/pdf/2506.08922.pdf

Abstract:
Off-the-shelf software for Command and Control is often used by attackers and legitimate pentesters looking for discretion. Among other functionalities, these tools facilitate the customization of their network traffic so it can mimic popular websites, thereby increasing their secrecy. Cobalt Strike is one of the most famous solutions in this category, used by known advanced attacker groups such as "Mustang Panda" or "Nobelium". In response to these threats, Security Operation Centers and other defense actors struggle to detect Command and Control traffic, which often use encryption protocols such as TLS. Network traffic metadata-based machine learning approaches have been proposed to detect encrypted malware communications or fingerprint websites over Tor network. This paper presents a machine learning-based method to detect Cobalt Strike Command and Control activity based only on widely used network traffic metadata. The proposed method is, to the best of our knowledge, the first of its kind that is able to adapt the model it uses to the observed traffic to optimize its performance. This specificity permits our method to performs equally or better than the state of the art while using standard features. Our method is thus easier to use in a production environment and more explainable.

Paperid: 1801, https://arxiv.org/pdf/2506.08192.pdf

Abstract:
We analyze two open source deep reinforcement learning agents submitted to the CAGE Challenge 2 cyber defense challenge, where each competitor submitted an agent to defend a simulated network against each of several provided rules-based attack agents. We demonstrate that one can gain interpretability of agent successes and failures by simplifying the complex state and action spaces and by tracking important events, shedding light on the fine-grained behavior of both the defense and attack agents in each experimental scenario. By analyzing important events within an evaluation episode, we identify patterns in infiltration and clearing events that tell us how well the attacker and defender played their respective roles; for example, defenders were generally able to clear infiltrations within one or two timesteps of a host being exploited. By examining transitions in the environment's state caused by the various possible actions, we determine which actions tended to be effective and which did not, showing that certain important actions are between 40% and 99% ineffective. We examine how decoy services affect exploit success, concluding for instance that decoys block up to 94% of exploits that would directly grant privileged access to a host. Finally, we discuss the realism of the challenge and ways that the CAGE Challenge 4 has addressed some of our concerns.

Paperid: 1802, https://arxiv.org/pdf/2506.07894.pdf

Abstract:
Federated Learning (FL) enables collaborative model training across distributed clients without sharing raw data, making it a promising approach for privacy-preserving machine learning in domains like Connected and Autonomous Vehicles (CAVs). However, recent studies have shown that exchanged model gradients remain susceptible to inference attacks such as Deep Leakage from Gradients (DLG), which can reconstruct private training data. While existing defenses like Differential Privacy (DP) and Secure Multi-Party Computation (SMPC) offer protection, they often compromise model accuracy. To that end, Homomorphic Encryption (HE) offers a promising alternative by enabling lossless computation directly on encrypted data, thereby preserving both privacy and model utility. However, HE introduces significant computational and communication overhead, which can hinder its practical adoption. To address this, we systematically evaluate various leveled HE schemes to identify the most suitable for FL in resource-constrained environments due to its ability to support fixed-depth computations without requiring costly bootstrapping. Our contributions in this paper include a comprehensive evaluation of HE schemes for real-world FL applications, a selective encryption strategy that targets only the most sensitive gradients to minimize computational overhead, and the development of a full HE-based FL pipeline that effectively mitigates DLG attacks while preserving model accuracy. We open-source our implementation to encourage reproducibility and facilitate adoption in safety-critical domains.

Paperid: 1803, https://arxiv.org/pdf/2506.06891.pdf

Abstract:
We study the corruption-robustness of in-context reinforcement learning (ICRL), focusing on the Decision-Pretrained Transformer (DPT, Lee et al., 2023). To address the challenge of reward poisoning attacks targeting the DPT, we propose a novel adversarial training framework, called Adversarially Trained Decision-Pretrained Transformer (AT-DPT). Our method simultaneously trains an attacker to minimize the true reward of the DPT by poisoning environment rewards, and a DPT model to infer optimal actions from the poisoned data. We evaluate the effectiveness of our approach against standard bandit algorithms, including robust baselines designed to handle reward contamination. Our results show that the proposed method significantly outperforms these baselines in bandit settings, under a learned attacker. We additionally evaluate AT-DPT on an adaptive attacker, and observe similar results. Furthermore, we extend our evaluation to the MDP setting, confirming that the robustness observed in bandit scenarios generalizes to more complex environments.

Paperid: 1804, https://arxiv.org/pdf/2506.04962.pdf

Abstract:
Security vulnerabilities in software packages are a significant concern for developers and users alike. Patching these vulnerabilities in a timely manner is crucial to restoring the integrity and security of software systems. However, previous work has shown that vulnerability reports often lack proof-of-concept (PoC) exploits, which are essential for fixing the vulnerability, testing patches, and avoiding regressions. Creating a PoC exploit is challenging because vulnerability reports are informal and often incomplete, and because it requires a detailed understanding of how inputs passed to potentially vulnerable APIs may reach security-relevant sinks. In this paper, we present PoCGen, a novel approach to autonomously generate and validate PoC exploits for vulnerabilities in npm packages. This is the first fully autonomous approach to use large language models (LLMs) in tandem with static and dynamic analysis techniques for PoC exploit generation. PoCGen leverages an LLM for understanding vulnerability reports, for generating candidate PoC exploits, and for validating and refining them. Our approach successfully generates exploits for 77% of the vulnerabilities in the SecBench$.$js dataset and 39% in a new, more challenging dataset of 794 recent vulnerabilities. This success rate significantly outperforms a recent baseline (by 45 absolute percentage points), while imposing an average cost of $0.02 per generated exploit.

Paperid: 1805, https://arxiv.org/pdf/2506.03870.pdf

Abstract:
The misuse of Large Language Models (LLMs) to infer emotions from text for malicious purposes, known as emotion inference attacks, poses a significant threat to user privacy. In this paper, we investigate the potential of Apple Intelligence's writing tools, integrated across iPhone, iPad, and MacBook, to mitigate these risks through text modifications such as rewriting and tone adjustment. By developing early novel datasets specifically for this purpose, we empirically assess how different text modifications influence LLM-based detection. This capability suggests strong potential for Apple Intelligence's writing tools as privacy-preserving mechanisms. Our findings lay the groundwork for future adaptive rewriting systems capable of dynamically neutralizing sensitive emotional content to enhance user privacy. To the best of our knowledge, this research provides the first empirical analysis of Apple Intelligence's text-modification tools within a privacy-preservation context with the broader goal of developing on-device, user-centric privacy-preserving mechanisms to protect against LLMs-based advanced inference attacks on deployed systems.

Paperid: 1806, https://arxiv.org/pdf/2506.03549.pdf

Abstract:
Quantum key distribution (QKD) provides an information-theoretic way of securely exchanging secret keys, and typically relies on pre-shared keys or public keys for message authentication. To lift the requirement of pre-shared or public keys, Buhrman et. al. [SIAM J. Comput. 43, 150 (2014)] proposed utilizing the location of a party as a credential. Here, we extend upon the proposal, develop a QKD protocol with location credentials using quantum position verification (QPV) based message and identity authentication. By using QKD with delayed authentication as a base, and later simplifying QPV-based message authentication, we significantly reduce the number of QPV runs, which currently acts as a bottleneck. Besides demonstrating security for the proposed protocol, we also provide improvements to QPV security analysis, including generalization of the QPV adversary model, tightening a trace distance bound using semidefinite programming, and propose a multi-basis QPV requiring only BB84 state preparation but with multiple measurement basis.

Paperid: 1807, https://arxiv.org/pdf/2506.02066.pdf

Abstract:
As foundation models grow in both popularity and capability, researchers have uncovered a variety of ways that the models can pose a risk to the model's owner, user, or others. Despite the efforts of measuring these risks via benchmarks and cataloging them in AI risk taxonomies, there is little guidance for practitioners on how to determine which risks are relevant for a given foundation model use. In this paper, we address this gap and develop requirements and an initial design for a risk identification framework. To do so, we look to prior literature to identify challenges for building a foundation model risk identification framework and adapt ideas from usage governance to synthesize four design requirements. We then demonstrate how a candidate framework can addresses these design requirements and provide a foundation model use example to show how the framework works in practice for a small subset of risks.

Paperid: 1808, https://arxiv.org/pdf/2506.01849.pdf

Abstract:
This competition hosted on Kaggle (https://www.kaggle.com/competitions/trojan-horse-hunt-in-space) is the first part of a series of follow-up competitions and hackathons related to the "Assurance for Space Domain AI Applications" project funded by the European Space Agency (https://assurance-ai.space-codev.org/). The competition idea is based on one of the real-life AI security threats identified within the project -- the adversarial poisoning of continuously fine-tuned satellite telemetry forecasting models. The task is to develop methods for finding and reconstructing triggers (trojans) in advanced models for satellite telemetry forecasting used in safety-critical space operations. Participants are provided with 1) a large public dataset of real-life multivariate satellite telemetry (without triggers), 2) a reference model trained on the clean data, 3) a set of poisoned neural hierarchical interpolation (N-HiTS) models for time series forecasting trained on the dataset with injected triggers, and 4) Jupyter notebook with the training pipeline and baseline algorithm (the latter will be published in the last month of the competition). The main task of the competition is to reconstruct a set of 45 triggers (i.e., short multivariate time series segments) injected into the training data of the corresponding set of 45 poisoned models. The exact characteristics (i.e., shape, amplitude, and duration) of these triggers must be identified by participants. The popular Neural Cleanse method is adopted as a baseline, but it is not designed for time series analysis and new approaches are necessary for the task. The impact of the competition is not limited to the space domain, but also to many other safety-critical applications of advanced time series analysis where model poisoning may lead to serious consequences.

Paperid: 1809, https://arxiv.org/pdf/2506.01162.pdf

Abstract:
Estimating the density of a distribution from its samples is a fundamental problem in statistics. Hypothesis selection addresses the setting where, in addition to a sample set, we are given $n$ candidate distributions -- referred to as hypotheses -- and the goal is to determine which one best describes the underlying data distribution. This problem is known to be solvable very efficiently, requiring roughly $O(\log n)$ samples and running in $\tilde{O}(n)$ time. The quality of the output is measured via the total variation distance to the unknown distribution, and the approximation factor of the algorithm determines how large this distance is compared to the optimal distance achieved by the best candidate hypothesis. It is known that $Î±= 3$ is the optimal approximation factor for this problem. We study hypothesis selection under the constraint of differential privacy. We propose a differentially private algorithm in the central model that runs in nearly-linear time with respect to the number of hypotheses, achieves the optimal approximation factor, and incurs only a modest increase in sample complexity, which remains polylogarithmic in $n$. This resolves an open question posed by [Bun, Kamath, Steinke, Wu, NeurIPS 2019]. Prior to our work, existing upper bounds required quadratic time.

Paperid: 1810, https://arxiv.org/pdf/2506.00518.pdf

Abstract:
In this work, we present an efficient secure multi-party computation MPC protocol that provides strong security guarantees in settings with dishonest majority of participants who may behave arbitrarily. Unlike the popular MPC implementation known as SPDZ [Crypto '12], which only ensures security with abort, our protocol achieves both complete identifiability and robustness. With complete identifiability, honest parties can detect and unanimously agree on the identity of any malicious party. Robustness allows the protocol to continue with the computation without requiring a restart, even when malicious behavior is detected. Additionally, our approach addresses the performance limitations observed in the protocol by Cunningham et al. [ICITS '17], which, while achieving complete identifiability, is hindered by the costly exponentiation operations required by the choice of commitment scheme. Our protocol is based on the approach by Rivinius et al. [S&P '22], utilizing lattice-based commitment for better efficiency. We achieved robustness with the help of a semi-honest trusted third party. We benchmark our robust protocol, showing the efficient recovery from parties' malicious behavior. Finally, we benchmark our protocol on a ML-as-a-service scenario, wherein clients off-load the desired computation to the servers, and verify the computation result. We benchmark on linear ML inference, running on various datasets. While our efficiency is slightly lower compared to SPDZ's, we offer stronger security properties that provide distinct advantages.

Paperid: 1811, https://arxiv.org/pdf/2506.00461.pdf

Abstract:
As hardware design complexity increases, hardware fuzzing emerges as a promising tool for automating the verification process. However, a significant gap still exists before it can be applied in industry. This paper aims to summarize the current progress of hardware fuzzing from an industry-use perspective and propose solutions to bridge the gap between hardware fuzzing and industrial verification. First, we review recent hardware fuzzing methods and analyze their compatibilities with industrial verification. We establish criteria to assess whether a hardware fuzzing approach is compatible. Second, we examine whether current verification tools can efficiently support hardware fuzzing. We identify the bottlenecks in hardware fuzzing performance caused by insufficient support from the industrial environment. To overcome the bottlenecks, we propose a prototype, HwFuzzEnv, providing the necessary support for hardware fuzzing. With this prototype, the previous hardware fuzzing method can achieve a several hundred times speedup in industrial settings. Our work could serve as a reference for EDA companies, encouraging them to enhance their tools to support hardware fuzzing efficiently in industrial verification.

Paperid: 1812, https://arxiv.org/pdf/2505.23938.pdf

Abstract:
The ChatGPT Windows application offers better user interaction in the Windows operating system (OS) by enhancing productivity and streamlining the workflow of ChatGPT's utilization. However, there are potential misuses associated with this application that require rigorous forensic analysis. This study presents a holistic forensic analysis of the ChatGPT Windows application, focusing on identifying and recovering digital artifacts for investigative purposes. With the use of widely popular and openly available digital forensics tools such as Autopsy, FTK Imager, Magnet RAM Capture, Wireshark, and Hex Workshop, this research explores different methods to extract and analyze cache, chat logs, metadata, and network traffic from the application. Our key findings also demonstrate the history of the application's chat, user interactions, and system-level traces that can be recovered even after deletion, providing critical insights into the crime investigation and, thus, documenting and outlining a potential misuse report for digital forensics.

Paperid: 1813, https://arxiv.org/pdf/2505.23817.pdf

Abstract:
The system prompt in Large Language Models (LLMs) plays a pivotal role in guiding model behavior and response generation. Often containing private configuration details, user roles, and operational instructions, the system prompt has become an emerging attack target. Recent studies have shown that LLM system prompts are highly susceptible to extraction attacks through meticulously designed queries, raising significant privacy and security concerns. Despite the growing threat, there is a lack of systematic studies of system prompt extraction attacks and defenses. In this paper, we present a comprehensive framework, SPE-LLM, to systematically evaluate System Prompt Extraction attacks and defenses in LLMs. First, we design a set of novel adversarial queries that effectively extract system prompts in state-of-the-art (SOTA) LLMs, demonstrating the severe risks of LLM system prompt extraction attacks. Second, we propose three defense techniques to mitigate system prompt extraction attacks in LLMs, providing practical solutions for secure LLM deployments. Third, we introduce a set of rigorous evaluation metrics to accurately quantify the severity of system prompt extraction attacks in LLMs and conduct comprehensive experiments across multiple benchmark datasets, which validates the efficacy of our proposed SPE-LLM framework.

Paperid: 1814, https://arxiv.org/pdf/2505.22860.pdf

Abstract:
In enterprise settings, organizational data is segregated, siloed and carefully protected by elaborate access control frameworks. These access control structures can completely break down if an LLM fine-tuned on the siloed data serves requests, for downstream tasks, from individuals with disparate access privileges. We propose Permissioned LLMs (PermLLM), a new class of LLMs that superimpose the organizational data access control structures on query responses they generate. We formalize abstractions underpinning the means to determine whether access control enforcement happens correctly over LLM query responses. Our formalism introduces the notion of a relevant response that can be used to prove whether a PermLLM mechanism has been implemented correctly. We also introduce a novel metric, called access advantage, to empirically evaluate the efficacy of a PermLLM mechanism. We introduce three novel PermLLM mechanisms that build on Parameter Efficient Fine-Tuning to achieve the desired access control. We furthermore present two instantiations of access advantage--(i) Domain Distinguishability Index (DDI) based on Membership Inference Attacks, and (ii) Utility Gap Index (UGI) based on LLM utility evaluation. We demonstrate the efficacy of our PermLLM mechanisms through extensive experiments on four public datasets (GPQA, RCV1, SimpleQA, and WMDP), in addition to evaluating the validity of DDI and UGI metrics themselves for quantifying access control in LLMs.

Paperid: 1815, https://arxiv.org/pdf/2505.22735.pdf

Abstract:
To safeguard user data privacy, on-device inference has emerged as a prominent paradigm on mobile and Internet of Things (IoT) devices. This paradigm involves deploying a model provided by a third party on local devices to perform inference tasks. However, it exposes the private model to two primary security threats: model stealing (MS) and membership inference attacks (MIA). To mitigate these risks, existing wisdom deploys models within Trusted Execution Environments (TEEs), which is a secure isolated execution space. Nonetheless, the constrained secure memory capacity in TEEs makes it challenging to achieve full model security with low inference latency. This paper fills the gap with TensorShield, the first efficient on-device inference work that shields partial tensors of the model while still fully defending against MS and MIA. The key enabling techniques in TensorShield include: (i) a novel eXplainable AI (XAI) technique exploits the model's attention transition to assess critical tensors and shields them in TEE to achieve secure inference, and (ii) two meticulous designs with critical feature identification and latency-aware placement to accelerate inference while maintaining security. Extensive evaluations show that TensorShield delivers almost the same security protection as shielding the entire model inside TEE, while being up to 25.35$\times$ (avg. 5.85$\times$) faster than the state-of-the-art work, without accuracy loss.

Paperid: 1816, https://arxiv.org/pdf/2505.19260.pdf

Abstract:
LLM Agents are becoming central to intelligent systems. However, their deployment raises serious safety concerns. Existing defenses largely rely on "Safety Checks", which struggle to capture the complex semantic risks posed by harmful user inputs or unsafe agent behaviors - creating a significant semantic gap between safety checks and real-world risks. To bridge this gap, we propose a novel defense framework, ALRPHFS (Adversarially Learned Risk Patterns with Hierarchical Fast & Slow Reasoning). ALRPHFS consists of two core components: (1) an offline adversarial self-learning loop to iteratively refine a generalizable and balanced library of risk patterns, substantially enhancing robustness without retraining the base LLM, and (2) an online hierarchical fast & slow reasoning engine that balances detection effectiveness with computational efficiency. Experimental results demonstrate that our approach achieves superior overall performance compared to existing baselines, achieving a best-in-class average accuracy of 80% and exhibiting strong generalizability across agents and tasks.

Paperid: 1817, https://arxiv.org/pdf/2505.19004.pdf

Abstract:
In-host shared memory (IVSHMEM) enables high-throughput, zero-copy communication between virtual machines, but today's implementations lack any security control, allowing any application to eavesdrop or tamper with the IVSHMEM region. This paper presents Secure IVSHMEM, a protocol that provides end-to-end mutual authentication and fine-grained access enforcement with negligible performance cost. We combine three techniques to ensure security: (1) channel separation and kernel module access control, (2)hypervisor-mediated handshake for end-to-end service authentication, and (3)application-level integration for abstraction and performance mitigation. In microbenchmarks, Secure IVSHMEM completes its one-time handshake in under 200ms and sustains data-plane round-trip latencies within 5\% of the unmodified baseline, with negligible bandwidth overhead. We believe this design is ideally suited for safety and latency-critical in-host domains, such as automotive systems, where both performance and security are paramount.

Paperid: 1818, https://arxiv.org/pdf/2505.17416.pdf

Abstract:
Smart contracts are a key component of the Web 3.0 ecosystem, widely applied in blockchain services and decentralized applications. However, the automated execution feature of smart contracts makes them vulnerable to potential attacks due to inherent flaws, which can lead to severe security risks and financial losses, even threatening the integrity of the entire decentralized finance system. Currently, research on smart contract vulnerabilities has evolved from traditional program analysis methods to deep learning techniques, with the gradual introduction of Large Language Models. However, existing studies mainly focus on vulnerability detection, lacking systematic cause analysis and Vulnerability Repair. To address this gap, we propose LLM-BSCVM, a Large Language Model-based smart contract vulnerability management framework, designed to provide end-to-end vulnerability detection, analysis, repair, and evaluation capabilities for Web 3.0 ecosystem. LLM-BSCVM combines retrieval-augmented generation technology and multi-agent collaboration, introducing a three-stage method of Decompose-Retrieve-Generate. This approach enables smart contract vulnerability management through the collaborative efforts of six intelligent agents, specifically: vulnerability detection, cause analysis, repair suggestion generation, risk assessment, vulnerability repair, and patch evaluation. Experimental results demonstrate that LLM-BSCVM achieves a vulnerability detection accuracy and F1 score exceeding 91\% on benchmark datasets, comparable to the performance of state-of-the-art (SOTA) methods, while reducing the false positive rate from 7.2\% in SOTA methods to 5.1\%, thus enhancing the reliability of vulnerability management. Furthermore, LLM-BSCVM supports continuous security monitoring and governance of smart contracts through a knowledge base hot-swapping dynamic update mechanism.

Paperid: 1819, https://arxiv.org/pdf/2505.13942.pdf

Abstract:
Intelligent mechanisms implemented in autonomous vehicles, such as proactive driving assist and collision alerts, reduce traffic accidents. However, verifying their correct functionality is difficult due to complex interactions with the environment. This problem is exacerbated in adversarial environments, where an attacker can control the environment surrounding autonomous vehicles to exploit vulnerabilities. To preemptively identify vulnerabilities in these systems, in this paper, we implement a scenario-based framework with a formal method to identify the impact of malicious drivers interacting with autonomous vehicles. The formalization of the evaluation requirements utilizes metric temporal logic (MTL) to identify a safety condition that we want to test. Our goal is to find, through a rigorous testing approach, any trace that violates this MTL safety specification. Our results can help designers identify the range of safe operational behaviors that prevent malicious drivers from exploiting the autonomous features of modern vehicles.

Paperid: 1820, https://arxiv.org/pdf/2505.13848.pdf

Abstract:
Quantum computing leverages quantum mechanics to achieve computational advantages over classical hardware, but the use of third-party quantum compilers in the Noisy Intermediate-Scale Quantum (NISQ) era introduces risks of intellectual property (IP) exposure. We address this by proposing a novel obfuscation technique that protects proprietary quantum circuits by inserting additional quantum gates prior to compilation. These gates corrupt the measurement outcomes, which are later corrected through a lightweight classical post-processing step based on the inserted gate structure. Unlike prior methods that rely on complex quantum reversals, barriers, or physical-to-virtual qubit mapping, our approach achieves obfuscation using compiler-agnostic classical correction. We evaluate the technique across five benchmark quantum algorithms -- Shor's, QAOA, Bernstein-Vazirani, Grover's, and HHL -- using IBM's Qiskit framework. The results demonstrate high Total Variation Distance (above 0.5) and consistently negative Degree of Functional Corruption (DFC), confirming both statistical and functional obfuscation. This shows that our method is a practical and effective solution for the security of quantum circuit designs in untrusted compilation flows.

Paperid: 1821, https://arxiv.org/pdf/2505.12490.pdf

Abstract:
Googles A2A protocol provides a secure communication framework for AI agents but demonstrates critical limitations when handling highly sensitive information such as payment credentials and identity documents. These gaps increase the risk of unintended harms, including unauthorized disclosure, privilege escalation, and misuse of private data in generative multi-agent environments. In this paper, we identify key weaknesses of A2A: insufficient token lifetime control, lack of strong customer authentication, overbroad access scopes, and missing consent flows. We propose protocol-level enhancements grounded in a structured threat model for semi-trusted multi-agent systems. Our refinements introduce explicit consent orchestration, ephemeral scoped tokens, and direct user-to-service data channels to minimize exposure across time, context, and topology. Empirical evaluation using adversarial prompt injection tests shows that the enhanced protocol substantially reduces sensitive data leakage while maintaining low communication latency. Comparative analysis highlights the advantages of our approach over both the original A2A specification and related academic proposals. These contributions establish a practical path for evolving A2A into a privacy-preserving framework that mitigates unintended harms in multi-agent generative AI systems.

Paperid: 1822, https://arxiv.org/pdf/2505.12144.pdf

Abstract:
Consensus protocols used today in blockchains often rely on computational power or financial stakes - scarce resources. We propose a novel protocol using social capital - trust and influence from social interactions - as a non-transferable staking mechanism to ensure fairness and decentralization. The methodology integrates zero-knowledge proofs, verifiable credentials, a Whisk-like leader election, and an incentive scheme to prevent Sybil attacks and encourage engagement. The theoretical framework would enhance privacy and equity, though unresolved issues like off-chain bribery require further research. This work offers a new model aligned with modern social media behavior and lifestyle, with applications in finance, providing a practical insight for decentralized system development.

Paperid: 1823, https://arxiv.org/pdf/2505.11097.pdf

Abstract:
Federated Unlearning (FU) has emerged as a critical compliance mechanism for data privacy regulations, requiring unlearned clients to provide verifiable Proof of Federated Unlearning (PoFU) to auditors upon data removal requests. However, we uncover a significant privacy vulnerability: when gradient differences are used as PoFU, honest-but-curious auditors may exploit mathematical correlations between gradient differences and forgotten samples to reconstruct the latter. Such reconstruction, if feasible, would face three key challenges: (i) restricted auditor access to client-side data, (ii) limited samples derivable from individual PoFU, and (iii) high-dimensional redundancy in gradient differences. To overcome these challenges, we propose Inverting Gradient difference to Forgotten data (IGF), a novel learning-based reconstruction attack framework that employs Singular Value Decomposition (SVD) for dimensionality reduction and feature extraction. IGF incorporates a tailored pixel-level inversion model optimized via a composite loss that captures both structural and semantic cues. This enables efficient and high-fidelity reconstruction of large-scale samples, surpassing existing methods. To counter this novel attack, we design an orthogonal obfuscation defense that preserves PoFU verification utility while preventing sensitive forgotten data reconstruction. Experiments across multiple datasets validate the effectiveness of the attack and the robustness of the defense. The code is available at https://anonymous.4open.science/r/IGF.

Paperid: 1824, https://arxiv.org/pdf/2505.10965.pdf

Abstract:
The application and development of process mining techniques face significant challenges due to the lack of publicly available real-life event logs. One reason for companies to abstain from sharing their data are privacy and confidentiality concerns. Privacy concerns refer to personal data as specified in the GDPR and have been addressed in existing work by providing privacy-preserving techniques for event logs. However, the concept of confidentiality in event logs not pertaining to individuals remains unclear, although they might contain a multitude of sensitive business data. This work addresses confidentiality of process data based on the privacy and confidentiality engineering method (PCRE). PCRE interactively explores privacy and confidentiality requirements regarding process data with different stakeholders and defines privacy-preserving actions to address possible concerns. We co-construct and evaluate PCRE based on structured interviews with process analysts in two manufacturing companies. PCRE is generic, hence applicable in different application domains. The goal is to systematically scrutinize process data and balance the trade-off between privacy and utility loss.

Paperid: 1825, https://arxiv.org/pdf/2505.10500.pdf

Abstract:
Audio and speech data are increasingly used in machine learning applications such as speech recognition, speaker identification, and mental health monitoring. However, the passive collection of this data by audio listening devices raises significant privacy concerns. Fully homomorphic encryption (FHE) offers a promising solution by enabling computations on encrypted data and preserving user privacy. Despite its potential, prior attempts to apply FHE to audio processing have faced challenges, particularly in securely computing time frequency representations, a critical step in many audio tasks. Here, we addressed this gap by introducing a fully secure pipeline that computes, with FHE and quantized neural network operations, four fundamental time-frequency representations: Short-Time Fourier Transform (STFT), Mel filterbanks, Mel-frequency cepstral coefficients (MFCCs), and gammatone filters. Our methods also support the private computation of audio descriptors and convolutional neural network (CNN) classifiers. Besides, we proposed approximate STFT algorithms that lighten computation and bit use for statistical and machine learning analyses. We ran experiments on the VocalSet and OxVoc datasets demonstrating the fully private computation of our approach. We showed significant performance improvements with STFT approximation in private statistical analysis of audio markers, and for vocal exercise classification with CNNs. Our results reveal that our approximations substantially reduce error rates compared to conventional STFT implementations in FHE. We also demonstrated a fully private classification based on the raw audio for gender and vocal exercise classification. Finally, we provided a practical heuristic for parameter selection, making quantized approximate signal processing accessible to researchers and practitioners aiming to protect sensitive audio data.

Paperid: 1826, https://arxiv.org/pdf/2505.10273.pdf

Abstract:
Vehicle platooning, with vehicles traveling in close formation coordinated through Vehicle-to-Everything (V2X) communications, offers significant benefits in fuel efficiency and road utilization. However, it is vulnerable to sophisticated falsification attacks by authenticated insiders that can destabilize the formation and potentially cause catastrophic collisions. This paper addresses this challenge: misbehavior detection in vehicle platooning systems. We present AttentionGuard, a transformer-based framework for misbehavior detection that leverages the self-attention mechanism to identify anomalous patterns in mobility data. Our proposal employs a multi-head transformer-encoder to process sequential kinematic information, enabling effective differentiation between normal mobility patterns and falsification attacks across diverse platooning scenarios, including steady-state (no-maneuver) operation, join, and exit maneuvers. Our evaluation uses an extensive simulation dataset featuring various attack vectors (constant, gradual, and combined falsifications) and operational parameters (controller types, vehicle speeds, and attacker positions). Experimental results demonstrate that AttentionGuard achieves up to 0.95 F1-score in attack detection, with robust performance maintained during complex maneuvers. Notably, our system performs effectively with minimal latency (100ms decision intervals), making it suitable for real-time transportation safety applications. Comparative analysis reveals superior detection capabilities and establishes the transformer-encoder as a promising approach for securing Cooperative Intelligent Transport Systems (C-ITS) against sophisticated insider threats.

Paperid: 1827, https://arxiv.org/pdf/2505.09929.pdf

Abstract:
In recent years, consumer Internet of Things (IoT) devices have become widely used in daily life. With the popularity of devices, related security and privacy risks arise at the same time as they collect user-related data and transmit it to various service providers. Although China accounts for a larger share of the consumer IoT industry, current analyses on consumer IoT device traffic primarily focus on regions such as Europe, the United States, and Australia. Research on China, however, is currently relatively rare. This study constructs the first large-scale dataset about consumer IoT device traffic in China. Specifically, we propose a fine-grained traffic collection guidance covering the entire lifecycle of consumer IoT devices, gathering traffic from 77 devices spanning 38 brands and 12 device categories. Based on this dataset, we analyze traffic destinations and encryption practices across different device types during the entire lifecycle and compare the findings with the results of other regions. Compared to other regions, our results show that consumer IoT devices in China rely more on domestic services and overall perform better in terms of encryption practices. However, there are still 23/40 devices improperly conducting certificate validation, and 2/70 devices use insecure encryption protocols. To facilitate future research, we open-source our traffic collection guidance and make our dataset publicly available.

Paperid: 1828, https://arxiv.org/pdf/2505.09892.pdf

Abstract:
The untraceability of transactions facilitated by Ethereum mixing services like Tornado Cash poses significant challenges to blockchain security and financial regulation. Existing methods for correlating mixing accounts suffer from limited labeled data and vulnerability to noisy annotations, which restrict their practical applicability. In this paper, we propose StealthLink, a novel framework that addresses these limitations through cross-task domain-invariant feature learning. Our key innovation lies in transferring knowledge from the well-studied domain of blockchain anomaly detection to the data-scarce task of mixing transaction tracing. Specifically, we design a MixFusion module that constructs and encodes mixing subgraphs to capture local transactional patterns, while introducing a knowledge transfer mechanism that aligns discriminative features across domains through adversarial discrepancy minimization. This dual approach enables robust feature learning under label scarcity and distribution shifts. Extensive experiments on real-world mixing transaction datasets demonstrate that StealthLink achieves state-of-the-art performance, with 96.98\% F1-score in 10-shot learning scenarios. Notably, our framework shows superior generalization capability in imbalanced data conditions than conventional supervised methods. This work establishes the first systematic approach for cross-domain knowledge transfer in blockchain forensics, providing a practical solution for combating privacy-enhanced financial crimes in decentralized ecosystems.

Paperid: 1829, https://arxiv.org/pdf/2505.09602.pdf

Abstract:
Large Language Models (LLMs) are increasingly embedded in autonomous systems and public-facing environments, yet they remain susceptible to jailbreak vulnerabilities that may undermine their security and trustworthiness. Adversarial suffixes are considered to be the current state-of-the-art jailbreak, consistently outperforming simpler methods and frequently succeeding even in black-box settings. Existing defenses rely on access to the internal architecture of models limiting diverse deployment, increase memory and computation footprints dramatically, or can be bypassed with simple prompt engineering methods. We introduce $\textbf{Adversarial Suffix Filtering}$ (ASF), a lightweight novel model-agnostic defensive pipeline designed to protect LLMs against adversarial suffix attacks. ASF functions as an input preprocessor and sanitizer that detects and filters adversarially crafted suffixes in prompts, effectively neutralizing malicious injections. We demonstrate that ASF provides comprehensive defense capabilities across both black-box and white-box attack settings, reducing the attack efficacy of state-of-the-art adversarial suffix generation methods to below 4%, while only minimally affecting the target model's capabilities in non-adversarial scenarios.

Paperid: 1830, https://arxiv.org/pdf/2505.08978.pdf

Abstract:
Machine Learning as a Service (MLaaS) has gained important attraction as a means for deploying powerful predictive models, offering ease of use that enables organizations to leverage advanced analytics without substantial investments in specialized infrastructure or expertise. However, MLaaS platforms must be safeguarded against security and privacy attacks, such as model extraction (MEA) attacks. The increasing integration of explainable AI (XAI) within MLaaS has introduced an additional privacy challenge, as attackers can exploit model explanations particularly counterfactual explanations (CFs) to facilitate MEA. In this paper, we investigate the trade offs among model performance, privacy, and explainability when employing Differential Privacy (DP), a promising technique for mitigating CF facilitated MEA. We evaluate two distinct DP strategies: implemented during the classification model training and at the explainer during CF generation.

Paperid: 1832, https://arxiv.org/pdf/2505.08835.pdf

Abstract:
The advent of convenient and efficient fully unmanned stores equipped with artificial intelligence-based automated checkout systems marks a new era in retail. However, these systems have inherent artificial intelligence security vulnerabilities, which are exploited via adversarial patch attacks, particularly in physical environments. This study demonstrated that adversarial patches can severely disrupt object detection models used in unmanned stores, leading to issues such as theft, inventory discrepancies, and interference. We investigated three types of adversarial patch attacks -- Hiding, Creating, and Altering attacks -- and highlighted their effectiveness. We also introduce the novel color histogram similarity loss function by leveraging attacker knowledge of the color information of a target class object. Besides the traditional confusion-matrix-based attack success rate, we introduce a new bounding-boxes-based metric to analyze the practical impact of these attacks. Starting with attacks on object detection models trained on snack and fruit datasets in a digital environment, we evaluated the effectiveness of adversarial patches in a physical testbed that mimicked a real unmanned store with RGB cameras and realistic conditions. Furthermore, we assessed the robustness of these attacks in black-box scenarios, demonstrating that shadow attacks can enhance success rates of attacks even without direct access to model parameters. Our study underscores the necessity for robust defense strategies to protect unmanned stores from adversarial threats. Highlighting the limitations of the current defense mechanisms in real-time detection systems and discussing various proactive measures, we provide insights into improving the robustness of object detection models and fortifying unmanned retail environments against these attacks.

Paperid: 1833, https://arxiv.org/pdf/2505.08576.pdf

Abstract:
Recent legal frameworks have mandated the right to be forgotten, obligating the removal of specific data upon user requests. Machine Unlearning has emerged as a promising solution by selectively removing learned information from machine learning models. This paper presents MUBox, a comprehensive platform designed to evaluate unlearning methods in deep learning. MUBox integrates 23 advanced unlearning techniques, tested across six practical scenarios with 11 diverse evaluation metrics. It allows researchers and practitioners to (1) assess and compare the effectiveness of different machine unlearning methods across various scenarios; (2) examine the impact of current evaluation metrics on unlearning performance; and (3) conduct detailed comparative studies on machine unlearning in a unified framework. Leveraging MUBox, we systematically evaluate these unlearning methods in deep learning and uncover several key insights: (a) Even state-of-the-art unlearning methods, including those published in top-tier venues and winners of unlearning competitions, demonstrate inconsistent effectiveness across diverse scenarios. Prior research has predominantly focused on simplified settings, such as random forgetting and class-wise unlearning, highlighting the need for broader evaluations across more difficult unlearning tasks. (b) Assessing unlearning performance remains a non-trivial problem, as no single evaluation metric can comprehensively capture the effectiveness, efficiency, and preservation of model utility. Our findings emphasize the necessity of employing multiple metrics to achieve a balanced and holistic assessment of unlearning methods. (c) In the context of depoisoning, our evaluation reveals significant variability in the effectiveness of existing approaches, which is highly dependent on the specific type of poisoning attacks.

Paperid: 1834, https://arxiv.org/pdf/2505.08204.pdf

Abstract:
Developers are increasingly integrating Language Models (LMs) into their mobile apps to provide features such as chat-based assistants. To prevent LM misuse, they impose various restrictions, including limits on the number of queries, input length, and allowed topics. However, if the LM integration is insecure, attackers can bypass these restrictions and gain unrestricted access to the LM, potentially harming developers' reputations and leading to significant financial losses. This paper presents the first systematic study of insecure usage of LMs by Android apps. We first manually analyze a preliminary dataset of apps to investigate LM integration methods, construct a taxonomy that categorizes the LM usage restrictions implemented by the apps, and determine how to bypass them. Alarmingly, we can bypass restrictions in 127 out of 181 apps. Then, we develop LM-Scout, a fully automated tool to detect on a large-scale vulnerable usage of LMs in 2,950 mobile apps. LM-Scout shows that, in many cases (i.e., 120 apps), it is possible to find and exploit such security issues automatically. Finally, we identify the root causes for the identified issues and offer recommendations for secure LM integration.

Paperid: 1835, https://arxiv.org/pdf/2505.06454.pdf

Abstract:
Recent studies have shown that sponge attacks can significantly increase the energy consumption and inference latency of deep neural networks (DNNs). However, prior work has focused primarily on computer vision and natural language processing tasks, overlooking the growing use of lightweight AI models in sensing-based applications on resource-constrained devices, such as those in Internet of Things (IoT) environments. These attacks pose serious threats of energy depletion and latency degradation in systems where limited battery capacity and real-time responsiveness are critical for reliable operation. This paper makes two key contributions. First, we present the first systematic exploration of energy-latency sponge attacks targeting sensing-based AI models. Using wearable sensing-based AI as a case study, we demonstrate that sponge attacks can substantially degrade performance by increasing energy consumption, leading to faster battery drain, and by prolonging inference latency. Second, to mitigate such attacks, we investigate model pruning, a widely adopted compression technique for resource-constrained AI, as a potential defense. Our experiments show that pruning-induced sparsity significantly improves model resilience against sponge poisoning. We also quantify the trade-offs between model efficiency and attack resilience, offering insights into the security implications of model compression in sensing-based AI systems deployed in IoT environments.

Paperid: 1836, https://arxiv.org/pdf/2505.06299.pdf

Abstract:
As Spiking Neural Networks (SNNs) gain traction across various applications, understanding their security vulnerabilities becomes increasingly important. In this work, we focus on the adversarial attacks, which is perhaps the most concerning threat. An adversarial attack aims at finding a subtle input perturbation to fool the network's decision-making. We propose two novel adversarial attack algorithms for SNNs: an input-specific attack that crafts adversarial samples from specific dataset inputs and a universal attack that generates a reusable patch capable of inducing misclassification across most inputs, thus offering practical feasibility for real-time deployment. The algorithms are gradient-based operating in the spiking domain proving to be effective across different evaluation metrics, such as adversarial accuracy, stealthiness, and generation time. Experimental results on two widely used neuromorphic vision datasets, NMNIST and IBM DVS Gesture, show that our proposed attacks surpass in all metrics all existing state-of-the-art methods. Additionally, we present the first demonstration of adversarial attack generation in the sound domain using the SHD dataset.

Paperid: 1837, https://arxiv.org/pdf/2505.05619.pdf

Abstract:
The growing adoption of Large Language Models (LLMs) has influenced the development of their lighter counterparts-Small Language Models (SLMs)-to enable on-device deployment across smartphones and edge devices. These SLMs offer enhanced privacy, reduced latency, server-free functionality, and improved user experience. However, due to resource constraints of on-device environment, SLMs undergo size optimization through compression techniques like quantization, which can inadvertently introduce fairness, ethical and privacy risks. Critically, quantized SLMs may respond to harmful queries directly, without requiring adversarial manipulation, raising significant safety and trust concerns. To address this, we propose LiteLMGuard (LLMG), an on-device prompt guard that provides real-time, prompt-level defense for quantized SLMs. Additionally, our prompt guard is designed to be model-agnostic such that it can be seamlessly integrated with any SLM, operating independently of underlying architectures. Our LLMG formalizes prompt filtering as a deep learning (DL)-based prompt answerability classification task, leveraging semantic understanding to determine whether a query should be answered by any SLM. Using our curated dataset, Answerable-or-Not, we trained and fine-tuned several DL models and selected ELECTRA as the candidate, with 97.75% answerability classification accuracy. Our safety effectiveness evaluations demonstrate that LLMG defends against over 87% of harmful prompts, including both direct instruction and jailbreak attack strategies. We further showcase its ability to mitigate the Open Knowledge Attacks, where compromised SLMs provide unsafe responses without adversarial prompting. In terms of prompt filtering effectiveness, LLMG achieves near state-of-the-art filtering accuracy of 94%, with an average latency of 135 ms, incurring negligible overhead for users.

Paperid: 1838, https://arxiv.org/pdf/2505.05328.pdf

Abstract:
Nakamoto consensus are the most widely adopted decentralized consensus mechanism in cryptocurrency systems. Since it was proposed in 2008, many studies have focused on analyzing its security. Most of them focus on maximizing the profit of the adversary. Examples include the selfish mining attack [FC '14] and the recent riskless uncle maker (RUM) attack [CCS '23]. In this work, we introduce the Staircase-Unrestricted Uncle Maker (SUUM), the first block withholding attack targeting the timestamp-based Nakamoto-style blockchain. Through block withholding, timestamp manipulation, and difficulty risk control, SUUM adversaries are capable of launching persistent attacks with zero cost and minimal difficulty risk characteristics, indefinitely exploiting rewards from honest participants. This creates a self-reinforcing cycle that threatens the security of blockchains. We conduct a comprehensive and systematic evaluation of SUUM, including the attack conditions, its impact on blockchains, and the difficulty risks. Finally, we further discuss four feasible mitigation measures against SUUM.

Paperid: 1839, https://arxiv.org/pdf/2505.03100.pdf

Abstract:
Large language models (LLMs) have seen widespread adoption in many domains including digital forensics. While prior research has largely centered on case studies and examples demonstrating how LLMs can assist forensic investigations, deeper explorations remain limited, i.e., a standardized approach for precise performance evaluations is lacking. Inspired by the NIST Computer Forensic Tool Testing Program, this paper proposes a standardized methodology to quantitatively evaluate the application of LLMs for digital forensic tasks, specifically in timeline analysis. The paper describes the components of the methodology, including the dataset, timeline generation, and ground truth development. Additionally, the paper recommends using BLEU and ROUGE metrics for the quantitative evaluation of LLMs through case studies or tasks involving timeline analysis. Experimental results using ChatGPT demonstrate that the proposed methodology can effectively evaluate LLM-based forensic timeline analysis. Finally, we discuss the limitations of applying LLMs to forensic timeline analysis.

Paperid: 1840, https://arxiv.org/pdf/2505.02828.pdf

Abstract:
Explainable Artificial Intelligence (XAI) has emerged as a pillar of Trustworthy AI and aims to bring transparency in complex models that are opaque by nature. Despite the benefits of incorporating explanations in models, an urgent need is found in addressing the privacy concerns of providing this additional information to end users. In this article, we conduct a scoping review of existing literature to elicit details on the conflict between privacy and explainability. Using the standard methodology for scoping review, we extracted 57 articles from 1,943 studies published from January 2019 to December 2024. The review addresses 3 research questions to present readers with more understanding of the topic: (1) what are the privacy risks of releasing explanations in AI systems? (2) what current methods have researchers employed to achieve privacy preservation in XAI systems? (3) what constitutes a privacy preserving explanation? Based on the knowledge synthesized from the selected studies, we categorize the privacy risks and preservation methods in XAI and propose the characteristics of privacy preserving explanations to aid researchers and practitioners in understanding the requirements of XAI that is privacy compliant. Lastly, we identify the challenges in balancing privacy with other system desiderata and provide recommendations for achieving privacy preserving XAI. We expect that this review will shed light on the complex relationship of privacy and explainability, both being the fundamental principles of Trustworthy AI.

Paperid: 1841, https://arxiv.org/pdf/2505.02392.pdf

Abstract:
Privacy-focused cryptocurrencies like Monero remain popular, despite increasing regulatory scrutiny that has led to their delisting from major centralized exchanges. The latter also explains the recent popularity of decentralized exchanges (DEXs) with no centralized ownership structures. These platforms typically leverage peer-to-peer (P2P) networks, promising secure and anonymous asset trading. However, questions of liability remain, and the academic literature lacks comprehensive insights into the functionality, trading activity, and privacy claims of these P2P platforms. In this paper, we provide an early systematization of the current landscape of decentralized peer-to-peer exchanges within the Monero ecosystem. We examine several recently developed DEX platforms, analyzing their popularity, functionality, architectural choices, and potential weaknesses. We further identify and report on a privacy vulnerability in the recently popularized Haveno exchange, demonstrating that certain Haveno trades could be detected, allowing transactions to be linked across the Monero and Bitcoin blockchains. We hope that our findings can nourish the discussion in the research community about more secure designs, and provide insights for regulators.

Paperid: 1842, https://arxiv.org/pdf/2505.01123.pdf

Abstract:
In vulnerability detection, machine learning has been used as an effective static analysis technique, although it suffers from a significant rate of false positives. Contextually, in vulnerability discovery, fuzzing has been used as an effective dynamic analysis technique, although it requires manually writing fuzz drivers. Fuzz drivers usually target a limited subset of functions in a library that must be chosen according to certain criteria, e.g., the depth of a function, the number of paths. These criteria are verified by components called target oracles. In this work, we propose an automated fuzz driver generation workflow composed of: (1) identifying a likely vulnerable function by leveraging a machine learning for vulnerability detection model as a target oracle, (2) automatically generating fuzz drivers, (3) fuzzing the target function to find bugs which could confirm the vulnerability inferred by the target oracle. We show our method on an existing vulnerability in libgd, with a plan for large-scale evaluation.

Paperid: 1843, https://arxiv.org/pdf/2505.01028.pdf

Abstract:
Security vulnerabilities in Windows Active Directory (AD) systems are typically modeled using an attack graph and hardening AD systems involves an iterative workflow: security teams propose an edge to remove, and IT operations teams manually review these fixes before implementing the removal. As verification requires significant manual effort, we formulate an Adaptive Path Removal Problem to minimize the number of steps in this iterative removal process. In our model, a wizard proposes an attack path in each step and presents it as a set of multiple-choice options to the IT admin. The IT admin then selects one edge from the proposed set to remove. This process continues until the target $t$ is disconnected from source $s$ or the number of proposed paths reaches $B$. The model aims to optimize the human effort by minimizing the expected number of interactions between the IT admin and the security wizard. We first prove that the problem is $\mathcal{\#P}$-hard. We then propose a set of solutions including an exact algorithm, an approximate algorithm, and several scalable heuristics. Our best heuristic, called DPR, can operate effectively on larger-scale graphs compared to the exact algorithm and consistently outperforms the approximate algorithm across all graphs. We verify the effectiveness of our algorithms on several synthetic AD graphs and an AD attack graph collected from a real organization.

Paperid: 1844, https://arxiv.org/pdf/2505.00951.pdf

Abstract:
Large Language Model (LLM)-based recommendation systems leverage powerful language models to generate personalized suggestions by processing user interactions and preferences. Unlike traditional recommendation systems that rely on structured data and collaborative filtering, LLM-based models process textual and contextual information, often using cloud-based infrastructure. This raises privacy concerns, as user data is transmitted to remote servers, increasing the risk of exposure and reducing control over personal information. To address this, we propose a hybrid privacy-preserving recommendation framework which separates sensitive from nonsensitive data and only shares the latter with the cloud to harness LLM-powered recommendations. To restore lost recommendations related to obfuscated sensitive data, we design a de-obfuscation module that reconstructs sensitive recommendations locally. Experiments on real-world e-commerce datasets show that our framework achieves almost the same recommendation utility with a system which shares all data with an LLM, while preserving privacy to a large extend. Compared to obfuscation-only techniques, our approach improves HR@10 scores and category distribution alignment, offering a better balance between privacy and recommendation quality. Furthermore, our method runs efficiently on consumer-grade hardware, making privacy-aware LLM-based recommendation systems practical for real-world use.

Paperid: 1845, https://arxiv.org/pdf/2505.00881.pdf

Abstract:
While supervised deep neural networks (DNNs) have proven effective for device authentication via radio frequency (RF) fingerprinting, they are hindered by domain shift issues and the scarcity of labeled data. The success of large language models has led to increased interest in unsupervised pre-trained models (PTMs), which offer better generalization and do not require labeled datasets, potentially addressing the issues mentioned above. However, the inherent vulnerabilities of PTMs in RF fingerprinting remain insufficiently explored. In this paper, we thoroughly investigate data-free backdoor attacks on such PTMs in RF fingerprinting, focusing on a practical scenario where attackers lack access to downstream data, label information, and training processes. To realize the backdoor attack, we carefully design a set of triggers and predefined output representations (PORs) for the PTMs. By mapping triggers and PORs through backdoor training, we can implant backdoor behaviors into the PTMs, thereby introducing vulnerabilities across different downstream RF fingerprinting tasks without requiring prior knowledge. Extensive experiments demonstrate the wide applicability of our proposed attack to various input domains, protocols, and PTMs. Furthermore, we explore potential detection and defense methods, demonstrating the difficulty of fully safeguarding against our proposed backdoor attack.

Paperid: 1846, https://arxiv.org/pdf/2504.21618.pdf

Abstract:
IPv4, IPv6, and TCP have a common mechanism allowing one to split an original data packet into several chunks. Such chunked packets may have overlapping data portions and, OS network stack implementations may reassemble these overlaps differently. A Network Intrusion Detection System (NIDS) that tries to reassemble a given flow data has to use the same reassembly policy as the monitored host OS; otherwise, the NIDS or the host may be subject to attack. In this paper, we provide several contributions that enable us to analyze NIDS resistance to overlapping data chunks-based attacks. First, we extend state-of-the-art insertion and evasion attack characterizations to address their limitations in an overlap-based context. Second, we propose a new way to model overlap types using Allen's interval algebra, a spatio-temporal reasoning. This new modeling allows us to formalize overlap test cases, which ensures exhaustiveness in overlap coverage and eases the reasoning about and use of reassembly policies. Third, we analyze the reassembly behavior of several OSes and NIDSes when processing the modeled overlap test cases. We show that 1) OS reassembly policies evolve over time and 2) all the tested NIDSes are (still) vulnerable to overlap-based evasion and insertion attacks.

Paperid: 1847, https://arxiv.org/pdf/2504.21436.pdf

Abstract:
Federated Learning enables collaborative training of a global model across multiple geographically dispersed clients without the need for data sharing. However, it is susceptible to inference attacks, particularly label inference attacks. Existing studies on label distribution inference exhibits sensitive to the specific settings of the victim client and typically underperforms under defensive strategies. In this study, we propose a novel label distribution inference attack that is stable and adaptable to various scenarios. Specifically, we estimate the size of the victim client's dataset and construct several virtual clients tailored to the victim client. We then quantify the temporal generalization of each class label for the virtual clients and utilize the variation in temporal generalization to train an inference model that predicts the label distribution proportions of the victim client. We validate our approach on multiple datasets, including MNIST, Fashion-MNIST, FER2013, and AG-News. The results demonstrate the superiority of our method compared to state-of-the-art techniques. Furthermore, our attack remains effective even under differential privacy defense mechanisms, underscoring its potential for real-world applications.

Paperid: 1848, https://arxiv.org/pdf/2504.20700.pdf

Abstract:
This study introduces a cutting-edge architecture developed for the NewbornTime project, which uses advanced AI to analyze video data at birth and during newborn resuscitation, with the aim of improving newborn care. The proposed architecture addresses the crucial issues of patient consent, data security, and investing trust in healthcare by integrating Ethereum blockchain with cloud computing. Our blockchain-based consent application simplifies patient consent's secure and transparent management. We explain the smart contract mechanisms and privacy measures employed, ensuring data protection while permitting controlled data sharing among authorized parties. This work demonstrates the potential of combining blockchain and cloud technologies in healthcare, emphasizing their role in maintaining data integrity, with implications for computer science and healthcare innovation.

Paperid: 1849, https://arxiv.org/pdf/2504.20296.pdf

Abstract:
Blockchain technologies have overturned the digital finance industry by introducing a decentralized pseudonymous means of monetary transfer. The pseudonymous nature introduced privacy concerns, enabling various deanonymization techniques, which in turn spurred development of stronger anonymity-preserving measures. The purpose of this paper is to create a comprehensive survey of mixing techniques and implementations within the vast ecosystem surrounding anonymization tools and mechanisms available in blockchain cryptocurrencies. First, we begin by reviewing classifications used in the field. Then, we survey various obfuscation techniques, helping to delve into actual implementations and combinations of these techniques. Next, we identify the positive and negative attributes of the approaches and implementations included. Moreover, we examine the implications of anonymization tools for user privacy, including their effectiveness in preserving anonymity and susceptibility to attacks and vulnerabilities. Finally, we discuss the challenges and innovations for extending mixing services into the realm of smart contracts or cross-chain space.

Paperid: 1850, https://arxiv.org/pdf/2504.19674.pdf

Abstract:
Safety evaluation of Large Language Models (LLMs) has made progress and attracted academic interest, but it remains challenging to keep pace with the rapid integration of LLMs across diverse applications. Different applications expose users to various harms, necessitating application-specific safety evaluations with tailored harms and policies. Another major gap is the lack of focus on the dynamic and conversational nature of LLM systems. Such potential oversights can lead to harms that go unnoticed in standard safety benchmarks. This paper identifies the above as key requirements for robust LLM safety evaluation and recognizing that current evaluation methodologies do not satisfy these, we introduce the $\texttt{SAGE}$ (Safety AI Generic Evaluation) framework. $\texttt{SAGE}$ is an automated modular framework designed for customized and dynamic harm evaluations. It utilizes adversarial user models that are system-aware and have unique personalities, enabling a holistic red-teaming evaluation. We demonstrate $\texttt{SAGE}$'s effectiveness by evaluating seven state-of-the-art LLMs across three applications and harm policies. Our experiments with multi-turn conversational evaluations revealed a concerning finding that harm steadily increases with conversation length. Furthermore, we observe significant disparities in model behavior when exposed to different user personalities and scenarios. Our findings also reveal that some models minimize harmful outputs by employing severe refusal tactics that can hinder their usefulness. These insights highlight the necessity of adaptive and context-specific testing to ensure better safety alignment and safer deployment of LLMs in real-world scenarios.

Paperid: 1851, https://arxiv.org/pdf/2504.19567.pdf

Abstract:
The rapid development of generative image models has brought tremendous opportunities to AI-generated content (AIGC) creation, while also introducing critical challenges in ensuring content authenticity and copyright ownership. Existing image watermarking methods, though partially effective, often rely on post-processing or reference images, and struggle to balance fidelity, robustness, and tamper localization. To address these limitations, we propose GenPTW, an In-Generation image watermarking framework for latent diffusion models (LDMs), which integrates Provenance Tracing and Tamper Localization into a unified Watermark-based design. It embeds structured watermark signals during the image generation phase, enabling unified provenance tracing and tamper localization. For extraction, we construct a frequency-coordinated decoder to improve robustness and localization precision in complex editing scenarios. Additionally, a distortion layer that simulates AIGC editing is introduced to enhance robustness. Extensive experiments demonstrate that GenPTW outperforms existing methods in image fidelity, watermark extraction accuracy, and tamper localization performance, offering an efficient and practical solution for trustworthy AIGC image generation.

Paperid: 1852, https://arxiv.org/pdf/2504.19456.pdf

Abstract:
Graph-based detection methods leveraging Function Call Graphs (FCGs) have shown promise for Android malware detection (AMD) due to their semantic insights. However, the deployment of malware detectors in dynamic and hostile environments raises significant concerns about their robustness. While recent approaches evaluate the robustness of FCG-based detectors using adversarial attacks, their effectiveness is constrained by the vast perturbation space, particularly across diverse models and features. To address these challenges, we introduce FCGHunter, a novel robustness testing framework for FCG-based AMD systems. Specifically, FCGHunter employs innovative techniques to enhance exploration and exploitation within this huge search space. Initially, it identifies critical areas within the FCG related to malware behaviors to narrow down the perturbation space. We then develop a dependency-aware crossover and mutation method to enhance the validity and diversity of perturbations, generating diverse FCGs. Furthermore, FCGHunter leverages multi-objective feedback to select perturbed FCGs, significantly improving the search process with interpretation-based feature change feedback. Extensive evaluations across 40 scenarios demonstrate that FCGHunter achieves an average attack success rate of 87.9%, significantly outperforming baselines by at least 44.7%. Notably, FCGHunter achieves a 100% success rate on robust models (e.g., AdaBoost with MalScan), where baselines achieve only 11% or are inapplicable.

Paperid: 1853, https://arxiv.org/pdf/2504.18571.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) devices, particularly in smart home environments, has introduced considerable security and privacy concerns due to their persistent connectivity and interaction with cloud services. Despite advancements in IoT security, effective privacy measures remain uncovered, with existing solutions often relying on cloud-based threat detection that exposes sensitive data or outdated allow-lists that inadequately restrict non-essential network traffic. This work presents ML-IoTrim, a system for detecting and mitigating non-essential IoT traffic (i.e., not influencing the device operations) by analyzing network behavior at the edge, leveraging Machine Learning to classify network destinations. Our approach includes building a labeled dataset based on IoT device behavior and employing a feature-extraction pipeline to enable a binary classification of essential vs. non-essential network destinations. We test our framework in a consumer smart home setup with IoT devices from five categories, demonstrating that the model can accurately identify and block non-essential traffic, including previously unseen destinations, without relying on traditional allow-lists. We implement our solution on a home access point, showing the framework has strong potential for scalable deployment, supporting near-real-time traffic classification in large-scale IoT environments with hundreds of devices. This research advances privacy-aware traffic control in smart homes, paving the way for future developments in IoT device privacy.

Paperid: 1854, https://arxiv.org/pdf/2504.18497.pdf

Abstract:
Empirical inference attacks are a popular approach for evaluating the privacy risk of data release mechanisms in practice. While an active attack literature exists to evaluate machine learning models or synthetic data release, we currently lack comparable methods for fixed aggregate statistics, in particular when only a limited number of statistics are released. We here propose an inference attack framework against fixed aggregate statistics and an attribute inference attack called DeSIA. We instantiate DeSIA against the U.S. Census PPMF dataset and show it to strongly outperform reconstruction-based attacks. In particular, we show DeSIA to be highly effective at identifying vulnerable users, achieving a true positive rate of 0.14 at a false positive rate of $10^{-3}$. We then show DeSIA to perform well against users whose attributes cannot be verified and when varying the number of aggregate statistics and level of noise addition. We also perform an extensive ablation study of DeSIA and show how DeSIA can be successfully adapted to the membership inference task. Overall, our results show that aggregation alone is not sufficient to protect privacy, even when a relatively small number of aggregates are being released, and emphasize the need for formal privacy mechanisms and testing before aggregate statistics are released.

Paperid: 1855, https://arxiv.org/pdf/2504.18369.pdf

Abstract:
Large Language Models (LLMs) are currently being integrated into industrial software applications to help users perform more complex tasks in less time. However, these LLM-Integrated Applications (LIA) expand the attack surface and introduce new kinds of threats. Threat modeling is commonly used to identify these threats and suggest mitigations. However, it is a time-consuming practice that requires the involvement of a security practitioner. Our goals are to 1) provide a method for performing threat modeling for LIAs early in their lifecycle, (2) develop a threat modeling tool that integrates existing threat models, and (3) ensure high-quality threat modeling. To achieve the goals, we work in collaboration with our industry partner. Our proposed way of performing threat modeling will benefit industry by requiring fewer security experts' participation and reducing the time spent on this activity. Our proposed tool combines LLMs and Retrieval Augmented Generation (RAG) and uses sources such as existing threat models and application architecture repositories to continuously create and update threat models. We propose to evaluate the tool offline -- i.e., using benchmarking -- and online with practitioners in the field. We conducted an early evaluation using ChatGPT on a simple LIA and obtained results that encouraged us to proceed with our research efforts.

Paperid: 1856, https://arxiv.org/pdf/2504.18131.pdf

Abstract:
Event reconstruction is a technique that examiners can use to attempt to infer past activities by analyzing digital artifacts. Despite its significance, the field suffers from fragmented research, with studies often focusing narrowly on aspects like timeline creation or tampering detection. This paper addresses the lack of a unified perspective by proposing a comprehensive framework for timeline-based event reconstruction, adapted from traditional forensic science models. We begin by harmonizing existing terminology and presenting a cohesive diagram that clarifies the relationships between key elements of the reconstruction process. Through a comprehensive literature survey, we classify and organize the main challenges, extending the discussion beyond common issues like data volume. Lastly, we highlight recent advancements and propose directions for future research, including specific research gaps. By providing a structured approach, key findings, and a clearer understanding of the underlying challenges, this work aims to strengthen the foundation of digital forensics.

Paperid: 1857, https://arxiv.org/pdf/2504.17953.pdf

Abstract:
Phishing detection on Ethereum has increasingly leveraged advanced machine learning techniques to identify fraudulent transactions. However, limited attention has been given to understanding the effectiveness of feature selection strategies and the role of graph-based models in enhancing detection accuracy. In this paper, we systematically examine these issues by analyzing and contrasting explicit transactional features and implicit graph-based features, both experimentally and analytically. We explore how different feature sets impact the performance of phishing detection models, particularly in the context of Ethereum's transactional network. Additionally, we address key challenges such as class imbalance and dataset composition and their influence on the robustness and precision of detection methods. Our findings demonstrate the advantages and limitations of each feature type, while also providing a clearer understanding of how feature affect model resilience and generalization in adversarial environments.

Paperid: 1858, https://arxiv.org/pdf/2504.17684.pdf

Abstract:
This paper explores the vulnerability of machine learning models to simple single-feature adversarial attacks in the context of Ethereum fraudulent transaction detection. Through comprehensive experimentation, we investigate the impact of various adversarial attack strategies on model performance metrics. Our findings, highlighting how prone those techniques are to simple attacks, are alarming, and the inconsistency in the attacks' effect on different algorithms promises ways for attack mitigation. We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness and show their effectiveness.

Paperid: 1859, https://arxiv.org/pdf/2504.16887.pdf

Abstract:
The sponge is a cryptographic construction that turns a public permutation into a hash function. When instantiated with the Keccak permutation, the sponge forms the NIST SHA-3 standard. SHA-3 is a core component of most post-quantum public-key cryptography schemes slated for worldwide adoption. While one can consider many security properties for the sponge, the ultimate one is indifferentiability from a random oracle, or simply indifferentiability. The sponge was proved indifferentiable against classical adversaries by Bertoni et al. in 2008. Despite significant efforts in the years since, little is known about sponge security against quantum adversaries, even for simple properties like preimage or collision resistance beyond a single round. This is primarily due to the lack of a satisfactory quantum analog of the lazy sampling technique for permutations. In this work, we develop a specialized technique that overcomes this barrier in the case of the sponge. We prove that the sponge is in fact indifferentiable from a random oracle against quantum adversaries. Our result establishes that the domain extension technique behind SHA-3 is secure in the post-quantum setting. Our indifferentiability bound for the sponge is a loose $O(\mathsf{poly}(q) 2^{-\mathsf{min}(r, c)/4})$, but we also give bounds on preimage and collision resistance that are tighter.

Paperid: 1860, https://arxiv.org/pdf/2504.16116.pdf

Abstract:
Large Language Models (LLMs) have achieved impressive performance in diverse natural language processing tasks, but specialized domains such as Web3 present new challenges and require more tailored evaluation. Despite the significant user base and capital flows in Web3, encompassing smart contracts, decentralized finance (DeFi), non-fungible tokens (NFTs), decentralized autonomous organizations (DAOs), on-chain governance, and novel token-economics, no comprehensive benchmark has systematically assessed LLM performance in this domain. To address this gap, we introduce the DMind Benchmark, a holistic Web3-oriented evaluation suite covering nine critical subfields: fundamental blockchain concepts, blockchain infrastructure, smart contract, DeFi mechanisms, DAOs, NFTs, token economics, meme concept, and security vulnerabilities. Beyond multiple-choice questions, DMind Benchmark features domain-specific tasks such as contract debugging and on-chain numeric reasoning, mirroring real-world scenarios. We evaluated 26 models, including ChatGPT, Claude, DeepSeek, Gemini, Grok, and Qwen, uncovering notable performance gaps in specialized areas like token economics and security-critical contract analysis. While some models excel in blockchain infrastructure tasks, advanced subfields remain challenging. Our benchmark dataset and evaluation pipeline are open-sourced on https://huggingface.co/datasets/DMindAI/DMind_Benchmark, reaching number one in Hugging Face's trending dataset charts within a week of release.

Paperid: 1861, https://arxiv.org/pdf/2504.15674.pdf

Abstract:
Federated learning (FL) systems allow decentralized data-owning clients to jointly train a global model through uploading their locally trained updates to a centralized server. The property of decentralization enables adversaries to craft carefully designed backdoor updates to make the global model misclassify only when encountering adversary-chosen triggers. Existing defense mechanisms mainly rely on post-training detection after receiving updates. These methods either fail to identify updates which are deliberately fabricated statistically close to benign ones, or show inconsistent performance in different FL training stages. The effect of unfiltered backdoor updates will accumulate in the global model, and eventually become functional. Given the difficulty of ruling out every backdoor update, we propose a backdoor defense paradigm, which focuses on proactive robustification on the global model against potential backdoor attacks. We first reveal that the successful launching of backdoor attacks in FL stems from the lack of conflict between malicious and benign updates on redundant neurons of ML models. We proceed to prove the feasibility of activating redundant neurons utilizing out-of-distribution (OOD) samples in centralized settings, and migrating to FL settings to propose a novel backdoor defense mechanism, TrojanDam. The proposed mechanism has the FL server continuously inject fresh OOD mappings into the global model to activate redundant neurons, canceling the effect of backdoor updates during aggregation. We conduct systematic and extensive experiments to illustrate the superior performance of TrojanDam, over several SOTA backdoor defense methods across a wide range of FL settings.

Paperid: 1862, https://arxiv.org/pdf/2504.15246.pdf

Abstract:
The quest for a precise and contextually grounded answer to the question in the present paper's title resulted in this stirred-not-shaken triptych, a phrase that reflects our desire to deepen the theoretical basis, broaden the practical applicability, and reduce the misperception of differential privacy (DP)$\unicode{x2014}$all without shaking its core foundations. Indeed, given the existence of more than 200 formulations of DP (and counting), before even attempting to answer the titular question one must first precisely specify what it actually means to be DP. Motivated by this observation, a theoretical investigation into DP's fundamental essence resulted in Part I of this trio, which introduces a five-building-block system explicating the who, where, what, how and how much aspects of DP. Instantiating this system in the context of the United States Decennial Census, Part II then demonstrates the broader applicability and relevance of DP by comparing a swapping strategy like that used in 2010 with the TopDown Algorithm$\unicode{x2014}$a DP method adopted in the 2020 Census. This paper provides nontechnical summaries of the preceding two parts as well as new discussion$\unicode{x2014}$for example, on how greater awareness of the five building blocks can thwart privacy theatrics; how our results bridging traditional SDC and DP allow a data custodian to reap the benefits of both these fields; how invariants impact disclosure risk; and how removing the implicit reliance on aleatoric uncertainty could lead to new generalizations of DP.

Paperid: 1863, https://arxiv.org/pdf/2504.14421.pdf

Abstract:
The ubiquity of mobile applications has increased dramatically in recent years, opening up new opportunities for cyber attackers and heightening security concerns in the mobile ecosystem. As a result, researchers and practitioners have intensified their research into improving the security and privacy of mobile applications. At the same time, more and more mobile applications have appeared on the market that address the aforementioned security issues. However, both academia and industry currently lack a comprehensive overview of these mobile security applications for Android and iOS platforms, including their respective use cases and the security information they provide. To address this gap, we systematically collected a total of 410 mobile applications from both the App and Play Store. Then, we identified the 20 most widely utilized mobile security applications on both platforms that were analyzed and classified. Our results show six primary use cases and a wide range of security information provided by these applications, thus supporting the core functionalities for ensuring mobile security.

Paperid: 1864, https://arxiv.org/pdf/2504.14368.pdf

Abstract:
Differentially private (DP) machine learning often relies on the availability of public data for tasks like privacy-utility trade-off estimation, hyperparameter tuning, and pretraining. While public data assumptions may be reasonable in text and image domains, they are less likely to hold for tabular data due to tabular data heterogeneity across domains. We propose leveraging powerful priors to address this limitation; specifically, we synthesize realistic tabular data directly from schema-level specifications - such as variable names, types, and permissible ranges - without ever accessing sensitive records. To that end, this work introduces the notion of "surrogate" public data - datasets generated independently of sensitive data, which consume no privacy loss budget and are constructed solely from publicly available schema or metadata. Surrogate public data are intended to encode plausible statistical assumptions (informed by publicly available information) into a dataset with many downstream uses in private mechanisms. We automate the process of generating surrogate public data with large language models (LLMs); in particular, we propose two methods: direct record generation as CSV files, and automated structural causal model (SCM) construction for sampling records. Through extensive experiments, we demonstrate that surrogate public tabular data can effectively replace traditional public data when pretraining differentially private tabular classifiers. To a lesser extent, surrogate public data are also useful for hyperparameter tuning of DP synthetic data generators, and for estimating the privacy-utility tradeoff.

Paperid: 1865, https://arxiv.org/pdf/2504.13484.pdf

Abstract:
With the known vulnerability of neural networks to distribution shift, maintaining reliability in learning-enabled cyber-physical systems poses a salient challenge. In response, many existing methods adopt a detect and abstain methodology, aiming to detect distribution shift at inference time so that the learning-enabled component can abstain from decision-making. This approach, however, has limited use in real-world applications. We instead propose a monitor and recover paradigm as a promising direction for future research. This philosophy emphasizes 1) robust safety monitoring instead of distribution shift detection and 2) distribution shift recovery instead of abstention. We discuss two examples from our recent work.

Paperid: 1866, https://arxiv.org/pdf/2504.13212.pdf

Abstract:
In the digital age, e-commerce has transformed the way consumers shop, offering convenience and accessibility. Nevertheless, concerns about the privacy and security of personal information shared on these platforms have risen. In this work, we investigate user privacy violations, noting the risks of data leakage to third-party entities. Utilizing a semi-automated data collection approach, we examine a selection of popular online e-shops, revealing that nearly 30% of them violate user privacy by disclosing personal information to third parties. We unveil how minimal user interaction across multiple e-commerce websites can result in a comprehensive privacy breach. We observe significant data-sharing patterns with platforms like Facebook, which use personal information to build user profiles and link them to social media accounts.

Paperid: 1867, https://arxiv.org/pdf/2504.12577.pdf

Abstract:
Federated learning (FL) enables collaborative training of deep learning models without requiring data to leave local clients, thereby preserving client privacy. The aggregation process on the server plays a critical role in the performance of the resulting FL model. The most commonly used aggregation method is weighted averaging based on the amount of data from each client, which is thought to reflect each client's contribution. However, this method is prone to model bias, as dishonest clients might report inaccurate training data volumes to the server, which is hard to verify. To address this issue, we propose a novel secure \underline{Fed}erated \underline{D}ata q\underline{u}antity-\underline{a}ware weighted averaging method (FedDua). It enables FL servers to accurately predict the amount of training data from each client based on their local model gradients uploaded. Furthermore, it can be seamlessly integrated into any FL algorithms that involve server-side model aggregation. Extensive experiments on three benchmarking datasets demonstrate that FedDua improves the global model performance by an average of 3.17% compared to four popular FL aggregation methods in the presence of inaccurate client data volume declarations.

Paperid: 1868, https://arxiv.org/pdf/2504.09841.pdf

Abstract:
The proliferation of autonomous agents powered by large language models (LLMs) has revolutionized popular business applications dealing with tabular data, i.e., tabular agents. Although LLMs are observed to be vulnerable against prompt injection attacks from external data sources, tabular agents impose strict data formats and predefined rules on the attacker's payload, which are ineffective unless the agent navigates multiple layers of structural data to incorporate the payload. To address the challenge, we present a novel attack termed StruPhantom which specifically targets black-box LLM-powered tabular agents. Our attack designs an evolutionary optimization procedure which continually refines attack payloads via the proposed constrained Monte Carlo Tree Search augmented by an off-topic evaluator. StruPhantom helps systematically explore and exploit the weaknesses of target applications to achieve goal hijacking. Our evaluation validates the effectiveness of StruPhantom across various LLM-based agents, including those on real-world platforms, and attack scenarios. Our attack achieves over 50% higher success rates than baselines in enforcing the application's response to contain phishing links or malicious codes.

Paperid: 1869, https://arxiv.org/pdf/2504.09199.pdf

Abstract:
Social Virtual Reality (VR) platforms have surged in popularity, yet their security risks remain underexplored. This paper presents four novel UI attacks that covertly manipulate users into performing harmful actions through deceptive virtual content. Implemented on VRChat and validated in an IRB-approved study with 30 participants, these attacks demonstrate how deceptive elements can mislead users into malicious actions without their awareness. To address these vulnerabilities, we propose MetaScanner, a proactive countermeasure that rapidly analyzes objects and scripts in virtual worlds, detecting suspicious elements within seconds.

Paperid: 1870, https://arxiv.org/pdf/2504.08999.pdf

Abstract:
Large Language Models (LLMs) are increasingly augmented with external tools through standardized interfaces like the Model Context Protocol (MCP). However, current MCP implementations face critical limitations: they typically require local process execution through STDIO transports, making them impractical for resource-constrained environments like mobile devices, web browsers, and edge computing. We present MCP Bridge, a lightweight RESTful proxy that connects to multiple MCP servers and exposes their capabilities through a unified API. Unlike existing solutions, MCP Bridge is fully LLM-agnostic, supporting any backend regardless of vendor. The system implements a risk-based execution model with three security levels standard execution, confirmation workflow, and Docker isolation while maintaining backward compatibility with standard MCP clients. Complementing this server-side infrastructure is a Python based MCP Gemini Agent that facilitates natural language interaction with MCP tools. The evaluation demonstrates that MCP Bridge successfully addresses the constraints of direct MCP connections while providing enhanced security controls and cross-platform compatibility, enabling sophisticated LLM-powered applications in previously inaccessible environments

Paperid: 1871, https://arxiv.org/pdf/2504.06102.pdf

Abstract:
Physically distributed components and legacy protocols make the protection of power grids against increasing cyberattack threats challenging. Infamously, the 2015 and 2016 blackouts in Ukraine were caused by cyberattacks, and the German Federal Office for Information Security (BSI) recorded over 200 cyber incidents against the German energy sector between 2023 and 2024. Intrusion detection promises to quickly detect such attacks and mitigate the worst consequences. However, public datasets of realistic scenarios are vital to evaluate these systems. This paper introduces Sherlock, a dataset generated with the co-simulator Wattson. In total, Sherlock covers three scenarios with various attacks manipulating the process state by injecting malicious commands or manipulating measurement values. We additionally test five recently-published intrusion detection systems on Sherlock, highlighting specific challenges for intrusion detection in power grids. Dataset and documentation are available at https://sherlock.wattson.it/.

Paperid: 1872, https://arxiv.org/pdf/2504.04734.pdf

Abstract:
Recent studies reveal that experienced data practitioners often draw sketches to facilitate communication around privacy design concepts. However, there is limited understanding of how we can help novice students develop such communication skills. This paper studies methods for lowering novice data science students' barriers to creating high-quality privacy sketches. We first conducted a need-finding study (N=12) to identify barriers students face when sketching privacy designs. We then used a human-centered design approach to guide the method development, culminating in three simple, text-based heuristics. Our user studies with 24 data science students revealed that simply presenting three heuristics to the participants at the beginning of the study can enhance the coverage of privacy-related design decisions in sketches, reduce the mental effort required for creating sketches, and improve the readability of the final sketches.

Paperid: 1873, https://arxiv.org/pdf/2504.04394.pdf

Abstract:
Extensive research has shown that Automatic Speech Recognition (ASR) systems are vulnerable to audio adversarial attacks. Current attacks mainly focus on single-source scenarios, ignoring dual-source scenarios where two people are speaking simultaneously. To bridge the gap, we propose a Selective Masking Adversarial attack, namely SMA attack, which ensures that one audio source is selected for recognition while the other audio source is muted in dual-source scenarios. To better adapt to the dual-source scenario, our SMA attack constructs the normal dual-source audio from the muted audio and selected audio. SMA attack initializes the adversarial perturbation with a small Gaussian noise and iteratively optimizes it using a selective masking optimization algorithm. Extensive experiments demonstrate that the SMA attack can generate effective and imperceptible audio adversarial examples in the dual-source scenario, achieving an average success rate of attack of 100% and signal-to-noise ratio of 37.15dB on Conformer-CTC, outperforming the baselines.

Paperid: 1874, https://arxiv.org/pdf/2504.03077.pdf

Abstract:
Federated Learning (FL) has recently emerged as a promising paradigm for privacy-preserving, distributed machine learning. However, FL systems face significant security threats, particularly from adaptive adversaries capable of modifying their attack strategies to evade detection. One such threat is the presence of Reconnecting Malicious Clients (RMCs), which exploit FLs open connectivity by reconnecting to the system with modified attack strategies. To address this vulnerability, we propose integration of Identity-Based Identification (IBI) as a security measure within FL environments. By leveraging IBI, we enable FL systems to authenticate clients based on cryptographic identity schemes, effectively preventing previously disconnected malicious clients from re-entering the system. Our approach is implemented using the TNC-IBI (Tan-Ng-Chin) scheme over elliptic curves to ensure computational efficiency, particularly in resource-constrained environments like Internet of Things (IoT). Experimental results demonstrate that integrating IBI with secure aggregation algorithms, such as Krum and Trimmed Mean, significantly improves FL robustness by mitigating the impact of RMCs. We further discuss the broader implications of IBI in FL security, highlighting research directions for adaptive adversary detection, reputation-based mechanisms, and the applicability of identity-based cryptographic frameworks in decentralized FL architectures. Our findings advocate for a holistic approach to FL security, emphasizing the necessity of proactive defence strategies against evolving adaptive adversarial threats.

Paperid: 1875, https://arxiv.org/pdf/2503.23533.pdf

Abstract:
Digital forensics is a cornerstone of modern crime investigations, yet it raises significant privacy concerns due to the collection, processing, and storage of digital evidence. Despite that, privacy threats in digital forensics crime investigations often remain underexplored, thereby leading to potential gaps in forensic practices and regulatory compliance, which may then escalate into harming the freedoms of natural persons. With this clear motivation, the present paper applies the SPADA methodology for threat modelling with the goal of incorporating privacy-oriented threat modelling in digital forensics. As a result, we identify a total of 298 privacy threats that may affect digital forensics processes through crime investigations. Furthermore, we demonstrate an unexplored feature on how SPADA assists in handling domain-dependency during threat elicitation. This yields a second list of privacy threats that are universally applicable to any domain. We then present a comprehensive and systematic privacy threat model for digital forensics in crime investigation. Moreover, we discuss some of the challenges about validating privacy threats in this domain, particularly given the variability of legal frameworks across jurisdictions. We ultimately propose our privacy threat model as a tool for ensuring ethical and legally compliant investigative practices.

Paperid: 1876, https://arxiv.org/pdf/2503.23511.pdf

Abstract:
Federated Learning (FL) is a popular paradigm enabling clients to jointly train a global model without sharing raw data. However, FL is known to be vulnerable towards backdoor attacks due to its distributed nature. As participants, attackers can upload model updates that effectively compromise FL. What's worse, existing defenses are mostly designed under independent-and-identically-distributed (iid) settings, hence neglecting the fundamental non-iid characteristic of FL. Here we propose FLBuff for tackling backdoor attacks even under non-iids. The main challenge for such defenses is that non-iids bring benign and malicious updates closer, hence harder to separate. FLBuff is inspired by our insight that non-iids can be modeled as omni-directional expansion in representation space while backdoor attacks as uni-directional. This leads to the key design of FLBuff, i.e., a supervised-contrastive-learning model extracting penultimate-layer representations to create a large in-between buffer layer. Comprehensive evaluations demonstrate that FLBuff consistently outperforms state-of-the-art defenses.

Paperid: 1877, https://arxiv.org/pdf/2503.22685.pdf

Abstract:
In the world of science new technology have opened up the possibility to rely on advanced computational methods and models to conduct and produce scientific research. An important aspect of scientific and business workflows is provenance - which refers to the information describing the production, history or lineage of an end product, which can also be data, digitalized processes and other not tangible artifacts. While there are already systems, tools and standards to capture provenance of data and workflows the provenance of adaptations/changes in workflows has not been addressed yet. In this paper we carry out a literature review to establish the state of the art on this topic and present our methodology and findings. Our findings confirm that provenance of adaptation has not been addressed adequately in the fields of business and scientific workflows. The two fields also have different motivation for recording the lineage of data or processes. While scientific workflows are interested in reproducibility and visualization, business workflows solutions are indirectly connected to compliance, exception handling and analysis. The adaptive nature of workflows in both fields is not reflected in the research on process provenance yet, as our results show. The use of standard provenance standards is also not wide spread.

Paperid: 1878, https://arxiv.org/pdf/2503.22232.pdf

Abstract:
Traditional Neighbor Discovery (ND) and Secure Neighbor Discovery (SND) are key elements for network functionality. SND is a hard problem, satisfying not only typical security properties (authentication, integrity) but also verification of direct communication, which involves distance estimation based on time measurements and device coordinates. Defeating relay attacks, also known as "wormholes", leading to stealthy Byzantine links and significant degradation of communication and adversarial control, is key in many wireless networked systems. However, SND is not concerned with privacy; it necessitates revealing the identity and location of the device(s) participating in the protocol execution. This can be a deterrent for deployment, especially involving user-held devices in the emerging Internet of Things (IoT) enabled smart environments. To address this challenge, we present a novel Privacy-Preserving Secure Neighbor Discovery (PP-SND) protocol, enabling devices to perform SND without revealing their actual identities and locations, effectively decoupling discovery from the exposure of sensitive information. We use Homomorphic Encryption (HE) for computing device distances without revealing their actual coordinates, as well as employing a pseudonymous device authentication to hide identities while preserving communication integrity. PP-SND provides SND [1] along with pseudonymity, confidentiality, and unlinkability. Our presentation here is not specific to one wireless technology, and we assess the performance of the protocols (cryptographic overhead) on a Raspberry Pi 4 and provide a security and privacy analysis.

Paperid: 1879, https://arxiv.org/pdf/2503.20982.pdf

Abstract:
In this paper, we describe the motivation, design, security properties, and a prototype implementation of NickPay, a new privacy-preserving yet auditable payment system built on top of the Ethereum blockchain platform. NickPay offers a strong level of privacy to participants and prevents successive payment transfers from being linked to their actual owners. It is providing the transparency that blockchains ensure and at the same time, preserving the possibility for a trusted authority to access sensitive information, e.g., for audit purposes or compliance with financial regulations. NickPay builds upon the Nicknames for Group Signatures (NGS) scheme, a new signing system based on dynamic ``nicknames'' for signers that extends the schemes of group signatures and signatures with flexible public keys. NGS enables identified group members to expose their flexible public keys, thus allowing direct and natural applications such as auditable private payment systems, NickPay being a blockchain-based prototype of these.

Paperid: 1881, https://arxiv.org/pdf/2503.17830.pdf

Abstract:
Fingerprinting is a technique used to create behavioral profiles of systems to identify threats and weaknesses. When applied to cryptographic primitives and network protocols, it can be exploited by attackers for denial-of-service, key recovery, or downgrade attacks. In this paper, we evaluate the feasibility of fingerprinting post-quantum (PQ) algorithms by analyzing key exchange and digital signature primitives, their integration into protocols like TLS, SSH, QUIC, OpenVPN, and OIDC, and their usage in SNARK libraries (pysnark and lattice_zksnark). PQ algorithms differ from classical ones in memory and computation demands. We examine implementations across liboqs and CIRCL libraries on Windows, Ubuntu, and MacOS. Our experiments show that we can distinguish classical from PQ key exchange and signatures with 98% and 100% accuracy, respectively; identify the specific PQ algorithm used with 97% and 86% accuracy; distinguish between liboqs and CIRCL implementations with up to 100% accuracy; and identify PQ vs. hybrid implementations within CIRCL with 97% accuracy. In protocol-level analysis, we can detect the presence and type of PQ key exchange. SNARK libraries are distinguishable with 100% accuracy. To demonstrate real-world applicability, we apply our fingerprinting methods to the Tranco dataset to detect domains using PQ TLS and integrate our methods into QUARTZ, an open-source threat analysis tool developed by Cisco.

Paperid: 1882, https://arxiv.org/pdf/2503.16972.pdf

Abstract:
A Decentralized Identifier (DID) empowers an entity to prove control over a unique and self-issued identifier without relying on any identity provider. The public key material for the proof is encoded into an associated DID document (DDO). This is preferable shared via a distributed ledger because it guarantees algorithmically that everyone has access to the latest state of any tamper-proof DDO but only the entities in control of a DID are able to update theirs. Yet, it is possible to grant deputies the authority to update the DDO on behalf of the DID owner. However, the DID specification leaves largely open on how authorizations over a DDO are managed and enforced among multiple deputies. This article investigates what it means to govern a DID and discusses various forms of how a DID can be controlled by potentially more than one entity. It also presents a prototype of a DID-conform identifier management system where a selected set of governance policies are deployed as Smart Contracts. The article highlights the critical role of governance for the trustworthy and flexible deployment of ledger-anchored DIDs across various domains.

Paperid: 1883, https://arxiv.org/pdf/2503.16586.pdf

Abstract:
Generative AI (GenAI) browser assistants integrate powerful capabilities of GenAI in web browsers to provide rich experiences such as question answering, content summarization, and agentic navigation. These assistants, available today as browser extensions, can not only track detailed browsing activity such as search and click data, but can also autonomously perform tasks such as filling forms, raising significant privacy concerns. It is crucial to understand the design and operation of GenAI browser extensions, including how they collect, store, process, and share user data. To this end, we study their ability to profile users and personalize their responses based on explicit or inferred demographic attributes and interests of users. We perform network traffic analysis and use a novel prompting framework to audit tracking, profiling, and personalization by the ten most popular GenAI browser assistant extensions. We find that instead of relying on local in-browser models, these assistants largely depend on server-side APIs, which can be auto-invoked without explicit user interaction. When invoked, they collect and share webpage content, often the full HTML DOM and sometimes even the user's form inputs, with their first-party servers. Some assistants also share identifiers and user prompts with third-party trackers such as Google Analytics. The collection and sharing continues even if a webpage contains sensitive information such as health or personal information such as name or SSN entered in a web form. We find that several GenAI browser assistants infer demographic attributes such as age, gender, income, and interests and use this profile--which carries across browsing contexts--to personalize responses. In summary, our work shows that GenAI browser assistants can and do collect personal and sensitive information for profiling and personalization with little to no safeguards.

Paperid: 1884, https://arxiv.org/pdf/2503.15554.pdf

Abstract:
LLMs are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current evaluation schemes leave several concerns unaddressed. Specifically, most existing studies evaluate security and functional correctness separately, using different datasets. That is, they assess vulnerabilities using security-related code datasets while validating functionality with general code datasets. In addition, prior research primarily relies on a single static analyzer, CodeQL, to detect vulnerabilities in generated code, which limits the scope of security evaluation. In this work, we conduct a comprehensive study to systematically assess the improvements introduced by four state-of-the-art secure code generation techniques. Specifically, we apply both security inspection and functionality validation to the same generated code and evaluate these two aspects together. We also employ three popular static analyzers and two LLMs to identify potential vulnerabilities in the generated code. Our study reveals that existing techniques often compromise the functionality of generated code to enhance security. Their overall performance remains limited when evaluating security and functionality together. In fact, many techniques even degrade the performance of the base LLM. Our further inspection reveals that these techniques often either remove vulnerable lines of code entirely or generate ``garbage code'' that is unrelated to the intended task. Moreover, the commonly used static analyzer CodeQL fails to detect several vulnerabilities, further obscuring the actual security improvements achieved by existing techniques. Our study serves as a guideline for a more rigorous and comprehensive evaluation of secure code generation performance in future work.

Paperid: 1885, https://arxiv.org/pdf/2503.15238.pdf

Abstract:
Mobile apps frequently use Bluetooth Low Energy (BLE) and WiFi scanning permissions to discover nearby devices like peripherals and connect to WiFi Access Points (APs). However, wireless interfaces also serve as a covert proxy for geolocation data, enabling continuous user tracking and profiling. This includes technologies like BLE beacons, which are BLE devices broadcasting unique identifiers to determine devices' indoor physical locations; such beacons are easily found in shopping centres. Despite the widespread use of wireless scanning APIs and their potential for privacy abuse, the interplay between commercial mobile SDKs with wireless sensing and beaconing technologies remains largely unexplored. In this work, we conduct the first systematic analysis of 52 wireless-scanning SDKs, revealing their data collection practices and privacy risks. We develop a comprehensive analysis pipeline that enables us to detect beacon scanning capabilities, inject wireless events to trigger app behaviors, and monitor runtime execution on instrumented devices. Our findings show that 86% of apps integrating these SDKs collect at least one sensitive data type, including device and user identifiers such as AAID, email, along with GPS coordinates, WiFi and Bluetooth scan results. We uncover widespread SDK-to-SDK data sharing and evidence of ID bridging, where persistent and resettable identifiers are shared and synchronized within SDKs embedded in applications to potentially construct detailed mobility profiles, compromising user anonymity and enabling long-term tracking. We provide evidence of key actors engaging in these practices and conclude by proposing mitigation strategies such as stronger SDK sandboxing, stricter enforcement of platform policies, and improved transparency mechanisms to limit unauthorized tracking.

Paperid: 1886, https://arxiv.org/pdf/2503.13550.pdf

Abstract:
The increasing adoption of data-driven applications in education such as in learning analytics and AI in education has raised significant privacy and data protection concerns. While these challenges have been widely discussed in previous works, there are still limited practical solutions. Federated learning has recently been discoursed as a promising privacy-preserving technique, yet its application in education remains scarce. This paper presents an experimental evaluation of federated learning for educational data prediction, comparing its performance to traditional non-federated approaches. Our findings indicate that federated learning achieves comparable predictive accuracy. Furthermore, under adversarial attacks, federated learning demonstrates greater resilience compared to non-federated settings. We summarise that our results reinforce the value of federated learning as a potential approach for balancing predictive performance and privacy in educational contexts.

Paperid: 1887, https://arxiv.org/pdf/2503.11404.pdf

Abstract:
Semantic watermarking methods enable the direct integration of watermarks into the generation process of latent diffusion models by only modifying the initial latent noise. One line of approaches building on Gaussian Shading relies on cryptographic primitives to steer the sampling process of the latent noise. However, we identify several issues in the usage of cryptographic techniques in Gaussian Shading, particularly in its proof of lossless performance and key management, causing ambiguity in follow-up works, too. In this work, we therefore revisit the cryptographic primitives for semantic watermarking. We introduce a novel, general proof of lossless performance based on IND\$-CPA security for semantic watermarks. We then discuss the configuration of the cryptographic primitives in semantic watermarks with respect to security, efficiency, and generation quality.

Paperid: 1888, https://arxiv.org/pdf/2503.10269.pdf

Abstract:
Protecting the use of audio datasets is a major concern for data owners, particularly with the recent rise of audio deep learning models. While watermarks can be used to protect the data itself, they do not allow to identify a deep learning model trained on a protected dataset. In this paper, we adapt to audio data the recently introduced data taggants approach. Data taggants is a method to verify if a neural network was trained on a protected image dataset with top-$k$ predictions access to the model only. This method relies on a targeted data poisoning scheme by discreetly altering a small fraction (1%) of the dataset as to induce a harmless behavior on out-of-distribution data called keys. We evaluate our method on the Speechcommands and the ESC50 datasets and state of the art transformer models, and show that we can detect the use of the dataset with high confidence without loss of performance. We also show the robustness of our method against common data augmentation techniques, making it a practical method to protect audio datasets.

Paperid: 1889, https://arxiv.org/pdf/2503.10063.pdf

Abstract:
We consider the problem of securely and robustly embedding covert messages into an image-based diffusion model's output. The sender and receiver want to exchange the maximum amount of information possible per diffusion sampled image while remaining undetected. The adversary wants to detect that such communication is taking place by identifying those diffusion samples that contain covert messages. To maximize robustness to transformations of the diffusion sample, a strategy is for the sender and the receiver to embed the message in the initial latents. We first show that prior work that attempted this is easily broken because their embedding technique alters the latents' distribution. We then propose a straightforward method to embed covert messages in the initial latent {\em without} altering the distribution. We prove that our construction achieves indistinguishability to any probabilistic polynomial time adversary. Finally, we discuss and analyze empirically the tradeoffs between embedding capacity, message recovery rates, and robustness. We find that optimizing the inversion method for error correction is crucial for reliability.

Paperid: 1890, https://arxiv.org/pdf/2503.09375.pdf

Abstract:
Quantum computing is an emerging paradigm with the potential to transform numerous application areas by addressing problems considered intractable in the classical domain. However, its integration into cyberspace introduces significant security and privacy challenges. The exponential rise in cyber attacks, further complicated by quantum capabilities, poses serious risks to financial systems and national security. The scope of quantum threats extends beyond traditional software, operating system, and network vulnerabilities, necessitating a shift in cybersecurity education. Traditional cybersecurity education, often reliant on didactic methods, lacks hands on, student centered learning experiences necessary to prepare students for these evolving challenges. There is an urgent need for curricula that address both classical and quantum security threats through experiential learning. In this work, we present the design and evaluation of EE 597: Introduction to Hardware Security, a graduate level course integrating hands-on quantum security learning with classical security concepts through simulations and cloud-based quantum hardware. Unlike conventional courses focused on quantum threats to cryptographic systems, EE 597 explores security challenges specific to quantum computing itself. We employ a mixed-methods evaluation using pre and post surveys to assess student learning outcomes and engagement. Results indicate significant improvements in students' understanding of quantum and hardware security, with strong positive feedback on course structure and remote instruction (mean scores: 3.33 to 3.83 on a 4 point scale).

Paperid: 1891, https://arxiv.org/pdf/2503.07568.pdf

Abstract:
Rapid adoption of AI technologies raises several major security concerns, including the risks of adversarial perturbations, which threaten the confidentiality and integrity of AI applications. Protecting AI hardware from misuse and diverse security threats is a challenging task. To address this challenge, we propose SAMURAI, a novel framework for safeguarding against malicious usage of AI hardware and its resilience to attacks. SAMURAI introduces an AI Performance Counter (APC) for tracking dynamic behavior of an AI model coupled with an on-chip Machine Learning (ML) analysis engine, known as TANTO (Trained Anomaly Inspection Through Trace Observation). APC records the runtime profile of the low-level hardware events of different AI operations. Subsequently, the summary information recorded by the APC is processed by TANTO to efficiently identify potential security breaches and ensure secure, responsible use of AI. SAMURAI enables real-time detection of security threats and misuse without relying on traditional software-based solutions that require model integration. Experimental results demonstrate that SAMURAI achieves up to 97% accuracy in detecting adversarial attacks with moderate overhead on various AI models, significantly outperforming conventional software-based approaches. It enhances security and regulatory compliance, providing a comprehensive solution for safeguarding AI against emergent threats.

Paperid: 1892, https://arxiv.org/pdf/2503.07199.pdf

Abstract:
Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. We present new conceptual approaches to minimize this barrier, towards improving the performance of one-run auditing of real machine learning algorithms.

Paperid: 1893, https://arxiv.org/pdf/2503.04652.pdf

Abstract:
The requirement for privacy-aware machine learning increases as we continue to use PII (Personally Identifiable Information) within machine training. To overcome these privacy issues, we can apply Fully Homomorphic Encryption (FHE) to encrypt data before it is fed into a machine learning model. This involves creating a homomorphic encryption key pair, and where the associated public key will be used to encrypt the input data, and the private key will decrypt the output. But, there is often a performance hit when we use homomorphic encryption, and so this paper evaluates the performance overhead of using the SVM machine learning technique with the OpenFHE homomorphic encryption library. This uses Python and the scikit-learn library for its implementation. The experiments include a range of variables such as multiplication depth, scale size, first modulus size, security level, batch size, and ring dimension, along with two different SVM models, SVM-Poly and SVM-Linear. Overall, the results show that the two main parameters which affect performance are the ring dimension and the modulus size, and that SVM-Poly and SVM-Linear show similar performance levels.

Paperid: 1894, https://arxiv.org/pdf/2503.03160.pdf

Abstract:
Specialized machine learning (ML) models tailored to users needs and requests are increasingly being deployed on smart devices with cameras, to provide personalized intelligent services taking advantage of camera data. However, two primary challenges hinder the training of such models: the lack of publicly available labeled data suitable for specialized tasks and the inaccessibility of labeled private data due to concerns about user privacy. To address these challenges, we propose a novel system SpinML, where the server generates customized Synthetic image data to Privately traIN a specialized ML model tailored to the user request, with the usage of only a few sanitized reference images from the user. SpinML offers users fine-grained, object-level control over the reference images, which allows user to trade between the privacy and utility of the generated synthetic data according to their privacy preferences. Through experiments on three specialized model training tasks, we demonstrate that our proposed system can enhance the performance of specialized models without compromising users privacy preferences.

Paperid: 1895, https://arxiv.org/pdf/2503.02862.pdf

Abstract:
With the growing adoption of privacy-preserving machine learning algorithms, such as Differentially Private Stochastic Gradient Descent (DP-SGD), training or fine-tuning models on private datasets has become increasingly prevalent. This shift has led to the need for models offering varying privacy guarantees and utility levels to satisfy diverse user requirements. However, managing numerous versions of large models introduces significant operational challenges, including increased inference latency, higher resource consumption, and elevated costs. Model deduplication is a technique widely used by many model serving and database systems to support high-performance and low-cost inference queries and model diagnosis queries. However, none of the existing model deduplication works has considered privacy, leading to unbounded aggregation of privacy costs for certain deduplicated models and inefficiencies when applied to deduplicate DP-trained models. We formalize the problems of deduplicating DP-trained models for the first time and propose a novel privacy- and accuracy-aware deduplication mechanism to address the problems. We developed a greedy strategy to select and assign base models to target models to minimize storage and privacy costs. When deduplicating a target model, we dynamically schedule accuracy validations and apply the Sparse Vector Technique to reduce the privacy costs associated with private validation data. Compared to baselines that do not provide privacy guarantees, our approach improved the compression ratio by up to $35\times$ for individual models (including large language models and vision transformers). We also observed up to $43\times$ inference speedup due to the reduction of I/O operations.

Paperid: 1896, https://arxiv.org/pdf/2503.02772.pdf

Abstract:
The BitVM and BitVMX protocols have long relied on inefficient one-time signature (OTS) schemes like Lamport and Winternitz for signing program inputs. These schemes exhibit significant storage overheads, hindering their practical application. This paper introduces ESSPI, an optimized method leveraging ECDSA/Schnorr signatures to sign the BitVMX program input. With Schnorr signatures we achieve an optimal 1:1 data expansion, compared to the current known best ratio of 1:200 based on Winternitz signatures. To accomplish this we introduce 4 innovations to BitVMX: (1) a modification of the BitVMX CPU, adding a challengeable hashing core to it, (2) a new partition-based search to detect fraud during hashing, (3) a new enhanced transaction DAG with added data-carrying transactions with a fraud-verifying smart-contract and (4) a novel timelock-based method for proving data availability to Bitcoin smart contracts. The enhanced BitVMX protocol enables the verification of uncompressed inputs such as SPV proofs, NiPoPoWs, or longer computation integrity proofs, such as STARKs.

Paperid: 1897, https://arxiv.org/pdf/2503.02176.pdf

Abstract:
In this paper, we propose a secure two-party computation protocol for dynamic controllers using a secret sharing scheme. The proposed protocol realizes outsourcing of controller computation to two servers, while controller parameters, states, inputs, and outputs are kept secret against the servers. Unlike previous encrypted controls in a single-server setting, the proposed method can operate a dynamic controller for an infinite time horizon without controller state decryption or input re-encryption. We show that the control performance achievable by the proposed protocol can be made arbitrarily close to that attained by the unencrypted controller. Furthermore, system-theoretic and cryptographic modifications of the protocol are presented to improve the communication complexity. The feasibility of the protocol is demonstrated through numerical examples of PID and observer-based controls.

Paperid: 1898, https://arxiv.org/pdf/2502.20589.pdf

Abstract:
As Large Language Models (LLMs) become increasingly integrated into many technological ecosystems across various domains and industries, identifying which model is deployed or being interacted with is critical for the security and trustworthiness of the systems. Current verification methods typically rely on analyzing the generated output to determine the source model. However, these techniques are susceptible to adversarial attacks, operate in a post-hoc manner, and may require access to the model weights to inject a verifiable fingerprint. In this paper, we propose a novel passive and non-invasive fingerprinting technique that operates in real-time and remains effective even under encrypted network traffic conditions. Our method leverages the intrinsic autoregressive generation nature of language models, which generate text one token at a time based on all previously generated tokens, creating a unique temporal pattern like a rhythm or heartbeat that persists even when the output is streamed over a network. We find that measuring the Inter-Token Times (ITTs)-time intervals between consecutive tokens-can identify different language models with high accuracy. We develop a Deep Learning (DL) pipeline to capture these timing patterns using network traffic analysis and evaluate it on 16 Small Language Models (SLMs) and 10 proprietary LLMs across different deployment scenarios, including local host machine (GPU/CPU), Local Area Network (LAN), Remote Network, and Virtual Private Network (VPN). The experimental results confirm that our proposed technique is effective and maintains high accuracy even when tested in different network conditions. This work opens a new avenue for model identification in real-world scenarios and contributes to more secure and trustworthy language model deployment.

Paperid: 1899, https://arxiv.org/pdf/2502.20234.pdf

Abstract:
The most widespread type of phishing attack involves email messages with links pointing to malicious content. Despite user training and the use of detection techniques, these attacks are still highly effective. Recent studies show that it is user inattentiveness, rather than lack of education, that is one of the key factors in successful phishing attacks. To this end, we develop a novel phishing defense mechanism based on URL inspection tasks: small challenges (loosely inspired by CAPTCHAs) that, to be solved, require users to interact with, and understand, the basic URL structure. We implemented and evaluated three tasks that act as ``barriers'' to visiting the website: (1) correct click-selection from a list of URLs, (2) mouse-based highlighting of the domain-name URL component, and (3) re-typing the domain-name. These tasks follow best practices in security interfaces and warning design. We assessed the efficacy of these tasks through an extensive on-line user study with 2,673 participants from three different cultures, native languages, and alphabets. Results show that these tasks significantly decrease the rate of successful phishing attempts, compared to the baseline case. Results also showed the highest efficacy for difficult URLs, such as typo-squats, with which participants struggled the most. This highlights the importance of (1) slowing down users while focusing their attention and (2) helping them understand the URL structure (especially, the domain-name component thereof) and matching it to their intent.

Paperid: 1900, https://arxiv.org/pdf/2502.17772.pdf

Abstract:
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantees often come at the cost of model performance, largely due to the inherent challenge of accurately quantifying privacy loss. While recent efforts have strengthened privacy guarantees by focusing solely on the final output and bounded domain cases, they still impose restrictive assumptions, such as convexity and other parameter limitations, and often lack a thorough analysis of utility. In this paper, we provide rigorous privacy and utility characterization for DPSGD for smooth loss functions in both bounded and unbounded domains. We track the privacy loss over multiple iterations by exploiting the noisy smooth-reduction property and establish the utility analysis by leveraging the projection's non-expansiveness and clipped SGD properties. In particular, we show that for DPSGD with a bounded domain, (i) the privacy loss can still converge without the convexity assumption, and (ii) a smaller bounded diameter can improve both privacy and utility simultaneously under certain conditions. Numerical results validate our results.

Paperid: 1901, https://arxiv.org/pdf/2502.17658.pdf

Abstract:
The rise of on-chip accelerators signifies a major shift in computing, driven by the growing demands of artificial intelligence (AI) and specialized applications. These accelerators have gained popularity due to their ability to substantially boost performance, cut energy usage, lower total cost of ownership (TCO), and promote sustainability. Intel's Advanced Matrix Extensions (AMX) is one such on-chip accelerator, specifically designed for handling tasks involving large matrix multiplications commonly used in machine learning (ML) models, image processing, and other computational-heavy operations. In this paper, we introduce a novel value-dependent timing side-channel vulnerability in Intel AMX. By exploiting this weakness, we demonstrate a software-based, value-dependent timing side-channel attack capable of inferring the sparsity of neural network weights without requiring any knowledge of the confidence score, privileged access or physical proximity. Our attack method can fully recover the sparsity of weights assigned to 64 input elements within 50 minutes, which is 631% faster than the maximum leakage rate achieved in the Hertzbleed attack.

Paperid: 1902, https://arxiv.org/pdf/2502.16272.pdf

Abstract:
In many areas of cybersecurity, we require access to Personally Identifiable Information (PII), such as names, postal addresses and email addresses. Unfortunately, this can lead to data breaches, especially in relation to data compliance regulations such as GDPR. An Internet Protocol (IP) address is an identifier that is assigned to a networked device to enable it to communicate over networks that use IP. Thus, in applications which are privacy-aware, we may aim to hide the IP address while aiming to determine if the address comes from a blacklist. One solution to this is to use homomorphic encryption to match an encrypted version of an IP address to a blacklisted network list. This matching allows us to encrypt the IP address and match it to an encrypted version of a blacklist. In this paper, we use the OpenFHE library [1] to encrypt network addresses with the BFV homomorphic encryption scheme. In order to assess the performance overhead of BFV, we implement a matching method using the OpenFHE library and compare it against partial homomorphic schemes, including Paillier, Damgard-Jurik, Okamoto-Uchiyama, Naccache-Stern and Benaloh. The main findings are that the BFV method compares favourably against the partial homomorphic methods in most cases.

Paperid: 1903, https://arxiv.org/pdf/2502.16094.pdf

Abstract:
Model merging has emerged as a promising approach for updating large language models (LLMs) by integrating multiple domain-specific models into a cross-domain merged model. Despite its utility and plug-and-play nature, unmonitored mergers can introduce significant security vulnerabilities, such as backdoor attacks and model merging abuse. In this paper, we identify a novel and more realistic attack surface where a malicious merger can extract targeted personally identifiable information (PII) from an aligned model with model merging. Specifically, we propose \texttt{Merger-as-a-Stealer}, a two-stage framework to achieve this attack: First, the attacker fine-tunes a malicious model to force it to respond to any PII-related queries. The attacker then uploads this malicious model to the model merging conductor and obtains the merged model. Second, the attacker inputs direct PII-related queries to the merged model to extract targeted PII. Extensive experiments demonstrate that \texttt{Merger-as-a-Stealer} successfully executes attacks against various LLMs and model merging methods across diverse settings, highlighting the effectiveness of the proposed framework. Given that this attack enables character-level extraction for targeted PII without requiring any additional knowledge from the attacker, we stress the necessity for improved model alignment and more robust defense mechanisms to mitigate such threats.

Paperid: 1904, https://arxiv.org/pdf/2502.16086.pdf

Abstract:
Decentralized training has become a resource-efficient framework to democratize the training of large language models (LLMs). However, the privacy risks associated with this framework, particularly due to the potential inclusion of sensitive data in training datasets, remain unexplored. This paper identifies a novel and realistic attack surface: the privacy leakage from training data in decentralized training, and proposes \textit{activation inversion attack} (AIA) for the first time. AIA first constructs a shadow dataset comprising text labels and corresponding activations using public datasets. Leveraging this dataset, an attack model can be trained to reconstruct the training data from activations in victim decentralized training. We conduct extensive experiments on various LLMs and publicly available datasets to demonstrate the susceptibility of decentralized training to AIA. These findings highlight the urgent need to enhance security measures in decentralized training to mitigate privacy risks in training LLMs.

Paperid: 1905, https://arxiv.org/pdf/2502.15932.pdf

Abstract:
The National Vulnerability Database (NVD) publishes over a thousand new vulnerabilities monthly, with a projected 25 percent increase in 2024, highlighting the crucial need for rapid vulnerability identification to mitigate cybersecurity attacks and save costs and resources. In this work, we propose using large language models (LLMs) to learn vulnerability evaluation from historical assessments of medical device vulnerabilities in a single manufacturer's portfolio. We highlight the effectiveness and challenges of using LLMs for automatic vulnerability evaluation and introduce a method to enrich historical data with cybersecurity ontologies, enabling the system to understand new vulnerabilities without retraining the LLM. Our LLM system integrates with the in-house application - Cybersecurity Management System (CSMS) - to help Siemens Healthineers (SHS) product cybersecurity experts efficiently assess the vulnerabilities in our products. Also, we present guidelines for efficient integration of LLMs into the cybersecurity tool.

Paperid: 1906, https://arxiv.org/pdf/2502.14307.pdf

Abstract:
We propose using reinforcement learning to address the challenges of discovering microarchitectural vulnerabilities, such as Spectre and Meltdown, which exploit subtle interactions in modern processors. Traditional methods like random fuzzing fail to efficiently explore the vast instruction space and often miss vulnerabilities that manifest under specific conditions. To overcome this, we introduce an intelligent, feedback-driven approach using RL. Our RL agents interact with the processor, learning from real-time feedback to prioritize instruction sequences more likely to reveal vulnerabilities, significantly improving the efficiency of the discovery process. We also demonstrate that RL systems adapt effectively to various microarchitectures, providing a scalable solution across processor generations. By automating the exploration process, we reduce the need for human intervention, enabling continuous learning that uncovers hidden vulnerabilities. Additionally, our approach detects subtle signals, such as timing anomalies or unusual cache behavior, that may indicate microarchitectural weaknesses. This proposal advances hardware security testing by introducing a more efficient, adaptive, and systematic framework for protecting modern processors. When unleashed on Intel Skylake-X and Raptor Lake microarchitectures, our RL agent was indeed able to generate instruction sequences that cause significant observable byte leakages through transient execution without generating any $Î¼$code assists, faults or interrupts. The newly identified leaky sequences stem from a variety of Intel instructions, e.g. including SERIALIZE, VERR/VERW, CLMUL, MMX-x87 transitions, LSL+RDSCP and LAR. These initial results give credence to the proposed approach.

Paperid: 1907, https://arxiv.org/pdf/2502.13482.pdf

Abstract:
Federated learning enables training machine learning models while preserving the privacy of participants. Surprisingly, there is no differentially private distributed method for smooth, non-convex optimization problems. The reason is that standard privacy techniques require bounding the participants' contributions, usually enforced via $\textit{clipping}$ of the updates. Existing literature typically ignores the effect of clipping by assuming the boundedness of gradient norms or analyzes distributed algorithms with clipping but ignores DP constraints. In this work, we study an alternative approach via $\textit{smoothed normalization}$ of the updates motivated by its favorable performance in the single-node setting. By integrating smoothed normalization with an error-feedback mechanism, we design a new distributed algorithm $Î±$-$\sf NormEC$. We prove that our method achieves a superior convergence rate over prior works. By extending $Î±$-$\sf NormEC$ to the DP setting, we obtain the first differentially private distributed optimization algorithm with provable convergence guarantees. Finally, our empirical results from neural network training indicate robust convergence of $Î±$-$\sf NormEC$ across different parameter settings.

Paperid: 1908, https://arxiv.org/pdf/2502.07410.pdf

Abstract:
Bitcoin's security relies on its Proof-of-Work consensus, where miners solve puzzles to propose blocks. The puzzle's difficulty is set by the difficulty adjustment mechanism (DAM), based on the network's available mining power. Attacks that destroy some portion of mining power can exploit the DAM to lower difficulty, making such attacks profitable. In this paper, we analyze three types of mining power destruction attacks in the presence of petty-compliant mining pools: selfish mining, bribery, and mining power distraction attacks. We analyze selfish mining while accounting for the distribution of mining power among pools, a factor often overlooked in the literature. Our findings indicate that selfish mining can be more destructive when the non-adversarial mining share is well distributed among pools. We also introduce a novel bribery attack, where the adversarial pool bribes petty-compliant pools to orphan others' blocks. For small pools, we demonstrate that the bribery attack can dominate strategies like selfish mining or undercutting. Lastly, we present the mining distraction attack, where the adversarial pool incentivizes petty-compliant pools to abandon Bitcoin's puzzle and mine for a simpler puzzle, thus wasting some part of their mining power. Similar to the previous attacks, this attack can lower the mining difficulty, but with the difference that it does not generate any evidence of mining power destruction, such as orphan blocks.

Paperid: 1909, https://arxiv.org/pdf/2502.07045.pdf

Abstract:
Insider threats wield an outsized influence on organizations, disproportionate to their small numbers. This is due to the internal access insiders have to systems, information, and infrastructure. %One example of this influence is where anonymous respondents submit web-based job search site reviews, an insider threat risk to organizations. Signals for such risks may be found in anonymous submissions to public web-based job search site reviews. This research studies the potential for large language models (LLMs) to analyze and detect insider threat sentiment within job site reviews. Addressing ethical data collection concerns, this research utilizes synthetic data generation using LLMs alongside existing job review datasets. A comparative analysis of sentiment scores generated by LLMs is benchmarked against expert human scoring. Findings reveal that LLMs demonstrate alignment with human evaluations in most cases, thus effectively identifying nuanced indicators of threat sentiment. The performance is lower on human-generated data than synthetic data, suggesting areas for improvement in evaluating real-world data. Text diversity analysis found differences between human-generated and LLM-generated datasets, with synthetic data exhibiting somewhat lower diversity. Overall, the results demonstrate the applicability of LLMs to insider threat detection, and a scalable solution for insider sentiment testing by overcoming ethical and logistical barriers tied to data acquisition.

Paperid: 1910, https://arxiv.org/pdf/2502.06662.pdf

Abstract:
Recent high-profile incidents in open-source software have greatly raised practitioner attention on software supply chain attacks. To guard against potential malicious package updates, security practitioners advocate pinning dependency to specific versions rather than floating in version ranges. However, it remains controversial whether pinning carries a meaningful security benefit that outweighs the cost of maintaining outdated and possibly vulnerable dependencies. In this paper, we quantify, through counterfactual analysis and simulations, the security and maintenance impact of version constraints in the npm ecosystem. By simulating dependency resolutions over historical time points, we find that pinning direct dependencies not only (as expected) increases the cost of maintaining vulnerable and outdated dependencies, but also (surprisingly) even increases the risk of exposure to malicious package updates in larger dependency graphs due to the specifics of npm's dependency resolution mechanism. Finally, we explore collective pinning strategies to secure the ecosystem against supply chain attacks, suggesting specific changes to npm to enable such interventions. Our study provides guidance for practitioners and tool designers to manage their supply chains more securely.

Paperid: 1911, https://arxiv.org/pdf/2502.06555.pdf

Abstract:
Differentially private (DP) synthetic data is a versatile tool for enabling the analysis of private data. Recent advancements in large language models (LLMs) have inspired a number of algorithm techniques for improving DP synthetic data generation. One family of approaches uses DP finetuning on the foundation model weights; however, the model weights for state-of-the-art models may not be public. In this work we propose two DP synthetic tabular data algorithms that only require API access to the foundation model. We adapt the Private Evolution algorithm (Lin et al., 2023; Xie et al., 2024) -- which was designed for image and text data -- to the tabular data domain. In our extension of Private Evolution, we define a query workload-based distance measure, which may be of independent interest. We propose a family of algorithms that use one-shot API access to LLMs, rather than adaptive queries to the LLM. Our findings reveal that API-access to powerful LLMs does not always improve the quality of DP synthetic data compared to established baselines that operate without such access. We provide insights into the underlying reasons and propose improvements to LLMs that could make them more effective for this application.

Paperid: 1912, https://arxiv.org/pdf/2502.05224.pdf

Abstract:
Large Language Models (LLMs) have achieved significantly advanced capabilities in understanding and generating human language text, which have gained increasing popularity over recent years. Apart from their state-of-the-art natural language processing (NLP) performance, considering their widespread usage in many industries, including medicine, finance, education, etc., security concerns over their usage grow simultaneously. In recent years, the evolution of backdoor attacks has progressed with the advancement of defense mechanisms against them and more well-developed features in the LLMs. In this paper, we adapt the general taxonomy for classifying machine learning attacks on one of the subdivisions - training-time white-box backdoor attacks. Besides systematically classifying attack methods, we also consider the corresponding defense methods against backdoor attacks. By providing an extensive summary of existing works, we hope this survey can serve as a guideline for inspiring future research that further extends the attack scenarios and creates a stronger defense against them for more robust LLMs.

Paperid: 1913, https://arxiv.org/pdf/2502.04901.pdf

Abstract:
This work investigates the theoretical boundaries of creating publicly-detectable schemes to enable the provenance of watermarked imagery. Metadata-based approaches like C2PA provide unforgeability and public-detectability. ML techniques offer robust retrieval and watermarking. However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. Although theoretically possible, we find that at present, it is intractable to build certain components of our scheme without a leap in deep learning capabilities. We analyze these limitations and propose research directions that need to be addressed before we can practically realize robust and publicly-verifiable provenance.

Paperid: 1914, https://arxiv.org/pdf/2502.04009.pdf

Abstract:
Quantum Key Distribution (QKD) is currently being discussed as a technology to safeguard communication in a future where quantum computers compromise traditional public-key cryptosystems. In this paper, we conduct a comprehensive security evaluation of QKD-based solutions, focusing on real-world use cases sourced from academic literature and industry reports. We analyze these use cases, assess their security and identify the possible advantages of deploying QKD-based solutions. We further compare QKD-based solutions with Post-Quantum Cryptography (PQC), the alternative approach to achieving security when quantum computers compromise traditional public-key cryptosystems, evaluating their respective suitability for each scenario. Based on this comparative analysis, we critically discuss and comment on which use cases QKD is suited for, considering factors such as implementation complexity, scalability, and long-term security. Our findings contribute to a better understanding of the role QKD could play in future cryptographic infrastructures and offer guidance to decision-makers considering the deployment of QKD.

Paperid: 1915, https://arxiv.org/pdf/2502.03614.pdf

Abstract:
The IoT facilitates a connected, intelligent, and sustainable society; therefore, it is imperative to protect the IoT ecosystem. The IoT-based 5G and 6G will leverage the use of machine learning and artificial intelligence (ML/AI) more to pave the way for autonomous and collaborative secure IoT networks. Zero-touch, zero-trust IoT security with AI and machine learning (ML) enablement frameworks offers a powerful approach to securing the expanding landscape of Internet of Things (IoT) devices. This paper presents a novel framework based on the integration of Zero Trust, Zero Touch, and AI/ML powered for the detection, mitigation, and prevention of DDoS attacks in modern IoT ecosystems. The focus will be on the new integrated framework by establishing zero trust for all IoT traffic, fixed and mobile 5G/6G IoT network traffic, and data security (quarantine-zero touch and dynamic policy enforcement). We perform a comparative analysis of five machine learning models, namely, XGBoost, Random Forest, K-Nearest Neighbors, Stochastic Gradient Descent, and Native Bayes, by comparing these models based on accuracy, precision, recall, F1-score, and ROC-AUC. Results show that the best performance in detecting and mitigating different DDoS vectors comes from the ensemble-based approaches.

Paperid: 1916, https://arxiv.org/pdf/2502.02676.pdf

Abstract:
Visible watermarks pose significant challenges for image restoration techniques, especially when the target background is unknown. Toward this end, we present MorphoMod, a novel method for automated visible watermark removal that operates in a blind setting -- without requiring target images. Unlike existing methods, MorphoMod effectively removes opaque and transparent watermarks while preserving semantic content, making it well-suited for real-world applications. Evaluations on benchmark datasets, including the Colored Large-scale Watermark Dataset (CLWD), LOGO-series, and the newly introduced Alpha1 datasets, demonstrate that MorphoMod achieves up to a 50.8% improvement in watermark removal effectiveness compared to state-of-the-art methods. Ablation studies highlight the impact of prompts used for inpainting, pre-removal filling strategies, and inpainting model performance on watermark removal. Additionally, a case study on steganographic disorientation reveals broader applications for watermark removal in disrupting high-level hidden messages. MorphoMod offers a robust, adaptable solution for watermark removal and opens avenues for further advancements in image restoration and adversarial manipulation.

Paperid: 1917, https://arxiv.org/pdf/2502.02320.pdf

Abstract:
In this work, we study multivalued byzantine agreement (BA) in an asynchronous network of $n$ parties where up to $t < \frac{n}{3}$ parties are byzantine. We present a new reduction from multivalued BA to binary BA. It allows one to achieve BA on $\ell$-bit inputs with one instance of binary BA, one instance of crusader agreement (CA) on $\ell$-bit inputs and $Î(\ell n + n^2)$ bits of additional communication. As our reduction uses multivalued CA, we also design two new information-theoretic CA protocols for $\ell$-bit inputs. In the first one, we use almost-universal hashing to achieve statistical security with probability $1 - 2^{-Î»}$ against $t < \frac{n}{3}$ faults with $Î(\ell n + n^2(Î»+ \log n))$ bits of communication. Following this, we replace the hashes with error correcting code symbols and add a preliminary step based on the synchronous multivalued BA protocol COOL [DISC '21] to obtain a second, perfectly secure CA protocol that can for any $\varepsilon > 0$ be set to tolerate $t \leq \frac{n}{3 + \varepsilon}$ faults with $\mathcal{O}\bigl(\frac{\ell n}{\min(1, \varepsilon^2)} + n^2\max\bigl(1, \log \frac{1}{\varepsilon}\bigr) \bigr)$ bits of communication. Our CA protocols allow one to extend binary BA to multivalued BA with a constant round overhead, a quadratic-in-$n$ communication overhead, and information-theoretic security.

Paperid: 1918, https://arxiv.org/pdf/2502.01853.pdf

Abstract:
Artificial Intelligence (AI)-driven code generation tools are increasingly used throughout the software development lifecycle to accelerate coding tasks. However, the security of AI-generated code using Large Language Models (LLMs) remains underexplored, with studies revealing various risks and weaknesses. This paper analyzes the security of code generated by LLMs across different programming languages. We introduce a dataset of 200 tasks grouped into six categories to evaluate the performance of LLMs in generating secure and maintainable code. Our research shows that while LLMs can automate code creation, their security effectiveness varies by language. Many models fail to utilize modern security features in recent compiler and toolkit updates, such as Java 17. Moreover, outdated methods are still commonly used, particularly in C++. This highlights the need for advancing LLMs to enhance security and quality while incorporating emerging best practices in programming languages.

Paperid: 1919, https://arxiv.org/pdf/2501.18820.pdf

Abstract:
The increasing prevalence of software vulnerabilities necessitates automated vulnerability repair (AVR) techniques. This Systematization of Knowledge (SoK) provides a comprehensive overview of the AVR landscape, encompassing both synthetic and real-world vulnerabilities. Through a systematic literature review and quantitative benchmarking across diverse datasets, methods, and strategies, we establish a taxonomy of existing AVR methodologies, categorizing them into template-guided, search-based, constraint-based, and learning-driven approaches. We evaluate the strengths and limitations of these approaches, highlighting common challenges and practical implications. Our comprehensive analysis of existing AVR methods reveals a diverse landscape with no single ``best'' approach. Learning-based methods excel in specific scenarios but lack complete program understanding, and both learning and non-learning methods face challenges with complex vulnerabilities. Additionally, we identify emerging trends and propose future research directions to advance the field of AVR. This SoK serves as a valuable resource for researchers and practitioners, offering a structured understanding of the current state-of-the-art and guiding future research and development in this critical domain.

Paperid: 1920, https://arxiv.org/pdf/2501.18727.pdf

Abstract:
The rapid proliferation of speech-enabled technologies, including virtual assistants, video conferencing platforms, and wearable devices, has raised significant privacy concerns, particularly regarding the inference of sensitive emotional information from audio data. Existing privacy-preserving methods often compromise usability and security, limiting their adoption in practical scenarios. This paper introduces a novel, user-centric approach that leverages familiar audio editing techniques, specifically pitch and tempo manipulation, to protect emotional privacy without sacrificing usability. By analyzing popular audio editing applications on Android and iOS platforms, we identified these features as both widely available and usable. We rigorously evaluated their effectiveness against a threat model, considering adversarial attacks from diverse sources, including Deep Neural Networks (DNNs), Large Language Models (LLMs), and and reversibility testing. Our experiments, conducted on three distinct datasets, demonstrate that pitch and tempo manipulation effectively obfuscates emotional data. Additionally, we explore the design principles for lightweight, on-device implementation to ensure broad applicability across various devices and platforms.

Paperid: 1921, https://arxiv.org/pdf/2501.17866.pdf

Abstract:
The field of brainwave-based biometrics has gained attention for its potential to revolutionize user authentication through hands-free interaction, resistance to shoulder surfing, continuous authentication, and revocability. However, current research often relies on single-session or limited-session datasets with fewer than 55 subjects, raising concerns about the generalizability of the findings. To address this gap, we conducted a large-scale study using a public brainwave dataset comprising 345 subjects and over 6,007 sessions (an average of 17 per subject) recorded over five years using three headsets. Our results reveal that deep learning approaches significantly outperform hand-crafted feature extraction methods. We also observe Equal Error Rates (EER) increases over time (e.g., from 6.7% after 1 day to 14.3% after a year). Therefore, it is necessary to reinforce the enrollment set after successful login attempts. Moreover, we demonstrate that fewer brainwave measurement sensors can be used, with an acceptable increase in EER, which is necessary for transitioning from medical-grade to affordable consumer-grade devices. Finally, we compared our results to prior work and existing biometric standards. While our performance is on par with or exceeds previous approaches, it still falls short of industrial benchmarks. Based on the results, we hypothesize that further improvements are possible with larger training sets. To support future research, we have open-sourced our analysis code.

Paperid: 1922, https://arxiv.org/pdf/2501.17548.pdf

Abstract:
Digital public services have revolutionised citizen and private sector interactions with governments. Certain communities are strongly dependent on such digital services for ensuring the availability of public services due to geographical isolation or the presence of adverse geophysical and weather phenomena. However, strong and effective security is key to maintaining the integrity of public records and services yet also for ensuring trust in them. Trust is essential for user uptake, particularly given a global increase in data-protection concerns and a turbulent geopolitical security environment. In this paper, we examine the case of public trust in various forms of authentication for electronic identification in Iceland, which has high availability requirements for digital public services due to its unique and dynamic geophysical characteristics. Additionally, Iceland has historically low levels of institutional trust which may conflict with the requirement for an increased need for digital public services. Through surveying the Icelandic general public, we find that there is a high-level of trust in digital identification services across all demographics. We conclude with a discussion and future research challenges towards improving the effectiveness of authentication considering the diverse groups within Icelandic society, such as the rapidly increasing population of migrants and the large and dynamic population of tourists.

Paperid: 1923, https://arxiv.org/pdf/2501.17539.pdf

Abstract:
Cybersecurity education is challenging and it is helpful for educators to understand Large Language Models' (LLMs') capabilities for supporting education. This study evaluates the effectiveness of LLMs in conducting a variety of penetration testing tasks. Fifteen representative tasks were selected to cover a comprehensive range of real-world scenarios. We evaluate the performance of 6 models (GPT-4o mini, GPT-4o, Gemini 1.5 Flash, Llama 3.1 405B, Mixtral 8x7B and WhiteRabbitNeo) upon the Metasploitable v3 Ubuntu image and OWASP WebGOAT. Our findings suggest that GPT-4o mini currently offers the most consistent support making it a valuable tool for educational purposes. However, its use in conjonction with WhiteRabbitNeo should be considered, because of its innovative approach to tool and command recommendations. This study underscores the need for continued research into optimising LLMs for complex, domain-specific tasks in cybersecurity education.

Paperid: 1924, https://arxiv.org/pdf/2501.17413.pdf

Abstract:
1-day vulnerabilities in binaries have become a major threat to software security. Patch presence test is one of the effective ways to detect the vulnerability. However, existing patch presence test works do not perform well in practical scenarios due to the interference from the various compilers and optimizations, patch-similar code blocks, and irrelevant functions in stripped binaries. In this paper, we propose a novel approach named PLocator, which leverages stable values from both the patch code and its context, extracted from the control flow graph, to accurately locate the real patch code in the target function, offering a practical solution for real-world vulnerability detection scenarios. To evaluate the effectiveness of PLocator, we collected 73 CVEs and constructed two comprehensive datasets ($Dataset_{-irr}$ and $Dataset_{+irr}$), comprising 1,090 and 27,250 test cases at four compilation optimization levels and two compilers with three different experiments, i.e., Same, XO (cross-optimizations), and XC (cross-compilers). The results demonstrate that PLocator achieves an average TPR of 88.2% and FPR of 12.9% in a short amount of time, outperforming state-of-the-art approaches by 26.7% and 63.5%, respectively, indicating that PLocator is more practical for the 1-day vulnerability detection task.

Paperid: 1925, https://arxiv.org/pdf/2501.17030.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable progress in reasoning, alignment, and task-specific performance. However, ensuring harmlessness in these systems remains a critical challenge, particularly in advanced models like DeepSeek-R1. This paper examines the limitations of Reinforcement Learning (RL) as the primary approach for reducing harmful outputs in DeepSeek-R1 and compares it with Supervised Fine-Tuning (SFT). While RL improves reasoning capabilities, it faces challenges such as reward hacking, generalization failures, language mixing, and high computational costs. We propose hybrid training approaches combining RL and SFT to achieve robust harmlessness reduction. Usage recommendations and future directions for deploying DeepSeek-R1 responsibly are also presented.

Paperid: 1926, https://arxiv.org/pdf/2501.15032.pdf

Abstract:
We present SuperEar, a novel privacy threat based on acoustic metamaterials. Unlike previous research, SuperEar can surreptitiously track and eavesdrop on the phone calls of a moving outdoor target from a safe distance. To design this attack, SuperEar overcomes the challenges faced by traditional acoustic metamaterials, including low low-frequency gain and audio distortion during reconstruction. It successfully magnifies the speech signal by approximately 20 times, allowing the sound to be captured from the earpiece of the target phone. In addition, SuperEar optimizes the trade-off between the number and size of acoustic metamaterials, improving the portability and concealability of the interceptor while ensuring effective interception performance. This makes it highly suitable for outdoor tracking and eavesdropping scenarios. Through extensive experimentation, we have evaluated SuperEar and our results show that it can achieve an eavesdropping accuracy of over 80% within a range of 4.5 meters in the aforementioned scenario, thus validating its great potential in real-world applications.

Paperid: 1927, https://arxiv.org/pdf/2501.15031.pdf

Abstract:
We present METAATTACK, the first approach to leverage acoustic metamaterials for inaudible attacks for voice control systems. Compared to the state-of-the-art inaudible attacks requiring complex and large speaker setups, METAATTACK achieves a longer attacking range and higher accuracy using a compact, portable device small enough to be put into a carry bag. These improvements in portability and stealth have led to the practical applicability of inaudible attacks and their adaptation to a wider range of scenarios. We demonstrate how the recent advancement in metamaterials can be utilized to design a voice attack system with carefully selected implementation parameters and commercial off-the-shelf components. We showcase that METAATTACK can be used to launch inaudible attacks for representative voice-controlled personal assistants, including Siri, Alexa, Google Assistant, XiaoAI, and Xiaoyi. The average word accuracy of all assistants is 76%, with a range of 8.85 m.

Paperid: 1928, https://arxiv.org/pdf/2501.14931.pdf

Abstract:
This work addresses the inherent issues of high latency in blockchains and low scalability in traditional consensus protocols. We present pod, a novel notion of consensus whose first priority is to achieve the physically-optimal latency of $2Î´$, or one round-trip, i.e., requiring only one network trip (duration $Î´$) for writing a transaction and one for reading it. To accomplish this, we first eliminate inter-replica communication. Instead, clients send transactions directly to all replicas, which independently process transactions and append them to local logs. Replicas assigns a timestamp and a sequence number to each transaction in their logs, allowing clients to extract valuable metadata about the transactions and the system state. Later on, clients retrieve these logs and extract transactions (and associated metadata) from them. Necessarily, this construction achieves weaker properties than a total-order broadcast protocol, due to existing lower bounds. Our work models the primitive of pod and defines its security properties. We then show pod-core, a protocol that satisfies properties such as transaction confirmation within $2Î´$, censorship resistance against Byzantine replicas, and accountability for safety violations. We show that single-shot auctions can be realized using the pod notion and observe that it is also sufficient for other popular applications.

Paperid: 1929, https://arxiv.org/pdf/2501.14414.pdf

Abstract:
Differential privacy has emerged as the most studied framework for privacy-preserving machine learning. However, recent studies show that enforcing differential privacy guarantees can not only significantly degrade the utility of the model, but also amplify existing disparities in its predictive performance across demographic groups. Although there is extensive research on the identification of factors that contribute to this phenomenon, we still lack a complete understanding of the mechanisms through which differential privacy exacerbates disparities. The literature on this problem is muddled by varying definitions of fairness, differential privacy mechanisms, and inconsistent experimental settings, often leading to seemingly contradictory results. This survey provides the first comprehensive overview of the factors that contribute to the disparate effect of training models with differential privacy guarantees. We discuss their impact and analyze their causal role in such a disparate effect. Our analysis is guided by a taxonomy that categorizes these factors by their position within the machine learning pipeline, allowing us to draw conclusions about their interaction and the feasibility of potential mitigation strategies. We find that factors related to the training dataset and the underlying distribution play a decisive role in the occurrence of disparate impact, highlighting the need for research on these factors to address the issue.

Paperid: 1930, https://arxiv.org/pdf/2501.14397.pdf

Abstract:
The Plug-and-Charge (PnC) process defined by ISO 15118 standardizes automated Electric Vehicle (EV) charging by enabling automatic installation of credentials and use for authentication between EV and Charge Point (CP). However, the current credential installation process is non-uniform, relies on a complex Public Key Infrastructure (PKI), lacks support for fine-grained authorization parameters, and is not very user-friendly. In this paper, we propose a streamlined approach to the initial charging authorization process by leveraging the OAuth Device Authorization Grant and Rich Authorization Requests. The proposed solution reduces technical complexity, simplifies credential installation, introduces flexible authorization constraints (e.g., time- and cost-based), and facilitates payment through OpenID Connect (OIDC). We present a proof-of-concept implementation along with performance evaluations and conduct a symbolic protocol verification using the Tamarin prover. Furthermore, our approach solves the issue of OAuth's cross-device authorization, making it suitable as a formally proven blueprint in contexts beyond EV charging.

Paperid: 1931, https://arxiv.org/pdf/2501.12911.pdf

Abstract:
Federated learning (FL) has come forward as a critical approach for privacy-preserving machine learning in healthcare, allowing collaborative model training across decentralized medical datasets without exchanging clients' data. However, current security implementations for these systems face a fundamental trade-off: rigorous cryptographic protections like fully homomorphic encryption (FHE) impose prohibitive computational overhead, while lightweight alternatives risk vulnerable data leakage through model updates. To address this issue, we present FAS (Fast and Secure Federated Learning), a novel approach that strategically combines selective homomorphic encryption, differential privacy, and bitwise scrambling to achieve robust security without compromising practical usability. Our approach eliminates the need for model pretraining phases while dynamically protecting high-risk model parameters through layered encryption and obfuscation. We implemented FAS using the Flower framework and evaluated it on a cluster of eleven physical machines. Our approach was up to 90\% faster than applying FHE on the model weights. In addition, we eliminated the computational overhead that is required by competitors such as FedML-HE and MaskCrypt. Our approach was up to 1.5$\times$ faster than the competitors while achieving comparable security results. Experimental evaluations on medical imaging datasets confirm that FAS maintains similar security results to conventional FHE against gradient inversion attacks while preserving diagnostic model accuracy. These results position FAS as a practical solution for latency-sensitive healthcare applications where both privacy preservation and computational efficiency are requirements.

Paperid: 1932, https://arxiv.org/pdf/2501.12123.pdf

Abstract:
Federated Learning (FL) enables clients to collaboratively train a global model using their local datasets while reinforcing data privacy, but it is prone to poisoning attacks. Existing defense mechanisms assume that clients' data are independent and identically distributed (IID), making them ineffective in real-world applications where data are non-IID. This paper presents FL-CLEANER, the first defense capable of filtering both byzantine and backdoor attackers' model updates in a non-IID FL environment. The originality of FL-CLEANER is twofold. First, it relies on a client confidence score derived from the reconstruction errors of each client's model activation maps for a given trigger set, with reconstruction errors obtained by means of a Conditional Variational Autoencoder trained according to a novel server-side strategy. Second, it uses an original ad-hoc trust propagation algorithm we propose. Based on previous client scores, it allows building a cluster of benign clients while flagging potential attackers. Experimental results on the datasets MNIST and FashionMNIST demonstrate the efficiency of FL-CLEANER against Byzantine attackers as well as to some state-of-the-art backdoors in non-IID scenarios; it achieves a close-to-zero (<1%) benign client misclassification rate, even in the absence of an attack, and achieves strong performance compared to state of the art defenses.

Paperid: 1933, https://arxiv.org/pdf/2501.11852.pdf

Abstract:
Black-box textual adversarial attacks are challenging due to the lack of model information and the discrete, non-differentiable nature of text. Existing methods often lack versatility for attacking different models, suffer from limited attacking performance due to the inefficient optimization with word saliency ranking, and frequently sacrifice semantic integrity to achieve better attack outcomes. This paper introduces a novel approach to textual adversarial attacks, which we call Cross-Entropy Attacks (CEA), that uses Cross-Entropy optimization to address the above issues. Our CEA approach defines adversarial objectives for both soft-label and hard-label settings and employs CE optimization to identify optimal replacements. Through extensive experiments on document classification and language translation problems, we demonstrate that our attack method excels in terms of attacking performance, imperceptibility, and sentence quality.

Paperid: 1934, https://arxiv.org/pdf/2501.10612.pdf

Abstract:
End-to-end blockchain latency has become a critical topic of interest in both academia and industry. However, while modern blockchain systems process transactions through multiple stages, most research has primarily focused on optimizing the latency of the Byzantine Fault Tolerance consensus component. In this work, we identify key sources of latency in blockchain systems and introduce Zaptos, a parallel pipelined architecture designed to minimize end-to-end latency while maintaining the high-throughput of pipelined blockchains. We implemented Zaptos and evaluated it against the pipelined architecture of the Aptos blockchain in a geo-distributed environment. Our evaluation demonstrates a 25\% latency reduction under low load and over 40\% reduction under high load. Notably, Zaptos achieves a throughput of 20,000 transactions per second with sub-second latency, surpassing previously reported blockchain throughput, with sub-second latency, by an order of magnitude.

Paperid: 1935, https://arxiv.org/pdf/2501.08449.pdf

Abstract:
Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered some statistics of the confidential data $\unicode{x2013}$ which are called the method's invariants $\unicode{x2013}$ and hence neither can be readily reconciled with differential privacy (DP), at least as it was originally conceived. Nevertheless, we establish that the PSA satisfies $\varepsilon$-DP subject to the invariants it necessarily induces, thereby showing that this traditional SDC method can in fact still be understood within our more-general system of DP specifications. By a similar modification to $Ï$-zero concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider the counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal privacy loss, but at the cost of releasing many more invariants. Therefore, while our results explicate the mathematical guarantees of SDC provided by the PSA, the TDA and the 2020 DAS in general, care must be taken in their translation to actual privacy protection $\unicode{x2013}$ just as is the case for any DP deployment.

Paperid: 1936, https://arxiv.org/pdf/2501.07476.pdf

Abstract:
The computation of collision probability ($\mathcal{P}_c$) is crucial for space environmentalism and sustainability by providing decision-making knowledge that can prevent collisions between anthropogenic space objects. However, the accuracy and precision of $\mathcal{P}_c$ computations is often compromised by limitations in computational resources and data availability. While significant improvements have been made in the computational aspects, the rising concerns regarding the privacy of collaborative data sharing can be a major limiting factor in the future conjunction analysis and risk assessment, especially as the space environment grows increasingly privatized, competitive, and fraught with conflicting strategic interests. This paper argues that the importance of privacy measures in space situational awareness (SSA) is underappreciated, and regulatory and compliance measures currently in place are not sufficient by themselves, presenting a significant gap. To address this gap, we introduce a novel encrypted architecture that leverages advanced cryptographic techniques, including homomorphic encryption (HE) and multi-party computation (MPC), to safeguard the privacy of entities computing space sustainability metrics, inter alia, $\mathcal{P}_c$. Our proposed protocol, Encrypted $\mathcal{P}_c$, integrates the Monte Carlo estimation algorithm with cryptographic solutions, enabling secure collision probability computation without exposing sensitive or proprietary information. This research advances secure conjunction analysis by developing a secure MPC protocol for $\mathcal{P}_c$ computation and highlights the need for innovative protocols to ensure a more secure and cooperative SSA landscape.

Paperid: 1937, https://arxiv.org/pdf/2501.07192.pdf

Abstract:
Backdoor attacks have become a critical threat to deep neural networks (DNNs), drawing many research interests. However, most of the studied attacks employ a single type of trigger. Consequently, proposed backdoor defenders often rely on the assumption that triggers would appear in a unified way. In this paper, we show that this naive assumption can create a loophole, allowing more sophisticated backdoor attacks to bypass. We design a novel backdoor attack mechanism that incorporates multiple types of backdoor triggers, focusing on stealthiness and effectiveness. Our journey begins with the intriguing observation that the performance of a backdoor attack in deep learning models, as well as its detectability and removability, are all proportional to the magnitude of the trigger. Based on this correlation, we propose reducing the magnitude of each trigger type and combining them to achieve a strong backdoor relying on the combined trigger while still staying safely under the radar of defenders. Extensive experiments on three standard datasets demonstrate that our method can achieve high attack success rates (ASRs) while consistently bypassing state-of-the-art defenses.

Paperid: 1938, https://arxiv.org/pdf/2501.06686.pdf

Abstract:
In this work, we study the feasibility of using neural ordinary differential equations (NODEs) to model systems with intrinsic privacy properties. Unlike conventional feedforward neural networks, which have unlimited expressivity and can represent arbitrary mappings between inputs and outputs, NODEs constrain their learning to the solution of a system of differential equations. We first examine whether this constraint reduces memorization and, consequently, the membership inference risks associated with NODEs. We conduct a comprehensive evaluation of NODEs under membership inference attacks and show that they exhibit twice the resistance compared to conventional models such as ResNets. By analyzing the variance in membership risks across different NODE models, we find that their limited expressivity leads to reduced overfitting to the training data. We then demonstrate, both theoretically and empirically, that membership inference risks can be further mitigated by utilizing a stochastic variant of NODEs: neural stochastic differential equations (NSDEs). We show that NSDEs are differentially-private (DP) learners that provide the same provable privacy guarantees as DPSGD, the de-facto mechanism for training private models. NSDEs are also effective in mitigating membership inference attacks, achieving risk levels comparable to private models trained with DP-SGD while offering an improved privacyutility trade-off. Moreover, we propose a drop-in-replacement strategy that efficiently integrates NSDEs into conventional feedforward architectures to enhance their privacy.

Paperid: 1939, https://arxiv.org/pdf/2501.06504.pdf

Abstract:
Biometric authentication is increasingly popular for its convenience and accuracy. However, while recent advancements focus on reducing errors and expanding modalities, the reliability of reported performance metrics often remains overlooked. Understanding reliability is critical, as it communicates how accurately reported error rates represent a system's actual performance, considering the uncertainty in error-rate estimates from test data. Currently, there is no widely accepted standard for reporting these uncertainties and indeed biometric studies rarely provide reliability estimates, limiting comparability and interpretation. To address this gap, we introduce BioQuake--a measure to estimate uncertainty in biometric verification systems--and empirically validate it on four systems and three datasets. Based on BioQuake, we provide simple guidelines for estimating performance uncertainty and facilitating reliable reporting. Additionally, we apply BioQuake to analyze biometric recognition performance on 62 biometric datasets used in research across eight modalities: face, fingerprint, gait, iris, keystroke, eye movement, Electroencephalogram (EEG), and Electrocardiogram (ECG). Our analysis shows that reported state-of-the-art performance often deviates significantly from actual error rates, potentially leading to inaccurate conclusions. To support researchers and foster the development of more reliable biometric systems and datasets, we release BioQuake as an easy-to-use web tool for reliability calculations.

Paperid: 1940, https://arxiv.org/pdf/2501.06087.pdf

Abstract:
The rapid expansion of Vehicle-to-Everything (V2X) networks within the Internet of Vehicles (IoV) demands secure and efficient authentication to support high-speed, high-density and mobility-challenged environments. This paper presents a privacy-preserving authentication scheme that incorporates batch authentication, mutual authentication, and secure key establishment, enabling users to authenticate one another without a central authority. Our proposed scheme facilitates simultaneous multi-user authentication, significantly enhancing scalability, robustness and security in dynamic IoV networks. Results from realistic implementations show that our method achieves average authentication and verification times of 10.61 ms and 1.78 ms, respectively, for a fleet of 100 vehicles, outperforming existing methods. Scalability tests demonstrate efficient processing for larger groups of up to 500 vehicles, where average authentication times remain low, establishing our scheme as a robust solution for secure communication in IoV systems.

Paperid: 1941, https://arxiv.org/pdf/2501.05223.pdf

Abstract:
The implementation of accurate nonlinear operators (e.g., sigmoid function) on heterogeneous datasets is a key challenge in privacy-preserving machine learning (PPML). Most existing frameworks approximate it through linear operations, which not only result in significant precision loss but also introduce substantial computational overhead. This paper proposes an efficient, verifiable, and accurate security 2-party logistic regression framework (EVA-S2PLoR), which achieves accurate nonlinear function computation through a subtly secure hadamard product protocol and its derived protocols. All protocols are based on a practical semi-honest security model, which is designed for decentralized privacy-preserving application scenarios that balance efficiency, precision, and security. High efficiency and precision are guaranteed by the asynchronous computation flow on floating point numbers and the few number of fixed communication rounds in the hadamard product protocol, where robust anomaly detection is promised by dimension transformation and Monte Carlo methods. EVA-S2PLoR outperforms many advanced frameworks in terms of precision, improving the performance of the sigmoid function by about 10 orders of magnitude compared to most frameworks. Moreover, EVA-S2PLoR delivers the best overall performance in secure logistic regression experiments with training time reduced by over 47.6% under WAN settings and a classification accuracy difference of only about 0.5% compared to the plaintext model.

Paperid: 1942, https://arxiv.org/pdf/2501.04848.pdf

Abstract:
Malware analysis is a complex process of examining and evaluating malicious software's functionality, origin, and potential impact. This arduous process typically involves dissecting the software to understand its components, infection vector, propagation mechanism, and payload. Over the years, deep reverse engineering of malware has become increasingly tedious, mainly due to modern malicious codebases' fast evolution and sophistication. Essentially, analysts are tasked with identifying the elusive needle in the haystack within the complexities of zero-day malware, all while under tight time constraints. Thus, in this paper, we explore leveraging Large Language Models (LLMs) for semantic malware analysis to expedite the analysis of known and novel samples. Built on GPT-4o-mini model, \msp is designed to augment malware analysis for Android through a hierarchical-tiered summarization chain and strategic prompt engineering. Additionally, \msp performs malware categorization, distinguishing potential malware from benign applications, thereby saving time during the malware reverse engineering process. Despite not being fine-tuned for Android malware analysis, we demonstrate that through optimized and advanced prompt engineering \msp can achieve up to 77% classification accuracy while providing highly robust summaries at functional, class, and package levels. In addition, leveraging the backward tracing of the summaries from package to function levels allowed us to pinpoint the precise code snippets responsible for malicious behavior.

Paperid: 1943, https://arxiv.org/pdf/2501.02704.pdf

Abstract:
Deep Neural Networks (DNNs) have gained considerable traction in recent years due to the unparalleled results they gathered. However, the cost behind training such sophisticated models is resource intensive, resulting in many to consider DNNs to be intellectual property (IP) to model owners. In this era of cloud computing, high-performance DNNs are often deployed all over the internet so that people can access them publicly. As such, DNN watermarking schemes, especially backdoor-based watermarks, have been actively developed in recent years to preserve proprietary rights. Nonetheless, there lies much uncertainty on the robustness of existing backdoor watermark schemes, towards both adversarial attacks and unintended means such as fine-tuning neural network models. One reason for this is that no complete guarantee of robustness can be assured in the context of backdoor-based watermark. In this paper, we extensively evaluate the persistence of recent backdoor-based watermarks within neural networks in the scenario of fine-tuning, we propose/develop a novel data-driven idea to restore watermark after fine-tuning without exposing the trigger set. Our empirical results show that by solely introducing training data after fine-tuning, the watermark can be restored if model parameters do not shift dramatically during fine-tuning. Depending on the types of trigger samples used, trigger accuracy can be reinstated to up to 100%. Our study further explores how the restoration process works using loss landscape visualization, as well as the idea of introducing training data in fine-tuning stage to alleviate watermark vanishing.

Paperid: 1944, https://arxiv.org/pdf/2501.02091.pdf

Abstract:
Online tracking is a widespread practice on the web with questionable ethics, security, and privacy concerns. While web tracking can offer personalized and curated content to Internet users, it operates as a sophisticated surveillance mechanism to gather extensive user information. This paper introduces PriveShield, a light-weight privacy mechanism that disrupts the information gathering cycle while offering more control to Internet users to maintain their privacy. PriveShield is implemented as a browser extension that offers an adjustable privacy feature to surf the web with multiple identities or accounts simultaneously without any changes to underlying browser code or services. When necessary, multiple factors are automatically analyzed on the client side to isolate cookies and other information that are the basis of online tracking. PriveShield creates isolated profiles for clients based on their browsing history, interactions with websites, and the amount of time they spend on specific websites. This allows the users to easily prevent unwanted browsing information from being shared with third parties and ad exchanges without the need for manual configuration. Our evaluation results from 54 real-world scenarios show that our extension is effective in preventing retargeted ads in 91% of those scenarios.

Paperid: 1945, https://arxiv.org/pdf/2501.02018.pdf

Abstract:
Large Language Models (LLMs) have been shown to be susceptible to jailbreak attacks, or adversarial attacks used to illicit high risk behavior from a model. Jailbreaks have been exploited by cybercriminals and blackhat actors to cause significant harm, highlighting the critical need to safeguard widely-deployed models. Safeguarding approaches, which include fine-tuning models or having LLMs "self-reflect", may lengthen the inference time of a model, incur a computational penalty, reduce the semantic fluency of an output, and restrict ``normal'' model behavior. Importantly, these Safety-Performance Trade-offs (SPTs) remain an understudied area. In this work, we introduce a novel safeguard, called SafeNudge, that combines Controlled Text Generation with "nudging", or using text interventions to change the behavior of a model. SafeNudge triggers during text-generation while a jailbreak attack is being executed, and can reduce successful jailbreak attempts by 30% by guiding the LLM towards a safe responses. It adds minimal latency to inference and has a negligible impact on the semantic fluency of outputs. Further, we allow for tunable SPTs. SafeNudge is open-source and available through https://pypi.org/, and is compatible with models loaded with the Hugging Face "transformers" library.

Paperid: 1946, https://arxiv.org/pdf/2501.01786.pdf

Abstract:
This paper addresses the challenge of balancing learner data privacy with the use of data in learning analytics (LA) by proposing a novel framework by applying Differential Privacy (DP). The need for more robust privacy protection keeps increasing, driven by evolving legal regulations and heightened privacy concerns, as well as traditional anonymization methods being insufficient for the complexities of educational data. To address this, we introduce the first DP framework specifically designed for LA and provide practical guidance for its implementation. We demonstrate the use of this framework through a LA usage scenario and validate DP in safeguarding data privacy against potential attacks through an experiment on a well-known LA dataset. Additionally, we explore the trade-offs between data privacy and utility across various DP settings. Our work contributes to the field of LA by offering a practical DP framework that can support researchers and practitioners in adopting DP in their works.

Paperid: 1947, https://arxiv.org/pdf/2501.01620.pdf

Abstract:
DL-based automatic modulation classification (AMC) models are highly susceptible to adversarial attacks, where even minimal input perturbations can cause severe misclassifications. While adversarially training an AMC model based on an adversarial attack significantly increases its robustness against that attack, the AMC model will still be defenseless against other adversarial attacks. The theoretically infinite possibilities for adversarial perturbations mean that an AMC model will inevitably encounter new unseen adversarial attacks if it is ever to be deployed to a real-world communication system. Moreover, the computational limitations and challenges of obtaining new data in real-time will not allow a full training process for the AMC model to adapt to the new attack when it is online. To this end, we propose a meta-learning-based adversarial training framework for AMC models that substantially enhances robustness against unseen adversarial attacks and enables fast adaptation to these attacks using just a few new training samples, if any are available. Our results demonstrate that this training framework provides superior robustness and accuracy with much less online training time than conventional adversarial training of AMC models, making it highly efficient for real-world deployment.

Paperid: 1948, https://arxiv.org/pdf/2501.01335.pdf

Abstract:
Numerous studies have investigated methods for jailbreaking Large Language Models (LLMs) to generate harmful content. Typically, these methods are evaluated using datasets of malicious prompts designed to bypass security policies established by LLM providers. However, the generally broad scope and open-ended nature of existing datasets can complicate the assessment of jailbreaking effectiveness, particularly in specific domains, notably cybersecurity. To address this issue, we present and publicly release CySecBench, a comprehensive dataset containing 12662 prompts specifically designed to evaluate jailbreaking techniques in the cybersecurity domain. The dataset is organized into 10 distinct attack-type categories, featuring close-ended prompts to enable a more consistent and accurate assessment of jailbreaking attempts. Furthermore, we detail our methodology for dataset generation and filtration, which can be adapted to create similar datasets in other domains. To demonstrate the utility of CySecBench, we propose and evaluate a jailbreaking approach based on prompt obfuscation. Our experimental results show that this method successfully elicits harmful content from commercial black-box LLMs, achieving Success Rates (SRs) of 65% with ChatGPT and 88% with Gemini; in contrast, Claude demonstrated greater resilience with a jailbreaking SR of 17%. Compared to existing benchmark approaches, our method shows superior performance, highlighting the value of domain-specific evaluation datasets for assessing LLM security measures. Moreover, when evaluated using prompts from a widely used dataset (i.e., AdvBench), it achieved an SR of 78.5%, higher than the state-of-the-art methods.

Paperid: 1949, https://arxiv.org/pdf/2501.00798.pdf

Abstract:
Neural network models implemented in embedded devices have been shown to be susceptible to side-channel attacks (SCAs), allowing recovery of proprietary model parameters, such as weights and biases. There are already available countermeasure methods currently used for protecting cryptographic implementations that can be tailored to protect embedded neural network models. Shuffling, a hiding-based countermeasure that randomly shuffles the order of computations, was shown to be vulnerable to SCA when the Fisher-Yates algorithm is used. In this paper, we propose a design of an SCA-secure version of the Fisher-Yates algorithm. By integrating the masking technique for modular reduction and Blakely's method for modular multiplication, we effectively remove the vulnerability in the division operation that led to side-channel leakage in the original version of the algorithm. We experimentally evaluate that the countermeasure is effective against SCA by implementing a correlation power analysis attack on an embedded neural network model implemented on ARM Cortex-M4. Compared to the original proposal, the memory overhead is $2\times$ the biggest layer of the network, while the time overhead varies from $4\%$ to $0.49\%$ for a layer with $100$ and $1000$ neurons, respectively.

Paperid: 1950, https://arxiv.org/pdf/2501.00786.pdf

Abstract:
In the face of escalating surveillance and censorship within the cyberspace, the sanctity of personal privacy has come under siege, necessitating the development of steganography, which offers a way to securely hide messages within innocent-looking texts. Previous methods alternate the texts to hide private massages, which is not secure. Large Language Models (LLMs) provide high-quality and explicit distribution, which is an available mathematical tool for secure steganography methods. However, existing attempts fail to achieve high capacity, time efficiency and correctness simultaneously, and their strongly coupling designs leave little room for refining them to achieve better performance. To provide a secure, high-capacity and efficient steganography method, we introduce ShiMer. Specifically, ShiMer pseudorandomly shifts the probability interval of the LLM's distribution to obtain a private distribution, and samples a token according to the private bits. ShiMer produced steganographic texts are indistinguishable in quality from the normal texts directly generated by the language model. To further enhance the capacity of ShiMer, we design a reordering algorithm to minimize the occurrence of interval splitting during decoding phase. Experimental results indicate that our method achieves the highest capacity and efficiency among existing secure steganography techniques.

Paperid: 1951, https://arxiv.org/pdf/2501.00281.pdf

Abstract:
Noisy channels are a foundational resource for constructing cryptographic primitives such as string commitment and oblivious transfer. The noisy channel model has been extended to unfair noisy channels, where adversaries can influence the parameters of a memoryless channel. In this work, we introduce the unstructured noisy channel model as a generalization of the unfair noisy channel model to allow the adversary to manipulate the channel arbitrarily subject to certain entropic constraints. We present a string commitment protocol with established security and derive its achievable commitment rate, demonstrating the feasibility of commitment against this stronger class of adversaries. Furthermore, we show that the entropic constraints in the unstructured noisy channel model can be derived from physical assumptions such as noisy quantum storage. Our work thus connects two distinct approaches to commitment, i.e., the noisy channel and physical limitations.

Paperid: 1952, https://arxiv.org/pdf/2506.23622.pdf

Abstract:
The privacy-preserving federated learning schemes based on the setting of two honest-but-curious and non-colluding servers offer promising solutions in terms of security and efficiency. However, our investigation reveals that these schemes still suffer from privacy leakage when considering model poisoning attacks from malicious users. Specifically, we demonstrate that the privacy-preserving computation process for defending against model poisoning attacks inadvertently leaks privacy to one of the honest-but-curious servers, enabling it to access users' gradients in plaintext. To address both privacy leakage and model poisoning attacks, we propose an enhanced privacy-preserving and Byzantine-robust federated learning (PBFL) scheme, comprising three components: (1) a two-trapdoor fully homomorphic encryption (FHE) scheme to bolster users' privacy protection; (2) a novel secure normalization judgment method to preemptively thwart gradient poisoning; and (3) an innovative secure cosine similarity measurement method for detecting model poisoning attacks without compromising data privacy. Our scheme guarantees privacy preservation and resilience against model poisoning attacks, even in scenarios with heterogeneous, non-IID (Independently and Identically Distributed) datasets. Theoretical analyses substantiate the security and efficiency of our scheme, and extensive experiments corroborate the efficacy of our private attacks. Furthermore, the experimental results demonstrate that our scheme accelerates training speed while reducing communication overhead compared to the state-of-the-art PBFL schemes.

Paperid: 1953, https://arxiv.org/pdf/2506.23321.pdf

Abstract:
Artificial intelligence (AI) is rapidly transforming global industries and societies, making AI literacy an indispensable skill for future generations. While AI integration in education is still emerging in Nepal, this study focuses on assessing the current AI literacy levels and identifying learning needs among students in Chitwan District of Nepal. By measuring students' understanding of AI and pinpointing areas for improvement, this research aims to provide actionable recommendations for educational stakeholders. Given the pivotal role of young learners in navigating a rapidly evolving technological landscape, fostering AI literacy is paramount. This study seeks to understand the current state of AI literacy in Chitwan District by analyzing students' knowledge, skills, and attitudes towards AI. The results will contribute to developing robust AI education programs for Nepalese schools. This paper offers a contemporary perspective on AI's role in Nepalese secondary education, emphasizing the latest AI tools and technologies. Moreover, the study illuminates the potential revolutionary impact of technological innovations on educational leadership and student outcomes. A survey was conducted to conceptualize the newly emerging concept of AI and cybersecurity among students of Chitwan district from different schools and colleges to find the literacy rate. The participants in the survey were students between grade 9 to 12. We conclude with discussions of the affordances and barriers to bringing AI and cybersecurity education to students from lower classes.

Paperid: 1954, https://arxiv.org/pdf/2506.20806.pdf

Abstract:
Graph Neural Networks (GNNs) show great promise for Network Intrusion Detection Systems (NIDS), particularly in IoT environments, but suffer performance degradation due to distribution drift and lack robustness against realistic adversarial attacks. Current robustness evaluations often rely on unrealistic synthetic perturbations and lack demonstrations on systematic analysis of different kinds of adversarial attack, which encompass both black-box and white-box scenarios. This work proposes a novel approach to enhance GNN robustness and generalization by employing Large Language Models (LLMs) in an agentic pipeline as simulated cybersecurity expert agents. These agents scrutinize graph structures derived from network flow data, identifying and potentially mitigating suspicious or adversarially perturbed elements before GNN processing. Our experiments, using a framework designed for realistic evaluation and testing with a variety of adversarial attacks including a dataset collected from physical testbed experiments, demonstrate that integrating LLM analysis can significantly improve the resilience of GNN-based NIDS against challenges, showcasing the potential of LLM agent as a complementary layer in intrusion detection architectures.

Paperid: 1955, https://arxiv.org/pdf/2506.20413.pdf

Abstract:
The growing adoption of Artificial Intelligence (AI) in Internet of Things (IoT) ecosystems has intensified the need for personalized learning methods that can operate efficiently and privately across heterogeneous, resource-constrained devices. However, enabling effective personalized learning in decentralized settings introduces several challenges, including efficient knowledge transfer between clients, protection of data privacy, and resilience against poisoning attacks. In this paper, we address these challenges by developing P4 (Personalized, Private, Peer-to-Peer) -- a method designed to deliver personalized models for resource-constrained IoT devices while ensuring differential privacy and robustness against poisoning attacks. Our solution employs a lightweight, fully decentralized algorithm to privately detect client similarity and form collaborative groups. Within each group, clients leverage differentially private knowledge distillation to co-train their models, maintaining high accuracy while ensuring robustness to the presence of malicious clients. We evaluate P4 on popular benchmark datasets using both linear and CNN-based architectures across various heterogeneity settings and attack scenarios. Experimental results show that P4 achieves 5% to 30% higher accuracy than leading differentially private peer-to-peer approaches and maintains robustness with up to 30% malicious clients. Additionally, we demonstrate its practicality by deploying it on resource-constrained devices, where collaborative training between two clients adds only ~7 seconds of overhead.

Paperid: 1956, https://arxiv.org/pdf/2506.18932.pdf

Abstract:
Artificial Intelligence (AI) is rapidly being integrated into critical systems across various domains, from healthcare to autonomous vehicles. While its integration brings immense benefits, it also introduces significant risks, including those arising from AI misuse. Within the discourse on managing these risks, the terms "AI Safety" and "AI Security" are often used, sometimes interchangeably, resulting in conceptual confusion. This paper aims to demystify the distinction and delineate the precise research boundaries between AI Safety and AI Security. We provide rigorous definitions, outline their respective research focuses, and explore their interdependency, including how security breaches can precipitate safety failures and vice versa. Using clear analogies from message transmission and building construction, we illustrate these distinctions. Clarifying these boundaries is crucial for guiding precise research directions, fostering effective cross-disciplinary collaboration, enhancing policy effectiveness, and ultimately, promoting the deployment of trustworthy AI systems.

Paperid: 1957, https://arxiv.org/pdf/2506.17824.pdf

Abstract:
Sensitive data captured by Industrial Control Systems (ICS) play a large role in the safety and integrity of many critical infrastructures. Detection of anomalous or malicious data, or Anomaly Detection (AD), with machine learning is one of many vital components of cyberphysical security. Quantum kernel-based machine learning methods have shown promise in identifying complex anomalous behavior by leveraging the highly expressive and efficient feature spaces of quantum computing. This study focuses on the parameterization of Quantum Hybrid Support Vector Machines (QSVMs) using three popular datasets from Cyber-Physical Systems (CPS). The results demonstrate that QSVMs outperform traditional classical kernel methods, achieving 13.3% higher F1 scores. Additionally, this research investigates noise using simulations based on real IBMQ hardware, revealing a maximum error of only 0.98% in the QSVM kernels. This error results in an average reduction of 1.57% in classification metrics. Furthermore, the study found that QSVMs show a 91.023% improvement in kernel-target alignment compared to classical methods, indicating a potential "quantum advantage" in anomaly detection for critical infrastructures. This effort suggests that QSVMs can provide a substantial advantage in anomaly detection for ICS, ultimately enhancing the security and integrity of critical infrastructures.

Paperid: 1958, https://arxiv.org/pdf/2506.17709.pdf

Abstract:
Graph Neural Networks (GNNs) have demonstrated remarkable utility across diverse applications, and their growing complexity has made Machine Learning as a Service (MLaaS) a viable platform for scalable deployment. However, this accessibility also exposes GNN to serious security threats, most notably model extraction attacks (MEAs), in which adversaries strategically query a deployed model to construct a high-fidelity replica. In this work, we evaluate the vulnerability of GNNs to MEAs and explore their potential for cost-effective model acquisition in non-adversarial research settings. Importantly, adaptive node querying strategies can also serve a critical role in research, particularly when labeling data is expensive or time-consuming. By selectively sampling informative nodes, researchers can train high-performing GNNs with minimal supervision, which is particularly valuable in domains such as biomedicine, where annotations often require expert input. To address this, we propose a node querying strategy tailored to a highly practical yet underexplored scenario, where bulk queries are prohibited, and only a limited set of initial nodes is available. Our approach iteratively refines the node selection mechanism over multiple learning cycles, leveraging historical feedback to improve extraction efficiency. Extensive experiments on benchmark graph datasets demonstrate our superiority over comparable baselines on accuracy, fidelity, and F1 score under strict query-size constraints. These results highlight both the susceptibility of deployed GNNs to extraction attacks and the promise of ethical, efficient GNN acquisition methods to support low-resource research environments.

Paperid: 1959, https://arxiv.org/pdf/2506.17621.pdf

Abstract:
The growing deployment of deep learning models in real-world environments has intensified the need for efficient inference under strict latency and resource constraints. To meet these demands, dynamic deep learning systems (DDLSs) have emerged, offering input-adaptive computation to optimize runtime efficiency. While these systems succeed in reducing cost, their dynamic nature introduces subtle and underexplored security risks. In particular, input-dependent execution pathways create opportunities for adversaries to degrade efficiency, resulting in excessive latency, energy usage, and potential denial-of-service in time-sensitive deployments. This work investigates the security implications of dynamic behaviors in DDLSs and reveals how current systems expose efficiency vulnerabilities exploitable by adversarial inputs. Through a survey of existing attack strategies, we identify gaps in the coverage of emerging model architectures and limitations in current defense mechanisms. Building on these insights, we propose to examine the feasibility of efficiency attacks on modern DDLSs and develop targeted defenses to preserve robustness under adversarial conditions.

Paperid: 1960, https://arxiv.org/pdf/2506.17185.pdf

Abstract:
We investigate the contents of web-scraped data for training AI systems, at sizes where human dataset curators and compilers no longer manually annotate every sample. Building off of prior privacy concerns in machine learning models, we ask: What are the legal privacy implications of web-scraped machine learning datasets? In an empirical study of a popular training dataset, we find significant presence of personally identifiable information despite sanitization efforts. Our audit provides concrete evidence to support the concern that any large-scale web-scraped dataset may contain personal data. We use these findings of a real-world dataset to inform our legal analysis with respect to existing privacy and data protection laws. We surface various privacy risks of current data curation practices that may propagate personal information to downstream models. From our findings, we argue for reorientation of current frameworks of "publicly available" information to meaningfully limit the development of AI built upon indiscriminate scraping of the internet.

Paperid: 1961, https://arxiv.org/pdf/2506.17154.pdf

Abstract:
Correctness for microprocessors is generally understood to be conformance with the associated instruction set architecture (ISA). This is the basis for one of the most important abstractions in computer science, allowing hardware designers to develop highly-optimized processors that are functionally "equivalent" to an ideal processor that executes instructions atomically. This specification is almost always informal, e.g., commercial microprocessors generally do not come with conformance specifications. In this paper, we advocate for the use of formal specifications, using the theory of refinement. We introduce notions of correctness that can be used to deal with transient execution attacks, including Meltdown and Spectre. Such attacks have shown that ubiquitous microprocessor optimizations, appearing in numerous processors for decades, are inherently buggy. Unlike alternative approaches that use non-interference properties, our notion of correctness is global, meaning it is single specification that: formalizes conformance, includes functional correctness and is parameterized by an microarchitecture. We introduce action skipping refinement, a new type of refinement and we describe how our notions of refinement can be decomposed into properties that are more amenable to automated verification using the the concept of shared-resource commitment refinement maps. We do this in the context of formal, fully executable bit- and cycle-accurate models of an ISA and a microprocessor. Finally, we show how light-weight formal methods based on property-based testing can be used to identify transient execution bugs.

Paperid: 1962, https://arxiv.org/pdf/2506.12846.pdf

Abstract:
Federated learning is a promising distributed learning paradigm that enables collaborative model training without exposing local client data, thereby protect data privacy. However, it also brings new threats and challenges. The advancement of model inversion attacks has rendered the plaintext transmission of local models insecure, while the distributed nature of federated learning makes it particularly vulnerable to attacks raised by malicious clients. To protect data privacy and prevent malicious client attacks, this paper proposes a privacy-preserving federated learning framework based on verifiable functional encryption, without a non-colluding dual-server setup or additional trusted third-party. Specifically, we propose a novel decentralized verifiable functional encryption (DVFE) scheme that enables the verification of specific relationships over multi-dimensional ciphertexts. This scheme is formally treated, in terms of definition, security model and security proof. Furthermore, based on the proposed DVFE scheme, we design a privacy-preserving federated learning framework VFEFL that incorporates a novel robust aggregation rule to detect malicious clients, enabling the effective training of high-accuracy models under adversarial settings. Finally, we provide formal analysis and empirical evaluation of the proposed schemes. The results demonstrate that our approach achieves the desired privacy protection, robustness, verifiability and fidelity, while eliminating the reliance on non-colluding dual-server settings or trusted third parties required by existing methods.

Paperid: 1963, https://arxiv.org/pdf/2506.11212.pdf

Abstract:
Mainstream messaging platforms offer a variety of features designed to enhance user privacy, such as password-protected chats and end-to-end encryption, which primarily protect message contents. Beyond contents, a lot can be inferred about people simply by tracing who sends and receives messages, when, and how often. This paper explores user perceptions of and attitudes toward "untraceability", defined as preventing third parties from tracing who communicates with whom, to inform the design of privacy-enhancing technologies and untraceable communication protocols. Through a vignette-based qualitative study with 189 participants, we identify a diverse set of features that users perceive to be useful for untraceable messaging, ranging from using aliases instead of real names to VPNs. Through a reflexive thematic analysis, we uncover three overarching attitudes that influence the support or rejection of untraceability in messaging platforms and that can serve as a set of new privacy personas: privacy fundamentalists, who advocate for privacy as a universal right; safety fundamentalists, who support surveillance for the sake of accountability; and optimists, who advocate for privacy in principle but also endorse exceptions in idealistic ways, such as encryption backdoors. We highlight a critical gap between the threat models assumed by users and those addressed by untraceable communication protocols. Many participants understood untraceability as a form of anonymity, but interpret it as senders and receivers hiding their identities from each other, rather than from external network observers. We discuss implications for design of strategic communication and user interfaces of untraceable messaging protocols, and propose framing untraceability as a form of "altruistic privacy", i.e., adopting privacy-enhancing technologies to protect others, as a promising strategy to foster broad adoption.

Paperid: 1964, https://arxiv.org/pdf/2506.10104.pdf

Abstract:
As cyber threats become more sophisticated, rapid and accurate vulnerability detection is essential for maintaining secure systems. This study explores the use of Large Language Models (LLMs) in software vulnerability assessment by simulating the identification of Python code with known Common Weakness Enumerations (CWEs), comparing zero-shot, few-shot cross-domain, and few-shot in-domain prompting strategies. Our results indicate that while zero-shot prompting performs poorly, few-shot prompting significantly enhances classification performance, particularly when integrated with confidence-based routing strategies that improve efficiency by directing human experts to cases where model uncertainty is high, optimizing the balance between automation and expert oversight. We find that LLMs can effectively generalize across vulnerability categories with minimal examples, suggesting their potential as scalable, adaptable cybersecurity tools in simulated environments. However, challenges such as model reliability, interpretability, and adversarial robustness remain critical areas for future research. By integrating AI-driven approaches with expert-in-the-loop (EITL) decision-making, this work highlights a pathway toward more efficient and responsive cybersecurity workflows. Our findings provide a foundation for deploying AI-assisted vulnerability detection systems in both real and simulated environments that enhance operational resilience while reducing the burden on human analysts.

Paperid: 1965, https://arxiv.org/pdf/2506.09689.pdf

Abstract:
The Bit-Flipping (BF) decoder, thanks to its very low computational complexity, is widely employed in post-quantum cryptographic schemes based on Moderate Density Parity Check codes in which, ultimately, decryption boils down to syndrome decoding. In such a setting, for security concerns, one must guarantee that the Decoding Failure Rate (DFR) is negligible. Such a condition, however, is very difficult to guarantee, because simulations are of little help and the decoder performance is difficult to model theoretically. In this paper, we introduce a new version of the BF decoder, that we call BF-Max, characterized by the fact that in each iteration only one bit (the least reliable) is flipped. When the number of iterations is equal to the number of errors to be corrected, we are able to develop a theoretical characterization of the DFR that tightly matches with numerical simulations. We also show how BF-Max can be implemented efficiently, achieving low complexity and making it inherently constant time. With our modeling, we are able to accurately predict values of DFR that are remarkably lower than those estimated by applying other approaches.

Paperid: 1966, https://arxiv.org/pdf/2506.08312.pdf

Abstract:
Private Evolution (PE) is a promising training-free method for differentially private (DP) synthetic data generation. While it achieves strong performance in some domains (e.g., images and text), its behavior in others (e.g., tabular data) is less consistent. To date, the only theoretical analysis of the convergence of PE depends on unrealistic assumptions about both the algorithm's behavior and the structure of the sensitive dataset. In this work, we develop a new theoretical framework to explain PE's practical behavior and identify sufficient conditions for its convergence. For $d$-dimensional sensitive datasets with $n$ data points from a bounded domain, we prove that PE produces an $(Îµ, Î´)$-DP synthetic dataset with expected 1-Wasserstein distance of order $\tilde{O}(d(nÎµ)^{-1/d})$ from the original, establishing worst-case convergence of the algorithm as $n \to \infty$. Our analysis extends to general Banach spaces as well. We also connect PE to the Private Signed Measure Mechanism, a method for DP synthetic data generation that has thus far not seen much practical adoption. We demonstrate the practical relevance of our theoretical findings in simulations.

Paperid: 1967, https://arxiv.org/pdf/2506.08218.pdf

Abstract:
Containerisation is a popular deployment process for application-level virtualisation using a layer-based approach. Docker is a leading provider of containerisation, and through the Docker Hub, users can supply Docker images for sharing and re-purposing popular software application containers. Using a combination of in-built inspection commands, publicly displayed image layer content, and static image scanning, Docker images are designed to ensure end users can clearly assess the content of the image before running them. In this paper we present \textbf{\textit{gh0stEdit}}, a vulnerability that fundamentally undermines the integrity of Docker images and subverts the assumed trust and transparency they utilise. The use of gh0stEdit allows an attacker to maliciously edit Docker images, in a way that is not shown within the image history, hierarchy or commands. This attack can also be carried out against signed images (Docker Content Trust) without invalidating the image signature. We present two use case studies for this vulnerability, and showcase how gh0stEdit is able to poison an image in a way that is not picked up through static or dynamic scanning tools. Our attack case studies highlight the issues in the current approach to Docker image security and trust, and expose an attack method which could potentially be exploited in the wild without being detected. To the best of our knowledge we are the first to provide detailed discussion on the exploit of this vulnerability.

Paperid: 1968, https://arxiv.org/pdf/2506.07957.pdf

Abstract:
Homomorphic Encryption (HE) enables secure computation on encrypted data without decryption, allowing a great opportunity for privacy-preserving computation. In particular, domains such as healthcare, finance, and government, where data privacy and security are of utmost importance, can benefit from HE by enabling third-party computation and services on sensitive data. In other words, HE constitutes the "Holy Grail" of cryptography: data remains encrypted all the time, being protected while in use. HE's security guarantees rely on noise added to data to make relatively simple problems computationally intractable. This error-centric intrinsic HE mechanism generates new challenges related to the fault tolerance and robustness of HE itself: hardware- and software-induced errors during HE operation can easily evade traditional error detection and correction mechanisms, resulting in silent data corruption (SDC). In this work, we motivate a thorough discussion regarding the sensitivity of HE applications to bit faults and provide a detailed error characterization study of CKKS (Cheon-Kim-Kim-Song). This is one of the most popular HE schemes due to its fixed-point arithmetic support for AI and machine learning applications. We also delve into the impact of the residue number system (RNS) and the number theoretic transform (NTT), two widely adopted HE optimization techniques, on CKKS' error sensitivity. To the best of our knowledge, this is the first work that looks into the robustness and error sensitivity of homomorphic encryption and, as such, it can pave the way for critical future work in this area.

Paperid: 1969, https://arxiv.org/pdf/2506.06742.pdf

Abstract:
Vertical Federated Learning (VFL) has emerged as a promising paradigm for collaborative model training across distributed feature spaces, which enables privacy-preserving learning without sharing raw data. However, recent studies have confirmed the feasibility of label inference attacks by internal adversaries. By strategically exploiting gradient vectors and semantic embeddings, attackers-through passive, active, or direct attacks-can accurately reconstruct private labels, leading to catastrophic data leakage. Existing defenses, which typically address isolated leakage vectors or are designed for specific types of attacks, remain vulnerable to emerging hybrid attacks that exploit multiple pathways simultaneously. To bridge this gap, we propose Label-Anonymized Defense with Substitution Gradient (LADSG), a unified and lightweight defense framework for VFL. LADSG first anonymizes true labels via soft distillation to reduce semantic exposure, then generates semantically-aligned substitute gradients to disrupt gradient-based leakage, and finally filters anomalous updates through gradient norm detection. It is scalable and compatible with standard VFL pipelines. Extensive experiments on six real-world datasets show that LADSG reduces the success rates of all three types of label inference attacks by 30-60% with minimal computational overhead, demonstrating its practical effectiveness.

Paperid: 1970, https://arxiv.org/pdf/2506.06563.pdf

Abstract:
Deep Neural Networks (DNNs) are widely used for traffic sign recognition because they can automatically extract high-level features from images. These DNNs are trained on large-scale datasets obtained from unknown sources. Therefore, it is important to ensure that the models remain secure and are not compromised or poisoned during training. In this paper, we investigate the robustness of DNNs trained for traffic sign recognition. First, we perform the error-minimizing attacks on DNNs used for traffic sign recognition by adding imperceptible perturbations on training data. Then, we propose a data augmentation-based training method to mitigate the error-minimizing attacks. The proposed training method utilizes nonlinear transformations to disrupt the perturbations and improve the model robustness. We experiment with two well-known traffic sign datasets to demonstrate the severity of the attack and the effectiveness of our mitigation scheme. The error-minimizing attacks reduce the prediction accuracy of the DNNs from 99.90% to 10.6%. However, our mitigation scheme successfully restores the prediction accuracy to 96.05%. Moreover, our approach outperforms adversarial training in mitigating the error-minimizing attacks. Furthermore, we propose a detection model capable of identifying poisoned data even when the perturbations are imperceptible to human inspection. Our detection model achieves a success rate of over 99% in identifying the attack. This research highlights the need to employ advanced training methods for DNNs in traffic sign recognition systems to mitigate the effects of data poisoning attacks.

Paperid: 1971, https://arxiv.org/pdf/2506.06556.pdf

Abstract:
As the development of autonomous and connected vehicles advances, the complexity of modern vehicles increases, with numerous Electronic Control Units (ECUs) integrated into the system. In an in-vehicle network, these ECUs communicate with one another using an standard protocol called Controller Area Network (CAN). Securing communication among ECUs plays a vital role in maintaining the safety and security of the vehicle. This paper proposes a robust SDN-based False Data Detection and Mitigation System (FDDMS) for in-vehicle networks. Leveraging the unique capabilities of Software-Defined Networking (SDN), FDDMS is designed to monitor and detect false data injection attacks in real-time. Specifically, we focus on brake-related ECUs within an SDN-enabled in-vehicle network. First, we decode raw CAN data to create an attack model that illustrates how false data can be injected into the system. Then, FDDMS, incorporating a Long Short Term Memory (LSTM)-based detection model, is used to identify false data injection attacks. We further propose an effective variant of DeepFool attack to evaluate the model's robustness. To countermeasure the impacts of four adversarial attacks including Fast gradient descent method, Basic iterative method, DeepFool, and the DeepFool variant, we further enhance a re-training technique method with a threshold based selection strategy. Finally, a mitigation scheme is implemented to redirect attack traffic by dynamically updating flow rules through SDN. Our experimental results show that the proposed FDDMS is robust against adversarial attacks and effectively detects and mitigates false data injection attacks in real-time.

Paperid: 1972, https://arxiv.org/pdf/2506.06530.pdf

Abstract:
The Probably Approximately Correct (PAC) Privacy framework [46] provides a powerful instance-based methodology to preserve privacy in complex data-driven systems. Existing PAC Privacy algorithms (we call them Auto-PAC) rely on a Gaussian mutual information upper bound. However, we show that the upper bound obtained by these algorithms is tight if and only if the perturbed mechanism output is jointly Gaussian with independent Gaussian noise. We propose two approaches for addressing this issue. First, we introduce two tractable post-processing methods for Auto-PAC, based on Donsker-Varadhan representation and sliced Wasserstein distances. However, the result still leaves wasted privacy budget. To address this issue more fundamentally, we introduce Residual-PAC (R-PAC) Privacy, an f-divergence-based measure to quantify privacy that remains after adversarial inference. To implement R-PAC Privacy in practice, we propose a Stackelberg Residual-PAC (SR-PAC) privatization mechanism, a game-theoretic framework that selects optimal noise distributions through convex bilevel optimization. Our approach achieves efficient privacy budget utilization for arbitrary data distributions and naturally composes when multiple mechanisms access the dataset. Through extensive experiments, we demonstrate that SR-PAC consistently obtains a better privacy-utility tradeoff than both PAC and differential privacy baselines.

Paperid: 1973, https://arxiv.org/pdf/2506.05708.pdf

Abstract:
Stablecoins face an unresolved trilemma of balancing decentralization, stability, and regulatory compliance. We present a hybrid stabilization protocol that combines crypto-collateralized reserves, algorithmic futures contracts, and cross-chain liquidity pools to achieve robust price adherence while preserving user privacy. At its core, the protocol introduces stabilization futures contracts (SFCs), non-collateralized derivatives that programmatically incentivize third-party arbitrageurs to counteract price deviations via adaptor signature atomic swaps. Autonomous AI agents optimize delta hedging across decentralized exchanges (DEXs), while zkSNARKs prove compliance with anti-money laundering (AML) regulations without exposing identities or transaction details. Our cryptographic design reduces cross-chain liquidity concentration (Herfindahl-Hirschman Index: 2,400 vs. 4,900 in single-chain systems) and ensures atomicity under standard cryptographic assumptions. The protocol's layered architecture encompassing incentive-compatible SFCs, AI-driven market making, and zero-knowledge regulatory proofs. It provides a blueprint for next-generation decentralized financial infrastructure.

Paperid: 1974, https://arxiv.org/pdf/2506.05692.pdf

Abstract:
The code generation capabilities of large language models(LLMs) have emerged as a critical dimension in evaluating their overall performance. However, prior research has largely overlooked the security risks inherent in the generated code. In this work, we introduce SafeGenBench, a benchmark specifically designed to assess the security of LLM-generated code. The dataset encompasses a wide range of common software development scenarios and vulnerability types. Building upon this benchmark, we develop an automatic evaluation framework that leverages both static application security testing(SAST) and LLM-based judging to assess the presence of security vulnerabilities in model-generated code. Through the empirical evaluation of state-of-the-art LLMs on SafeGenBench, we reveal notable deficiencies in their ability to produce vulnerability-free code. Our findings highlight pressing challenges and offer actionable insights for future advancements in the secure code generation performance of LLMs. The data and code will be released soon.

Paperid: 1975, https://arxiv.org/pdf/2506.05402.pdf

Abstract:
The growing adoption of large pre-trained models in edge computing has made deploying model inference on mobile clients both practical and popular. These devices are inherently vulnerable to direct adversarial attacks, which pose a substantial threat to the robustness and security of deployed models. Federated adversarial training (FAT) has emerged as an effective solution to enhance model robustness while preserving client privacy. However, FAT frequently produces a generalized global model, which struggles to address the diverse and heterogeneous data distributions across clients, resulting in insufficiently personalized performance, while also encountering substantial communication challenges during the training process. In this paper, we propose \textit{Sylva}, a personalized collaborative adversarial training framework designed to deliver customized defense models for each client through a two-phase process. In Phase 1, \textit{Sylva} employs LoRA for local adversarial fine-tuning, enabling clients to personalize model robustness while drastically reducing communication costs by uploading only LoRA parameters during federated aggregation. In Phase 2, a game-based layer selection strategy is introduced to enhance accuracy on benign data, further refining the personalized model. This approach ensures that each client receives a tailored defense model that balances robustness and accuracy effectively. Extensive experiments on benchmark datasets demonstrate that \textit{Sylva} can achieve up to 50$\times$ improvements in communication efficiency compared to state-of-the-art algorithms, while achieving up to 29.5\% and 50.4\% enhancements in adversarial robustness and benign accuracy, respectively.

Paperid: 1976, https://arxiv.org/pdf/2506.05001.pdf

Abstract:
Traditional security detection methods face three key challenges: inadequate data collection that misses critical security events, resource-intensive monitoring systems, and poor detection algorithms with high false positive rates. We present FEAD (Focus-Enhanced Attack Detection), a framework that addresses these issues through three innovations: (1) an attack model-driven approach that extracts security-critical monitoring items from online attack reports for comprehensive coverage; (2) efficient task decomposition that optimally distributes monitoring across existing collectors to minimize overhead; and (3) locality-aware anomaly analysis that leverages the clustering behavior of malicious activities in provenance graphs to improve detection accuracy. Evaluations demonstrate FEAD achieves 8.23% higher F1-score than existing solutions with only 5.4% overhead, confirming that focus-based designs significantly enhance detection performance.

Paperid: 1977, https://arxiv.org/pdf/2506.00100.pdf

Abstract:
Children are one of the most under-represented groups in speech technologies, as well as one of the most vulnerable in terms of privacy. Despite this, anonymization techniques targeting this population have received little attention. In this study, we seek to bridge this gap, and establish a baseline for the use of voice anonymization techniques designed for adult speech when applied to children's voices. Such an evaluation is essential, as children's speech presents a distinct set of challenges when compared to that of adults. This study comprises three children's datasets, six anonymization methods, and objective and subjective utility metrics for evaluation. Our results show that existing systems for adults are still able to protect children's voice privacy, but suffer from much higher utility degradation. In addition, our subjective study displays the challenges of automatic evaluation methods for speech quality in children's speech, highlighting the need for further research.

Paperid: 1978, https://arxiv.org/pdf/2505.23772.pdf

Abstract:
In 2022, Persianom, Phan and Yung outlined the creation of Anamorphic Cryptography. With this, we can create a public key to encrypt data, and then have two secret keys. These secret keys are used to decrypt the cipher into different messages. So, one secret key is given to the Dictator (who must be able to decrypt all the messages), and the other is given to Alice. Alice can then decrypt the ciphertext to a secret message that the Dictator cannot see. This paper outlines the implementation of Anamorphic Cryptography using ECC (Elliptic Curve Cryptography), such as with the secp256k1 curve. This gives considerable performance improvements over discrete logarithm-based methods with regard to security for a particular bit length. Overall, it outlines how the secret message sent to Alice is hidden within the random nonce value, which is used within the encryption process, and which is cancelled out when the Dictator decrypts the ciphertext. It also shows that the BSGS (Baby-step Giant-step) variant significantly outperforms unoptimised elliptic curve methods.

Paperid: 1979, https://arxiv.org/pdf/2505.22798.pdf

Abstract:
The growing reliance on artificial intelligence in safety- and security-critical applications demands effective neural network certification. A challenging real-world use case is certification against ``patch attacks'', where adversarial patches or lighting conditions obscure parts of images, for example traffic signs. One approach to certification, which also gives quantitative coverage estimates, utilizes preimages of neural networks, i.e., the set of inputs that lead to a specified output. However, these preimage approximation methods, including the state-of-the-art PREMAP algorithm, struggle with scalability. This paper presents novel algorithmic improvements to PREMAP involving tighter bounds, adaptive Monte Carlo sampling, and improved branching heuristics. We demonstrate efficiency improvements of at least an order of magnitude on reinforcement learning control benchmarks, and show that our method scales to convolutional neural networks that were previously infeasible. Our results demonstrate the potential of preimage approximation methodology for reliability and robustness certification.

Paperid: 1980, https://arxiv.org/pdf/2505.22449.pdf

Abstract:
Koufogiannis et al. (2016) showed a $\textit{gradual release}$ result for Laplace noise-based differentially private mechanisms: given an $\varepsilon$-DP release, a new release with privacy parameter $\varepsilon' > \varepsilon$ can be computed such that the combined privacy loss of both releases is at most $\varepsilon'$ and the distribution of the latter is the same as a single release with parameter $\varepsilon'$. They also showed gradual release techniques for Gaussian noise, later also explored by Whitehouse et al. (2022). In this paper, we consider a more general $\textit{multiple release}$ setting in which analysts hold private releases with different privacy parameters corresponding to different access/trust levels. These releases are determined one by one, with privacy parameters in arbitrary order. A multiple release is $\textit{lossless}$ if having access to a subset $S$ of the releases has the same privacy guarantee as the least private release in $S$, and each release has the same distribution as a single release with the same privacy parameter. Our main result is that lossless multiple release is possible for a large class of additive noise mechanisms. For the Gaussian mechanism we give a simple method for lossless multiple release with a short, self-contained analysis that does not require knowledge of the mathematics of Brownian motion. We also present lossless multiple release for the Laplace and Poisson mechanisms. Finally, we consider how to efficiently do gradual release of sparse histograms, and present a mechanism with running time independent of the number of dimensions.

Paperid: 1981, https://arxiv.org/pdf/2505.20189.pdf

Abstract:
Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an $(\varepsilon, Î´)$-differentially private algorithm obtaining an $Î±$-multiplicative approximation to the geometric median objective, $\frac 1 n \sum_{i \in [n]} \|\cdot - \mathbf{x}_i\|$, given a dataset $\mathcal{D} := \{\mathbf{x}_i\}_{i \in [n]} \subset \mathbb{R}^d$. Their algorithm requires $n \gtrsim \sqrt d \cdot \frac 1 {Î±\varepsilon}$ samples, which they prove is information-theoretically optimal. This result is surprising because its error scales with the \emph{effective radius} of $\mathcal{D}$ (i.e., of a ball capturing most points), rather than the worst-case radius. We give an improved algorithm that obtains the same approximation quality, also using $n \gtrsim \sqrt d \cdot \frac 1 {Î±Îµ}$ samples, but in time $\widetilde{O}(nd + \frac d {Î±^2})$. Our runtime is nearly-linear, plus the cost of the cheapest non-private first-order method due to [CLM+16]. To achieve our results, we use subsampling and geometric aggregation tools inspired by FriendlyCore [TCK+22] to speed up the "warm start" component of the [HSU24] algorithm, combined with a careful custom analysis of DP-SGD's sensitivity for the geometric median objective.

Paperid: 1982, https://arxiv.org/pdf/2505.20098.pdf

Abstract:
As protein informatics advances rapidly, the demand for enhanced predictive accuracy, structural analysis, and functional understanding has intensified. Transformer models, as powerful deep learning architectures, have demonstrated unprecedented potential in addressing diverse challenges across protein research. However, a comprehensive review of Transformer applications in this field remains lacking. This paper bridges this gap by surveying over 100 studies, offering an in-depth analysis of practical implementations and research progress of Transformers in protein-related tasks. Our review systematically covers critical domains, including protein structure prediction, function prediction, protein-protein interaction analysis, functional annotation, and drug discovery/target identification. To contextualize these advancements across various protein domains, we adopt a domain-oriented classification system. We first introduce foundational concepts: the Transformer architecture and attention mechanisms, categorize Transformer variants tailored for protein science, and summarize essential protein knowledge. For each research domain, we outline its objectives and background, critically evaluate prior methods and their limitations, and highlight transformative contributions enabled by Transformer models. We also curate and summarize pivotal datasets and open-source code resources to facilitate reproducibility and benchmarking. Finally, we discuss persistent challenges in applying Transformers to protein informatics and propose future research directions. This review aims to provide a consolidated foundation for the synergistic integration of Transformer and protein informatics, fostering further innovation and expanded applications in the field.

Paperid: 1983, https://arxiv.org/pdf/2505.19482.pdf

Abstract:
The increasing demand for privacy protection and security considerations leads to a significant rise in the proportion of encrypted network traffic. Since traffic content becomes unrecognizable after encryption, accurate analysis is challenging, making it difficult to classify applications and detect attacks. Deep learning is currently the predominant approach for encrypted traffic classification through feature analysis. However, these methods face limitations due to their high dependence on labeled data and difficulties in detecting attack variants. First, their performance is highly sensitive to data quality, where the highcost manual labeling process and dataset imbalance significantly degrade results. Second, the rapid evolution of attack patterns makes it challenging for models to identify new types of attacks. To tackle these challenges, we present GBC, a generative model based on pre-training for encrypted traffic comprehension. Since traditional tokenization methods are primarily designed for natural language, we propose a protocol-aware tokenization approach for encrypted traffic that improves model comprehension of fields specific to network traffic. In addition, GBC employs pretraining to learn general representations from extensive unlabeled traffic data. Through prompt learning, it effectively adapts to various downstream tasks, enabling both high-quality traffic generation and effective detection. Evaluations across multiple datasets demonstrate that GBC achieves superior results in both traffic classification and generation tasks, resulting in a 5% improvement in F1 score compared to state-of-the-art methods for classification tasks.

Paperid: 1984, https://arxiv.org/pdf/2505.18846.pdf

Abstract:
Sixth Generation (6G) wireless networks, which are expected to be deployed in the 2030s, have already created great excitement in academia and the private sector with their extremely high communication speed and low latency rates. However, despite the ultra-low latency, high throughput, and AI-assisted orchestration capabilities they promise, they are vulnerable to stealthy and long-term Advanced Persistent Threats (APTs). Large Language Models (LLMs) stand out as an ideal candidate to fill this gap with their high success in semantic reasoning and threat intelligence. In this paper, we present a comprehensive systematic review and taxonomy study for LLM-assisted APT detection in 6G networks. We address five research questions, namely, semantic merging of fragmented logs, encrypted traffic analysis, edge distribution constraints, dataset/modeling techniques, and reproducibility trends, by leveraging most recent studies on the intersection of LLMs, APTs, and 6G wireless networks. We identify open challenges such as explainability gaps, data scarcity, edge hardware limitations, and the need for real-time slicing-aware adaptation by presenting various taxonomies such as granularity, deployment models, and kill chain stages. We then conclude the paper by providing several research gaps in 6G infrastructures for future researchers. To the best of our knowledge, this paper is the first comprehensive systematic review and classification study on LLM-based APT detection in 6G networks.

Paperid: 1985, https://arxiv.org/pdf/2505.17335.pdf

Abstract:
Incorrect handling of security-critical data formats, particularly in low-level languages, are the root cause of many security vulnerabilities. Provably correct parsing and serialization tools that target languages like C can help. Towards this end, we present PulseParse, a library of verified parser and serializer combinators for non-malleable binary formats. Specifications and proofs in PulseParse are in separation logic, offering a more abstract and compositional interface, with full support for data validation, parsing, and serialization. PulseParse also supports a class of recursive formats -- with a focus on security and handling adversarial inputs, we show how to parse such formats with only a constant amount of stack space. We use PulseParse at scale by providing the first formalization of CBOR, a recursive, binary data format standard, with growing adoption in various industrial standards. We prove that the deterministic fragment of CBOR is non-malleable and provide EverCBOR, a verified library in both C and Rust to validate, parse, and serialize CBOR objects implemented using PulseParse. Next, we provide the first formalization of CDDL, a schema definition language for CBOR. We identify well-formedness conditions on CDDL definitions that ensure that they yield unambiguous, non-malleable formats, and implement EverCDDL, a tool that checks that a CDDL definition is well-formed, and then produces verified parsers and serializers for it. To evaluate our work, we use EverCDDL to generate verified parsers and serializers for various security-critical applications. Notably, we build a formally verified implementation of COSE signing, a standard for cryptographically signed objects. We also use our toolchain to generate verified code for other standards specified in CDDL, including DICE Protection Environment, a secure boot protocol standard.

Paperid: 1986, https://arxiv.org/pdf/2505.15935.pdf

Abstract:
Agentic AI systems, which build on Large Language Models (LLMs) and interact with tools and memory, have rapidly advanced in capability and scope. Yet, since LLMs have been shown to struggle in multilingual settings, typically resulting in lower performance and reduced safety, agentic systems risk inheriting these limitations. This raises concerns about the accessibility of such systems, as users interacting in languages other than English may encounter unreliable or security-critical agent behavior. Despite growing interest in evaluating agentic AI, existing benchmarks focus exclusively on English, leaving multilingual settings unexplored. To address this gap, we propose MAPS, a multilingual benchmark suite designed to evaluate agentic AI systems across diverse languages and tasks. MAPS builds on four widely used agentic benchmarks - GAIA (real-world tasks), SWE-bench (code generation), MATH (mathematical reasoning), and the Agent Security Benchmark (security). We translate each dataset into eleven diverse languages, resulting in 805 unique tasks and 9,660 total language-specific instances - enabling a systematic analysis of the multilingual effect on AI agents' performance and robustness. Empirically, we observe degradation in both performance and security when transitioning from English to other languages, with severity varying by task and correlating with the amount of translated input. Building on these findings, we provide actionable recommendations to guide agentic AI systems development and assessment under multilingual settings. This work establishes the first standardized evaluation framework for multilingual agentic AI, encouraging future research towards equitable, reliable, and accessible agentic AI. MAPS benchmark suite is publicly available at https://huggingface.co/datasets/Fujitsu-FRE/MAPS

Paperid: 1987, https://arxiv.org/pdf/2505.15833.pdf

Abstract:
Deployment of deep neural networks in resource-constrained embedded systems requires innovative algorithmic solutions to facilitate their energy and memory efficiency. To further ensure the reliability of these systems against malicious actors, recent works have extensively studied adversarial robustness of existing architectures. Our work focuses on the intersection of adversarial robustness, memory- and energy-efficiency in neural networks. We introduce a neural network conversion algorithm designed to produce sparse and adversarially robust spiking neural networks (SNNs) by leveraging the sparse connectivity and weights from a robustly pretrained artificial neural network (ANN). Our approach combines the energy-efficient architecture of SNNs with a novel conversion algorithm, leading to state-of-the-art performance with enhanced energy and memory efficiency through sparse connectivity and activations. Our models are shown to achieve up to 100x reduction in the number of weights to be stored in memory, with an estimated 8.6x increase in energy efficiency compared to dense SNNs, while maintaining high performance and robustness against adversarial threats.

Paperid: 1988, https://arxiv.org/pdf/2505.15476.pdf

Abstract:
Face recognition is an effective technology for identifying a target person by facial images. However, sensitive facial images raises privacy concerns. Although privacy-preserving face recognition is one of potential solutions, this solution neither fully addresses the privacy concerns nor is efficient enough. To this end, we propose an efficient privacy-preserving solution for face recognition, named Pura, which sufficiently protects facial privacy and supports face recognition over encrypted data efficiently. Specifically, we propose a privacy-preserving and non-interactive architecture for face recognition through the threshold Paillier cryptosystem. Additionally, we carefully design a suite of underlying secure computing protocols to enable efficient operations of face recognition over encrypted data directly. Furthermore, we introduce a parallel computing mechanism to enhance the performance of the proposed secure computing protocols. Privacy analysis demonstrates that Pura fully safeguards personal facial privacy. Experimental evaluations demonstrate that Pura achieves recognition speeds up to 16 times faster than the state-of-the-art.

Paperid: 1989, https://arxiv.org/pdf/2505.15136.pdf

Abstract:
The rapid advancement of artificial intelligence (AI) has enabled sophisticated audio generation and voice cloning technologies, posing significant security risks for applications reliant on voice authentication. While existing datasets and models primarily focus on distinguishing between human and fully synthetic speech, real-world attacks often involve audio that combines both genuine and cloned segments. To address this gap, we construct a novel hybrid audio dataset incorporating human, AI-generated, cloned, and mixed audio samples. We further propose fine-tuned Audio Spectrogram Transformer (AST)-based models tailored for detecting these complex acoustic patterns. Extensive experiments demonstrate that our approach significantly outperforms existing baselines in mixed-audio detection, achieving 97\% classification accuracy. Our findings highlight the importance of hybrid datasets and tailored models in advancing the robustness of speech-based authentication systems.

Paperid: 1990, https://arxiv.org/pdf/2505.14898.pdf

Abstract:
Network-on-Chip (NoC) enables on-chip communication between diverse cores in modern System-on-Chip (SoC) designs. With its shared communication fabric, NoC has become a focal point for various security threats, especially in heterogeneous and high-performance computing platforms. Among these attacks, Distributed Denial of Service (DDoS) attacks occur when multiple malicious entities collaborate to overwhelm and disrupt access to critical system components, potentially causing severe performance degradation or complete disruption of services. These attacks are particularly challenging to detect due to their distributed nature and dynamic traffic patterns in NoC, which often evade static detection rules or simple profiling. This paper presents a framework to conduct topology-aware detection and localization of DDoS attacks using Graph Neural Networks (GNNs) by analyzing NoC traffic patterns. Specifically, by modeling the NoC as a graph, our method utilizes spatiotemporal traffic features to effectively identify and localize DDoS attacks. Unlike prior works that rely on handcrafted features or threshold-based detection, our GNN-based approach operates directly on raw inter-flit delay data, learning complex traffic dependencies without manual intervention. Experimental results demonstrate that our approach can detect and localize DDoS attacks with high accuracy (up to 99\%) while maintaining consistent performance under diverse attack strategies. Furthermore, the proposed method exhibits strong robustness across varying numbers and placements of malicious IPs, different packet injection rates, application workloads, and architectural configurations, including both 2D mesh and 3D TSV-based NoCs. Our work provides a scalable, flexible, and architecture-agnostic defense mechanism, significantly improving the availability and trustworthiness of on-chip communication in future SoC designs.

Paperid: 1991, https://arxiv.org/pdf/2505.14797.pdf

Abstract:
Federated Learning (FL) is susceptible to privacy attacks, such as data reconstruction attacks, in which a semi-honest server or a malicious client infers information about other clients' datasets from their model updates or gradients. To enhance the privacy of FL, recent studies combined Multi-Key Homomorphic Encryption (MKHE) and FL, making it possible to aggregate the encrypted model updates using different keys without having to decrypt them. Despite the privacy guarantees of MKHE, existing approaches are not well-suited for real-world deployment due to their high computation and communication overhead. We propose MASER, an efficient MKHE-based Privacy-Preserving FL framework that combines consensus-based model pruning and slicing techniques to reduce this overhead. Our experimental results show that MASER is 3.03 to 8.29 times more efficient than existing MKHE-based FL approaches in terms of computation and communication overhead while maintaining comparable classification accuracy to standard FL algorithms. Compared to a vanilla FL algorithm, the overhead of MASER is only 1.48 to 5 times higher, striking a good balance between privacy, accuracy, and efficiency in both IID and non-IID settings.

Paperid: 1992, https://arxiv.org/pdf/2505.14616.pdf

Abstract:
Website fingerprinting (WF) is a technique that allows an eavesdropper to determine the website a target user is accessing by inspecting the metadata associated with the packets she exchanges via some encrypted tunnel, e.g., Tor. Recent WF attacks built using machine learning (and deep learning) process and summarize trace metadata during their feature extraction phases. This methodology leads to predictions that lack information about the instant at which a given website is detected within a (potentially large) network trace comprised of multiple sequential website accesses -- a setting known as \textit{multi-tab} WF. In this paper, we explore whether classical time series analysis techniques can be effective in the WF setting. Specifically, we introduce TSA-WF, a pipeline designed to closely preserve network traces' timing and direction characteristics, which enables the exploration of algorithms designed to measure time series similarity in the WF context. Our evaluation with Tor traces reveals that TSA-WF achieves a comparable accuracy to existing WF attacks in scenarios where website accesses can be easily singled-out from a given trace (i.e., the \textit{single-tab} WF setting), even when shielded by specially designed WF defenses. Finally, while TSA-WF did not outperform existing attacks in the multi-tab setting, we show how TSA-WF can help pinpoint the approximate instant at which a given website of interest is visited within a multi-tab trace.\footnote{This preprint has not undergone any post-submission improvements or corrections. The Version of Record of this contribution is published in the Proceedings of the 20th International Conference on Availability, Reliability and Security (ARES 2025)}

Paperid: 1993, https://arxiv.org/pdf/2505.14551.pdf

Abstract:
Reputation systems play an essential role in the Internet era, as they enable people to decide whom to trust, by collecting and aggregating data about users' behavior. Recently, several works proposed the use of reputation for the design and scalability improvement of decentralized (blockchain) ledgers; however, such systems are prone to manipulation and to our knowledge no game-theoretic treatment exists that can support their economic robustness. In this work we put forth a new model for the design of what we call, {\em trustworthy reputation systems}. Concretely, we describe a class of games, which we term {\em trustworthy reputation games}, that enable a set of users to report a function of their beliefs about the trustworthiness of each server in a set -- i.e., their estimate of the probability that this server will behave according to its specified strategy -- in a way that satisfies the following properties: 1. It is $(Îµ$-)best response for any rational user in the game to play a prescribed (truthful) strategy according to their true belief. 2. Assuming that the users' beliefs are not too far from the {\em true} trustworthiness of the servers, playing the above ($Îµ-$)Nash equilibrium allows anyone who observes the users' strategies to estimate the relative trustworthiness of any two servers. Our utilities and decoding function build on a connection between the well known PageRank algorithm and the problem of trustworthiness discovery, which can be of independent interest. Finally, we show how the above games are motivated by and can be leveraged in proof-of-reputation (PoR) blockchains.

Paperid: 1994, https://arxiv.org/pdf/2505.14325.pdf

Abstract:
The Cyber Resilience Act (CRA) is a new European Union (EU) regulation aimed at enhancing the security of digital products and services by ensuring they meet stringent cybersecurity requirements. This paper investigates the challenges that industrial equipment manufacturing companies anticipate while preparing for compliance with CRA through a comprehensive survey. Key findings highlight significant hurdles such as implementing secure development lifecycle practices, managing vulnerability notifications within strict timelines, and addressing gaps in cybersecurity expertise. This study provides insights into these specific challenges and offers targeted recommendations on key focus areas, such as tooling improvements, to aid industrial equipment manufacturers in their preparation for CRA compliance.

Paperid: 1995, https://arxiv.org/pdf/2505.13804.pdf

Abstract:
Securing software supply chains is a growing challenge due to the inadequacy of existing datasets in capturing the complexity of next-gen attacks, such as multiphase malware execution, remote access activation, and dynamic payload generation. Existing datasets, which rely on metadata inspection and static code analysis, are inadequate for detecting such attacks. This creates a critical gap because these datasets do not capture what happens during and after a package is installed. To address this gap, we present QUT-DV25, a dynamic analysis dataset specifically designed to support and advance research on detecting and mitigating supply chain attacks within the Python Package Index (PyPI) ecosystem. This dataset captures install and post-install-time traces from 14,271 Python packages, of which 7,127 are malicious. The packages are executed in an isolated sandbox environment using an extended Berkeley Packet Filter (eBPF) kernel and user-level probes. It captures 36 real-time features, that includes system calls, network traffic, resource usages, directory access patterns, dependency logs, and installation behaviors, enabling the study of next-gen attack vectors. ML analysis using the QUT-DV25 dataset identified four malicious PyPI packages previously labeled as benign, each with thousands of downloads. These packages deployed covert remote access and multi-phase payloads, were reported to PyPI maintainers, and subsequently removed. This highlights the practical value of QUT-DV25, as it outperforms reactive, metadata, and static datasets, offering a robust foundation for developing and benchmarking advanced threat detection within the evolving software supply chain ecosystem.

Paperid: 1996, https://arxiv.org/pdf/2505.13581.pdf

Abstract:
Content moderation for large language models (LLMs) remains a significant challenge, requiring flexible and adaptable solutions that can quickly respond to emerging threats. This paper introduces Retrieval Augmented Rejection (RAR), a novel approach that leverages a retrieval-augmented generation (RAG) architecture to dynamically reject unsafe user queries without model retraining. By strategically inserting and marking malicious documents into the vector database, the system can identify and reject harmful requests when these documents are retrieved. Our preliminary results show that RAR achieves comparable performance to embedded moderation in LLMs like Claude 3.5 Sonnet, while offering superior flexibility and real-time customization capabilities, a fundamental feature to timely address critical vulnerabilities. This approach introduces no architectural changes to existing RAG systems, requiring only the addition of specially crafted documents and a simple rejection mechanism based on retrieval results.

Paperid: 1997, https://arxiv.org/pdf/2505.13362.pdf

Abstract:
Membership Inference Attacks (MIAs) pose a significant risk to the privacy of training datasets by exploiting subtle differences in model outputs to determine whether a particular data sample was used during training. These attacks can compromise sensitive information, especially in domains such as healthcare and finance, where data privacy is paramount. Traditional mitigation techniques, such as static differential privacy, rely on injecting a fixed amount of noise during training or inference. However, this approach often leads to a detrimental trade-off: the noise may be insufficient to counter sophisticated attacks or, when increased, may substantially degrade model performance. In this paper, we present DynaNoise, an adaptive approach that dynamically modulates noise injection based on query sensitivity. Our approach performs sensitivity analysis using measures such as Shannon entropy to evaluate the risk associated with each query and adjusts the noise variance accordingly. A probabilistic smoothing step is then applied to renormalize the perturbed outputs, ensuring that the model maintains high accuracy while effectively obfuscating membership signals. We further propose an empirical metric, the Membership Inference Defense Privacy-Utility Tradeoff (MIDPUT), which quantifies the balance between reducing attack success rates and preserving the target model's accuracy. Our extensive evaluation on several benchmark datasets demonstrates that DynaNoise not only significantly reduces MIA success rates but also achieves up to a fourfold improvement in the MIDPUT metric compared to the state-of-the-art. Moreover, DynaNoise maintains competitive model accuracy while imposing only marginal inference overhead, highlighting its potential as an effective and efficient privacy defense against MIAs.

Paperid: 1998, https://arxiv.org/pdf/2505.13103.pdf

Abstract:
The rapid advancement of bug-finding techniques has led to the discovery of more vulnerabilities than developers can reasonably fix, creating an urgent need for effective Automated Program Repair (APR) methods. However, the complexity of modern bugs often makes precise root cause analysis difficult and unreliable. To address this challenge, we propose crash-site repair to simplify the repair task while still mitigating the risk of exploitation. In addition, we introduce a template-guided patch generation approach that significantly reduces the token cost of Large Language Models (LLMs) while maintaining both efficiency and effectiveness. We implement our prototype system, WILLIAMT, and evaluate it against state-of-the-art APR tools. Our results show that, when combined with the top-performing agent CodeRover-S, WILLIAMT reduces token cost by 45.9% and increases the bug-fixing rate to 73.5% (+29.6%) on ARVO, a ground-truth open source software vulnerabilities benchmark. Furthermore, we demonstrate that WILLIAMT can function effectively even without access to frontier LLMs: even a local model running on a Mac M4 Mini achieves a reasonable repair rate. These findings highlight the broad applicability and scalability of WILLIAMT.

Paperid: 1999, https://arxiv.org/pdf/2505.12688.pdf

Abstract:
In today's data-driven analytics landscape, deep learning has become a powerful tool, with latent representations, known as embeddings, playing a central role in several applications. In the face analytics domain, such embeddings are commonly used for biometric recognition (e.g., face identification). However, these embeddings, or templates, can inadvertently expose sensitive attributes such as age, gender, and ethnicity. Leaking such information can compromise personal privacy and affect civil liberty and human rights. To address these concerns, we introduce a multi-layer protection framework for embeddings. It consists of a sequence of operations: (a) encrypting embeddings using Fully Homomorphic Encryption (FHE), and (b) hashing them using irreversible feature manifold hashing. Unlike conventional encryption methods, FHE enables computations directly on encrypted data, allowing downstream analytics while maintaining strong privacy guarantees. To reduce the overhead of encrypted processing, we employ embedding compression. Our proposed method shields latent representations of sensitive data from leaking private attributes (such as age and gender) while retaining essential functional capabilities (such as face identification). Extensive experiments on two datasets using two face encoders demonstrate that our approach outperforms several state-of-the-art privacy protection methods.

Paperid: 2000, https://arxiv.org/pdf/2505.12167.pdf

Abstract:
Deep learning-based weather forecasting models have recently demonstrated significant performance improvements over gold-standard physics-based simulation tools. However, these models are vulnerable to adversarial attacks, which raises concerns about their trustworthiness. In this paper, we first investigate the feasibility of applying existing adversarial attack methods to weather forecasting models. We argue that a successful attack should (1) not modify significantly its original inputs, (2) be faithful, i.e., achieve the desired forecast at targeted locations with minimal changes to non-targeted locations, and (3) be geospatio-temporally realistic. However, balancing these criteria is a challenge as existing methods are not designed to preserve the geospatio-temporal dependencies of the original samples. To address this challenge, we propose a novel framework called FABLE (Forecast Alteration By Localized targeted advErsarial attack), which employs a 3D discrete wavelet decomposition to extract the varying components of the geospatio-temporal data. By regulating the magnitude of adversarial perturbations across different components, FABLE can generate adversarial inputs that maintain geospatio-temporal coherence while remaining faithful and closely aligned with the original inputs. Experimental results on multiple real-world datasets demonstrate the effectiveness of our framework over baseline methods across various metrics.

Paperid: 2001, https://arxiv.org/pdf/2505.12019.pdf

Abstract:
Federated learning (FL) is gaining increasing attention as an emerging collaborative machine learning approach, particularly in the context of large-scale computing and data systems. However, the fundamental algorithm of FL, Federated Averaging (FedAvg), is susceptible to backdoor attacks. Although researchers have proposed numerous defense algorithms, two significant challenges remain. The attack is becoming more stealthy and harder to detect, and current defense methods are unable to handle 50\% or more malicious users or assume an auxiliary server dataset. To address these challenges, we propose a novel defense algorithm, FL-PLAS, \textbf{F}ederated \textbf{L}earning based on \textbf{P}artial\textbf{ L}ayer \textbf{A}ggregation \textbf{S}trategy. In particular, we divide the local model into a feature extractor and a classifier. In each iteration, the clients only upload the parameters of a feature extractor after local training. The server then aggregates these local parameters and returns the results to the clients. Each client retains its own classifier layer, ensuring that the backdoor labels do not impact other clients. We assess the effectiveness of FL-PLAS against state-of-the-art (SOTA) backdoor attacks on three image datasets and compare our approach to six defense strategies. The results of the experiment demonstrate that our methods can effectively protect local models from backdoor attacks. Without requiring any auxiliary dataset for the server, our method achieves a high main-task accuracy with a lower backdoor accuracy even under the condition of 90\% malicious users with the attacks of trigger, semantic and edge-case.

Paperid: 2002, https://arxiv.org/pdf/2505.11365.pdf

Abstract:
Ensuring the safety of large language models (LLMs) is critical for responsible deployment, yet existing evaluations often prioritize performance over identifying failure modes. We introduce Phare, a multilingual diagnostic framework to probe and evaluate LLM behavior across three critical dimensions: hallucination and reliability, social biases, and harmful content generation. Our evaluation of 17 state-of-the-art LLMs reveals patterns of systematic vulnerabilities across all safety dimensions, including sycophancy, prompt sensitivity, and stereotype reproduction. By highlighting these specific failure modes rather than simply ranking models, Phare provides researchers and practitioners with actionable insights to build more robust, aligned, and trustworthy language systems.

Paperid: 2003, https://arxiv.org/pdf/2505.10430.pdf

Abstract:
We study the security of stock price forecasting using Deep Learning (DL) in computational finance. Despite abundant prior research on the vulnerability of DL to adversarial perturbations, such work has hitherto hardly addressed practical adversarial threat models in the context of DL-powered algorithmic trading systems (ATS). Specifically, we investigate the vulnerability of ATS to adversarial perturbations launched by a realistically constrained attacker. We first show that existing literature has paid limited attention to DL security in the financial domain, which is naturally attractive for adversaries. Then, we formalize the concept of ephemeral perturbations (EP), which can be used to stage a novel type of attack tailored for DL-based ATS. Finally, we carry out an end-to-end evaluation of our EP against a profitable ATS. Our results reveal that the introduction of small changes to the input stock prices not only (i) induces the DL model to behave incorrectly but also (ii) leads the whole ATS to make suboptimal buy/sell decisions, resulting in a worse financial performance of the targeted ATS.

Paperid: 2004, https://arxiv.org/pdf/2505.10297.pdf

Abstract:
Federated learning (FL) enhances privacy and reduces communication cost for resource-constrained edge clients by supporting distributed model training at the edge. However, the heterogeneous nature of such devices produces diverse, non-independent, and identically distributed (non-IID) data, making the detection of backdoor attacks more challenging. In this paper, we propose a novel federated representative-attention-based defense mechanism, named FeRA, that leverages cross-client attention over internal feature representations to distinguish benign from malicious clients. FeRA computes an anomaly score based on representation reconstruction errors, effectively identifying clients whose internal activations significantly deviate from the group consensus. Our evaluation demonstrates FeRA's robustness across various FL scenarios, including challenging non-IID data distributions typical of edge devices. Experimental results show that it effectively reduces backdoor attack success rates while maintaining high accuracy on the main task. The method is model-agnostic, attack-agnostic, and does not require labeled reference data, making it well suited to heterogeneous and resource-limited edge deployments.

Paperid: 2005, https://arxiv.org/pdf/2505.09501.pdf

Abstract:
The widespread use of smartphones in daily life has raised concerns about privacy and security among researchers and practitioners. Privacy issues are generally highly prevalent in mobile applications, particularly targeting the Android platform, the most popular mobile operating system. For this reason, several techniques have been proposed to identify malicious behavior in Android applications, including the Mining Android Sandbox approach (MAS approach), which aims to identify malicious behavior in repackaged Android applications (apps). However, previous empirical studies evaluated the MAS approach using a small dataset consisting of only 102 pairs of original and repackaged apps. This limitation raises questions about the external validity of their findings and whether the MAS approach can be generalized to larger datasets. To address these concerns, this paper presents the results of a replication study focused on evaluating the performance of the MAS approach regarding its capabilities of correctly classifying malware from different families. Unlike previous studies, our research employs a dataset that is an order of magnitude larger, comprising 4,076 pairs of apps covering a more diverse range of Android malware families. Surprisingly, our findings indicate a poor performance of the MAS approach for identifying malware, with the F1-score decreasing from 0.90 for the small dataset used in the previous studies to 0.54 in our more extensive dataset. Upon closer examination, we discovered that certain malware families partially account for the low accuracy of the MAS approach, which fails to classify a repackaged version of an app as malware correctly. Our findings highlight the limitations of the MAS approach, particularly when scaled, and underscore the importance of complementing it with other techniques to detect a broader range of malware effectively.

Paperid: 2006, https://arxiv.org/pdf/2505.08837.pdf

Abstract:
The security of cloud environments, such as Amazon Web Services (AWS), is complex and dynamic. Static security policies have become inadequate as threats evolve and cloud resources exhibit elasticity [1]. This paper addresses the limitations of static policies by proposing a security policy management framework that uses reinforcement learning (RL) to adapt dynamically. Specifically, we employ deep reinforcement learning algorithms, including deep Q Networks and proximal policy optimization, enabling the learning and continuous adjustment of controls such as firewall rules and Identity and Access Management (IAM) policies. The proposed RL based solution leverages cloud telemetry data (AWS Cloud Trail logs, network traffic data, threat intelligence feeds) to continuously refine security policies, maximizing threat mitigation, and compliance while minimizing resource impact. Experimental results demonstrate that our adaptive RL based framework significantly outperforms static policies, achieving higher intrusion detection rates (92% compared to 82% for static policies) and substantially reducing incident detection and response times by 58%. In addition, it maintains high conformity with security requirements and efficient resource usage. These findings validate the effectiveness of adaptive reinforcement learning approaches in improving cloud security policy management.

Paperid: 2007, https://arxiv.org/pdf/2505.08830.pdf

Abstract:
The integration of Large Language Models (LLMs) and Federated Learning (FL) presents a promising solution for joint training on distributed data while preserving privacy and addressing data silo issues. However, this emerging field, known as Federated Large Language Models (FLLM), faces significant challenges, including communication and computation overheads, heterogeneity, privacy and security concerns. Current research has primarily focused on the feasibility of FLLM, but future trends are expected to emphasize enhancing system robustness and security. This paper provides a comprehensive review of the latest advancements in FLLM, examining challenges from four critical perspectives: feasibility, robustness, security, and future directions. We present an exhaustive survey of existing studies on FLLM feasibility, introduce methods to enhance robustness in the face of resource, data, and task heterogeneity, and analyze novel risks associated with this integration, including privacy threats and security challenges. We also review the latest developments in defense mechanisms and explore promising future research directions, such as few-shot learning, machine unlearning, and IP protection. This survey highlights the pressing need for further research to enhance system robustness and security while addressing the unique challenges posed by the integration of FL and LLM.

Paperid: 2008, https://arxiv.org/pdf/2505.07724.pdf

Abstract:
WiFi fingerprint-based indoor localization schemes deliver highly accurate location data by matching the received signal strength indicator (RSSI) with an offline database using machine learning (ML) or deep learning (DL) models. However, over time, RSSI values degrade due to the malicious behavior of access points (APs), causing low positional accuracy due to RSSI value mismatch with the offline database. Existing literature lacks detection of malicious APs in the online phase and mitigating their effects. This research addresses these limitations and proposes a long-term reliable indoor localization scheme by incorporating malicious AP detection and their effect mitigation techniques. The proposed scheme uses a Light Gradient-Boosting Machine (LGBM) classifier to estimate locations and integrates simple yet efficient techniques to detect malicious APs based on online query data. Subsequently, a mitigation technique is incorporated that updates the offline database and online queries by imputing stable values for malicious APs using LGBM Regressors. Additionally, we introduce a noise addition mechanism in the offline database to capture the dynamic environmental effects. Extensive experimental evaluation shows that the proposed scheme attains a detection accuracy above 95% for each attack type. The mitigation strategy effectively restores the system's performance nearly to its original state when no malicious AP is present. The noise addition module reduces localization errors by nearly 16%. Furthermore, the proposed solution is lightweight, reducing the execution time by approximately 94% compared to the existing methods.

Paperid: 2009, https://arxiv.org/pdf/2505.06827.pdf

Abstract:
Watermarking AI-generated text is critical for combating misuse. Yet recent theoretical work argues that any watermark can be erased via random walk attacks that perturb text while preserving quality. However, such attacks rely on two key assumptions: (1) rapid mixing (watermarks dissolve quickly under perturbations) and (2) reliable quality preservation (automated quality oracles perfectly guide edits). Through large-scale experiments and human-validated assessments, we find mixing is slow: 100% of perturbed texts retain traces of their origin after hundreds of edits, defying rapid mixing. Oracles falter, as state-of-the-art quality detectors misjudge edits (77% accuracy), compounding errors during attacks. Ultimately, attacks underperform: automated walks remove watermarks just 26% of the time -- dropping to 10% under human quality review. These findings challenge the inevitability of watermark removal. Instead, practical barriers -- slow mixing and imperfect quality control -- reveal watermarking to be far more robust than theoretical models suggest. The gap between idealized attacks and real-world feasibility underscores the need for stronger watermarking methods and more realistic attack models.

Paperid: 2010, https://arxiv.org/pdf/2505.06305.pdf

Abstract:
With the widespread application of large language models (LLMs), user privacy protection has become a significant research topic. Existing privacy preference modeling methods often rely on large-scale user data, making effective privacy preference analysis challenging in data-limited environments. This study explores how LLMs can analyze user behavior related to privacy protection in scenarios with limited data and proposes a method that integrates Few-shot Learning and Privacy Computing to model user privacy preferences. The research utilizes anonymized user privacy settings data, survey responses, and simulated data, comparing the performance of traditional modeling approaches with LLM-based methods. Experimental results demonstrate that, even with limited data, LLMs significantly improve the accuracy of privacy preference modeling. Additionally, incorporating Differential Privacy and Federated Learning further reduces the risk of user data exposure. The findings provide new insights into the application of LLMs in privacy protection and offer theoretical support for advancing privacy computing and user behavior analysis.

Paperid: 2011, https://arxiv.org/pdf/2505.05922.pdf

Abstract:
Large Language Models (LLMs) have gained significant popularity due to their remarkable capabilities in text understanding and generation. However, despite their widespread deployment in inference services such as ChatGPT, concerns about the potential leakage of sensitive user data have arisen. Existing solutions primarily rely on privacy-enhancing technologies to mitigate such risks, facing the trade-off among efficiency, privacy, and utility. To narrow this gap, we propose Cape, a context-aware prompt perturbation mechanism based on differential privacy, to enable efficient inference with an improved privacy-utility trade-off. Concretely, we introduce a hybrid utility function that better captures the token similarity. Additionally, we propose a bucketized sampling mechanism to handle large sampling space, which might lead to long-tail phenomenons. Extensive experiments across multiple datasets, along with ablation studies, demonstrate that Cape achieves a better privacy-utility trade-off compared to prior state-of-the-art works.

Paperid: 2012, https://arxiv.org/pdf/2505.04799.pdf

Abstract:
Multi-agent collaboration systems (MACS), powered by large language models (LLMs), solve complex problems efficiently by leveraging each agent's specialization and communication between agents. However, the inherent exchange of information between agents and their interaction with external environments, such as LLM, tools, and users, inevitably introduces significant risks of sensitive data leakage, including vulnerabilities to attacks such as eavesdropping and prompt injection. Existing MACS lack fine-grained data protection controls, making it challenging to manage sensitive information securely. In this paper, we take the first step to mitigate the MACS's data leakage threat through a privacy-enhanced MACS development paradigm, Maris. Maris enables rigorous message flow control within MACS by embedding reference monitors into key multi-agent conversation components. We implemented Maris as an integral part of widely-adopted open-source multi-agent development frameworks, AutoGen and LangChain. To evaluate its effectiveness, we develop a Privacy Assessment Framework that emulates MACS under different threat scenarios. Our evaluation shows that Maris effectively mitigated sensitive data leakage threats across three different task suites while maintaining a high task success rate.

Paperid: 2013, https://arxiv.org/pdf/2505.03863.pdf

Abstract:
Cyber-Physical Systems (CPS) are abundant in safety-critical domains such as healthcare, avionics, and autonomous vehicles. Formal verification of their operational safety is, therefore, of utmost importance. In this paper, we address the falsification problem, where the focus is on searching for an unsafe execution in the system instead of proving their absence. The contribution of this paper is a framework that (a) connects the falsification of CPS with the falsification of deep neural networks (DNNs) and (b) leverages the inherent interpretability of Decision Trees for faster falsification of CPS. This is achieved by: (1) building a surrogate model of the CPS under test, either as a DNN model or a Decision Tree, (2) application of various DNN falsification tools to falsify CPS, and (3) a novel falsification algorithm guided by the explanations of safety violations of the CPS model extracted from its Decision Tree surrogate. The proposed framework has the potential to exploit a repertoire of \emph{adversarial attack} algorithms designed to falsify robustness properties of DNNs, as well as state-of-the-art falsification algorithms for DNNs. Although the presented methodology is applicable to systems that can be executed/simulated in general, we demonstrate its effectiveness, particularly in CPS. We show that our framework, implemented as a tool \textsc{FlexiFal}, can detect hard-to-find counterexamples in CPS that have linear and non-linear dynamics. Decision tree-guided falsification shows promising results in efficiently finding multiple counterexamples in the ARCH-COMP 2024 falsification benchmarks~\cite{khandait2024arch}.

Paperid: 2014, https://arxiv.org/pdf/2505.03768.pdf

Abstract:
To meet non-functional requirements, practitioners must identify Pareto-optimal configurations of the degree of decentralization, scalability, and security of blockchain systems. Maximizing all of these subconcepts is, however, impossible due to the trade-offs highlighted by the blockchain trilemma. We reviewed analysis approaches to identify constructs and their operationalization through metrics for analyzing the blockchain trilemma subconcepts and to assess the applicability of the operationalized constructs to various blockchain systems. By clarifying these constructs and metrics, this work offers a theoretical foundation for more sophisticated investigations into how the blockchain trilemma manifests in blockchain systems, helping practitioners identify Pareto-optimal configurations.

Paperid: 2015, https://arxiv.org/pdf/2505.01811.pdf

Abstract:
As Deep Neural Networks (DNNs) continue to require larger amounts of data and computational power, Mixture of Experts (MoE) models have become a popular choice to reduce computational complexity. This popularity increases the importance of considering the security of MoE architectures. Unfortunately, the security of models using a MoE architecture has not yet gained much attention compared to other DNN models. In this work, we investigate the vulnerability of patch-based MoE (pMoE) models for image classification against backdoor attacks. We examine multiple trigger generation methods and Fine-Pruning as a defense. To better understand a pMoE model's vulnerability to backdoor attacks, we investigate which factors affect the model's patch selection. Our work shows that pMoE models are highly susceptible to backdoor attacks. More precisely, we achieve high attack success rates of up to 100% with visible triggers and a 2% poisoning rate, whilst only having a clean accuracy drop of 1.0%. Additionally, we show that pruning itself is ineffective as a defense but that fine-tuning can remove the backdoor almost completely. Our results show that fine-tuning the model for five epochs reduces the attack success rate to 2.1% whilst sacrificing 1.4% accuracy.

Paperid: 2016, https://arxiv.org/pdf/2505.01489.pdf

Abstract:
This paper presents our simulation of cyber-attacks and detection strategies on the traffic control system in Daytona Beach, FL. using Raspberry Pi virtual machines and the OPNSense firewall, along with traffic dynamics from SUMO and exploitation via the Metasploit framework. We try to answer the research questions: are we able to identify cyber attacks by only analyzing traffic flow patterns. In this research, the cyber attacks are focused particularly when lights are randomly turned all green or red at busy intersections by adversarial attackers. Despite challenges stemming from imbalanced data and overlapping traffic patterns, our best model shows 85\% accuracy when detecting intrusions purely using traffic flow statistics. Key indicators for successful detection included occupancy, jam length, and halting durations.

Paperid: 2017, https://arxiv.org/pdf/2505.01488.pdf

Abstract:
The increasing automation of traffic management systems has made them prime targets for cyberattacks, disrupting urban mobility and public safety. Traditional network-layer defenses are often inaccessible to transportation agencies, necessitating a machine learning-based approach that relies solely on traffic flow data. In this study, we simulate cyberattacks in a semi-realistic environment, using a virtualized traffic network to analyze disruption patterns. We develop a deep learning-based anomaly detection system, demonstrating that Longest Stop Duration and Total Jam Distance are key indicators of compromised signals. To enhance interpretability, we apply Explainable AI (XAI) techniques, identifying critical decision factors and diagnosing misclassification errors. Our analysis reveals two primary challenges: transitional data inconsistencies, where mislabeled recovery-phase traffic misleads the model, and model limitations, where stealth attacks in low-traffic conditions evade detection. This work enhances AI-driven traffic security, improving both detection accuracy and trustworthiness in smart transportation systems.

Paperid: 2018, https://arxiv.org/pdf/2505.00618.pdf

Abstract:
Network attackers have increasingly resorted to proxy chains, VPNs, and anonymity networks to conceal their activities. To tackle this issue, past research has explored the applicability of traffic correlation techniques to perform attack attribution, i.e., to identify an attacker's true network location. However, current traffic correlation approaches rely on well-provisioned and centralized systems that ingest flows from multiple network probes to compute correlation scores. Unfortunately, this makes correlation efforts scale poorly for large high-speed networks. In this paper, we propose RevealNet, a decentralized framework for attack attribution that orchestrates a fleet of P4-programmable switches to perform traffic correlation. RevealNet builds on a set of correlation primitives inspired by prior work on computing and comparing flow sketches -- compact summaries of flows' key characteristics -- to enable efficient, distributed, in-network traffic correlation. Our evaluation suggests that RevealNet achieves comparable accuracy to centralized attack attribution systems while significantly reducing both the computational complexity and bandwidth overheads imposed by correlation tasks.

Paperid: 2019, https://arxiv.org/pdf/2504.21518.pdf

Abstract:
Although serverless computing offers compelling cost and deployment simplicity advantages, a significant challenge remains in securely managing sensitive data as it flows through the network of ephemeral function executions in serverless computing environments within untrusted clouds. While Confidential Virtual Machines (CVMs) offer a promising secure execution environment, their integration with serverless architectures currently faces fundamental limitations in key areas: security, performance, and resource efficiency. We present WALLET, a confidential computing system for secure serverless deployments to overcome these limitations. By employing nested confidential execution and a decoupled guest OS within CVMs, WALLET runs each function in a minimal "trustlet", significantly improving security through a reduced Trusted Computing Base (TCB). Furthermore, by leveraging a data-centric I/O architecture built upon a lightweight LibOS, WALLET optimizes network communication to address performance and resource efficiency challenges. Our evaluation shows that compared to CVM-based deployments, WALLET has 4.3x smaller TCB, improves end-to-end latency (15-93%), achieves higher function density (up to 907x), and reduces inter-function communication (up to 27x) and function chaining latency (16.7-30.2x); thus, WALLET offers a practical system for confidential serverless computing.

Paperid: 2020, https://arxiv.org/pdf/2504.21415.pdf

Abstract:
User authentication is essential to ensure secure access to computer systems, yet traditional methods face limitations in usability, cost, and security. Mouse dynamics authentication, based on the analysis of users' natural interaction behaviors with mouse devices, offers a cost-effective, non-intrusive, and adaptable solution. However, challenges remain in determining the optimal data volume, balancing accuracy and practicality, and effectively capturing temporal behavioral patterns. In this study, we propose a statistical method using Gaussian kernel density estimate (KDE) and Kullback-Leibler (KL) divergence to estimate the sufficient data volume for training authentication models. We introduce the Mouse Authentication Unit (MAU), leveraging Approximate Entropy (ApEn) to optimize segment length for efficient and accurate behavioral representation. Furthermore, we design the Local-Time Mouse Authentication (LT-AMouse) framework, integrating 1D-ResNet for local feature extraction and GRU for modeling long-term temporal dependencies. Taking the Balabit and DFL datasets as examples, we significantly reduced the data scale, particularly by a factor of 10 for the DFL dataset, greatly alleviating the training burden. Additionally, we determined the optimal input recognition unit length for the user authentication system on different datasets based on the slope of Approximate Entropy. Training with imbalanced samples, our model achieved a successful defense AUC 98.52% for blind attack on the DFL dataset and 94.65% on the Balabit dataset, surpassing the current sota performance.

Paperid: 2021, https://arxiv.org/pdf/2504.21413.pdf

Abstract:
Buffered Linear Toeplitz (BLT) matrices are a family of parameterized lower-triangular matrices that play an important role in streaming differential privacy with correlated noise. Our main result is a BLT inversion theorem: the inverse of a BLT matrix is itself a BLT matrix with different parameters. We also present an efficient and differentiable $O(d^3)$ algorithm to compute the parameters of the inverse BLT matrix, where $d$ is the degree of the original BLT (typically $d < 10$). Our characterization enables direct optimization of BLT parameters for privacy mechanisms through automatic differentiation.

Paperid: 2022, https://arxiv.org/pdf/2504.21054.pdf

Abstract:
Backdoor attacks pose a significant threat to deep neural networks, as backdoored models would misclassify poisoned samples with specific triggers into target classes while maintaining normal performance on clean samples. Among these, multi-target backdoor attacks can simultaneously target multiple classes. However, existing multi-target backdoor attacks all follow the dirty-label paradigm, where poisoned samples are mislabeled, and most of them require an extremely high poisoning rate. This makes them easily detectable by manual inspection. In contrast, clean-label attacks are more stealthy, as they avoid modifying the labels of poisoned samples. However, they generally struggle to achieve stable and satisfactory attack performance and often fail to scale effectively to multi-target attacks. To address this issue, we propose the Feature-based Full-target Clean-label Backdoor Attacks (FFCBA) which consists of two paradigms: Feature-Spanning Backdoor Attacks (FSBA) and Feature-Migrating Backdoor Attacks (FMBA). FSBA leverages class-conditional autoencoders to generate noise triggers that align perturbed in-class samples with the original category's features, ensuring the effectiveness, intra-class consistency, inter-class specificity and natural-feature correlation of triggers. While FSBA supports swift and efficient attacks, its cross-model attack capability is relatively weak. FMBA employs a two-stage class-conditional autoencoder training process that alternates between using out-of-class samples and in-class samples. This allows FMBA to generate triggers with strong target-class features, making it highly effective for cross-model attacks. We conduct experiments on multiple datasets and models, the results show that FFCBA achieves outstanding attack performance and maintains desirable robustness against the state-of-the-art backdoor defenses.

Paperid: 2023, https://arxiv.org/pdf/2504.21052.pdf

Abstract:
Multi-target backdoor attacks pose significant security threats to deep neural networks, as they can preset multiple target classes through a single backdoor injection. This allows attackers to control the model to misclassify poisoned samples with triggers into any desired target class during inference, exhibiting superior attack performance compared with conventional backdoor attacks. However, existing multi-target backdoor attacks fail to guarantee trigger specificity and stealthiness in black-box settings, resulting in two main issues. First, they are unable to simultaneously target all classes when only training data can be manipulated, limiting their effectiveness in realistic attack scenarios. Second, the triggers often lack visual imperceptibility, making poisoned samples easy to detect. To address these problems, we propose a Spatial-based Full-target Invisible Backdoor Attack, called SFIBA. It restricts triggers for different classes to specific local spatial regions and morphologies in the pixel space to ensure specificity, while employing a frequency-domain-based trigger injection method to guarantee stealthiness. Specifically, for injection of each trigger, we first apply fast fourier transform to obtain the amplitude spectrum of clean samples in local spatial regions. Then, we employ discrete wavelet transform to extract the features from the amplitude spectrum and use singular value decomposition to integrate the trigger. Subsequently, we selectively filter parts of the trigger in pixel space to implement trigger morphology constraints and adjust injection coefficients based on visual effects. We conduct experiments on multiple datasets and models. The results demonstrate that SFIBA can achieve excellent attack performance and stealthiness, while preserving the model's performance on benign samples, and can also bypass existing backdoor defenses.

Paperid: 2024, https://arxiv.org/pdf/2504.21008.pdf

Abstract:
With the widespread adoption of the Internet of Things (IoT) and Industrial IoT (IIoT) technologies, network architectures have become increasingly complex, and the volume of traffic has grown substantially. This evolution poses significant challenges to traditional security mechanisms, particularly in detecting high-frequency, diverse, and highly covert network attacks. To address these challenges, this study proposes a novel network traffic anomaly detection model that integrates a Convolutional Neural Network (CNN) with a Bidirectional Long Short-Term Memory (BiLSTM) network, implemented on the MindSpore framework. Comprehensive experiments were conducted using the NF-BoT-IoT dataset. The results demonstrate that the proposed model achieves 99% across accuracy, precision, recall, and F1-score, indicating its strong performance and robustness in network intrusion detection tasks.

Paperid: 2025, https://arxiv.org/pdf/2504.20941.pdf

Abstract:
Differential Privacy (DP) enables privacy-preserving data analysis by adding calibrated noise. While recent works extend DP to curved manifolds (e.g., diffusion-tensor MRI, social networks) by adding geodesic noise, these assume uniform data distribution. This assumption is not always practical, hence these approaches may introduce biased noise and suboptimal privacy-utility trade-offs for non-uniform data. To address this issue, we propose \emph{Conformal}-DP that utilizes conformal transformations on Riemannian manifolds. This approach locally equalizes sample density and redefines geodesic distances while preserving intrinsic manifold geometry. Our theoretical analysis demonstrates that the conformal factor, which is derived from local kernel density estimates, is data density-aware. We show that under these conformal metrics, \emph{Conformal}-DP satisfies $\varepsilon$-differential privacy on any complete Riemannian manifold and offers a closed-form expected geodesic error bound dependent only on the maximal density ratio, and not global curvature. We show through experiments on synthetic and real-world datasets that our mechanism achieves superior privacy-utility trade-offs, particularly for heterogeneous manifold data, and also is beneficial for homogeneous datasets.

Paperid: 2026, https://arxiv.org/pdf/2504.20926.pdf

Abstract:
With the increasing importance of data privacy, Local Differential Privacy (LDP) has recently become a strong measure of privacy for protecting each user's privacy from data analysts without relying on a trusted third party. In many cases, both data providers and data analysts hope to maximize the utility of released data. In this paper, we study the fundamental trade-off formulated as a constrained optimization problem: maximizing data utility subject to the constraint of LDP budgets. In particular, the Generalized Randomized Response (GRR) treats all discrete data equally except for the true data. For this, we introduce an adaptive LDP mechanism called Bipartite Randomized Response (BRR), which solves the above privacy-utility maximization problem from the global standpoint. We prove that for any utility function and any privacy level, solving the maximization problem is equivalent to confirming how many high-utility data to be treated equally as the true data on release probability, the outcome of which gives the optimal randomized response. Further, solving this linear program can be computationally cheap in theory. Several examples of utility functions defined by distance metrics and applications in decision trees and deep learning are presented. The results of various experiments show that our BRR significantly outperforms the state-of-the-art LDP mechanisms of both continuous and distributed types.

Paperid: 2027, https://arxiv.org/pdf/2504.20485.pdf

Abstract:
Java deserialization gadget chains are a well-researched critical software weakness. The vast majority of known gadget chains rely on gadgets from software dependencies. Furthermore, it has been shown that small code changes in dependencies have enabled these gadget chains. This makes gadget chain detection a purely reactive endeavor. Even if one dependency's deployment pipeline employs gadget chain detection, a gadget chain can still result from gadgets in other dependencies. In this work, we assess how likely small code changes are to enable a gadget chain. These changes could either be accidental or intentional as part of a supply chain attack. Specifically, we show that class serializability is a strongly fluctuating property over a dependency's evolution. Then, we investigate three change patterns by which an attacker could stealthily introduce gadgets into a dependency. We apply these patterns to 533 dependencies and run three state-of-the-art gadget chain detectors both on the original and the modified dependencies. The tools detect that applying the modification patterns can activate/inject gadget chains in 26.08% of the dependencies we selected. Finally, we verify the newly detected chains. As such, we identify dormant gadget chains in 53 dependencies that could be added through minor code modifications. This both shows that Java deserialization gadget chains are a broad liability to software and proves dormant gadget chains as a lucrative supply chain attack vector.

Paperid: 2028, https://arxiv.org/pdf/2504.19821.pdf

Abstract:
Cryptographic research takes software timing side channels seriously. Approaches to mitigate them include constant-time coding and techniques to enforce such practices. However, recent attacks like Meltdown [42], Spectre [37], and Hertzbleed [70] have challenged our understanding of what it means for code to execute in constant time on modern CPUs. To ensure that assumptions on the underlying hardware are correct and to create a complete feedback loop, developers should also perform \emph{timing measurements} as a final validation step to ensure the absence of exploitable side channels. Unfortunately, as highlighted by a recent study by Jancar et al. [30], developers often avoid measurements due to the perceived unreliability of the statistical analysis and its guarantees. In this work, we combat the view that statistical techniques only provide weak guarantees by introducing a new algorithm for the analysis of timing measurements with strong, formal statistical guarantees, giving developers a reliable analysis tool. Specifically, our algorithm (1) is non-parametric, making minimal assumptions about the underlying distribution and thus overcoming limitations of classical tests like the t-test, (2) handles unknown data dependencies in measurements, (3) can estimate in advance how many samples are needed to detect a leak of a given size, and (4) allows the definition of a negligible leak threshold $Î$, ensuring that acceptable non-exploitable leaks do not trigger false positives, without compromising statistical soundness. We demonstrate the necessity, effectiveness, and benefits of our approach on both synthetic benchmarks and real-world applications.

Paperid: 2029, https://arxiv.org/pdf/2504.18990.pdf

Abstract:
Drivers are becoming increasingly reliant on advanced driver assistance systems (ADAS) as autonomous driving technology becomes more popular and developed with advanced safety features to enhance road safety. However, the increasing complexity of the ADAS makes autonomous vehicles (AVs) more exposed to attacks and accidental faults. In this paper, we evaluate the resilience of a widely used ADAS against safety-critical attacks that target perception inputs. Various safety mechanisms are simulated to assess their impact on mitigating attacks and enhancing ADAS resilience. Experimental results highlight the importance of timely intervention by human drivers and automated safety mechanisms in preventing accidents in both driving and lateral directions and the need to resolve conflicts among safety interventions to enhance system resilience and reliability.

Paperid: 2030, https://arxiv.org/pdf/2504.18375.pdf

Abstract:
Public information contains valuable Cyber Threat Intelligence (CTI) that is used to prevent future attacks. While standards exist for sharing this information, much appears in non-standardized news articles or blogs. Monitoring online sources for threats is time-consuming and source selection is uncertain. Current research focuses on extracting Indicators of Compromise from known sources, rarely addressing new source identification. This paper proposes a CTI-focused crawler using multi-armed bandit (MAB) and various crawling strategies. It employs SBERT to identify relevant documents while dynamically adapting its crawling path. Our system ThreatCrawl achieves a harvest rate exceeding 25% and expands its seed by over 300% while maintaining topical focus. Additionally, the crawler identifies previously unknown but highly relevant overview pages, datasets, and domains.

Paperid: 2031, https://arxiv.org/pdf/2504.17198.pdf

Abstract:
Today's security tools predominantly rely on predefined rules crafted by experts, making them poorly adapted to the emergence of software supply chain attacks. To tackle this limitation, we propose a novel tool, RuleLLM, which leverages large language models (LLMs) to automate rule generation for OSS ecosystems. RuleLLM extracts metadata and code snippets from malware as its input, producing YARA and Semgrep rules that can be directly deployed in software development. Specifically, the rule generation task involves three subtasks: crafting rules, refining rules, and aligning rules. To validate RuleLLM's effectiveness, we implemented a prototype system and conducted experiments on the dataset of 1,633 malicious packages. The results are promising that RuleLLM generated 763 rules (452 YARA and 311 Semgrep) with a precision of 85.2\% and a recall of 91.8\%, outperforming state-of-the-art (SOTA) tools and scored-based approaches. We further analyzed generated rules and proposed a rule taxonomy: 11 categories and 38 subcategories.

Paperid: 2032, https://arxiv.org/pdf/2504.17185.pdf

Abstract:
This work presents a joint design of encoding and encryption procedures for public key encryptions (PKEs) and key encapsulation mechanism (KEMs) such as Kyber, without relying on the assumption of independent decoding noise components, achieving reductions in both communication overhead (CER) and decryption failure rate (DFR). Our design features two techniques: ciphertext packing and lattice packing. First, we extend the Peikert-Vaikuntanathan-Waters (PVW) method to Kyber: $\ell$ plaintexts are packed into a single ciphertext. This scheme is referred to as P$_\ell$-Kyber. We prove that the P$_\ell$-Kyber is IND-CCA secure under the M-LWE hardness assumption. We show that the decryption decoding noise entries across the $\ell$ plaintexts (also known as layers) are mutually independent. Second, we propose a cross-layer lattice encoding scheme for the P$_\ell$-Kyber, where every $\ell$ cross-layer information symbols are encoded to a lattice point. This way we obtain a \emph{coded} P$_\ell$-Kyber, where the decoding noise entries for each lattice point are mutually independent. Therefore, the DFR analysis does not require the assumption of independence among the decryption decoding noise entries. Both DFR and CER are greatly decreased thanks to ciphertext packing and lattice packing. We demonstrate that with $\ell=24$ and Leech lattice encoder, the proposed coded P$_\ell$-KYBER1024 achieves DFR $<2^{-281}$ and CER $ = 4.6$, i.e., a decrease of CER by $90\%$ compared to KYBER1024.

Paperid: 2033, https://arxiv.org/pdf/2504.17059.pdf

Abstract:
As cybersecurity threats continue to evolve, the need for advanced tools to analyze and understand complex cyber environments has become increasingly critical. Graph theory offers a powerful framework for modeling relationships within cyber ecosystems, making it highly applicable to cybersecurity. This paper focuses on the development of an enriched version of the widely recognized NSL-KDD dataset, incorporating graph-theoretical concepts to enhance its practical value. The enriched dataset provides a resource for students and professionals to engage in hands-on analysis, enabling them to explore graph-based methodologies for identifying network behavior and vulnerabilities. To validate the effectiveness of this dataset, we employed IBM Auto AI, demonstrating its capability in real-world applications such as classification and threat prediction. By addressing the need for graph-theoretical datasets, this study provides a practical tool for equipping future cybersecurity professionals with the skills necessary to confront complex cyber challenges.

Paperid: 2034, https://arxiv.org/pdf/2504.16743.pdf

Abstract:
A Software Bill of Materials (SBOM) is becoming an increasingly important tool in regulatory and technical spaces to introduce more transparency and security into a project's software supply chain. Artificial intelligence (AI) projects face unique challenges beyond the security of their software, and thus require a more expansive approach to a bill of materials. In this report, we introduce the concept of an AI-BOM, expanding on the SBOM to include the documentation of algorithms, data collection methods, frameworks and libraries, licensing information, and standard compliance.

Paperid: 2035, https://arxiv.org/pdf/2504.16407.pdf

Abstract:
Quantum fire was recently formalized by Bostanci, Nehoran and Zhandry (STOC 25). This notion considers a distribution of quantum states that can be efficiently cloned, but cannot be converted into a classical string. Previously, work of Nehoran and Zhandry (ITCS 24) showed how to construct quantum fire relative to an inefficient unitary oracle. Later, the work of Bostanci, Nehoran, Zhandry gave a candidate construction based on group action assumptions, and proved the correctness of their scheme; however, even in the classical oracle model they only conjectured the security, and no security proof was given. In this work, we give the first construction of public-key quantum fire relative to a classical oracle, and prove its security unconditionally. This gives the first classical oracle seperation between the two fundamental principles of quantum mechanics that are equivalent in the information-theoretic setting: no-cloning and no-telegraphing. Going further, we introduce a stronger notion called quantum key-fire where the clonable fire states can be used to run a functionality (such as a signing or decryption key), and prove a secure construction relative to a classical oracle. As an application of this notion, we get the first public-key encryption scheme whose secret key is clonable but satisfies unbounded leakage-resilience (Cakan, Goyal, Liu-Zhang, Ribeiro [TCC 24]), relative to a classical oracle. Unbounded leakage-resilience is closely related to, and can be seen as a generalization of the notion of no-telegraphing. For all of our constructions, the oracles can be made efficient (i.e. polynomial time), assuming the existence of post-quantum one-way functions.

Paperid: 2036, https://arxiv.org/pdf/2504.14220.pdf

Abstract:
Incident management is a classical topic in cyber security. Recently, the European Union (EU) has started to consider also the relation between cyber security incidents and cyber security crises. These considerations and preparations, including those specified in the EU's new cyber security laws, constitute the paper's topic. According to an analysis of the laws and associated policy documents, (i) cyber security crises are equated in the EU to large-scale cyber security incidents that either exceed a handling capacity of a single member state or affect at least two member states. For this and other purposes, (ii) the new laws substantially increase mandatory reporting about cyber security incidents, including but not limited to the large-scale incidents. Despite the laws and new governance bodies established by them, however, (iii) the working of actual cyber security crisis management remains unclear particularly at the EU-level. With these policy research results, the paper advances the domain of cyber security incident management research by elaborating how European law perceives cyber security crises and their relation to cyber security incidents, paving the way for many relevant further research topics with practical relevance, whether theoretical, conceptual, or empirical.

Paperid: 2037, https://arxiv.org/pdf/2504.13551.pdf

Abstract:
Many adversarial attack approaches are proposed to verify the vulnerability of language models. However, they require numerous queries and the information on the target model. Even black-box attack methods also require the target model's output information. They are not applicable in real-world scenarios, as in hard black-box settings where the target model is closed and inaccessible. Even the recently proposed hard black-box attacks still require many queries and demand extremely high costs for training adversarial generators. To address these challenges, we propose Q-faker (Query-free Hard Black-box Attacker), a novel and efficient method that generates adversarial examples without accessing the target model. To avoid accessing the target model, we use a surrogate model instead. The surrogate model generates adversarial sentences for a target-agnostic attack. During this process, we leverage controlled generation techniques. We evaluate our proposed method on eight datasets. Experimental results demonstrate our method's effectiveness including high transferability and the high quality of the generated adversarial examples, and prove its practical in hard black-box settings.

Paperid: 2038, https://arxiv.org/pdf/2504.13547.pdf

Abstract:
Android applications (apps) integrate reusable and well-tested third-party libraries (TPLs) to enhance functionality and shorten development cycles. However, recent research reveals that TPLs have become the largest attack surface for Android apps, where the use of insecure TPLs can compromise both developer and user interests. To mitigate such threats, researchers have proposed various tools to detect TPLs used by apps, supporting further security analyses such as vulnerable TPLs identification. Although existing tools achieve notable library-level TPL detection performance in the presence of obfuscation, they struggle with version-level TPL detection due to a lack of sensitivity to differences between versions. This limitation results in a high version-level false positive rate, significantly increasing the manual workload for security analysts. To resolve this issue, we propose SAD, a TPL detection tool with high version-level detection performance. SAD generates a candidate app class list for each TPL class based on the feature of nodes in class dependency graphs (CDGs). It then identifies the unique corresponding app class for each TPL class by performing class matching based on the similarity of their class summaries. Finally, SAD identifies TPL versions by evaluating the structural similarity of the sub-graph formed by matched classes within the CDGs of the TPL and the app. Extensive evaluation on three datasets demonstrates the effectiveness of SAD and its components. SAD achieves F1 scores of 97.64% and 84.82% for library-level and version-level detection on obfuscated apps, respectively, surpassing existing state-of-the-art tools. The version-level false positives reported by the best tool is 1.61 times that of SAD. We further evaluate the degree to which TPLs identified by detection tools correspond to actual TPL classes.

Paperid: 2039, https://arxiv.org/pdf/2504.10277.pdf

Abstract:
Language model deployments in consumer-facing applications introduce numerous risks. While existing research on harms and hazards of such applications follows top-down approaches derived from regulatory frameworks and theoretical analyses, empirical evidence of real-world failure modes remains underexplored. In this work, we introduce RealHarm, a dataset of annotated problematic interactions with AI agents built from a systematic review of publicly reported incidents. Analyzing harms, causes, and hazards specifically from the deployer's perspective, we find that reputational damage constitutes the predominant organizational harm, while misinformation emerges as the most common hazard category. We empirically evaluate state-of-the-art guardrails and content moderation systems to probe whether such systems would have prevented the incidents, revealing a significant gap in the protection of AI applications.

Paperid: 2040, https://arxiv.org/pdf/2504.09977.pdf

Abstract:
Poorly designed smart contracts are particularly vulnerable, as they may allow attackers to exploit weaknesses and steal the virtual currency they manage. In this study, we train a model using unsupervised learning to identify vulnerabilities in the Solidity source code of Ethereum smart contracts. To address the challenges associated with real-world smart contracts, our training data is derived from actual vulnerability samples obtained from datasets such as SmartBugs Curated and the SolidiFI Benchmark. These datasets enable us to develop a robust unsupervised static analysis method for detecting five specific vulnerabilities: Reentrancy, Access Control, Timestamp Dependency, tx.origin, and Unchecked Low-Level Calls. We employ clustering algorithms to identify outliers, which are subsequently classified as vulnerable smart contracts.

Paperid: 2041, https://arxiv.org/pdf/2504.09584.pdf

Abstract:
Whilst many key exchange and digital signature methods use the NIST P256 (secp256r1) and secp256k1 curves, there is often a demand for increased security. With these curves, we have a 128-bit security. These security levels can be increased to 256-bit security with NIST P-521 Curve 448 and Brainpool-P512. This paper outlines a new curve - Eccfrog512ck2 - and which provides 256-bit security and enhanced performance over NIST P-521. Along with this, it has side-channel resistance and is designed to avoid weaknesses such as related to the MOV attack. It shows that Eccfrog512ck2 can have a 61.5% speed-up on scalar multiplication and a 33.3% speed-up on point generation over the NIST P-521 curve.

Paperid: 2042, https://arxiv.org/pdf/2504.09335.pdf

Abstract:
We investigate encrypted control policy synthesis over the cloud. While encrypted control implementations have been studied previously, we focus on the less explored paradigm of privacy-preserving control synthesis, which can involve heavier computations ideal for cloud outsourcing. We classify control policy synthesis into model-based, simulator-driven, and data-driven approaches and examine their implementation over fully homomorphic encryption (FHE) for privacy enhancements. A key challenge arises from comparison operations (min or max) in standard reinforcement learning algorithms, which are difficult to execute over encrypted data. This observation motivates our focus on Relative-Entropy-regularized reinforcement learning (RL) problems, which simplifies encrypted evaluation of synthesis algorithms due to their comparison-free structures. We demonstrate how linearly solvable value iteration, path integral control, and Z-learning can be readily implemented over FHE. We conduct a case study of our approach through numerical simulations of encrypted Z-learning in a grid world environment using the CKKS encryption scheme, showing convergence with acceptable approximation error. Our work suggests the potential for secure and efficient cloud-based reinforcement learning.

Paperid: 2043, https://arxiv.org/pdf/2504.07414.pdf

Abstract:
Shuffling has been shown to amplify differential privacy guarantees, enabling a more favorable privacy-utility trade-off. To characterize and compute this amplification, two fundamental analytical frameworks have been proposed: the \emph{privacy blanket} by Balle et al. (CRYPTO 2019) and the \emph{clone}--including both the standard and stronger variant--by Feldman et al. (FOCS 2021, SODA 2023). These frameworks share a common foundation: decomposing local randomizers into structured components for analysis. In this work, we introduce a unified analytical framework--the general clone paradigm--which subsumes all possible decompositions, with the clone and blanket decompositions arising as special cases. Within this framework, we identify the optimal decomposition, which is precisely the one used by the privacy blanket. Moreover, we develop a simple and efficient algorithm based on the Fast Fourier Transform (FFT) to compute optimal privacy amplification bounds. Experimental results show that our computed upper bounds nearly match the lower bounds, demonstrating the tightness of our method. Building on this method, we also derive optimal amplification bounds for both \emph{joint} and \emph{parallel} compositions of LDP mechanisms in the shuffle model.

Paperid: 2044, https://arxiv.org/pdf/2504.07362.pdf

Abstract:
The shuffle model of DP (Differential Privacy) provides high utility by introducing a shuffler that randomly shuffles noisy data sent from users. However, recent studies show that existing shuffle protocols suffer from the following two major drawbacks. First, they are vulnerable to local data poisoning attacks, which manipulate the statistics about input data by sending crafted data, especially when the privacy budget epsilon is small. Second, the actual value of epsilon is increased by collusion attacks by the data collector and users. In this paper, we address these two issues by thoroughly exploring the potential of the augmented shuffle model, which allows the shuffler to perform additional operations, such as random sampling and dummy data addition. Specifically, we propose a generalized framework for local-noise-free protocols in which users send (encrypted) input data to the shuffler without adding noise. We show that this generalized protocol provides DP and is robust to the above two attacks if a simpler mechanism that performs the same process on binary input data provides DP. Based on this framework, we propose three concrete protocols providing DP and robustness against the two attacks. Our first protocol generates the number of dummy values for each item from a binomial distribution and provides higher utility than several state-of-the-art existing shuffle protocols. Our second protocol significantly improves the utility of our first protocol by introducing a novel dummy-count distribution: asymmetric two-sided geometric distribution. Our third protocol is a special case of our second protocol and provides pure epsilon-DP. We show the effectiveness of our protocols through theoretical analysis and comprehensive experiments.

Paperid: 2045, https://arxiv.org/pdf/2504.07132.pdf

Abstract:
Rug pulls in Solana have caused significant damage to users interacting with Decentralized Finance (DeFi). A rug pull occurs when developers exploit users' trust and drain liquidity from token pools on Decentralized Exchanges (DEXs), leaving users with worthless tokens. Although rug pulls in Ethereum and Binance Smart Chain (BSC) have gained attention recently, analysis of rug pulls in Solana remains largely under-explored. In this paper, we introduce SolRPDS (Solana Rug Pull Dataset), the first public rug pull dataset derived from Solana's transactions. We examine approximately four years of DeFi data (2021-2024) that covers suspected and confirmed tokens exhibiting rug pull patterns. The dataset, derived from 3.69 billion transactions, consists of 62,895 suspicious liquidity pools. The data is annotated for inactivity states, which is a key indicator, and includes several detailed liquidity activities such as additions, removals, and last interaction as well as other attributes such as inactivity periods and withdrawn token amounts, to help identify suspicious behavior. Our preliminary analysis reveals clear distinctions between legitimate and fraudulent liquidity pools and we found that 22,195 tokens in the dataset exhibit rug pull patterns during the examined period. SolRPDS can support a wide range of future research on rug pulls including the development of data-driven and heuristic-based solutions for real-time rug pull detection and mitigation.

Paperid: 2046, https://arxiv.org/pdf/2504.06418.pdf

Abstract:
In recent years, the industry has been witnessing an extended usage of process mining and automated event data analysis. Consequently, there is a rising significance in addressing privacy apprehensions related to the inclusion of sensitive and private information within event data utilized by process mining algorithms. State-of-the-art research mainly focuses on providing quantifiable privacy guarantees, e.g., via differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques designed for the release of trace variants are still insufficient to meet all the demands of industry-scale utilization. Moreover, ensuring privacy guarantees in situations characterized by a high occurrence of infrequent trace variants remains a challenging endeavor. In this paper, we introduce two novel approaches for releasing differentially private trace variants based on trained generative models. With TraVaG, we leverage \textit{Generative Adversarial Networks} (GANs) to sample from a privatized implicit variant distribution. Our second method employs \textit{Denoising Diffusion Probabilistic Models} that reconstruct artificial trace variants from noise via trained Markov chains. Both methods offer industry-scale benefits and elevate the degree of privacy assurances, particularly in scenarios featuring a substantial prevalence of infrequent variants. Also, they overcome the shortcomings of conventional privacy preservation techniques, such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data demonstrate that our approaches surpass state-of-the-art techniques in terms of privacy guarantees and utility preservation.

Paperid: 2047, https://arxiv.org/pdf/2504.03936.pdf

Abstract:
Simple commit-reveal beacons are vulnerable to last-revealer strategies, and existing descriptions often leave accountability and recovery mechanisms unspecified for practical deployments. We present Commit-Reveal$^2$, a layered design for blockchain deployments that cryptographically randomizes the final reveal order, together with a concrete accountability and fallback mechanism that we implement as smart-contract logic. The protocol is architected as a hybrid system, where routine coordination runs off chain for efficiency and the blockchain acts as the trust anchor for commitments and the final arbiter for disputes. Our implementation covers leader coordination, on-chain verification, slashing for non-cooperation, and an explicit on-chain recovery path that maintains progress when off-chain coordination fails. We formally define two security goals for distributed randomness beacons, unpredictability and bit-wise bias resistance, and we show that Commit-Reveal$^2$ meets these notions under standard hash assumptions in the random-oracle model. In measurements with small to moderate operator sets, the hybrid design reduces on-chain gas by more than 80% compared to a fully on-chain baseline. We release a publicly verifiable prototype and evaluation artifacts to support replication and adoption in blockchain applications.

Paperid: 2048, https://arxiv.org/pdf/2504.03111.pdf

Abstract:
Large Language Model (LLM) agents are autonomous systems powered by LLMs, capable of reasoning and planning to solve problems by leveraging a set of tools. However, the integration of multi-tool capabilities in LLM agents introduces challenges in securely managing tools, ensuring their compatibility, handling dependency relationships, and protecting control flows within LLM agent workflows. In this paper, we present the first systematic security analysis of task control flows in multi-tool-enabled LLM agents. We identify a novel threat, Cross-Tool Harvesting and Polluting (XTHP), which includes multiple attack vectors to first hijack the normal control flows of agent tasks, and then collect and pollute confidential or private information within LLM agent systems. To understand the impact of this threat, we developed Chord, a dynamic scanning tool designed to automatically detect real-world agent tools susceptible to XTHP attacks. Our evaluation of 66 real-world tools from the repositories of two major LLM agent development frameworks, LangChain and LlamaIndex, revealed a significant security concern: 75\% are vulnerable to XTHP attacks, highlighting the prevalence of this threat.

Paperid: 2049, https://arxiv.org/pdf/2504.02194.pdf

Abstract:
The rise of cryptocurrencies like Bitcoin and Ethereum has driven interest in blockchain database technology, with smart contracts enabling the growth of decentralized finance (DeFi). However, research has shown that adversaries exploit transaction ordering to extract profits through attacks like front-running, sandwich attacks, and liquidation manipulation. This issue affects blockchain databases in which block proposers have full control over transaction ordering. To address this, a more fair approach to transaction ordering is essential. Existing fairness protocols, such as Pompe and Themis, operate on leader-based consensus protocols, which not only suffer from low throughput, but also allow adversaries to manipulate transaction ordering. To address these limitations, we propose FairDAG-AB and FairDAG-RL that run fairness protocols on top of DAG-based consensus protocols, which improve protocol performance in both throughput and fairness quality, leveraging the multi-proposer design and validity of DAG-based consensus protocols. We conducted a comprehensive analytical and experimental evaluation of our protocols. The results show that FairDAG-AB and FairDAG-RL outperform the prior fairness protocols in both throughput and fairness quality.

Paperid: 2050, https://arxiv.org/pdf/2504.00563.pdf

Abstract:
Federated Learning (FL) is a collaborative method for training machine learning models while preserving the confidentiality of the participants' training data. Nevertheless, FL is vulnerable to reconstruction attacks that exploit shared parameters to reveal private training data. In this paper, we address this issue in the cybersecurity domain by applying Multi-Input Functional Encryption (MIFE) to a recent FL implementation for training ML-based network intrusion detection systems. We assess both classical and post-quantum solutions in terms of memory cost and computational overhead in the FL process, highlighting their impact on convergence time.

Paperid: 2051, https://arxiv.org/pdf/2504.00436.pdf

Abstract:
Nowadays, mobile smart devices are widely used in daily life. It is increasingly important to prevent malicious users from accessing private data, thus a secure and convenient authentication method is urgently needed. Compared with common one-off authentication (e.g., password, face recognition, and fingerprint), continuous authentication can provide constant privacy protection. However, most studies are based on behavioral features and vulnerable to spoofing attacks. To solve this problem, we study the unique influence of sliding fingers on active vibration signals, and further propose an authentication system, FingerSlid, which uses vibration motors and accelerometers in mobile devices to sense biometric features of sliding fingers to achieve behavior-independent continuous authentication. First, we design two kinds of active vibration signals and propose a novel signal generation mechanism to improve the anti-attack ability of FingerSlid. Then, we extract different biometric features from the received two kinds of signals, and eliminate the influence of behavioral features in biometric features using a carefully designed Triplet network. Last, user authentication is performed by using the generated behavior-independent biometric features. FingerSlid is evaluated through a large number of experiments under different scenarios, and it achieves an average accuracy of 95.4% and can resist 99.5% of attacks.

Paperid: 2052, https://arxiv.org/pdf/2504.00366.pdf

Abstract:
Quantum Neural Networks (QNNs) have shown significant value across domains, with well-trained QNNs representing critical intellectual property often deployed via cloud-based QNN-as-a-Service (QNNaaS) platforms. Recent work has examined QNN model extraction attacks using classical and emerging quantum strategies. These attacks involve adversaries querying QNNaaS platforms to obtain labeled data for training local substitute QNNs that replicate the functionality of cloud-based models. However, existing approaches have largely overlooked the impact of varying quantum noise inherent in noisy intermediate-scale quantum (NISQ) computers, limiting their effectiveness in real-world settings. To address this limitation, we propose the CopyQNN framework, which employs a three-step data cleaning method to eliminate noisy data based on its noise sensitivity. This is followed by the integration of contrastive and transfer learning within the quantum domain, enabling efficient training of substitute QNNs using a limited but cleaned set of queried data. Experimental results on NISQ computers demonstrate that a practical implementation of CopyQNN significantly outperforms state-of-the-art QNN extraction attacks, achieving an average performance improvement of 8.73% across all tasks while reducing the number of required queries by 90x, with only a modest increase in hardware overhead.

Paperid: 2053, https://arxiv.org/pdf/2503.24191.pdf

Abstract:
Content Warning: This paper may contain unsafe or harmful content generated by LLMs that may be offensive to readers. Large Language Models (LLMs) are extensively used as tooling platforms through structured output APIs to ensure syntax compliance so that robust integration with existing softwares like agent systems, could be achieved. However, the feature enabling functionality of grammar-guided structured output presents significant security vulnerabilities. In this work, we reveal a critical control-plane attack surface orthogonal to traditional data-plane vulnerabilities. We introduce Constrained Decoding Attack (CDA), a novel jailbreak class that weaponizes structured output constraints to bypass safety mechanisms. Unlike prior attacks focused on input prompts, CDA operates by embedding malicious intent in schema-level grammar rules (control-plane) while maintaining benign surface prompts (data-plane). We instantiate this with a proof-of-concept Chain Enum Attack, achieves 96.2% attack success rates across proprietary and open-weight LLMs on five safety benchmarks with a single query, including GPT-4o and Gemini-2.0-flash. Our findings identify a critical security blind spot in current LLM architectures and urge a paradigm shift in LLM safety to address control-plane vulnerabilities, as current mechanisms focused solely on data-plane threats leave critical systems exposed.

Paperid: 2054, https://arxiv.org/pdf/2503.23986.pdf

Abstract:
A rollup network is a type of popular "Layer 2" scaling solution for general purpose "Layer 1" blockchains like Ethereum. Rollups networks separate execution of transactions from other aspects like consensus, processing transactions off of the Layer 1, and posting the data onto the underlying layer for security. While rollups offer significant scalability advantages, they often rely on centralized operators for transaction ordering and inclusion, which also introduces potential risks. If the operator fails to build rollup blocks or propose new state roots to the underlying Layer 1, users may lose access to digital assets on the rollup. An escape hatch allows users to bypass the failing operator and withdraw assets directly on the Layer 1. We propose using a time-based trigger, Merkle proofs, and new resolver contracts to implement a practical escape hatch for these networks. The use of novel resolver contracts allow user owned assets to be located in the Layer 2 state root, including those owned by smart contracts, in order to allow users to escape them. This design ensures safe and verifiable escape of assets, including ETH, ERC-20 and ERC-721 tokens, and more, from the Layer 2.

Paperid: 2055, https://arxiv.org/pdf/2503.23748.pdf

Abstract:
On-device deep learning (DL) has rapidly gained adoption in mobile apps, offering the benefits of offline model inference and user privacy preservation over cloud-based approaches. However, it inevitably stores models on user devices, introducing new vulnerabilities, particularly model-stealing attacks and intellectual property infringement. While system-level protections like Trusted Execution Environments (TEEs) provide a robust solution, practical challenges remain in achieving scalable on-device DL model protection, including complexities in supporting third-party models and limited adoption in current mobile solutions. Advancements in TEE-enabled hardware, such as NVIDIA's GPU-based TEEs, may address these obstacles in the future. Currently, watermarking serves as a common defense against model theft but also faces challenges here as many mobile app developers lack corresponding machine learning expertise and the inherent read-only and inference-only nature of on-device DL models prevents third parties like app stores from implementing existing watermarking techniques in post-deployment models. To protect the intellectual property of on-device DL models, in this paper, we propose THEMIS, an automatic tool that lifts the read-only restriction of on-device DL models by reconstructing their writable counterparts and leverages the untrainable nature of on-device DL models to solve watermark parameters and protect the model owner's intellectual property. Extensive experimental results across various datasets and model structures show the superiority of THEMIS in terms of different metrics. Further, an empirical investigation of 403 real-world DL mobile apps from Google Play is performed with a success rate of 81.14%, showing the practicality of THEMIS.

Paperid: 2056, https://arxiv.org/pdf/2503.23414.pdf

Abstract:
This paper investigates the presence and impact of questionable, AI-generated academic papers on widely used preprint repositories, with a focus on their role in citation manipulation. Motivated by suspicious patterns observed in publications related to our ongoing research on GenAI-enhanced cybersecurity, we identify clusters of questionable papers and profiles. These papers frequently exhibit minimal technical content, repetitive structure, unverifiable authorship, and mutually reinforcing citation patterns among a recurring set of authors. To assess the feasibility and implications of such practices, we conduct a controlled experiment: generating a fake paper using GenAI, embedding citations to suspected questionable publications, and uploading it to one such repository (ResearchGate). Our findings demonstrate that such papers can bypass platform checks, remain publicly accessible, and contribute to inflating citation metrics like the H-index and i10-index. We present a detailed analysis of the mechanisms involved, highlight systemic weaknesses in content moderation, and offer recommendations for improving platform accountability and preserving academic integrity in the age of GenAI.

Paperid: 2057, https://arxiv.org/pdf/2503.22710.pdf

Abstract:
The rapid digitalization of banking services has significantly transformed financial transactions, offering enhanced convenience and efficiency for consumers. However, the increasing reliance on digital banking has also exposed financial institutions and users to a wide range of cybersecurity threats, including phishing, malware, ransomware, data breaches, and unauthorized access. This study systematically examines the influence of cybersecurity threats on digital banking security, adoption, and regulatory compliance by conducting a comprehensive review of 78 peer-reviewed articles published between 2015 and 2024. Using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology, this research critically evaluates the most prevalent cyber threats targeting digital banking platforms, the effectiveness of modern security measures, and the role of regulatory frameworks in mitigating financial cybersecurity risks. The findings reveal that phishing and malware attacks remain the most commonly exploited cyber threats, leading to significant financial losses and consumer distrust. Multi-factor authentication (MFA) and biometric security have been widely adopted to combat unauthorized access, while AI-driven fraud detection and blockchain technology offer promising solutions for securing financial transactions. However, the integration of third-party FinTech solutions introduces additional security risks, necessitating stringent regulatory oversight and cybersecurity protocols. The study also highlights that compliance with global cybersecurity regulations, such as GDPR, PSD2, and GLBA, enhances digital banking security by enforcing strict authentication measures, encryption protocols, and real-time fraud monitoring.

Paperid: 2058, https://arxiv.org/pdf/2503.21891.pdf

Abstract:
We propose a framework to convert $(\varepsilon, Î´)$-approximate Differential Privacy (DP) mechanisms into $(\varepsilon, 0)$-pure DP mechanisms, a process we call ``purification''. This algorithmic technique leverages randomized post-processing with calibrated noise to eliminate the $Î´$ parameter while preserving utility. By combining the tighter utility bounds and computational efficiency of approximate DP mechanisms with the stronger guarantees of pure DP, our approach achieves the best of both worlds. We illustrate the applicability of this framework in various settings, including Differentially Private Empirical Risk Minimization (DP-ERM), data-dependent DP mechanisms such as Propose-Test-Release (PTR), and query release tasks. To the best of our knowledge, this is the first work to provide a systematic method for transforming approximate DP into pure DP while maintaining competitive accuracy and computational efficiency.

Paperid: 2060, https://arxiv.org/pdf/2503.20884.pdf

Abstract:
Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, but its robustness is threatened by Byzantine behaviors such as data and model poisoning. Existing defenses face fundamental limitations: robust aggregation rules incur error lower bounds that grow with client heterogeneity, while detection-based methods often rely on heuristics (e.g., a fixed number of malicious clients) or require trusted external datasets for validation. We present a defense framework that addresses these challenges by leveraging a conditional generative adversarial network (cGAN) at the server to synthesize representative data for validating client updates. This approach eliminates reliance on external datasets, adapts to diverse attack strategies, and integrates seamlessly into standard FL workflows. Extensive experiments on benchmark datasets demonstrate that our framework accurately distinguishes malicious from benign clients while maintaining overall model accuracy. Beyond Byzantine robustness, we also examine the representativeness of synthesized data, computational costs of cGAN training, and the transparency and scalability of our approach.

Paperid: 2061, https://arxiv.org/pdf/2503.20798.pdf

Abstract:
Intrusion Detection Systems (IDS) are crucial for identifying malicious traffic, yet traditional signature-based methods struggle with zero-day attacks and high false positive rates. AI-driven packet-capture analysis offers a promising alternative. However, existing approaches rely heavily on flow-based or statistical features, limiting their ability to detect fine-grained attack patterns. This study proposes Xavier-CMAE, an enhanced Convolutional Multi-Head Attention Ensemble (CMAE) model that improves detection accuracy while reducing computational overhead. By replacing Word2Vec embeddings with a Hex2Int tokenizer and Xavier initialization, Xavier-CMAE eliminates pre-training, accelerates training, and achieves 99.971% accuracy with a 0.018% false positive rate, outperforming Word2Vec-based methods. Additionally, we introduce LLM-CMAE, which integrates pre-trained Large Language Model (LLM) tokenizers into CMAE. While LLMs enhance feature extraction, their computational cost hinders real-time detection. LLM-CMAE balances efficiency and performance, reaching 99.969% accuracy with a 0.019% false positive rate. This work advances AI-powered IDS by (1) introducing a payload-based detection framework, (2) enhancing efficiency with Xavier-CMAE, and (3) integrating LLM tokenizers for improved real-time detection.

Paperid: 2062, https://arxiv.org/pdf/2503.19909.pdf

Abstract:
Fuzzing is a well-established technique for detecting bugs and vulnerabilities. With the surge of fuzzers and fuzzer platforms being developed such as AFL and OSSFuzz rises the necessity to benchmark these tools' performance. A common problem is that vulnerability benchmarks are based on bugs in old software releases. For this very reason, Magma introduced the notion of forward-porting to reintroduce vulnerable code in current software releases. While their results are promising, the state-of-the-art lacks an update on the maintainability of this approach over time. Indeed, adding the vulnerable code to a recent software version might either break its functionality or make the vulnerable code no longer reachable. We characterise the challenges with forward-porting by reassessing the portability of Magma's CVEs four years after its release and manually reintroducing the vulnerabilities in the current software versions. We find the straightforward process efficient for 17 of the 32 CVEs in our study. We further investigate why a trivial forward-porting process fails in the 15 other CVEs. This involves identifying the commits breaking the forward-porting process and reverting them in addition to the bug fix. While we manage to complete the process for nine of these CVEs, we provide an update on all 15 and explain the challenges we have been confronted with in this process. Thereby, we give the basis for future work towards a sustainable forward-ported fuzzing benchmark.

Paperid: 2063, https://arxiv.org/pdf/2503.19548.pdf

Abstract:
In this paper, we present the first large-scale empirical study of smart contract dependencies, analyzing over 41 million contracts and 11 billion interactions on Ethereum up to December 2024. Our results yield four key insights: (1) 59% of contract transactions involve multiple contracts (median of 4 per transaction in 2024) indicating potential smart contract dependency risks; (2) the ecosystem exhibits extreme centralization, with just 11 (0.001%) deployers controlling 20.5 million (50%) of alive contracts, with major risks related to factory contracts and deployer privileges; (3) three most depended-upon contracts are mutable, meaning large parts of the ecosystem rely on contracts that can be altered at any time, which is a significant risk, (4) actual smart contract protocol dependencies are significantly more complex than officially documented, undermining Ethereum's transparency ethos, and creating unnecessary attack surface. Our work provides the first large-scale empirical foundation for understanding smart contract dependency risks, offering crucial insights for developers, users, and security researchers in the blockchain space.

Paperid: 2064, https://arxiv.org/pdf/2503.19142.pdf

Abstract:
With high-stakes machine learning applications increasingly moving to untrusted end-user or cloud environments, safeguarding pre-trained model parameters becomes essential for protecting intellectual property and user privacy. Recent advancements in hardware-isolated enclaves, notably Intel SGX, hold the promise to secure the internal state of machine learning applications even against compromised operating systems. However, we show that privileged software adversaries can exploit input-dependent memory access patterns in common neural network activation functions to extract secret weights and biases from an SGX enclave. Our attack leverages the SGX-Step framework to obtain a noise-free, instruction-granular page-access trace. In a case study of an 11-input regression network using the Tensorflow Microlite library, we demonstrate complete recovery of all first-layer weights and biases, as well as partial recovery of parameters from deeper layers under specific conditions. Our novel attack technique requires only 20 queries per input per weight to obtain all first-layer weights and biases with an average absolute error of less than 1%, improving over prior model stealing attacks. Additionally, a broader ecosystem analysis reveals the widespread use of activation functions with input-dependent memory access patterns in popular machine learning frameworks (either directly or via underlying math libraries). Our findings highlight the limitations of deploying confidential models in SGX enclaves and emphasise the need for stricter side-channel validation of machine learning implementations, akin to the vetting efforts applied to secure cryptographic libraries.

Paperid: 2065, https://arxiv.org/pdf/2503.18043.pdf

Abstract:
Web access today occurs predominantly through mobile devices, with Android representing a significant share of the mobile device market. This widespread usage makes Android a prime target for malicious attacks. Despite efforts to combat malicious attacks through tools like Google Play Protect and antivirus software, new and evolved malware continues to infiltrate Android devices. Source code analysis is effective but limited, as attackers quickly abandon old malware for new variants to evade detection. Therefore, there is a need for alternative methods that complement source code analysis. Prior research investigated clustering applications based on their descriptions and identified outliers in these clusters by API usage as malware. However, these works often used traditional techniques such as Latent Dirichlet Allocation (LDA) and k-means clustering, that do not capture the nuanced semantic structures present in app descriptions. To this end, in this paper, we propose BERTDetect, which leverages the BERTopic neural topic modelling to effectively capture the latent topics in app descriptions. The resulting topic clusters are comparatively more coherent than previous methods and represent the app functionalities well. Our results demonstrate that BERTDetect outperforms other baselines, achieving ~10% relative improvement in F1 score.

Paperid: 2066, https://arxiv.org/pdf/2503.17588.pdf

Abstract:
Dynamic analysis, through rehosting, is an important capability for security assessment in embedded systems software. Existing rehosting techniques aim to provide high-fidelity execution by accurately emulating hardware and peripheral interactions. However, these techniques face challenges in adoption due to the increasing number of available peripherals and the complexities involved in designing emulation models for diverse hardware. Additionally, contrary to the prevailing belief that guides existing works, our analysis of reported bugs shows that high-fidelity execution is not required to expose most bugs in embedded software. Our key hypothesis is that security vulnerabilities are more likely to arise at higher abstraction levels. To substantiate our hypothesis, we introduce LEMIX, a framework enabling dynamic analysis of embedded applications by rehosting them as x86 Linux applications decoupled from hardware dependencies. Enabling embedded applications to run natively on Linux facilitates security analysis using available techniques and takes advantage of the powerful hardware available on the Linux platform for higher testing throughput. We develop various techniques to address the challenges involved in converting embedded applications to Linux applications. We evaluated LEMIX on 18 real-world embedded applications across four RTOSes and found 21 new bugs in 12 of the applications and all 4 of the RTOS kernels. We report that LEMIX is superior to existing state-of-the-art techniques both in terms of code coverage (~2x more coverage) and bug detection (18 more bugs).

Paperid: 2067, https://arxiv.org/pdf/2503.16693.pdf

Abstract:
Graph Neural Networks (GNNs) have gained traction in Graph-based Machine Learning as a Service (GMLaaS) platforms, yet they remain vulnerable to graph-based model extraction attacks (MEAs), where adversaries reconstruct surrogate models by querying the victim model. Existing defense mechanisms, such as watermarking and fingerprinting, suffer from poor real-time performance, susceptibility to evasion, or reliance on post-attack verification, making them inadequate for handling the dynamic characteristics of graph-based MEA variants. To address these limitations, we propose ATOM, a novel real-time MEA detection framework tailored for GNNs. ATOM integrates sequential modeling and reinforcement learning to dynamically detect evolving attack patterns, while leveraging $k$-core embedding to capture the structural properties, enhancing detection precision. Furthermore, we provide theoretical analysis to characterize query behaviors and optimize detection strategies. Extensive experiments on multiple real-world datasets demonstrate that ATOM outperforms existing approaches in detection performance, maintaining stable across different time steps, thereby offering a more effective defense mechanism for GMLaaS environments.

Paperid: 2068, https://arxiv.org/pdf/2503.15772.pdf

Abstract:
The integrity of peer review is fundamental to scientific progress, but the rise of large language models (LLMs) has introduced concerns that some reviewers may rely on these tools to generate reviews rather than writing them independently. Although some venues have banned LLM-assisted reviewing, enforcement remains difficult as existing detection tools cannot reliably distinguish between fully generated reviews and those merely polished with AI assistance. In this work, we address the challenge of detecting LLM-generated reviews. We consider the approach of performing indirect prompt injection via the paper's PDF, prompting the LLM to embed a covert watermark in the generated review, and subsequently testing for presence of the watermark in the review. We identify and address several pitfalls in naÃ¯ve implementations of this approach. Our primary contribution is a rigorous watermarking and detection framework that offers strong statistical guarantees. Specifically, we introduce watermarking schemes and hypothesis tests that control the family-wise error rate across multiple reviews, achieving higher statistical power than standard corrections such as Bonferroni, while making no assumptions about the nature of human-written reviews. We explore multiple indirect prompt injection strategies--including font-based embedding and obfuscated prompts--and evaluate their effectiveness under various reviewer defense scenarios. Our experiments find high success rates in watermark embedding across various LLMs. We also empirically find that our approach is resilient to common reviewer defenses, and that the bounds on error rates in our statistical tests hold in practice. In contrast, we find that Bonferroni-style corrections are too conservative to be useful in this setting.

Paperid: 2069, https://arxiv.org/pdf/2503.13637.pdf

Abstract:
The number of blockchain interoperability protocols for transferring data and assets between blockchains has grown significantly. However, no open dataset of cross-chain transactions exists to study interoperability protocols in operation. There is also no tool to generate such datasets and make them available to the community. This paper proposes XChainDataGen, a tool to extract cross-chain data from blockchains and generate datasets of cross-chain transactions (cctxs). Using XChainDataGen, we extracted over 35 GB of data from five cross-chain protocols deployed on 11 blockchains in the last seven months of 2024, identifying 11,285,753 cctxs that moved over 28 billion USD in cross-chain token transfers. Using the data collected, we compare protocols and provide insights into their security, cost, and performance trade-offs. As examples, we highlight differences between protocols that require full finality on the source blockchain and those that only demand soft finality (\textit{security}). We compare user costs, fee models, and the impact of variables such as the Ethereum gas price on protocol fees (\textit{cost}). Finally, we produce the first analysis of the implications of EIP-7683 for cross-chain intents, which are increasingly popular and greatly improve the speed with which cctxs are processed (\textit{performance}), thereby enhancing the user experience. The availability of XChainDataGen and this dataset allows various analyses, including trends in cross-chain activity, security assessments of interoperability protocols, and financial research on decentralized finance (DeFi) protocols.

Paperid: 2070, https://arxiv.org/pdf/2503.12192.pdf

Abstract:
Software supply chain frameworks, such as the US NIST Secure Software Development Framework (SSDF), detail what tasks software development organizations are recommended or mandated to adopt to reduce security risk. However, to further reduce the risk of similar attacks occurring, software organizations benefit from knowing what tasks mitigate attack techniques the attackers are currently using to address specific threats, prioritize tasks, and close mitigation gaps. The goal of this study is to aid software organizations in reducing the risk of software supply chain attacks by systematically synthesizing how framework tasks mitigate the attack techniques used in the SolarWinds, Log4j, and XZ Utils attacks. We qualitatively analyzed 106 Cyber Threat Intelligence (CTI) reports of the 3 attacks to gather the attack techniques. We then systematically constructed a mapping between attack techniques and the 73 tasks enumerated in 10 software supply chain frameworks. Afterward, we established and ranked priority tasks that mitigate attack techniques. The three mitigation tasks with the highest scores are role-based access control, system monitoring, and boundary protection. Additionally, three mitigation tasks were missing from all ten frameworks, including sustainable open-source software and environmental scanning tools. Thus, software products would still be vulnerable to software supply chain attacks even if organizations adopted all recommended tasks.

Paperid: 2071, https://arxiv.org/pdf/2503.10690.pdf

Abstract:
Adversarial factuality refers to the deliberate insertion of misinformation into input prompts by an adversary, characterized by varying levels of expressed confidence. In this study, we systematically evaluate the performance of several open-source large language models (LLMs) when exposed to such adversarial inputs. Three tiers of adversarial confidence are considered: strongly confident, moderately confident, and limited confidence. Our analysis encompasses eight LLMs: LLaMA 3.1 (8B), Phi 3 (3.8B), Qwen 2.5 (7B), Deepseek-v2 (16B), Gemma2 (9B), Falcon (7B), Mistrallite (7B), and LLaVA (7B). Empirical results indicate that LLaMA 3.1 (8B) exhibits a robust capability in detecting adversarial inputs, whereas Falcon (7B) shows comparatively lower performance. Notably, for the majority of the models, detection success improves as the adversary's confidence decreases; however, this trend is reversed for LLaMA 3.1 (8B) and Phi 3 (3.8B), where a reduction in adversarial confidence corresponds with diminished detection performance. Further analysis of the queries that elicited the highest and lowest rates of successful attacks reveals that adversarial attacks are more effective when targeting less commonly referenced or obscure information.

Paperid: 2072, https://arxiv.org/pdf/2503.10549.pdf

Abstract:
As facial recognition is increasingly adopted for government and commercial services, its potential misuse has raised serious concerns about privacy and civil rights. To counteract, various anti-facial recognition techniques have been proposed for privacy protection by adversarially perturbing face images, among which generative makeup-based approaches are the most popular. However, these methods, designed primarily to impersonate specific target identities, can only achieve weak dodging success rates while increasing the risk of targeted abuse. In addition, they often introduce global visual artifacts or a lack of adaptability to accommodate diverse makeup prompts, compromising user satisfaction. To address the above limitations, we develop MASQUE, a novel diffusion-based framework that generates localized adversarial makeups guided by user-defined text prompts. Built upon precise null-text inversion, customized cross-attention fusion with masking, and a pairwise adversarial guidance mechanism using images of the same individual, MASQUE achieves robust dodging performance without requiring any external identity. Comprehensive evaluations on open-source facial recognition models and commercial APIs demonstrate that MASQUE significantly improves dodging success rates over all baselines, along with higher perceptual fidelity and stronger adaptability to various text makeup prompts.

Paperid: 2073, https://arxiv.org/pdf/2503.10411.pdf

Abstract:
Before a fair exchange takes place, there is typically an advertisement phase with the goal of increasing the appeal of possessing a digital asset while keeping it sufficiently hidden. Advertisement phases are implicit in mainstream definitions, and therefore are not explicitly integrated within fair-exchange protocols. In this work we give an explicit definition for such a fair exchange in a setting where parties communicate via broadcast messages only (i.e., no point-to-point connection between seller and buyer is needed). Next, we construct a fair-exchange protocol satisfying our new definition using zk-SNARKs and relying on mainstream decentralized platforms (i.e., a blockchain with smart contracts like Ethereum and a decentralized storage system like IPFS). Experimental results confirm the practical relevance of our decentralized approach, paving the road towards building decentralized marketplaces where users can, even anonymously, and without direct off-chain communications, effectively advertise and exchange their digital assets as part of a system of enhanced NFTs.

Paperid: 2074, https://arxiv.org/pdf/2503.10171.pdf

Abstract:
Graph databases have garnered extensive attention and research due to their ability to manage relationships between entities efficiently. Today, many graph search services have been outsourced to a third-party server to facilitate storage and computational support. Nevertheless, the outsourcing paradigm may invade the privacy of graphs. PeGraph is the latest scheme achieving encrypted search over social graphs to address the privacy leakage, which maintains two data structures XSet and TSet motivated by the OXT technology to support encrypted conjunctive search. However, PeGraph still exhibits limitations inherent to the underlying OXT. It does not provide transparent search capabilities, suffers from expensive computation and result pattern leakages, and it fails to support search over dynamic encrypted graph database and results verification. In this paper, we propose SecGraph to address the first two limitations, which adopts a novel system architecture that leverages an SGX-enabled cloud server to provide users with secure and transparent search services since the secret key protection and computational overhead have been offloaded to the cloud server. Besides, we design an LDCF-encoded XSet based on the Logarithmic Dynamic Cuckoo Filter to facilitate efficient plaintext computation in trusted memory, effectively mitigating the risks of result pattern leakage and performance degradation due to exceeding the limited trusted memory capacity. Finally, we design a new dynamic version of TSet named Twin-TSet to enable conjunctive search over dynamic encrypted graph database. In order to support verifiable search, we further propose VSecGraph, which utilizes a procedure-oriented verification method to verify all data structures loaded into the trusted memory, thus bypassing the computational overhead associated with the client's local verification.

Paperid: 2075, https://arxiv.org/pdf/2503.09302.pdf

Abstract:
This paper investigates the critical issue of data poisoning attacks on AI models, a growing concern in the ever-evolving landscape of artificial intelligence and cybersecurity. As advanced technology systems become increasingly prevalent across various sectors, the need for robust defence mechanisms against adversarial attacks becomes paramount. The study aims to develop and evaluate novel techniques for detecting and preventing data poisoning attacks, focusing on both theoretical frameworks and practical applications. Through a comprehensive literature review, experimental validation using the CIFAR-10 and Insurance Claims datasets, and the development of innovative algorithms, this paper seeks to enhance the resilience of AI models against malicious data manipulation. The study explores various methods, including anomaly detection, robust optimization strategies, and ensemble learning, to identify and mitigate the effects of poisoned data during model training. Experimental results indicate that data poisoning significantly degrades model performance, reducing classification accuracy by up to 27% in image recognition tasks (CIFAR-10) and 22% in fraud detection models (Insurance Claims dataset). The proposed defence mechanisms, including statistical anomaly detection and adversarial training, successfully mitigated poisoning effects, improving model robustness and restoring accuracy levels by an average of 15-20%. The findings further demonstrate that ensemble learning techniques provide an additional layer of resilience, reducing false positives and false negatives caused by adversarial data injections.

Paperid: 2076, https://arxiv.org/pdf/2503.09291.pdf

Abstract:
The inference process of modern large language models (LLMs) demands prohibitive computational resources, rendering them infeasible for deployment on consumer-grade devices. To address this limitation, recent studies propose distributed LLM inference frameworks, which employ split learning principles to enable collaborative LLM inference on resource-constrained hardware. However, distributing LLM layers across participants requires the transmission of intermediate outputs, which may introduce privacy risks to the original input prompts - a critical issue that has yet to be thoroughly explored in the literature. In this paper, we rigorously examine the privacy vulnerabilities of distributed LLM inference frameworks by designing and evaluating three prompt inference attacks aimed at reconstructing input prompts from intermediate LLM outputs. These attacks are developed under various query and data constraints to reflect diverse real-world LLM service scenarios. Specifically, the first attack assumes an unlimited query budget and access to an auxiliary dataset sharing the same distribution as the target prompts. The second attack also leverages unlimited queries but uses an auxiliary dataset with a distribution differing from the target prompts. The third attack operates under the most restrictive scenario, with limited query budgets and no auxiliary dataset available. We evaluate these attacks on a range of LLMs, including state-of-the-art models such as Llama-3.2 and Phi-3.5, as well as widely-used models like GPT-2 and BERT for comparative analysis. Our experiments show that the first two attacks achieve reconstruction accuracies exceeding 90%, while the third achieves accuracies typically above 50%, even under stringent constraints. These findings highlight privacy risks in distributed LLM inference frameworks, issuing a strong alert on their deployment in real-world applications.

Paperid: 2077, https://arxiv.org/pdf/2503.09066.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they remain vulnerable to adversarial manipulations such as jailbreaking via prompt injection attacks. These attacks bypass safety mechanisms to generate restricted or harmful content. In this study, we investigated the underlying latent subspaces of safe and jailbroken states by extracting hidden activations from a LLM. Inspired by attractor dynamics in neuroscience, we hypothesized that LLM activations settle into semi stable states that can be identified and perturbed to induce state transitions. Using dimensionality reduction techniques, we projected activations from safe and jailbroken responses to reveal latent subspaces in lower dimensional spaces. We then derived a perturbation vector that when applied to safe representations, shifted the model towards a jailbreak state. Our results demonstrate that this causal intervention results in statistically significant jailbreak responses in a subset of prompts. Next, we probed how these perturbations propagate through the model's layers, testing whether the induced state change remains localized or cascades throughout the network. Our findings indicate that targeted perturbations induced distinct shifts in activations and model responses. Our approach paves the way for potential proactive defenses, shifting from traditional guardrail based methods to preemptive, model agnostic techniques that neutralize adversarial states at the representation level.

Paperid: 2078, https://arxiv.org/pdf/2503.08359.pdf

Abstract:
Blockchains are among the most powerful technologies to realize decentralized information systems. In order to safely enjoy all guarantees provided by a blockchain, one should maintain a full node, therefore maintaining an updated local copy of the ledger. This allows one to locally verify transactions, states of smart contracts, and to compute any information over them. Unfortunately, for obvious practical reasons, a very large part of blockchain-based information systems consists of users relying on clients that access data stored in blockchains only through servers, without verifying what is received. In notable use cases, the user has application-specific queries that can be answered only by very few servers, sometimes all belonging to the same organization. This clearly re-introduces a single point of failure. In this work we present an architecture allowing superlight clients (i.e., clients that do not want to download the involved transactions) to outsource the computation of a query to a (possibly untrusted) server, receiving a trustworthy answer. Our architecture relies on the power of SNARKs and makes them lighter to compute by using data obtained from full nodes and blockchain explorers, possibly leveraging the existence of smart contracts. The viability of our architecture is confirmed by an experimental evaluation on concrete scenarios. Our work paves the road towards blockchain-based information systems that remain decentralized and reliable even when users rely on common superlight clients (e.g., smartphones).

Paperid: 2079, https://arxiv.org/pdf/2503.08293.pdf

Abstract:
The constant increase of devices connected to the Internet, and therefore of cyber-attacks, makes it necessary to analyze network traffic in order to recognize malicious activity. Traditional packet-based analysis methods are insufficient because in large networks the amount of traffic is so high that it is unfeasible to review all communications. For this reason, flows is a suitable approach for this situation, which in future 5G networks will have to be used, as the number of packets will increase dramatically. If this is also combined with unsupervised learning models, it can detect new threats for which it has not been trained. This paper presents a systematic review of the literature on unsupervised learning algorithms for detecting anomalies in network flows, following the PRISMA guideline. A total of 63 scientific articles have been reviewed, analyzing 13 of them in depth. The results obtained show that autoencoder is the most used option, followed by SVM, ALAD, or SOM. On the other hand, all the datasets used for anomaly detection have been collected, including some specialised in IoT or with real data collected from honeypots.

Paperid: 2080, https://arxiv.org/pdf/2503.06529.pdf

Abstract:
As object detection becomes integral to many safety-critical applications, understanding its vulnerabilities is essential. Backdoor attacks, in particular, pose a serious threat by implanting hidden triggers in victim models, which adversaries can later exploit to induce malicious behaviors during inference. However, current understanding is limited to single-target attacks, where adversaries must define a fixed malicious behavior (target) before training, making inference-time adaptability impossible. Given the large output space of object detection (including object existence prediction, bounding box estimation, and classification), the feasibility of flexible, inference-time model control remains unexplored. This paper introduces AnywhereDoor, a multi-target backdoor attack for object detection. Once implanted, AnywhereDoor allows adversaries to make objects disappear, fabricate new ones, or mislabel them, either across all object classes or specific ones, offering an unprecedented degree of control. This flexibility is enabled by three key innovations: (i) objective disentanglement to scale the number of supported targets; (ii) trigger mosaicking to ensure robustness even against region-based detectors; and (iii) strategic batching to address object-level data imbalances that hinder manipulation. Extensive experiments demonstrate that AnywhereDoor grants attackers a high degree of control, improving attack success rates by 26% compared to adaptations of existing methods for such flexible control.

Paperid: 2081, https://arxiv.org/pdf/2503.05482.pdf

Abstract:
Forensic Memory Analysis (FMA) and Virtual Machine Introspection (VMI) are critical tools for security in a virtualization-based approach. VMI and FMA involves using digital forensic methods to extract information from the system to identify and explain security incidents. A key challenge in both FMA and VMI is the "Semantic Gap", which is the difficulty of interpreting raw memory data without specialized tools and expertise. In this work, we investigate how a priori knowledge, metadata and engineered features can aid VMI and FMA, leveraging machine learning to automate information extraction and reduce the workload of forensic investigators. We choose OpenSSH as our use case to test different methods to extract high level structures. We also test our method on complete physical memory dumps to showcase the effectiveness of the engineered features. Our features range from basic statistical features to advanced graph-based representations using malloc headers and pointer translations. The training and testing are carried out on public datasets that we compare against already recognized baseline methods. We show that using metadata, we can improve the performance of the algorithm when there is very little training data and also quantify how having more data results in better generalization performance. The final contribution is an open dataset of physical memory dumps, totalling more than 1 TB of different memory state, software environments, main memory capacities and operating system versions. Our methods show that having more metadata boosts performance with all methods obtaining an F1-Score of over 80%. Our research underscores the possibility of using feature engineering and machine learning techniques to bridge the semantic gap.

Paperid: 2082, https://arxiv.org/pdf/2503.05430.pdf

Abstract:
Digital inequality remains a significant barrier for many older adults, limiting their ability to navigate online spaces securely and confidently while increasing their susceptibility to cyber threats. In response, we propose a novel shedding-type card game for older adults to conceptually learn and reinforce cyber hygiene practices in educational settings. We asked digital educators to participate as players alongside older adults (n = 16), departing from their usual role as teachers, they collaborated and shared a unique learning experience. The cybersafety game addresses 4 key topics: handling scams, password management, responding to cyber attacks, and staying private. We adopted a mixed-method approach of think-aloud playtesting, semi-structured interviews, and surveys to evaluate the game's reception and impact. Participants reported highly favorable gameplay experiences and found the cybersafety advice useful. Player feedback informed game modifications, detailed in this paper, to further enhance the game's usability and educational value.

Paperid: 2083, https://arxiv.org/pdf/2503.05376.pdf

Abstract:
With increasing demands for privacy, it becomes necessary to protect sensitive user query data when accessing public key-value databases. Existing Private Information Retrieval (PIR) schemes provide full security but suffer from poor scalability, limiting their applicability in large-scale deployment. We argue that in many real-world scenarios, a more practical solution should allow users to flexibly determine the privacy levels of their queries in a theoretically guided way, balancing security and performance based on specific needs. To formally provide provable guarantees, we introduce a novel concept of distance-based indistinguishability, which can facilitate users to comfortably relax their security requirements. We then design Femur, an efficient framework to securely query public key-value stores with flexible security and performance trade-offs. It uses a space-efficient learned index to convert query keys into storage locations, obfuscates these locations with extra noise provably derived by the distance-based indistinguishability theory, and sends the expanded range to the server. The server then adaptively utilizes the best scheme to retrieve data. We also propose a novel variable-range PIR scheme optimized for bandwidth-constrained environments. Experiments show that Femur outperforms the state-of-the-art designs even when ensuring the same full security level. When users are willing to relax their privacy requirements, Femur can further improve the performance gains to up to 163.9X, demonstrating an effective trade-off between security and performance.

Paperid: 2084, https://arxiv.org/pdf/2503.01944.pdf

Abstract:
Smart contracts in Decentralized Finance (DeFi) platforms are attractive targets for attacks as their vulnerabilities can lead to massive amounts of financial losses. Flash loan attacks, in particular, pose a major threat to DeFi protocols that hold a Total Value Locked (TVL) exceeding \$106 billion. These attacks use the atomicity property of blockchains to drain funds from smart contracts in a single transaction. While existing research primarily focuses on price manipulation attacks, such as oracle manipulation, mitigating non-price flash loan attacks that often exploit smart contracts' zero-day vulnerabilities remains largely unaddressed. These attacks are challenging to detect because of their unique patterns, time sensitivity, and complexity. In this paper, we present FlashGuard, a runtime detection and mitigation method for non-price flash loan attacks. Our approach targets smart contract function signatures to identify attacks in real-time and counterattack by disrupting the attack transaction atomicity by leveraging the short window when transactions are visible in the mempool but not yet confirmed. When FlashGuard detects an attack, it dispatches a stealthy dusting counterattack transaction to miners to change the victim contract's state which disrupts the attack's atomicity and forces the attack transaction to revert. We evaluate our approach using 20 historical attacks and several unseen attacks. FlashGuard achieves an average real-time detection latency of 150.31ms, a detection accuracy of over 99.93\%, and an average disruption time of 410.92ms. FlashGuard could have potentially rescued over \$405.71 million in losses if it were deployed prior to these attack instances. FlashGuard demonstrates significant potential as a DeFi security solution to mitigate and handle rising threats of non-price flash loan attacks.

Paperid: 2085, https://arxiv.org/pdf/2503.01766.pdf

Abstract:
This paper examines the challenges in distributing AI models through model zoos and file transfer mechanisms. Despite advancements in security measures, vulnerabilities persist, necessitating a multi-layered approach to mitigate risks effectively. The physical security of model files is critical, requiring stringent access controls and attack prevention solutions. This paper proposes a novel solution architecture composed of two prevention approaches. The first is Content Disarm and Reconstruction (CDR), which focuses on disarming serialization attacks that enable attackers to run malicious code as soon as the model is loaded. The second is protecting the model architecture and weights from attacks by using Moving Target Defense (MTD), alerting the model structure, and providing verification steps to detect such attacks. The paper focuses on the highly exploitable Pickle and PyTorch file formats. It demonstrates a 100% disarm rate while validated against known AI model repositories and actual malware attacks from the HuggingFace model zoo.

Paperid: 2087, https://arxiv.org/pdf/2503.01522.pdf

Abstract:
We study the distributed function computation problem with $k$ users of which at most $s$ may be controlled by an adversary and characterize the set of functions of the sources the decoder can reconstruct robustly in the following sense -- if the users behave honestly, the function is recovered with high probability (w.h.p.); if they behave adversarially, w.h.p, either one of the adversarial users will be identified or the function is recovered with vanishingly small distortion.

Paperid: 2088, https://arxiv.org/pdf/2503.01327.pdf

Abstract:
Online abuse, a persistent aspect of social platform interactions, impacts user well-being and exposes flaws in platform designs that include insufficient detection efforts and inadequate victim protection measures. Ensuring safety in platform interactions requires the integration of victim perspectives in the design of abuse detection and response systems. In this paper, we conduct surveys (n = 230) and semi-structured interviews (n = 15) with students at a minority-serving institution in the US, to explore their experiences with abuse on a variety of social platforms, their defense strategies, and their recommendations for social platforms to improve abuse responses. We build on study findings to propose design requirements for abuse defense systems and discuss the role of privacy, anonymity, and abuse attribution requirements in their implementation. We introduce ARI, a blueprint for a unified, transparent, and personalized abuse response system for social platforms that sustainably detects abuse by leveraging the expertise of platform users, incentivized with proceeds obtained from abusers.

Paperid: 2089, https://arxiv.org/pdf/2503.00324.pdf

Abstract:
Malicious Python packages make software supply chains vulnerable by exploiting trust in open-source repositories like Python Package Index (PyPI). Lack of real-time behavioral monitoring makes metadata inspection and static code analysis inadequate against advanced attack strategies such as typosquatting, covert remote access activation, and dynamic payload generation. To address these challenges, we introduce DySec, a machine learning (ML)-based dynamic analysis framework for PyPI that uses eBPF kernel and user-level probes to monitor behaviors during package installation. By capturing 36 real-time features-including system calls, network traffic, resource usage, directory access, and installation patterns-DySec detects threats like typosquatting, covert remote access activation, dynamic payload generation, and multiphase attack malware. We developed a comprehensive dataset of 14,271 Python packages, including 7,127 malicious sample traces, by executing them in a controlled isolated environment. Experimental results demonstrate that DySec achieves a 95.99\% detection accuracy with a latency of <0.5s, reducing false negatives by 78.65\% compared to static analysis and 82.24\% compared to metadata analysis. During the evaluation, DySec flagged 11 packages that PyPI classified as benign. A manual analysis, including installation behavior inspection, confirmed six of them as malicious. These findings were reported to PyPI maintainers, resulting in the removal of four packages. DySec bridges the gap between reactive traditional methods and proactive, scalable threat mitigation in open-source ecosystems by uniquely detecting malicious install-time behaviors.

Paperid: 2090, https://arxiv.org/pdf/2502.21026.pdf

Abstract:
Server-side request forgery (SSRF) vulnerabilities are inevitable in PHP web applications. Existing static tools in detecting vulnerabilities in PHP web applications neither contain SSRF-related features to enhance detection accuracy nor consider PHP's dynamic type features. In this paper, we present Artemis, a static taint analysis tool for detecting SSRF vulnerabilities in PHP web applications. First, Artemis extracts both PHP built-in and third-party functions as candidate source and sink functions. Second, Artemis constructs both explicit and implicit call graphs to infer functions' relationships. Third, Artemis performs taint analysis based on a set of rules that prevent over-tainting and pauses when SSRF exploitation is impossible. Fourth, Artemis analyzes the compatibility of path conditions to prune false positives. We have implemented a prototype of Artemis and evaluated it on 250 PHP web applications. Artemis reports 207 true vulnerable paths (106 true SSRFs) with 15 false positives. Of the 106 detected SSRFs, 35 are newly found and reported to developers, with 24 confirmed and assigned CVE IDs.

Paperid: 2091, https://arxiv.org/pdf/2502.20621.pdf

Abstract:
Phishing attacks, typically carried out by email, remain a significant cybersecurity threat with attackers creating legitimate-looking websites to deceive recipients into revealing sensitive information or executing harmful actions. In this paper, we propose {\bf EPhishCADE}, the first {\em privacy-aware}, {\em multi-dimensional} framework for {\bf E}mail {\bf Phish}ing {\bf CA}mpaign {\bf DE}tection to automatically identify email phishing campaigns by clustering seemingly unrelated attacks. Our framework employs a hierarchical architecture combining a structural layer and a contextual layer, offering a comprehensive analysis of phishing attacks by thoroughly examining both structural and contextual elements. Specifically, we implement a graph-based contextual layer to reveal hidden similarities across multiple dimensions, including textual, numeric, temporal, and spatial features, among attacks that may initially appear unrelated. Our framework streamlines the handling of security threat reports, reducing analysts' fatigue and workload while enhancing protection against these threats. Another key feature of our framework lies in its sole reliance on phishing URLs in emails without the need for private information, including senders, recipients, content, etc. This feature enables a collaborative identification of phishing campaigns and attacks among multiple organizations without compromising privacy. Finally, we benchmark our framework against an established structure-based study (WWW \textquotesingle 17) to demonstrate its effectiveness.

Paperid: 2092, https://arxiv.org/pdf/2502.20207.pdf

Abstract:
Fast and accurate estimation of quantiles on data streams coming from communication networks, Internet of Things (IoT), and alike, is at the heart of important data processing applications including statistical analysis, latency monitoring, query optimization for parallel database management systems, and more. Indeed, quantiles are more robust indicators for the underlying distribution, compared to moment-based indicators such as mean and variance. The streaming setting additionally constrains accurate tracking of quantiles, as stream items may arrive at a very high rate and must be processed as quickly as possible and discarded, being their storage usually unfeasible. Since an exact solution is only possible when data are fully stored, the goal in practical contexts is to provide an approximate solution with a provably guaranteed bound on the approximation error committed, while using a minimal amount of space. At the same time, with the increasing amount of personal and sensitive information exchanged, it is essential to design privacy protection techniques to ensure confidentiality and data integrity. In this paper we present the following differentially private streaming algorithms for frugal estimation of a quantile: \textsc{DP-Frugal-1U-L}, \textsc{DP-Frugal-1U-G}, \textsc{DP-Frugal-1U-$Ï$}. Frugality refers to the ability of the algorithms to provide a good approximation to the sought quantile using a modest amount of space, either one or two units of memory. We provide a theoretical analysis and experimental results.

Paperid: 2093, https://arxiv.org/pdf/2502.19041.pdf

Abstract:
Although Aligned Large Language Models (LLMs) are trained to refuse harmful requests, they remain vulnerable to jailbreak attacks. Unfortunately, existing methods often focus on surface-level patterns, overlooking the deeper attack essences. As a result, defenses fail when attack prompts change, even though the underlying "attack essence" remains the same. To address this issue, we introduce EDDF, an \textbf{E}ssence-\textbf{D}riven \textbf{D}efense \textbf{F}ramework Against Jailbreak Attacks in LLMs. EDDF is a plug-and-play input-filtering method and operates in two stages: 1) offline essence database construction, and 2) online adversarial query detection. The key idea behind EDDF is to extract the "attack essence" from a diverse set of known attack instances and store it in an offline vector database. Experimental results demonstrate that EDDF significantly outperforms existing methods by reducing the Attack Success Rate by at least 20\%, underscoring its superior robustness against jailbreak attacks.

Paperid: 2094, https://arxiv.org/pdf/2502.18501.pdf

Abstract:
Advancements in digital technologies make it easy to modify the content of digital images. Hence, ensuring digital images integrity and authenticity is necessary to protect them against various attacks that manipulate them. We present a Deep Learning (DL) based dual invisible watermarking technique for performing source authentication, content authentication, and protecting digital content copyright of images sent over the internet. Beyond securing images, the proposed technique demonstrates robustness to content-preserving image manipulations. It is also impossible to imitate or overwrite watermarks because the cryptographic hash of the image and the dominant features of the image in the form of perceptual hash are used as watermarks. We highlighted the need for source authentication to safeguard image integrity and authenticity, along with identifying similar content for copyright protection. After exhaustive testing, we obtained a high peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM), which implies there is a minute change in the original image after embedding our watermarks. Our trained model achieves high watermark extraction accuracy and to the best of our knowledge, this is the first deep learning-based dual watermarking technique proposed in the literature.

Paperid: 2095, https://arxiv.org/pdf/2502.18303.pdf

Abstract:
Messaging Layer security (MLS) and its underlying Continuous Group Key Agreement (CGKA) protocol allows a group of users to share a cryptographic secret in a dynamic manner, such that the secret is modified in member insertions and deletions. One of the most relevant contributions of MLS is its efficiency, as its communication cost scales logarithmically with the number of members. However, this claim has only been analysed in theoretical models and thus it is unclear how efficient MLS is in real-world scenarios. Furthermore, practical decisions such as the chosen Delivery Service and paradigm can also influence the efficiency and evolution of an MLS group. In this work we analyse MLS from an empirical viewpoint: we provide real-world measurements for metrics such as commit generation and processing times and message sizes under different conditions. In order to obtain these results we have developed a highly configurable environment for empirical evaluations of MLS through the simulation of MLS clients. Among other findings, our results show that computation costs scale linearly in practical scenarios even in the best-case scenario.

Paperid: 2096, https://arxiv.org/pdf/2502.17537.pdf

Abstract:
The proliferation of text-to-image diffusion models has raised significant privacy and security concerns, particularly regarding the generation of copyrighted or harmful images. In response, several concept erasure (defense) methods have been developed to prevent the generation of unwanted content through post-hoc finetuning. On the other hand, concept restoration (attack) methods seek to recover supposedly erased concepts via adversarially crafted prompts. However, all existing restoration methods only succeed in the highly restrictive scenario of finding adversarial prompts tailed to some fixed seed. To address this, we introduce RECORD, a novel coordinate-descent-based restoration algorithm that finds adversarial prompts to recover erased concepts independently of the seed. Our extensive experiments demonstrate RECORD consistently outperforms the current restoration methods by up to 17.8 times in this setting. Our findings further reveal the susceptibility of unlearned models to restoration attacks, providing crucial insights into the behavior of unlearned models under the influence of adversarial prompts.

Paperid: 2097, https://arxiv.org/pdf/2502.17330.pdf

Abstract:
We propose a novel approach for performing side-channel attacks on elliptic curve cryptography. Unlike previous approaches and inspired by the ``activity detection'' literature, we adopt a long-short-term memory (LSTM) neural network to analyze a power trace and identify patterns of operation in the scalar multiplication algorithm performed during an ECDSA signature, that allows us to recover bits of the ephemeral key, and thus retrieve the signer's private key. Our approach is based on the fact that modular reductions are conditionally performed by micro-ecc and depend on key bits. We evaluated the feasibility and reproducibility of our attack through experiments in both simulated and real implementations. We demonstrate the effectiveness of our attack by implementing it on a real target device, an STM32F415 with the micro-ecc library, and successfully compromise it. Furthermore, we show that current countermeasures, specifically the coordinate randomization technique, are not sufficient to protect against side channels. Finally, we suggest other approaches that may be implemented to thwart our attack.

Paperid: 2098, https://arxiv.org/pdf/2502.16650.pdf

Abstract:
5G NR sidelink communication enables new possibilities for direct device-to-device interactions, supporting applications from vehicle-to-everything (V2X) systems to public safety, industrial automation, and drone networks. However, these advancements come with significant security challenges due to the decentralized trust model and increased reliance on User Equipment (UE) for critical functions like synchronization, resource allocation, and authorization. This paper presents the first comprehensive security analysis of NR V2X sidelink. We identify vulnerabilities across critical procedures and demonstrate plausible attack, including attacks that manipulate data integrity feedback and block resources, ultimately undermining the reliability and privacy of sidelink communications. Our analysis reveals that NR operational modes are vulnerable, with the ones relying on autonomous resource management (without network supervision) particularly exposed. To address these issues, we propose mitigation strategies to enhance the security of 5G sidelink communications. This work establishes a foundation for future efforts to strengthen 5G device-to-device sidelink communications, ensuring its safe deployment in critical applications.

Paperid: 2099, https://arxiv.org/pdf/2502.16054.pdf

Abstract:
Given the complexity of multi-tenant cloud environments and the growing need for real-time threat mitigation, Security Operations Centers (SOCs) must adopt AI-driven adaptive defense mechanisms to counter Advanced Persistent Threats (APTs). However, SOC analysts face challenges in handling adaptive adversarial tactics, requiring intelligent decision-support frameworks. We propose a Cognitive Hierarchy Theory-driven Deep Q-Network (CHT-DQN) framework that models interactive decision-making between SOC analysts and AI-driven APT bots. The SOC analyst (defender) operates at cognitive level-1, anticipating attacker strategies, while the APT bot (attacker) follows a level-0 policy. By incorporating CHT into DQN, our framework enhances adaptive SOC defense using Attack Graph (AG)-based reinforcement learning. Simulation experiments across varying AG complexities show that CHT-DQN consistently achieves higher data protection and lower action discrepancies compared to standard DQN. A theoretical lower bound further confirms its superiority as AG complexity increases. A human-in-the-loop (HITL) evaluation on Amazon Mechanical Turk (MTurk) reveals that SOC analysts using CHT-DQN-derived transition probabilities align more closely with adaptive attackers, leading to better defense outcomes. Moreover, human behavior aligns with Prospect Theory (PT) and Cumulative Prospect Theory (CPT): participants are less likely to reselect failed actions and more likely to persist with successful ones. This asymmetry reflects amplified loss sensitivity and biased probability weighting -- underestimating gains after failure and overestimating continued success. Our findings highlight the potential of integrating cognitive models into deep reinforcement learning to improve real-time SOC decision-making for cloud security.

Paperid: 2100, https://arxiv.org/pdf/2502.16044.pdf

Abstract:
Deep Neural Networks (DNNs) have demonstrated remarkable success across a wide range of tasks, particularly in fields such as image classification. However, DNNs are highly susceptible to adversarial attacks, where subtle perturbations are introduced to input images, leading to erroneous model outputs. In today's digital era, ensuring the security and integrity of images processed by DNNs is of critical importance. One of the most prominent adversarial attack methods is the Fast Gradient Sign Method (FGSM), which perturbs images in the direction of the loss gradient to deceive the model. This paper presents a novel approach for detecting and filtering FGSM adversarial attacks in image processing tasks. Our proposed method evaluates 10,000 images, each subjected to five different levels of perturbation, characterized by $Îµ$ values of 0.01, 0.02, 0.05, 0.1, and 0.2. These perturbations are applied in the direction of the loss gradient. We demonstrate that our approach effectively filters adversarially perturbed images, mitigating the impact of FGSM attacks. The method is implemented in Python, and the source code is publicly available on GitHub for reproducibility and further research.

Paperid: 2101, https://arxiv.org/pdf/2502.15506.pdf

Abstract:
With the emergence of high-performance large language models (LLMs) such as GPT, Claude, and Gemini, the autonomous and semi-autonomous execution of tasks has significantly advanced across various domains. However, in highly specialized fields such as cybersecurity, full autonomy remains a challenge. This difficulty primarily stems from the limitations of LLMs in reasoning capabilities and domain-specific knowledge. We propose a system that semi-autonomously executes complex cybersecurity workflows by employing multiple LLMs modules to formulate attack strategies, generate commands, and analyze results, thereby addressing the aforementioned challenges. In our experiments using Hack The Box virtual machines, we confirmed that our system can autonomously construct attack strategies, issue appropriate commands, and automate certain processes, thereby reducing the need for manual intervention.

Paperid: 2102, https://arxiv.org/pdf/2502.15023.pdf

Abstract:
The Quantum Signature Validation Algorithm (QSVA) is introduced as a novel quantum-based approach designed to enhance the detection of tampered transactions in blockchain systems. Leveraging the powerful capabilities of quantum computing, especially within the framework of transaction-based blockchains, the QSVA aims to surpass classical methods in both speed and efficiency. By utilizing a quantum walk approach integrated with PageRank-based search algorithms, QSVA provides a robust mechanism for identifying fraudulent transactions. Our adaptation of the transaction graph representation efficiently verifies transactions by maintaining a current set of unspent transaction outputs (UTXOs) characteristic of models like Bitcoin. The QSVA not only amplifies detection efficacy through a quadratic speedup but also incorporates two competing quantum search algorithms$-$Quantum SearchRank and Randomized SearchRank$-$to explore their effectiveness as foundational components. Our results indicate that Randomized SearchRank, in particular, outperforms its counterpart in aligning with transaction rankings based on the Classical PageRank algorithm, ensuring more consistent detection probabilities. These findings highlight the potential for quantum algorithms to revolutionize blockchain security by improving detection times to $O(\sqrt{N})$. Progress in Distributed Ledger Technologies (DLTs) could facilitate future integration of quantum solutions into more general distributed systems. As quantum technology continues to evolve, the QSVA stands as a promising strategy offering significant advancements in blockchain efficiency and security.

Paperid: 2103, https://arxiv.org/pdf/2502.14116.pdf

Abstract:
Hardware Trojans are malicious modifications in digital designs that can be inserted by untrusted supply chain entities. Hardware Trojans can give rise to diverse attack vectors such as information leakage (e.g. MOLES Trojan) and denial-of-service (rarely triggered bit flip). Such an attack in critical systems (e.g. healthcare and aviation) can endanger human lives and lead to catastrophic financial loss. Several techniques have been developed to detect such malicious modifications in digital designs, particularly for designs sourced from third-party intellectual property (IP) vendors. However, most techniques have scalability concerns (due to unsound assumptions during evaluation) and lead to large number of false positive detections (false alerts). Our framework (SALTY) mitigates these concerns through the use of a novel Graph Neural Network architecture (using Jumping-Knowledge mechanism) for generating initial predictions and an Explainable Artificial Intelligence (XAI) approach for fine tuning the outcomes (post-processing). Experiments show 98% True Positive Rate (TPR) and True Negative Rate (TNR), significantly outperforming state-of-the-art techniques across a large set of standard benchmarks.

Paperid: 2104, https://arxiv.org/pdf/2502.13833.pdf

Abstract:
Synthetic data has garnered attention as a Privacy Enhancing Technology (PET) in sectors such as healthcare and finance. When using synthetic data in practical applications, it is important to provide protection guarantees. In the literature, two family of approaches are proposed for tabular data: on the one hand, Similarity-based methods aim at finding the level of similarity between training and synthetic data. Indeed, a privacy breach can occur if the generated data is consistently too similar or even identical to the train data. On the other hand, Attack-based methods conduce deliberate attacks on synthetic datasets. The success rates of these attacks reveal how secure the synthetic datasets are. In this paper, we introduce a contrastive method that improves privacy assessment of synthetic datasets by embedding the data in a more representative space. This overcomes obstacles surrounding the multitude of data types and attributes. It also makes the use of intuitive distance metrics possible for similarity measurements and as an attack vector. In a series of experiments with publicly available datasets, we compare the performances of similarity-based and attack-based methods, both with and without use of the contrastive learning-based embeddings. Our results show that relatively efficient, easy to implement privacy metrics can perform equally well as more advanced metrics explicitly modeling conditions for privacy referred to by the GDPR.

Paperid: 2105, https://arxiv.org/pdf/2502.13401.pdf

Abstract:
Cryptographic implementations bolster security against timing side-channel attacks by integrating constant-time components. However, the new ciphertext side channels resulting from the deterministic memory encryption in Trusted Execution Environments (TEEs), enable ciphertexts to manifest identifiable patterns when being sequentially written to the same memory address. Attackers with read access to encrypted memory in TEEs can potentially deduce plaintexts by analyzing these changing ciphertext patterns. In this paper, we design CipherGuard, a compiler-aided mitigation methodology to counteract ciphertext side channels with high efficiency and security. CipherGuard is based on the LLVM ecosystem, and encompasses multiple mitigation strategies, including software-based probabilistic encryption and secret-aware register allocation. Through a comprehensive evaluation, we demonstrate that CipherGuard can strengthen the security of various cryptographic implementations more efficiently than existing state-of-the-art defense mechanism, i.e., CipherFix.

Paperid: 2106, https://arxiv.org/pdf/2502.13060.pdf

Abstract:
Most heavy computation occurs on servers owned by a second party. This reduces data privacy, resulting in interest in data-oblivious computation, which typically severely degrades performance. Secure and fast delegated computation is particularly important for linear algebra, which comprises a large fraction of total computation and is best run on highly specialized hardware often only accessible through the cloud. We state the natural efficiency and security desiderata for fast and data-oblivious delegated linear algebra. We demonstrate the existence of trapdoored matrix families based on a LPN assumption, and provide a scheme for secure delegated matrix-matrix and matrix-vectors multiplication based on the existence of trapdoored matrices. We achieve sublinear overhead for the server, dramatically reduced computation cost for the client, and various practical advantages over previous algorithms.

Paperid: 2107, https://arxiv.org/pdf/2502.12863.pdf

Abstract:
Malware attacks pose a significant threat in today's interconnected digital landscape, causing billions of dollars in damages. Detecting and identifying families as early as possible provides an edge in protecting against such malware. We explore a lightweight, order-invariant approach to detecting and mitigating malware threats: analyzing API calls without regard to their sequence. We publish a public dataset of over three hundred thousand samples and their function call parameters for this task, annotated with labels indicating benign or malicious activity. The complete dataset is above 550GB uncompressed in size. We leverage machine learning algorithms, such as random forests, and conduct behavioral analysis by examining patterns and anomalies in API call sequences. By investigating how the function calls occur regardless of their order, we can identify discriminating features that can help us identify malware early on. The models we've developed are not only effective but also efficient. They are lightweight and can run on any machine with minimal performance overhead, while still achieving an impressive F1-Score of over 85\%. We also empirically show that we only need a subset of the function call sequence, specifically calls to the ntdll.dll library, to identify malware. Our research demonstrates the efficacy of this approach through empirical evaluations, underscoring its accuracy and scalability. The code is open source and available at Github along with the dataset on Zenodo.

Paperid: 2108, https://arxiv.org/pdf/2502.12628.pdf

Abstract:
Verifiable Homomorphic Encryption (VHE) is a cryptographic technique that integrates Homomorphic Encryption (HE) with Verifiable Computation (VC). It serves as a crucial technology for ensuring both privacy and integrity in outsourced computation, where a client sends input ciphertexts ct and a function f to a server and verifies the correctness of the evaluation upon receiving the evaluation result f(ct) from the server. At CCS, Chatel et al. introduced two lightweight VHE schemes: Replication Encoding (REP) and Polynomial Encoding (PE). A similar approach to REP was used by Albrecht et al. in Eurocrypt to develop a Verifiable Oblivious PRF scheme (vADDG). A key approach in these schemes is to embed specific secret information within HE ciphertexts to verify homomorphic evaluations. This paper presents efficient attacks that exploit the homomorphic properties of encryption schemes. The one strategy is to retrieve the secret information in encrypted state from the input ciphertexts and then leverage it to modify the resulting ciphertext without being detected by the verification algorithm. The other is to exploit the secret embedding structure to modify the evaluation function f into f' which works well on input values for verification purposes. Our forgery attack on vADDG demonstrates that the proposed 80-bit security parameters in fact offer less than 10-bits of concrete security. Our attack on REP and PE achieves a probability 1 attack with linear time complexity when using fully homomorphic encryption.

Paperid: 2109, https://arxiv.org/pdf/2502.11650.pdf

Abstract:
Based on Irish older adult's perceptions, practices, and challenges regarding password management, the goal of this study was to compile suitable advice that can benefit this demographic. To achieve this, we first conducted semi structured interviews (n=37), we then collated advice based on best practice and what we learned from these interviews. We facilitated two independent focus groups (n=31) to evaluate and adjust this advice and tested the finalized advice through an observational study (n=15). The participants were aged between 59 and 86 and came from various counties in Ireland, both rural and urban. The findings revealed that managing multiple passwords was a significant source of frustration, leading some participants to adopt novel and informal strategies for storing them. A notable hesitation to adopt digital password managers and passphrases was also observed. Participants appreciated guidance on improving their password practices, with many affirming that securely writing down passwords was a practical strategy. Irish older adults demonstrated strong intuition regarding cybersecurity, notably expressing concerns over knowledge-based security checks used by banks and government institutions. This study aims to contribute to the aggregation of practical password advice suited to older adults, making password security more manageable and less burdensome for this demographic.

Paperid: 2110, https://arxiv.org/pdf/2502.11470.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) devices has increased the risk of cyber-attacks, making effective detection essential for securing IoT networks. This work introduces a novel approach combining Self-Organizing Maps (SOMs), Deep Belief Networks (DBNs), and Autoencoders to detect known and previously unseen attack patterns. A comprehensive evaluation using simulated and real-world traffic data is conducted, with models optimized via Particle Swarm Optimization (PSO). The system achieves an accuracy of up to 99.99% and Matthews Correlation Coefficient (MCC) values exceeding 99.50%. Experiments on NSL-KDD, UNSW-NB15, and CICIoT2023 confirm the model's strong performance across diverse attack types. These findings suggest that the proposed method enhances IoT security by identifying emerging threats and adapting to evolving attack strategies.

Paperid: 2111, https://arxiv.org/pdf/2502.11455.pdf

Abstract:
Safety alignment is critical in pre-training large language models (LLMs) to generate responses aligned with human values and refuse harmful queries. Unlike LLM, the current safety alignment of VLMs is often achieved with post-hoc safety fine-tuning. However, these methods are less effective to white-box attacks. To address this, we propose $\textit{Adversary-aware DPO (ADPO)}$, a novel training framework that explicitly considers adversarial. $\textit{Adversary-aware DPO (ADPO)}$ integrates adversarial training into DPO to enhance the safety alignment of VLMs under worst-case adversarial perturbations. $\textit{ADPO}$ introduces two key components: (1) an adversarial-trained reference model that generates human-preferred responses under worst-case perturbations, and (2) an adversarial-aware DPO loss that generates winner-loser pairs accounting for adversarial distortions. By combining these innovations, $\textit{ADPO}$ ensures that VLMs remain robust and reliable even in the presence of sophisticated jailbreak attacks. Extensive experiments demonstrate that $\textit{ADPO}$ outperforms baselines in the safety alignment and general utility of VLMs.

Paperid: 2112, https://arxiv.org/pdf/2502.11006.pdf

Abstract:
Large Language Models (LLMs) are vulnerable to adversarial prompt based injects. These injects could jailbreak or exploit vulnerabilities within these models with explicit prompt requests leading to undesired responses. In the context of investigating prompt injects, the challenge is the sheer volume of input prompts involved that are likely to be largely benign. This investigative challenge is further complicated by the semantics and subjectivity of the input prompts involved in the LLM conversation with its user and the context of the environment to which the conversation is being carried out. Hence, the challenge for AI security investigators would be two-fold. The first is to identify adversarial prompt injects and then to assess whether the input prompt is contextually benign or adversarial. For the first step, this could be done using existing AI security solutions like guardrails to detect and protect the LLMs. Guardrails have been developed using a variety of approaches. A popular approach is to use signature based. Another popular approach to develop AI models to classify such prompts include the use of NLP based models like a language model. However, in the context of conducting an AI security investigation of prompt injects, these guardrails lack the ability to aid investigators in triaging or assessing the identified input prompts. In this applied research exploration, we explore the use of a text generation capabilities of LLM to detect prompt injects and generate explanation for its detections to aid AI security investigators in assessing and triaging of such prompt inject detections. The practical benefit of such a tool is to ease the task of conducting investigation into prompt injects.

Paperid: 2113, https://arxiv.org/pdf/2502.10495.pdf

Abstract:
In the rapidly evolving landscape of image generation, Latent Diffusion Models (LDMs) have emerged as powerful tools, enabling the creation of highly realistic images. However, this advancement raises significant concerns regarding copyright infringement and the potential misuse of generated content. Current watermarking techniques employed in LDMs often embed constant signals to the generated images that compromise their stealthiness, making them vulnerable to detection by malicious attackers. In this paper, we introduce SWA-LDM, a novel approach that enhances watermarking by randomizing the embedding process, effectively eliminating detectable patterns while preserving image quality and robustness. Our proposed watermark presence attack reveals the inherent vulnerabilities of existing latent-based watermarking methods, demonstrating how easily these can be exposed. Through comprehensive experiments, we validate that SWA-LDM not only fortifies watermark stealthiness but also maintains competitive performance in watermark robustness and visual fidelity. This work represents a pivotal step towards securing LDM-generated images against unauthorized use, ensuring both copyright protection and content integrity in an era where digital image authenticity is paramount.

Paperid: 2114, https://arxiv.org/pdf/2502.10390.pdf

Abstract:
Transformers excel at discovering patterns in sequential data, yet their fundamental limitations and learning mechanisms remain crucial topics of investigation. In this paper, we study the ability of Transformers to learn pseudo-random number sequences from linear congruential generators (LCGs), defined by the recurrence relation $x_{t+1} = a x_t + c \;\mathrm{mod}\; m$. We find that with sufficient architectural capacity and training data variety, Transformers can perform in-context prediction of LCG sequences with unseen moduli ($m$) and parameters ($a,c$). By analyzing the embedding layers and attention patterns, we uncover how Transformers develop algorithmic structures to learn these sequences in two scenarios of increasing complexity. First, we investigate how Transformers learn LCG sequences with unseen ($a, c$) but fixed modulus; and demonstrate successful learning up to $m = 2^{32}$. We find that models learn to factorize $m$ and utilize digit-wise number representations to make sequential predictions. In the second, more challenging scenario of unseen moduli, we show that Transformers can generalize to unseen moduli up to $m_{\text{test}} = 2^{16}$. In this case, the model employs a two-step strategy: first estimating the unknown modulus from the context, then utilizing prime factorizations to generate predictions. For this task, we observe a sharp transition in the accuracy at a critical depth $d= 3$. We also find that the number of in-context sequence elements needed to reach high accuracy scales sublinearly with the modulus.

Paperid: 2115, https://arxiv.org/pdf/2502.10166.pdf

Abstract:
As digital services increasingly replace traditional analogue systems, ensuring that older adults are not left behind is critical to fostering inclusive access. This study explores how digital educators support older adults in developing essential digital skills, drawing insights from interviews with $34$ educators in Ireland. These educators, both professional and volunteer, offer instruction through a range of formats, including workshops, remote calls, and in-person sessions. Our findings highlight the importance of personalized, step-by-step guidance tailored to older adults' learning needs, as well as fostering confidence through hands-on engagement with technology. Key challenges identified include limited transportation options, poor internet connectivity, outdated devices, and a lack of familial support for learning. To address these barriers, we propose enhanced public funding, expanded access to resources, and sustainable strategies such as providing relevant and practical course materials. Additionally, innovative tools like simulated online platforms for practicing digital transactions can help reduce anxiety and enhance digital literacy among older adults. This study underscores the vital role that digital educators play in bridging the digital divide, creating a more inclusive, human-centered approach to digital learning for older adults.

Paperid: 2116, https://arxiv.org/pdf/2502.09584.pdf

Abstract:
We initiate the study of differentially private data-compression schemes motivated by the insecurity of the popular "Compress-Then-Encrypt" framework. Data compression is a useful tool which exploits redundancy in data to reduce storage/bandwidth when files are stored or transmitted. However, if the contents of a file are confidential then the length of a compressed file might leak confidential information about the content of the file itself. Encrypting a compressed file does not eliminate this leakage as data encryption schemes are only designed to hide the content of confidential message instead of the length of the message. In our proposed Differentially Private Compress-Then-Encrypt framework, we add a random positive amount of padding to the compressed file to ensure that any leakage satisfies the rigorous privacy guarantee of $(Îµ,Î´)$-differential privacy. The amount of padding that needs to be added depends on the sensitivity of the compression scheme to small changes in the input, i.e., to what degree can changing a single character of the input message impact the length of the compressed file. While some popular compression schemes are highly sensitive to small changes in the input, we argue that effective data compression schemes do not necessarily have high sensitivity. Our primary technical contribution is analyzing the fine-grained sensitivity of the LZ77 compression scheme (IEEE Trans. Inf. Theory 1977) which is one of the most common compression schemes used in practice. We show that the global sensitivity of the LZ77 compression scheme has the upper bound $O(W^{2/3}\log n)$ where $W\leq n$ denotes the size of the sliding window. When $W=n$, we show the lower bound $Î©(n^{2/3}\log^{1/3}n)$ for the global sensitivity of the LZ77 compression scheme which is tight up to a sublogarithmic factor.

Paperid: 2117, https://arxiv.org/pdf/2502.09484.pdf

Abstract:
Traditional ethical hacking relies on skilled professionals and time-intensive command management, which limits its scalability and efficiency. To address these challenges, we introduce PenTest++, an AI-augmented system that integrates automation with generative AI (GenAI) to optimise ethical hacking workflows. Developed in a controlled virtual environment, PenTest++ streamlines critical penetration testing tasks, including reconnaissance, scanning, enumeration, exploitation, and documentation, while maintaining a modular and adaptable design. The system balances automation with human oversight, ensuring informed decision-making at key stages, and offers significant benefits such as enhanced efficiency, scalability, and adaptability. However, it also raises ethical considerations, including privacy concerns and the risks of AI-generated inaccuracies (hallucinations). This research underscores the potential of AI-driven systems like PenTest++ to complement human expertise in cybersecurity by automating routine tasks, enabling professionals to focus on strategic decision-making. By incorporating robust ethical safeguards and promoting ongoing refinement, PenTest++ demonstrates how AI can be responsibly harnessed to address operational and ethical challenges in the evolving cybersecurity landscape.

Paperid: 2118, https://arxiv.org/pdf/2502.09251.pdf

Abstract:
Replication protocols are essential for distributed systems, ensuring consistency, reliability, and fault tolerance. Traditional Crash Fault Tolerant (CFT) protocols, which assume a fail-stop model, are inadequate for untrusted cloud environments where adversaries or software bugs can cause Byzantine behavior. Byzantine Fault Tolerant (BFT) protocols address these threats but face significant performance, resource overheads, and scalability challenges. This paper introduces Recipe, a novel approach to transforming CFT protocols to operate securely in Byzantine settings without altering their core logic. Recipe rethinks CFT protocols in the context of modern cloud hardware, including many-core servers, RDMA-capable networks, and Trusted Execution Environments (TEEs). The approach leverages these advancements to enhance the security and performance of replication protocols in untrusted cloud environments. Recipe implements two practical security mechanisms, i.e., transferable authentication and non-equivocation, using TEEs and high-performance networking stacks (e.g., RDMA, DPDK). These mechanisms ensure that any CFT protocol can be transformed into a BFT protocol, guaranteeing authenticity and non-equivocation. The Recipe protocol consists of five key components: transferable authentication, initialization, normal operation, view change, and recovery phases. The protocol's correctness is formally verified using Tamarin, a symbolic model checker. Recipe is implemented as a library and applied to transform four widely used CFT protocols-Raft, Chain Replication, ABD, and AllConcur-into Byzantine settings. The results demonstrate up to 24x higher throughput compared to PBFT and 5.9x better performance than state-of-the-art BFT protocols. Additionally, Recipe requires fewer replicas and offers confidentiality, a feature absent in traditional BFT protocols.

Paperid: 2119, https://arxiv.org/pdf/2502.08610.pdf

Abstract:
As AI systems integrate into critical infrastructure, security gaps in AI compliance frameworks demand urgent attention. This paper audits and quantifies security risks in three major AI governance standards: NIST AI RMF 1.0, UK's AI and Data Protection Risk Toolkit, and the EU's ALTAI. Using a novel risk assessment methodology, we develop four key metrics: Risk Severity Index (RSI), Attack Potential Index (AVPI), Compliance-Security Gap Percentage (CSGP), and Root Cause Vulnerability Score (RCVS). Our analysis identifies 136 concerns across the frameworks, exposing significant gaps. NIST fails to address 69.23 percent of identified risks, ALTAI has the highest attack vector vulnerability (AVPI = 0.51) and the ICO Toolkit has the largest compliance-security gap, with 80.00 percent of high-risk concerns remaining unresolved. Root cause analysis highlights under-defined processes (ALTAI RCVS = 033) and weak implementation guidance (NIST and ICO RCVS = 0.25) as critical weaknesses. These findings emphasize the need for stronger, enforceable security controls in AI compliance. We offer targeted recommendations to enhance security posture and bridge the gap between compliance and real-world AI risks.

Paperid: 2120, https://arxiv.org/pdf/2502.08447.pdf

Abstract:
Inter-app communication is a mandatory and security-critical functionality of operating systems, such as Android. On the application level, Android implements this facility through Intents, which can also transfer non-primitive objects using Java's Serializable API. However, the Serializable API has a long history of deserialization vulnerabilities, specifically deserialization gadget chains. Research endeavors have been heavily directed towards the detection of deserialization gadget chains on the Java platform. Yet, there is little knowledge about the existence of gadget chains within the Android platform. We aim to close this gap by searching gadget chains in the Android SDK, Android's official development libraries, as well as frequently used third-party libraries. To handle this large dataset, we design a gadget chain detection tool optimized for soundness and efficiency. In a benchmark on the full Ysoserial dataset, it achieves similarly sound results to the state-of-the-art in significantly less time. Using our tool, we first show that the Android SDK contains almost the same trampoline gadgets as the Java Class Library. We also find that one can trigger Java native serialization through Android's Parcel API. Yet, running our tool on the Android SDK and 1,200 Android dependencies, in combination with a comprehensive sink dataset, yields no security-critical gadget chains. This result opposes the general notion of Java deserialization gadget chains being a widespread problem. Instead, the issue appears to be more nuanced, and we provide a perspective on where to direct further research.

Paperid: 2121, https://arxiv.org/pdf/2502.07159.pdf

Abstract:
Motivated by practical concerns in cryptography, we study pseudorandomness properties of permutations on $\{0,1\}^n$ computed by random circuits made from reversible $3$-bit gates (permutations on $\{0,1\}^3$). Our main result is that a random circuit of depth $\sqrt{n} \cdot \tilde{O}(k^3)$, with each layer consisting of $Î(n)$ random gates in a fixed two-dimensional nearest-neighbor architecture, yields approximate $k$-wise independent permutations. Our result can be seen as a particularly simple/practical block cipher construction that gives provable statistical security against attackers with access to $k$~input-output pairs within few rounds. The main technical component of our proof consists of two parts: 1. We show that the Markov chain on $k$-tuples of $n$-bit strings induced by a single random $3$-bit one-dimensional nearest-neighbor gate has spectral gap at least $1/n \cdot \tilde{O}(k)$. Then we infer that a random circuit with layers of random gates in a fixed one-dimensional gate architecture yields approximate $k$-wise independent permutations of $\{0,1\}^n$ in depth $n\cdot \tilde{O}(k^2)$ 2. We show that if the $n$ wires are layed out on a two-dimensional lattice of bits, then repeatedly alternating applications of approximate $k$-wise independent permutations of $\{0,1\}^{\sqrt n}$ to the rows and columns of the lattice yields an approximate $k$-wise independent permutation of $\{0,1\}^n$ in small depth. Our work improves on the original work of Gowers, who showed a gap of $1/\mathrm{poly}(n,k)$ for one random gate (with non-neighboring inputs); and, on subsequent work improving the gap to $Î©(1/n^2k)$ in the same setting.

Paperid: 2122, https://arxiv.org/pdf/2502.07066.pdf

Abstract:
In this paper we propose new methods to statistically assess $f$-Differential Privacy ($f$-DP), a recent refinement of differential privacy (DP) that remedies certain weaknesses of standard DP (including tightness under algorithmic composition). A challenge when deploying differentially private mechanisms is that DP is hard to validate, especially in the black-box setting. This has led to numerous empirical methods for auditing standard DP, while $f$-DP remains less explored. We introduce new black-box methods for $f$-DP that, unlike existing approaches for this privacy notion, do not require prior knowledge of the investigated algorithm. Our procedure yields a complete estimate of the $f$-DP trade-off curve, with theoretical guarantees of convergence. Additionally, we propose an efficient auditing method that empirically detects $f$-DP violations with statistical certainty, merging techniques from non-parametric estimation and optimal classification theory. Through experiments on a range of DP mechanisms, we demonstrate the effectiveness of our estimation and auditing procedures.

Paperid: 2123, https://arxiv.org/pdf/2502.07053.pdf

Abstract:
Internet-of-Things (IoT) devices are increasingly common in both consumer and industrial settings, often performing safety-critical functions. Although securing these devices is vital, manufacturers typically neglect security issues or address them as an afterthought. This is of particular importance in IoT networks, e.g., in the industrial automation settings. To this end, network attestation -- verifying the software state of all devices in a network -- is a promising mitigation approach. However, current network attestation schemes have certain shortcomings: (1) lengthy TOCTOU (Time-Of-Check-Time-Of-Use) vulnerability windows, (2) high latency and resource overhead, and (3) susceptibility to interference from compromised devices. To address these limitations, we construct TRAIN (TOCTOU-Resilient Attestation for IoT Networks), an efficient technique that minimizes TOCTOU windows, ensures constant-time per-device attestation, and maintains resilience even with multiple compromised devices. We demonstrate TRAIN's viability and evaluate its performance via a fully functional and publicly available prototype.

Paperid: 2124, https://arxiv.org/pdf/2502.06521.pdf

Abstract:
Advanced Persistent Threats (APTs) are challenging to detect due to their complexity and stealth. To mitigate such attacks, many approaches utilize provenance graphs to model entities and their dependencies, detecting the covert and persistent nature of APTs. However, existing methods face several challenges: 1) Environmental noise hinders precise detection; 2) Reliance on hard-to-obtain labeled data and prior knowledge of APTs limits their ability to detect unknown threats; 3) The difficulty in capturing long-range interaction dependencies, leading to the loss of critical context. We propose Sentient, a threat detection system based on behavioral intent analysis that detects node-level threats from audit logs. Sentient constructs a provenance graph from the audit logs and uses this graph to build multiple scenarios. By combining graph comprehension with multiple scenario comprehension, Sentient learns normal interaction behaviors. Sentient detects anomalies by identifying interactions that deviate from the established behavior patterns. We evaluated Sentient on three widely used datasets covering both real-world and simulated attacks. The results confirm that Sentient consistently delivers strong detection performance. Notably, Sentient achieves entity-level APT detection with a precision of 96% and a recall of 99%.

Paperid: 2125, https://arxiv.org/pdf/2502.05765.pdf

Abstract:
Access to diverse, high-quality datasets is crucial for machine learning model performance, yet data sharing remains limited by privacy concerns and competitive interests, particularly in regulated domains like healthcare. This dynamic especially disadvantages smaller organizations that lack resources to purchase data or negotiate favorable sharing agreements. We present SecureKL, a privacy-preserving framework that enables organizations to identify beneficial data partnerships without exposing sensitive information. Building on recent advances in dataset combination methods, we develop a secure multiparty computation protocol that maintains strong privacy guarantees while achieving >90\% correlation with plaintext evaluations. In experiments with real-world hospital data, SecureKL successfully identifies beneficial data partnerships that improve model performance for intensive care unit mortality prediction while preserving data privacy. Our framework provides a practical solution for organizations seeking to leverage collective data resources while maintaining privacy and competitive advantages. These results demonstrate the potential for privacy-preserving data collaboration to advance machine learning applications in high-stakes domains while promoting more equitable access to data resources.

Paperid: 2126, https://arxiv.org/pdf/2502.05429.pdf

Abstract:
Self-modifying code (SMC) allows programs to alter their own instructions, optimizing performance and functionality on x86 processors. Despite its benefits, SMC introduces unique microarchitectural behaviors that can be exploited for malicious purposes. In this paper, we explore the security implications of SMC by examining how specific x86 instructions affecting instruction cache lines lead to measurable timing discrepancies between cache hits and misses. These discrepancies facilitate refined cache attacks, making them less noisy and more effective. We introduce novel attack techniques that leverage these timing variations to enhance existing methods such as Prime+Probe and Flush+Reload. Our advanced techniques allow adversaries to more precisely attack cryptographic keys and create covert channels akin to Spectre across various x86 platforms. Finally, we propose a dynamic detection methodology utilizing hardware performance counters to mitigate these enhanced threats.

Paperid: 2127, https://arxiv.org/pdf/2502.05338.pdf

Abstract:
We introduce TNIC, a trusted NIC architecture for building trustworthy distributed systems deployed in heterogeneous, untrusted (Byzantine) cloud environments. TNIC builds a minimal, formally verified, silicon root-of-trust at the network interface level. We strive for three primary design goals: (1) a host CPU-agnostic unified security architecture by providing trustworthy network-level isolation; (2) a minimalistic and verifiable TCB based on a silicon root-of-trust by providing two core properties of transferable authentication and non-equivocation; and (3) a hardware-accelerated trustworthy network stack leveraging SmartNICs. Based on the TNIC architecture and associated network stack, we present a generic set of programming APIs and a recipe for building high-performance, trustworthy, distributed systems for Byzantine settings. We formally verify the safety and security properties of our TNIC while demonstrating its use by building four trustworthy distributed systems. Our evaluation of TNIC shows up to 6x performance improvement compared to CPU-centric TEE systems.

Paperid: 2128, https://arxiv.org/pdf/2502.04636.pdf

Abstract:
The Android ecosystem is vulnerable to issues such as app repackaging, counterfeiting, and piracy, threatening both developers and users. To mitigate these risks, developers often employ code obfuscation techniques. However, while effective in protecting legitimate applications, obfuscation also hinders security investigations as it is often exploited for malicious purposes. As such, it is important to understand code obfuscation practices in Android apps. In this paper, we analyze over 500,000 Android APKs from Google Play, spanning an eight-year period, to investigate the evolution and prevalence of code obfuscation techniques. First, we propose a set of classifiers to detect obfuscated code, tools, and techniques and then conduct a longitudinal analysis to identify trends. Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools. We also show that obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps. To our knowledge, this is the first large-scale study of obfuscation adoption in the Google Play Store, providing insights for developers and security analysts.

Paperid: 2129, https://arxiv.org/pdf/2502.04601.pdf

Abstract:
The privacy vulnerabilities of the federated learning (FL) paradigm, primarily caused by gradient leakage, have prompted the development of various defensive measures. Nonetheless, these solutions have predominantly been crafted for and assessed in the context of synchronous FL systems, with minimal focus on asynchronous FL. This gap arises in part due to the unique challenges posed by the asynchronous setting, such as the lack of coordinated updates, increased variability in client participation, and the potential for more severe privacy risks. These concerns have stymied the adoption of asynchronous FL. In this work, we first demonstrate the privacy vulnerabilities of asynchronous FL through a novel data reconstruction attack that exploits gradient updates to recover sensitive client data. To address these vulnerabilities, we propose a privacy-preserving framework that combines a gradient obfuscation mechanism with Trusted Execution Environments (TEEs) for secure asynchronous FL aggregation at the network edge. To overcome the limitations of conventional enclave attestation, we introduce a novel data-centric attestation mechanism based on Multi-Authority Attribute-Based Encryption. This mechanism enables clients to implicitly verify TEE-based aggregation services, effectively handle on-demand client participation, and scale seamlessly with an increasing number of asynchronous connections. Our gradient obfuscation mechanism reduces the structural similarity index of data reconstruction by 85% and increases reconstruction error by 400%, while our framework improves attestation efficiency by lowering average latency by up to 1500% compared to RA-TLS, without additional overhead.

Paperid: 2130, https://arxiv.org/pdf/2502.04565.pdf

Abstract:
This paper presents an implementation of machine learning model training using private federated learning (PFL) on edge devices. We introduce a novel framework that uses PFL to address the challenge of training a model using users' private data. The framework ensures that user data remain on individual devices, with only essential model updates transmitted to a central server for aggregation with privacy guarantees. We detail the architecture of our app selection model, which incorporates a neural network with attention mechanisms and ambiguity handling through uncertainty management. Experiments conducted through off-line simulations and on device training demonstrate the feasibility of our approach in real-world scenarios. Our results show the potential of PFL to improve the accuracy of an app selection model by adapting to changes in user behavior over time, while adhering to privacy standards. The insights gained from this study are important for industries looking to implement PFL, offering a robust strategy for training a predictive model directly on edge devices while ensuring user data privacy.

Paperid: 2131, https://arxiv.org/pdf/2502.04236.pdf

Abstract:
This paper presents the $\underline{\textbf{saf}}$e sub$\underline{\textbf{flo}}$w (Saflo) eBPF-based multipath TCP (MPTCP) scheduler, designed to mitigate traffic analysis attacks in cellular networks. Traffic analysis attacks, which exploit vulnerabilities in Downlink Control Information (DCI) messages, remain a significant security threat in LTE/5G networks. To counter such threats, the Saflo scheduler employs multipath communication combined with additional security-related tasks. Specifically, it utilizes eBPF tools to operate in both kernel and user spaces. In the kernel space, the eBPF scheduler performs multipath scheduling while excluding paths disabled by the user-space programs. The user-space programs conduct security-related computations and machine learning-based attack detection, determining whether each path should be enabled or disabled. This approach offloads computationally intensive tasks to user-space programs, enabling timely multipath scheduling in kernel space. The Saflo scheduler was evaluated in a private LTE/5G testbed. The results demonstrated that it significantly reduces the accuracy of video identification and user identification attacks in cellular networks while maintaining reasonable network performance for users.

Paperid: 2132, https://arxiv.org/pdf/2502.03964.pdf

Abstract:
Despite living in the era of the internet, phone-based scams remain one of the most prevalent forms of scams. These scams aim to exploit victims for financial gain, causing both monetary losses and psychological distress. While governments, industries, and academia have actively introduced various countermeasures, scammers also continue to evolve their tactics, making phone scams a persistent threat. To combat these increasingly sophisticated scams, detection technologies must also advance. In this work, we propose a framework for modeling scam calls and introduce an LLM-based real-time detection approach, which assesses fraudulent intent in conversations, further providing immediate warnings to users to mitigate harm. Through experiments, we evaluate the method's performance and analyze key factors influencing its effectiveness. This analysis enables us to refine the method to improve precision while exploring the trade-off between recall and timeliness, paving the way for future directions in this critical area of research.

Paperid: 2133, https://arxiv.org/pdf/2502.03403.pdf

Abstract:
Task offloading management in 6G vehicular networks is crucial for maintaining network efficiency, particularly as vehicles generate substantial data. Integrating secure communication through authentication introduces additional computational and communication overhead, significantly impacting offloading efficiency and latency. This paper presents a unified framework incorporating lightweight Identity-Based Cryptographic (IBC) authentication into task offloading within cloud-based 6G Vehicular Twin Networks (VTNs). Utilizing Proximal Policy Optimization (PPO) in Deep Reinforcement Learning (DRL), our approach optimizes authenticated offloading decisions to minimize latency and enhance resource allocation. Performance evaluation under varying network sizes, task sizes, and data rates reveals that IBC authentication can reduce offloading efficiency by up to 50% due to the added overhead. Besides, increasing network size and task size can further reduce offloading efficiency by up to 91.7%. As a countermeasure, increasing the transmission data rate can improve the offloading performance by as much as 63%, even in the presence of authentication overhead. The code for the simulations and experiments detailed in this paper is available on GitHub for further reference and reproducibility [1].

Paperid: 2134, https://arxiv.org/pdf/2502.03365.pdf

Abstract:
Software vulnerabilities are commonly detected via static analysis, penetration testing, and fuzzing. They can also be found by running unit tests - so-called vulnerability-witnessing tests - that stimulate the security-sensitive behavior with crafted inputs. Developing such tests is difficult and time-consuming; thus, automated data-driven approaches could help developers intercept vulnerabilities earlier. However, training and validating such approaches require a lot of data, which is currently scarce. This paper introduces VUTECO, a deep learning-based approach for collecting instances of vulnerability-witnessing tests from Java repositories. VUTECO carries out two tasks: (1) the "Finding" task to determine whether a test case is security-related, and (2) the "Matching" task to relate a test case to the exact vulnerability it is witnessing. VUTECO successfully addresses the Finding task, achieving perfect precision and 0.83 F0.5 score on validated test cases in VUL4J and returning 102 out of 145 (70%) correct security-related test cases from 244 open-source Java projects. Despite showing sufficiently good performance for the Matching task - i.e., 0.86 precision and 0.68 F0.5 score - VUTECO failed to retrieve any valid match in the wild. Nevertheless, we observed that in almost all of the matches, the test case was still security-related despite being matched to the wrong vulnerability. In the end, VUTECO can help find vulnerability-witnessing tests, though the matching with the right vulnerability is yet to be solved; the findings obtained lay the stepping stone for future research on the matter.

Paperid: 2135, https://arxiv.org/pdf/2502.03247.pdf

Abstract:
Threshold cryptography is a powerful and well-known technique with many applications to systems relying on distributed trust. It has recently emerged also as a solution to challenges in blockchain: frontrunning prevention, managing wallet keys, and generating randomness. This work presents Thetacrypt, a versatile library for integrating many threshold schemes into one codebase. It offers a way to easily build distributed systems using threshold cryptography and is agnostic to their implementation language. The architecture of Thetacrypt supports diverse protocols uniformly. The library currently includes six cryptographic schemes that span ciphers, signatures, and randomness generation. The library additionally contains a flexible adapter to an underlying networking layer that provides peer-to-peer communication and a total-order broadcast channel; the latter can be implemented by distributed ledgers, for instance. Thetacrypt serves as a controlled testbed for evaluating the performance of multiple threshold-cryptographic schemes under consistent conditions, showing how the traditional micro benchmarking approach neglects the distributed nature of the protocols and its relevance when considering system performance.

Paperid: 2136, https://arxiv.org/pdf/2502.02787.pdf

Abstract:
The widespread adoption of large language models (LLMs) necessitates reliable methods to detect LLM-generated text. We introduce SimMark, a robust sentence-level watermarking algorithm that makes LLMs' outputs traceable without requiring access to model internals, making it compatible with both open and API-based LLMs. By leveraging the similarity of semantic sentence embeddings combined with rejection sampling to embed detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while maintaining the text quality and fluency.

Paperid: 2137, https://arxiv.org/pdf/2502.02445.pdf

Abstract:
Quantum computers create new security risks for today's encryption systems. This paper presents an improved version of the Advanced Encryption Standard (AES) that uses quantum technology to strengthen protection. Our approach offers two modes: a fully quantum-based method for maximum security and a hybrid version that works with existing infrastructure. The system generates encryption keys using quantum randomness instead of predictable computer algorithms, making keys virtually impossible to guess. It regularly refreshes these keys automatically to block long-term attacks, even as technology advances. Testing confirms the system works seamlessly with current security standards, maintaining fast performance for high-volume data transfers. The upgraded AES keeps its original security benefits while adding three key defenses: quantum-powered key creation, adjustable security settings for different threats, and safeguards against attacks that exploit device vulnerabilities. Organizations can implement this solution in stages--starting with hybrid mode for sensitive data while keeping older systems operational. This phased approach allows businesses to protect financial transactions, medical records, and communication networks today while preparing for more powerful quantum computers in the future. The design prioritizes easy adoption, requiring no costly replacements of existing hardware or software in most cases.

Paperid: 2138, https://arxiv.org/pdf/2502.01966.pdf

Abstract:
This paper represents "Cloudlab", a comprehensive, cloud - native laboratory designed to support network security research and training. Built on Google Cloud and adhering to GitOps methodologies, Cloudlab facilitates the the creation, testing, and deployment of secure, containerized workloads using Kubernetes and serverless architectures. The lab integrates tools like Palo Alto Networks firewalls, Bridgecrew for "Security as Code," and automated GitHub workflows to establish a robust Continuous Integration/Continuous Machine Learning pipeline. By providing an adaptive and scalable environment, Cloudlab supports advanced security concepts such as role-based access control, Policy as Code, and container security. This initiative enables data scientists and engineers to explore cutting-edge practices in a dynamic cloud-native ecosystem, fostering innovation and improving operational resilience in modern IT infrastructures.

Paperid: 2139, https://arxiv.org/pdf/2502.00765.pdf

Abstract:
Graph neural networks (GNNs) achieve the state-of-the-art on graph-relevant tasks such as node and graph classification. However, recent works show GNNs are vulnerable to adversarial perturbations include the perturbation on edges, nodes, and node features, the three components forming a graph. Empirical defenses against such attacks are soon broken by adaptive ones. While certified defenses offer robustness guarantees, they face several limitations: 1) almost all restrict the adversary's capability to only one type of perturbation, which is impractical; 2) all are designed for a particular GNN task, which limits their applicability; and 3) the robustness guarantees of all methods except one are not 100% accurate. We address all these limitations by developing AGNNCert, the first certified defense for GNNs against arbitrary (edge, node, and node feature) perturbations with deterministic robustness guarantees, and applicable to the two most common node and graph classification tasks. AGNNCert also encompass existing certified defenses as special cases. Extensive evaluations on multiple benchmark node/graph classification datasets and two real-world graph datasets, and multiple GNNs validate the effectiveness of AGNNCert to provably defend against arbitrary perturbations. AGNNCert also shows its superiority over the state-of-the-art certified defenses against the individual edge perturbation and node perturbation.

Paperid: 2140, https://arxiv.org/pdf/2501.19191.pdf

Abstract:
This paper introduces a secure communication architecture for Unmanned Aerial Vehicles (UAVs) and ground stations in 5G networks, addressing critical challenges in network security. The proposed solution integrates the Advanced Encryption Standard (AES) with Elliptic Curve Cryptography (ECC) and CRYSTALS-Kyber for key encapsulation, offering a hybrid cryptographic approach. By incorporating CRYSTALS-Kyber, the framework mitigates vulnerabilities in ECC against quantum attacks, positioning it as a quantum-resistant alternative. The architecture is based on a server-client model, with UAVs functioning as clients and the ground station acting as the server. The system was rigorously evaluated in both VPN and 5G environments. Experimental results confirm that CRYSTALS-Kyber delivers strong protection against quantum threats with minimal performance overhead, making it highly suitable for UAVs with resource constraints. Moreover, the proposed architecture integrates an Artificial Intelligence (AI)-based Intrusion Detection System (IDS) to further enhance security. In performance evaluations, the IDS demonstrated strong results across multiple models with XGBoost, particularly in more demanding scenarios, outperforming other models with an accuracy of 97.33% and an AUC of 0.94. These findings underscore the potential of combining quantum-resistant encryption mechanisms with AI-driven IDS to create a robust, scalable, and secure communication framework for UAV networks, particularly within the high-performance requirements of 5G environments.

Paperid: 2141, https://arxiv.org/pdf/2501.18928.pdf

Abstract:
We propose a quantum function secret sharing scheme in which the communication is exclusively classical. In this primitive, a classical dealer distributes a secret quantum circuit $C$ by providing shares to $p$ quantum parties. The parties on an input state $|Ï\rangle$ and a projection $Î $, compute values $y_i$ that they then classically communicate back to the dealer, who can then compute $\lVert Î C|Ï\rangle\rVert^2$ using only classical resources. Moreover, the shares do not leak much information about the secret circuit $C$. Our protocol for quantum secret sharing uses the {\em Cayley path}, a tool that has been extensively used to support quantum primacy claims. More concretely, the shares of $C$ correspond to randomized version of $C$ which are delegated to the quantum parties, and the reconstruction can be done by extrapolation. Our scheme has two limitations, which we prove to be inherent to our techniques: First, our scheme is only secure against single adversaries, and we show that if two parties collude, then they can break its security. Second, the evaluation done by the parties requires exponential time in the number of gates.

Paperid: 2142, https://arxiv.org/pdf/2501.18857.pdf

Abstract:
RowHammer vulnerabilities pose a significant threat to modern DRAM-based systems, where rapid activation of DRAM rows can induce bit-flips in neighboring rows. To mitigate this, state-of-the-art host-side RowHammer mitigations typically rely on shared counters or tracking structures. While these optimizations benefit benign applications, they are vulnerable to Performance Attacks (Perf-Attacks), where adversaries exploit shared structures to reduce DRAM bandwidth for co-running benign applications by increasing DRAM accesses for RowHammer counters or triggering repetitive refreshes required for the early reset of structures, significantly degrading performance. In this paper, we propose secure hashing mechanisms to thwart adversarial attempts to capture the mapping of shared structures. We propose DAPPER, a novel low-cost tracker resilient to Perf-Attacks even at ultra-low RowHammer thresholds. We first present a secure hashing template in the form of DAPPER-S. We then develop DAPPER-H, an enhanced version of DAPPER-S, incorporating double-hashing, novel reset strategies, and mitigative refresh techniques. Our security analysis demonstrates the effectiveness of DAPPER-H against both RowHammer and Perf-Attacks. Experiments with 57 workloads from SPEC2006, SPEC2017, TPC, Hadoop, MediaBench, and YCSB show that, even at an ultra-low RowHammer threshold of 500, DAPPER-H incurs only a 0.9% slowdown in the presence of Perf-Attacks while using only 96KB of SRAM per 32GB of DRAM memory.

Paperid: 2143, https://arxiv.org/pdf/2501.18810.pdf

Abstract:
Blockchain ecosystems -- such as those built around chains, layers, and services -- try to engage users for a variety of reasons: user education, growing and protecting their market share, climbing metric-measuring leaderboards with competing systems, demonstrating usage to investors, and identifying worthy recipients for newly created tokens (airdrops). A popular approach is offering user quests: small tasks that can be completed by a user, exposing them to a common task they might want to do in the future, and rewarding them for completion. In this paper, we analyze a proprietary dataset from one deployed quest system that offered 43 unique quests over 10 months with 80M completions. We offer insights about the factors that correlate with task completion: amount of reward, monetary value of reward, difficulty, and cost. We also discuss the role of farming and bots, and the factors that complicate distinguishing real users from automated scripts.

Paperid: 2144, https://arxiv.org/pdf/2501.17634.pdf

Abstract:
With growing concerns about user data collection, individualized privacy has emerged as a promising solution to balance protection and utility by accounting for diverse user privacy preferences. Instead of enforcing a uniform level of anonymization for all users, this approach allows individuals to choose privacy settings that align with their comfort levels. Building on this idea, we propose an adapted method for enabling Individualized Differential Privacy (IDP) in Federated Learning (FL) by handling clients according to their personal privacy preferences. By extending the SAMPLE algorithm from centralized settings to FL, we calculate client-specific sampling rates based on their heterogeneous privacy budgets and integrate them into a modified IDP-FedAvg algorithm. We test this method under realistic privacy distributions and multiple datasets. The experimental results demonstrate that our approach achieves clear improvements over uniform DP baselines, reducing the trade-off between privacy and utility. Compared to the alternative SCALE method in related work, which assigns differing noise scales to clients, our method performs notably better. However, challenges remain for complex tasks with non-i.i.d. data, primarily stemming from the constraints of the decentralized setting.

Paperid: 2145, https://arxiv.org/pdf/2501.16704.pdf

Abstract:
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset.

Paperid: 2146, https://arxiv.org/pdf/2501.16029.pdf

Abstract:
Large Language Models (LLMs) are rapidly transforming the landscape of digital content creation. However, the prevalent black-box Application Programming Interface (API) access to many LLMs introduces significant challenges in accountability, governance, and security. LLM fingerprinting, which aims to identify the source model by analyzing statistical and stylistic features of generated text, offers a potential solution. Current progress in this area is hindered by a lack of dedicated datasets and the need for efficient, practical methods that are robust against adversarial manipulations. To address these challenges, we introduce FD-Dataset, a comprehensive bilingual fingerprinting benchmark comprising 90,000 text samples from 20 famous proprietary and open-source LLMs. Furthermore, we present FDLLM, a novel fingerprinting method that leverages parameter-efficient Low-Rank Adaptation (LoRA) to fine-tune a foundation model. This approach enables LoRA to extract deep, persistent features that characterize each source LLM. Through our analysis, we find that LoRA adaptation promotes the aggregation of outputs from the same LLM in representation space while enhancing the separation between different LLMs. This mechanism explains why LoRA proves particularly effective for LLM fingerprinting. Extensive empirical evaluations on FD-Dataset demonstrate FDLLM's superiority, achieving a Macro F1 score 22.1% higher than the strongest baseline. FDLLM also exhibits strong generalization to newly released models, achieving an average accuracy of 95% on unseen models. Notably, FDLLM remains consistently robust under various adversarial attacks, including polishing, translation, and synonym substitution. Experimental results show that FDLLM reduces the average attack success rate from 49.2% (LM-D) to 23.9%.

Paperid: 2147, https://arxiv.org/pdf/2501.15563.pdf

Abstract:
The rapid expansion of connected devices has made them prime targets for cyberattacks. To address these threats, deep learning-based, data-driven intrusion detection systems (IDS) have emerged as powerful tools for detecting and mitigating such attacks. These IDSs analyze network traffic to identify unusual patterns and anomalies that may indicate potential security breaches. However, prior research has shown that deep learning models are vulnerable to backdoor attacks, where attackers inject triggers into the model to manipulate its behavior and cause misclassifications of network traffic. In this paper, we explore the susceptibility of deep learning-based IDS systems to backdoor attacks in the context of network traffic analysis. We introduce \texttt{PCAP-Backdoor}, a novel technique that facilitates backdoor poisoning attacks on PCAP datasets. Our experiments on real-world Cyber-Physical Systems (CPS) and Internet of Things (IoT) network traffic datasets demonstrate that attackers can effectively backdoor a model by poisoning as little as 1\% or less of the entire training dataset. Moreover, we show that an attacker can introduce a trigger into benign traffic during model training yet cause the backdoored model to misclassify malicious traffic when the trigger is present. Finally, we highlight the difficulty of detecting this trigger-based backdoor, even when using existing backdoor defense techniques.

Paperid: 2148, https://arxiv.org/pdf/2501.15478.pdf

Abstract:
LoRA (Low-Rank Adaptation) has achieved remarkable success in the parameter-efficient fine-tuning of large models. The trained LoRA matrix can be integrated with the base model through addition or negation operation to improve performance on downstream tasks. However, the unauthorized use of LoRAs to generate harmful content highlights the need for effective mechanisms to trace their usage. A natural solution is to embed watermarks into LoRAs to detect unauthorized misuse. However, existing methods struggle when multiple LoRAs are combined or negation operation is applied, as these can significantly degrade watermark performance. In this paper, we introduce LoRAGuard, a novel black-box watermarking technique for detecting unauthorized misuse of LoRAs. To support both addition and negation operations, we propose the Yin-Yang watermark technique, where the Yin watermark is verified during negation operation and the Yang watermark during addition operation. Additionally, we propose a shadow-model-based watermark training approach that significantly improves effectiveness in scenarios involving multiple integrated LoRAs. Extensive experiments on both language and diffusion models show that LoRAGuard achieves nearly 100% watermark verification success and demonstrates strong effectiveness.

Paperid: 2149, https://arxiv.org/pdf/2501.13965.pdf

Abstract:
Low-Rank Adaptation (LoRA) is a widely adopted method for customizing large-scale language models. In distributed, untrusted training environments, an open source base model user may want to use LoRA weights created by an external contributor, leading to two requirements: (1) the base model user must confirm that the LoRA weights are effective when paired with the intended base model, and (2) the LoRA contributor must keep their proprietary weights private until compensation is assured. We present ZKLoRA, a zero-knowledge verification protocol that relies on succinct proofs and our novel Multi-Party Inference procedure to verify LoRA-base model compatibility without exposing LoRA weights. ZKLoRA produces deterministic correctness guarantees and validates each LoRA module in only 1-2 seconds on state-of-the-art large language models. This low-latency approach enables nearly real-time verification and promotes secure collaboration among geographically decentralized teams and contract-based training pipelines. The protocol ensures that the delivered LoRA module works as claimed, safeguarding the contributor's intellectual property while providing the base model user with verification of compatibility and lineage.

Paperid: 2150, https://arxiv.org/pdf/2501.13962.pdf

Abstract:
The rapid expansion of the industrial Internet of things (IIoT) has introduced new challenges in securing critical infrastructures against sophisticated cyberthreats. This study presents the development and evaluation of an advanced Intrusion detection (IDS) based on a hybrid LSTM-convolution neural network (CNN)-Attention architecture, specifically designed to detect and classify cyberattacks in IIoT environments. The research focuses on two key classification tasks: binary and multi-class classification. The proposed models was rigorously tested using the Edge-IIoTset dataset. To mitigate the class imbalance in the dataset, the synthetic minority over-sampling technique (SMOTE) was employed to generate synthetic samples for the underrepresented classes. This ensured that the model could learn effectively from all classes, thereby improving the overall classification performance. Through systematic experimentation, various deep learning (DL) models were compared, ultimately demonstrating that the LSTM-CNN-Attention model consistently outperformed others across key performance metrics. In binary classification, the model achieved near-perfect accuracy, while in multi-class classification, it maintained a high accuracy level (99.04%), effectively categorizing different attack types with a loss value of 0.0220%.

Paperid: 2151, https://arxiv.org/pdf/2501.13941.pdf

Abstract:
Recent advances in Large Language Models (LLMs) have led to significant improvements in natural language processing tasks, but their ability to generate human-quality text raises significant ethical and operational concerns in settings where it is important to recognize whether or not a given text was generated by a human. Thus, recent work has focused on developing techniques for watermarking LLM-generated text, i.e., introducing an almost imperceptible signal that allows a provider equipped with a secret key to determine if given text was generated by their model. Current watermarking techniques are often not practical due to concerns with generation latency, detection time, degradation in text quality, or robustness. Many of these drawbacks come from the focus on token-level watermarking, which ignores the inherent structure of text. In this work, we introduce a new scheme, GaussMark, that is simple and efficient to implement, has formal statistical guarantees on its efficacy, comes at no cost in generation latency, and embeds the watermark into the weights of the model itself, providing a structural watermark. Our approach is based on Gaussian independence testing and is motivated by recent empirical observations that minor additive corruptions to LLM weights can result in models of identical (or even improved) quality. We show that by adding a small amount of Gaussian noise to the weights of a given LLM, we can watermark the model in a way that is statistically detectable by a provider who retains the secret key. We provide formal statistical bounds on the validity and power of our procedure. Through an extensive suite of experiments, we demonstrate that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations and can be instantiated with essentially no loss in model quality.

Paperid: 2152, https://arxiv.org/pdf/2501.12842.pdf

Abstract:
Symmetric private information retrieval is a cryptographic task allowing a user to query a database and obtain exactly one entry without revealing to the owner of the database which element was accessed. The task is a variant of general two-party protocols called one-sided secure function evaluation and is closely related to oblivious transfer. Under the name quantum private queries, quantum protocols have been proposed to solve this problem in a cheat-sensitive way: In such protocols, it is not impossible for dishonest participants to cheat, but they risk detection [V. Giovannetti, S. Lloyd, and L. Maccone, Phys. Rev. Lett. 100, 230502 (2008)]. We give an explicit attack against any cheat-sensitive symmetric private information retrieval protocol, showing that any protocol that is secure for the user cannot have non-trivial security guarantees for the owner of the database.

Paperid: 2153, https://arxiv.org/pdf/2501.12465.pdf

Abstract:
Artificial Intelligence (AI) faces growing challenges from evolving data protection laws and enforcement practices worldwide. Regulations like GDPR and CCPA impose strict compliance requirements on Machine Learning (ML) models, especially concerning personal data use. These laws grant individuals rights such as data correction and deletion, complicating the training and deployment of Large Language Models (LLMs) that rely on extensive datasets. Public data availability does not guarantee its lawful use for ML, amplifying these challenges. This paper introduces an adaptive system for mitigating risk of Personally Identifiable Information (PII) and Sensitive Personal Information (SPI) in LLMs. It dynamically aligns with diverse regulatory frameworks and integrates seamlessly into Governance, Risk, and Compliance (GRC) systems. The system uses advanced NLP techniques, context-aware analysis, and policy-driven masking to ensure regulatory compliance. Benchmarks highlight the system's effectiveness, with an F1 score of 0.95 for Passport Numbers, outperforming tools like Microsoft Presidio (0.33) and Amazon Comprehend (0.54). In human evaluations, the system achieved an average user trust score of 4.6/5, with participants acknowledging its accuracy and transparency. Observations demonstrate stricter anonymization under GDPR compared to CCPA, which permits pseudonymization and user opt-outs. These results validate the system as a scalable and robust solution for enterprise privacy compliance.

Paperid: 2154, https://arxiv.org/pdf/2501.12456.pdf

Abstract:
The adoption of Large Language Models (LLMs) has revolutionized AI applications but poses significant challenges in safeguarding user privacy. Ensuring compliance with privacy regulations such as GDPR and CCPA while addressing nuanced privacy risks requires robust and scalable frameworks. This paper presents a detailed study of OneShield Privacy Guard, a framework designed to mitigate privacy risks in user inputs and LLM outputs across enterprise and open-source settings. We analyze two real-world deployments:(1) a multilingual privacy-preserving system integrated with Data and Model Factory, focusing on enterprise-scale data governance; and (2) PR Insights, an open-source repository emphasizing automated triaging and community-driven refinements. In Deployment 1, OneShield achieved a 0.95 F1 score in detecting sensitive entities like dates, names, and phone numbers across 26 languages, outperforming state-of-the-art tool such as StarPII and Presidio by up to 12\%. Deployment 2, with an average F1 score of 0.86, reduced manual effort by over 300 hours in three months, accurately flagging 8.25\% of 1,256 pull requests for privacy risks with enhanced context sensitivity. These results demonstrate OneShield's adaptability and efficacy in diverse environments, offering actionable insights for context-aware entity recognition, automated compliance, and ethical AI adoption. This work advances privacy-preserving frameworks, supporting user trust and compliance across operational contexts.

Paperid: 2155, https://arxiv.org/pdf/2501.12009.pdf

Abstract:
A lattice-based signature, called G+G convoluted Gaussian signature, was proposed in ASIACRYPT 2023 and was proved secure in the quantum random oracle model. In this paper, we propose a ratio attack on the G+G convoluted Gaussian signature to recover the secret key and comment on the revised eprint paper. The attack exploits the fact, proved in this paper, that the secret key can be obtained from the expected value of the ratio of signatures which follows a truncated Cauchy distribution. Moreover, we also compute the number of signatures required to successfully recover the secret key. Furthermore, we simulate the ratio attack in Sagemath with a few different parameters as a proof-of-concept of the ratio attack. In addition, although the revised signature in the revised eprint paper is secure against the ratio attack, we found that either a valid signature cannot be produced or a signature can be forged easily for their given parameters in the eprint.

Paperid: 2156, https://arxiv.org/pdf/2501.11848.pdf

Abstract:
Recently, the practical needs of ``the right to be forgotten'' in federated learning gave birth to a paradigm known as federated unlearning, which enables the server to forget personal data upon the client's removal request. Existing studies on federated unlearning have primarily focused on efficiently eliminating the influence of requested data from the client's model without retraining from scratch, however, they have rarely doubted the reliability of the global model posed by the discrepancy between its prediction performance before and after unlearning. To bridge this gap, we take the first step by introducing a novel malicious unlearning attack dubbed FedMUA, aiming to unveil potential vulnerabilities emerging from federated learning during the unlearning process. The crux of FedMUA is to mislead the global model into unlearning more information associated with the influential samples for the target sample than anticipated, thus inducing adverse effects on target samples from other clients. To achieve this, we design a novel two-step method, known as Influential Sample Identification and Malicious Unlearning Generation, to identify and subsequently generate malicious feature unlearning requests within the influential samples. By doing so, we can significantly alter the predictions pertaining to the target sample by initiating the malicious feature unlearning requests, leading to the deliberate manipulation for the user adversely. Additionally, we design a new defense mechanism that is highly resilient against malicious unlearning attacks. Extensive experiments on three realistic datasets reveal that FedMUA effectively induces misclassification on target samples and can achieve an 80% attack success rate by triggering only 0.3% malicious unlearning requests.

Paperid: 2157, https://arxiv.org/pdf/2501.11771.pdf

Abstract:
Confidential computing (CC) or trusted execution enclaves (TEEs) is now the most common approach to enable secure computing in the cloud. The recent introduction of GPU TEEs by NVIDIA enables machine learning (ML) models to be trained without leaking model weights or data to the cloud provider. However, the potential performance implications of using GPU TEEs for ML training are not well characterized. In this work, we present an in-depth characterization study on performance overhead associated with running distributed data parallel (DDP) ML training with GPU Trusted Execution Environments (TEE). Our study reveals the performance challenges in DDP training within GPU TEEs. DDP uses ring-all-reduce, a well-known approach, to aggregate gradients from multiple devices. Ring all-reduce consists of multiple scatter-reduce and all-gather operations. In GPU TEEs only the GPU package (GPU and HBM memory) is trusted. Hence, any data communicated outside the GPU packages must be encrypted and authenticated for confidentiality and integrity verification. Hence, each phase of the ring-all-reduce requires encryption and message authentication code (MAC) generation from the sender, and decryption and MAC authentication on the receiver. As the number of GPUs participating in DDP increases, the overhead of secure inter-GPU communication during ring-all-reduce grows proportionally. Additionally, larger models lead to more asynchronous all-reduce operations, exacerbating the communication cost. Our results show that with four GPU TEEs, depending on the model that is being trained, the runtime per training iteration increases by an average of 8x and up to a maximum of 41.6x compared to DDP training without TEE.

Paperid: 2158, https://arxiv.org/pdf/2501.11668.pdf

Abstract:
Ethereum is currently the second largest blockchain by market capitalization and a popular platform for cryptocurrencies. As it has grown, the high value present and the anonymity afforded by the technology have led Ethereum to become a hotbed for various cybercrimes. This paper seeks to understand how these fraudulent schemes may be characterized and develop methods for detecting them. One key feature introduced by Ethereum is the ability to use programmable smart contracts to execute code on the blockchain. A common use of smart contracts is implementing fungible tokens with the ERC-20 interface. Such tokens can be used to impersonate legitimate tokens and defraud users. By parsing the event logs emitted by these ERC-20 contracts over 20 different periods of 100K blocks, we construct token transfer graphs for each of the available ERC-20 tokens on the blockchain. By analyzing these graphs, we find a set of characteristics by which suspicious contracts are distinguished from legitimate ones. These observations result in a simple model that can identify scam contracts with an average of 88.7% accuracy. This suggests that the mechanism by which fraudulent schemes function strongly correlates with their transfer graphs and that these graphs may be used to improve scam-detection mechanisms, contributing to making Ethereum safer.

Paperid: 2159, https://arxiv.org/pdf/2501.09895.pdf

Abstract:
Quantum security improves cryptographic protocols by applying quantum mechanics principles, assuring resistance to both quantum and conventional computer attacks. This work addresses these issues by integrating Quantum Key Distribution (QKD) utilizing the E91 method with Multi-Layer Chaotic Encryption, which employs a variety of patterns to detect eavesdropping, resulting in a highly secure image-transmission architecture. The method leverages entropy calculations to determine the unpredictability and integrity of encrypted and decrypted pictures, guaranteeing strong security. Extensive statistical scenarios illustrate the framework's effectiveness in image encryption while preserving high entropy and sensitivity to the original visuals. The findings indicate significant improvement in encryption and decryption performance, demonstrating the framework's potential as a robust response to weaknesses introduced by advances in quantum computing. Several metrics, such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Normalized Cross-Correlation (NCC), Bit Error Rate (BER), entropy values for original, encrypted, and decrypted images, and the correlation between original and decrypted images, validate the framework's effectiveness. The combination of QKD with Multi-Layer Chaotic Encryption provides a scalable and resilient technique to secure image communication. As quantum computing advances, this framework offers a future-proof approach for defining secure communication protocols in crucial sectors such as medical treatment, forensic computing, and national security, where information confidentiality is valuable.

Paperid: 2160, https://arxiv.org/pdf/2501.09246.pdf

Abstract:
This paper examines the Galileo Open Service Navigation Message Authentication (OSNMA) and, for the first time, discovers two critical vulnerabilities, namely artificially-manipulated time synchronization (ATS) and interruptible message authentication (IMA). ATS allows attackers falsify a receiver's signals and/or local reference time (LRT) while still fulfilling the time synchronization (TS) requirement. IMA allows temporary interruption of the navigation data authentication process due to the reception of a broken message (probably caused by spoofing attacks) and restores the authentication later. By exploiting the ATS vulnerability, we propose a TS-comply replay (TSR) attack with two variants (real-time and non-real-time), where attackers replay signals to a victim receiver while strictly complying with the TS rule. We further propose a TS-comply forgery (TSF) attack, where attackers first use a previously-disclosed key to forge a message based on the OSNMA protocol, then tamper with the vitcim receiver's LRT correspondingly to comply with the TS rule and finally transmit the forged message to the receiver. Finally, we propose a concatenating replay (CR) attack based on the IMA vulnerability, where attackers concatenate replayed signals to the victim receiver's signals in a way that still enables correct verification of the navigation data in the replayed signals. To validate the effectiveness of the proposed attacks, we conduct real-world experiments with a commercial Galileo receiver manufactured by Septentrio, two software-defined radio (SDR) devices, open-source Galileo-SDR-SIM and OSNMAlib software. The results showed that all the attacks can successfully pass the OSNMA scheme and the TSF attack can spoof receivers to arbitrary locations.

Paperid: 2161, https://arxiv.org/pdf/2501.06989.pdf

Abstract:
With its significant security potential, the quantum internet is poised to revolutionize technologies like cryptography and communications. Although it boasts enhanced security over traditional networks, the quantum internet still encounters unique security challenges essential for safeguarding its Confidentiality, Integrity, and Availability (CIA). This study explores these challenges by analyzing the vulnerabilities and the corresponding mitigation strategies across different layers of the quantum internet, including physical, link, network, and application layers. We assess the severity of potential attacks, evaluate the expected effectiveness of mitigation strategies, and identify vulnerabilities within diverse network configurations, integrating both classical and quantum approaches. Our research highlights the dynamic nature of these security issues and emphasizes the necessity for adaptive security measures. The findings underline the need for ongoing research into the security dimension of the quantum internet to ensure its robustness, encourage its adoption, and maximize its impact on society.

Paperid: 2162, https://arxiv.org/pdf/2501.06620.pdf

Abstract:
The cumulative distribution function (CDF) is fundamental due to its ability to reveal information about random variables, making it essential in studies that require privacy-preserving methods to protect sensitive data. This paper introduces a novel privacy-preserving CDF method inspired by the functional analysis and functional mechanism. Our approach projects the empirical CDF into a predefined space, approximating it using specific functions, and protects the coefficients to achieve a differentially private empirical CDF. Compared to existing methods like histogram queries and adaptive quantiles, our method is preferable in decentralized settings and scenarios where CDFs must be updated with newly collected data.

Paperid: 2163, https://arxiv.org/pdf/2501.05907.pdf

Abstract:
The Huge growth in the usage of web applications has raised concerns regarding their security vulnerabilities, which in turn pushes toward robust security testing tools. This study compares OWASP ZAP, the leading open-source web application vulnerability scanner, across its two most recent iterations. While comparing their performance to the OWASP Benchmark, the study evaluates their efficiency in spotting vulnerabilities in the purposefully vulnerable application, OWASP Benchmark project. The research methodology involves conducting systematic scans of OWASP Benchmark using both v2.12.0 and v2.13.0 of OWASP ZAP. The OWASP Benchmark provides a standardized framework to evaluate the scanner's abilities in identifying security flaws, Insecure Cookies, Path traversal, SQL injection, and more. Results obtained from this benchmark comparison offer valuable insights into the strengths and weaknesses of each version of the tool. This study aids in web application security testing by shedding light on how well-known scanners work at spotting vulnerabilities. The knowledge gained from this study can assist security professionals and developers in making informed decisions to support their web application security status. In conclusion, this study comprehensively analyzes ZAP's capabilities in detecting security flaws using OWASP Benchmark v1.2. The findings add to the continuing debates about online application security tools and establish the framework for future studies and developments in the research field of web application security testing.

Paperid: 2164, https://arxiv.org/pdf/2501.05722.pdf

Abstract:
As a critical component of electrical energy infrastructure, the smart grid system has become indispensable to the energy sector. However, the rapid evolution of smart grids has attracted numerous nation-state actors seeking to disrupt the power infrastructure of adversarial nations. This development underscores the urgent need to establish secure mechanisms for firmware updates, with firmware signing and verification serving as pivotal elements in safeguarding system integrity. In this work, we propose a digital signing and verification framework grounded in Public Key Infrastructure (PKI), specifically tailored for resource-constrained devices such as smart meters. The framework utilizes the Concise Binary Object Representation (CBOR) and Object Signing and Encryption (COSE) formats to achieve efficient da-ta encapsulation and robust security features. Our approach not only en-sures the secure deployment of firmware updates against the convergence of information technology (IT) and operational technology (OT) attacks but also addresses performance bottlenecks stemming from device limitations, thereby enhancing the overall reliability and stability of the smart grid sys-tem.

Paperid: 2165, https://arxiv.org/pdf/2501.05614.pdf

Abstract:
Graph Neural Networks (GNNs) are the mainstream method to learn pervasive graph data and are widely deployed in industry, making their intellectual property valuable. However, protecting GNNs from unauthorized use remains a challenge. Watermarking, which embeds ownership information into a model, is a potential solution. However, existing watermarking methods have two key limitations: First, almost all of them focus on non-graph data, with watermarking GNNs for complex graph data largely unexplored. Second, the de facto backdoor-based watermarking methods pollute training data and induce ownership ambiguity through intentional misclassification. Our explanation-based watermarking inherits the strengths of backdoor-based methods (e.g., robust to watermark removal attacks), but avoids data pollution and eliminates intentional misclassification. In particular, our method learns to embed the watermark in GNN explanations such that this unique watermark is statistically distinct from other potential solutions, and ownership claims must show statistical significance to be verified. We theoretically prove that, even with full knowledge of our method, locating the watermark is an NP-hard problem. Empirically, our method manifests robustness to removal attacks like fine-tuning and pruning. By addressing these challenges, our approach marks a significant advancement in protecting GNN intellectual property.

Paperid: 2166, https://arxiv.org/pdf/2501.05535.pdf

Abstract:
In blockchain systems, fair transaction ordering is crucial for a trusted and regulation-compliant economic ecosystem. Unlike traditional State Machine Replication (SMR) systems, which focus solely on liveness and safety, blockchain systems also require a fairness property. This paper examines these properties and aims to eliminate algorithmic bias in transaction ordering services. We build on the notion of equal opportunity. We characterize transactions in terms of relevant and irrelevant features, requiring that the order be determined solely by the relevant ones. Specifically, transactions with identical relevant features should have an equal chance of being ordered before one another. We extend this framework to define a property where the greater the distance in relevant features between transactions, the higher the probability of prioritizing one over the other. We reveal a surprising link between equal opportunity in SMR and Differential Privacy (DP), showing that any DP mechanism can be used to ensure fairness in SMR. This connection not only enhances our understanding of the interplay between privacy and fairness in distributed computing but also opens up new opportunities for designing fair distributed protocols using well-established DP techniques.

Paperid: 2167, https://arxiv.org/pdf/2501.05249.pdf

Abstract:
In recent years, tremendous success has been witnessed in Retrieval-Augmented Generation (RAG), widely used to enhance Large Language Models (LLMs) in domain-specific, knowledge-intensive, and privacy-sensitive tasks. However, attackers may steal those valuable RAGs and deploy or commercialize them, making it essential to detect Intellectual Property (IP) infringement. Most existing ownership protection solutions, such as watermarks, are designed for relational databases and texts. They cannot be directly applied to RAGs because relational database watermarks require white-box access to detect IP infringement, which is unrealistic for the knowledge base in RAGs. Meanwhile, post-processing by the adversary's deployed LLMs typically destructs text watermark information. To address those problems, we propose a novel black-box "knowledge watermark" approach, named RAG-WM, to detect IP infringement of RAGs. RAG-WM uses a multi-LLM interaction framework, comprising a Watermark Generator, Shadow LLM & RAG, and Watermark Discriminator, to create watermark texts based on watermark entity-relationship tuples and inject them into the target RAG. We evaluate RAG-WM across three domain-specific and two privacy-sensitive tasks on four benchmark LLMs. Experimental results show that RAG-WM effectively detects the stolen RAGs in various deployed LLMs. Furthermore, RAG-WM is robust against paraphrasing, unrelated content removal, knowledge insertion, and knowledge expansion attacks. Lastly, RAG-WM can also evade watermark detection approaches, highlighting its promising application in detecting IP infringement of RAG systems.

Paperid: 2168, https://arxiv.org/pdf/2501.03709.pdf

Abstract:
We show that the large Cartesian powers of any graph have log-concave valencies with respect to a ffxed vertex. We show that the series of valencies of distance regular graphs is log-concave, thus improving on a result of (Taylor, Levingston, 1978). Consequences for strongly regular graphs, two-weight codes, and completely regular codes are derived. By P-Q duality of association schemes the series of multiplicities of Q-polynomial association schemes is shown, under some assumption, to be log-concave.

Paperid: 2169, https://arxiv.org/pdf/2501.03423.pdf

Abstract:
Blockchain technology has revolutionized industries by enabling secure and decentralized transactions. However, the isolated nature of blockchain ecosystems hinders the seamless transfer of digital assets across different chains. Cross-chain bridges have emerged as vital web3 infrastructure to address this challenge by facilitating interoperability between distinct blockchains. Cross-chain bridges remain vulnerable to various attacks despite sophisticated designs and security measures. The industry has experienced a surge in bridge attacks, resulting in significant financial losses. The largest hack impacted Axie Infinity Ronin Bridge, with a loss of almost \$600 million USD. This paper analyzes recent cross-chain bridge hacks in 2022 and 2023 and examines the exploited vulnerabilities. By understanding the attack nature and underlying weaknesses, the paper aims to enhance bridge security and propose potential countermeasures. The findings contribute to developing industry-wide standards for bridge security and operational resilience. Addressing the vulnerabilities and weaknesses exploited in recent cross-chain bridge hacks fosters trust and confidence in cross-chain interoperability.

Paperid: 2170, https://arxiv.org/pdf/2501.01194.pdf

Abstract:
Watermarking deep neural network (DNN) models has attracted a great deal of attention and interest in recent years because of the increasing demand to protect the intellectual property of DNN models. Many practical algorithms have been proposed by covertly embedding a secret watermark into a given DNN model through either parametric/structural modulation or backdooring against intellectual property infringement from the attacker while preserving the model performance on the original task. Despite the performance of these approaches, the lack of basic research restricts the algorithmic design to either a trial-based method or a data-driven technique. This has motivated the authors in this paper to introduce a game between the model attacker and the model defender for trigger-based black-box model watermarking. For each of the two players, we construct the payoff function and determine the optimal response, which enriches the theoretical foundation of model watermarking and may inspire us to develop novel schemes in the future.

Paperid: 2171, https://arxiv.org/pdf/2501.01042.pdf

Abstract:
Video-based multimodal large language models (V-MLLMs) have shown vulnerability to adversarial examples in video-text multimodal tasks. However, the transferability of adversarial videos to unseen models--a common and practical real world scenario--remains unexplored. In this paper, we pioneer an investigation into the transferability of adversarial video samples across V-MLLMs. We find that existing adversarial attack methods face significant limitations when applied in black-box settings for V-MLLMs, which we attribute to the following shortcomings: (1) lacking generalization in perturbing video features, (2) focusing only on sparse key-frames, and (3) failing to integrate multimodal information. To address these limitations and deepen the understanding of V-MLLM vulnerabilities in black-box scenarios, we introduce the Image-to-Video MLLM (I2V-MLLM) attack. In I2V-MLLM, we utilize an image-based multimodal model (IMM) as a surrogate model to craft adversarial video samples. Multimodal interactions and temporal information are integrated to disrupt video representations within the latent space, improving adversarial transferability. In addition, a perturbation propagation technique is introduced to handle different unknown frame sampling strategies. Experimental results demonstrate that our method can generate adversarial examples that exhibit strong transferability across different V-MLLMs on multiple video-text multimodal tasks. Compared to white-box attacks on these models, our black-box attacks (using BLIP-2 as surrogate model) achieve competitive performance, with average attack success rates of 55.48% on MSVD-QA and 58.26% on MSRVTT-QA for VideoQA tasks, respectively. Our code will be released upon acceptance.

Paperid: 2172, https://arxiv.org/pdf/2501.00337.pdf

Abstract:
In the almost-everywhere reliable message transmission problem, introduced by [Dwork, Pippenger, Peleg, Upfal'86], the goal is to design a sparse communication network $G$ that supports efficient, fault-tolerant protocols for interactions between all node pairs. By fault-tolerant, we mean that that even if an adversary corrupts a small fraction of vertices in $G$, then all but a small fraction of vertices can still communicate perfectly via the constructed protocols. Being successful to do so allows one to simulate, on a sparse graph, any fault-tolerant distributed computing task and secure multi-party computation protocols built for a complete network, with only minimal overhead in efficiency. Previous works on this problem achieved either constant-degree networks tolerating $o(1)$ faults, constant-degree networks tolerating a constant fraction of faults via inefficient protocols (exponential work complexity), or poly-logarithmic degree networks tolerating a constant fraction of faults. We show a construction of constant-degree networks with efficient protocols (i.e., with polylogarithmic work complexity) that can tolerate a constant fraction of adversarial faults, thus solving the main open problem of Dwork et al.. Our main contribution is a composition technique for communication networks, based on graph products. Our technique combines two networks tolerant to adversarial edge-faults to construct a network with a smaller degree while maintaining efficiency and fault-tolerance. We apply this composition result multiple times, using the polylogarithmic-degree edge-fault tolerant networks constructed in a recent work of [Bafna, Minzer, Vyas'24] (that are based on high-dimensional expanders) with itself, and then with the constant-degree networks (albeit with inefficient protocols) of [Upfal'92].

Paperid: 2173, https://arxiv.org/pdf/2501.00066.pdf

Abstract:
We investigate the adversarial robustness of LLMs in transfer learning scenarios. Through comprehensive experiments on multiple datasets (MBIB Hate Speech, MBIB Political Bias, MBIB Gender Bias) and various model architectures (BERT, RoBERTa, GPT-2, Gemma, Phi), we reveal that transfer learning, while improving standard performance metrics, often leads to increased vulnerability to adversarial attacks. Our findings demonstrate that larger models exhibit greater resilience to this phenomenon, suggesting a complex interplay between model size, architecture, and adaptation methods. Our work highlights the crucial need for considering adversarial robustness in transfer learning scenarios and provides insights into maintaining model security without compromising performance. These findings have significant implications for the development and deployment of LLMs in real-world applications where both performance and robustness are paramount.

Paperid: 2174, https://arxiv.org/pdf/2506.23841.pdf

Abstract:
Attack Trees (AT) are a popular formalism for security analysis. They are meant to display an attacker's goal decomposed into attack steps needed to achieve it and compute certain security metrics (e.g., attack cost, probability, and damage). ATs offer three important services: (a) conceptual modeling capabilities for representing security risk management scenarios, (b) a qualitative assessment to find root causes and minimal conditions of successful attacks, and (c) quantitative analyses via security metrics computation under formal semantics, such as minimal time and cost among all attacks. Still, the AT language presents limitations due to its lack of ontological foundations, thus compromising associated services. Via an ontological analysis grounded in the Common Ontology of Value and Risk (COVER) -- a reference core ontology based on the Unified Foundational Ontology (UFO) -- we investigate the ontological adequacy of AT and reveal four significant shortcomings: (1) ambiguous syntactical terms that can be interpreted in various ways; (2) ontological deficit concerning crucial domain-specific concepts; (3) lacking modeling guidance to construct ATs decomposing a goal; (4) lack of semantic interoperability, resulting in ad hoc stand-alone tools. We also discuss existing incremental solutions and how our analysis paves the way for overcoming those issues through a broader approach to risk management modeling.

Paperid: 2175, https://arxiv.org/pdf/2506.23683.pdf

Abstract:
There are many sandboxing mechanisms provided by operating systems to limit what resources applications can access, however, sometimes the use of these mechanisms requires developers to refactor their code to fit the sandboxing model. In this work, we investigate what makes existing sandboxing mechanisms challenging to apply to certain types of applications, and propose Threadbox, a sandboxing mechanism that enables having modular and independent sandboxes, and can be applied to threads and sandbox specific functions. We present case studies to illustrate the applicability of the idea and discuss its limitations.

Paperid: 2176, https://arxiv.org/pdf/2506.23682.pdf

Abstract:
A digital security-by-design computer architecture, like CHERI, lets you program without fear of buffer overflows or other memory safety errors, but CHERI also rewrites some of the assumptions about how C works and how fundamental types (such as pointers) are implemented in hardware. We conducted a usability study to examine how developers react to the changes required by CHERI when porting software to run on it. We find that developers struggle with CHERI's display of warnings and errors and a lack of diverse documentation.

Paperid: 2177, https://arxiv.org/pdf/2506.22423.pdf

Abstract:
Unmanned Aerial Vehicles (UAVs) depend on onboard sensors for perception, navigation, and control. However, these sensors are susceptible to physical attacks, such as GPS spoofing, that can corrupt state estimates and lead to unsafe behavior. While reinforcement learning (RL) offers adaptive control capabilities, existing safe RL methods are ineffective against such attacks. We present ARMOR (Adaptive Robust Manipulation-Optimized State Representations), an attack-resilient, model-free RL controller that enables robust UAV operation under adversarial sensor manipulation. Instead of relying on raw sensor observations, ARMOR learns a robust latent representation of the UAV's physical state via a two-stage training framework. In the first stage, a teacher encoder, trained with privileged attack information, generates attack-aware latent states for RL policy training. In the second stage, a student encoder is trained via supervised learning to approximate the teacher's latent states using only historical sensor data, enabling real-world deployment without privileged information. Our experiments show that ARMOR outperforms conventional methods, ensuring UAV safety. Additionally, ARMOR improves generalization to unseen attacks and reduces training cost by eliminating the need for iterative adversarial training.

Paperid: 2178, https://arxiv.org/pdf/2506.20828.pdf

Abstract:
The rise of massive networks across diverse domains necessitates sophisticated graph analytics, often involving sensitive data and raising privacy concerns. This paper addresses these challenges using local differential privacy (LDP), which enforces privacy at the individual level, where no third-party entity is trusted, unlike centralized models that assume a trusted curator. We introduce novel LDP algorithms for two fundamental graph statistics: k-core decomposition and triangle counting. Our approach leverages input-dependent private graph properties, specifically the degeneracy and maximum degree of the graph, to improve theoretical utility. Unlike prior methods, our error bounds are determined by the maximum degree rather than the total number of edges, resulting in significantly tighter guarantees. For triangle counting, we improve upon the work of Imola, Murakami, and Chaudhury [USENIX Security `21, `22], which bounds error in terms of edge count. Instead, our algorithm achieves bounds based on graph degeneracy by leveraging a private out-degree orientation, a refined variant of Eden et al.'s randomized response technique [ICALP `23], and a novel analysis, yielding stronger guarantees than prior work. Beyond theoretical gains, we are the first to evaluate local DP algorithms in a distributed simulation, unlike prior work tested on a single processor. Experiments on real-world graphs show substantial accuracy gains: our k-core decomposition achieves errors within 3x of exact values, far outperforming the 131x error in the baseline of Dhulipala et al. [FOCS `22]. Our triangle counting algorithm reduces multiplicative approximation errors by up to six orders of magnitude, while maintaining competitive runtime.

Paperid: 2179, https://arxiv.org/pdf/2506.20290.pdf

Abstract:
Local differential privacy (LDP) has become a widely accepted framework for privacy-preserving data collection. In LDP, many protocols rely on hash functions to implement user-side encoding and perturbation. However, the security and privacy implications of hash function selection have not been previously investigated. In this paper, we expose that the hash functions may act as a source of unfairness in LDP protocols. We show that although users operate under the same protocol and privacy budget, differences in hash functions can lead to significant disparities in vulnerability to inference and poisoning attacks. To mitigate hash-induced unfairness, we propose Fair-OLH (F-OLH), a variant of OLH that enforces an entropy-based fairness constraint on hash function selection. Experiments show that F-OLH is effective in mitigating hash-induced unfairness under acceptable time overheads.

Paperid: 2180, https://arxiv.org/pdf/2506.20109.pdf

Abstract:
Disassemblers are crucial in the analysis and modification of binaries. Existing works showing disassembler errors largely rely on practical implementation without specific guarantees and assume source code and compiler toolchains to evaluate ground truth. However, the assumption of source code is contrary to typical binary scenarios where only the binary is available. In this work, we investigate an approach with minimal assumptions and a sound approach to disassembly error evaluation that does not require source code. Any source code does not address the fundamental problem of binary disassembly and fails when only the binary exists. As far as we know, this is the first work to evaluate disassembly errors using only the binary. We propose TraceBin, which uses dynamic execution to find disassembly errors. TraceBin targets the use case where the disassembly is used in an automated fashion for security tasks on a target binary, such as static binary instrumentation, binary hardening, automated code repair, and so on, which may be affected by disassembly errors. Discovering disassembly errors in the target binary aids in reducing problems caused by such errors. Furthermore, we are not aware of existing approaches that can evaluate errors given only a target binary, as they require source code. Our evaluation shows TraceBin finds: (i) errors consistent with existing studies even without source; (ii) disassembly errors due to control flow; (iii) new interesting errors; (iv) errors in non-C/C++ binaries; (v) errors in closed-source binaries; and (vi) show that disassembly errors can have significant security implications. Overall, our experimental results show that TraceBin finds many errors in existing popular disassemblers. It is also helpful in automated security tasks on (closed source) binaries relying on disassemblers.

Paperid: 2181, https://arxiv.org/pdf/2506.20037.pdf

Abstract:
Machine learning providers commonly distribute global models to edge devices, which subsequently personalize these models using local data. However, issues such as copyright infringements, biases, or regulatory requirements may require the verifiable removal of certain data samples across all edge devices. Ensuring that edge devices correctly execute such unlearning operations is critical to maintaining integrity. In this work, we introduce a verification framework leveraging zero-knowledge proofs, specifically zk-SNARKs, to confirm data unlearning on personalized edge-device models without compromising privacy. We have developed algorithms explicitly designed to facilitate unlearning operations that are compatible with efficient zk-SNARK proof generation, ensuring minimal computational and memory overhead suitable for constrained edge environments. Furthermore, our approach carefully preserves personalized enhancements on edge devices, maintaining model performance post-unlearning. Our results affirm the practicality and effectiveness of this verification framework, demonstrating verifiable unlearning with minimal degradation in personalization-induced performance improvements. Our methodology ensures verifiable, privacy-preserving, and effective machine unlearning across edge devices.

Paperid: 2182, https://arxiv.org/pdf/2506.19563.pdf

Abstract:
Large Language Models (LLMs) are widely used in sensitive domains, including healthcare, finance, and legal services, raising concerns about potential private information leaks during inference. Privacy extraction attacks, such as jailbreaking, expose vulnerabilities in LLMs by crafting inputs that force the models to output sensitive information. However, these attacks cannot verify whether the extracted private information is accurate, as no public datasets exist for cross-validation, leaving a critical gap in private information detection during inference. To address this, we propose PrivacyXray, a novel framework detecting privacy breaches by analyzing LLM inner states. Our analysis reveals that LLMs exhibit higher semantic coherence and probabilistic certainty when generating correct private outputs. Based on this, PrivacyXray detects privacy breaches using four metrics: intra-layer and inter-layer semantic similarity, token-level and sentence-level probability distributions. PrivacyXray addresses critical challenges in private information detection by overcoming the lack of open-source private datasets and eliminating reliance on external data for validation. It achieves this through the synthesis of realistic private data and a detection mechanism based on the inner states of LLMs. Experiments show that PrivacyXray achieves consistent performance, with an average accuracy of 92.69% across five LLMs. Compared to state-of-the-art methods, PrivacyXray achieves significant improvements, with an average accuracy increase of 20.06%, highlighting its stability and practical utility in real-world applications.

Paperid: 2183, https://arxiv.org/pdf/2506.19393.pdf

Abstract:
Biometric authentication relies on physiological or behavioral traits that are inherent to a user, making them difficult to lose, forge or forget. Biometric data with a temporal component enable the following authentication protocol: recent readings of the underlying biometrics are encoded as time series and compared to a set of base readings. If the distance between the new readings and the base readings falls within an acceptable threshold, then the user is successfully authenticated. Various methods exist for comparing time series data, such as Dynamic Time Warping (DTW) and the Time Warp Edit Distance (TWED), each offering advantages and drawbacks depending on the context. Moreover, many of these techniques do not inherently preserve privacy, which is a critical consideration in biometric authentication due to the complexity of resetting biometric credentials. In this work, we propose ZK-SERIES to provide privacy and efficiency to a broad spectrum of time series-based authentication protocols. ZK-SERIES uses the same building blocks, i.e., zero-knowledge multiplication proofs and efficiently batched range proofs, to ensure consistency across all protocols. Furthermore, it is optimized for compatibility with low-capacity devices such as smartphones. To assess the effectiveness of our proposed technique, we primarily focus on two case studies for biometric authentication: shake-based and blow-based authentication. To demonstrate ZK-SERIES's practical applicability even in older and less powerful smartphones, we conduct experiments on a 5-year-old low-spec smartphone using real data for two case studies alongside scalability assessments using artificial data. Our experimental results indicate that the privacy-preserving authentication protocol can be completed within 1.3 seconds on older devices.

Paperid: 2184, https://arxiv.org/pdf/2506.17504.pdf

Abstract:
Nominative signatures allow us to indicate who can verify a signature, and they can be employed to construct a non-transferable signature verification system that prevents the signature verification by a third party in unexpected situations. For example, this system can prevent IOU/loan certificate verification in unexpected situations. However, nominative signatures themselves do not allow the verifier to check whether the funds will be transferred in the future or have been transferred.It would be desirable to verify the fact simultaneously when the system involves a certain money transfer such as cryptocurrencies/cryptoassets. In this paper, we propose a smart contract-based non-transferable signature verification system using nominative signatures. We pay attention to the fact that the invisibility, which is a security requirement to be held for nominative signatures, allows us to publish nominative signatures on the blockchain. Our system can verify whether a money transfer actually will take place, in addition to indicating who can verify a signature. We transform the Hanaoka-Schuldt nominative signature scheme (ACNS 2011, IEICE Trans. 2016) which is constructed over a symmetric pairing to a scheme constructed over an asymmetric pairing, and evaluate the gas cost when a smart contract runs the verification algorithm of the modified Hanaoka-Schuldt nominative signature scheme.

Paperid: 2185, https://arxiv.org/pdf/2506.16899.pdf

Abstract:
A key challenge in security analysis is the manual evaluation of potential security weaknesses generated by static application security testing (SAST) tools. Numerous false positives (FPs) in these reports reduce the effectiveness of security analysis. We propose using Large Language Models (LLMs) to improve the assessment of SAST findings. We investigate the ability of LLMs to reduce FPs while trying to maintain a perfect true positive rate, using datasets extracted from the OWASP Benchmark (v1.2) and a real-world software project. Our results indicate that advanced prompting techniques, such as Chain-of-Thought and Self-Consistency, substantially improve FP detection. Notably, some LLMs identified approximately 62.5% of FPs in the OWASP Benchmark dataset without missing genuine weaknesses. Combining detections from different LLMs would increase this FP detection to approximately 78.9%. Additionally, we demonstrate our approach's generalizability using a real-world dataset covering five SAST tools, three programming languages, and infrastructure files. The best LLM detected 33.85% of all FPs without missing genuine weaknesses, while combining detections from different LLMs would increase this detection to 38.46%. Our findings highlight the potential of LLMs to complement traditional SAST tools, enhancing automation and reducing resources spent addressing false alarms.

Paperid: 2186, https://arxiv.org/pdf/2506.16626.pdf

Abstract:
In recent years, the adoption of cloud services has been expanding at an unprecedented rate. As more and more organizations migrate or deploy their businesses to the cloud, a multitude of related cybersecurity incidents such as data breaches are on the rise. Many inherent attributes of cloud environments, for example, data sharing, remote access, dynamicity and scalability, pose significant challenges for the protection of cloud security. Even worse, cyber threats are becoming increasingly sophisticated and covert. Attack methods, such as Advanced Persistent Threats (APTs), are continually developed to bypass traditional security measures. Among the emerging technologies for robust threat detection, system provenance analysis is being considered as a promising mechanism, thus attracting widespread attention in the field of incident response. This paper proposes a new few-shot learning-based attack detection with improved data context intelligence. We collect operating system behavior data of cloud systems during realistic attacks and leverage an innovative semiotics extraction method to describe system events. Inspired by the advances in semantic analysis, which is a fruitful area focused on understanding natural languages in computational linguistics, we further convert the anomaly detection problem into a similarity comparison problem. Comprehensive experiments show that the proposed approach is able to generalize over unseen attacks and make accurate predictions, even if the incident detection models are trained with very limited samples.

Paperid: 2187, https://arxiv.org/pdf/2506.16545.pdf

Abstract:
The rise of the Internet of Things and Cyber-Physical Systems has introduced new challenges on ensuring secure and robust communication. The growing number of connected devices increases network complexity, leading to higher latency and traffic. Distributed computing architectures (DCAs) have gained prominence to address these issues. This shift has significantly expanded the attack surface, requiring additional security measures to protect all components -- from sensors and actuators to edge nodes and central servers. Recent incidents highlight the difficulty of this task: Cyberattacks, like distributed denial of service attacks, continue to pose severe threats and cause substantial damage. Implementing a holistic defense mechanism remains an open challenge, particularly against attacks that demand both enhanced resilience and rapid response. Addressing this gap requires innovative solutions to enhance the security of DCAs. In this work, we present our holistic self-adaptive security framework which combines different adaptation strategies to create comprehensive and efficient defense mechanisms. We describe how to incorporate the framework into a real-world use case scenario and further evaluate its applicability and efficiency. Our evaluation yields promising results, indicating great potential to further extend the research on our framework.

Paperid: 2188, https://arxiv.org/pdf/2506.15924.pdf

Abstract:
Confidential virtual machines (CVMs) based on trusted execution environments (TEEs) enable new privacy-preserving solutions. Yet, they leave side-channel leakage outside their threat model, shifting the responsibility of mitigating such attacks to developers. However, mitigations are either not generic or too slow for practical use, and developers currently lack a systematic, efficient way to measure and compare leakage across real-world deployments. In this paper, we present FARFETCH'D, an open-source toolkit that offers configurable side-channel tracing primitives on production AMD SEV-SNP hardware and couples them with statistical and machine-learning-based analysis pipelines for automated leakage estimation. We apply FARFETCH'D to three representative workloads that are deployed on CVMs to enhance user privacy - private information retrieval, private heavy hitters, and Wasm user-defined functions - and uncover previously unnoticed leaks, including a covert channel that exfiltrated data at 497 kbit/s. The results show that FARFETCH'D pinpoints vulnerabilities and guides low-overhead mitigations based on oblivious memory and differential privacy, giving practitioners a practical path to deploy CVMs with meaningful confidentiality guarantees.

Paperid: 2189, https://arxiv.org/pdf/2506.15547.pdf

Abstract:
Randomness extractors are algorithms that distill weak random sources into near-perfect random numbers. Two-source extractors enable this distillation process by combining two independent weak random sources. Raz's extractor (STOC '05) was the first to achieve this in a setting where one source has linear min-entropy (i.e., proportional to its length), while the other has only logarithmic min-entropy in its length. However, Raz's original construction is impractical due to a polynomial computation time of at least degree 4. Our work solves this problem by presenting an improved version of Raz's extractor with quasi-linear computation time, as well as a new analytic theorem with reduced entropy requirements. We provide comprehensive analytical and numerical comparisons of our construction with others in the literature, and we derive strong and quantum-proof versions of our efficient Raz extractor. Additionally, we offer an easy-to-use, open-source code implementation of the extractor and a numerical parameter calculation module.

Paperid: 2190, https://arxiv.org/pdf/2506.15432.pdf

Abstract:
Dataflow neural network accelerators efficiently process AI tasks on FPGAs, with deployment simplified by ready-to-use frameworks and pre-trained models. However, this convenience makes them vulnerable to malicious actors seeking to reverse engineer valuable Intellectual Property (IP) through Side-Channel Attacks (SCA). This paper proposes a methodology to recover the hardware configuration of dataflow accelerators generated with the FINN framework. Through unsupervised dimensionality reduction, we reduce the computational overhead compared to the state-of-the-art, enabling lightweight classifiers to recover both folding and quantization parameters. We demonstrate an attack phase requiring only 337 ms to recover the hardware parameters with an accuracy of more than 95% and 421 ms to fully recover these parameters with an averaging of 4 traces for a FINN-based accelerator running a CNN, both using a random forest classifier on side-channel traces, even with the accelerator dataflow fully loaded. This approach offers a more realistic attack scenario than existing methods, and compared to SoA attacks based on tsfresh, our method requires 940x and 110x less time for preparation and attack phases, respectively, and gives better results even without averaging traces.

Paperid: 2191, https://arxiv.org/pdf/2506.15417.pdf

Abstract:
Software-exploitable Hardware Trojans (HTs) enable attackers to execute unauthorized software or gain illicit access to privileged operations. This manuscript introduces a hardware-based methodology for detecting runtime HT activations using Error Correction Codes (ECCs) on a RISC-V microprocessor. Specifically, it focuses on HTs that inject malicious instructions, disrupting the normal execution flow by triggering unauthorized programs. To counter this threat, the manuscript introduces a Hardware Security Checker (HSC) leveraging Hamming Single Error Correction (HSEC) architectures for effective HT detection. Experimental results demonstrate that the proposed solution achieves a 100% detection rate for potential HT activations, with no false positives or undetected attacks. The implementation incurs minimal overhead, requiring only 72 #LUTs, 24 #FFs, and 0.5 #BRAM while maintaining the microprocessor's original operating frequency and introducing no additional time delay.

Paperid: 2192, https://arxiv.org/pdf/2506.15028.pdf

Abstract:
The integration of AI/ML into medical devices is rapidly transforming healthcare by enhancing diagnostic and treatment facilities. However, this advancement also introduces serious cybersecurity risks due to the use of complex and often opaque models, extensive interconnectivity, interoperability with third-party peripheral devices, Internet connectivity, and vulnerabilities in the underlying technologies. These factors contribute to a broad attack surface and make threat prevention, detection, and mitigation challenging. Given the highly safety-critical nature of these devices, a cyberattack on these devices can cause the ML models to mispredict, thereby posing significant safety risks to patients. Therefore, ensuring the security of these devices from the time of design is essential. This paper underscores the urgency of addressing the cybersecurity challenges in ML-enabled medical devices at the pre-market phase. We begin by analyzing publicly available data on device recalls and adverse events, and known vulnerabilities, to understand the threat landscape of AI/ML-enabled medical devices and their repercussions on patient safety. Building on this analysis, we introduce a suite of tools and techniques designed by us to assist security analysts in conducting comprehensive premarket risk assessments. Our work aims to empower manufacturers to embed cybersecurity as a core design principle in AI/ML-enabled medical devices, thereby making them safe for patients.

Paperid: 2193, https://arxiv.org/pdf/2506.14964.pdf

Abstract:
Confidential Virtual Machines (CVMs) provide isolation guarantees for data in use, but their threat model does not include physical level protection and side-channel attacks. Therefore, current deployments rely on trusted cloud providers to host the CVMs' underlying infrastructure. However, TEE attestations do not provide information about the operator hosting a CVM. Without knowing whether a Trusted Execution Environment (TEE) runs within a provider's infrastructure, a user cannot accurately assess the risks of physical attacks. We observe a misalignment in the threat model where the workloads are protected against other tenants but do not offer end-to-end security assurances to external users without relying on cloud providers. The attestation should be extended to bind the CVM with the provider. A possible solution can rely on the Protected Platform Identifier (PPID), a unique CPU identifier. However, the implementation details of various TEE manufacturers, attestation flows, and providers vary. This makes verification of attestations, ease of migration, and building applications without relying on a trusted party challenging, highlighting a key limitation that must be addressed for the adoption of CVMs. We discuss two points focusing on hardening and extensions of TEEs' attestation.

Paperid: 2194, https://arxiv.org/pdf/2506.14323.pdf

Abstract:
Security researchers are interested in security vulnerabilities, but these security vulnerabilities create risks for stakeholders. Coordinated Vulnerability Disclosure has been an accepted best practice for many years in disclosing newly discovered vulnerabilities. This practice has mostly worked, but it can become challenging when there are many different parties involved. There has also been research into known vulnerabilities, using datasets or active scans to discover how many machines are still vulnerable. The ethical guidelines suggest that researchers also make an effort to notify the owners of these machines. We identify that this differs from vulnerability disclosure, but rather the practice of vulnerability notification. This practice has some similarities with vulnerability disclosure but should be distinguished from it, providing other challenges and requiring a different approach. Based on our earlier disclosure experience and on prior work documenting their disclosure and notification operations, we provide a meta-review on vulnerability disclosure and notification to observe the shifts in strategies in recent years. We assess how researchers initiated their messaging and examine the outcomes. We then compile the best practices for the existing disclosure guidelines and for notification operations.

Paperid: 2195, https://arxiv.org/pdf/2506.12802.pdf

Abstract:
Biometric authentication systems pose privacy risks, as leaked templates such as iris or fingerprints can lead to security breaches. Fully Homomorphic Encryption (FHE) enables secure encrypted evaluation, but its deployment is hindered by large ciphertexts, high key overhead, and limited trust models. We propose the Bidirectional Transciphering Framework (BTF), combining FHE, transciphering, and a non-colluding trusted party to enable efficient and privacy-preserving biometric authentication. The key architectural innovation is the introduction of a trusted party that assists in evaluation and key management, along with a double encryption mechanism to preserve the FHE trust model, where client data remains private. BTF addresses three core deployment challenges: reducing the size of returned FHE ciphertexts, preventing clients from falsely reporting successful authentication, and enabling scalable, centralized FHE key management. We implement BTF using TFHE and the Trivium cipher, and evaluate it on iris-based biometric data. Our results show up to a 121$\times$ reduction in transmission size compared to standard FHE models, demonstrating practical scalability and deployment potential.

Paperid: 2196, https://arxiv.org/pdf/2506.12675.pdf

Abstract:
Quantum neural networks (QNNs) leverage quantum computing to create powerful and efficient artificial intelligence models capable of solving complex problems significantly faster than traditional computers. With the fast development of quantum hardware technology, such as superconducting qubits, trapped ions, and integrated photonics, quantum computers may become reality, accelerating the applications of QNNs. However, preparing quantum circuits and optimizing parameters for QNNs require quantum hardware support, expertise, and high-quality data. How to protect intellectual property (IP) of QNNs becomes an urgent problem to be solved in the era of quantum computing. We make the first attempt towards IP protection of QNNs by watermarking. To this purpose, we collect classical clean samples and trigger ones, each of which is generated by adding a perturbation to a clean sample, associated with a label different from the ground-truth one. The host QNN, consisting of quantum encoding, quantum state transformation, and quantum measurement, is then trained from scratch with the clean samples and trigger ones, resulting in a watermarked QNN model. During training, we introduce sample grouped and paired training to ensure that the performance on the downstream task can be maintained while achieving good performance for watermark extraction. When disputes arise, by collecting a mini-set of trigger samples, the hidden watermark can be extracted by analyzing the prediction results of the target model corresponding to the trigger samples, without accessing the internal details of the target QNN model, thereby verifying the ownership of the model. Experiments have verified the superiority and applicability of this work.

Paperid: 2197, https://arxiv.org/pdf/2506.12411.pdf

Abstract:
Multimodal contrastive learning models like CLIP have demonstrated remarkable vision-language alignment capabilities, yet their vulnerability to backdoor attacks poses critical security risks. Attackers can implant latent triggers that persist through downstream tasks, enabling malicious control of model behavior upon trigger presentation. Despite great success in recent defense mechanisms, they remain impractical due to strong assumptions about attacker knowledge or excessive clean data requirements. In this paper, we introduce InverTune, the first backdoor defense framework for multimodal models under minimal attacker assumptions, requiring neither prior knowledge of attack targets nor access to the poisoned dataset. Unlike existing defense methods that rely on the same dataset used in the poisoning stage, InverTune effectively identifies and removes backdoor artifacts through three key components, achieving robust protection against backdoor attacks. Specifically, InverTune first exposes attack signatures through adversarial simulation, probabilistically identifying the target label by analyzing model response patterns. Building on this, we develop a gradient inversion technique to reconstruct latent triggers through activation pattern analysis. Finally, a clustering-guided fine-tuning strategy is employed to erase the backdoor function with only a small amount of arbitrary clean data, while preserving the original model capabilities. Experimental results show that InverTune reduces the average attack success rate (ASR) by 97.87% against the state-of-the-art (SOTA) attacks while limiting clean accuracy (CA) degradation to just 3.07%. This work establishes a new paradigm for securing multimodal systems, advancing security in foundation model deployment without compromising performance.

Paperid: 2198, https://arxiv.org/pdf/2506.11791.pdf

Abstract:
Rigorous security-focused evaluation of large language model (LLM) agents is imperative for establishing trust in their safe deployment throughout the software development lifecycle. However, existing benchmarks largely rely on synthetic challenges or simplified vulnerability datasets that fail to capture the complexity and ambiguity encountered by security engineers in practice. We introduce SEC-bench, the first fully automated benchmarking framework for evaluating LLM agents on authentic security engineering tasks. SEC-bench employs a novel multi-agent scaffold that automatically constructs code repositories with harnesses, reproduces vulnerabilities in isolated environments, and generates gold patches for reliable evaluation. Our framework automatically creates high-quality software vulnerability datasets with reproducible artifacts at a cost of only $0.87 per instance. Using SEC-bench, we implement two critical software security tasks to rigorously evaluate LLM agents' capabilities: proof-of-concept (PoC) generation and vulnerability patching. A comprehensive evaluation of state-of-the-art LLM code agents reveals significant performance gaps, achieving at most 18.0% success in PoC generation and 34.0% in vulnerability patching on our complete dataset. These results highlight the crucial steps needed toward developing LLM agents that are more practical, intelligent, and autonomous for security engineering.

Paperid: 2199, https://arxiv.org/pdf/2506.11325.pdf

Abstract:
Indicators of Compromise (IoCs) are critical for threat detection and response, marking malicious activity across networks and systems. Yet, the effectiveness of automated IoC extraction systems is fundamentally limited by one key issue: the lack of high-quality ground truth. Current extraction tools rely either on manually extracted ground truth, which is labor-intensive and costly, or on automated ground truth creation methods that include non-malicious artifacts, leading to inflated false positive (FP) rates and unreliable threat intelligence. In this work, we analyze the shortcomings of existing ground truth creation strategies and address them by introducing the first hybrid human-in-the-loop pipeline for IoC extraction, which combines a large language model-based classifier (LANCE) with expert analyst validation. Our system improves precision through explainable, context-aware labeling and reduces analysts' work factor by 43% compared to manual annotation, as demonstrated in our evaluation with six analysts. Using this approach, we produce PRISM, a high-quality, publicly available benchmark of 1,791 labeled IoCs from 50 real-world threat reports. PRISM supports both fair evaluation and training of IoC extraction methods and enables reproducible research grounded in expert-validated indicators.

Paperid: 2200, https://arxiv.org/pdf/2506.10327.pdf

Abstract:
In the last decade, the rapid growth of Unmanned Aircraft Systems (UAS) and Unmanned Aircraft Vehicles (UAV) in communication, defense, and transportation has increased. The application of UAS will continue to increase rapidly. This has led researchers to examine security vulnerabilities in various facets of UAS infrastructure and UAVs, which form a part of the UAS system to reinforce these critical systems. This survey summarizes the cybersecurity vulnerabilities in several phases of UAV deployment, the likelihood of each vulnerability's occurrence, the impact of attacks, and mitigation strategies that could be applied. We go beyond the state-of-the-art by taking a comprehensive approach to enhancing UAS security by performing an analysis of both UAS-specific and non-UAS-specific mitigation strategies that are applicable within the UAS domain to define the lessons learned. We also present relevant cybersecurity standards and their recommendations in the UAS context. Despite the significant literature in UAS security and the relevance of cyberphysical and networked systems security approaches from the past, which we identify in the survey, we find several critical research gaps that require further investigation. These form part of our discussions and recommendations for the future exploration by our research community.

Paperid: 2201, https://arxiv.org/pdf/2506.10030.pdf

Abstract:
As Retrieval-Augmented Generation (RAG) evolves into service-oriented platforms (Rag-as-a-Service) with shared knowledge bases, protecting the copyright of contributed data becomes essential. Existing watermarking methods in RAG focus solely on textual knowledge, leaving image knowledge unprotected. In this work, we propose AQUA, the first watermark framework for image knowledge protection in Multimodal RAG systems. AQUA embeds semantic signals into synthetic images using two complementary methods: acronym-based triggers and spatial relationship cues. These techniques ensure watermark signals survive indirect watermark propagation from image retriever to textual generator, being efficient, effective and imperceptible. Experiments across diverse models and datasets show that AQUA enables robust, stealthy, and reliable copyright tracing, filling a key gap in multimodal RAG protection.

Paperid: 2202, https://arxiv.org/pdf/2506.09418.pdf

Abstract:
The advent of Open Radio Access Networks (O-RAN) introduces modularity and flexibility into 5G deployments but also surfaces novel security challenges across disaggregated interfaces. This literature review synthesizes recent research across thirteen academic and industry sources, examining vulnerabilities such as cipher bidding-down attacks, partial encryption exposure on control/user planes, and performance trade-offs in securing O-RAN interfaces like E2 and O1. The paper surveys key cryptographic tools -- SNOW-V, AES-256, and ZUC-256 -- evaluating their throughput, side-channel resilience, and adaptability to heterogeneous slices (eMBB, URLLC, mMTC). Emphasis is placed on emerging testbeds and AI-driven controllers that facilitate dynamic orchestration, anomaly detection, and secure configuration. We conclude by outlining future research directions, including hardware offloading, cross-layer cipher adaptation, and alignment with 3GPP TS 33.501 and O-RAN Alliance security mandates, all of which point toward the need for integrated, zero-trust architectures in 6G.

Paperid: 2203, https://arxiv.org/pdf/2506.08996.pdf

Abstract:
Online services provide users with cookie banners to accept/reject the cookies placed on their web browsers. Despite the increased adoption of cookie banners, little has been done to ensure that cookie consent is compliant with privacy laws around the globe. Prior studies have found that cookies are often placed on browsers even after their explicit rejection by users. These inconsistencies in cookie banner behavior circumvent users' consent preferences and are known as cookie consent violations. To address this important problem, we propose an end-to-end system, called ConsentChk, that detects and analyzes cookie banner behavior. ConsentChk uses a formal model to systematically detect and categorize cookie consent violations. We investigate eight English-speaking regions across the world, and analyze cookie banner behavior across 1,793 globally-popular websites. Cookie behavior, cookie consent violation rates, and cookie banner implementations are found to be highly dependent on region. Our evaluation reveals that consent management platforms (CMPs) and website developers likely tailor cookie banner configurations based on their (often incorrect) interpretations of regional privacy laws. We discuss various root causes behind these cookie consent violations. The resulting implementations produce misleading cookie banners, indicating the prevalence of inconsistently implemented and enforced cookie consent between various regions.

Paperid: 2204, https://arxiv.org/pdf/2506.08461.pdf

Abstract:
As the demand for privacy-preserving computation continues to grow, fully homomorphic encryption (FHE)-which enables continuous computation on encrypted data-has become a critical solution. However, its adoption is hindered by significant computational overhead, requiring 10000-fold more computation compared to plaintext processing. Recent advancements in FHE accelerators have successfully improved server-side performance, but client-side computations remain a bottleneck, particularly under bootstrappable parameter configurations, which involve combinations of encoding, encrypt, decoding, and decrypt for large-sized parameters. To address this challenge, we propose ABC-FHE, an area- and power-efficient FHE accelerator that supports bootstrappable parameters on the client side. ABC-FHE employs a streaming architecture to maximize performance density, minimize area usage, and reduce off-chip memory access. Key innovations include a reconfigurable Fourier engine capable of switching between NTT and FFT modes. Additionally, an on-chip pseudo-random number generator and a unified on-the-fly twiddle factor generator significantly reduce memory demands, while optimized task scheduling enhances the CKKS client-side processing, achieving reduced latency. Overall, ABC-FHE occupies a die area of 28.638 mm2 and consumes 5.654 W of power in 28 nm technology. It delivers significant performance improvements, achieving a 1112x speed-up in encoding and encryption execution time compared to a CPU, and 214x over the state-of-the-art client-side accelerator. For decoding and decryption, it achieves a 963x speed-up over the CPU and 82x over the state-of-the-art accelerator.

Paperid: 2205, https://arxiv.org/pdf/2506.08445.pdf

Abstract:
Recently, approaches using Deep Reinforcement Learning (DRL) have been proposed to solve UAV navigation systems in complex and unknown environments. However, despite extensive research and attention, systematic studies on various security aspects have not yet been conducted. Therefore, in this paper, we conduct research on security vulnerabilities in DRL-based navigation systems, particularly focusing on GPS spoofing attacks against the system. Many recent basic DRL-based navigation systems fundamentally share an efficient structure. This paper presents an attack model that operates through GPS spoofing attacks briefly modeling the range of spoofing attack against EKF sensor fusion of PX4 autopilot, and combine this with the DRL-based system to design attack scenarios that are closer to reality. Finally, this paper experimentally demonstrated that attacks are possible both in the basic DRL system and in attack models combining the DRL system with PX4 autopilot system.

Paperid: 2206, https://arxiv.org/pdf/2506.08255.pdf

Abstract:
Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalability, or both. We propose a novel framework that integrates Interval Bound Propagation (IBP) with a hypernetwork-based architecture to enable certifiably robust continual learning across sequential tasks. Our method, SHIELD, generates task-specific model parameters via a shared hypernetwork conditioned solely on compact task embeddings, eliminating the need for replay buffers or full model copies and enabling efficient over time. To further enhance robustness, we introduce Interval MixUp, a novel training strategy that blends virtual examples represented as $\ell_{\infty}$ balls centered around MixUp points. Leveraging interval arithmetic, this technique guarantees certified robustness while mitigating the wrapping effect, resulting in smoother decision boundaries. We evaluate SHIELD under strong white-box adversarial attacks, including PGD and AutoAttack, across multiple benchmarks. It consistently outperforms existing robust continual learning methods, achieving state-of-the-art average accuracy while maintaining both scalability and certification. These results represent a significant step toward practical and theoretically grounded continual learning in adversarial settings.

Paperid: 2207, https://arxiv.org/pdf/2506.08252.pdf

Abstract:
Power Side-Channel (PSC) attacks exploit power consumption patterns to extract sensitive information, posing risks to cryptographic operations crucial for secure systems. Traditional countermeasures, such as masking, face challenges including complex integration during synthesis, substantial area overhead, and susceptibility to optimization removal during logic synthesis. To address these issues, we introduce PoSyn, a novel logic synthesis framework designed to enhance cryptographic hardware resistance against PSC attacks. Our method centers on optimal bipartite mapping of vulnerable RTL components to standard cells from the technology library, aiming to minimize PSC leakage. By utilizing a cost function integrating critical characteristics from both the RTL design and the standard cell library, we strategically modify mapping criteria during RTL-to-netlist conversion without altering design functionality. Furthermore, we theoretically establish that PoSyn minimizes mutual information leakage, strengthening its security against PSC vulnerabilities. We evaluate PoSyn across various cryptographic hardware implementations, including AES, RSA, PRESENT, and post-quantum cryptographic algorithms such as Saber and CRYSTALS-Kyber, at technology nodes of 65nm, 45nm, and 15nm. Experimental results demonstrate a substantial reduction in success rates for Differential Power Analysis (DPA) and Correlation Power Analysis (CPA) attacks, achieving lows of 3% and 6%, respectively. TVLA analysis further confirms that synthesized netlists exhibit negligible leakage. Additionally, compared to conventional countermeasures like masking and shuffling, PoSyn significantly lowers attack success rates, achieving reductions of up to 72%, while simultaneously enhancing area efficiency by as much as 3.79 times.

Paperid: 2208, https://arxiv.org/pdf/2506.05734.pdf

Abstract:
The security of printed circuit boards (PCBs) has become increasingly vital as supply chain vulnerabilities, including tampering, present significant risks to electronic systems. While detecting tampering on a PCB is the first step for verification, forensics is also needed to identify the modified component. One non-invasive and reliable PCB tamper detection technique with global coverage is the impedance characterization of a PCB's power delivery network (PDN). However, it is an open question whether one can use the two-dimensional impedance signatures for forensics purposes. In this work, we introduce a novel PCB forensics approach using explainable AI (XAI) on impedance signatures. Through extensive experiments, we replicate various PCB tamper events, generating a dataset used to develop an XAI algorithm capable of not only detecting tampering but also explaining why the algorithm makes a decision about whether a tamper event has happened. At the core of our XAI algorithm is a random forest classifier with an accuracy of 96.7%, sufficient to explain the algorithm's decisions. To understand the behavior of the classifier in the decision-making process, we utilized SHAP values as an XAI tool to determine which frequency component influences the classifier's decision for a particular class the most. This approach enhances detection capabilities as well as advancing the verifier's ability to reverse-engineer and analyze two-dimensional impedance signatures for forensics.

Paperid: 2209, https://arxiv.org/pdf/2506.05430.pdf

Abstract:
Binary code similarity detection (BCSD) serves as a fundamental technique for various software engineering tasks, e.g., vulnerability detection and classification. Attacks against such models have therefore drawn extensive attention, aiming at misleading the models to generate erroneous predictions. Prior works have explored various approaches to generating semantic-preserving variants, i.e., adversarial samples, to evaluate the robustness of the models against adversarial attacks. However, they have mainly relied on heuristic criteria or iterative greedy algorithms to locate salient code influencing the model output, failing to operate on a solid theoretical basis. Moreover, when processing programs with high complexities, such attacks tend to be time-consuming. In this work, we propose a novel optimization for adversarial attacks against BCSD models. In particular, we aim to improve the attacks in a challenging scenario, where the attack goal is to limit the model predictions to a specific range, i.e., the targeted attacks. Our attack leverages the superior capability of black-box, model-agnostic explainers in interpreting the model decision boundaries, thereby pinpointing the critical code snippet to apply semantic-preserving perturbations. The evaluation results demonstrate that compared with the state-of-the-art attacks, the proposed attacks achieve higher attack success rate in almost all scenarios, while also improving the efficiency and transferability. Our real-world case studies on vulnerability detection and classification further demonstrate the security implications of our attacks, highlighting the urgent need to further enhance the robustness of existing BCSD models.

Paperid: 2210, https://arxiv.org/pdf/2506.05408.pdf

Abstract:
Clustering is a cornerstone of data analysis that is particularly suited to identifying coherent subgroups or substructures in unlabeled data, as are generated continuously in large amounts these days. However, in many cases traditional clustering methods are not applicable, because data are increasingly being produced and stored in a distributed way, e.g. on edge devices, and privacy concerns prevent it from being transferred to a central server. To address this challenge, we present FedDP-KMeans, a new algorithm for $k$-means clustering that is fully-federated as well as differentially private. Our approach leverages (potentially small and out-of-distribution) server-side data to overcome the primary challenge of differentially private clustering methods: the need for a good initialization. Combining our initialization with a simple federated DP-Lloyds algorithm we obtain an algorithm that achieves excellent results on synthetic and real-world benchmark tasks. We also provide a theoretical analysis of our method that provides bounds on the convergence speed and cluster identification success.

Paperid: 2211, https://arxiv.org/pdf/2506.05376.pdf

Abstract:
Large Language Model (LLM) safeguards, which implement request refusals, have become a widely adopted mitigation strategy against misuse. At the intersection of adversarial machine learning and AI safety, safeguard red teaming has effectively identified critical vulnerabilities in state-of-the-art refusal-trained LLMs. However, in our view the many conference submissions on LLM red teaming do not, in aggregate, prioritize the right research problems. First, testing against clear product safety specifications should take a higher priority than abstract social biases or ethical principles. Second, red teaming should prioritize realistic threat models that represent the expanding risk landscape and what real attackers might do. Finally, we contend that system-level safety is a necessary step to move red teaming research forward, as AI models present new threats as well as affordances for threat mitigation (e.g., detection and banning of malicious users) once placed in a deployment context. Adopting these priorities will be necessary in order for red teaming research to adequately address the slate of new threats that rapid AI advances present today and will present in the very near future.

Paperid: 2212, https://arxiv.org/pdf/2506.04853.pdf

Abstract:
We propose a privacy-preserving smart wallet with a novel invitation-based private onboarding mechanism. The solution integrates two levels of compliance in concert with an authority party: a proof of innocence mechanism and an ancestral commitment tracking system using bloom filters for probabilistic UTXO chain states. Performance analysis demonstrates practical efficiency: private transfers with compliance checks complete within seconds on a consumer-grade laptop, and overall with proof generation remaining low. On-chain costs stay minimal, ensuring affordability for all operations on Base layer 2 network. The wallet facilitates private contact list management through encrypted data blobs while maintaining transaction unlinkability. Our evaluation validates the approach's viability for privacy-preserving, compliance-aware digital payments with minimized computational and financial overhead.

Paperid: 2213, https://arxiv.org/pdf/2506.03746.pdf

Abstract:
Achieving differentially private computations in decentralized settings poses significant challenges, particularly regarding accuracy, communication cost, and robustness against information leakage. While cryptographic solutions offer promise, they often suffer from high communication overhead or require centralization in the presence of network failures. Conversely, existing fully decentralized approaches typically rely on relaxed adversarial models or pairwise noise cancellation, the latter suffering from substantial accuracy degradation if parties unexpectedly disconnect. In this work, we propose IncA, a new protocol for fully decentralized mean estimation, a widely used primitive in data-intensive processing. Our protocol, which enforces differential privacy, requires no central orchestration and employs low-variance correlated noise, achieved by incrementally injecting sensitive information into the computation. First, we theoretically demonstrate that, when no parties permanently disconnect, our protocol achieves accuracy comparable to that of a centralized setting-already an improvement over most existing decentralized differentially private techniques. Second, we empirically show that our use of low-variance correlated noise significantly mitigates the accuracy loss experienced by existing techniques in the presence of dropouts.

Paperid: 2214, https://arxiv.org/pdf/2506.02674.pdf

Abstract:
With the development of the Internet, the amount of data generated by the medical industry each year has grown exponentially. The Electronic Health Record (EHR) manages the electronic data generated during the user's treatment process. Typically, an EHR data manager belongs to a medical institution. This traditional centralized data management model has many unreasonable or inconvenient aspects, such as difficulties in data sharing, and it is hard to verify the authenticity and integrity of the data. The decentralized, non-forgeable, data unalterable and traceable features of blockchain are in line with the application requirements of EHR. This paper takes the most common COVID-19 as the application scenario and designs a COVID-19 health system based on blockchain, which has extensive research and application value. Considering that the public and transparent nature of blockchain violates the privacy requirements of some health data, in the system design stage, from the perspective of practical application, the data is divided into public data and private data according to its characteristics. For private data, data encryption methods are adopted to ensure data privacy. The searchable encryption technology is combined with blockchain technology to achieve the retrieval function of encrypted data. Then, the proxy re-encryption technology is used to realize authorized access to data. In the system implementation part, based on the Hyperledger Fabric architecture, some functions of the system design are realized, including data upload, retrieval of the latest data and historical data. According to the environment provided by the development architecture, Go language chaincode (smart contract) is written to implement the relevant system functions.

Paperid: 2215, https://arxiv.org/pdf/2506.02479.pdf

Abstract:
The inherent risk of generating harmful and unsafe content by Large Language Models (LLMs), has highlighted the need for their safety alignment. Various techniques like supervised fine-tuning, reinforcement learning from human feedback, and red-teaming were developed for ensuring the safety alignment of LLMs. However, the robustness of these aligned LLMs is always challenged by adversarial attacks that exploit unexplored and underlying vulnerabilities of the safety alignment. In this paper, we develop a novel black-box jailbreak attack, called BitBypass, that leverages hyphen-separated bitstream camouflage for jailbreaking aligned LLMs. This represents a new direction in jailbreaking by exploiting fundamental information representation of data as continuous bits, rather than leveraging prompt engineering or adversarial manipulations. Our evaluation of five state-of-the-art LLMs, namely GPT-4o, Gemini 1.5, Claude 3.5, Llama 3.1, and Mixtral, in adversarial perspective, revealed the capabilities of BitBypass in bypassing their safety alignment and tricking them into generating harmful and unsafe content. Further, we observed that BitBypass outperforms several state-of-the-art jailbreak attacks in terms of stealthiness and attack success. Overall, these results highlights the effectiveness and efficiency of BitBypass in jailbreaking these state-of-the-art LLMs.

Paperid: 2216, https://arxiv.org/pdf/2506.02035.pdf

Abstract:
As AI-enabled cyber capabilities become more advanced, we propose "differential access" as a strategy to tilt the cybersecurity balance toward defense by shaping access to these capabilities. We introduce three possible approaches that form a continuum, becoming progressively more restrictive for higher-risk capabilities: Promote Access, Manage Access, and Deny by Default. However, a key principle across all approaches is the need to prioritize defender access, even in the most restrictive scenarios, so that defenders can prepare for adversaries gaining access to similar capabilities. This report provides a process to help frontier AI developers choose and implement one of the three differential access approaches, including considerations based on a model's cyber capabilities, a defender's maturity and role, and strategic and technical implementation details. We also present four example schemes for defenders to reference, demonstrating how differential access provides value across various capability and defender levels, and suggest directions for further research.

Paperid: 2217, https://arxiv.org/pdf/2506.01072.pdf

Abstract:
Vertical privacy-preserving machine learning (vPPML) enables multiple parties to train models on their vertically distributed datasets while keeping datasets private. In vPPML, it is critical to perform the secure dataset join, which aligns features corresponding to intersection IDs across datasets and forms a secret-shared and joint training dataset. However, existing methods for this step could be impractical due to: (1) they are insecure when they expose intersection IDs; or (2) they rely on a strong trust assumption requiring a non-colluding auxiliary server; or (3) they are limited to the two-party setting. This paper proposes IDCloak, the first practical secure multi-party dataset join framework for vPPML that keeps IDs private without a non-colluding auxiliary server. IDCloak consists of two protocols: (1) a circuit-based multi-party private set intersection protocol (cmPSI), which obtains secret-shared flags indicating intersection IDs via an optimized communication structure combining OKVS and OPRF; (2) a secure multi-party feature alignment protocol, which obtains the secret-shared and joint dataset using secret-shared flags, via our proposed efficient secure shuffle protocol. Experiments show that: (1) compared to the state-of-the-art secure two-party dataset join framework (iPrivjoin), IDCloak demonstrates higher efficiency in the two-party setting and comparable performance when the party number increases; (2) compared to the state-of-the-art cmPSI protocol under honest majority, our proposed cmPSI protocol provides a stronger security guarantee (dishonest majority) while improving efficiency by up to $7.78\times$ in time and $8.73\times$ in communication sizes; (3) our proposed secure shuffle protocol outperforms the state-of-the-art shuffle protocol by up to $138.34\times$ in time and $132.13\times$ in communication sizes.

Paperid: 2218, https://arxiv.org/pdf/2506.00313.pdf

Abstract:
Data-flow analysis is a critical component of security research. Theoretically, accurate data-flow analysis in binary executables is an undecidable problem, due to complexities of binary code. Practically, many binary analysis engines offer some data-flow analysis capability, but we lack understanding of the accuracy of these analyses, and their limitations. We address this problem by introducing a labeled benchmark data set, including 215,072 microbenchmark test cases, mapping to 277,072 binary executables, created specifically to evaluate data-flow analysis implementations. Additionally, we augment our benchmark set with dynamically-discovered data flows from 6 real-world executables. Using our benchmark data set, we evaluate three state of the art data-flow analysis implementations, in angr, Ghidra and Miasm and discuss their very low accuracy and reasons behind it. We further propose three model extensions to static data-flow analysis that significantly improve accuracy, achieving almost perfect recall (0.99) and increasing precision from 0.13 to 0.32. Finally, we show that leveraging these model extensions in a vulnerability-discovery context leads to a tangible improvement in vulnerable instruction identification.

Paperid: 2219, https://arxiv.org/pdf/2506.00274.pdf

Abstract:
Large language models hold considerable promise for supporting forensic investigations, but their widespread adoption is hindered by a lack of transparency, explainability, and reproducibility. This paper explores how the emerging Model Context Protocol can address these challenges and support the meaningful use of LLMs in digital forensics. Through a theoretical analysis, we examine how MCP can be integrated across various forensic scenarios - ranging from artifact analysis to the generation of interpretable reports. We also outline both technical and conceptual considerations for deploying an MCP server in forensic environments. Our analysis reveals a wide range of use cases in which MCP not only strengthens existing forensic workflows but also facilitates the application of LLMs to areas of forensics where their use was previously limited. Furthermore, we introduce the concept of the inference constraint level - a way of characterizing how specific MCP design choices can deliberately constrain model behavior, thereby enhancing both auditability and traceability. Our insights demonstrate that MCP has significant potential as a foundational component for developing LLM-assisted forensic workflows that are not only more transparent, reproducible, and legally defensible, but also represent a step toward increased automation in digital forensic analysis. However, we also highlight potential challenges that the adoption of MCP may pose for digital forensics in the future.

Paperid: 2220, https://arxiv.org/pdf/2505.24451.pdf

Abstract:
Large Language Models (LLMs) are being extensively used for cybersecurity purposes. One of them is the detection of vulnerable codes. For the sake of efficiency and effectiveness, compression and fine-tuning techniques are being developed, respectively. However, they involve spending substantial computational efforts. In this vein, we analyse how Linear Probes (LPs) can be used to provide an estimation on the performance of a compressed LLM at an early phase -- before fine-tuning. We also show their suitability to set the cut-off point when applying layer pruning compression. Our approach, dubbed $LPASS$, is applied in BERT and Gemma for the detection of 12 of MITRE's Top 25 most dangerous vulnerabilities on 480k C/C++ samples. LPs can be computed in 142.97 s. and provide key findings: (1) 33.3 \% and 72.2\% of layers can be removed, respectively, with no precision loss; (2) they provide an early estimate of the post-fine-tuning and post-compression model effectiveness, with 3\% and 8.68\% as the lowest and average precision errors, respectively. $LPASS$-based LLMs outperform the state of the art, reaching 86.9\% of accuracy in multi-class vulnerability detection. Interestingly, $LPASS$-based compressed versions of Gemma outperform the original ones by 1.6\% of F1-score at a maximum while saving 29.4 \% and 23.8\% of training and inference time and 42.98\% of model size.

Paperid: 2221, https://arxiv.org/pdf/2505.24393.pdf

Abstract:
Optimistic Rollups (ORUs) significantly enhance blockchain scalability but inherently suffer from the verifier's dilemma, particularly concerning validator attentiveness. Current systems lack mechanisms to proactively ensure validators are diligently monitoring L2 state transitions, creating a vulnerability where fraudulent states could be finalized. This paper introduces the Randomized Attention Test (RAT), a novel L1-based protocol designed to probabilistically challenge validators in ORUs, thereby verifying their liveness and computational readiness. Our game-theoretic analysis demonstrates that an Ideal Security Equilibrium, where all validators are attentive and proposers are honest, can be achieved with RAT. Notably, this equilibrium is attainable and stable with relatively low economic penalties (under \$1000) for non-responsive validators, a low attention test frequency (under 1\% per epoch), and a minimal operation overhead (monthly under \$30) with 10 validators. RAT thus provides a pivotal, practical mechanism to enforce validator diligence, fortifying the overall security and integrity of ORU systems with minimizing additional costs.

Paperid: 2222, https://arxiv.org/pdf/2505.24008.pdf

Abstract:
Satellites are the backbone of several mission-critical services that enable our modern society to function, for example, GPS. For years, satellites were assumed to be secure because of their indecipherable architectures and the reliance on security by obscurity. However, technological advancements have made these assumptions obsolete, paving the way for potential attacks, and sparking interest in satellite security. Unfortunately, to this day, there is no efficient way to collect data on adversarial techniques for satellites, hurting the generation of security intelligence that can lead to the development of effective countermeasures. In this paper, we present HoneySat, the first high-interaction satellite honeypot framework, fully capable of convincingly simulating a real-world CubeSat, a type of Small Satellite (SmallSat). To provide evidence of HoneySat's effectiveness, we surveyed experienced SmallSat operators in charge of in-orbit satellites and deployed HoneySat over the Internet to entice adversaries. Our results show that 90% of satellite operators agreed that HoneySat provides a realistic and engaging simulation of a SmallSat mission. Additionally, HoneySat successfully deceived human adversaries in the wild and collected 22 real-world satellite-specific adversarial interactions. Finally, in a major demonstration of HoneySat's robustness, we collaborated with an aerospace company to perform a hardware-in-the-loop operation that resulted in HoneySat successfully communicating with an in-orbit, operational SmallSat mission.

Paperid: 2223, https://arxiv.org/pdf/2505.22447.pdf

Abstract:
Prompt learning is a crucial technique for adapting pre-trained multimodal language models (MLLMs) to user tasks. Federated prompt personalization (FPP) is further developed to address data heterogeneity and local overfitting, however, it exposes personalized prompts - valuable intellectual assets - to privacy risks like prompt stealing or membership inference attacks. Widely-adopted techniques like differential privacy add noise to prompts, whereas degrading personalization performance. We propose SecFPP, a secure FPP protocol harmonizing generalization, personalization, and privacy guarantees. SecFPP employs hierarchical prompt adaptation with domain-level and class-level components to handle multi-granular data imbalance. For privacy, it uses a novel secret-sharing-based adaptive clustering algorithm for domain-level adaptation while keeping class-level components private. While theoretically and empirically secure, SecFPP achieves state-of-the-art accuracy under severe heterogeneity in data distribution. Extensive experiments show it significantly outperforms both non-private and privacy-preserving baselines, offering a superior privacy-performance trade-off.

Paperid: 2224, https://arxiv.org/pdf/2505.22052.pdf

Abstract:
Even today, over 70% of security vulnerabilities in critical software systems result from memory safety violations. To address this challenge, fuzzing and static analysis are widely used automated methods to discover such vulnerabilities. Fuzzing generates random program inputs to identify faults, while static analysis examines source code to detect potential vulnerabilities. Although these techniques share a common goal, they take fundamentally different approaches and have evolved largely independently. In this paper, we present an empirical analysis of five static analyzers and 13 fuzzers, applied to over 100 known security vulnerabilities in C/C++ programs. We measure the number of bug reports generated for each vulnerability to evaluate how the approaches differ and complement each other. Moreover, we randomly sample eight bug-containing functions, manually analyze all bug reports therein, and quantify false-positive rates. We also assess limits to bug discovery, ease of use, resource requirements, and integration into the development process. We find that both techniques discover different types of bugs, but there are clear winners for each. Developers should consider these tools depending on their specific workflow and usability requirements. Based on our findings, we propose future directions to foster collaboration between these research domains.

Paperid: 2225, https://arxiv.org/pdf/2505.21756.pdf

Abstract:
Fully decentralized training of machine learning models offers significant advantages in scalability, robustness, and fault tolerance. However, achieving differential privacy (DP) in such settings is challenging due to the absence of a central aggregator and varying trust assumptions among nodes. In this work, we present a novel privacy analysis of decentralized gossip-based averaging algorithms with additive node-level noise, both with and without secure summation over each node's direct neighbors. Our main contribution is a new analytical framework based on a linear systems formulation that accurately characterizes privacy leakage across these scenarios. This framework significantly improves upon prior analyses, for example, reducing the RÃ©nyi DP parameter growth from $O(T^2)$ to $O(T)$, where $T$ is the number of training rounds. We validate our analysis with numerical results demonstrating superior DP bounds compared to existing approaches. We further illustrate our analysis with a logistic regression experiment on MNIST image classification in a fully decentralized setting, demonstrating utility comparable to central aggregation methods.

Paperid: 2227, https://arxiv.org/pdf/2505.19425.pdf

Abstract:
The rapid advancement of diffusion models has enhanced their image inpainting and editing capabilities but also introduced significant societal risks. Adversaries can exploit user images from social media to generate misleading or harmful content. While adversarial perturbations can disrupt inpainting, global perturbation-based methods fail in mask-guided editing tasks due to spatial constraints. To address these challenges, we propose Structure Disruption Attack (SDA), a powerful protection framework for safeguarding sensitive image regions against inpainting-based editing. Building upon the contour-focused nature of self-attention mechanisms of diffusion models, SDA optimizes perturbations by disrupting queries in self-attention during the initial denoising step to destroy the contour generation process. This targeted interference directly disrupts the structural generation capability of diffusion models, effectively preventing them from producing coherent images. We validate our motivation through visualization techniques and extensive experiments on public datasets, demonstrating that SDA achieves state-of-the-art (SOTA) protection performance while maintaining strong robustness.

Paperid: 2228, https://arxiv.org/pdf/2505.18889.pdf

Abstract:
Large Language Models (LLMs) such as ChatGPT and its competitors have caused a revolution in natural language processing, but their capabilities also introduce new security vulnerabilities. This survey provides a comprehensive overview of these emerging concerns, categorizing threats into several key areas: inference-time attacks via prompt manipulation; training-time attacks; misuse by malicious actors; and the inherent risks in autonomous LLM agents. Recently, a significant focus is increasingly being placed on the latter. We summarize recent academic and industrial studies from 2022 to 2025 that exemplify each threat, analyze existing defense mechanisms and their limitations, and identify open challenges in securing LLM-based applications. We conclude by emphasizing the importance of advancing robust, multi-layered security strategies to ensure LLMs are safe and beneficial.

Paperid: 2229, https://arxiv.org/pdf/2505.17310.pdf

Abstract:
The proliferation of electronic devices has greatly transformed every aspect of human life, such as communication, healthcare, transportation, and energy. Unfortunately, the global electronics supply chain is vulnerable to various attacks, including piracy of intellectual properties, tampering, counterfeiting, information leakage, side-channel, and fault injection attacks, due to the complex nature of electronic products and vulnerabilities present in them. Although numerous solutions have been proposed to address these threats, significant gaps remain, particularly in providing scalable and comprehensive protection against emerging attacks. Digital twin, a dynamic virtual replica of a physical system, has emerged as a promising solution to address these issues by providing backward traceability, end-to-end visibility, and continuous verification of component integrity and behavior. In this paper, we comprehensively present the latest digital twin-based security implementations, including their role in cyber-physical systems, Internet of Things, cryptographic systems, detection of counterfeit electronics, intrusion detection, fault injection, and side-channel leakage. This work considers these critical security use cases within a single study to offer researchers and practitioners a unified reference for securing hardware with digital twins. The paper also explores the integration of large language models with digital twins for enhanced security and discusses current challenges, solutions, and future research directions.

Paperid: 2230, https://arxiv.org/pdf/2505.17236.pdf

Abstract:
Log management is crucial for ensuring the security, integrity, and compliance of modern information systems. Traditional log management solutions face challenges in achieving tamper-proofing, scalability, and real-time processing in distributed environments. This paper presents a blockchain-based log management framework that addresses these limitations by leveraging blockchain's decentralized, immutable, and transparent features. The framework integrates a hybrid on-chain and off-chain storage model, combining blockchain's integrity guarantees with the scalability of distributed storage solutions like IPFS. Smart contracts automate log validation and access control, while cryptographic techniques ensure privacy and confidentiality. With a focus on real-time log processing, the framework is designed to handle the high-volume log generation typical in large-scale systems, such as data centers and network infrastructure. Performance evaluations demonstrate the framework's scalability, low latency, and ability to manage millions of log entries while maintaining strong security guarantees. Additionally, the paper discusses challenges like blockchain storage overhead and energy consumption, offering insights for enhancing future systems.

Paperid: 2231, https://arxiv.org/pdf/2505.16371.pdf

Abstract:
Cyberterrorism poses a formidable threat to digital infrastructures, with increasing reliance on encrypted, decentralized platforms that obscure threat actor activity. To address the challenge of analyzing such adversarial networks while preserving the privacy of distributed intelligence data, we propose a Privacy-Aware Federated Graph Neural Network (PA-FGNN) framework. PA-FGNN integrates graph attention networks, differential privacy, and homomorphic encryption into a robust federated learning pipeline tailored for cyberterrorism network analysis. Each client trains locally on sensitive graph data and exchanges encrypted, noise-perturbed model updates with a central aggregator, which performs secure aggregation and broadcasts global updates. We implement anomaly detection for flagging high-risk nodes and incorporate defenses against gradient poisoning. Experimental evaluations on simulated dark web and cyber-intelligence graphs demonstrate that PA-FGNN achieves over 91\% classification accuracy, maintains resilience under 20\% adversarial client behavior, and incurs less than 18\% communication overhead. Our results highlight that privacy-preserving GNNs can support large-scale cyber threat detection without compromising on utility, privacy, or robustness.

Paperid: 2232, https://arxiv.org/pdf/2505.16300.pdf

Abstract:
With the ongoing adoption of 5G for communication in industrial systems and critical infrastructure, the security of industrial UEs such as 5G-enabled industrial robots becomes an increasingly important topic. Most notably, to meet the stringent security requirements of industrial deployments, industrial UEs not only have to fully comply with the 5G specifications but also implement and use correctly secure communication protocols such as TLS. To ensure the security of industrial UEs, operators of industrial 5G networks rely on security testing before deploying new devices to their production networks. However, currently only isolated tests for individual security aspects of industrial UEs exist, severely hindering comprehensive testing. In this paper, we report on our ongoing efforts to alleviate this situation by creating an automated security testing framework for industrial UEs to comprehensively evaluate their security posture before deployment. With this framework, we aim to provide stakeholders with a fully automated-method to verify that higher-layer security protocols are correctly implemented, while simultaneously ensuring that the UE's protocol stack adheres to 3GPP specifications.

Paperid: 2233, https://arxiv.org/pdf/2505.15383.pdf

Abstract:
Insider threats represent one of the most critical challenges in modern cybersecurity. These threats arise from individuals within an organization who misuse their legitimate access to harm the organization's assets, data, or operations. Traditional security mechanisms, primarily designed for external attackers, fall short in identifying these subtle and context-aware threats. In this paper, we propose a novel framework for real-time detection of insider threats using behavioral analytics combined with deep evidential clustering. Our system captures and analyzes user activities, applies context-rich behavioral features, and classifies potential threats using a deep evidential clustering model that estimates both cluster assignment and epistemic uncertainty. The proposed model dynamically adapts to behavioral changes and significantly reduces false positives. We evaluate our framework on benchmark insider threat datasets such as CERT and TWOS, achieving an average detection accuracy of 94.7% and a 38% reduction in false positives compared to traditional clustering methods. Our results demonstrate the effectiveness of integrating uncertainty modeling in threat detection pipelines. This research provides actionable insights for deploying intelligent, adaptive, and robust insider threat detection systems across various enterprise environments.

Paperid: 2234, https://arxiv.org/pdf/2505.15376.pdf

Abstract:
Industrial Internet of Things (IIoT) systems have become integral to smart manufacturing, yet their growing connectivity has also exposed them to significant cybersecurity threats. Traditional intrusion detection systems (IDS) often rely on centralized architectures that raise concerns over data privacy, latency, and single points of failure. In this work, we propose a novel Federated Learning-Enhanced Blockchain Framework (FL-BCID) for privacy-preserving intrusion detection tailored for IIoT environments. Our architecture combines federated learning (FL) to ensure decentralized model training with blockchain technology to guarantee data integrity, trust, and tamper resistance across IIoT nodes. We design a lightweight intrusion detection model collaboratively trained using FL across edge devices without exposing sensitive data. A smart contract-enabled blockchain system records model updates and anomaly scores to establish accountability. Experimental evaluations using the ToN-IoT and N-BaIoT datasets demonstrate the superior performance of our framework, achieving 97.3% accuracy while reducing communication overhead by 41% compared to baseline centralized methods. Our approach ensures privacy, scalability, and robustness-critical for secure industrial operations. The proposed FL-BCID system provides a promising solution for enhancing trust and privacy in modern IIoT security architectures.

Paperid: 2235, https://arxiv.org/pdf/2505.14914.pdf

Abstract:
We introduce the Sei Giga, a multi-concurrent producer parallelized execution EVM layer one blockchain. In an internal testnet Giga has achieved >5 gigagas/sec throughput and sub 400ms finality. Giga uses Autobahn for consensus with separate DA and consensus layers requiring f+1 votes for a PoA on the DA layer before consensus. Giga reaches consensus over ordering and uses async block execution and state agreement to remove execution from the consensus bottleneck.

Paperid: 2236, https://arxiv.org/pdf/2505.11706.pdf

Abstract:
There has been a rise in third-party cloud providers offering quantum hardware as a service to improve performance at lower cost. Although these providers provide flexibility to the users to choose from several qubit technologies, quantum hardware, and coupling maps; the actual execution of the program is not clearly visible to the customer. The success of the user program, in addition to various other metadata such as cost, performance, & number of iterations to converge, depends on the error rate of the backend used. Moreover, the third-party provider and/or tools (e.g., hardware allocator and mapper) may hold insider/outsider adversarial agents to conserve resources and maximize profit by running the quantum circuits on error-prone hardware. Thus it is important to gain visibility of the backend from various perspectives of the computing process e.g., execution, transpilation and outcomes. In this paper, we estimate the error rate of the backend from the original and transpiled circuit. For the forensics, we exploit the fact that qubit mapping and routing steps of the transpilation process select qubits and qubit pairs with less single qubit and two-qubit gate errors to minimize overall error accumulation, thereby, giving us clues about the error rates of the various parts of the backend. We ranked qubit links into bins based on ECR error rates publicly available, and compared it to the rankings derived from our investigation of the relative frequency of a qubit link being chosen by the transpiler. For upto 83.5% of the qubit links in IBM Sherbrooke and 80% in IBM Brisbane, 127 qubit IBM backends, we are able to assign a bin rank which has a difference upto 2 with the bin rank assigned on the basis of actual error rate information.

Paperid: 2237, https://arxiv.org/pdf/2505.11320.pdf

Abstract:
Scam contracts on Ethereum have rapidly evolved alongside the rise of DeFi and NFT ecosystems, utilizing increasingly complex code obfuscation techniques to avoid early detection. This paper systematically investigates how obfuscation amplifies the financial risks of fraudulent contracts and undermines existing auditing tools. We propose a transfer-centric obfuscation taxonomy, distilling seven key features, and introduce ObfProbe, a framework that performs bytecode-level smart contract analysis to uncover obfuscation techniques and quantify obfuscation complexity via Z-score ranking. In a large-scale study of 1.03 million Ethereum contracts, we isolate over 3 000 highly obfuscated contracts and identify two scam archetypes, three high-risk contract categories, and MEV bots that employ a variety of obfuscation maneuvers such as inline assembly, dead code insertion, and deep function splitting. We further show that obfuscation substantially increases both the scale of financial damage and the time until detection. Finally, we evaluate SourceP, a state-of-the-art Ponzi detection tool, on obfuscated versus non-obfuscated samples and observe its accuracy drop from approximately 80 percent to approximately 12 percent in real-world scenarios. These findings highlight the urgent need for enhanced anti-obfuscation analysis techniques and broader community collaboration to stem the proliferation of scam contracts in the expanding DeFi ecosystem.

Paperid: 2238, https://arxiv.org/pdf/2505.10871.pdf

Abstract:
Releasing useful information from datasets with hierarchical structures while preserving individual privacy presents a significant challenge. Standard privacy-preserving mechanisms, and in particular Differential Privacy, often require careful allocation of a finite privacy budget across different levels and components of the hierarchy. Sub-optimal allocation can lead to either excessive noise, rendering the data useless, or to insufficient protections for sensitive information. This paper addresses the critical problem of optimal privacy budget allocation for hierarchical data release. It formulates this challenge as a constrained optimization problem, aiming to maximize data utility subject to a total privacy budget while considering the inherent trade-offs between data granularity and privacy loss. The proposed approach is supported by theoretical analysis and validated through comprehensive experiments on real hierarchical datasets. These experiments demonstrate that optimal privacy budget allocation significantly enhances the utility of the released data and improves the performance of downstream tasks.

Paperid: 2239, https://arxiv.org/pdf/2505.09276.pdf

Abstract:
Runtime verification offers scalable solutions to improve the safety and reliability of systems. However, systems that require verification or monitoring by a third party to ensure compliance with a specification might contain sensitive information, causing privacy concerns when usual runtime verification approaches are used. Privacy is compromised if protected information about the system, or sensitive data that is processed by the system, is revealed. In addition, revealing the specification being monitored may undermine the essence of third-party verification. In this work, we propose two novel protocols for the privacy-preserving runtime verification of systems against formal sequential specifications. In our first protocol, the monitor verifies whether the system satisfies the specification without learning anything else, though both parties are aware of the specification. Our second protocol ensures that the system remains oblivious to the monitored specification, while the monitor learns only whether the system satisfies the specification and nothing more. Our protocols adapt and improve existing techniques used in cryptography, and more specifically, multi-party computation. The sequential specification defines the observation step of the monitor, whose granularity depends on the situation (e.g., banks may be monitored on a daily basis). Our protocols exchange a single message per observation step, after an initialisation phase. This design minimises communication overhead, enabling relatively lightweight privacy-preserving monitoring. We implement our approach for monitoring specifications described by register automata and evaluate it experimentally.

Paperid: 2240, https://arxiv.org/pdf/2505.09002.pdf

Abstract:
The emergence of chiplet-based heterogeneous integration is transforming the semiconductor, AI, and high-performance computing industries by enabling modular designs and improved scalability. However, assembling chiplets from multiple vendors after fabrication introduces a complex supply chain that raises serious security concerns, including counterfeiting, overproduction, and unauthorized access. Current solutions often depend on dedicated security chiplets or changes to the timing flow, which assume a trusted SiP integrator. This assumption can expose chiplet signatures to other vendors and create new attack surfaces. This work addresses those vulnerabilities using Multi-party Computation (MPC), which enables zero-trust authentication without disclosing sensitive information to any party. We present SAFE-SiP, a scalable authentication framework that garbles chiplet signatures and uses MPC for verifying integrity, effectively blocking unauthorized access and adversarial inference. SAFE-SiP removes the need for a dedicated security chiplet and ensures secure authentication, even in untrusted integration scenarios. We evaluated SAFE-SiP on five RISC-V-based System-in-Package (SiP) designs. Experimental results show that SAFE-SiP incurs minimal power overhead, an average area overhead of only 3.05%, and maintains a computational complexity of 2^192, offering a highly efficient and scalable security solution.

Paperid: 2241, https://arxiv.org/pdf/2505.08810.pdf

Abstract:
Vehicular Ad Hoc Networks (VANETs) play a key role in Intelligent Transportation Systems (ITS), particularly in enabling real-time communication for emergency vehicles. However, Distributed Denial of Service (DDoS) attacks, which interfere with safety-critical communication channels, can severely impair their reliability. This study introduces a robust and scalable framework to detect DDoS attacks in highway-based VANET environments. A synthetic dataset was constructed using Network Simulator 3 (NS-3) in conjunction with the Simulation of Urban Mobility (SUMO) and further enriched with real-world mobility traces from Germany's A81 highway, extracted via OpenStreetMap (OSM). Three traffic categories were simulated: DDoS, VoIP, and TCP-based video streaming (VideoTCP). The data preprocessing pipeline included normalization, signal-to-noise ratio (SNR) feature engineering, missing value imputation, and class balancing using the Synthetic Minority Over-sampling Technique (SMOTE). Feature importance was assessed using SHapley Additive exPlanations (SHAP). Eleven classifiers were benchmarked, among them XGBoost (XGB), CatBoost (CB), AdaBoost (AB), GradientBoosting (GB), and an Artificial Neural Network (ANN). XGB and CB achieved the best performance, each attaining an F1-score of 96%. These results highlight the robustness of the proposed framework and its potential for real-time deployment in VANETs to secure critical emergency communications.

Paperid: 2242, https://arxiv.org/pdf/2505.08728.pdf

Abstract:
Retrieval Augmented Generation (RAG) has emerged as the de facto industry standard for user-facing NLP applications, offering the ability to integrate data without re-training or fine-tuning Large Language Models (LLMs). This capability enhances the quality and accuracy of responses but also introduces novel security and privacy challenges, particularly when sensitive data is integrated. With the rapid adoption of RAG, securing data and services has become a critical priority. This paper first reviews the vulnerabilities of RAG pipelines, and outlines the attack surface from data pre-processing and data storage management to integration with LLMs. The identified risks are then paired with corresponding mitigations in a structured overview. In a second step, the paper develops a framework that combines RAG-specific security considerations, with existing general security guidelines, industry standards, and best practices. The proposed framework aims to guide the implementation of robust, compliant, secure, and trustworthy RAG systems.

Paperid: 2243, https://arxiv.org/pdf/2505.08088.pdf

Abstract:
Indoor positioning systems (IPSs) are increasingly vital for location-based services in complex multi-storey environments. This study proposes a novel graph-based approach for floor separation using Wi-Fi fingerprint trajectories, addressing the challenge of vertical localization in indoor settings. We construct a graph where nodes represent Wi-Fi fingerprints, and edges are weighted by signal similarity and contextual transitions. Node2Vec is employed to generate low-dimensional embeddings, which are subsequently clustered using K-means to identify distinct floors. Evaluated on the Huawei University Challenge 2021 dataset, our method outperforms traditional community detection algorithms, achieving an accuracy of 68.97\%, an F1-score of 61.99\%, and an Adjusted Rand Index of 57.19\%. By publicly releasing the preprocessed dataset and implementation code, this work contributes to advancing research in indoor positioning. The proposed approach demonstrates robustness to signal noise and architectural complexities, offering a scalable solution for floor-level localization.

Paperid: 2244, https://arxiv.org/pdf/2505.07584.pdf

Abstract:
The increasing deployment of large language models in security-sensitive domains necessitates rigorous evaluation of their resilience against adversarial prompt-based attacks. While previous benchmarks have focused on security evaluations with limited and predefined attack domains, such as cybersecurity attacks, they often lack a comprehensive assessment of intent-driven adversarial prompts and the consideration of real-life scenario-based multi-turn attacks. To address this gap, we present SecReEvalBench, the Security Resilience Evaluation Benchmark, which defines four novel metrics: Prompt Attack Resilience Score, Prompt Attack Refusal Logic Score, Chain-Based Attack Resilience Score and Chain-Based Attack Rejection Time Score. Moreover, SecReEvalBench employs six questioning sequences for model assessment: one-off attack, successive attack, successive reverse attack, alternative attack, sequential ascending attack with escalating threat levels and sequential descending attack with diminishing threat levels. In addition, we introduce a dataset customized for the benchmark, which incorporates both neutral and malicious prompts, categorised across seven security domains and sixteen attack techniques. In applying this benchmark, we systematically evaluate five state-of-the-art open-weighted large language models, Llama 3.1, Gemma 2, Mistral v0.3, DeepSeek-R1 and Qwen 3. Our findings offer critical insights into the strengths and weaknesses of modern large language models in defending against evolving adversarial threats. The SecReEvalBench dataset is publicly available at https://kaggle.com/datasets/5a7ee22cf9dab6c93b55a73f630f6c9b42e936351b0ae98fbae6ddaca7fe248d, which provides a groundwork for advancing research in large language model security.

Paperid: 2245, https://arxiv.org/pdf/2505.07329.pdf

Abstract:
Preserving data confidentiality during the fine-tuning of open-source Large Language Models (LLMs) is crucial for sensitive applications. This work introduces an interactive protocol adapting the Low-Rank Adaptation (LoRA) technique for private fine-tuning. Homomorphic Encryption (HE) protects the confidentiality of training data and gradients handled by remote worker nodes performing the bulk of computations involving the base model weights. The data owner orchestrates training, requiring minimal local computing power and memory, thus alleviating the need for expensive client-side GPUs. We demonstrate feasibility by fine-tuning a Llama-3.2-1B model, presenting convergence results using HE-compatible quantization and performance benchmarks for HE computations on GPU hardware. This approach enables applications such as confidential knowledge base question answering, private codebase fine-tuning for AI code assistants, AI agents for drafting emails based on a company's email archive, and adapting models to analyze sensitive legal or healthcare documents.

Paperid: 2246, https://arxiv.org/pdf/2505.07328.pdf

Abstract:
In contrast to its predecessors, 5G supports a wide range of commercial, industrial, and critical infrastructure scenarios. One key feature of 5G, ultra-reliable low latency communication, is particularly appealing to such scenarios for its real-time capabilities. However, 5G's enhanced security, mostly realized through optional security controls, imposes additional overhead on the network performance, potentially hindering its real-time capabilities. To better assess this impact and guide operators in choosing between different options, we measure the latency overhead of IPsec when applied over the N3 and the service-based interfaces to protect user and control plane data, respectively. Furthermore, we evaluate whether WireGuard constitutes an alternative to reduce this overhead. Our findings show that IPsec, if configured correctly, has minimal latency impact and thus is a prime candidate to secure real-time critical scenarios.

Paperid: 2247, https://arxiv.org/pdf/2505.06821.pdf

Abstract:
Current hardware security verification processes predominantly rely on manual threat modeling and test plan generation, which are labor-intensive, error-prone, and struggle to scale with increasing design complexity and evolving attack methodologies. To address these challenges, we propose ThreatLens, an LLM-driven multi-agent framework that automates security threat modeling and test plan generation for hardware security verification. ThreatLens integrates retrieval-augmented generation (RAG) to extract relevant security knowledge, LLM-powered reasoning for threat assessment, and interactive user feedback to ensure the generation of practical test plans. By automating these processes, the framework reduces the manual verification effort, enhances coverage, and ensures a structured, adaptable approach to security verification. We evaluated our framework on the NEORV32 SoC, demonstrating its capability to automate security verification through structured test plans and validating its effectiveness in real-world scenarios.

Paperid: 2248, https://arxiv.org/pdf/2505.06477.pdf

Abstract:
Safety-critical applications such as healthcare and autonomous vehicles use deep neural networks (DNN) to make predictions and infer decisions. DNNs are susceptible to evasion attacks, where an adversary crafts a malicious data instance to trick the DNN into making wrong decisions at inference time. Existing defenses that protect DNNs against evasion attacks are either static or dynamic. Static defenses are computationally efficient but do not adapt to the evolving threat landscape, while dynamic defenses are adaptable but suffer from an increased computational overhead. To combine the best of both worlds, in this paper, we propose a novel risk profiling framework that uses a risk-aware strategy to selectively train static defenses using victim instances that exhibit the most resilient features and are hence more resilient against an evasion attack. We hypothesize that training existing defenses on instances that are less vulnerable to the attack enhances the adversarial detection rate by reducing false negatives. We evaluate the efficacy of our risk-aware selective training strategy on a blood glucose management system that demonstrates how training static anomaly detectors indiscriminately may result in an increased false negative rate, which could be life-threatening in safety-critical applications. Our experiments show that selective training on the less vulnerable patients achieves a recall increase of up to 27.5\% with minimal impact on precision compared to indiscriminate training.

Paperid: 2249, https://arxiv.org/pdf/2505.06364.pdf

Abstract:
Analog and mixed-signal (A/MS) integrated circuits (ICs) are integral to safety-critical applications. However, the globalization and outsourcing of A/MS ICs to untrusted third-party foundries expose them to security threats, particularly analog Trojans. Unlike digital Trojans which have been extensively studied, analog Trojans remain largely unexplored. There has been only limited research on their diversity and stealth in analog designs, where a Trojan is activated only during a narrow input voltage range. Effective defense techniques require a clear understanding of the attack vectors; however, the lack of diverse analog Trojan instances limits robust advances in detection strategies. To address this gap, we present LATENT, the first large language model (LLM)-driven framework for crafting stealthy, circuit-specific analog Trojans. LATENT incorporates LLM as an autonomous agent to intelligently insert and refine Trojan components within analog designs based on iterative feedback from a detection model. This feedback loop ensures that the inserted Trojans remain stealthy while successfully evading detection. Experimental results demonstrate that our generated Trojan designs exhibit an average Trojan-activation range of 15.74%, ensuring they remain inactive under most operating voltages, while causing a significant performance degradation of 11.3% upon activation.

Paperid: 2250, https://arxiv.org/pdf/2505.06335.pdf

Abstract:
Federated Learning (FL) has the potential for simultaneous global learning amongst a large number of parallel agents, enabling emerging AI such as LLMs to be trained across demographically diverse data. Central to this being efficient is the ability for FL to perform sparse gradient updates and remote direct memory access at the central server. Most of the research in FL security focuses on protecting data privacy at the edge client or in the communication channels between the client and server. Client-facing attacks on the server are less well investigated as the assumption is that a large collective of clients offer resilience. Here, we show that by attacking certain clients that lead to a high frequency repetitive memory update in the server, we can remote initiate a rowhammer attack on the server memory. For the first time, we do not need backdoor access to the server, and a reinforcement learning (RL) attacker can learn how to maximize server repetitive memory updates by manipulating the client's sensor observation. The consequence of the remote rowhammer attack is that we are able to achieve bit flips, which can corrupt the server memory. We demonstrate the feasibility of our attack using a large-scale FL automatic speech recognition (ASR) systems with sparse updates, our adversarial attacking agent can achieve around 70\% repeated update rate (RUR) in the targeted server model, effectively inducing bit flips on server DRAM. The security implications are that can cause disruptions to learning or may inadvertently cause elevated privilege. This paves the way for further research on practical mitigation strategies in FL and hardware design.

Paperid: 2251, https://arxiv.org/pdf/2505.06174.pdf

Abstract:
Algebraic Manipulation Detection (AMD) codes is a cryptographic primitive that was introduced by Cramer, Dodis, Fehr, Padro and Wichs. They are keyless message authentication codes that protect messages against additive tampering by the adversary assuming that the adversary cannot "see" the codeword. For certain applications, it is unreasonable to assume that the adversary computes the added offset without any knowledge of the codeword c. Recently, Ahmadi and Safavi-Naini, and then Lin, Safavi-Naini, and Wang gave a construction of leakage-resilient AMD codes where the adversary has some partial information about the codeword before choosing added offset, and the scheme is secure even conditioned on this partial information. In this paper we establish bounds on the leakage rate r and the code rate k for leakage-resilient AMD codes. In particular we prove that 2r + k < 1 and for the weak case (security is averaged over a uniformly random message) r + k < 1. These bounds hold even if adversary is polynomial-time bounded, as long as we allow leakage function to be arbitrary. We present constructions of AMD codes that (asymptotically) fulfill the above bounds for almost full range of parameters r and k. This shows that the above bounds and constructions are in-fact optimal. In the last section we show that if a leakage function is computationally bounded (we use the Ideal Cipher Model) then it is possible to break these bounds.

Paperid: 2252, https://arxiv.org/pdf/2505.05100.pdf

Abstract:
The intersection of blockchain (distributed ledger) and identity management lacks a comprehensive framework for classifying distributed-ledger-based identity solutions. This paper introduces a methodologically developed taxonomy derived from the analysis of 390 scientific papers and expert discussions. The resulting framework consists of 22 dimensions with 113 characteristics, organized into three groups: trust anchor implementations, identity architectures (identifiers and credentials), and ledger specifications. This taxonomy facilitates the systematic analysis, comparison, and design of distributed-ledger-based identity solutions, as demonstrated through its application to two distinct architectures. As the first methodology-driven taxonomy in this field, this work advances standardization and enhances understanding of distributed-ledger-based identity architectures. It provides researchers and practitioners with a structured framework for evaluating design decisions and implementation approaches.

Paperid: 2253, https://arxiv.org/pdf/2505.04101.pdf

Abstract:
Artificial Intelligence (AI) is expected to be an integral part of next-generation AI-native 6G networks. With the prevalence of AI, researchers have identified numerous use cases of AI in network security. However, there are almost nonexistent studies that analyze the suitability of Large Language Models (LLMs) in network security. To fill this gap, we examine the suitability of LLMs in network security, particularly with the case study of STRIDE threat modeling. We utilize four prompting techniques with five LLMs to perform STRIDE classification of 5G threats. From our evaluation results, we point out key findings and detailed insights along with the explanation of the possible underlying factors influencing the behavior of LLMs in the modeling of certain threats. The numerical results and the insights support the necessity for adjusting and fine-tuning LLMs for network security use cases.

Paperid: 2254, https://arxiv.org/pdf/2505.02464.pdf

Abstract:
Rust is a promising programming language that focuses on concurrency, usability, and security. It is used in production code by major industry players and got recommended by government bodies. Rust provides strong security guarantees achieved by design utilizing the concepts of ownership and borrowing. However, Rust allows programmers to write unsafe code which is not subject to the strict Rust security policy. Empirical studies show that security issues in practice always involve code written in unsafe Rust. In this paper, we present the first approach that utilizes selective code coverage feedback to focus the fuzzing efforts on unsafe Rust code. Our approach significantly improves the efficiency when fuzzing Rust programs and does not require additional computational resources while fuzz testing the target. To quantify the impact of partial code instrumentation, we implement our approach by extending the capabilities of the Rust compiler toolchain. We present an automated approach to detect unsafe and safe code components to decide which parts of the program a fuzzer should focus on when running a fuzzing campaign to find vulnerabilities in Rust programs. Our approach is fully compatible with existing fuzzing implementations and does not require complex manual work, thus retaining the existing high usability standard. Focusing on unsafe code, our implementation allows us to generate inputs that trigger more unsafe code locations with statistical significance and therefore is able to detect potential vulnerabilities in a shorter time span while imposing no performance overhead during fuzzing itself.

Paperid: 2255, https://arxiv.org/pdf/2505.00817.pdf

Abstract:
Side-channel attacks on shared hardware resources increasingly threaten confidentiality, especially with the rise of Large Language Models (LLMs). In this work, we introduce Spill The Beans, a novel application of cache side-channels to leak tokens generated by an LLM. By co-locating an attack process on the same hardware as the victim model, we flush and reload embedding vectors from the embedding layer, where each token corresponds to a unique embedding vector. When accessed during token generation, it results in a cache hit detectable by our attack on shared lower-level caches. A significant challenge is the massive size of LLMs, which, by nature of their compute intensive operation, quickly evicts embedding vectors from the cache. We address this by balancing the number of tokens monitored against the amount of information leaked. Monitoring more tokens increases potential vocabulary leakage but raises the chance of missing cache hits due to eviction; monitoring fewer tokens improves detection reliability but limits vocabulary coverage. Through extensive experimentation, we demonstrate the feasibility of leaking tokens from LLMs via cache side-channels. Our findings reveal a new vulnerability in LLM deployments, highlighting that even sophisticated models are susceptible to traditional side-channel attacks. We discuss the implications for privacy and security in LLM-serving infrastructures and suggest considerations for mitigating such threats. For proof of concept we consider two concrete attack scenarios: Our experiments show that an attacker can recover as much as 80%-90% of a high entropy API key with single shot monitoring. As for English text we can reach a 40% recovery rate with a single shot. We should note that the rate highly depends on the monitored token set and these rates can be improved by targeting more specialized output domains.

Paperid: 2256, https://arxiv.org/pdf/2505.00206.pdf

Abstract:
In the $k$-Orthogonal Vectors ($k$-OV) problem we are given $k$ sets, each containing $n$ binary vectors of dimension $d=n^{o(1)}$, and our goal is to pick one vector from each set so that at each coordinate at least one vector has a zero. It is a central problem in fine-grained complexity, conjectured to require $n^{k-o(1)}$ time in the worst case. We propose a way to \emph{plant} a solution among vectors with i.i.d. $p$-biased entries, for appropriately chosen $p$, so that the planted solution is the unique one. Our conjecture is that the resulting $k$-OV instances still require time $n^{k-o(1)}$ to solve, \emph{on average}. Our planted distribution has the property that any subset of strictly less than $k$ vectors has the \emph{same} marginal distribution as in the model distribution, consisting of i.i.d. $p$-biased random vectors. We use this property to give average-case search-to-decision reductions for $k$-OV.

Paperid: 2257, https://arxiv.org/pdf/2504.21342.pdf

Abstract:
The performance of any elliptic curve cryptography hardware accelerator significantly relies on the efficiency of the underlying point multiplication (PM) architecture. This article presents a hardware implementation of field-programmable gate array (FPGA) based modular arithmetic, group operation, and point multiplication unit on the twisted Edwards curve (Edwards25519) over the 256-bit prime field. An original hardware architecture of a unified point operation module in projective coordinates that executes point addition and point doubling within a single module has been developed, taking only 646 clock cycles and ensuring a better security level than conventional approaches. The proposed point multiplication module consumes 1.4 ms time, operating at a maximal clock frequency of 117.8 MHz utilising 164,730 clock cycles having 183.38 kbps throughput on the Xilinx Virtex-5 FPGA platform for 256-bit length of key. The comparative assessment of latency and throughput across various related recent works indicates the effectiveness of our proposed PM architecture. Finally, this high throughput and low latency PM architecture will be a good candidate for rapid data encryption in high-speed wireless communication networks.

Paperid: 2258, https://arxiv.org/pdf/2504.20726.pdf

Abstract:
Public vulnerability databases, such as the National Vulnerability Database (NVD), document vulnerabilities and facilitate threat information sharing. However, they often suffer from short descriptions and outdated or insufficient information. In this paper, we introduce Zad, a system designed to enrich NVD vulnerability descriptions by leveraging external resources. Zad consists of two pipelines: one collects and filters supplementary data using two encoders to build a detailed dataset, while the other fine-tunes a pre-trained model on this dataset to generate enriched descriptions. By addressing brevity and improving content quality, Zad produces more comprehensive and cohesive vulnerability descriptions. We evaluate Zad using standard summarization metrics and human assessments, demonstrating its effectiveness in enhancing vulnerability information.

Paperid: 2259, https://arxiv.org/pdf/2504.19274.pdf

Abstract:
Verification of the integrity of deep learning inference is crucial for understanding whether a model is being applied correctly. However, such verification typically requires access to model weights and (potentially sensitive or private) training data. So-called Zero-knowledge Succinct Non-Interactive Arguments of Knowledge (ZK-SNARKs) would appear to provide the capability to verify model inference without access to such sensitive data. However, applying ZK-SNARKs to modern neural networks, such as transformers and large vision models, introduces significant computational overhead. We present TeleSparse, a ZK-friendly post-processing mechanisms to produce practical solutions to this problem. TeleSparse tackles two fundamental challenges inherent in applying ZK-SNARKs to modern neural networks: (1) Reducing circuit constraints: Over-parameterized models result in numerous constraints for ZK-SNARK verification, driving up memory and proof generation costs. We address this by applying sparsification to neural network models, enhancing proof efficiency without compromising accuracy or security. (2) Minimizing the size of lookup tables required for non-linear functions, by optimizing activation ranges through neural teleportation, a novel adaptation for narrowing activation functions' range. TeleSparse reduces prover memory usage by 67% and proof generation time by 46% on the same model, with an accuracy trade-off of approximately 1%. We implement our framework using the Halo2 proving system and demonstrate its effectiveness across multiple architectures (Vision-transformer, ResNet, MobileNet) and datasets (ImageNet,CIFAR-10,CIFAR-100). This work opens new directions for ZK-friendly model design, moving toward scalable, resource-efficient verifiable deep learning.

Paperid: 2260, https://arxiv.org/pdf/2504.19055.pdf

Abstract:
The development of machine learning techniques for discovering software vulnerabilities relies fundamentally on the availability of appropriate datasets. The ideal dataset consists of a large and diverse collection of real-world vulnerabilities, paired so as to contain both vulnerable and patched versions of each program. Naturally, collecting such datasets is a laborious and time-consuming task. Within the specific domain of vulnerability discovery in binary code, previous datasets are either publicly unavailable, lack semantic diversity, involve artificially introduced vulnerabilities, or were collected using static analyzers, thereby themselves containing incorrectly labeled example programs. In this paper, we describe a new publicly available dataset which we dubbed Binpool, containing numerous samples of vulnerable versions of Debian packages across the years. The dataset was automatically curated, and contains both vulnerable and patched versions of each program, compiled at four different optimization levels. Overall, the dataset covers 603 distinct CVEs across 89 CWE classes, 162 Debian packages, and contains 6144 binaries. We argue that this dataset is suitable for evaluating a range of security analysis tools, including for vulnerability discovery, binary function similarity, and plagiarism detection.

Paperid: 2261, https://arxiv.org/pdf/2504.17878.pdf

Abstract:
In the looming post-quantum era, traditional cryptographic systems are increasingly vulnerable to quantum computing attacks that can compromise their mathematical foundations. To address this critical challenge, we propose crypto-ncRNA-a bio-convergent cryptographic framework that leverages the dynamic folding properties of non-coding RNA (ncRNA) to generate high-entropy, quantum-resistant keys and produce unpredictable ciphertexts. The framework employs a novel, multi-stage process: encoding plaintext into RNA sequences, predicting and manipulating RNA secondary structures using advanced algorithms, and deriving cryptographic keys through the intrinsic physical unclonability of RNA molecules. Experimental evaluations indicate that, although crypto-ncRNA's encryption speed is marginally lower than that of AES, it significantly outperforms RSA in terms of efficiency and scalability while achieving a 100% pass rate on the NIST SP 800-22 randomness tests. These results demonstrate that crypto-ncRNA offers a promising and robust approach for securing digital infrastructures against the evolving threats posed by quantum computing.

Paperid: 2262, https://arxiv.org/pdf/2504.15695.pdf

Abstract:
Software ecosystems built around programming languages have greatly facilitated software development. At the same time, their security has increasingly been acknowledged as a problem. To this end, the paper examines the previously overlooked longitudinal aspects of software ecosystem security, focusing on malware uploaded to six popular programming language ecosystems. The dataset examined is based on the new Open Source Vulnerabilities (OSV) database. According to the results, records about detected malware uploads in the database have recently surpassed those addressing vulnerabilities in packages distributed in the ecosystems. In the early 2025 even up to 80% of all entries in the OSV have been about malware. Regarding time series analysis of malware frequencies and their shares to all database entries, good predictions are available already by relatively simple autoregressive models using the numbers of ecosystems, security advisories, and media and other articles as predictors. With these results and the accompanying discussion, the paper improves and advances the understanding of the thus far overlooked longitudinal aspects of ecosystems and malware.

Paperid: 2263, https://arxiv.org/pdf/2504.15592.pdf

Abstract:
We report empirical evidence of web defacement and DDoS attacks carried out by low-level cybercrime actors in the Israel-Gaza conflict. Our quantitative measurements indicate an immediate increase in such cyberattacks following the Hamas-led assault and the subsequent declaration of war. However, the surges waned quickly after a few weeks, with patterns resembling those observed in the aftermath of the Russian invasion of Ukraine. The scale of attacks and discussions within the hacking community this time was both significantly lower than those during the early days of the Russia-Ukraine war, and attacks have been prominently one-sided: many pro-Palestinian supporters have targeted Israel, while attacks on Palestine have been much less significant. Beyond targeting these two, attackers also defaced sites of other countries to express their war support. Their broader opinions are also largely disparate, with far more support for Palestine and many objections expressed toward Israel.

Paperid: 2264, https://arxiv.org/pdf/2504.14381.pdf

Abstract:
Publicly verifiable secret sharing (PVSS) allows a dealer to share a secret among a set of shareholders so that the secret can be reconstructed later from any set of qualified participants. In addition, any public verifier should be able to check the correctness of the sharing and reconstruction process. PVSS has been demonstrated to yield various applications, such as e-voting, distributed key generation, decentralized random number generation protocols, and multi-party computation. Although many concrete PVSS protocols have been proposed, their security is either proven in the random oracle model or relies on quantum-vulnerable assumptions such as factoring or discrete logarithm. In this work, we put forward a generic construction for PVSS that can be instantiated in the standard model under the Learning With Errors (LWE) assumption. Our instantiation provides the first post-quantum PVSS in the standard model, with a reasonable level of asymptotic efficiency.

Paperid: 2265, https://arxiv.org/pdf/2504.12733.pdf

Abstract:
This paper presents an adversary model and a simulation framework specifically tailored for analyzing attacks on distributed systems composed of multiple distributed protocols, with a focus on assessing the security of blockchain networks. Our model classifies and constrains adversarial actions based on the assumptions of the target protocols, defined by failure models, communication models, and the fault tolerance thresholds of Byzantine Fault Tolerant (BFT) protocols. The goal is to study not only the intended effects of adversarial strategies but also their unintended side effects on critical system properties. We apply this framework to analyze fairness properties in a Hyperledger Fabric (HF) blockchain network. Our focus is on novel fairness attacks that involve coordinated adversarial actions across various HF services. Simulations show that even a constrained adversary can violate fairness with respect to specific clients (client fairness) and impact related guarantees (order fairness), which relate the reception order of transactions to their final order in the blockchain. This paper significantly extends our previous work by introducing and evaluating a mitigation mechanism specifically designed to counter transaction reordering attacks. We implement and integrate this defense into our simulation environment, demonstrating its effectiveness under diverse conditions.

Paperid: 2266, https://arxiv.org/pdf/2504.11867.pdf

Abstract:
The integration of intelligent and connected technologies in modern vehicles, while offering enhanced functionalities through Electronic Control Unit (ECU) and interfaces like OBD-II and telematics, also exposes the vehicle's in-vehicle network (IVN) to potential cyberattacks. Unlike prior work, we identify a new time-exciting threat model against IVN. These attacks inject malicious messages that exhibit a time-exciting effect, gradually manipulating network traffic to disrupt vehicle operations and compromise safety-critical functions. We systematically analyze the characteristics of the threat: dynamism, time-exciting impact, and low prior knowledge dependency. To validate its practicality, we replicate the attack on a real Advanced Driver Assistance System via Controller Area Network (CAN), exploiting Unified Diagnostic Service vulnerabilities and proposing four attack strategies. While CAN's integrity checks mitigate attacks, Ethernet migration (e.g., DoIP/SOME/IP) introduces new surfaces. We further investigate the feasibility of time-exciting threat under SOME/IP. To detect time-exciting threat, we introduce MDHP-Net, leveraging Multi-Dimentional Hawkes Process (MDHP) and temporal and message-wise feature extracting structures. Meanwhile, to estimate MDHP parameters, we developed the first GPU-optimized gradient descent solver for MDHP (MDHP-GDS). These modules significantly improves the detection rate under time-exciting attacks in multi-ECU IVN system. To address data scarcity, we release STEIA9, the first open-source dataset for time-exciting attacks, covering 9 Ethernet-based attack scenarios. Extensive experiments on STEIA9 (9 attack scenarios) show MDHP-Net outperforms 3 baselines, confirming attack feasibility and detection efficacy.

Paperid: 2267, https://arxiv.org/pdf/2504.11860.pdf

Abstract:
The recent proliferation of blockchain-based decentralized applications (DApp) has catalyzed transformative advancements in distributed systems, with extensive deployments observed across financial, entertainment, media, and cybersecurity domains. These trustless architectures, characterized by their decentralized nature and elimination of third-party intermediaries, have garnered substantial institutional attention. Consequently, the escalating security challenges confronting DApp demand rigorous scholarly investigation. This study initiates with a systematic analysis of behavioral patterns derived from empirical DApp datasets, establishing foundational insights for subsequent methodological developments. The principal security vulnerabilities in Ethereum-based smart contracts developed via Solidity are then critically examined. Specifically, reentrancy vulnerability attacks are addressed by formally representing contract logic using highly expressive code fragments. This enables precise source code-level detection via bidirectional long short-term memory networks with attention mechanisms (BLSTM-ATT). Regarding privacy preservation challenges, contemporary solutions are evaluated through dual analytical lenses: identity privacy preservation and transaction anonymity enhancement, while proposing future research trajectories in cryptographic obfuscation techniques.

Paperid: 2268, https://arxiv.org/pdf/2504.11168.pdf

Abstract:
Large Language Models (LLMs) guardrail systems are designed to protect against prompt injection and jailbreak attacks. However, they remain vulnerable to evasion techniques. We demonstrate two approaches for bypassing LLM prompt injection and jailbreak detection systems via traditional character injection methods and algorithmic Adversarial Machine Learning (AML) evasion techniques. Through testing against six prominent protection systems, including Microsoft's Azure Prompt Shield and Meta's Prompt Guard, we show that both methods can be used to evade detection while maintaining adversarial utility achieving in some instances up to 100% evasion success. Furthermore, we demonstrate that adversaries can enhance Attack Success Rates (ASR) against black-box targets by leveraging word importance ranking computed by offline white-box models. Our findings reveal vulnerabilities within current LLM protection mechanisms and highlight the need for more robust guardrail systems.

Paperid: 2269, https://arxiv.org/pdf/2504.10347.pdf

Abstract:
We propose a novel covert communication system in which a ground user, Alice, transmits unauthorized message fragments to Bob, a low-Earth orbit satellite (LEO), and an unmanned aerial vehicle (UAV) warden (Willie) attempts to detect these transmissions. The key contribution is modeling a scenario where Alice and Willie are unaware of each other's exact locations and move randomly within a specific area. Alice utilizes environmental obstructions to avoid detection and only transmits when the satellite is directly overhead. LEO satellite technology allows users to avoid transmitting messages near a base station. We introduce two key performance metrics: catch probability (Willie detects and locates Alice during a message chunk transmission) and overall catch probability over multiple message chunks. We analyze how two parameters impact these metrics: 1) the size of the detection window and 2) the number of message chunks. The paper proposes two algorithms to optimize these parameters. The simulation results show that the algorithms effectively reduce the detection risks. This work advances the understanding of covert communication under mobility and uncertainty in satellite-aided systems.

Paperid: 2270, https://arxiv.org/pdf/2504.09248.pdf

Abstract:
In this paper, we propose methods to encrypted a pre-given dynamic controller with homomorphic encryption, without re-encrypting the control inputs. We first present a preliminary result showing that the coefficients in a pre-given dynamic controller can be scaled up into integers by the zooming-in factor in dynamic quantization, without utilizing re-encryption. However, a sufficiently small zooming-in factor may not always exist because it requires that the convergence speed of the pre-given closed-loop system should be sufficiently fast. Then, as the main result, we design a new controller approximating the pre-given dynamic controller, in which the zooming-in factor is decoupled from the convergence rate of the pre-given closed-loop system. Therefore, there always exist a (sufficiently small) zooming-in factor of dynamic quantization scaling up all the controller's coefficients to integers, and a finite modulus preventing overflow in cryptosystems. The process is asymptotically stable and the quantizer is not saturated.

Paperid: 2271, https://arxiv.org/pdf/2504.09181.pdf

Abstract:
The application of Bitcoin enables people to understand blockchain technology gradually. Bitcoin is a decentralized currency that does not rely on third-party credit institutions, and the core of Bitcoin's underlying technology is blockchain. With the increasing value of Bitcoin and the vigorous development of decentralization, people's research on blockchain is also increasing day by day. Today's blockchain technology has not only made great achievements in the application of Bitcoin, but has also been preliminarily applied in other fields, such as finance, medical treatment, the Internet of Things, and so on. However, with the initial application of blockchain technology on the Internet, the security of blockchain technology has also been widely concerned by people in the industry. For example, whether currency trading platforms, smart contracts, blockchain consensus mechanisms, and other technologies are vulnerable to attacks, and how we can defend against these attacks digitally and optimize the blockchain system is exactly the subject we want to study. For the security of appeal blockchain, this paper first analyzes the security threats faced by the application digital currency trading platform of the blockchain system, then analyzes the security problems of smart contract closely related to blockchain 2.0, and then analyzes and studies the security threats of blockchain public chain, consensus mechanism, and P2P. Finally, combined with the security problems at all levels of the blockchain system we analyze and study how to optimize the security of the blockchain system.

Paperid: 2272, https://arxiv.org/pdf/2504.08951.pdf

Abstract:
The modern power grid increasingly depends on advanced information and communication technology (ICT) systems to enhance performance and reliability through real-time monitoring, intelligent control, and bidirectional communication. However, ICT integration also exposes the grid to cyber-threats. Load altering attacks (LAAs), which use botnets of high-wattage devices to manipulate load profiles, are a notable threat to grid stability. While previous research has examined LAAs, their specific impact on load frequency control (LFC), critical for maintaining nominal frequency during load fluctuations, still needs to be explored. Even minor frequency deviations can jeopardize grid operations. This study bridges the gap by analyzing LAA effects on LFC through simulations of static and dynamic scenarios using Python and RTDS. The results highlight LAA impacts on frequency stability and present an eigenvalue-based stability assessment for dynamic LAAs (DLAAs), identifying key parameters influencing grid resilience.

Paperid: 2273, https://arxiv.org/pdf/2504.08848.pdf

Abstract:
Large Language Models (LLMs) have rapidly become integral to numerous applications in critical domains where reliability is paramount. Despite significant advances in safety frameworks and guardrails, current protective measures exhibit crucial vulnerabilities, particularly in multilingual contexts. Existing safety systems remain susceptible to adversarial attacks in low-resource languages and through code-switching techniques, primarily due to their English-centric design. Furthermore, the development of effective multilingual guardrails is constrained by the scarcity of diverse cross-lingual training data. Even recent solutions like Llama Guard-3, while offering multilingual support, lack transparency in their decision-making processes. We address these challenges by introducing X-Guard agent, a transparent multilingual safety agent designed to provide content moderation across diverse linguistic contexts. X-Guard effectively defends against both conventional low-resource language attacks and sophisticated code-switching attacks. Our approach includes: curating and enhancing multiple open-source safety datasets with explicit evaluation rationales; employing a jury of judges methodology to mitigate individual judge LLM provider biases; creating a comprehensive multilingual safety dataset spanning 132 languages with 5 million data points; and developing a two-stage architecture combining a custom-finetuned mBART-50 translation module with an evaluation X-Guard 3B model trained through supervised finetuning and GRPO training. Our empirical evaluations demonstrate X-Guard's effectiveness in detecting unsafe content across multiple languages while maintaining transparency throughout the safety evaluation process. Our work represents a significant advancement in creating robust, transparent, and linguistically inclusive safety systems for LLMs and its integrated systems.

Paperid: 2274, https://arxiv.org/pdf/2504.08751.pdf

Abstract:
With the rapid development of short video platforms, recommendation systems have become key technologies for improving user experience and enhancing platform engagement. However, while short video recommendation systems leverage multimodal information (such as images, text, and audio) to improve recommendation effectiveness, they also face the severe challenge of user privacy leakage. This paper proposes a short video recommendation system based on multimodal information and differential privacy protection. First, deep learning models are used for feature extraction and fusion of multimodal data, effectively improving recommendation accuracy. Then, a differential privacy protection mechanism suitable for recommendation scenarios is designed to ensure user data privacy while maintaining system performance. Experimental results show that the proposed method outperforms existing mainstream approaches in terms of recommendation accuracy, multimodal fusion effectiveness, and privacy protection performance, providing important insights for the design of recommendation systems for short video platforms.

Paperid: 2275, https://arxiv.org/pdf/2504.07323.pdf

Abstract:
WhatsApp, the world's largest messaging application, uses a version of the Signal protocol to provide end-to-end encryption (E2EE) with strong security guarantees, including Perfect Forward Secrecy (PFS). To ensure PFS right from the start of a new conversation -- even when the recipient is offline -- a stash of ephemeral (one-time) prekeys must be stored on a server. While the critical role of these one-time prekeys in achieving PFS has been outlined in the Signal specification, we are the first to demonstrate a targeted depletion attack against them on individual WhatsApp user devices. Our findings not only reveal an attack that can degrade PFS for certain messages, but also expose inherent privacy risks and serious availability implications arising from the refilling and distribution procedure essential for this security mechanism.

Paperid: 2276, https://arxiv.org/pdf/2504.06833.pdf

Abstract:
The implementation of security protocols often combines different languages. This practice, however, poses a challenge to traditional verification techniques, which typically assume a single-language environment and, therefore, are insufficient to handle challenges presented by the interplay of different languages. To address this issue, we establish principles for combining multiple programming languages operating on different atomic types using a symbolic execution semantics. This facilitates the (parallel) composition of labeled transition systems, improving the analysis of complex systems by streamlining communication between diverse programming languages. By treating the Dolev-Yao (DY) model as a symbolic abstraction, our approach eliminates the need for translation between different base types, such as bitstrings and DY terms. Our technique provides a foundation for securing interactions in multi-language environments, enhancing program verification and system analysis in complex, interconnected systems.

Paperid: 2277, https://arxiv.org/pdf/2504.06712.pdf

Abstract:
The Internet of Things (IoT) has rapidly expanded across various sectors, with consumer IoT devices - such as smart thermostats and security cameras - experiencing growth. Although these devices improve efficiency and promise additional comfort, they also introduce new security challenges. Common and easy-to-explore vulnerabilities make IoT devices prime targets for malicious actors. Upcoming mandatory security certifications offer a promising way to mitigate these risks by enforcing best practices and providing transparency. Regulatory bodies are developing IoT security frameworks, but a universal standard for large-scale systematic security assessment is lacking. Existing manual testing approaches are expensive, limiting their efficacy in the diverse and rapidly evolving IoT domain. This paper reviews current IoT security challenges and assessment efforts, identifies gaps, and proposes a roadmap for scalable, automated security assessment, leveraging a model-based testing approach and machine learning techniques to strengthen consumer IoT security.

Paperid: 2278, https://arxiv.org/pdf/2504.05968.pdf

Abstract:
Smart contracts are a secure and trustworthy application that plays a vital role in decentralized applications in various fields such as insurance,the internet, and gaming. However, in recent years, smart contract security breaches have occurred frequently, and due to their financial properties, they have caused huge economic losses, such as the most famous security incident "The DAO" which caused a loss of over $60 million in Ethereum. This has drawn a lot of attention from all sides. Writing a secure smart contract is now a critical issue. This paper focuses on Ether smart contracts and explains the main components of Ether, smart contract architecture and mechanism. The environment used in this paper is the Ethernet environment, using remix online compilation platform and Solidity language, according to the four security events of American Chain, The DAO, Parity and KotET, the principles of integer overflow attack, reentrant attack, access control attack and denial of service attack are studied and analyzed accordingly, and the scenarios of these vulnerabilities are reproduced, and the measures to prevent them are given. Finally, preventive measures are given. In addition, the principles of short address attack, early transaction attack and privileged function exposure attack are also introduced in detail, and security measures are proposed. As vulnerabilities continue to emerge, their classification will also evolve. The analysis and research of the current vulnerabilities are also to lay a solid foundation for avoiding more vulnerabilities.

Paperid: 2279, https://arxiv.org/pdf/2504.05147.pdf

Abstract:
The rise of large language models (LLMs) has introduced new privacy challenges, particularly during inference where sensitive information in prompts may be exposed to proprietary LLM APIs. In this paper, we address the problem of formally protecting the sensitive information contained in a prompt while maintaining response quality. To this end, first, we introduce a cryptographically inspired notion of a prompt sanitizer which transforms an input prompt to protect its sensitive tokens. Second, we propose Pr$ÎµÎµ$mpt, a novel system that implements a prompt sanitizer. Pr$ÎµÎµ$mpt categorizes sensitive tokens into two types: (1) those where the LLM's response depends solely on the format (such as SSNs, credit card numbers), for which we use format-preserving encryption (FPE); and (2) those where the response depends on specific values, (such as age, salary) for which we apply metric differential privacy (mDP). Our evaluation demonstrates that Pr$ÎµÎµ$mpt is a practical method to achieve meaningful privacy guarantees, while maintaining high utility compared to unsanitized prompts, and outperforming prior methods

Paperid: 2280, https://arxiv.org/pdf/2504.04803.pdf

Abstract:
The modern software development landscape heavily relies on transitive dependencies. They enable seamless integration of third-party libraries. However, they also introduce security challenges. Transitive vulnerabilities that arise from indirect dependencies expose projects to risks associated with Common Vulnerabilities and Exposures (CVEs). It happens even when direct dependencies remain secure. This paper examines the lifecycle of transitive vulnerabilities in the Maven ecosystem. We employ survival analysis to measure the time projects remain exposed after a CVE is introduced. Using a large dataset of Maven projects, we identify factors that influence the resolution of these vulnerabilities. Our findings offer practical advice on improving dependency management.

Paperid: 2281, https://arxiv.org/pdf/2504.03863.pdf

Abstract:
The Common Vulnerabilities and Exposures (CVEs) system is a reference method for documenting publicly known information security weaknesses and exposures. This paper presents a study of the lifetime of CVEs in software projects and the risk factors affecting their existence. The study uses survival analysis to examine how features of programming languages, projects, and CVEs themselves impact the lifetime of CVEs. We suggest avenues for future research to investigate the effect of various factors on the resolution of vulnerabilities.

Paperid: 2282, https://arxiv.org/pdf/2504.02313.pdf

Abstract:
Cyber supply chain, encompassing digital asserts, software, hardware, has become an essential component of modern Information and Communications Technology (ICT) provisioning. However, the growing inter-dependencies have introduced numerous attack vectors, making supply chains a prime target for exploitation. In particular, advanced persistent threats (APTs) frequently leverage supply chain vulnerabilities (SCVs) as entry points, benefiting from their inherent stealth. Current defense strategies primarly focus on prevention through blockchain for integrity assurance or detection using plain-text source code analysis in open-source software (OSS). However, these approaches overlook scenarios where source code is unavailable and fail to address detection and defense during runtime. To bridge this gap, we propose a novel approach that integrates multi-source data, constructs a comprehensive dynamic provenance graph, and detects APT behavior in real time using temporal graph learning. Given the lack of tailored datasets in both industry and academia, we also aim to simulate a custom dataset by replaying real-world supply chain exploits with multi-source monitoring.

Paperid: 2283, https://arxiv.org/pdf/2504.00170.pdf

Abstract:
It is common practice to outsource the training of machine learning models to cloud providers. Clients who do so gain from the cloud's economies of scale, but implicitly assume trust: the server should not deviate from the client's training procedure. A malicious server may, for instance, seek to insert backdoors in the model. Detecting a backdoored model without prior knowledge of both the backdoor attack and its accompanying trigger remains a challenging problem. In this paper, we show that a client with access to multiple cloud providers can replicate a subset of training steps across multiple servers to detect deviation from the training procedure in a similar manner to differential testing. Assuming some cloud-provided servers are benign, we identify malicious servers by the substantial difference between model updates required for backdooring and those resulting from clean training. Perhaps the strongest advantage of our approach is its suitability to clients that have limited-to-no local compute capability to perform training; we leverage the existence of multiple cloud providers to identify malicious updates without expensive human labeling or heavy computation. We demonstrate the capabilities of our approach on an outsourced supervised learning task where $50\%$ of the cloud providers insert their own backdoor; our approach is able to correctly identify $99.6\%$ of them. In essence, our approach is successful because it replaces the signature-based paradigm taken by existing approaches with an anomaly-based detection paradigm. Furthermore, our approach is robust to several attacks from adaptive adversaries utilizing knowledge of our detection scheme.

Paperid: 2284, https://arxiv.org/pdf/2504.00018.pdf

Abstract:
While large language models (LLMs) are powerful assistants in programming tasks, they may also produce malicious code. Testing LLM-generated code therefore poses significant risks to assessment infrastructure tasked with executing untrusted code. To address these risks, this work focuses on evaluating the security and confidentiality properties of test environments, reducing the risk that LLM-generated code may compromise the assessment infrastructure. We introduce SandboxEval, a test suite featuring manually crafted test cases that simulate real-world safety scenarios for LLM assessment environments in the context of untrusted code execution. The suite evaluates vulnerabilities to sensitive information exposure, filesystem manipulation, external communication, and other potentially dangerous operations in the course of assessment activity. We demonstrate the utility of SandboxEval by deploying it on an open-source implementation of Dyff, an established AI assessment framework used to evaluate the safety of LLMs at scale. We show, first, that the test suite accurately describes limitations placed on an LLM operating under instructions to generate malicious code. Second, we show that the test results provide valuable insights for developers seeking to harden assessment infrastructure and identify risks associated with LLM execution activities.

Paperid: 2285, https://arxiv.org/pdf/2503.24037.pdf

Abstract:
Online disinformation often provokes strong anger, driving social media users to spread it; however, few measures specifically target sharing behaviors driven by this emotion to curb the spread of disinformation. This study aimed to evaluate whether digital nudges that encourage deliberation by drawing attention to emotional information can reduce sharing driven by strong anger associated with online disinformation. We focused on emotion regulation, as a method for fostering deliberation, which is activated when individuals' attention is drawn to their current emotions. Digital nudges were designed to display emotional information about disinformation and emotion regulation messages. Among these, we found that distraction and perspective-taking nudges may encourage deliberation in anger-driven sharing. To assess their effectiveness, existing nudges mimicking platform functions were used for comparison. Participant responses were measured across four dimensions: sharing intentions, type of emotion, intensity of emotion, and authenticity. The results showed that all digital nudges significantly reduced the sharing of disinformation, with distraction nudges being the most effective. These findings suggest that digital nudges addressing emotional responses can serve as an effective intervention against the spread disinformation driven by strong anger.

Paperid: 2286, https://arxiv.org/pdf/2503.23175.pdf

Abstract:
Several recent works have argued that Large Language Models (LLMs) can be used to tame the data deluge in the cybersecurity field, by improving the automation of Cyber Threat Intelligence (CTI) tasks. This work presents an evaluation methodology that other than allowing to test LLMs on CTI tasks when using zero-shot learning, few-shot learning and fine-tuning, also allows to quantify their consistency and their confidence level. We run experiments with three state-of-the-art LLMs and a dataset of 350 threat intelligence reports and present new evidence of potential security risks in relying on LLMs for CTI. We show how LLMs cannot guarantee sufficient performance on real-size reports while also being inconsistent and overconfident. Few-shot learning and fine-tuning only partially improve the results, thus posing doubts about the possibility of using LLMs for CTI scenarios, where labelled datasets are lacking and where confidence is a fundamental factor.

Paperid: 2287, https://arxiv.org/pdf/2503.22935.pdf

Abstract:
An upstream task for software bill-of-materials (SBOMs) is the accurate localization of the patch that fixes a vulnerability. Nevertheless, existing work reveals a significant gap in the CVEs whose patches exist but are not traceable. Existing works have proposed several approaches to trace/retrieve the patching commit for fixing a CVE. However, they suffer from two major challenges: (1) They cannot effectively handle long diff code of a commit; (2) We are not aware of existing work that scales to the full repository with satisfactory accuracy. Upon identifying this gap, we propose SITPatchTracer, a scalable and effective retrieval system for tracing known vulnerability patches. To handle the context length challenge, SITPatchTracer proposes a novel hierarchical embedding technique which efficiently extends the context coverage to 6x that of existing work while covering all files in the commit. To handle the scalability challenge, SITPatchTracer utilizes a three-phase framework, balancing the effectiveness/efficiency in each phase. The evaluation of SITPatchTracer demonstrates it outperforms existing patch tracing methods (PatchFinder, PatchScout, VFCFinder) by a large margin. Furthermore, SITPatchTracer outperforms VoyageAI, the SOTA commercial code embedding LLM (\$1.8 per 10K commits) on the MRR and Recall@10 by 18\% and 28\% on our two datasets. Using SITPatchTracer, we have successfully traced and merged the patch links for 35 new CVEs in the GitHub Advisory database Our ablation study reveals that hierarchical embedding is a practically effective way of handling long context for patch retrieval.

Paperid: 2288, https://arxiv.org/pdf/2503.22760.pdf

Abstract:
This paper explores the risk that a large language model (LLM) trained for code generation on data mined from software repositories will generate content that discloses sensitive information included in its training data. We decompose this risk, known in the literature as ``unintended memorization,'' into two components: unintentional disclosure (where an LLM presents secrets to users without the user seeking them out) and malicious disclosure (where an LLM presents secrets to an attacker equipped with partial knowledge of the training data). We observe that while existing work mostly anticipates malicious disclosure, unintentional disclosure is also a concern. We describe methods to assess unintentional and malicious disclosure risks side-by-side across different releases of training datasets and models. We demonstrate these methods through an independent assessment of the Open Language Model (OLMo) family of models and its Dolma training datasets. Our results show, first, that changes in data source and processing are associated with substantial changes in unintended memorization risk; second, that the same set of operational changes may increase one risk while mitigating another; and, third, that the risk of disclosing sensitive information varies not only by prompt strategies or test datasets but also by the types of sensitive information. These contributions rely on data mining to enable greater privacy and security testing required for the LLM training data supply chain.

Paperid: 2289, https://arxiv.org/pdf/2503.20310.pdf

Abstract:
Adversarial attacks in black-box settings are highly practical, with transfer-based attacks being the most effective at generating adversarial examples (AEs) that transfer from surrogate models to unseen target models. However, their performance significantly degrades when transferring across heterogeneous architectures -- such as CNNs, MLPs, and Vision Transformers (ViTs) -- due to fundamental architectural differences. To address this, we propose Feature Permutation Attack (FPA), a zero-FLOP, parameter-free method that enhances adversarial transferability across diverse architectures. FPA introduces a novel feature permutation (FP) operation, which rearranges pixel values in selected feature maps to simulate long-range dependencies, effectively making CNNs behave more like ViTs and MLPs. This enhances feature diversity and improves transferability both across heterogeneous architectures and within homogeneous CNNs. Extensive evaluations on 14 state-of-the-art architectures show that FPA achieves maximum absolute gains in attack success rates of 7.68% on CNNs, 14.57% on ViTs, and 14.48% on MLPs, outperforming existing black-box attacks. Additionally, FPA is highly generalizable and can seamlessly integrate with other transfer-based attacks to further boost their performance. Our findings establish FPA as a robust, efficient, and computationally lightweight strategy for enhancing adversarial transferability across heterogeneous architectures.

Paperid: 2290, https://arxiv.org/pdf/2503.19626.pdf

Abstract:
The progress of artificial intelligence (AI) has made sophisticated methods available for cyberattacks and red team activities. These AI attacks can automate the process of penetrating a target or collecting sensitive data. The new methods can also accelerate the execution of the attacks. This review article examines the use of AI technologies in cybersecurity attacks. It also tries to describe typical targets for such attacks. We employed a scoping review methodology to analyze articles and identify AI methods, targets, and models that red teams can utilize to simulate cybercrime. From the 470 records screened, 11 were included in the review. Various cyberattack methods were identified, targeting sensitive data, systems, social media profiles, passwords, and URLs. The application of AI in cybercrime to develop versatile attack models presents an increasing threat. Furthermore, AI-based techniques in red team use can provide new ways to address these issues.

Paperid: 2291, https://arxiv.org/pdf/2503.19338.pdf

Abstract:
As large-scale models such as Large Language Models (LLMs) and Large Multimodal Models (LMMs) see increasing deployment, their privacy risks remain underexplored. Membership Inference Attacks (MIAs), which reveal whether a data point was used in training the target model, are an important technique for exposing or assessing privacy risks and have been shown to be effective across diverse machine learning algorithms. However, despite extensive studies on MIAs in classic models, there remains a lack of systematic surveys addressing their effectiveness and limitations in large-scale models. To address this gap, we provide the first comprehensive review of MIAs targeting LLMs and LMMs, analyzing attacks by model type, adversarial knowledge, and strategy. Unlike prior surveys, we further examine MIAs across multiple stages of the model pipeline, including pre-training, fine-tuning, alignment, and Retrieval-Augmented Generation (RAG). Finally, we identify open challenges and propose future research directions for strengthening privacy resilience in large-scale models.

Paperid: 2292, https://arxiv.org/pdf/2503.18316.pdf

Abstract:
Advanced Persistent Threats (APTs) have caused significant losses across a wide range of sectors, including the theft of sensitive data and harm to system integrity. As attack techniques grow increasingly sophisticated and stealthy, the arms race between cyber defenders and attackers continues to intensify. The revolutionary impact of Large Language Models (LLMs) has opened up numerous opportunities in various fields, including cybersecurity. An intriguing question arises: can the extensive knowledge embedded in LLMs be harnessed for provenance analysis and play a positive role in identifying previously unknown malicious events? To seek a deeper understanding of this issue, we propose a new strategy for taking advantage of LLMs in provenance-based threat detection. In our design, the state-of-the-art LLM offers additional details in provenance data interpretation, leveraging their knowledge of system calls, software identity, and high-level understanding of application execution context. The advanced contextualized embedding capability is further utilized to capture the rich semantics of event descriptions. We comprehensively examine the quality of the resulting embeddings, and it turns out that they offer promising avenues. Subsequently, machine learning models built upon these embeddings demonstrated outstanding performance on real-world data. In our evaluation, supervised threat detection achieves a precision of 99.0%, and semi-supervised anomaly detection attains a precision of 96.9%.

Paperid: 2293, https://arxiv.org/pdf/2503.17867.pdf

Abstract:
Distributed Denial of Service attacks represent an active cybersecurity research problem. Recent research shifted from static rule-based defenses towards AI-based detection and mitigation. This comprehensive survey covers several key topics. Preeminently, state-of-the-art AI detection methods are discussed. An in-depth taxonomy based on manual expert hierarchies and an AI-generated dendrogram are provided, thus settling DDoS categorization ambiguities. An important discussion on available datasets follows, covering data format options and their role in training AI detection methods together with adversarial training and examples augmentation. Beyond detection, AI based mitigation techniques are surveyed as well. Finally, multiple open research directions are proposed.

Paperid: 2294, https://arxiv.org/pdf/2503.16640.pdf

Abstract:
Android applications collecting data from users must protect it according to the current legal frameworks. Such data protection has become even more important since in 2018 the European Union rolled out the General Data Protection Regulation (GDPR). Since app developers are not legal experts, they find it difficult to integrate privacy-aware practices into source code development. Despite these legal obligations, developers have limited tool support to reason about data protection throughout their app development process. This paper explores the use of static program slicing and software visualization to analyze privacy-relevant data flows in Android apps. We introduce SliceViz, a web tool that analyzes an Android app by slicing all privacy-relevant data sources detected in the source code on the back-end. It then helps developers by visualizing these privacy-relevant program slices. We conducted a user study with 12 participants demonstrating that SliceViz effectively aids developers in identifying privacy-relevant properties in Android apps. Our findings indicate that program slicing can be employed to identify and reason about privacy-relevant data flows in Android applications. With further usability improvements, developers can be better equipped to handle privacy-sensitive information.

Paperid: 2295, https://arxiv.org/pdf/2503.16287.pdf

Abstract:
The rapid development of low-Earth orbit (LEO) satellite constellations and satellite communication systems has elevated the importance of secure video transmission, which is the key to applications such as remote sensing, disaster relief, and secure information exchange. In this context, three serious issues arise concerning real-time encryption of videos on satellite embedded devices: (a) the challenge of achieving real-time performance; (b) the limitations posed by the constrained computing performance of satellite payloads; and (c) the potential for excessive power consumption leading to overheating, thereby escalating safety risks. To overcome these challenges, this study introduced a novel approach for encrypting videos by employing two 1D chaotic maps, which was deployed on a satellite for the first time. The experiment on the satellite confirms that our scheme is suitable for complex satellite environments. In addition, the proposed chaotic maps were implemented on a Field Programmable Gate Array (FPGA) platform, and simulation results showed consistency with those obtained on a Raspberry Pi. Experiments on the Raspberry Pi 4B demonstrate exceptional real-time performance and low power consumption, validating both the hardware feasibility and the stability of our design. Rigorous statistical testing also confirms the scheme's resilience against a variety of attacks, underscoring its potential for secure, real-time data transmission in satellite communication systems.

Paperid: 2296, https://arxiv.org/pdf/2503.15548.pdf

Abstract:
The widespread adoption of Retrieval-Augmented Generation (RAG) systems in real-world applications has heightened concerns about the confidentiality and integrity of their proprietary knowledge bases. These knowledge bases, which play a critical role in enhancing the generative capabilities of Large Language Models (LLMs), are increasingly vulnerable to breaches that could compromise sensitive information. To address these challenges, this paper proposes an advanced encryption methodology designed to protect RAG systems from unauthorized access and data leakage. Our approach encrypts both textual content and its corresponding embeddings prior to storage, ensuring that all data remains securely encrypted. This mechanism restricts access to authorized entities with the appropriate decryption keys, thereby significantly reducing the risk of unintended data exposure. Furthermore, we demonstrate that our encryption strategy preserves the performance and functionality of RAG pipelines, ensuring compatibility across diverse domains and applications. To validate the robustness of our method, we provide comprehensive security proofs that highlight its resilience against potential threats and vulnerabilities. These proofs also reveal limitations in existing approaches, which often lack robustness, adaptability, or reliance on open-source models. Our findings suggest that integrating advanced encryption techniques into the design and deployment of RAG systems can effectively enhance privacy safeguards. This research contributes to the ongoing discourse on improving security measures for AI-driven services and advocates for stricter data protection standards within RAG architectures.

Paperid: 2297, https://arxiv.org/pdf/2503.12625.pdf

Abstract:
Payment channel networks (PCNs) are a promising solution to address blockchain scalability and throughput challenges, However, the security of PCNs and their vulnerability to attacks are not sufficiently studied. In this paper, we introduce SCOOP, a framework that includes two novel congestion attacks on PCNs. These attacks consider the minimum transferable amount along a path (path capacity) and the number of channels involved (path length), formulated as linear optimization problems. The first attack allocates the attacker's budget to achieve a specific congestion threshold, while the second maximizes congestion under budget constraints. Simulation results show the effectiveness of the proposed attack formulations in comparison to other attack strategies. Specifically, the results indicate that the first attack provides around a 40\% improvement in congestion performance, while the second attack offers approximately a 50\% improvement in comparison to the state-of-the-art. Moreover, in terms of payment to congestion efficiency, the first attack is about 60\% more efficient, and the second attack is around 90\% more efficient in comparison to state-of-the-art

Paperid: 2298, https://arxiv.org/pdf/2503.11777.pdf

Abstract:
With the exponentially growing Internet traffic, sketch data structure with a probabilistic algorithm has been expected to be an alternative solution for non-compromised (non-selective) security monitoring. While facilitating counting within a confined memory space, the sketch's memory efficiency and accuracy were further pushed to their limit through finer-grained and dynamic control of constrained memory space to adapt to the data stream's inherent skewness (i.e., Zipf distribution), namely small counters with extensions. In this paper, we unveil a vulnerable factor of the small counter design by introducing a new sketch-oriented attack, which threatens a stream of state-of-the-art sketches and their security applications. With the root cause analyses, we propose Siamese Counter with enhanced adversarial resiliency and verified feasibility with extensive experimental and theoretical analyses. Under a sketch pollution attack, Siamese Counter delivers 47% accurate results than a state-of-the-art scheme, and demonstrates up to 82% more accurate estimation under normal measurement scenarios.

Paperid: 2299, https://arxiv.org/pdf/2503.10619.pdf

Abstract:
We introduce Tempest, a multi-turn adversarial framework that models the gradual erosion of Large Language Model (LLM) safety through a tree search perspective. Unlike single-turn jailbreaks that rely on one meticulously engineered prompt, Tempest expands the conversation at each turn in a breadth-first fashion, branching out multiple adversarial prompts that exploit partial compliance from previous responses. By tracking these incremental policy leaks and re-injecting them into subsequent queries, Tempest reveals how minor concessions can accumulate into fully disallowed outputs. Evaluations on the JailbreakBench dataset show that Tempest achieves a 100% success rate on GPT-3.5-turbo and 97% on GPT-4 in a single multi-turn run, using fewer queries than baselines such as Crescendo or GOAT. This tree search methodology offers an in-depth view of how model safeguards degrade over successive dialogue turns, underscoring the urgency of robust multi-turn testing procedures for language models.

Paperid: 2300, https://arxiv.org/pdf/2503.10320.pdf

Abstract:
Cellular Automata (CA) are commonly investigated as a particular type of dynamical systems, defined by shift-invariant local rules. In this paper, we consider instead CA as algebraic systems, focusing on the combinatorial designs induced by their short-term behavior. Specifically, we review the main results published in the literature concerning the construction of mutually orthogonal Latin squares via bipermutive CA, considering both the linear and nonlinear cases. We then survey some significant applications of these results to cryptography, and conclude with a discussion of open problems to be addressed in future research on CA-based combinatorial designs.

Paperid: 2301, https://arxiv.org/pdf/2503.09192.pdf

Abstract:
Personalized federated learning is extensively utilized in scenarios characterized by data heterogeneity, facilitating more efficient and automated local training on data-owning terminals. This includes the automated selection of high-performance model parameters for upload, thereby enhancing the overall training process. However, it entails significant risks of privacy leakage. Existing studies have attempted to mitigate these risks by utilizing differential privacy. Nevertheless, these studies present two major limitations: (1) The integration of differential privacy into personalized federated learning lacks sufficient personalization, leading to the introduction of excessive noise into the model. (2) It fails to adequately control the spatial scope of model update information, resulting in a suboptimal balance between data privacy and model effectiveness in differential privacy federated learning. In this paper, we propose a differentially private personalized federated learning approach that employs dynamically sparsified client updates through reparameterization and adaptive norm(DP-pFedDSU). Reparameterization training effectively selects personalized client update information, thereby reducing the quantity of updates. This approach minimizes the introduction of noise to the greatest extent possible. Additionally, dynamic adaptive norm refers to controlling the norm space of model updates during the training process, mitigating the negative impact of clipping on the update information. These strategies substantially enhance the effective integration of differential privacy and personalized federated learning. Experimental results on EMNIST, CIFAR-10, and CIFAR-100 demonstrate that our proposed scheme achieves superior performance and is well-suited for more complex personalized federated learning scenarios.

Paperid: 2302, https://arxiv.org/pdf/2503.08716.pdf

Abstract:
In the age of powerful AI-generated text, automatic detectors have emerged to identify machine-written content. This poses a threat to author privacy and freedom, as text authored with AI assistance may be unfairly flagged. We propose AuthorMist, a novel reinforcement learning-based system to transform AI-generated text into human-like writing. AuthorMist leverages a 3-billion-parameter language model as a backbone, fine-tuned with Group Relative Policy Optimization (GPRO) to paraphrase text in a way that evades AI detectors. Our framework establishes a generic approach where external detector APIs (GPTZero, WinstonAI, Originality.ai, etc.) serve as reward functions within the reinforcement learning loop, enabling the model to systematically learn outputs that these detectors are less likely to classify as AI-generated. This API-as-reward methodology can be applied broadly to optimize text against any detector with an accessible interface. Experiments on multiple datasets and detectors demonstrate that AuthorMist effectively reduces the detectability of AI-generated text while preserving the original meaning. Our evaluation shows attack success rates ranging from 78.6% to 96.2% against individual detectors, significantly outperforming baseline paraphrasing methods. AuthorMist maintains high semantic similarity (above 0.94) with the original text while successfully evading detection. These results highlight limitations in current AI text detection technologies and raise questions about the sustainability of the detection-evasion arms race.

Paperid: 2303, https://arxiv.org/pdf/2503.08053.pdf

Abstract:
Microcontroller systems are integral to our daily lives, powering mission-critical applications such as vehicles, medical devices, and industrial control systems. Therefore, it is essential to investigate and outline the challenges encountered in developing secure microcontroller systems. While previous research has focused solely on microcontroller firmware analysis to identify and characterize vulnerabilities, our study uniquely leverages data from the 2023 and 2024 MITRE eCTF team submissions and post-competition interviews. This approach allows us to dissect the entire lifecycle of secure microcontroller system development from both technical and perceptual perspectives, providing deeper insights into how these vulnerabilities emerge in the first place. Through the lens of eCTF, we identify fundamental conceptual and practical challenges in securing microcontroller systems. Conceptually, it is difficult to adapt from a microprocessor system to a microcontroller system, and participants are not wholly aware of the unique attacks against microcontrollers. Practically, security-enhancing tools, such as the memory-safe language Rust, lack adequate support on microcontrollers. Additionally, poor-quality entropy sources weaken cryptography and secret generation. Additionally, our findings articulate specific research, developmental, and educational deficiencies, leading to targeted recommendations for researchers, developers, vendors, and educators to enhance the security of microcontroller systems.

Paperid: 2304, https://arxiv.org/pdf/2503.07196.pdf

Abstract:
Quantum Key Distribution (QKD) promises information-theoretic security, yet integrating QKD into existing protocols like TLS remains challenging due to its fundamentally different operational model. In this paper, we propose a hybrid QKD-KEM protocol with two distinct integration approaches: a client-initiated flow compatible with both ETSI 004 and 014 specifications, and a server-initiated flow similar to existing work but limited to stateless ETSI 014 APIs. Unlike previous implementations, our work specifically addresses the integration of stateful QKD key exchange protocols (ETSI 004) which is essential for production QKD networks but has remained largely unexplored. By adapting OpenSSL's provider infrastructure to accommodate QKD's pre-distributed key model, we maintain compatibility with current TLS implementations while offering dual layers of security. Performance evaluations demonstrate the feasibility of our hybrid scheme with acceptable overhead, showing that robust security against quantum threats is achievable while addressing the unique requirements of different QKD API specifications.

Paperid: 2305, https://arxiv.org/pdf/2503.06487.pdf

Abstract:
Phishing websites continue to pose a significant security challenge, making the development of robust detection mechanisms essential. Brand Domain Identification (BDI) serves as a crucial step in many phishing detection approaches. This study systematically evaluates the effectiveness of features employed over the past decade for BDI, focusing on their weighted importance in phishing detection as of 2025. The primary objective is to determine whether the identified brand domain matches the claimed domain, utilizing popular features for phishing detection. To validate feature importance and evaluate performance, we conducted two experiments on a dataset comprising 4,667 legitimate sites and 4,561 phishing sites. In Experiment 1, we used the Weka tool to identify optimized and important feature sets out of 5: CN Information(CN), Logo Domain(LD),Form Action Domain(FAD),Most Common Link in Domain(MCLD) and Cookie Domain through its 4 Attribute Ranking Evaluator. The results revealed that none of the features were redundant, and Random Forest emerged as the best classifier, achieving an impressive accuracy of 99.7\% with an average response time of 0.08 seconds. In Experiment 2, we trained five machine learning models, including Random Forest, Decision Tree, Support Vector Machine, Multilayer Perceptron, and XGBoost to assess the performance of individual BDI features and their combinations. The results demonstrated an accuracy of 99.8\%, achieved with feature combinations of only three features: Most Common Link Domain, Logo Domain, Form Action and Most Common Link Domain,CN Info,Logo Domain using Random Forest as the best classifier. This study underscores the importance of leveraging key domain features for efficient phishing detection and paves the way for the development of real-time, scalable detection systems.

Paperid: 2306, https://arxiv.org/pdf/2503.04810.pdf

Abstract:
This article introduces a novel methodology, Network Simulator-centric Compositional Testing (NSCT), to enhance the verification of network protocols with a particular focus on time-varying network properties. NSCT follows a Model-Based Testing (MBT) approach. These approaches usually struggle to test and represent time-varying network properties. NSCT also aims to achieve more accurate and reproducible protocol testing. It is implemented using the Ivy tool and the Shadow network simulator. This enables online debugging of real protocol implementations. A case study on an implementation of QUIC (picoquic) is presented, revealing an error in its compliance with a time-varying specification. This error has subsequently been rectified, highlighting NSCT's effectiveness in uncovering and addressing real-world protocol implementation issues. The article underscores NSCT's potential in advancing protocol testing methodologies, offering a notable contribution to the field of network protocol verification.

Paperid: 2307, https://arxiv.org/pdf/2503.03969.pdf

Abstract:
The software compilation process has a tendency to obscure the original design of the system and makes it difficult both to identify individual components and discern their purpose simply by examining the resulting binary code. Although decompilation techniques attempt to recover higher-level source code from the machine code in question, they are not fully able to restore the semantics of the original functions. Furthermore, binaries are often stripped of metadata, and this makes it challenging to reverse engineer complex binary software. In this paper we show how a combination of binary decomposition techniques, decompilation passes, and LLM-powered function summarization can be used to build an economical engine to identify modules in stripped binaries and associate them with high-level natural language descriptions. We instantiated this technique with three underlying open-source LLMs -- CodeQwen, DeepSeek-Coder and CodeStral -- and measured its effectiveness in identifying modules in robotics firmware. This experimental evaluation involved 467 modules from four devices from the ArduPilot software suite, and showed that CodeStral, the best-performing backend LLM, achieves an average F1-score of 0.68 with an online running time of just a handful of seconds.

Paperid: 2308, https://arxiv.org/pdf/2503.03108.pdf

Abstract:
Recently, Provenance-based Intrusion Detection Systems (PIDSes) have been widely used for endpoint threat analysis. These studies can be broadly categorized into rule-based detection systems and learning-based detection systems. Among these, due to the evolution of attack techniques, rules cannot dynamically model all the characteristics of attackers. As a result, such systems often face false negatives. Learning-based detection systems are further divided into supervised learning and anomaly detection. The scarcity of attack samples hinders the usability and effectiveness of supervised learning-based detection systems in practical applications. Anomaly-based detection systems face a massive false positive problem because they cannot distinguish between changes in normal behavior and real attack behavior. The alert results of detection systems are closely related to the manual labor costs of subsequent security analysts. To reduce manual analysis time, we propose OMNISEC, which applies large language models (LLMs) to anomaly-based intrusion detection systems via retrieval-augmented behavior prompting. OMNISEC can identify abnormal nodes and corresponding abnormal events by constructing suspicious nodes and rare paths. By combining two external knowledge bases, OMNISEC uses Retrieval Augmented Generation (RAG) to enable the LLM to determine whether abnormal behavior is a real attack. Finally, OMNISEC can reconstruct the attack graph and restore the complete attack behavior chain of the attacker's intrusion. Experimental results show that OMNISEC outperforms state-of-the-art methods on public benchmark datasets.

Paperid: 2309, https://arxiv.org/pdf/2503.02413.pdf

Abstract:
In this paper, we introduce PANTHER, a modular framework for testing network protocols and formally verifying their specification. The framework incorporates a plugin architecture to enhance flexibility and extensibility for diverse testing scenarios, facilitate reproducible and scalable experiments leveraging Ivy and Shadow, and improve testing efficiency by enabling automated workflows through YAML-based configuration management. Its modular design validates complex protocol properties, adapts to dynamic behaviors, and facilitates seamless plugin integration for scalability. Moreover, the framework enables a stateful fuzzer plugin to enhance implementation robustness checks.

Paperid: 2310, https://arxiv.org/pdf/2503.02019.pdf

Abstract:
The rapid advancements in wireless technology have significantly increased the demand for communication resources, leading to the development of Spectrum Access Systems (SAS). However, network regulations require disclosing sensitive user information, such as location coordinates and transmission details, raising critical privacy concerns. Moreover, as a database-driven architecture reliant on user-provided data, SAS necessitates robust location verification to counter identity and location spoofing attacks and remains a primary target for denial-of-service (DoS) attacks. Addressing these security challenges while adhering to regulatory requirements is essential. In this paper, we propose SLAP, a novel framework that ensures location privacy and anonymity during spectrum queries, usage notifications, and location-proof acquisition. Our solution includes an adaptive dual-scenario location verification mechanism with architectural flexibility and a fallback option, along with a counter-DoS approach using time-lock puzzles. We prove the security of SLAP and demonstrate its advantages over existing solutions through comprehensive performance evaluations.

Paperid: 2311, https://arxiv.org/pdf/2503.01538.pdf

Abstract:
The rapid evolution of cyber threats has increased the need for robust methods to discover vulnerabilities in increasingly complex and diverse network protocols. This paper introduces Network Attack-centric Compositional Testing (NACT), a novel methodology designed to discover new vulnerabilities in network protocols and create scenarios to reproduce these vulnerabilities through attacker models. NACT integrates composable attacker specifications, formal specification mutations, and randomized constraint-solving techniques to generate sophisticated attack scenarios and test cases. The methodology enables comprehensive testing of both single-protocol and multi-protocol interactions. Through case studies involving a custom minimalist protocol (MiniP) and five widely used QUIC implementations, NACT is shown to effectively identify, reproduce, and find new real-world vulnerabilities such as version negotiation abuse. Additionally, by comparing the current and older versions of these QUIC implementations, NACT demonstrates its ability to detect both persistent vulnerabilities and regressions. Finally, by supporting cross-protocol testing within a black-box testing framework, NACT provides a versatile approach to improve the security of network protocols.

Paperid: 2312, https://arxiv.org/pdf/2503.01482.pdf

Abstract:
Local Differential Privacy (LDP) offers strong privacy protection, especially in settings in which the server collecting the data is untrusted. However, designing LDP mechanisms that achieve an optimal trade-off between privacy, utility and robustness to adversarial inference attacks remains challenging. In this work, we introduce a general multi-objective optimization framework for refining LDP protocols, enabling the joint optimization of privacy and utility under various adversarial settings. While our framework is flexible to accommodate multiple privacy and security attacks as well as utility metrics, in this paper, we specifically optimize for Attacker Success Rate (ASR) under \emph{data reconstruction attack} as a concrete measure of privacy leakage and Mean Squared Error (MSE) as a measure of utility. More precisely, we systematically revisit these trade-offs by analyzing eight state-of-the-art LDP protocols and proposing refined counterparts that leverage tailored optimization techniques. Experimental results demonstrate that our proposed adaptive mechanisms consistently outperform their non-adaptive counterparts, achieving substantial reductions in ASR while preserving utility, and pushing closer to the ASR-MSE Pareto frontier. By bridging the gap between theoretical guarantees and real-world vulnerabilities, our framework enables modular and context-aware deployment of LDP mechanisms with tunable privacy-utility trade-offs.

Paperid: 2313, https://arxiv.org/pdf/2503.01229.pdf

Abstract:
Water systems are vital components of modern infrastructure, yet they are increasingly susceptible to sophisticated cyber attacks with potentially dire consequences on public health and safety. While state-of-the-art machine learning techniques effectively detect anomalies, contemporary model-agnostic attack attribution methods using LIME, SHAP, and LEMNA are deemed impractical for large-scale, interdependent water systems. This is due to the intricate interconnectivity and dynamic interactions that define these complex environments. Such methods primarily emphasize individual feature importance while falling short of addressing the crucial sensor-actuator interactions in water systems, which limits their effectiveness in identifying root cause attacks. To this end, we propose a novel model-agnostic Factorization Machines (FM)-based approach that capitalizes on water system sensor-actuator interactions to provide granular explanations and attributions for cyber attacks. For instance, an anomaly in an actuator pump activity can be attributed to a top root cause attack candidates, a list of water pressure sensors, which is derived from the underlying linear and quadratic effects captured by our approach. We validate our method using two real-world water system specific datasets, SWaT and WADI, demonstrating its superior performance over traditional attribution methods. In multi-feature cyber attack scenarios involving intricate sensor-actuator interactions, our FM-based attack attribution method effectively ranks attack root causes, achieving approximately 20% average improvement over SHAP and LEMNA.

Paperid: 2314, https://arxiv.org/pdf/2503.00703.pdf

Abstract:
Differential privacy (DP) is a privacy-preserving paradigm that protects the training data when training deep learning models. Critically, the performance of models is determined by the training hyperparameters, especially those of the learning rate schedule, thus requiring fine-grained hyperparameter tuning on the data. In practice, it is common to tune the learning rate hyperparameters through the grid search that (1) is computationally expensive as multiple runs are needed, and (2) increases the risk of data leakage as the selection of hyperparameters is data-dependent. In this work, we adapt the automatic learning rate schedule to DP optimization for any models and optimizers, so as to significantly mitigate or even eliminate the cost of hyperparameter tuning when applied together with automatic per-sample gradient clipping. Our hyperparameter-free DP optimization is almost as computationally efficient as the standard non-DP optimization, and achieves state-of-the-art DP performance on various language and vision tasks.

Paperid: 2315, https://arxiv.org/pdf/2502.21156.pdf

Abstract:
We introduce Cryptis, an extension of the Iris separation logic that can be used to verify cryptographic components using the symbolic model of cryptography. The combination of separation logic and cryptographic reasoning allows us to prove the correctness of a protocol and later reuse this result to verify larger systems that rely on the protocol. To make this integration possible, we propose novel specifications for authentication protocols that allow agents in a network to agree on the use of system resources. We evaluate our approach by verifying various authentication protocols and a key-value store server that uses these authentication protocols to connect to clients. Our results are formalized in Coq.

Paperid: 2316, https://arxiv.org/pdf/2502.18609.pdf

Abstract:
We propose a groundbreaking random number generator that achieves truly uniform, independent, and identically distributed (IID) randomness by integrating Quantum Permutation Pads (QPP) with system jitter--derived entropy, herein called IID-based QPP-RNG. Unlike conventional RNGs that use raw timing variations, our design uses system jitter solely to generate ephemeral QPP pads and derives 8-bit outputs directly from permutation counts, eliminating the need for post-processing. This approach leverages the factorial complexity of permutation sorting to systematically accumulate entropy from dynamic hardware interactions, ensuring non-deterministic outputs even from fixed seeds. Notably, IID-based QPP-RNG achieves a min-entropy of 7.85-7.95 bits per byte from IID min-entropy estimate, surpassing ID Quantique's QRNG (7.157042 bits per byte), which marks a breakthrough in randomness quality. Our implementation employs a dynamic seed evolution protocol that continuously refreshes the internal state with unpredictable system jitter, effectively decoupling the QPP sequence from the initial seed. Cross-platform validation on macOS (x86 and ARM) and Windows (x86) confirms uniformly distributed outputs, while evaluations compliant with NIST SP 800-90B show a Shannon entropy of 7.9999 bits per byte. Overall, IID-based QPP-RNG represents a significant advancement in random number generation, offering a scalable, system-based, software-only, post-quantum secure solution for a wide range of cryptographic applications.

Paperid: 2317, https://arxiv.org/pdf/2502.18549.pdf

Abstract:
The target defense problem (TDP) for unmanned surface vehicles (USVs) concerns intercepting an adversarial USV before it breaches a designated target region, using one or more defending USVs. A particularly challenging scenario arises when the attacker exhibits superior maneuverability compared to the defenders, significantly complicating effective interception. To tackle this challenge, this letter introduces ARBoids, a novel adaptive residual reinforcement learning framework that integrates deep reinforcement learning (DRL) with the biologically inspired, force-based Boids model. Within this framework, the Boids model serves as a computationally efficient baseline policy for multi-agent coordination, while DRL learns a residual policy to adaptively refine and optimize the defenders' actions. The proposed approach is validated in a high-fidelity Gazebo simulation environment, demonstrating superior performance over traditional interception strategies, including pure force-based approaches and vanilla DRL policies. Furthermore, the learned policy exhibits strong adaptability to attackers with diverse maneuverability profiles, highlighting its robustness and generalization capability. The code of ARBoids will be released upon acceptance of this letter.

Paperid: 2318, https://arxiv.org/pdf/2502.17270.pdf

Abstract:
Order fairness in distributed ledgers refers to properties that relate the order in which transactions are sent or received to the order in which they are eventually finalized, i.e., totally ordered. The study of such properties is relatively new and has been especially stimulated by the rise of Maximal Extractable Value (MEV) attacks in blockchain environments. Indeed, in many classical blockchain protocols, leaders are responsible for selecting the transactions to be included in blocks, which creates a clear vulnerability and opportunity for transaction order manipulation. Unlike blockchains, DAG-based ledgers allow participants in the network to independently propose blocks, which are then arranged as vertices of a directed acyclic graph. Interestingly, leaders in DAG-based ledgers are elected only after the fact, once transactions are already part of the graph, to determine their total order. In other words, transactions are not chosen by single leaders; instead, they are collectively validated by the nodes, and leaders are only elected to establish an ordering. This approach intuitively reduces the risk of transaction manipulation and enhances fairness. In this paper, we aim to quantify the capability of DAG-based ledgers to achieve order fairness. To this end, we define new variants of order fairness adapted to DAG-based ledgers and evaluate the impact of an adversary capable of compromising a limited number of nodes (below the one-third threshold) to reorder transactions. We analyze how often our order fairness properties are violated under different network conditions and parameterizations of the DAG algorithm, depending on the adversary's power. Our study shows that DAG-based ledgers are still vulnerable to reordering attacks, as an adversary can coordinate a minority of Byzantine nodes to manipulate the DAG's structure.

Paperid: 2319, https://arxiv.org/pdf/2502.16877.pdf

Abstract:
As the importance of Privacy-Preserving Inference of Transformers (PiT) increases, a hybrid protocol that integrates Garbled Circuits (GC) and Homomorphic Encryption (HE) is emerging for its implementation. While this protocol is preferred for its ability to maintain accuracy, it has a severe drawback of excessive latency. To address this, existing protocols primarily focused on reducing HE latency, thus making GC the new latency bottleneck. Furthermore, previous studies only focused on individual computing layers, such as protocol or hardware accelerator, lacking a comprehensive solution at the system level. This paper presents APINT, a full-stack framework designed to reduce PiT's overall latency by addressing the latency problem of GC through both software and hardware solutions. APINT features a novel protocol that reallocates possible GC workloads to alternative methods (i.e., HE or standard matrix operation), substantially decreasing the GC workload. It also suggests GC-friendly circuit generation that reduces the number of AND gates at the most, which is the expensive operator in GC. Furthermore, APINT proposes an innovative netlist scheduling that combines coarse-grained operation mapping and fine-grained scheduling for maximal data reuse and minimal dependency. Finally, APINT's hardware accelerator, combined with its compiler speculation, effectively resolves the memory stall issue. Putting it all together, APINT achieves a remarkable end-to-end reduction in latency, outperforming the existing protocol on CPU platform by 12.2x online and 2.2x offline. Meanwhile, the APINT accelerator not only reduces its latency by 3.3x but also saves energy consumption by 4.6x while operating PiT compared to the state-of-the-art GC accelerator.

Paperid: 2320, https://arxiv.org/pdf/2502.15578.pdf

Abstract:
Modern FPGAs are increasingly supporting multi-tenancy to enable dynamic reconfiguration of user modules. While multi-tenant FPGAs improve utilization and flexibility, this paradigm introduces critical security threats. In this paper, we present FLARE, a fault attack that exploits vulnerabilities in the partial reconfiguration process, specifically while a user bitstream is being uploaded to the FPGA by a reconfiguration manager. Unlike traditional fault attacks that operate during module runtime, FLARE injects faults in the bitstream during its reconfiguration, altering the configuration address and redirecting it to unintended partial reconfigurable regions (PRRs). This enables the overwriting of pre-configured co-tenant modules, disrupting their functionality. FLARE leverages power-wasters that activate briefly during the reconfiguration process, making the attack stealthy and more challenging to detect with existing countermeasures. Experimental results on a Xilinx Pynq FPGA demonstrate the effectiveness of FLARE in compromising multiple user bitstreams during the reconfiguration process.

Paperid: 2321, https://arxiv.org/pdf/2502.14586.pdf

Abstract:
Model selection is a fundamental task in Machine Learning~(ML), focusing on selecting the most suitable model from a pool of candidates by evaluating their performance on specific metrics. This process ensures optimal performance, computational efficiency, and adaptability to diverse tasks and environments. Despite its critical role, its security from the perspective of adversarial ML remains unexplored. This risk is heightened in the Machine-Learning-as-a-Service model, where users delegate the training phase and the model selection process to third-party providers, supplying data and training strategies. Therefore, attacks on model selection could harm both the user and the provider, undermining model performance and driving up operational costs. In this work, we present MOSHI (MOdel Selection HIjacking adversarial attack), the first adversarial attack specifically targeting model selection. Our novel approach manipulates model selection data to favor the adversary, even without prior knowledge of the system. Utilizing a framework based on Variational Auto Encoders, we provide evidence that an attacker can induce inefficiencies in ML deployment. We test our attack on diverse computer vision and speech recognition benchmark tasks and different settings, obtaining an average attack success rate of 75.42%. In particular, our attack causes an average 88.30% decrease in generalization capabilities, an 83.33% increase in latency, and an increase of up to 105.85% in energy consumption. These results highlight the significant vulnerabilities in model selection processes and their potential impact on real-world applications.

Paperid: 2322, https://arxiv.org/pdf/2502.14017.pdf

Abstract:
This manuscript explores the cybersecurity challenges of Operational Technology (OT) networks, focusing on their critical role in industrial environments such as manufacturing, energy, and utilities. As OT systems increasingly integrate with Information Technology (IT) systems due to Industry 4.0 initiatives, they become more vulnerable to cyberattacks, which pose risks not only to data but also to physical infrastructure. The study examines key components of OT systems, such as SCADA (Supervisory Control and Data Acquisition), PLCs (Programmable Logic Controllers), and RTUs (Remote Terminal Units), and analyzes recent cyberattacks targeting OT environments. Furthermore, it highlights the security concerns arising from the convergence of IT and OT systems, examining attack vectors and the growing threats posed by malware, ransomware, and nation-state actors. Finally, the paper discusses modern approaches and tools used to secure these environments, providing insights into improving the cybersecurity posture of OT networks.

Paperid: 2323, https://arxiv.org/pdf/2502.13658.pdf

Abstract:
Purpose: The increasing number of cyber-attacks has elevated the importance of cybersecurity for organizations. This has also increased the demand for professionals with the necessary skills to protect these organizations. As a result, many individuals are looking to enter the field of cybersecurity. However, there is a lack of clear understanding of the skills required for a successful career in this field. In this paper, we identify the skills required for cybersecurity professionals. We also determine how the demand for cyber skills relates to various cyber roles such as security analyst and security architect. Furthermore, we identify the programming languages that are important for cybersecurity professionals. Design/Methodology: For this study, we have collected and analyzed data from 12,161 job ads and 49,002 Stack Overflow posts. By examining this, we identified patterns and trends related to skill requirements, role-specific demands, and programming languages in cybersecurity. Findings: Our results reveal that (i) communication skills and project management skills are the most important soft skills, (ii) as compared to soft skills, the demand for technical skills varies more across various cyber roles, and (iii) Java is the most commonly used programming language. Originality: Our findings serve as a guideline for individuals aiming to get into the field of cybersecurity. Moreover, our findings are useful in terms of informing educational institutes to teach the correct set of skills to students doing degrees in cybersecurity.

Paperid: 2324, https://arxiv.org/pdf/2502.11641.pdf

Abstract:
The syndrome decoding problem is one of the NP-complete problems lying at the foundation of code-based cryptography. The variant thereof where the distance between vectors is measured with respect to the Lee metric, rather than the more commonly used Hamming metric, has been analyzed recently in several works due to its potential relevance for building more efficient code-based cryptosystems. The purpose of this article is to present a zero-knowledge proof of knowledge for this variant of the problem.

Paperid: 2325, https://arxiv.org/pdf/2502.11029.pdf

Abstract:
Multi-party computation (MPC) based machine learning, referred to as multi-party learning (MPL), has become an important technology for utilizing data from multiple parties with privacy preservation. In recent years, in order to apply MPL in more practical scenarios, various MPC-friendly models have been proposedto reduce the extraordinary communication overhead of MPL. Within the optimization of MPC-friendly models, a critical element to tackle the challenge is profiling the communication cost of models. However, the current solutions mainly depend on manually establishing the profiles to identify communication bottlenecks of models, often involving burdensome human efforts in a monotonous procedure. In this paper, we propose HawkEye, a static model communication cost profiling framework, which enables model designers to get the accurate communication cost of models in MPL frameworks without dynamically running the secure model training or inference processes on a specific MPL framework. Firstly, to profile the communication cost of models with complex structures, we propose a static communication cost profiling method based on a prefix structure that records the function calling chain during the static analysis. Secondly, HawkEye employs an automatic differentiation library to assist model designers in profiling the communication cost of models in PyTorch. Finally, we compare the static profiling results of HawkEye against the profiling results obtained through dynamically running secure model training and inference processes on five popular MPL frameworks, CryptFlow2, CrypTen, Delphi, Cheetah, and SecretFlow-SEMI2K. The experimental results show that HawkEye can accurately profile the model communication cost without dynamic profiling.

Paperid: 2326, https://arxiv.org/pdf/2502.10512.pdf

Abstract:
Blockchain technology has revolutionized financial markets by enabling decentralized exchanges (DEXs) that operate without intermediaries. Uniswap V2, a leading DEX, facilitates the rapid creation and trading of new tokens, offering high return potential but exposing investors to significant risks. In this work, we analyze the financial impact of newly created tokens, assessing their market dynamics, profitability and liquidity manipulations. Our findings reveal that a significant portion of market liquidity is trapped in honeypots, reducing market efficiency and misleading investors. Applying a simple buy-and-hold strategy, we are able to uncover some major risks associated with investing in newly created tokens, including the widespread presence of rug pulls and sandwich attacks. We extract the optimal sandwich amount, revealing that their proliferation in new tokens stems from higher profitability in low-liquidity pools. Furthermore, we analyze the fundamental differences between token price evolution in swap time and physical time. Using clustering techniques, we highlight these differences and identify typical patterns of honeypot and sellable tokens. Our study provides insights into the risks and financial dynamics of decentralized markets and their challenges for investors.

Paperid: 2327, https://arxiv.org/pdf/2502.09827.pdf

Abstract:
Space Protocol is applying the principles derived from MITRE and NIST's Supply Chain Traceability: Manufacturing Meta-Framework (NIST IR 8536) to a complex multi party system to achieve introspection, auditing, and replay of data and decisions that ultimately lead to a end decision. The core goal of decision traceability is to ensure transparency, accountability, and integrity within the WA system. This is accomplished by providing a clear, auditable path from the system's inputs all the way to the final decision. This traceability enables the system to track the various algorithms and data flows that have influenced a particular outcome.

Paperid: 2328, https://arxiv.org/pdf/2502.08966.pdf

Abstract:
Tool-Based Agent Systems (TBAS) allow Language Models (LMs) to use external tools for tasks beyond their standalone capabilities, such as searching websites, booking flights, or making financial transactions. However, these tools greatly increase the risks of prompt injection attacks, where malicious content hijacks the LM agent to leak confidential data or trigger harmful actions. Existing defenses (OpenAI GPTs) require user confirmation before every tool call, placing onerous burdens on users. We introduce Robust TBAS (RTBAS), which automatically detects and executes tool calls that preserve integrity and confidentiality, requiring user confirmation only when these safeguards cannot be ensured. RTBAS adapts Information Flow Control to the unique challenges presented by TBAS. We present two novel dependency screeners, using LM-as-a-judge and attention-based saliency, to overcome these challenges. Experimental results on the AgentDojo Prompt Injection benchmark show RTBAS prevents all targeted attacks with only a 2% loss of task utility when under attack, and further tests confirm its ability to obtain near-oracle performance on detecting both subtle and direct privacy leaks.

Paperid: 2329, https://arxiv.org/pdf/2502.08535.pdf

Abstract:
The widespread use of Smart Home devices has attracted significant research interest in understanding their behavior within home networks. Unlike general-purpose computers, these devices exhibit relatively simple and predictable network activity patterns. However, previous studies have primarily focused on normal network conditions, overlooking potential hidden patterns that emerge under challenging conditions. Discovering these hidden flows is crucial for assessing device robustness. This paper addresses this gap by presenting a framework that systematically and automatically reveals these hidden communication patterns. By actively disturbing communication and blocking observed traffic, the framework generates comprehensive profiles structured as behavior trees, uncovering flows that are missed by more shallow methods. This approach was applied to ten real-world devices, identifying 254 unique flows, with over 27% only discovered through this new method. These insights enhance our understanding of device robustness and can be leveraged to improve the accuracy of network security measures.

Paperid: 2330, https://arxiv.org/pdf/2502.08219.pdf

Abstract:
Throughout computer history, it has been repeatedly demonstrated that critical software vulnerabilities can significantly affect the components involved. In the Free/Libre and Open Source Software (FLOSS) ecosystem, most software is distributed through package repositories. Nowadays, monitoring critical dependencies in a software system is essential for maintaining robust security practices. This is particularly important due to new legal requirements, such as the European Cyber Resilience Act, which necessitate that software projects maintain a transparent track record with Software Bill of Materials (SBOM) and ensure a good overall state. This study provides a summary of the current state of available FLOSS package repositories and addresses the challenge of identifying problematic areas within a software ecosystem. These areas are analyzed in detail, quantifying the current state of the FLOSS ecosystem. The results indicate that while there are well-maintained projects within the FLOSS ecosystem, there are also high-impact projects that are susceptible to supply chain attacks. This study proposes a method for analyzing the current state and identifies missing elements, such as interfaces, for future research.

Paperid: 2331, https://arxiv.org/pdf/2502.08001.pdf

Abstract:
Federated Distillation (FD) has emerged as a popular federated training framework, enabling clients to collaboratively train models without sharing private data. Public Dataset-Assisted Federated Distillation (PDA-FD), which leverages public datasets for knowledge sharing, has become widely adopted. Although PDA-FD enhances privacy compared to traditional Federated Learning, we demonstrate that the use of public datasets still poses significant privacy risks to clients' private training data. This paper presents the first comprehensive privacy analysis of PDA-FD in presence of an honest-but-curious server. We show that the server can exploit clients' inference results on public datasets to extract two critical types of private information: label distributions and membership information of the private training dataset. To quantify these vulnerabilities, we introduce two novel attacks specifically designed for the PDA-FD setting: a label distribution inference attack and innovative membership inference methods based on Likelihood Ratio Attack (LiRA). Through extensive evaluation of three representative PDA-FD frameworks (FedMD, DS-FL, and Cronus), our attacks achieve state-of-the-art performance, with label distribution attacks reaching minimal KL-divergence and membership inference attacks maintaining high True Positive Rates under low False Positive Rate constraints. Our findings reveal significant privacy risks in current PDA-FD frameworks and emphasize the need for more robust privacy protection mechanisms in collaborative learning systems.

Paperid: 2332, https://arxiv.org/pdf/2502.07657.pdf

Abstract:
We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,Î´)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and obtain upper bounds on the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$. Our analysis provides improvements over previous bounds, particularly when the spectrum of $M$ satisfies natural structural assumptions. The novel insight is to view the addition of Gaussian noise to a matrix as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations enable us to upper bound the Frobenius distance between the best rank-$k$ approximation of $M$ and that of a Gaussian perturbation of $M$ as an integral that involves inverse eigenvalue gaps of the stochastically evolving matrix, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Subsequently, again using the Dyson Brownian motion viewpoint, we show that the eigenvalues of the matrix $M$ perturbed by Gaussian noise have large gaps with high probability. These results also contribute to the analysis of low-rank approximations under average-case perturbations, and to an understanding of eigenvalue gaps for random matrices, both of which may be of independent interest.

Paperid: 2333, https://arxiv.org/pdf/2502.07492.pdf

Abstract:
Attributing APT (Advanced Persistent Threat) malware to their respective groups is crucial for threat intelligence and cybersecurity. However, APT adversaries often conceal their identities, rendering attribution inherently adversarial. Existing machine learning-based attribution models, while effective, remain highly vulnerable to adversarial attacks. For example, the state-of-the-art byte-level model MalConv sees its accuracy drop from over 90% to below 2% under PGD (projected gradient descent) attacks. Existing gradient-based adversarial training techniques for malware detection or image processing were applied to malware attribution in this study, revealing that both robustness and training efficiency require significant improvement. To address this, we propose RoMA, a novel single-step adversarial training approach that integrates global perturbations to generate enhanced adversarial samples and employs adversarial consistency regularization to improve representation quality and resilience. A novel APT malware dataset named AMG18, with diverse samples and realistic class imbalances, is introduced for evaluation. Extensive experiments show that RoMA significantly outperforms seven competing methods in both adversarial robustness (e.g., achieving over 80% robust accuracy-more than twice that of the next-best method under PGD attacks) and training efficiency (e.g., more than twice as fast as the second-best method in terms of accuracy), while maintaining superior standard accuracy in non-adversarial scenarios.

Paperid: 2334, https://arxiv.org/pdf/2502.06031.pdf

Abstract:
Internet of things (IoT) networks, boosted by 6G technology, are transforming various industries. However, their widespread adoption introduces significant security risks, particularly in detecting rare but potentially damaging cyber-attacks. This makes the development of robust IDS crucial for monitoring network traffic and ensuring their safety. Traditional IDS often struggle with detecting rare attacks due to severe class imbalances in IoT data. In this paper, we propose a novel two-stage system called conditional tabular generative synthetic minority data generation with deep neural network (CTGSM-DNN). In the first stage, a conditional tabular generative adversarial network (CTGAN) is employed to generate synthetic data for rare attack classes. In the second stage, the SMOTEENN method is applied to improve dataset quality. The full study was conducted using the CSE-CIC-IDS2018 dataset, and we assessed the performance of the proposed IDS using different evaluation metrics. The experimental results demonstrated the effectiveness of the proposed multiclass classifier, achieving an overall accuracy of 99.90% and 80% accuracy in detecting rare attacks.

Paperid: 2335, https://arxiv.org/pdf/2502.05760.pdf

Abstract:
Millions of new pieces of malicious software (i.e., malware) are introduced each year. This poses significant challenges for antivirus vendors, who use machine learning to detect and analyze malware, and must keep up with changes in the distribution while retaining knowledge of older variants. Continual learning (CL) holds the potential to address this challenge by reducing the storage and computational costs of regularly retraining over all the collected data. Prior work, however, shows that CL techniques, which are designed primarily for computer vision tasks, fare poorly when applied to malware classification. To address these issues, we begin with an exploratory analysis of a typical malware dataset, which reveals that malware families are diverse and difficult to characterize, requiring a wide variety of samples to learn a robust representation. Based on these findings, we propose $\underline{M}$alware $\underline{A}$nalysis with $\underline{D}$istribution-$\underline{A}$ware $\underline{R}$eplay (MADAR), a CL framework that accounts for the unique properties and challenges of the malware data distribution. Through extensive evaluation on large-scale Windows and Android malware datasets, we show that MADAR significantly outperforms prior work. This highlights the importance of understanding domain characteristics when designing CL techniques and demonstrates a path forward for the malware classification domain.

Paperid: 2336, https://arxiv.org/pdf/2502.04753.pdf

Abstract:
Law enforcement and private-sector partners have in recent years conducted various interventions to disrupt the DDoS-for-hire market. Drawing on multiple quantitative datasets, including web traffic and ground-truth visits to seized websites, millions of DDoS attack records from academic, industry, and self-reported statistics, along with chats on underground forums and Telegram channels, we assess the effects of an ongoing global intervention against DDoS-for-hire services since December 2022. This is the most extensive booter takedown to date conducted, combining targeting infrastructure with digital influence tactics in a concerted effort by law enforcement across several countries with two waves of website takedowns and the use of deceptive domains. We found over half of the seized sites in the first wave returned within a median of one day, while all booters seized in the second wave returned within a median of two days. Re-emerged booter domains, despite closely resembling old ones, struggled to attract visitors (80-90% traffic reduction). While the first wave cut the global DDoS attack volume by 20-40% with a statistically significant effect specifically on UDP-based DDoS attacks (commonly attributed to booters), the impact of the second wave appeared minimal. Underground discussions indicated a cumulative impact, leading to changes in user perceptions of safety and causing some operators to leave the market. Despite the extensive intervention efforts, all DDoS datasets consistently suggest that the illicit market is fairly resilient, with an overall short-lived effect on the global DDoS attack volume lasting for at most only around six weeks.

Paperid: 2337, https://arxiv.org/pdf/2502.03909.pdf

Abstract:
Anomaly-based Network Intrusion Detection Systems (NIDS) require correctly labelled, representative and diverse datasets for an accurate evaluation and development. However, several widely used datasets do not include labels which are fine-grained enough and, together with small sample sizes, can lead to overfitting issues that also remain undetected when using test data. Additionally, the cybersecurity sector is evolving fast, and new attack mechanisms require the continuous creation of up-to-date datasets. To address these limitations, we developed a modular traffic generator that can simulate a wide variety of benign and malicious traffic. It incorporates multiple protocols, variability through randomization techniques and can produce attacks along corresponding benign traffic, as it occurs in real-world scenarios. Using the traffic generator, we create a dataset capturing over 12 million samples with 82 flow-level features and 21 fine-grained labels. Additionally, we include several web attack types which are often underrepresented in other datasets.

Paperid: 2338, https://arxiv.org/pdf/2502.03177.pdf

Abstract:
Variable-bitrate video streaming is ubiquitous in video surveillance and CCTV, enabling high-quality video streaming while conserving network bandwidth. However, as the name suggests, variable-bitrate IP cameras can generate sharp traffic spikes depending on the dynamics of the visual input. In this paper, we show that the effectiveness of video compression can be reduced by up to 6X using a simple laser LED pointing at a variable-bitrate IP camera, forcing the camera to generate excessive network traffic. Experiments with IP cameras connected to wired and wireless networks indicate that a laser attack on a single camera can cause significant packet loss in systems sharing the network with the camera and reduce the available bandwidth of a shared network link by 90%. This attack represents a new class of cyber-physical attacks that manipulate variable bitrate devices through changes in the physical environment without a digital presence on the device or the network. We also analyze the broader view of multidimensional cyberattacks that involve both the physical and digital realms and present a taxonomy that categorizes attacks based on their direction of influence (physical-to-digital or digital-to-physical) and their method of operation (environment-driven or device-driven), highlighting multiple areas for future research.

Paperid: 2339, https://arxiv.org/pdf/2502.03052.pdf

Abstract:
Jailbreaking attacks can effectively manipulate open-source large language models (LLMs) to produce harmful responses. However, these attacks exhibit limited transferability, failing to disrupt proprietary LLMs consistently. To reliably identify vulnerabilities in proprietary LLMs, this work investigates the transferability of jailbreaking attacks by analysing their impact on the model's intent perception. By incorporating adversarial sequences, these attacks can redirect the source LLM's focus away from malicious-intent tokens in the original input, thereby obstructing the model's intent recognition and eliciting harmful responses. Nevertheless, these adversarial sequences fail to mislead the target LLM's intent perception, allowing the target LLM to refocus on malicious-intent tokens and abstain from responding. Our analysis further reveals the inherent distributional dependency within the generated adversarial sequences, whose effectiveness stems from overfitting the source LLM's parameters, resulting in limited transferability to target LLMs. To this end, we propose the Perceived-importance Flatten (PiF) method, which uniformly disperses the model's focus across neutral-intent tokens in the original input, thus obscuring malicious-intent tokens without relying on overfitted adversarial sequences. Extensive experiments demonstrate that PiF provides an effective and efficient red-teaming evaluation for proprietary LLMs.

Paperid: 2340, https://arxiv.org/pdf/2502.02038.pdf

Abstract:
Federated learning is an essential distributed model training technique. However, threats such as gradient inversion attacks and poisoning attacks pose significant risks to the privacy of training data and the model correctness. We propose a novel approach called SMTFL to achieve secure model training in federated learning without relying on trusted participants. To safeguard gradients privacy against gradient inversion attacks, clients are dynamically grouped, allowing one client's gradient to be divided to obfuscate the gradients of other clients within the group. This method incorporates checks and balances to reduce the collusion for inferring specific client data. To detect poisoning attacks from malicious clients, we assess the impact of aggregated gradients on the global model's performance, enabling effective identification and exclusion of malicious clients. Each client's gradients are encrypted and stored, with decryption collectively managed by all clients. The detected poisoning gradients are invalidated from the global model through a unlearning method. We present a practical secure aggregation scheme, which does not require trusted participants, avoids the performance degradation associated with traditional noise-injection, and aviods complex cryptographic operations during gradient aggregation. Evaluation results are encouraging based on four datasets and two models: SMTFL is effective against poisoning attacks and gradient inversion attacks, achieving an accuracy rate of over 95% in locating malicious clients, while keeping the false positive rate for honest clients within 5%. The model accuracy is also nearly restored to its pre-attack state when SMTFL is deployed.

Paperid: 2341, https://arxiv.org/pdf/2502.01060.pdf

Abstract:
This paper investigates the learnability of the nonlinearity property of Boolean functions using neural networks. We train encoder style deep neural networks to learn to predict the nonlinearity of Boolean functions from examples of functions in the form of a truth table and their corresponding nonlinearity values. We report empirical results to show that deep neural networks are able to learn to predict the property for functions in 4 and 5 variables with an accuracy above 95%. While these results are positive and a disciplined analysis is being presented for the first time in this regard, we should also underline the statutory warning that it seems quite challenging to extend the idea to higher number of variables, and it is also not clear whether one can get advantage in terms of time and space complexity over the existing combinatorial algorithms.

Paperid: 2342, https://arxiv.org/pdf/2501.19223.pdf

Abstract:
\begin{abstract} This paper comprehensively analyzes privacy policies in AR/VR applications, leveraging BERT, a state-of-the-art text classification model, to evaluate the clarity and thoroughness of these policies. By comparing the privacy policies of AR/VR applications with those of free and premium websites, this study provides a broad perspective on the current state of privacy practices within the AR/VR industry. Our findings indicate that AR/VR applications generally offer a higher percentage of positive segments than free content but lower than premium websites. The analysis of highlighted segments and words revealed that AR/VR applications strategically emphasize critical privacy practices and key terms. This enhances privacy policies' clarity and effectiveness.

Paperid: 2343, https://arxiv.org/pdf/2501.18617.pdf

Abstract:
With the growing demand for personalized AI solutions, customized LLMs have become a preferred choice for businesses and individuals, driving the deployment of millions of AI agents across various platforms, e.g., GPT Store hosts over 3 million customized GPTs. Their popularity is partly driven by advanced reasoning capabilities, such as Chain-of-Thought, which enhance their ability to tackle complex tasks. However, their rapid proliferation introduces new vulnerabilities, particularly in reasoning processes that remain largely unexplored. We introduce DarkMind, a novel backdoor attack that exploits the reasoning capabilities of customized LLMs. Designed to remain latent, DarkMind activates within the reasoning chain to covertly alter the final outcome. Unlike existing attacks, it operates without injecting triggers into user queries, making it a more potent threat. We evaluate DarkMind across eight datasets covering arithmetic, commonsense, and symbolic reasoning domains, using five state-of-the-art LLMs with five distinct trigger implementations. Our results demonstrate DarkMind effectiveness across all scenarios, underscoring its impact. Finally, we explore potential defense mechanisms to mitigate its risks, emphasizing the need for stronger security measures.

Paperid: 2344, https://arxiv.org/pdf/2501.17748.pdf

Abstract:
In the world of open-source software (OSS), the number of known vulnerabilities has tremendously increased. The GitHub Advisory Database contains advisories for security risks in GitHub-hosted OSS projects. As of 09/25/2023, there are 197,609 unreviewed GitHub security advisories. Of those unreviewed, at least 63,852 are publicly documented vulnerabilities, potentially leaving many OSS projects vulnerable. Recently, bug bounty platforms have emerged to focus solely on providing bounties to help secure OSS. In this paper, we conduct an empirical study on 3,798 reviewed GitHub security advisories and 4,033 disclosed OSS bug bounty reports, a perspective that is currently understudied, because they contain comprehensive information about security incidents, e.g., the nature of vulnerabilities, their impact, and how they were resolved. We are the first to determine the explicit process describing how OSS vulnerabilities propagate from security advisories and bug bounty reports, which are the main intermediaries between vulnerability reporters, OSS maintainers, and dependent projects, to vulnerable OSS projects and entries in global vulnerability databases and possibly back. This process uncovers how missing or delayed CVE assignments for OSS vulnerabilities result in projects, both in and out of OSS, not being notified of necessary security updates promptly and corresponding bottlenecks. Based on our findings, we provide suggestions, actionable items, and future research directions to help improve the security posture of OSS projects.

Paperid: 2345, https://arxiv.org/pdf/2501.16732.pdf

Abstract:
The area of research includes control theory, dynamic systems, parameters of the external environment, mode, integral indicators, British standards. The main idea of the article is information security. The activity of a large-scale object (enterprise) is considered. The activity of the enterprise is presented as a multidimensional dynamic system and is displayed as a digital copy of 1.2 million parameters. A British digital copy-based information security standard is being introduced. Information security equipment and software were purchased. The training of the company's personnel was completed. Evaluation of implementation (activities) is done as an integral indicator. The dynamics of the integral indicator assesses the implementation of the British standard.

Paperid: 2346, https://arxiv.org/pdf/2501.15395.pdf

Abstract:
The rapid growth of Internet of Things (IoT) devices has introduced significant challenges to privacy, particularly as network traffic analysis techniques evolve. While encryption protects data content, traffic attributes such as packet size and timing can reveal sensitive information about users and devices. Existing single-technique obfuscation methods, such as packet padding, often fall short in dynamic environments like smart homes due to their predictability, making them vulnerable to machine learning-based attacks. This paper introduces a multi-technique obfuscation framework designed to enhance privacy by disrupting traffic analysis. The framework leverages six techniques-Padding, Padding with XORing, Padding with Shifting, Constant Size Padding, Fragmentation, and Delay Randomization-to obscure traffic patterns effectively. Evaluations on three public datasets demonstrate significant reductions in classifier performance metrics, including accuracy, precision, recall, and F1 score. We assess the framework's robustness against adversarial tactics by retraining and fine-tuning neural network classifiers on obfuscated traffic. The results reveal a notable degradation in classifier performance, underscoring the framework's resilience against adaptive attacks. Furthermore, we evaluate communication and system performance, showing that higher obfuscation levels enhance privacy but may increase latency and communication overhead.

Paperid: 2347, https://arxiv.org/pdf/2501.12229.pdf

Abstract:
Health data is one of the most sensitive data for people, which attracts the attention of malicious activities. We propose an open-source health data management framework, that follows a patient-centric approach. The proposed framework implements the Self-Sovereign Identity paradigm with innovative technologies such as Decentralized Identifiers and Verifiable Credentials. The framework uses Blockchain technology to provide immutability, verifiable data registry, and auditability, as well as an agent-based model to provide protection and privacy for the patient data. We also define different use cases regarding the daily patient-practitioner-laboratory interactions and specific functions to cover patient data loss, data access revocation, and emergency cases where patients are unable to give consent and access to their data. To address this design, a proof of concept is created with an interaction between patient and doctor. The most feasible technologies are selected and the created design is validated. We discuss the differences and novelties of this framework, which includes the patient-centric approach also for data storage, the designed recovery and emergency plan, the defined backup procedure, and the selected blockchain platform.

Paperid: 2348, https://arxiv.org/pdf/2501.12077.pdf

Abstract:
The increased use of digital devices and applications has led to a rise in phishing attacks. We develop a serious game to raise awareness about phishing attacks and help people avoid these threats in a risk-free learning environment. This game targets three types of phishing-clone phishing, SMS phishing, and spear phishing-and uses a Large Language Model to generate dialogues and questions dynamically. It also incorporates state randomization and time-limited challenges to enhance the gameplay. We evaluated two groups of participants and found that those who played the game showed, on average, a 24% increase in awareness and a 30% boost in confidence.

Paperid: 2349, https://arxiv.org/pdf/2501.12006.pdf

Abstract:
To investigate the level of support and awareness developers possess for dealing with sensitive data in the metaverse, we surveyed developers, consulted legal frameworks, and analyzed API documentation in the metaverse. Our preliminary results suggest that privacy is a major concern, but developer awareness and existing support are limited. Developers lack strategies to identify sensitive data that are exclusive to the metaverse. The API documentation contains guidelines for collecting sensitive information, but it omits instructions for identifying and protecting it. Legal frameworks include definitions that are subject to individual interpretation. These findings highlight the urgent need to build a transparent and common ground for privacy definitions, identify sensitive data, and implement usable protection measures.

Paperid: 2350, https://arxiv.org/pdf/2501.11207.pdf

Abstract:
Microcontroller-based embedded systems are vital in daily life, but are especially vulnerable to control-flow hijacking attacks due to hardware and software constraints. Control-Flow Attestation (CFA) aims to precisely attest the execution path of a program to a remote verifier. However, existing CFA solutions face challenges with large measurement and/or trace data, limiting these solutions to small programs. In addition, slow software-based measurement calculations limit their feasibility for microcontroller systems. In this paper, we present ENOLA, an efficient control-flow attestation solution for low-end embedded systems. ENOLA introduces a novel authenticator that achieves linear space complexity. Moreover, ENOLA capitalizes on the latest hardware-assisted message authentication code computation capabilities found in commercially-available devices for measurement computation. ENOLA employs a trusted execution environment, and allocates general-purpose registers to thwart memory corruption attacks. We have developed the ENOLA compiler through LLVM passes and attestation engine on the ARMv8.1-M architecture. Our evaluations demonstrate ENOLA's effectiveness in minimizing data transmission, while achieving lower or comparable performance to the existing works.

Paperid: 2351, https://arxiv.org/pdf/2501.10996.pdf

Abstract:
Adversarial attacks present significant challenges for malware detection systems. This research investigates the effectiveness of benign and malicious adversarial examples (AEs) in evasion and poisoning attacks on the Portable Executable file domain. A novel focus of this study is on benign AEs, which, although not directly harmful, can increase false positives and undermine trust in antivirus solutions. We propose modifying existing adversarial malware generators to produce benign AEs and show they are as successful as malware AEs in evasion attacks. Furthermore, our data show that benign AEs have a more decisive influence in poisoning attacks than standard malware AEs, demonstrating their superior ability to decrease the model's performance. Our findings introduce new opportunities for adversaries and further increase the attack surface that needs to be protected by security researchers.

Paperid: 2352, https://arxiv.org/pdf/2501.10881.pdf

Abstract:
Multiplayer online gaming has witnessed an explosion in popularity over the past two decades. However, security issues continue to give rise to in-game cheating, deterring honest gameplay, detracting from user experience, and ultimately bringing financial harm to game developers. In this paper, we present a new approach for detecting network packet-based cheats, such as forgery and timing cheats, within the context of multiplayer games using an application of secret sharing. Our developed protocols are subjected to formal verification using AVISPA, and we present simulation results using a Python-based implementation. We show that our proposal is practical in addressing some widely used attacks in online gaming.

Paperid: 2353, https://arxiv.org/pdf/2501.10606.pdf

Abstract:
Marked temporal point processes (MTPPs) have been shown to be extremely effective in modeling continuous time event sequences (CTESs). In this work, we present adversarial attacks designed specifically for MTPP models. A key criterion for a good adversarial attack is its imperceptibility. For objects such as images or text, this is often achieved by bounding perturbation in some fixed $L_p$ norm-ball. However, similarly minimizing distance norms between two CTESs in the context of MTPPs is challenging due to their sequential nature and varying time-scales and lengths. We address this challenge by first permuting the events and then incorporating the additive noise to the arrival timestamps. However, the worst case optimization of such adversarial attacks is a hard combinatorial problem, requiring exploration across a permutation space that is factorially large in the length of the input sequence. As a result, we propose a novel differentiable scheme PERMTPP using which we can perform adversarial attacks by learning to minimize the likelihood, while minimizing the distance between two CTESs. Our experiments on four real-world datasets demonstrate the offensive and defensive capabilities, and lower inference times of PERMTPP.

Paperid: 2354, https://arxiv.org/pdf/2501.10251.pdf

Abstract:
In this paper, we study the problem of information-theoretic distributed multi-user point function, involving a trusted master node, $N \in \mathbb{N}$ server nodes, and $K\in \mathbb{N}$ users, where each user has access to the contents of a subset of the storages of server nodes. Each user is associated with an independent point function $f_{X_k,Z_k}: \{1,2,\hdots,T\} \rightarrow{GF(q^{m R_k})},T,mR_k \in \mathbb{N}$. Using these point functions, the trusted master node encodes and places functional shares $G_1,G_2,\hdots,G_N \in GF(q^{M}), M \in \mathbb{N}$ in the storage nodes such that each user can correctly recover its point function result from the response transmitted to itself and gains no information about the point functions of any other user, even with knowledge of all responses transmitted from its connected servers. For the first time, we propose a multi-user scheme that satisfies the correctness and information-theoretic privacy constraints, ensuring recovery for all point functions. We also characterize the inner and outer bounds on the capacity -- the maximum achievable rate defined as the size of the range of each point function $mR_k$ relative to the storage size of the servers $M$ -- of the distributed multi-user point function scheme by presenting a novel converse argument.

Paperid: 2355, https://arxiv.org/pdf/2501.10002.pdf

Abstract:
This paper introduces a novel fuzzing framework, SyzParam which incorporates runtime parameters into the fuzzing process. Achieving this objective requires addressing several key challenges, including valid value extraction, inter-device relation construction, and fuzz engine integration. By inspecting the data structures and functions associated with the LKDM, our tool can extract runtime parameters across various drivers through static analysis. Additionally, SyzParam collects inter-device relations and identifies associations between runtime parameters and drivers. Furthermore, SyzParam proposes a novel mutation strategy, which leverages these relations and prioritizes parameter modification during related driver execution. Our evaluation demonstrates that SyzParam outperforms existing fuzzing works in driver code coverage and bug-detection capabilities. To date, we have identified 30 unique bugs in the latest kernel upstreams, with 20 confirmed and 14 patched into the mainline kernel, including 9 CVEs.

Paperid: 2356, https://arxiv.org/pdf/2501.09334.pdf

Abstract:
Trusted execution environment (TEE) has provided an isolated and secure environment for building cloud-based analytic systems, but it still suffers from access pattern leakages caused by side-channel attacks. To better secure the data, computation inside TEE enclave should be made oblivious, which introduces significant overhead and severely slows down the computation. A natural way to speed up is to build the analytic system with multiple servers in the distributed setting. However, this setting raises a new security concern -- the volumes of the transmissions among these servers can leak sensitive information to a network adversary. Existing works have designed specialized algorithms to address this concern, but their supports for equi-join, one of the most important but non-trivial database operators, are either inefficient, limited, or under a weak security assumption. In this paper, we present Jodes, an efficient oblivious join algorithm in the distributed setting. Jodes prevents the leakage on both the network and enclave sides, supports a general equi-join operation, and provides a high security level protection that only publicizes the input sizes and the output size. Meanwhile, it achieves both communication cost and computation cost asymptotically superior to existing algorithms. To demonstrate the practicality of Jodes, we conduct experiments in the distributed setting comprising 16 servers. Empirical results show that Jodes achieves up to a sixfold performance improvement over state-of-the-art join algorithms.

Paperid: 2357, https://arxiv.org/pdf/2501.07033.pdf

Abstract:
This study explores the use of Generative Adversarial Networks (GANs) to detect AI deepfakes and fraudulent activities in online payment systems. With the growing prevalence of deepfake technology, which can manipulate facial features in images and videos, the potential for fraud in online transactions has escalated. Traditional security systems struggle to identify these sophisticated forms of fraud. This research proposes a novel GAN-based model that enhances online payment security by identifying subtle manipulations in payment images. The model is trained on a dataset consisting of real-world online payment images and deepfake images generated using advanced GAN architectures, such as StyleGAN and DeepFake. The results demonstrate that the proposed model can accurately distinguish between legitimate transactions and deepfakes, achieving a high detection rate above 95%. This approach significantly improves the robustness of payment systems against AI-driven fraud. The paper contributes to the growing field of digital security, offering insights into the application of GANs for fraud detection in financial services. Keywords- Payment Security, Image Recognition, Generative Adversarial Networks, AI Deepfake, Fraudulent Activities

Paperid: 2358, https://arxiv.org/pdf/2501.06399.pdf

Abstract:
From a simple text prompt, generative-AI image models can create stunningly realistic and creative images bounded, it seems, by only our imagination. These models have achieved this remarkable feat thanks, in part, to the ingestion of billions of images collected from nearly every corner of the internet. Many creators have understandably expressed concern over how their intellectual property has been ingested without their permission or a mechanism to opt out of training. As a result, questions of fair use and copyright infringement have quickly emerged. We describe a method that allows us to determine if a model was trained on a specific image or set of images. This method is computationally efficient and assumes no explicit knowledge of the model architecture or weights (so-called black-box membership inference). We anticipate that this method will be crucial for auditing existing models and, looking ahead, ensuring the fairer development and deployment of generative AI models.

Paperid: 2359, https://arxiv.org/pdf/2501.05239.pdf

Abstract:
Autonomous vehicles rely on camera-based perception systems to comprehend their driving environment and make crucial decisions, thereby ensuring vehicles to steer safely. However, a significant threat known as Electromagnetic Signal Injection Attacks (ESIA) can distort the images captured by these cameras, leading to incorrect AI decisions and potentially compromising the safety of autonomous vehicles. Despite the serious implications of ESIA, there is limited understanding of its impacts on the robustness of AI models across various and complex driving scenarios. To address this gap, our research analyzes the performance of different models under ESIA, revealing their vulnerabilities to the attacks. Moreover, due to the challenges in obtaining real-world attack data, we develop a novel ESIA simulation method and generate a simulated attack dataset for different driving scenarios. Our research provides a comprehensive simulation and evaluation framework, aiming to enhance the development of more robust AI models and secure intelligent systems, ultimately contributing to the advancement of safer and more reliable technology across various fields.

Paperid: 2360, https://arxiv.org/pdf/2501.04323.pdf

Abstract:
Instruction tuning has proven effective in enhancing Large Language Models' (LLMs) performance on downstream tasks. However, real-world fine-tuning faces inherent conflicts between model providers' intellectual property protection, clients' data privacy requirements, and tuning costs. While recent approaches like split learning and offsite tuning demonstrate promising architectures for privacy-preserving fine-tuning, there is a gap in systematically addressing the multidimensional trade-offs required for diverse real-world deployments. We propose several indicative evaluation metrics to guide design trade-offs for privacy-preserving fine-tuning and a series of example designs, collectively named GuardedTuning; they result from novel combinations of system architectures with adapted privacy-enhancement methods and emerging computation techniques. Each design represents distinct trade-offs across model utility, privacy guarantees, and costs. Experimental results demonstrate that these designs protect against data reconstruction attacks while maintaining competitive fine-tuning performance.

Paperid: 2361, https://arxiv.org/pdf/2501.02520.pdf

Abstract:
The rapid integration of Internet of Things (IoT) devices into enterprise environments presents significant security challenges. Many IoT devices are released to the market with minimal security measures, often harbouring an average of 25 vulnerabilities per device. To enhance cybersecurity measures and aid system administrators in managing IoT patches more effectively, we propose an innovative framework that predicts the time it will take for a vulnerable IoT device to receive a fix or patch. We developed a survival analysis model based on the Accelerated Failure Time (AFT) approach, implemented using the XGBoost ensemble regression model, to predict when vulnerable IoT devices will receive fixes or patches. By constructing a comprehensive IoT vulnerabilities database that combines public and private sources, we provide insights into affected devices, vulnerability detection dates, published CVEs, patch release dates, and associated Twitter activity trends. We conducted thorough experiments evaluating different combinations of features, including fundamental device and vulnerability data, National Vulnerability Database (NVD) information such as CVE, CWE, and CVSS scores, transformed textual descriptions into sentence vectors, and the frequency of Twitter trends related to CVEs. Our experiments demonstrate that the proposed model accurately predicts the time to fix for IoT vulnerabilities, with data from VulDB and NVD proving particularly effective. Incorporating Twitter trend data offered minimal additional benefit. This framework provides a practical tool for organisations to anticipate vulnerability resolutions, improve IoT patch management, and strengthen their cybersecurity posture against potential threats.

Paperid: 2362, https://arxiv.org/pdf/2501.02042.pdf

Abstract:
Recent work has investigated the concept of adversarial attacks on explainable AI (XAI) in the NLP domain with a focus on examining the vulnerability of local surrogate methods such as Lime to adversarial perturbations or small changes on the input of a machine learning (ML) model. In such attacks, the generated explanation is manipulated while the meaning and structure of the original input remain similar under the ML model. Such attacks are especially alarming when XAI is used as a basis for decision making (e.g., prescribing drugs based on AI medical predictors) or for legal action (e.g., legal dispute involving AI software). Although weaknesses across many XAI methods have been shown to exist, the reasons behind why remain little explored. Central to this XAI manipulation is the similarity measure used to calculate how one explanation differs from another. A poor choice of similarity measure can lead to erroneous conclusions about the stability or adversarial robustness of an XAI method. Therefore, this work investigates a variety of similarity measures designed for text-based ranked lists referenced in related work to determine their comparative suitability for use. We find that many measures are overly sensitive, resulting in erroneous estimates of stability. We then propose a weighting scheme for text-based data that incorporates the synonymity between the features within an explanation, providing more accurate estimates of the actual weakness of XAI methods to adversarial examples.

Paperid: 2363, https://arxiv.org/pdf/2501.01187.pdf

Abstract:
Privacy-preserving machine learning (PPML) enables clients to collaboratively train deep learning models without sharing private datasets, but faces privacy leakage risks due to gradient leakage attacks. Prevailing methods leverage secure aggregation strategies to enhance PPML, where clients leverage masks and secret sharing to further protect gradient data while tolerating participant dropouts. These methods, however, require frequent inter-client communication to negotiate keys and perform secret sharing, leading to substantial communication overhead. To tackle this issue, we propose NET-SA, an efficient secure aggregation architecture for PPML based on in-network computing. NET-SA employs seed homomorphic pseudorandom generators for local gradient masking and utilizes programmable switches for seed aggregation. Accurate and secure gradient aggregation is then performed on the central server based on masked gradients and aggregated seeds. This design effectively reduces communication overhead due to eliminating the communication-intensive phases of seed agreement and secret sharing, with enhanced dropout tolerance due to overcoming the threshold limit of secret sharing. Extensive experiments on server clusters and Intel Tofino programmable switch demonstrate that NET-SA achieves up to 77x and 12x enhancements in runtime and 2x decrease in total client communication cost compared with state-of-the-art methods.

Paperid: 2364, https://arxiv.org/pdf/2501.00230.pdf

Abstract:
This paper introduces FDSC, a private-protected subspace clustering (SC) approach with federated learning (FC) schema. In each client, there is a deep subspace clustering network accounting for grouping the isolated data, composed of a encode network, a self-expressive layer, and a decode network. FDSC is achieved by uploading the encode network to communicate with other clients in the server. Besides, FDSC is also enhanced by preserving the local neighborhood relationship in each client. With the effects of federated learning and locality preservation, the learned data features from the encoder are boosted so as to enhance the self-expressiveness learning and result in better clustering performance. Experiments test FDSC on public datasets and compare with other clustering methods, demonstrating the effectiveness of FDSC.

Paperid: 2365, https://arxiv.org/pdf/2506.23949.pdf

Abstract:
Increasingly multi-purpose AI models, such as cutting-edge large language models or other 'general-purpose AI' (GPAI) models, 'foundation models,' generative AI models, and 'frontier models' (typically all referred to hereafter with the umbrella term 'GPAI/foundation models' except where greater specificity is needed), can provide many beneficial capabilities but also risks of adverse events with profound consequences. This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAI/foundation models. We intend this document primarily for developers of large-scale, state-of-the-art GPAI/foundation models; others that can benefit from this guidance include downstream developers of end-use applications that build on a GPAI/foundation model. This document facilitates conformity with or use of leading AI risk management-related standards, adapting and building on the generic voluntary guidance in the NIST AI Risk Management Framework and ISO/IEC 23894, with a focus on the unique issues faced by developers of GPAI/foundation models.

Paperid: 2366, https://arxiv.org/pdf/2506.23679.pdf

Abstract:
Modular exponentiation is crucial to number theory and cryptography, yet remains largely unexplored from a mechanistic interpretability standpoint. We train a 4-layer encoder-decoder Transformer model to perform this operation and investigate the emergence of numerical reasoning during training. Utilizing principled sampling strategies, PCA-based embedding analysis, and activation patching, we examine how number-theoretic properties are encoded within the model. We find that reciprocal operand training leads to strong performance gains, with sudden generalization across related moduli. These synchronized accuracy surges reflect grokking-like dynamics, suggesting the model internalizes shared arithmetic structure. We also find a subgraph consisting entirely of attention heads in the final layer sufficient to achieve full performance on the task of regular exponentiation. These results suggest that transformer models learn modular arithmetic through specialized computational circuits, paving the way for more interpretable and efficient neural approaches to modular exponentiation.

Paperid: 2367, https://arxiv.org/pdf/2506.23583.pdf

Abstract:
Federated learning with secure aggregation enables private and collaborative learning from decentralised data without leaking sensitive client information. However, secure aggregation also complicates the detection of malicious client behaviour and the evaluation of individual client contributions to the learning. To address these challenges, QI (Pejo et al.) and FedGT (Xhemrishi et al.) were proposed for contribution evaluation (CE) and misbehaviour detection (MD), respectively. QI, however, lacks adequate MD accuracy due to its reliance on the random selection of clients in each training round, while FedGT lacks the CE ability. In this work, we combine the strengths of QI and FedGT to achieve both robust MD and accurate CE. Our experiments demonstrate superior performance compared to using either method independently.

Paperid: 2368, https://arxiv.org/pdf/2506.23314.pdf

Abstract:
Malware detection in Android systems requires both cybersecurity expertise and machine learning (ML) techniques. Automated Machine Learning (AutoML) has emerged as an approach to simplify ML development by reducing the need for specialized knowledge. However, current AutoML solutions typically operate as black-box systems with limited transparency, interpretability, and experiment traceability. To address these limitations, we present MH-AutoML, a domain-specific framework for Android malware detection. MH-AutoML automates the entire ML pipeline, including data preprocessing, feature engineering, algorithm selection, and hyperparameter tuning. The framework incorporates capabilities for interpretability, debugging, and experiment tracking that are often missing in general-purpose solutions. In this study, we compare MH-AutoML against seven established AutoML frameworks: Auto-Sklearn, AutoGluon, TPOT, HyperGBM, Auto-PyTorch, LightAutoML, and MLJAR. Results show that MH-AutoML achieves better recall rates while providing more transparency and control. The framework maintains computational efficiency comparable to other solutions, making it suitable for cybersecurity applications where both performance and explainability matter.

Paperid: 2369, https://arxiv.org/pdf/2506.22802.pdf

Abstract:
Recent breakthroughs and rapid integration of generative models (GMs) have sparked interest in the problem of model attribution and their fingerprints. For instance, service providers need reliable methods of authenticating their models to protect their IP, while users and law enforcement seek to verify the source of generated content for accountability and trust. In addition, a growing threat of model collapse is arising, as more model-generated data are being fed back into sources (e.g., YouTube) that are often harvested for training ("regurgitative training"), heightening the need to differentiate synthetic from human data. Yet, a gap still exists in understanding generative models' fingerprints, we believe, stemming from the lack of a formal framework that can define, represent, and analyze the fingerprints in a principled way. To address this gap, we take a geometric approach and propose a new definition of artifact and fingerprint of GMs using Riemannian geometry, which allows us to leverage the rich theory of differential geometry. Our new definition generalizes previous work (Song et al., 2024) to non-Euclidean manifolds by learning Riemannian metrics from data and replacing the Euclidean distances and nearest-neighbor search with geodesic distances and kNN-based Riemannian center of mass. We apply our theory to a new gradient-based algorithm for computing the fingerprints in practice. Results show that it is more effective in distinguishing a large array of GMs, spanning across 4 different datasets in 2 different resolutions (64 by 64, 256 by 256), 27 model architectures, and 2 modalities (Vision, Vision-Language). Using our proposed definition significantly improves the performance on model attribution, as well as a generalization to unseen datasets, model types, and modalities, suggesting its practical efficacy.

Paperid: 2370, https://arxiv.org/pdf/2506.22506.pdf

Abstract:
Federated Prompt Learning has emerged as a communication-efficient and privacy-preserving paradigm for adapting large vision-language models like CLIP across decentralized clients. However, the security implications of this setup remain underexplored. In this work, we present the first study of backdoor attacks in Federated Prompt Learning. We show that when malicious clients inject visually imperceptible, learnable noise triggers into input images, the global prompt learner becomes vulnerable to targeted misclassification while still maintaining high accuracy on clean inputs. Motivated by this vulnerability, we propose SABRE-FL, a lightweight, modular defense that filters poisoned prompt updates using an embedding-space anomaly detector trained offline on out-of-distribution data. SABRE-FL requires no access to raw client data or labels and generalizes across diverse datasets. We show, both theoretically and empirically, that malicious clients can be reliably identified and filtered using an embedding-based detector. Across five diverse datasets and four baseline defenses, SABRE-FL outperforms all baselines by significantly reducing backdoor accuracy while preserving clean accuracy, demonstrating strong empirical performance and underscoring the need for robust prompt learning in future federated systems.

Paperid: 2371, https://arxiv.org/pdf/2506.22089.pdf

Abstract:
We consider the problem of a game theorist analyzing a game that uses cryptographic protocols. Ideally, a theorist abstracts protocols as ideal, implementation-independent primitives, letting conclusions in the "ideal world" carry over to the "real world." This is crucial, since the game theorist cannot--and should not be expected to--handle full cryptographic complexity. In today's landscape, the rise of distributed ledgers makes a shared language between cryptography and game theory increasingly necessary. The security of cryptographic protocols hinges on two types of assumptions: state-of-the-world (e.g., "factoring is hard") and behavioral (e.g., "honest majority"). We observe that for protocols relying on behavioral assumptions (e.g., ledgers), our goal is unattainable in full generality. For state-of-the-world assumptions, we show that standard solution concepts, e.g., ($Îµ$-)Nash equilibria, are not robust to transfer from the ideal to the real world. We propose a new solution concept: the pseudo-Nash equilibrium. Informally, a profile $s=(s_1,\dots,s_n)$ is a pseudo-Nash equilibrium if, for any player $i$ and deviation $s'_i$ with higher expected utility, $i$'s utility from $s_i$ is (computationally) indistinguishable from that of $s'_i$. Pseudo-Nash is simpler and more accessible to game theorists than prior notions addressing the mismatch between (asymptotic) cryptography and game theory. We prove that Nash equilibria in games with ideal, unbreakable cryptography correspond to pseudo-Nash equilibria when ideal cryptography is instantiated with real protocols (under state-of-the-world assumptions). Our translation is conceptually simpler and more general: it avoids tuning or restricting utility functions in the ideal game to fit quirks of cryptographic implementations. Thus, pseudo-Nash lets us study game-theoretic and cryptographic aspects separately and seamlessly.

Paperid: 2372, https://arxiv.org/pdf/2506.21988.pdf

Abstract:
Delegated quantum computing (DQC) allows clients with low quantum capabilities to outsource computations to a server hosting a quantum computer. This process is typically envisioned within the measurement-based quantum computing framework, as it naturally facilitates blindness of inputs and computation. Hence, the overall process of setting up and conducting the computation encompasses a sequence of three stages: preparing the qubits, entangling the qubits to obtain the resource state, and measuring the qubits to run the computation. There are two primary approaches to distributing these stages between the client and the server that impose different constraints on cryptographic techniques and experimental implementations. In the prepare-and-send setting, the client prepares the qubits and sends them to the server, while in the receive-and-measure setting, the client receives the qubits from the server and measures them. Although these settings have been extensively studied independently, their interrelation and whether setting-dependent theoretical constraints are inevitable remain unclear. By implementing the key components of most DQC protocols in the respective missing setting, we provide a method to build prospective protocols in both settings simultaneously and to translate existing protocols from one setting into the other.

Paperid: 2373, https://arxiv.org/pdf/2506.21914.pdf

Abstract:
Data brokers collect and sell the personal information of millions of individuals, often without their knowledge or consent. The California Consumer Privacy Act (CCPA) grants consumers the legal right to request access to, or deletion of, their data. To facilitate these requests, California maintains an official registry of data brokers. However, the extent to which these entities comply with the law is unclear. This paper presents the first large-scale, systematic study of CCPA compliance of all 543 officially registered data brokers. Data access requests were manually submitted to each broker, followed by in-depth analyses of their responses (or lack thereof). Above 40% failed to respond at all, in an apparent violation of the CCPA. Data brokers that responded requested personal information as part of their identity verification process, including details they had not previously collected. Paradoxically, this means that exercising one's privacy rights under CCPA introduces new privacy risks. Our findings reveal rampant non-compliance and lack of standardization of the data access request process. These issues highlight an urgent need for stronger enforcement, clearer guidelines, and standardized, periodic compliance checks to enhance consumers' privacy protections and improve data broker accountability.

Paperid: 2374, https://arxiv.org/pdf/2506.20800.pdf

Abstract:
SIM tracing -- the ability to inspect, modify, and relay communication between a SIM card and modem -- has become a significant technique in cellular network research. It enables essential security- and development-related applications such as fuzzing communication interfaces, extracting session keys, monitoring hidden SIM activity (e.g., proactive SIM commands or over-the-air updates), and facilitating scalable, distributed measurement platforms through SIM reuse. Traditionally, achieving these capabilities has relied on specialized hardware, which can pose financial and logistical burdens for researchers, particularly those new to the field. In this work, we show that full SIM tracing functionality can be achieved using only simple, widely available components, such as UART interfaces and GPIO ports. We port these capabilities to low-cost microcontrollers, exemplified by the Raspberry Pi Pico (4~USD). Unlike other approaches, it dramatically reduces hardware complexity by electrically decoupling the SIM and the modem and only transferring on APDU level. By significantly reducing hardware requirements and associated costs, we aim to make SIM tracing techniques accessible to a broader community of researchers and hobbyists, fostering wider exploration and experimentation in cellular network research.

Paperid: 2375, https://arxiv.org/pdf/2506.20576.pdf

Abstract:
Adversarial attacks, wherein slight inputs are carefully crafted to mislead intelligent models, have attracted increasing attention. However, a critical gap persists between theoretical advancements and practical application, particularly in structured data like network traffic, where interdependent features complicate effective adversarial manipulations. Moreover, ambiguity in current approaches restricts reproducibility and limits progress in this field. Hence, existing defenses often fail to handle evolving adversarial attacks. This paper proposes a novel approach for black-box adversarial attacks, that addresses these limitations. Unlike prior work, which often assumes system access or relies on repeated probing, our method strictly respect black-box constraints, reducing interaction to avoid detection and better reflect real-world scenarios. We present an adaptive feature selection strategy using change-point detection and causality analysis to identify and target sensitive features to perturbations. This lightweight design ensures low computational cost and high deployability. Our comprehensive experiments show the attack's effectiveness in evading detection with minimal interaction, enhancing its adaptability and applicability in real-world scenarios. By advancing the understanding of adversarial attacks in network traffic, this work lays a foundation for developing robust defenses.

Paperid: 2376, https://arxiv.org/pdf/2506.20082.pdf

Abstract:
Website Fingerprinting (WF) attacks aim to infer which websites a user is visiting by analyzing traffic patterns, thereby compromising user anonymity. Although this technique has been demonstrated to be effective in controlled experimental environments, it remains largely limited to small-scale scenarios, typically restricted to recognizing website homepages. In practical settings, however, users frequently access multiple subpages in rapid succession, often before previous content fully loads. WebPage Fingerprinting (WPF) generalizes the WF framework to large-scale environments by modeling subpages of the same site as distinct classes. These pages often share similar page elements, resulting in lower inter-class variance in traffic features. Furthermore, we consider multi-tab browsing scenarios, in which a single trace encompasses multiple categories of webpages. This leads to overlapping traffic segments, and similar features may appear in different positions within the traffic, thereby increasing the difficulty of classification. To address these challenges, we propose an attention-driven fine-grained WPF attack, named ADWPF. Specifically, during the training phase, we apply targeted augmentation to salient regions of the traffic based on attention maps, including attention cropping and attention masking. ADWPF then extracts low-dimensional features from both the original and augmented traffic and applies self-attention modules to capture the global contextual patterns of the trace. Finally, to handle the multi-tab scenario, we employ the residual attention to generate class-specific representations of webpages occurring at different temporal positions. Extensive experiments demonstrate that the proposed method consistently surpasses state-of-the-art baselines across datasets of different scales.

Paperid: 2377, https://arxiv.org/pdf/2506.19871.pdf

Abstract:
Insurance fraud detection represents a pivotal advancement in modern insurance service, providing intelligent and digitalized monitoring to enhance management and prevent fraud. It is crucial for ensuring the security and efficiency of insurance systems. Although AI and machine learning algorithms have demonstrated strong performance in detecting fraudulent claims, the absence of standardized defense mechanisms renders current systems vulnerable to emerging adversarial threats. In this paper, we propose a GAN-based approach to conduct adversarial attacks on fraud detection systems. Our results indicate that an attacker, without knowledge of the training data or internal model details, can generate fraudulent cases that are classified as legitimate with a 99\% attack success rate (ASR). By subtly modifying real insurance records and claims, adversaries can significantly increase the fraud risk, potentially bypassing compromised detection systems. These findings underscore the urgent need to enhance the robustness of insurance fraud detection models against adversarial manipulation, thereby ensuring the stability and reliability of different insurance systems.

Paperid: 2378, https://arxiv.org/pdf/2506.19409.pdf

Abstract:
A modification of the TLS protocol is presented, using our implementation of the Quantum Key Distribution (QKD) standard ETSI GS QKD 014 v1.1.1. We rely on the Rustls library for this. The TLS protocol is modified while maintaining backward compatibility on the client and server side. We thus wish to participate in the effort to generalize the use of QKD on the Internet. We used our protocol for a video conference call encrypted by QKD. Finally, we analyze the performance of our protocol, comparing the time needed to establish a handshake to that of TLS 1.3.

Paperid: 2379, https://arxiv.org/pdf/2506.18470.pdf

Abstract:
This paper introduces a novel approach for the automated selection of software protections to mitigate MATE risks against critical assets within software applications. We formalize the key elements involved in protection decision-making - including code artifacts, assets, security requirements, attacks, and software protections - and frame the protection process through a game-theoretic model. In this model, a defender strategically applies protections to various code artifacts of a target application, anticipating repeated attack attempts by adversaries against the confidentiality and integrity of the application's assets. The selection of the optimal defense maximizes resistance to attacks while ensuring the application remains usable by constraining the overhead introduced by protections. The game is solved through a heuristic based on a mini-max depth-first exploration strategy, augmented with dynamic programming optimizations for improved efficiency. Central to our formulation is the introduction of the Software Protection Index, an original contribution that extends existing notions of potency and resilience by evaluating protection effectiveness against attack paths using software metrics and expert assessments. We validate our approach through a proof-of-concept implementation and expert evaluations, demonstrating that automated software protection is a practical and effective solution for risk mitigation in software.

Paperid: 2380, https://arxiv.org/pdf/2506.18020.pdf

Abstract:
Robust distributed learning algorithms aim to maintain good performance in distributed and federated settings, even in the presence of misbehaving workers. Two primary threat models have been studied: Byzantine attacks, where misbehaving workers can send arbitrarily corrupted updates, and data poisoning attacks, where misbehavior is limited to manipulation of local training data. While prior work has shown comparable optimization error under both threat models, a fundamental question remains open: How do these threat models impact generalization? Empirical evidence suggests a gap between the two threat models, yet it remains unclear whether it is fundamental or merely an artifact of suboptimal attacks. In this work, we present the first theoretical investigation into this problem, formally showing that Byzantine attacks are intrinsically more harmful to generalization than data poisoning. Specifically, we prove that: (i) under data poisoning, the uniform algorithmic stability of a robust distributed learning algorithm, with optimal optimization error, degrades by an additive factor of $\varTheta ( \frac{f}{n-f} )$, with $f$ the number of misbehaving workers out of $n$; and (ii) In contrast, under Byzantine attacks, the degradation is in $\mathcal{O} \big( \sqrt{ \frac{f}{n-2f}} \big)$.This difference in stability leads to a generalization error gap that is especially significant as $f$ approaches its maximum value $\frac{n}{2}$.

Paperid: 2381, https://arxiv.org/pdf/2506.17805.pdf

Abstract:
Federated Learning (FL) enables collaborative learning without exposing clients' data. While clients only share model updates with the aggregator, studies reveal that aggregators can infer sensitive information from these updates. Secure Aggregation (SA) protects individual updates during transmission; however, recent work demonstrates a critical vulnerability where adversarial aggregators manipulate client selection to bypass SA protections, constituting a Biased Selection Attack (BSA). Although verifiable random selection prevents BSA, it precludes informed client selection essential for FL performance. We propose Adversarial Robust Federated Learning (AdRo-FL), which simultaneously enables: informed client selection based on client utility, and robust defense against BSA maintaining privacy-preserving aggregation. AdRo-FL implements two client selection frameworks tailored for distinct settings. The first framework assumes clients are grouped into clusters based on mutual trust, such as different branches of an organization. The second framework handles distributed clients where no trust relationships exist between them. For the cluster-oriented setting, we propose a novel defense against BSA by (1) enforcing a minimum client selection quota from each cluster, supervised by a cluster-head in every round, and (2) introducing a client utility function to prioritize efficient clients. For the distributed setting, we design a two-phase selection protocol: first, the aggregator selects the top clients based on our utility-driven ranking; then, a verifiable random function (VRF) ensures a BSA-resistant final selection. AdRo-FL also applies quantization to reduce communication overhead and sets strict transmission deadlines to improve energy efficiency. AdRo-FL achieves up to $1.85\times$ faster time-to-accuracy and up to $1.06\times$ higher final accuracy compared to insecure baselines.

Paperid: 2382, https://arxiv.org/pdf/2506.17767.pdf

Abstract:
A succinct histogram captures frequent items and their frequencies across clients and has become increasingly important for large-scale, privacy-sensitive machine learning applications. To develop a rigorous framework to guarantee privacy for the succinct histogram problem, local differential privacy (LDP) has been utilized and shown promising results. To preserve data utility under LDP, which essentially works by intentionally adding noise to data, error-correcting codes naturally emerge as a promising tool for reliable information collection. This work presents the first practical $(Îµ,Î´)$-LDP protocol for constructing succinct histograms using error-correcting codes. To this end, polar codes and their successive-cancellation list (SCL) decoding algorithms are leveraged as the underlying coding scheme. More specifically, our protocol introduces Gaussian-based perturbations to enable efficient soft decoding. Experiments demonstrate that our approach outperforms prior methods, particularly for items with low true frequencies, while maintaining similar frequency estimation accuracy.

Paperid: 2383, https://arxiv.org/pdf/2506.17570.pdf

Abstract:
Virtual reality (VR) has recently proliferated significantly, consisting of headsets or head-mounted displays (HMDs) and hand controllers for an embodied and immersive experience. The VR device is usually embedded with different kinds of IoT sensors, such as cameras, microphones, communication sensors, etc. However, VR security has not been scrutinized from a physical hardware point of view, especially electromagnetic emanations (EM) that are automatically and unintentionally emitted from the VR headset. This paper presents VReaves, a system that can eavesdrop on the electromagnetic emanation side channel of a VR headset for VR app identification and activity recognition. To do so, we first characterize the electromagnetic emanations from the embedded IoT sensors (e.g., cameras and microphones) in the VR headset through a signal processing pipeline and further propose machine learning models to identify the VR app and recognize the VR app activities. Our experimental evaluation with commercial off-the-shelf VR devices demonstrates the efficiency of VR app identification and activity recognition via electromagnetic emanation side channel.

Paperid: 2384, https://arxiv.org/pdf/2506.17318.pdf

Abstract:
Autonomous web navigation agents, which translate natural language instructions into sequences of browser actions, are increasingly deployed for complex tasks across e-commerce, information retrieval, and content discovery. Due to the stateless nature of large language models (LLMs), these agents rely heavily on external memory systems to maintain context across interactions. Unlike centralized systems where context is securely stored server-side, agent memory is often managed client-side or by third-party applications, creating significant security vulnerabilities. This was recently exploited to attack production systems. We introduce and formalize "plan injection," a novel context manipulation attack that corrupts these agents' internal task representations by targeting this vulnerable context. Through systematic evaluation of two popular web agents, Browser-use and Agent-E, we show that plan injections bypass robust prompt injection defenses, achieving up to 3x higher attack success rates than comparable prompt-based attacks. Furthermore, "context-chained injections," which craft logical bridges between legitimate user goals and attacker objectives, lead to a 17.7% increase in success rate for privacy exfiltration tasks. Our findings highlight that secure memory handling must be a first-class concern in agentic systems.

Paperid: 2385, https://arxiv.org/pdf/2506.16891.pdf

Abstract:
Targeted advertising is fueled by the comprehensive tracking of users' online activity. As a result, advertising companies, such as Google and Meta, encourage website administrators to not only install tracking scripts on their websites but configure them to automatically collect users' Personally Identifying Information (PII). In this study, we aim to characterize how Google and Meta's trackers can be configured to collect PII data from web forms. We first perform a qualitative analysis of how third parties present form data collection to website administrators in the documentation and user interface. We then perform a measurement study of 40,150 websites to quantify the prevalence and configuration of Google and Meta trackers. Our results reveal that both Meta and Google encourage the use of form data collection and include inaccurate statements about hashing PII as a privacy-preserving method. Additionally, we find that Meta includes configuring form data collection as part of the basic setup flow. Our large-scale measurement study reveals that while Google trackers are more prevalent than Meta trackers (72.6% vs. 28.2% of websites), Meta trackers are configured to collect form data more frequently (11.6% vs. 62.3%). Finally, we identify sensitive finance and health websites that have installed trackers that are likely configured to collect form data PII in violation of Meta and Google policies. Our study highlights how tracker documentation and interfaces can potentially play a role in users' privacy through the configuration choices made by the website administrators who install trackers.

Paperid: 2386, https://arxiv.org/pdf/2506.15648.pdf

Abstract:
Although Rust ensures memory safety by default, it also permits the use of unsafe code, which can introduce memory safety vulnerabilities if misused. Unfortunately, existing tools for detecting memory bugs in Rust typically exhibit limited detection capabilities, inadequately handle Rust-specific types, or rely heavily on manual intervention. To address these limitations, we present deepSURF, a tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation to effectively identify memory safety vulnerabilities in Rust libraries, specifically targeting unsafe code. deepSURF introduces a novel approach for handling generics by substituting them with custom types and generating tailored implementations for the required traits, enabling the fuzzer to simulate user-defined behaviors within the fuzzed library. Additionally, deepSURF employs LLMs to augment fuzzing harnesses dynamically, facilitating exploration of complex API interactions and significantly increasing the likelihood of exposing memory safety vulnerabilities. We evaluated deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and uncovering 6 previously unknown vulnerabilities, demonstrating clear improvements over state-of-the-art tools.

Paperid: 2387, https://arxiv.org/pdf/2506.15100.pdf

Abstract:
As AI capabilities advance rapidly, flexible hardware-enabled guarantees (flexHEGs) offer opportunities to address international security challenges through comprehensive governance frameworks. This report examines how flexHEGs could enable internationally trustworthy AI governance by establishing standardized designs, robust ecosystem defenses, and clear operational parameters for AI-relevant chips. We analyze four critical international security applications: limiting proliferation to address malicious use, implementing safety norms to prevent loss of control, managing risks from military AI systems, and supporting strategic stability through balance-of-power mechanisms while respecting national sovereignty. The report explores both targeted deployments for specific high-risk facilities and comprehensive deployments covering all AI-relevant compute. We examine two primary governance models: verification-based agreements that enable transparent compliance monitoring, and ruleset-based agreements that automatically enforce international rules through cryptographically-signed updates. Through game-theoretic analysis, we demonstrate that comprehensive flexHEG agreements could remain stable under reasonable assumptions about state preferences and catastrophic risks. The report addresses critical implementation challenges including technical thresholds for AI-relevant chips, management of existing non-flexHEG hardware, and safeguards against abuse of governance power. While requiring significant international coordination, flexHEGs could provide a technical foundation for managing AI risks at the scale and speed necessary to address emerging threats to international security and stability.

Paperid: 2388, https://arxiv.org/pdf/2506.15093.pdf

Abstract:
As artificial intelligence systems become increasingly powerful, they pose growing risks to international security, creating urgent coordination challenges that current governance approaches struggle to address without compromising sensitive information or national security. We propose flexible hardware-enabled guarantees (flexHEGs), that could be integrated with AI accelerators to enable trustworthy, privacy-preserving verification and enforcement of claims about AI development. FlexHEGs consist of an auditable guarantee processor that monitors accelerator usage and a secure enclosure providing physical tamper protection. The system would be fully open source with flexible, updateable verification capabilities. FlexHEGs could enable diverse governance mechanisms including privacy-preserving model evaluations, controlled deployment, compute limits for training, and automated safety protocol enforcement. In this first part of a three part series, we provide a comprehensive introduction of the flexHEG system, including an overview of the governance and security capabilities it offers, its potential development and adoption paths, and the remaining challenges and limitations it faces. While technically challenging, flexHEGs offer an approach to address emerging regulatory and international security challenges in frontier AI development.

Paperid: 2389, https://arxiv.org/pdf/2506.15018.pdf

Abstract:
We study the problem of differentially private continual counting in the unbounded setting where the input size $n$ is not known in advance. Current state-of-the-art algorithms based on optimal instantiations of the matrix mechanism cannot be directly applied here because their privacy guarantees only hold when key parameters are tuned to $n$. Using the common `doubling trick' avoids knowledge of $n$ but leads to suboptimal and non-smooth error. We solve this problem by introducing novel matrix factorizations based on logarithmic perturbations of the function $\frac{1}{\sqrt{1-z}}$ studied in prior works, which may be of independent interest. The resulting algorithm has smooth error, and for any $Î±> 0$ and $t\leq n$ it is able to privately estimate the sum of the first $t$ data points with $O(\log^{2+2Î±}(t))$ variance. It requires $O(t)$ space and amortized $O(\log t)$ time per round, compared to $O(\log(n)\log(t))$ variance, $O(n)$ space and $O(n \log n)$ pre-processing time for the nearly-optimal bounded-input algorithm of Henzinger et al. (SODA 2023). Empirically, we find that our algorithm's performance is also comparable to theirs in absolute terms: our variance is less than $1.5\times$ theirs for $t$ as large as $2^{24}$.

Paperid: 2390, https://arxiv.org/pdf/2506.14340.pdf

Abstract:
This paper investigates the integration of quantum randomness into Verifiable Random Functions (VRFs) using the Ed25519 elliptic curve to strengthen cryptographic security. By replacing traditional pseudorandom number generators with quantum entropy sources, we assess the impact on key security and performance metrics, including execution time, and resource usage. Our approach simulates a modified VRF setup where initialization keys are derived from a quantum random number generator source (QRNG). The results show that while QRNGs could enhance the unpredictability and verifiability of VRFs, their incorporation introduces challenges related to temporal and computational overhead. This study provides valuable insights into the trade-offs of leveraging quantum randomness in API-driven cryptographic systems and offers a potential path toward more secure and efficient protocol design. The QRNG-based system shows increased (key generation times from 50 to 400+ microseconds, verification times from 500 to 3500 microseconds) and higher CPU usage (17% to 30%) compared to the more consistent performance of a Go-based VRF (key generation times below 200 microseconds, verification times under 2000 microseconds, CPU usage below 10%), highlighting trade-offs in computational efficiency and resource demands.

Paperid: 2391, https://arxiv.org/pdf/2506.13170.pdf

Abstract:
User profiling is crucial in providing personalised services, as it relies on analysing user behaviour and preferences to deliver targeted services. This approach enhances user experience and promotes heightened engagement. Nevertheless, user profiling also gives rise to noteworthy privacy considerations due to the extensive tracking and monitoring of personal data, potentially leading to surveillance or identity theft. We propose a dual-ring protection mechanism to protect user privacy by examining various threats to user privacy, such as behavioural attacks, profiling fingerprinting and monitoring, profile perturbation, etc., both on the user and service provider sides. We develop user profiles that contain sensitive private attributes and an equivalent profile based on differential privacy for evaluating personalised services. We determine the entropy of the resultant profiles during each update to protect profiling attributes and invoke various processes, such as data evaporation, to artificially increase entropy or destroy private profiling attributes. Furthermore, we use different variants of private information retrieval (PIR) to retrieve personalised services against differentially private profiles. We implement critical components of the proposed model via a proof-of-concept mobile app to demonstrate its applicability over a specific case study of advertising services, which can be generalised to other services. Our experimental results show that the observed processing delays with different PIR schemes are similar to the current advertising systems.

Paperid: 2392, https://arxiv.org/pdf/2506.12995.pdf

Abstract:
Open-source software (OSS) has become increasingly more popular across different domains. However, this rapid development and widespread adoption come with a security cost. The growing complexity and openness of OSS ecosystems have led to increased exposure to vulnerabilities and attack surfaces. This paper investigates the trends and patterns of reported vulnerabilities within OSS platforms, focusing on the implications of these findings for security practices. To understand the dynamics of OSS vulnerabilities, we analyze a comprehensive dataset comprising 31,267 unique vulnerability reports from GitHub's advisory database and Snyk.io, belonging to 14,675 packages across 10 programming languages. Our analysis reveals a significant surge in reported vulnerabilities, increasing at an annual rate of 98%, far outpacing the 25% average annual growth in the number of open-source software (OSS) packages. Additionally, we observe an 85% increase in the average lifespan of vulnerabilities across ecosystems during the studied period, indicating a potential decline in security. We identify the most prevalent Common Weakness Enumerations (CWEs) across programming languages and find that, on average, just seven CWEs are responsible for over 50% of all reported vulnerabilities. We further examine these commonly observed CWEs and highlight ecosystem-specific trends. Notably, we find that vulnerabilities associated with intentionally malicious packages comprise 49% of reports in the NPM ecosystem and 14% in PyPI, an alarming indication of targeted attacks within package repositories. We conclude with an in-depth discussion of the characteristics and attack vectors associated with these malicious packages.

Paperid: 2393, https://arxiv.org/pdf/2506.12994.pdf

Abstract:
Bilevel optimization, in which one optimization problem is nested inside another, underlies many machine learning applications with a hierarchical structure -- such as meta-learning and hyperparameter optimization. Such applications often involve sensitive training data, raising pressing concerns about individual privacy. Motivated by this, we study differentially private bilevel optimization. We first focus on settings where the outer-level objective is \textit{convex}, and provide novel upper and lower bounds on the excess risk for both pure and approximate differential privacy, covering both empirical and population-level loss. These bounds are nearly tight and essentially match the optimal rates for standard single-level differentially private ERM and stochastic convex optimization (SCO), up to additional terms that capture the intrinsic complexity of the nested bilevel structure. The bounds are achieved in polynomial time via efficient implementations of the exponential and regularized exponential mechanisms. A key technical contribution is a new method and analysis of log-concave sampling under inexact function evaluations, which may be of independent interest. In the \textit{non-convex} setting, we develop novel algorithms with state-of-the-art rates for privately finding approximate stationary points. Notably, our bounds do not depend on the dimension of the inner problem.

Paperid: 2394, https://arxiv.org/pdf/2506.12026.pdf

Abstract:
In many web applications, such as Content Delivery Networks (CDNs), TLS credentials are shared, e.g., between the website's TLS origin server and the CDN's edge servers, which can be distributed around the globe. To enhance the security and trust for TLS 1.3 in such scenarios, we propose LURK-T, a provably secure framework which allows for limited use of remote keys with added trust in TLS 1.3. We efficiently decouple the server side of TLS 1.3 into a LURK-T Crypto Service (CS) and a LURK-T Engine (E). CS executes all cryptographic operations in a Trusted Execution Environment (TEE), upon E's requests. CS and E together provide the whole TLS-server functionality. A major benefit of our construction is that it is application agnostic; the LURK-T Crypto Service could be collocated with the LURK-T Engine, or it could run on different machines. Thus, our design allows for in situ attestation and protection of the cryptographic side of the TLS server, as well as for all setups of CDNs over TLS. To support such a generic decoupling, we provide a full Application Programming Interface (API) for LURK-T. To this end, we implement our LURK-T Crypto Service using Intel SGX and integrate it with OpenSSL. We also test LURK-T's efficiency and show that, from a TLS-client's perspective, HTTPS servers using LURK-T instead a traditional TLS-server have no noticeable overhead when serving files greater than 1MB. In addition, we provide cryptographic proofs and formal security verification using ProVerif.

Paperid: 2395, https://arxiv.org/pdf/2506.11939.pdf

Abstract:
Vulnerability datasets used for ML testing implicitly contain retrospective information. When tested on the field, one can only use the labels available at the time of training and testing (e.g. seen and assumed negatives). As vulnerabilities are discovered across calendar time, labels change and past performance is not necessarily aligned with future performance. Past works only considered the slices of the whole history (e.g. DiverseVUl) or individual differences between releases (e.g. Jimenez et al. ESEC/FSE 2019). Such approaches are either too optimistic in training (e.g. the whole history) or too conservative (e.g. consecutive releases). We propose a method to restructure a dataset into a series of datasets in which both training and testing labels change to account for the knowledge available at the time. If the model is actually learning, it should improve its performance over time as more data becomes available and data becomes more stable, an effect that can be checked with the Mann-Kendall test. We validate our methodology for vulnerability detection with 4 time-based datasets (3 projects from BigVul dataset + Vuldeepecker's NVD) and 5 ML models (Code2Vec, CodeBERT, LineVul, ReGVD, and Vuldeepecker). In contrast to the intuitive expectation (more retrospective information, better performance), the trend results show that performance changes inconsistently across the years, showing that most models are not learning.

Paperid: 2396, https://arxiv.org/pdf/2506.11687.pdf

Abstract:
Machine learning models should not reveal particular information that is not otherwise accessible. Differential privacy provides a formal framework to mitigate privacy risks by ensuring that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm, thus limiting the exposure of private information. This survey paper explores the foundational definitions of differential privacy, reviews its original formulations and tracing its evolution through key research contributions. It then provides an in-depth examination of how DP has been integrated into machine learning models, analyzing existing proposals and methods to preserve privacy when training ML models. Finally, it describes how DP-based ML techniques can be evaluated in practice. %Finally, it discusses the broader implications of DP, highlighting its potential for public benefit, its real-world applications, and the challenges it faces, including vulnerabilities to adversarial attacks. By offering a comprehensive overview of differential privacy in machine learning, this work aims to contribute to the ongoing development of secure and responsible AI systems.

Paperid: 2397, https://arxiv.org/pdf/2506.11679.pdf

Abstract:
Modern life has witnessed the explosion of mobile devices. However, besides the valuable features that bring convenience to end users, security and privacy risks still threaten users of mobile apps. The increasing sophistication of these threats in recent years has underscored the need for more advanced and efficient detection approaches. In this chapter, we explore the application of Large Language Models (LLMs) to identify security risks and privacy violations and mitigate them for the mobile application ecosystem. By introducing state-of-the-art research that applied LLMs to mitigate the top 10 common security risks of smartphone platforms, we highlight the feasibility and potential of LLMs to replace traditional analysis methods, such as dynamic and hybrid analysis of mobile apps. As a representative example of LLM-based solutions, we present an approach to detect sensitive data leakage when users share images online, a common behavior of smartphone users nowadays. Finally, we discuss open research challenges.

Paperid: 2398, https://arxiv.org/pdf/2506.10194.pdf

Abstract:
Autocrats use secret police to stay in power, as these organizations deter and suppress opposition to their rule. Existing research shows that secret police are very good at this but, surprisingly, also that they are not as ubiquitous in autocracies as one may assume, existing in less than 50% of autocratic country-years. We thus explore under which conditions secret police emerge in dictatorships. For this purpose, we apply statistical variable selection techniques to identify which of several candidate variables extracted from the literature on state security forces and authoritarian survival hold explanatory power. Our results highlight that secret police are more likely to emerge when rulers face specific, preempt-able threats, such as protests and anti-system mobilisation, but also when they have the material resources to establish these organisations. This research contributes to our understanding of autocrats' institutional choices and authoritarian politics.

Paperid: 2399, https://arxiv.org/pdf/2506.09580.pdf

Abstract:
When `cyber' is used as a prefix, attention is typically drawn to the technological and spectacular aspects of war and conflict -- and, by extension, security. We offer a different approach to engaging with and understanding security in such contexts, by foregrounding the everyday -- mundane -- experiences of security within communities living with and fleeing from war. We do so through three vignettes from our field research in Colombia, Lebanon and Sweden, respectively, and by highlighting the significance of ethnography for security research with communities living in regions afflicted by war. We conclude by setting out a call to action for security researchers and practitioners to consider such lived experiences in the design of security technology that aims to cater to the needs of communities in `global conflict and disaster regions'.

Paperid: 2400, https://arxiv.org/pdf/2506.09559.pdf

Abstract:
The computing continuum introduces new challenges for access control due to its dynamic, distributed, and heterogeneous nature. In this paper, we propose a Zero-Trust (ZT) access control solution that leverages decentralized identification and authentication mechanisms based on Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs). Additionally, we employ Relationship-Based Access Control (ReBAC) to define policies that capture the evolving trust relationships inherent in the continuum. Through a proof-of-concept implementation, we demonstrate the feasibility and efficiency of our solution, highlighting its potential to enhance security and trust in decentralized environments.

Paperid: 2401, https://arxiv.org/pdf/2506.09464.pdf

Abstract:
Elliptic curve cryptography (ECC) has emerged as the dominant public-key protocol, with NIST standardizing parameters for binary field GF(2^m) ECC systems. This work presents a hardware implementation of a Hybrid Multiplication technique for modular multiplication over binary field GF(2m), targeting NIST B-163, 233, 283, and 571 parameters. The design optimizes the combination of conventional multiplication (CM) and Karatsuba multiplication (KM) to enhance elliptic curve point multiplication (ECPM). The key innovation uses CM for smaller operands (up to 41 bits for m=163) and KM for larger ones, reducing computational complexity and enhancing efficiency. The design is evaluated in three areas: Resource Utilization For m=163, the hybrid design uses 6,812 LUTs, a 39.82% reduction compared to conventional methods. For m=233, LUT usage reduces by 45.53% and 70.70% compared to overlap-free and bit-parallel implementations. Delay Performance For m=163, achieves 13.31ns delay, improving by 37.60% over bit-parallel implementations. For m=233, maintains 13.39ns delay. Area-Delay Product For m=163, achieves ADP of 90,860, outperforming bit-parallel (75,337) and digit-serial (43,179) implementations. For m=233, demonstrates 16.86% improvement over overlap-free and 96.10% over bit-parallel designs. Results show the hybrid technique significantly improves speed, hardware efficiency, and resource utilization for ECC cryptographic systems.

Paperid: 2402, https://arxiv.org/pdf/2506.08991.pdf

Abstract:
Generative models, particularly diffusion-based text-to-image (T2I) models, have demonstrated astounding success. However, aligning them to avoid generating content with unacceptable concepts (e.g., offensive or copyrighted content, or celebrity likenesses) remains a significant challenge. Concept replacement techniques (CRTs) aim to address this challenge, often by trying to "erase" unacceptable concepts from models. Recently, model providers have started offering image editing services which accept an image and a text prompt as input, to produce an image altered as specified by the prompt. These are known as image-to-image (I2I) models. In this paper, we first use an I2I model to empirically demonstrate that today's state-of-the-art CRTs do not in fact erase unacceptable concepts. Existing CRTs are thus likely to be ineffective in emerging I2I scenarios, despite their proven ability to remove unwanted concepts in T2I pipelines, highlighting the need to understand this discrepancy between T2I and I2I settings. Next, we argue that a good CRT, while replacing unacceptable concepts, should preserve other concepts specified in the inputs to generative models. We call this fidelity. Prior work on CRTs have neglected fidelity in the case of unacceptable concepts. Finally, we propose the use of targeted image-editing techniques to achieve both effectiveness and fidelity. We present such a technique, AntiMirror, and demonstrate its viability.

Paperid: 2403, https://arxiv.org/pdf/2506.07988.pdf

Abstract:
Ethereum's transaction pool (mempool) dynamics and fee market efficiency critically affect transaction inclusion, validator workload, and overall network performance. This research empirically analyzes gas price variations, mempool clearance rates, and block finalization times in Ethereum's proof-of-stake ecosystem using real-time data from Geth and Prysm nodes. We observe that high-fee transactions are consistently prioritized, while low-fee transactions face delays or exclusion despite EIP-1559's intended improvements. Mempool congestion remains a key factor in validator efficiency and proposal latency. We provide empirical evidence of persistent fee-based disparities and show that extremely high fees do not always guarantee faster confirmation, revealing inefficiencies in the current fee market. To address these issues, we propose congestion-aware fee adjustments, reserved block slots for low-fee transactions, and improved handling of out-of-gas vulnerabilities. By mitigating prioritization bias and execution inefficiencies, our findings support more equitable transaction inclusion, enhance validator performance, and promote scalability. This work contributes to Ethereum's long-term decentralization by reducing dependence on high transaction fees for network participation.

Paperid: 2404, https://arxiv.org/pdf/2506.07882.pdf

Abstract:
A Network Intrusion Detection System (NIDS) monitors networks for cyber attacks and other unwanted activities. However, NIDS solutions often generate an overwhelming number of alerts daily, making it challenging for analysts to prioritize high-priority threats. While deep learning models promise to automate the prioritization of NIDS alerts, the lack of transparency in these models can undermine trust in their decision-making. This study highlights the critical need for explainable artificial intelligence (XAI) in NIDS alert classification to improve trust and interpretability. We employed a real-world NIDS alert dataset from Security Operations Center (SOC) of TalTech (Tallinn University Of Technology) in Estonia, developing a Long Short-Term Memory (LSTM) model to prioritize alerts. To explain the LSTM model's alert prioritization decisions, we implemented and compared four XAI methods: Local Interpretable Model-Agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), Integrated Gradients, and DeepLIFT. The quality of these XAI methods was assessed using a comprehensive framework that evaluated faithfulness, complexity, robustness, and reliability. Our results demonstrate that DeepLIFT consistently outperformed the other XAI methods, providing explanations with high faithfulness, low complexity, robust performance, and strong reliability. In collaboration with SOC analysts, we identified key features essential for effective alert classification. The strong alignment between these analyst-identified features and those obtained by the XAI methods validates their effectiveness and enhances the practical applicability of our approach.

Paperid: 2405, https://arxiv.org/pdf/2506.06635.pdf

Abstract:
Modern vehicles are equipped with numerous in-vehicle components that interact with the external environment through remote communications and services, such as Bluetooth and vehicle-to-infrastructure communication. These components form a network, exchanging information to ensure the proper functioning of the vehicle. However, the presence of false or fabricated information can disrupt the vehicle's performance. Given that these components are interconnected, erroneous data can propagate throughout the network, potentially affecting other components and leading to catastrophic consequences. To address this issue, we propose TrustConnect, a framework designed to assess the trustworthiness of a vehicle's in-vehicle network by evaluating the trust levels of individual components under various network configurations. The proposed framework leverages the interdependency of all the vehicle's components, along with the correlation of their values and their vulnerability to remote injection based on the outside exposure of each component, to determine the reliability of the in-vehicle network. The effectiveness of the proposed framework has been validated through programming simulations conducted across various scenarios using a random distribution of an in-vehicle network graph generated with the Networkx package in Python.

Paperid: 2406, https://arxiv.org/pdf/2506.04838.pdf

Abstract:
The complexity of modern computing environments and the growing sophistication of cyber threats necessitate a more robust, adaptive, and automated approach to security enforcement. In this paper, we present a framework leveraging large language models (LLMs) for automating attack mitigation policy compliance through an innovative combination of in-context learning and retrieval-augmented generation (RAG). We begin by describing how our system collects and manages both tool and API specifications, storing them in a vector database to enable efficient retrieval of relevant information. We then detail the architectural pipeline that first decomposes high-level mitigation policies into discrete tasks and subsequently translates each task into a set of actionable API calls. Our empirical evaluation, conducted using publicly available CTI policies in STIXv2 format and Windows API documentation, demonstrates significant improvements in precision, recall, and F1-score when employing RAG compared to a non-RAG baseline.

Paperid: 2407, https://arxiv.org/pdf/2506.03940.pdf

Abstract:
In blockchain networks, so-called "full nodes" serve data to and relay transactions from clients through an RPC interface. This serving layer enables integration of "Web3" data, stored on blockchains, with "Web2" mobile or web applications that cannot directly participate as peers in a blockchain network. In practice, the serving layer is dominated by a small number of centralized services ("node providers") that offer permissioned access to RPC endpoints. Clients register with these providers because they offer reliable and convenient access to blockchain data: operating a full node themselves requires significant computational and storage resources, and public (permissionless) RPC nodes lack financial incentives to serve large numbers of clients with consistent performance. Permissioned access to an otherwise permissionless blockchain network raises concerns regarding the privacy, integrity, and availability of data access. To address this, we propose a Permissionless Accountable RPC Protocol (PARP). It enables clients and full nodes to interact pseudonymously while keeping both parties accountable. PARP leverages "light client" schemes for essential data integrity checks, combined with fraud proofs, to keep full nodes honest and accountable. It integrates payment channels to facilitate micro-payments, holding clients accountable for the resources they consume and providing an economic incentive for full nodes to serve. Our prototype implementation for Ethereum demonstrates the feasibility of PARP, and we quantify its overhead compared to the base RPC protocol.

Paperid: 2408, https://arxiv.org/pdf/2506.03409.pdf

Abstract:
Frontier AI models pose increasing risks to public safety and international security, creating a pressing need for AI developers to provide credible guarantees about their development activities without compromising proprietary information. We propose Flexible Hardware-Enabled Guarantees (flexHEG), a system integrated with AI accelerator hardware to enable verifiable claims about compute usage in AI development. The flexHEG system consists of two primary components: an auditable Guarantee Processor that monitors accelerator usage and verifies compliance with specified rules, and a Secure Enclosure that provides physical tamper protection. In this second report of a three part series, we analyze technical implementation options ranging from firmware modifications to custom hardware approaches, with focus on an "Interlock" design that provides the Guarantee Processor direct access to accelerator data paths. Our proposed architecture could support various guarantee types, from basic usage auditing to sophisticated automated verification. This work establishes technical foundations for hardware-based AI governance mechanisms that could address emerging regulatory and international security needs in frontier AI development.

Paperid: 2409, https://arxiv.org/pdf/2506.03207.pdf

Abstract:
Federated Learning (FL) is increasingly adopted as a decentralized machine learning paradigm due to its capability to preserve data privacy by training models without centralizing user data. However, FL is susceptible to indirect privacy breaches via network traffic analysis-an area not explored in existing research. The primary objective of this research is to study the feasibility of fingerprinting deep learning models deployed within FL environments by analyzing their network-layer traffic information. In this paper, we conduct an experimental evaluation using various deep learning architectures (i.e., CNN, RNN) within a federated learning testbed. We utilize machine learning algorithms, including Support Vector Machines (SVM), Random Forest, and Gradient-Boosting, to fingerprint unique patterns within the traffic data. Our experiments show high fingerprinting accuracy, achieving 100% accuracy using Random Forest and around 95.7% accuracy using SVM and Gradient Boosting classifiers. This analysis suggests that we can identify specific architectures running within the subsection of the network traffic. Hence, if an adversary knows about the underlying DL architecture, they can exploit that information and conduct targeted attacks. These findings suggest a notable security vulnerability in FL systems and the necessity of strengthening it at the network level.

Paperid: 2410, https://arxiv.org/pdf/2506.02438.pdf

Abstract:
IDS aims to protect computer networks from security threats by detecting, notifying, and taking appropriate action to prevent illegal access and protect confidential information. As the globe becomes increasingly dependent on technology and automated processes, ensuring secured systems, applications, and networks has become one of the most significant problems of this era. The global web and digital technology have significantly accelerated the evolution of the modern world, necessitating the use of telecommunications and data transfer platforms. Researchers are enhancing the effectiveness of IDS by incorporating popular datasets into machine learning algorithms. IDS, equipped with machine learning classifiers, enhances security attack detection accuracy by identifying normal or abnormal network traffic. This paper explores the methods of capturing and reviewing intrusion detection systems (IDS) and evaluates the challenges existing datasets face. A deluge of research on machine learning (ML) and deep learning (DL) architecture-based intrusion detection techniques has been conducted in the past ten years on various cybersecurity datasets, including KDDCUP'99, NSL-KDD, UNSW-NB15, CICIDS-2017, and CSE-CIC-IDS2018. We conducted a literature review and presented an in-depth analysis of various intrusion detection methods that use SVM, KNN, DT, LR, NB, RF, XGBOOST, Adaboost, and ANN. We provide an overview of each technique, explaining the role of the classifiers and algorithms used. A detailed tabular analysis highlights the datasets used, classifiers employed, attacks detected, evaluation metrics, and conclusions drawn. This article offers a thorough review for future IDS research.

Paperid: 2411, https://arxiv.org/pdf/2506.02156.pdf

Abstract:
The distributed nature of local differential privacy (LDP) invites data poisoning attacks and poses unforeseen threats to the underlying LDP-supported applications. In this paper, we propose a comprehensive mitigation framework for popular frequency estimation, which contains a suite of novel defenses, including malicious user detection, attack pattern recognition, and damaged utility recovery. In addition to existing attacks, we explore new adaptive adversarial activities for our mitigation design. For detection, we present a new method to precisely identify bogus reports and thus LDP aggregation can be performed over the ``clean'' data. When the attack behavior becomes stealthy and direct filtering out malicious users is difficult, we further propose a detection that can effectively recognize hidden adversarial patterns, thus facilitating the decision-making of service providers. These detection methods require no additional data and attack information and incur minimal computational cost. Our experiment demonstrates their excellent performance and substantial improvement over previous work in various settings. In addition, we conduct an empirical analysis of LDP post-processing for corrupted data recovery and propose a new post-processing method, through which we reveal new insights into protocol recommendations in practice and key design principles for future research.

Paperid: 2412, https://arxiv.org/pdf/2506.02048.pdf

Abstract:
We present 'Random-Crypto', a procedurally generated cryptographic Capture The Flag (CTF) dataset designed to unlock the potential of Reinforcement Learning (RL) for LLM-based agents in security-sensitive domains. Cryptographic reasoning offers an ideal RL testbed: it combines precise validation, structured multi-step inference, and reliance on reliable computational tool use. Leveraging these properties, we fine-tune a Python tool-augmented Llama-3.1-8B via Group Relative Policy Optimization (GRPO) in a secure execution environment. The resulting agent achieves a significant improvement in Pass@8 on previously unseen challenges. Moreover, the improvements generalize to two external benchmarks: 'picoCTF', spanning both crypto and non-crypto tasks, and 'AICrypto MCQ', a multiple-choice benchmark of 135 cryptography questions. Ablation studies attribute the gains to enhanced tool usage and procedural reasoning. These findings position 'Random-Crypto' as a rich training ground for building intelligent, adaptable LLM agents capable of handling complex cybersecurity tasks.

Paperid: 2413, https://arxiv.org/pdf/2506.02043.pdf

Abstract:
Containerization, driven by Docker, has transformed application development and deployment by enhancing efficiency and scalability. However, the rapid adoption of container technologies introduces significant security challenges that require careful management. This paper investigates key areas of container security, including runtime protection, network safeguards, configuration best practices, supply chain security, and comprehensive monitoring and logging solutions. We identify common vulnerabilities within these domains and provide actionable recommendations to address and mitigate these risks. By integrating security throughout the Software Development Lifecycle (SDLC), organizations can reinforce their security posture, creating a resilient and reliable containerized application infrastructure that withstands evolving threats.

Paperid: 2414, https://arxiv.org/pdf/2506.01907.pdf

Abstract:
Privacy-preserving data publication, including synthetic data sharing, often experiences trade-offs between privacy and utility. Synthetic data is generally more effective than data anonymization in balancing this trade-off, however, not without its own challenges. Synthetic data produced by generative models trained on source data may inadvertently reveal information about outliers. Techniques specifically designed for preserving privacy, such as introducing noise to satisfy differential privacy, often incur unpredictable and significant losses in utility. In this work we show that, with the right mechanism of synthetic data generation, we can achieve strong privacy protection without significant utility loss. Synthetic data generators producing contracting data patterns, such as Synthetic Minority Over-sampling Technique (SMOTE), can enhance a differentially private data generator, leveraging the strengths of both. We prove in theory and through empirical demonstration that this SMOTE-DP technique can produce synthetic data that not only ensures robust privacy protection but maintains utility in downstream learning tasks.

Paperid: 2415, https://arxiv.org/pdf/2506.01885.pdf

Abstract:
Smart contracts, the cornerstone of blockchain technology, enable secure, automated distributed execution. Given their role in handling large transaction volumes across clients, miners, and validators, exploring concurrency is critical. This includes concurrent transaction execution or validation within blocks, block processing across shards, and miner competition to select and persist transactions. Concurrency and parallelism are a double-edged sword: while they improve throughput, they also introduce risks like race conditions, non-determinism, and vulnerabilities such as deadlock and livelock. This paper presents the first survey of concurrency in smart contracts, offering a systematic literature review organized into key dimensions. First, it establishes a taxonomy of concurrency levels in blockchain systems and discusses proposed solutions for future adoption. Second, it examines vulnerabilities, attacks, and countermeasures in concurrent operations, emphasizing the need for correctness and security. Crucially, we reveal a flawed concurrency assumption in a major research category, which has led to widespread misinterpretation. This work aims to correct that and guide future research toward more accurate models. Finally, we identify gaps in each category to outline future research directions and support blockchain's advancement.

Paperid: 2416, https://arxiv.org/pdf/2506.01848.pdf

Abstract:
The advent of Big Data has made the collection and analysis of cyber threat intelligence challenging due to its volume, leading research to focus on identifying key threat actors; yet these studies have failed to consider the technical expertise of these actors. Expertise, especially towards specific attack patterns, is crucial for cybercrime intelligence, as it focuses on targeting actors with the knowledge and skills to attack enterprises. Using CVEs and CAPEC classifications to build a bimodal network, as well as community detection, k-means and a criminological framework, this study addresses the key hacker identification problem by identifying communities interested in specific attack patterns across cybercrime forums and their related key expert actors. The analyses reveal several key contributions. First, the community structure of the CAPEC-actor bimodal network shows that there exists groups of actors interested in similar attack patterns across cybercrime forums. Second, key actors identified in this study account for about 4% of the study population. Third, about half of the study population are amateurs who show little technical expertise. Finally, key actors highlighted in this study represent a promising scarcity for resources allocation in cyber threat intelligence production. Further research should look into how they develop and use their technical expertise in cybercrime forums.

Paperid: 2417, https://arxiv.org/pdf/2506.01700.pdf

Abstract:
The proliferation of digital carriers that can be exploited to conceal arbitrary data has greatly increased the number of techniques for implementing network steganography. As a result, the literature overlaps greatly in terms of concepts and terminology. Moreover, from a cybersecurity viewpoint, the same hiding mechanism may be perceived differently, making harder the development of a unique defensive strategy or the definition of practices to mitigate risks arising from the use of steganography. To mitigate these drawbacks, several researchers introduced approaches that aid in the unified description of steganography methods and network covert channels. Understanding and combining all descriptive methods for steganography techniques is a challenging but important task. For instance, researchers might want to explain how malware applies a certain steganography technique or categorize a novel hiding approach. Consequently, this paper aims to provide an introduction to the concept of descriptive methods for steganography. The paper is organized in the form of a tutorial, with the main goal of explaining how existing descriptions and taxonomy objects can be combined to achieve a detailed categorization and description of hiding methods. To show how this can effectively help the research community, the paper also contains various real-world examples.

Paperid: 2418, https://arxiv.org/pdf/2506.00821.pdf

Abstract:
Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated significant success in variant effect prediction. However, their adversarial robustness remains largely unexplored. To address this gap, we propose SafeGenes: a framework for Secure analysis of genomic foundation models, leveraging adversarial attacks to evaluate robustness against both engineered near-identical adversarial Genes and embedding-space manipulations. In this study, we assess the adversarial vulnerabilities of GFMs using two approaches: the Fast Gradient Sign Method (FGSM) and a soft prompt attack. FGSM introduces minimal perturbations to input sequences, while the soft prompt attack optimizes continuous embeddings to manipulate model predictions without modifying the input tokens. By combining these techniques, SafeGenes provides a comprehensive assessment of GFM susceptibility to adversarial manipulation. Targeted soft prompt attacks led to substantial performance degradation, even in large models such as ESM1b and ESM1v. These findings expose critical vulnerabilities in current foundation models, opening new research directions toward improving their security and robustness in high-stakes genomic applications such as variant effect prediction.

Paperid: 2419, https://arxiv.org/pdf/2506.00566.pdf

Abstract:
Multiparty private set intersection (MPSI) allows multiple participants to compute the intersection of their locally owned data sets without revealing them. MPSI protocols can be categorized based on the network topology of nodes, with the star, mesh, and ring topologies being the primary types, respectively. Given that star and mesh topologies dominate current implementations, most existing MPSI protocols are based on these two topologies. However, star-topology MPSI protocols suffer from high leader node load, while mesh topology protocols suffer from high communication complexity and overhead. In this paper, we first propose a multi-point sequential oblivious pseudorandom function (MP-SOPRF) in a multi-party setting. Based on MP-SOPRF, we then develop an MPSI protocol with a ring topology, addressing the challenges of communication and computational overhead in existing protocols. We prove that our MPSI protocol is semi-honest secure under the Hamming correlation robustness assumption. Our experiments demonstrate that our MPSI protocol outperforms state-of-the-art protocols, achieving a reduction of 74.8% in communication and a 6% to 287% improvement in computational efficiency.

Paperid: 2420, https://arxiv.org/pdf/2506.00281.pdf

Abstract:
Retrieval-Augmented Generation (RAG) systems, which integrate Large Language Models (LLMs) with external knowledge sources, are vulnerable to a range of adversarial attack vectors. This paper examines the importance of RAG systems through recent industry adoption trends and identifies the prominent attack vectors for RAG: prompt injection, data poisoning, and adversarial query manipulation. We analyze these threats under risk management lens, and propose robust prioritized control list that includes risk-mitigating actions like input validation, adversarial training, and real-time monitoring.

Paperid: 2421, https://arxiv.org/pdf/2506.00124.pdf

Abstract:
Quantum cryptographic protocols do not rely only on quantum-physical resources, they also require reliable classical communication and computation. In particular, the secrecy of any quantum key distribution protocol critically depends on the correct execution of the privacy amplification step. This is a classical post-processing procedure transforming a partially secret bit string, known to be somewhat correlated with an adversary, into a shorter bit string that is close to uniform and independent of the adversary's knowledge. It is typically implemented using randomness extractors. Standardization efforts in quantum cryptography have focused on the security of physical devices and quantum operations. Future efforts should also consider all algorithms used in classical post-processing, especially in privacy amplification, due to its critical role in ensuring the final security of the key. We present randextract, a reference library to test and validate privacy amplification implementations.

Paperid: 2422, https://arxiv.org/pdf/2505.23825.pdf

Abstract:
We investigate a new form of (privacy-preserving) inconsistency measurement for multi-party communication. Intuitively, for two knowledge bases K_A, K_B (of two agents A, B), our results allow to quantitatively assess the degree of inconsistency for K_A U K_B without having to reveal the actual contents of the knowledge bases. Using secure multi-party computation (SMPC) and cryptographic protocols, we develop two concrete methods for this use-case and show that they satisfy important properties of SMPC protocols -- notably, input privacy, i.e., jointly computing the inconsistency degree without revealing the inputs.

Paperid: 2423, https://arxiv.org/pdf/2505.23805.pdf

Abstract:
This paper introduces the Adaptive Defense Agent (ADA), an innovative Automated Moving Target Defense (AMTD) system designed to fundamentally enhance the security posture of AI workloads. ADA operates by continuously and automatically rotating these workloads at the infrastructure level, leveraging the inherent ephemerality of Kubernetes pods. This constant managed churn systematically invalidates attacker assumptions and disrupts potential kill chains by regularly destroying and respawning AI service instances. This methodology, applying principles of chaos engineering as a continuous, proactive defense, offers a paradigm shift from traditional static defenses that rely on complex and expensive confidential or trusted computing solutions to secure the underlying compute platforms, while at the same time agnostically supporting the latest advancements in agentic and nonagentic AI ecosystems and solutions such as agent-to-agent (A2A) communication frameworks or model context protocols (MCP). This AI-native infrastructure design, relying on the widely proliferated cloud-native Kubernetes technologies, facilitates easier deployment, simplifies maintenance through an inherent zero trust posture achieved by rotation, and promotes faster adoption. We posit that ADA's novel approach to AMTD provides a more robust, agile, and operationally efficient zero-trust model for AI services, achieving security through proactive environmental manipulation rather than reactive patching.

Paperid: 2424, https://arxiv.org/pdf/2505.23803.pdf

Abstract:
Phishing email detection faces critical challenges from evolving adversarial tactics and heterogeneous attack patterns. Traditional detection methods, such as rule-based filters and denylists, often struggle to keep pace with these evolving tactics, leading to false negatives and compromised security. While machine learning approaches have improved detection accuracy, they still face challenges adapting to novel phishing strategies. We present MultiPhishGuard, a dynamic LLM-based multi-agent detection system that synergizes specialized expertise with adversarial-aware reinforcement learning. Our framework employs five cooperative agents (text, URL, metadata, explanation simplifier, and adversarial agents) with automatically adjusted decision weights powered by a Proximal Policy Optimization reinforcement learning algorithm. To address emerging threats, we introduce an adversarial training loop featuring an adversarial agent that generates subtle context-aware email variants, creating a self-improving defense ecosystem and enhancing system robustness. Experimental evaluations on public datasets demonstrate that MultiPhishGuard significantly outperforms Chain-of-Thoughts, single-agent baselines and state-of-the-art detectors, as validated by ablation studies and comparative analyses. Experiments demonstrate that MultiPhishGuard achieves high accuracy (97.89\%) with low false positive (2.73\%) and false negative rates (0.20\%). Additionally, we incorporate an explanation simplifier agent, which provides users with clear and easily understandable explanations for why an email is classified as phishing or legitimate. This work advances phishing defense through dynamic multi-agent collaboration and generative adversarial resilience.

Paperid: 2425, https://arxiv.org/pdf/2505.23020.pdf

Abstract:
The acquisition of agentic capabilities has transformed LLMs from "knowledge providers" to "action executors", a trend that while expanding LLMs' capability boundaries, significantly increases their susceptibility to malicious use. Previous work has shown that current LLM-based agents execute numerous malicious tasks even without being attacked, indicating a deficiency in agentic use safety alignment during the post-training phase. To address this gap, we propose AgentAlign, a novel framework that leverages abstract behavior chains as a medium for safety alignment data synthesis. By instantiating these behavior chains in simulated environments with diverse tool instances, our framework enables the generation of highly authentic and executable instructions while capturing complex multi-step dynamics. The framework further ensures model utility by proportionally synthesizing benign instructions through non-malicious interpretations of behavior chains, precisely calibrating the boundary between helpfulness and harmlessness. Evaluation results on AgentHarm demonstrate that fine-tuning three families of open-source models using our method substantially improves their safety (35.8% to 79.5% improvement) while minimally impacting or even positively enhancing their helpfulness, outperforming various prompting methods. The dataset and code have both been open-sourced.

Paperid: 2426, https://arxiv.org/pdf/2505.21733.pdf

Abstract:
Online data scraping has taken on new dimensions in recent years, as traditional scrapers have been joined by new AI-specific bots. To counteract unwanted scraping, many sites use tools like the Robots Exclusion Protocol (REP), which places a robots$.$txt file at the site root to dictate scraper behavior. Yet, the efficacy of the REP is not well-understood. Anecdotal evidence suggests some bots comply poorly with it, but no rigorous study exists to support (or refute) this claim. To understand the merits and limits of the REP, we conduct the first large-scale study of web scraper compliance with robots$.$txt directives using anonymized web logs from our institution. We analyze the behavior of 130 self-declared bots (and many anonymous ones) over 40 days, using a series of controlled robots$.$txt experiments. We find that bots are less likely to comply with stricter robots$.$txt directives, and that certain categories of bots, including AI search crawlers, rarely check robots$.$txt at all. These findings suggest that relying on robots$.$txt files to prevent unwanted scraping is risky and highlight the need for alternative approaches.

Paperid: 2427, https://arxiv.org/pdf/2505.21642.pdf

Abstract:
Supply chain attacks have emerged as a prominent cybersecurity threat in recent years. Reproducible and bootstrappable builds have the potential to reduce such attacks significantly. In combination with independent, exhaustive and periodic source code audits, these measures can effectively eradicate compromises in the building process. In this paper we introduce both concepts, we analyze the achievements over the last ten years and explain the remaining challenges. We contribute to the reproducible builds effort by setting up a rebuilder and verifier instance to test the reproducibility of Arch Linux packages. Using the results from this instance, we uncover an unnoticed and security-relevant packaging issue affecting 16 packages related to Certbot, the recommended software to install TLS certificates from Let's Encrypt, making them unreproducible. Additionally, we find the root cause of unreproduciblity in the source code of fwupd, a critical software used to update device firmware on Linux devices, and submit an upstream patch to fix it.

Paperid: 2428, https://arxiv.org/pdf/2505.21462.pdf

Abstract:
The growing complexity of encrypted network traffic presents dual challenges for modern network management: accurate multiclass classification of known applications and reliable detection of unknown traffic patterns. Although deep learning models show promise in controlled environments, their real-world deployment is hindered by data scarcity, concept drift, and operational constraints. This paper proposes M3S-UPD, a novel Multi-Stage Self-Supervised Unknown-aware Packet Detection framework that synergistically integrates semi-supervised learning with representation analysis. Our approach eliminates artificial segregation between classification and detection tasks through a four-phase iterative process: 1) probabilistic embedding generation, 2) clustering-based structure discovery, 3) distribution-aligned outlier identification, and 4) confidence-aware model updating. Key innovations include a self-supervised unknown detection mechanism that requires neither synthetic samples nor prior knowledge, and a continuous learning architecture that is resistant to performance degradation. Experimental results show that M3S-UPD not only outperforms existing methods on the few-shot encrypted traffic classification task, but also simultaneously achieves competitive performance on the zero-shot unknown traffic discovery task.

Paperid: 2429, https://arxiv.org/pdf/2505.21051.pdf

Abstract:
Federated fine-tuning of large language models (LLMs) is critical for improving their performance in handling domain-specific tasks. However, prior work has shown that clients' private data can actually be recovered via gradient inversion attacks. Existing privacy preservation techniques against such attacks typically entail performance degradation and high costs, making them ill-suited for clients with heterogeneous data distributions and device capabilities. In this paper, we propose SHE-LoRA, which integrates selective homomorphic encryption (HE) and low-rank adaptation (LoRA) to enable efficient and privacy-preserving federated tuning of LLMs in cross-device environment. Heterogeneous clients adaptively select partial model parameters for homomorphic encryption based on parameter sensitivity assessment, with the encryption subset obtained via negotiation. To ensure accurate model aggregation, we design a column-aware secure aggregation method and customized reparameterization techniques to align the aggregation results with the heterogeneous device capabilities of clients. Extensive experiments demonstrate that SHE-LoRA maintains performance comparable to non-private baselines, achieves strong resistance to the state-of-the-art attacks, and significantly reduces communication overhead by 94.901\% and encryption computation overhead by 99.829\%, compared to baseline. Our code is accessible at https://anonymous.4open.science/r/SHE-LoRA-8D84.

Paperid: 2430, https://arxiv.org/pdf/2505.19915.pdf

Abstract:
As AI systems become increasingly capable, understanding their offensive cyber potential is critical for informed governance and responsible deployment. However, it's hard to accurately bound their capabilities, and some prior evaluations dramatically underestimated them. The art of extracting maximum task-specific performance from AIs is called "AI elicitation", and today's safety organizations typically conduct it in-house. In this paper, we explore crowdsourcing elicitation efforts as an alternative to in-house elicitation work. We host open-access AI tracks at two Capture The Flag (CTF) competitions: AI vs. Humans (400 teams) and Cyber Apocalypse (8000 teams). The AI teams achieve outstanding performance at both events, ranking top-5% and top-10% respectively for a total of \$7500 in bounties. This impressive performance suggests that open-market elicitation may offer an effective complement to in-house elicitation. We propose elicitation bounties as a practical mechanism for maintaining timely, cost-effective situational awareness of emerging AI capabilities. Another advantage of open elicitations is the option to collect human performance data at scale. Applying METR's methodology, we found that AI agents can reliably solve cyber challenges requiring one hour or less of effort from a median human CTF participant.

Paperid: 2431, https://arxiv.org/pdf/2505.19864.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by incorporating external knowledge, but its openness introduces vulnerabilities that can be exploited by poisoning attacks. Existing poisoning methods for RAG systems have limitations, such as poor generalization and lack of fluency in adversarial texts. In this paper, we propose CPA-RAG, a black-box adversarial framework that generates query-relevant texts capable of manipulating the retrieval process to induce target answers. The proposed method integrates prompt-based text generation, cross-guided optimization through multiple LLMs, and retriever-based scoring to construct high-quality adversarial samples. We conduct extensive experiments across multiple datasets and LLMs to evaluate its effectiveness. Results show that the framework achieves over 90\% attack success when the top-k retrieval setting is 5, matching white-box performance, and maintains a consistent advantage of approximately 5 percentage points across different top-k values. It also outperforms existing black-box baselines by 14.5 percentage points under various defense strategies. Furthermore, our method successfully compromises a commercial RAG system deployed on Alibaba's BaiLian platform, demonstrating its practical threat in real-world applications. These findings underscore the need for more robust and secure RAG frameworks to defend against poisoning attacks.

Paperid: 2432, https://arxiv.org/pdf/2505.19840.pdf

Abstract:
Deep Neural Networks (DNNs) have achieved widespread success yet remain prone to adversarial attacks. Typically, such attacks either involve frequent queries to the target model or rely on surrogate models closely mirroring the target model -- often trained with subsets of the target model's training data -- to achieve high attack success rates through transferability. However, in realistic scenarios where training data is inaccessible and excessive queries can raise alarms, crafting adversarial examples becomes more challenging. In this paper, we present UnivIntruder, a novel attack framework that relies solely on a single, publicly available CLIP model and publicly available datasets. By using textual concepts, UnivIntruder generates universal, transferable, and targeted adversarial perturbations that mislead DNNs into misclassifying inputs into adversary-specified classes defined by textual concepts. Our extensive experiments show that our approach achieves an Attack Success Rate (ASR) of up to 85% on ImageNet and over 99% on CIFAR-10, significantly outperforming existing transfer-based methods. Additionally, we reveal real-world vulnerabilities, showing that even without querying target models, UnivIntruder compromises image search engines like Google and Baidu with ASR rates up to 84%, and vision language models like GPT-4 and Claude-3.5 with ASR rates up to 80%. These findings underscore the practicality of our attack in scenarios where traditional avenues are blocked, highlighting the need to reevaluate security paradigms in AI applications.

Paperid: 2433, https://arxiv.org/pdf/2505.19174.pdf

Abstract:
Penetration testing refers to the process of simulating hacker attacks to evaluate the security of information systems . This study aims not only to clarify the theoretical foundations of penetration testing but also to explain and demonstrate the complete testing process, including how network system administrators may simulate attacks using various penetration testing methods. Methodologically, the paper outlines the five basic stages of a typical penetration test: intelligence gathering, vulnerability scanning, vulnerability exploitation, privilege escalation, and post-exploitation activities. In each phase, specific tools and techniques are examined in detail, along with practical guidance on their use. To enhance the practical relevance of the study, the paper also presents a real-life case study, illustrating how a complete penetration test is conducted in a real-world environment. Through this case, readers can gain insights into the detailed procedures and applied techniques, thereby deepening their understanding of the practical value of penetration testing. Finally, the paper summarizes the importance and necessity of penetration testing in securing information systems and maintaining network integrity, and it explores future trends and development directions for the field. Overall, the findings of this paper offer valuable references for both researchers and practitioners, contributing meaningfully to the improvement of penetration testing practices and the advancement of cybersecurity as a whole.

Paperid: 2434, https://arxiv.org/pdf/2505.18861.pdf

Abstract:
Unauthorized credit inquiries are also a central entry point for identity theft, with Social Security Numbers (SSNs) being widely utilized in fraudulent cases. Traditional credit inquiry systems do not usually possess strict user authentication, making them vulnerable to unauthorized access. This paper proposes a real-time user authorization system to enhance security by enforcing explicit user approval before processing any credit inquiry. The system employs real-time verification and approval techniques. This ensures that the authorized user only approves or rejects a credit check request. It minimizes the risks of interference by third parties. Apart from enhancing security, this system complies with regulations like the General Data Protection Regulation (GDPR) and the Fair Credit Reporting Act (FCRA) while maintaining a seamless user experience. This article discusses the technical issues, scaling-up issues, and ways of implementing real-time user authorization in financial systems. Through this framework, financial institutions can drastically minimize the risk of identity theft, avert unauthorized credit checks, and increase customer trust in the credit verification system.

Paperid: 2435, https://arxiv.org/pdf/2505.17274.pdf

Abstract:
The progress of automotive technologies has made cybersecurity a crucial focus, leading to various cyber attacks. These attacks primarily target the Controller Area Network (CAN) and specialized Electronic Control Units (ECUs). In order to mitigate these attacks and bolster the security of vehicular systems, numerous defense solutions have been proposed.These solutions aim to detect diverse forms of vehicular attacks. However, the practical implementation of these solutions still presents certain limitations and challenges. In light of these circumstances, this paper undertakes a thorough examination of existing vehicular attacks and defense strategies employed against the CAN and ECUs. The objective is to provide valuable insights and inform the future design of Vehicular Intrusion Detection Systems (VIDS). The findings of our investigation reveal that the examined VIDS primarily concentrate on particular categories of attacks, neglecting the broader spectrum of potential threats. Moreover, we provide a comprehensive overview of the significant challenges encountered in implementing a robust and feasible VIDS. Additionally, we put forth several defense recommendations based on our study findings, aiming to inform and guide the future design of VIDS in the context of vehicular security.

Paperid: 2436, https://arxiv.org/pdf/2505.17248.pdf

Abstract:
Backdoor attacks, or trojans, pose a security risk by concealing undesirable behavior in deep neural network models. Open-source neural networks are downloaded from the internet daily, possibly containing backdoors, and third-party model developers are common. To advance research on backdoor attack mitigation, we develop several trojans for deep reinforcement learning (DRL) agents. We focus on in-distribution triggers, which occur within the agent's natural data distribution, since they pose a more significant security threat than out-of-distribution triggers due to their ease of activation by the attacker during model deployment. We implement backdoor attacks in four reinforcement learning (RL) environments: LavaWorld, Randomized LavaWorld, Colorful Memory, and Modified Safety Gymnasium. We train various models, both clean and backdoored, to characterize these attacks. We find that in-distribution triggers can require additional effort to implement and be more challenging for models to learn, but are nevertheless viable threats in DRL even using basic data poisoning attacks.

Paperid: 2437, https://arxiv.org/pdf/2505.17224.pdf

Abstract:
In this cloud-dependent era, various security techniques, such as encryption, steganography, and hybrid approaches, have been utilized in cloud computing to enhance security, maintain enormous storage capacity, and provide ease of access. However, the absence of data type-specific encryption and decryption procedures renders multimedia data vulnerable. To address this issue, this study presents a dynamic encryption-based security architecture that adapts encryption methods to any file type, using keys generated from facial images and passwords. Four diverse datasets are created, each with a consistent size of 2GB, containing varying combinations of image, audio (MP3 and MPEG), video, text, CSV, PPT, and PDF files, to implement the proposed methodology. AES is used to encrypt image data, AES-CTR is employed for audio or video data to meet real-time streaming needs, and Blowfish is used for other types of data. Performance analysis on all four datasets is conducted using AWS servers, where DATASET-1 demonstrates the best performance compared to the others.

Paperid: 2438, https://arxiv.org/pdf/2505.16439.pdf

Abstract:
As network security issues continue gaining prominence, password security has become crucial in safeguarding personal information and network systems. This study first introduces various methods for system password cracking, outlines password defense strategies, and discusses the application of machine learning in the realm of password security. Subsequently, we conduct a detailed public password database analysis, uncovering standard features and patterns among passwords. We extract multiple characteristics of passwords, including length, the number of digits, the number of uppercase and lowercase letters, and the number of special characters. We then experiment with six different machine learning algorithms: support vector machines, logistic regression, neural networks, decision trees, random forests, and stacked models, evaluating each model's performance based on various metrics, including accuracy, recall, and F1 score through model validation and hyperparameter tuning. The evaluation results on the test set indicate that decision trees and stacked models excel in accuracy, recall, and F1 score, making them a practical option for the strong and weak password classification task.

Paperid: 2439, https://arxiv.org/pdf/2505.16366.pdf

Abstract:
Binary analysis plays a pivotal role in security domains such as malware detection and vulnerability discovery, yet it remains labor-intensive and heavily reliant on expert knowledge. General-purpose large language models (LLMs) perform well in programming analysis on source code, while binaryspecific LLMs are underexplored. In this work, we present ReCopilot, an expert LLM designed for binary analysis tasks. ReCopilot integrates binary code knowledge through a meticulously constructed dataset, encompassing continue pretraining (CPT), supervised fine-tuning (SFT), and direct preference optimization (DPO) stages. It leverages variable data flow and call graph to enhance context awareness and employs test-time scaling to improve reasoning capabilities. Evaluations on a comprehensive binary analysis benchmark demonstrate that ReCopilot achieves state-of-the-art performance in tasks such as function name recovery and variable type inference on the decompiled pseudo code, outperforming both existing tools and LLMs by 13%. Our findings highlight the effectiveness of domain-specific training and context enhancement, while also revealing challenges in building super long chain-of-thought. ReCopilot represents a significant step toward automating binary analysis with interpretable and scalable AI assistance in this domain.

Paperid: 2440, https://arxiv.org/pdf/2505.16112.pdf

Abstract:
Cryptography underpins the security of modern digital infrastructure, from cloud services to health data. However, many widely deployed systems will become vulnerable after the advent of scalable quantum computing. Although quantum-safe cryptographic primitives have been developed, such as lattice-based digital signature algorithms (DSAs) and key encapsulation mechanisms (KEMs), their unique structural and performance characteristics make them unsuitable for existing protocols. In this work, we introduce a quantum-safe single-shot protocol for machine-to-machine authentication and authorization that is specifically designed to leverage the strengths of lattice-based DSAs and KEMs. Operating entirely over insecure channels, this protocol enables the forward-secure establishment of tokens in constrained environments. By demonstrating how new quantum-safe cryptographic primitives can be incorporated into secure systems, this study lays the groundwork for scalable, resilient, and future-proof identity infrastructures in a quantum-enabled world.

Paperid: 2441, https://arxiv.org/pdf/2505.15921.pdf

Abstract:
The acquisition of data from main memory or from hard disk storage is usually one of the first steps in a forensic investigation. We revisit the discussion on quality criteria for "forensically sound" acquisition of such storage and propose a new way to capture the intent to acquire an instantaneous snapshot from a single target system. The idea of our definition is to allow a certain flexibility into when individual portions of memory are acquired, but at the same time require being consistent with causality (i.e., cause/effect relations). Our concept is much stronger than the original notion of atomicity defined by Vomel and Freiling (2012) but still attainable using copy-on-write mechanisms. As a minor result, we also fix a conceptual problem within the original definition of integrity.

Paperid: 2442, https://arxiv.org/pdf/2505.15265.pdf

Abstract:
Adversarial attacks aim to generate malicious inputs that mislead deep models, but beyond causing model failure, they cannot provide certain interpretable information such as ``\textit{What content in inputs make models more likely to fail?}'' However, this information is crucial for researchers to specifically improve model robustness. Recent research suggests that models may be particularly sensitive to certain semantics in visual inputs (such as ``wet,'' ``foggy''), making them prone to errors. Inspired by this, in this paper we conducted the first exploration on large vision-language models (LVLMs) and found that LVLMs indeed are susceptible to hallucinations and various errors when facing specific semantic concepts in images. To efficiently search for these sensitive concepts, we integrated large language models (LLMs) and text-to-image (T2I) models to propose a novel semantic evolution framework. Randomly initialized semantic concepts undergo LLM-based crossover and mutation operations to form image descriptions, which are then converted by T2I models into visual inputs for LVLMs. The task-specific performance of LVLMs on each input is quantified as fitness scores for the involved semantics and serves as reward signals to further guide LLMs in exploring concepts that induce LVLMs. Extensive experiments on seven mainstream LVLMs and two multimodal tasks demonstrate the effectiveness of our method. Additionally, we provide interesting findings about the sensitive semantics of LVLMs, aiming to inspire further in-depth research.

Paperid: 2443, https://arxiv.org/pdf/2505.15156.pdf

Abstract:
Recommendation as a service has improved the quality of our lives and plays a significant role in variant aspects. However, the preference of users may reveal some sensitive information, so that the protection of privacy is required. In this paper, we propose a privacy-preserving, socialized, recommendation protocol that introduces information collected from online social networks to enhance the quality of the recommendation. The proposed scheme can calculate the similarity between users to determine their potential relationships and interests, and it also can protect the users' privacy from leaking to an untrusted third party. The security analysis and experimental results showed that our proposed scheme provides excellent performance and is feasible for real-world applications.

Paperid: 2444, https://arxiv.org/pdf/2505.14162.pdf

Abstract:
Advancements in quantum computing pose a significant threat to most of the cryptography currently deployed. Fortunately, cryptographic building blocks to mitigate the threat are already available; mostly based on post-quantum and quantum cryptography, but also on symmetric cryptography techniques. Notably, quantum-safe building blocks must be deployed as soon as possible due to the ``harvest-now decrypt-later'' attack scenario, which is already challenging our sensitive and encrypted data today. Following an agile defense-in-depth approach, Hybrid Authenticated Key Exchange (HAKE) protocols have recently been gaining significant attention. Such protocols modularly combine conventional, post-quantum, and quantum cryptography to achieve confidentiality, authenticity, and integrity guarantees for network channels. Unfortunately, only a few protocols have yet been proposed (mainly Muckle and Muckle+) with different flexibility guarantees. Looking at available standards in the network domain (especially at the Media Access Control Security (MACsec) standard), we believe that HAKE protocols could already bring strong security benefits to MACsec today. MACsec is a standard designed to secure communication at the data link layer in Ethernet networks by providing security for all traffic between adjacent entities. In addition, MACsec establishes secure channels within a Local Area Network (LAN), ensuring that data remain protected from eavesdropping, tampering, and unauthorized access, while operating transparently to higher layer protocols. Currently, MACsec does not offer enough protection in the event of cryptographically relevant quantum computers. In this work, we tackle the challenge and propose a new versatile HAKE protocol, dubbed VMuckle, which is sufficiently flexible for the use in MACsec to provide LAN participants with hybrid key material ensuring secure communication.

Paperid: 2445, https://arxiv.org/pdf/2505.13922.pdf

Abstract:
Outdated software remains a potent and underappreciated menace in 2025's cybersecurity environment, exposing systems to a broad array of threats, including ransomware, data breaches, and operational outages that can have devastating and far-reaching impacts. This essay explores the unseen threats of cyberattacks by presenting robust statistical information, including the staggering reality that 32% of cyberattacks exploit unpatched software vulnerabilities, based on a 2025 TechTarget survey. Furthermore, it discusses real case studies, including the MOVEit breach in 2023 and the Log4Shell breach in 2021, both of which illustrate the catastrophic consequences of failing to perform software updates. The article offers a detailed analysis of the nature of software vulnerabilities, the underlying reasons for user resistance to patches, and organizational barriers that compound the issue. Furthermore, it suggests actionable solutions, including automation and awareness campaigns, to address these shortcomings. Apart from this, the paper also talks of trends such as AI-driven vulnerability patching and legal consequences of non-compliance under laws like HIPAA, thus providing a futuristic outlook on how such advancements may define future defenses. Supplemented by tables like one detailing trends in vulnerability and a graph illustrating technology adoption, this report showcases the pressing demand for anticipatory update strategies to safeguard digital ecosystems against the constantly evolving threats that characterize the modern cyber landscape. As it stands, it is a very useful document for practitioners, policymakers, and researchers.

Paperid: 2446, https://arxiv.org/pdf/2505.13076.pdf

Abstract:
Autonomous browsing agents powered by large language models (LLMs) are increasingly used to automate web-based tasks. However, their reliance on dynamic content, tool execution, and user-provided data exposes them to a broad attack surface. This paper presents a comprehensive security evaluation of such agents, focusing on systemic vulnerabilities across multiple architectural layers. Our work outlines the first end-to-end threat model for browsing agents and provides actionable guidance for securing their deployment in real-world environments. To address discovered threats, we propose a defense in depth strategy incorporating input sanitization, planner executor isolation, formal analyzers, and session safeguards. These measures protect against both initial access and post exploitation attack vectors. Through a white box analysis of a popular open source project, Browser Use, we demonstrate how untrusted web content can hijack agent behavior and lead to critical security breaches. Our findings include prompt injection, domain validation bypass, and credential exfiltration, evidenced by a disclosed CVE and a working proof of concept exploit.

Paperid: 2447, https://arxiv.org/pdf/2505.12995.pdf

Abstract:
Confidential computing plays an important role in isolating sensitive applications from the vast amount of untrusted code commonly found in the modern cloud. We argue that it can also be leveraged to build safer and more secure mission-critical embedded systems. In this paper, we introduce the Assured Confidential Execution (ACE), an open-source and royalty-free confidential computing technology targeted for embedded RISC-V systems. We present a set of principles and a methodology that we used to build \ACE and that might be applied for developing other embedded systems that require formal verification. An evaluation of our prototype on the first available RISC-V hardware supporting virtualization indicates that ACE is a viable candidate for our target systems.

Paperid: 2448, https://arxiv.org/pdf/2505.12869.pdf

Abstract:
Feature selection is a technique that extracts a meaningful subset from a set of features in training data. When the training data is large-scale, appropriate feature selection enables the removal of redundant features, which can improve generalization performance, accelerate the training process, and enhance the interpretability of the model. This study proposes a privacy-preserving computation model for feature selection. Generally, when the data owner and analyst are the same, there is no need to conceal the private information. However, when they are different parties or when multiple owners exist, an appropriate privacy-preserving framework is required. Although various private feature selection algorithms, they all require two or more computing parties and do not guarantee security in environments where no external party can be fully trusted. To address this issue, we propose the first outsourcing algorithm for feature selection using fully homomorphic encryption. Compared to a prior two-party algorithm, our result improves the time and space complexity O(kn^2) to O(kn log^3 n) and O(kn), where k and n denote the number of features and data samples, respectively. We also implemented the proposed algorithm and conducted comparative experiments with the naive one. The experimental result shows the efficiency of our method even with small datasets.

Paperid: 2449, https://arxiv.org/pdf/2505.12700.pdf

Abstract:
Security is increasingly more important in designing chips and systems based on them, and the International Solid-State Circuits Conference (ISSCC), the leading conference for presenting advances in solid-state circuits and semiconductor technology, is committed to hardware security by establishing the security subcommittee since 2024. In the past two years, the authors of this paper reviewed submissions as members of the Security Subcommittee, a part of International Technical Program Committee (ITPC). This paper aims to encourage high-quality submissions to grow this field in the overall scope of the ISSCC.

Paperid: 2450, https://arxiv.org/pdf/2505.12256.pdf

Abstract:
Constructing a Trusted Execution Environment (TEE) on Field Programmable Gate Array System on Chip (FPGA-SoC) in Cloud can effectively protect users' private intel-lectual Property (IP) cores. In order to facilitate the wide-spread deployment of FPGA-SoC TEE, this paper proposes an approach for constructing a TPM 2.0-compatible runtime customizable TEE on FPGA-SoC. This approach leverages a user-controllable virtual Trusted Platform Module (vTPM) that integrates sensitive operations specific to FPGA-SoC TEE. It provides TPM 2.0 support for a customizable FPGA-SoC TEE to dynamically measure, deploy, and invoke IP during runtime. Our main contributions include: (i) Propose an FPGA-vTPM architecture that enables the TPM 2.0 specification support for FPGA-SoC TEE; (ii) Explore the utilization of FPGA-vTPM to dynamically measure, deploy, and invoke users' IPs on FPGA-SoC TEE; (iii) Extend the TPM command set to accommodate the sensitive operations of FPGA-SoC TEE, enabling users to perform sensitive tasks in a secure and verifiable manner according to the TPM 2.0 specification. We implement a prototype of TRCTEE on the Xilinx Zynq UltraScale+ MPSoC platform and conducted security analysis and performance evaluations to prove the practicality and enhanced security features of this approach.

Paperid: 2451, https://arxiv.org/pdf/2505.11988.pdf

Abstract:
Accurately identifying adversarial techniques in security texts is critical for effective cyber defense. However, existing methods face a fundamental trade-off: they either rely on generic models with limited domain precision or require resource-intensive pipelines that depend on large labeled datasets and task-specific optimizations, such as custom hard-negative mining and denoising, resources rarely available in specialized domains. We propose TechniqueRAG, a domain-specific retrieval-augmented generation (RAG) framework that bridges this gap by integrating off-the-shelf retrievers, instruction-tuned LLMs, and minimal text-technique pairs. Our approach addresses data scarcity by fine-tuning only the generation component on limited in-domain examples, circumventing the need for resource-intensive retrieval training. While conventional RAG mitigates hallucination by coupling retrieval and generation, its reliance on generic retrievers often introduces noisy candidates, limiting domain-specific precision. To address this, we enhance retrieval quality and domain specificity through zero-shot LLM re-ranking, which explicitly aligns retrieved candidates with adversarial techniques. Experiments on multiple security benchmarks demonstrate that TechniqueRAG achieves state-of-the-art performance without extensive task-specific optimizations or labeled data, while comprehensive analysis provides further insights.

Paperid: 2452, https://arxiv.org/pdf/2505.11542.pdf

Abstract:
User and Entity Behaviour Analytics (UEBA) is a broad branch of data analytics that attempts to build a normal behavioural profile in order to detect anomalous events. Among the techniques used to detect anomalies, Deep Autoencoders constitute one of the most promising deep learning models on UEBA tasks, allowing explainable detection of security incidents that could lead to the leak of personal data, hijacking of systems, or access to sensitive business information. In this study, we introduce the first implementation of an explainable UEBA-based anomaly detection framework that leverages Deep Autoencoders in combination with Doc2Vec to process both numerical and textual features. Additionally, based on the theoretical foundations of neural networks, we offer a novel proof demonstrating the equivalence of two widely used definitions for fully-connected neural networks. The experimental results demonstrate the proposed framework capability to detect real and synthetic anomalies effectively generated from real attack data, showing that the models provide not only correct identification of anomalies but also explainable results that enable the reconstruction of the possible origin of the anomaly. Our findings suggest that the proposed UEBA framework can be seamlessly integrated into enterprise environments, complementing existing security systems for explainable threat detection.

Paperid: 2453, https://arxiv.org/pdf/2505.11094.pdf

Abstract:
Retail energy markets are increasingly consumer-oriented, thanks to a growing number of energy plans offered by a plethora of energy suppliers, retailers and intermediaries. To maximize the benefits of competitive retail energy markets, group purchasing is an emerging paradigm that aggregates consumers' purchasing power by coordinating switch decisions to specific energy providers for discounted energy plans. Traditionally, group purchasing is mediated by a trusted third-party, which suffers from the lack of privacy and transparency. In this paper, we introduce a novel paradigm of decentralized privacy-preserving group purchasing, empowered by privacy-preserving blockchain and secure multi-party computation, to enable users to form a coalition for coordinated switch decisions in a decentralized manner, without a trusted third-party. The coordinated switch decisions are determined by a competitive online algorithm, based on users' private consumption data and current energy plan tariffs. Remarkably, no private user consumption data will be revealed to others in the online decision-making process, which is carried out in a transparently verifiable manner to eliminate frauds from dishonest users and supports fair mutual compensations by sharing the switching costs to incentivize group purchasing. We implemented our decentralized group purchasing solution as a smart contract on Solidity-supported blockchain platform (e.g., Ethereum), and provide extensive empirical evaluation.

Paperid: 2454, https://arxiv.org/pdf/2505.10932.pdf

Abstract:
This paper presents a physical layer authentication (PLA) technique using information reconciliation in multiuser communication systems. A cost-effective solution for low-end Internet of Things networks can be provided by PLA. In this work, we develop an information reconciliation scheme using Polar codes along with a quantization strategy that employs an arbitrary number of bits to enhance the performance of PLA. We employ the principle of Slepian-Wolf coding to reconcile channel measurements spread in time. Numerical results show that our approach works very well and outperforms competing approaches, achieving more than 99.80% increase in detection probability for false alarm probabilities close to 0.

Paperid: 2455, https://arxiv.org/pdf/2505.10838.pdf

Abstract:
Efficient red-teaming method to uncover vulnerabilities in Large Language Models (LLMs) is crucial. While recent attacks often use LLMs as optimizers, the discrete language space make gradient-based methods struggle. We introduce LARGO (Latent Adversarial Reflection through Gradient Optimization), a novel latent self-reflection attack that reasserts the power of gradient-based optimization for generating fluent jailbreaking prompts. By operating within the LLM's continuous latent space, LARGO first optimizes an adversarial latent vector and then recursively call the same LLM to decode the latent into natural language. This methodology yields a fast, effective, and transferable attack that produces fluent and stealthy prompts. On standard benchmarks like AdvBench and JailbreakBench, LARGO surpasses leading jailbreaking techniques, including AutoDAN, by 44 points in attack success rate. Our findings demonstrate a potent alternative to agentic LLM prompting, highlighting the efficacy of interpreting and attacking LLM internals through gradient optimization.

Paperid: 2456, https://arxiv.org/pdf/2505.07536.pdf

Abstract:
Randomness plays a vital role in numerous applications, including simulation, cryptography, distributed systems, and gaming. Consequently, extensive research has been conducted to generate randomness. One such method is to design a decentralized random number generator (DRNG), a protocol that enables multiple participants to collaboratively generate random outputs that must be publicly verifiable. However, existing DRNGs are either not secure against quantum computers or depend on the random oracle model (ROM) to achieve security. In this paper, we design a DRNG based on lattice-based publicly verifiable secret sharing (PVSS) that is post-quantum secure and proven secure in the standard model. Additionally, our DRNG requires only two rounds of communication to generate a single (pseudo)random value and can tolerate up to any t < n/2 dishonest participants. To our knowledge, the proposed DRNG construction is the first to achieve all these properties.

Paperid: 2457, https://arxiv.org/pdf/2505.06380.pdf

Abstract:
As artificial intelligence (AI) systems become increasingly adopted across sectors, the need for robust, proactive security strategies is paramount. Traditional defensive measures often fall short against the unique and evolving threats facing AI-driven technologies, making offensive security an essential approach for identifying and mitigating risks. This paper presents a comprehensive framework for offensive security in AI systems, emphasizing proactive threat simulation and adversarial testing to uncover vulnerabilities throughout the AI lifecycle. We examine key offensive security techniques, including weakness and vulnerability assessment, penetration testing, and red teaming, tailored specifically to address AI's unique susceptibilities. By simulating real-world attack scenarios, these methodologies reveal critical insights, informing stronger defensive strategies and advancing resilience against emerging threats. This framework advances offensive AI security from theoretical concepts to practical, actionable methodologies that organizations can implement to strengthen their AI systems against emerging threats.

Paperid: 2458, https://arxiv.org/pdf/2505.06177.pdf

Abstract:
The purpose of continuous fuzzing platforms is to enable fuzzing for software projects via \emph{fuzz harnesses} -- but as the projects continue to evolve, are these harnesses updated in lockstep, or do they run out of date? If these harnesses remain unmaintained, will they \emph{degrade} over time in terms of coverage achieved or number of bugs found? This is the subject of our study. We study Google's OSS-Fuzz continuous fuzzing platform containing harnesses for 510 open-source C/C++ projects, many of which are security-critical. A harness is the glue code between the fuzzer and the project, so it needs to adapt to changes in the project. It is often added by a project maintainer or as part of a, sometimes short-lived, testing effort. Our analysis shows a consistent overall fuzzer coverage percentage for projects in OSS-Fuzz and a surprising longevity of the bug-finding capability of harnesses even without explicit updates, as long as they still build. However, we also identify and manually examine individual cases of harness coverage degradation and categorize their root causes. Furthermore, we contribute to OSS-Fuzz and Fuzz Introspector to support metrics to detect harness degradation in OSS-Fuzz projects guided by this research.

Paperid: 2459, https://arxiv.org/pdf/2505.06084.pdf

Abstract:
This article documents the HashKitty platform, a distributed solution for password analysis based on the hashcat tool, designed to improve efficiency in both offensive and defensive security operations. The main objectives of this work are to utilise and characterise the hashcat tool, to develop a central platform that connects various computational nodes, to allow the use of nodes with different equipment and manufacturers, to distribute tasks among the nodes through a web platform, and to perform distributed password analysis. The results show that the presented solution achieves the proposed objectives, demonstrating effectiveness in workload distribution and password analysis using different types of nodes based on various operating systems and architectures. The architecture of HashKitty is based on a scalable and modular distributed architecture, composed of several components such as computational nodes, integration and control software, a web platform that implements our API, and database servers. In order to achieve a fast and organised development process for our application we used multiple frameworks, runtimes and libraries. For the communication between the computational nodes and the other software we made use of websockets so that we have real-time updates between them.

Paperid: 2460, https://arxiv.org/pdf/2505.05897.pdf

Abstract:
Free and open source software (FOSS) is ubiquitous on modern IT systems, accelerating the speed of software engineering over the past decades. With its increasing importance and historical reliance on uncompensated contributions, questions have been raised regarding the continuous maintenance of FOSS and its implications from a security perspective. In recent years, different funding programs have emerged to provide external incentives to reinforce community FOSS' sustainability. Past research primarily focused on analyses what type of projects have been funded and for what reasons. However, it has neither been considered whether there is a need for such external incentives, nor whether the incentive mechanisms, especially with the development of decentralized approaches, are susceptible to fraud. In this study, we explore the need for funding through a literature review and compare the susceptibility to fraud of centralized and decentralized incentive programs by performing case studies on the Sovereign Tech Fund (STF) and the tea project. We find non-commercial incentives to fill an important gap, ensuring longevity and sustainability of projects. Furthermore, we find the STF to be able to achieve a high resilience against fraud attempts, while tea is highly susceptible to fraud, as evidenced by revelation of an associated sybil attack on npm. Our results imply that special considerations must be taken into account when utilizing quantitative repository metrics regardless whether spoofing is expected.

Paperid: 2461, https://arxiv.org/pdf/2505.05707.pdf

Abstract:
The integration of AI into daily life has generated considerable attention and excitement, while also raising concerns about automating algorithmic harms and re-entrenching existing social inequities. While the responsible deployment of trustworthy AI systems is a worthy goal, there are many possible ways to realize it, from policy and regulation to improved algorithm design and evaluation. In fact, since AI trains on social data, there is even a possibility for everyday users, citizens, or workers to directly steer its behavior through Algorithmic Collective Action, by deliberately modifying the data they share with a platform to drive its learning process in their favor. This paper considers how these grassroots efforts to influence AI interact with methods already used by AI firms and governments to improve model trustworthiness. In particular, we focus on the setting where the AI firm deploys a differentially private model, motivated by the growing regulatory focus on privacy and data protection. We investigate how the use of Differentially Private Stochastic Gradient Descent (DPSGD) affects the collective's ability to influence the learning process. Our findings show that while differential privacy contributes to the protection of individual data, it introduces challenges for effective algorithmic collective action. We characterize lower bounds on the success of algorithmic collective action under differential privacy as a function of the collective's size and the firm's privacy parameters, and verify these trends experimentally by simulating collective action during the training of deep neural network classifiers across several datasets.

Paperid: 2462, https://arxiv.org/pdf/2505.03529.pdf

Abstract:
Data privacy and anonymisation are critical concerns in today's data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation to de-identify releases of tabular data. Available hardware resources provide an upper bound on the maximum size of dataset that can be processed at a time. Large datasets with sizes exceeding this upper bound must be broken up into smaller data chunks for processing. In these cases, standard k-anonymisation tools such as ARX can only operate on a per-chunk basis. This paper proposes SKALD, a novel algorithm for performing k-anonymisation on large datasets with limited RAM. Our SKALD algorithm offers multi-fold performance improvement over standard k-anonymisation methods by extracting and combining sufficient statistics from each chunk during processing to ensure successful k-anonymisation while providing better utility.

Paperid: 2463, https://arxiv.org/pdf/2505.02521.pdf

Abstract:
In this paper we present attestable builds, a new paradigm to provide strong source-to-binary correspondence in software artifacts. We tackle the challenge of opaque build pipelines that disconnect the trust between source code, which can be understood and audited, and the final binary artifact, which is difficult to inspect. Our system uses modern trusted execution environments (TEEs) and sandboxed build containers to provide strong guarantees that a given artifact was correctly built from a specific source code snapshot. As such it complements existing approaches like reproducible builds which typically require time-intensive modifications to existing build configurations and dependencies, and require independent parties to continuously build and verify artifacts. In comparison, an attestable build requires only minimal changes to an existing project, and offers nearly instantaneous verification of the correspondence between a given binary and the source code and build pipeline used to construct it. We evaluate it by building open-source software libraries - focusing on projects which are important to the trust chain and those which have proven difficult to be built deterministically. Overall, the overhead (42 seconds start-up latency and 14% increase in build duration) is small in comparison to the overall build time. Importantly, our prototype builds even complex projects such as LLVM Clang without requiring any modifications to their source code and build scripts. Finally, we formally model and verify the attestable build design to demonstrate its security against well-resourced adversaries.

Paperid: 2464, https://arxiv.org/pdf/2505.01976.pdf

Abstract:
Although Large Language Models (LLMs) have become increasingly integral to diverse applications, their capabilities raise significant privacy concerns. This survey offers a comprehensive overview of privacy risks associated with LLMs and examines current solutions to mitigate these challenges. First, we analyze privacy leakage and attacks in LLMs, focusing on how these models unintentionally expose sensitive information through techniques such as model inversion, training data extraction, and membership inference. We investigate the mechanisms of privacy leakage, including the unauthorized extraction of training data and the potential exploitation of these vulnerabilities by malicious actors. Next, we review existing privacy protection against such risks, such as inference detection, federated learning, backdoor mitigation, and confidential computing, and assess their effectiveness in preventing privacy leakage. Furthermore, we highlight key practical challenges and propose future research directions to develop secure and privacy-preserving LLMs, emphasizing privacy risk assessment, secure knowledge transfer between models, and interdisciplinary frameworks for privacy governance. Ultimately, this survey aims to establish a roadmap for addressing escalating privacy challenges in the LLMs domain.

Paperid: 2465, https://arxiv.org/pdf/2505.01186.pdf

Abstract:
Hierarchical Federated Learning (HFL) has recently emerged as a promising solution for intelligent decision-making in vehicular networks, helping to address challenges such as limited communication resources, high vehicle mobility, and data heterogeneity. However, HFL remains vulnerable to adversarial and unreliable vehicles, whose misleading updates can significantly compromise the integrity and convergence of the global model. To address these challenges, we propose a novel defense framework that integrates dynamic vehicle selection with robust anomaly detection within a cluster-based HFL architecture, specifically designed to counter Gaussian noise and gradient ascent attacks. The framework performs a comprehensive reliability assessment for each vehicle by evaluating historical accuracy, contribution frequency, and anomaly records. Anomaly detection combines Z-score and cosine similarity analyses on model updates to identify both statistical outliers and directional deviations in model updates. To further refine detection, an adaptive thresholding mechanism is incorporated into the cosine similarity metric, dynamically adjusting the threshold based on the historical accuracy of each vehicle to enforce stricter standards for consistently high-performing vehicles. In addition, a weighted gradient averaging mechanism is implemented, which assigns higher weights to gradient updates from more trustworthy vehicles. To defend against coordinated attacks, a cross-cluster consistency check is applied to identify collaborative attacks in which multiple compromised clusters coordinate misleading updates. Together, these mechanisms form a multi-level defense strategy to filter out malicious contributions effectively. Simulation results show that the proposed algorithm significantly reduces convergence time compared to benchmark methods across both 1-hop and 3-hop topologies.

Paperid: 2466, https://arxiv.org/pdf/2505.01177.pdf

Abstract:
As large language models (LLMs) continue to evolve, it is critical to assess the security threats and vulnerabilities that may arise both during their training phase and after models have been deployed. This survey seeks to define and categorize the various attacks targeting LLMs, distinguishing between those that occur during the training phase and those that affect already trained models. A thorough analysis of these attacks is presented, alongside an exploration of defense mechanisms designed to mitigate such threats. Defenses are classified into two primary categories: prevention-based and detection-based defenses. Furthermore, our survey summarizes possible attacks and their corresponding defense strategies. It also provides an evaluation of the effectiveness of the known defense mechanisms for the different security threats. Our survey aims to offer a structured framework for securing LLMs, while also identifying areas that require further research to improve and strengthen defenses against emerging security challenges.

Paperid: 2467, https://arxiv.org/pdf/2505.00976.pdf

Abstract:
Large Language Models (LLMs) have become central to numerous natural language processing tasks, but their vulnerabilities present significant security and ethical challenges. This systematic survey explores the evolving landscape of attack and defense techniques in LLMs. We classify attacks into adversarial prompt attack, optimized attacks, model theft, as well as attacks on application of LLMs, detailing their mechanisms and implications. Consequently, we analyze defense strategies, including prevention-based and detection-based defense methods. Although advances have been made, challenges remain to adapt to the dynamic threat landscape, balance usability with robustness, and address resource constraints in defense implementation. We highlight open problems, including the need for adaptive scalable defenses, explainable security techniques, and standardized evaluation frameworks. This survey provides actionable insights and directions for developing secure and resilient LLMs, emphasizing the importance of interdisciplinary collaboration and ethical considerations to mitigate risks in real-world applications.

Paperid: 2468, https://arxiv.org/pdf/2505.00665.pdf

Abstract:
\textit{Auditing} data accesses helps preserve privacy and ensures accountability by allowing one to determine who accessed (potentially sensitive) information. A prior formal definition of register auditability was based on the values returned by read operations, \emph{without accounting for cases where a reader might learn a value without explicitly reading it or gain knowledge of data access without being an auditor}. This paper introduces a refined definition of auditability that focuses on when a read operation is \emph{effective}, rather than relying on its completion and return of a value. Furthermore, we formally specify the constraints that \textit{prevent readers from learning values they did not explicitly read or from auditing other readers' accesses.} Our primary algorithmic contribution is a wait-free implementation of a \emph{multi-writer, multi-reader register} that tracks effective reads while preventing unauthorized audits. The key challenge is ensuring that a read is auditable as soon as it becomes effective, which we achieve by combining value access and access logging into a single atomic operation. Another challenge is recording accesses without exposing them to readers, which we address using a simple encryption technique (one-time pad). We extend this implementation to an \emph{auditable max register} that tracks the largest value ever written. The implementation deals with the additional challenge posed by the max register semantics, which allows readers to learn prior values without reading them. The max register, in turn, serves as the foundation for implementing an \emph{auditable snapshot} object and, more generally, \emph{versioned types}. These extensions maintain the strengthened notion of auditability, appropriately adapted from multi-writer, multi-reader registers.

Paperid: 2469, https://arxiv.org/pdf/2505.00340.pdf

Abstract:
Secure and reliable communications are crucial for Intelligent Transportation Systems (ITSs), where Vehicle-to-Infrastructure (V2I) communication plays a key role in enabling mobility-enhancing and safety-critical services. Current V2I authentication relies on credential-based methods over wireless Non-Line-of-Sight (NLOS) channels, leaving them exposed to remote impersonation and proximity attacks. To mitigate these risks, we propose a unified Multi-Channel, Multi-Factor Authentication (MFA) scheme that combines NLOS cryptographic credentials with a Line-of-Sight (LOS) visual channel. Our approach leverages a challenge-response security paradigm: the infrastructure issues challenges and the vehicle's headlights respond by flashing a structured sequence containing encoded security data. Deep learning models on the infrastructure side then decode the embedded information to authenticate the vehicle. Real-world experimental evaluations demonstrate high test accuracy, reaching an average of 95% and 96.6%, respectively, under various lighting, weather, speed, and distance conditions. Additionally, we conducted extensive experiments on three state-of-the-art deep learning models, including detailed ablation studies for decoding the flashing sequence. Our results indicate that the optimal architecture employs a dual-channel design, enabling simultaneous decoding of the flashing sequence and extraction of vehicle spatial and locational features for robust authentication.

Paperid: 2470, https://arxiv.org/pdf/2505.00257.pdf

Abstract:
The sharing of external data has become a strong demand of financial institutions, but the privacy issue has led to the difficulty of interconnecting different platforms and the low degree of data openness. To effectively solve the privacy problem of financial data in trans-border flow and sharing, to ensure that the data is available but not visible, to realize the joint portrait of all kinds of heterogeneous data of business organizations in different industries, we propose a Heterogeneous Federated Graph Neural Network (HFGNN) approach. In this method, the distribution of heterogeneous business data of trans-border organizations is taken as subgraphs, and the sharing and circulation process among subgraphs is constructed as a statistically heterogeneous global graph through a central server. Each subgraph learns the corresponding personalized service model through local training to select and update the relevant subset of subgraphs with aggregated parameters, and effectively separates and combines topological and feature information among subgraphs. Finally, our simulation experimental results show that the proposed method has higher accuracy performance and faster convergence speed than existing methods.

Paperid: 2471, https://arxiv.org/pdf/2504.21323.pdf

Abstract:
Knowledge distillation has become a cornerstone in modern machine learning systems, celebrated for its ability to transfer knowledge from a large, complex teacher model to a more efficient student model. Traditionally, this process is regarded as secure, assuming the teacher model is clean. This belief stems from conventional backdoor attacks relying on poisoned training data with backdoor triggers and attacker-chosen labels, which are not involved in the distillation process. Instead, knowledge distillation uses the outputs of a clean teacher model to guide the student model, inherently preventing recognition or response to backdoor triggers as intended by an attacker. In this paper, we challenge this assumption by introducing a novel attack methodology that strategically poisons the distillation dataset with adversarial examples embedded with backdoor triggers. This technique allows for the stealthy compromise of the student model while maintaining the integrity of the teacher model. Our innovative approach represents the first successful exploitation of vulnerabilities within the knowledge distillation process using clean teacher models. Through extensive experiments conducted across various datasets and attack settings, we demonstrate the robustness, stealthiness, and effectiveness of our method. Our findings reveal previously unrecognized vulnerabilities and pave the way for future research aimed at securing knowledge distillation processes against backdoor attacks.

Paperid: 2472, https://arxiv.org/pdf/2504.21205.pdf

Abstract:
This paper introduces SecRepoBench, a benchmark to evaluate LLMs on secure code generation in real-world repositories. SecRepoBench has 318 code generation tasks in 27 C/C++ repositories, covering 15 CWEs. We evaluate 19 state-of-the-art LLMs using our benchmark and find that the models struggle with generating correct and secure code. In addition, the performance of LLMs to generate self-contained programs as measured by prior benchmarks do not translate to comparative performance at generating secure and correct code at the repository level in SecRepoBench. We show that the state-of-the-art prompt engineering techniques become less effective when applied to the repository level secure code generation problem. We conduct extensive experiments, including an agentic technique to generate secure code, to demonstrate that our benchmark is currently the most difficult secure coding benchmark, compared to previous state-of-the-art benchmarks. Finally, our comprehensive analysis provides insights into potential directions for enhancing the ability of LLMs to generate correct and secure code in real-world repositories.

Paperid: 2473, https://arxiv.org/pdf/2504.21038.pdf

Abstract:
Large Language Models face security threats from jailbreak attacks. Existing research has predominantly focused on prompt-level attacks while largely ignoring the underexplored attack surface of user-controlled response prefilling. This functionality allows an attacker to dictate the beginning of a model's output, thereby shifting the attack paradigm from persuasion to direct state manipulation.In this paper, we present a systematic black-box security analysis of prefill-level jailbreak attacks. We categorize these new attacks and evaluate their effectiveness across fourteen language models. Our experiments show that prefill-level attacks achieve high success rates, with adaptive methods exceeding 99% on several models. Token-level probability analysis reveals that these attacks work through initial-state manipulation by changing the first-token probability from refusal to compliance.Furthermore, we show that prefill-level jailbreak can act as effective enhancers, increasing the success of existing prompt-level attacks by 10 to 15 percentage points. Our evaluation of several defense strategies indicates that conventional content filters offer limited protection. We find that a detection method focusing on the manipulative relationship between the prompt and the prefill is more effective. Our findings reveal a gap in current LLM safety alignment and highlight the need to address the prefill attack surface in future safety training.

Paperid: 2474, https://arxiv.org/pdf/2504.20767.pdf

Abstract:
We introduce did:self, a Decentralized Identifier (DID) method that does not depend on any trusted registry for storing the corresponding DID documents. Information for authenticating a did:self subject can be disseminated using any means and without making any security assumption about the delivery method. did:self is lightweight, it allows controlled delegation, it offers increased security and privacy, and it can be used for identifying people, content, as well as IoT devices. Furthermore, DID documents in did:self can be implicit, allowing re-construction of DID documents based on other authentication material, such as JSON Web Tokens and X.509 certificates.

Paperid: 2475, https://arxiv.org/pdf/2504.20681.pdf

Abstract:
In the rapidly evolving landscape of cybersecurity threats, ransomware represents a significant challenge. Attackers increasingly employ sophisticated encryption methods, such as entropy reduction through Base64 encoding, and partial or intermittent encryption to evade traditional detection methods. This study explores the dynamic battle between adversaries who continuously refine encryption strategies and defenders developing advanced countermeasures to protect vulnerable data. We investigate the application of online incremental machine learning algorithms designed to predict file encryption activities despite adversaries evolving obfuscation techniques. Our analysis utilizes an extensive dataset of 32.6 GB, comprising 11,928 files across multiple formats, including Microsoft Word documents (doc), PowerPoint presentations (ppt), Excel spreadsheets (xlsx), image formats (jpg, jpeg, png, tif, gif), PDFs (pdf), audio (mp3), and video (mp4) files. These files were encrypted by 75 distinct ransomware families, facilitating a robust empirical evaluation of machine learning classifiers effectiveness against diverse encryption tactics. Results highlight the Hoeffding Tree algorithms superior incremental learning capability, particularly effective in detecting traditional and AES-Base64 encryption methods employed to lower entropy. Conversely, the Random Forest classifier with warm-start functionality excels at identifying intermittent encryption methods, demonstrating the necessity of tailored machine learning solutions to counter sophisticated ransomware strategies.

Paperid: 2476, https://arxiv.org/pdf/2504.20569.pdf

Abstract:
Sensor attacks on robotic vehicles have become pervasive and manipulative. Their latest advancements exploit sensor and detector characteristics to bypass detection. Recent security efforts have leveraged the physics-based model to detect or mitigate sensor attacks. However, these approaches are only resilient to a few sensor attacks and still need improvement in detection effectiveness. We present VIMU, an efficient sensor attack detection and resilience system for unmanned aerial vehicles. We propose a detection algorithm, CS-EMA, that leverages low-pass filtering to identify stealthy gyroscope attacks while achieving an overall effective sensor attack detection. We develop a fine-grained nonlinear physical model with precise aerodynamic and propulsion wrench modeling. We also augment the state estimation with a FIFO buffer safeguard to mitigate the impact of high-rate IMU attacks. The proposed physical model and buffer safeguard provide an effective system state recovery toward maintaining flight stability. We implement VIMU on PX4 autopilot. The evaluation results demonstrate the effectiveness of VIMU in detecting and mitigating various realistic sensor attacks, especially stealthy attacks.

Paperid: 2477, https://arxiv.org/pdf/2504.20432.pdf

Abstract:
Language-based information flow control (IFC) enables reasoning about and enforcing security policies in decentralized applications. While information flow properties are relatively extensional and compositional, designing expressive systems that enforce such properties remains challenging. In particular, it can be difficult to use IFC labels to model certain security assumptions, such as semi-honest agents. Motivated by these modeling limitations, we study the algebraic semantics of lattice-based IFC label models, and propose a semantic framework that allows formalizing asymmetric delegation, which is partial delegation of confidentiality or integrity. Our framework supports downgrading of information and ensures their safety through nonmalleable information flow (NMIF). To demonstrate the practicality of our framework, we design and implement a novel algorithm that statically checks NMIF and a label inference procedure that efficiently supports bounded label polymorphism, allowing users to write code generic with respect to labels.

Paperid: 2478, https://arxiv.org/pdf/2504.20266.pdf

Abstract:
Digital twins (DTs) help improve real-time monitoring and decision-making in water distribution systems. However, their connectivity makes them easy targets for cyberattacks such as scanning, denial-of-service (DoS), and unauthorized access. Small and medium-sized enterprises (SMEs) that manage these systems often do not have enough budget or staff to build strong cybersecurity teams. To solve this problem, we present a Virtual Cybersecurity Department (VCD), an affordable and automated framework designed for SMEs. The VCD uses open-source tools like Zabbix for real-time monitoring, Suricata for network intrusion detection, Fail2Ban to block repeated login attempts, and simple firewall settings. To improve threat detection, we also add a machine-learning-based IDS trained on the OD-IDS2022 dataset using an improved ensemble model. This model detects cyber threats such as brute-force attacks, remote code execution (RCE), and network flooding, with 92\% accuracy and fewer false alarms. Our solution gives SMEs a practical and efficient way to secure water systems using low-cost and easy-to-manage tools.

Paperid: 2479, https://arxiv.org/pdf/2504.19956.pdf

Abstract:
As generative AI (GenAI) agents become more common in enterprise settings, they introduce security challenges that differ significantly from those posed by traditional systems. These agents are not just LLMs; they reason, remember, and act, often with minimal human oversight. This paper introduces a comprehensive threat model tailored specifically for GenAI agents, focusing on how their autonomy, persistent memory access, complex reasoning, and tool integration create novel risks. This research work identifies 9 primary threats and organizes them across five key domains: cognitive architecture vulnerabilities, temporal persistence threats, operational execution vulnerabilities, trust boundary violations, and governance circumvention. These threats are not just theoretical they bring practical challenges such as delayed exploitability, cross-system propagation, cross system lateral movement, and subtle goal misalignments that are hard to detect with existing frameworks and standard approaches. To help address this, the research work present two complementary frameworks: ATFAA - Advanced Threat Framework for Autonomous AI Agents, which organizes agent-specific risks, and SHIELD, a framework proposing practical mitigation strategies designed to reduce enterprise exposure. While this work builds on existing work in LLM and AI security, the focus is squarely on what makes agents different and why those differences matter. Ultimately, this research argues that GenAI agents require a new lens for security. If we fail to adapt our threat models and defenses to account for their unique architecture and behavior, we risk turning a powerful new tool into a serious enterprise liability.

Paperid: 2480, https://arxiv.org/pdf/2504.19215.pdf

Abstract:
GitHub is one of the most widely used public code development platform. However, the code hosted publicly on the platform is vulnerable to commit spoofing that allows an adversary to introduce malicious code or commits into the repository by spoofing the commit metadata to indicate that the code was added by a legitimate user. The only defense that GitHub employs is the process of commit signing, which indicates whether a commit is from a valid source or not based on the keys registered by the users. In this work, we perform an empirical analysis of how prevalent is the use of commit signing in commonly used GitHub repositories. To this end, we build a framework that allows us to extract the metadata of all prior commits of a GitHub repository, and identify what commits in the repository are verified. We analyzed 60 open-source repositories belonging to four different domains -- web development, databases, machine learning and security -- using our framework and study the presence of verified commits in each repositories over five years. Our analysis shows that only ~10% of all the commits in these 60 repositories are verified. Developers committing code to security-related repositories are much more vigilant when it comes to signing commits by users. We also analyzed different Git clients for the ease of commit signing, and found that GitKraken provides the most convenient way of commit signing whereas GitHub Web provides the most accessible way for verifying commits. During our analysis, we also identified an unexpected behavior in how GitHub handles unverified emails in user accounts preventing legitimate owner to use the email address. We believe that the low number of verified commits may be due to lack of awareness, difficulty in setup and key management. Finally, we propose ways to identify commit ownership based on GitHub's Events API addressing the issue of commit spoofing.

Paperid: 2481, https://arxiv.org/pdf/2504.19001.pdf

Abstract:
We study the sample complexity of differentially private optimization of quasi-concave functions. For a fixed input domain $\mathcal{X}$, Cohen et al. (STOC 2023) proved that any generic private optimizer for low sensitive quasi-concave functions must have sample complexity $Î©(2^{\log^*|\mathcal{X}|})$. We show that the lower bound can be bypassed for a series of ``natural'' problems. We define a new class of \emph{approximated} quasi-concave functions, and present a generic differentially private optimizer for approximated quasi-concave functions with sample complexity $\tilde{O}(\log^*|\mathcal{X}|)$. As applications, we use our optimizer to privately select a center point of points in $d$ dimensions and \emph{probably approximately correct} (PAC) learn $d$-dimensional halfspaces. In previous works, Bun et al. (FOCS 2015) proved a lower bound of $Î©(\log^*|\mathcal{X}|)$ for both problems. Beimel et al. (COLT 2019) and Kaplan et al. (NeurIPS 2020) gave an upper bound of $\tilde{O}(d^{2.5}\cdot 2^{\log^*|\mathcal{X}|})$ for the two problems, respectively. We improve the dependency of the upper bounds on the cardinality of the domain by presenting a new upper bound of $\tilde{O}(d^{5.5}\cdot\log^*|\mathcal{X}|)$ for both problems. To the best of our understanding, this is the first work to reduce the sample complexity dependency on $|\mathcal{X}|$ for these two problems from exponential in $\log^* |\mathcal{X}|$ to $\log^* |\mathcal{X}|$.

Paperid: 2482, https://arxiv.org/pdf/2504.18563.pdf

Abstract:
Text-to-image diffusion models are increasingly vulnerable to backdoor attacks, where malicious modifications to the training data cause the model to generate unintended outputs when specific triggers are present. While classification models have seen extensive development of defense mechanisms, generative models remain largely unprotected due to their high-dimensional output space, which complicates the detection and mitigation of subtle perturbations. Defense strategies for diffusion models, in particular, remain under-explored. In this work, we propose Spatial Attention Unlearning (SAU), a novel technique for mitigating backdoor attacks in diffusion models. SAU leverages latent space manipulation and spatial attention mechanisms to isolate and remove the latent representation of backdoor triggers, ensuring precise and efficient removal of malicious effects. We evaluate SAU across various types of backdoor attacks, including pixel-based and style-based triggers, and demonstrate its effectiveness in achieving 100% trigger removal accuracy. Furthermore, SAU achieves a CLIP score of 0.7023, outperforming existing methods while preserving the model's ability to generate high-quality, semantically aligned images. Our results show that SAU is a robust, scalable, and practical solution for securing text-to-image diffusion models against backdoor attacks.

Paperid: 2483, https://arxiv.org/pdf/2504.17211.pdf

Abstract:
As water distribution networks (WDNs) become increasingly connected with digital infrastructures, they face greater exposure to cyberattacks that threaten their operational integrity. Stealthy False Data Injection Attacks (SFDIAs) are particularly concerning, as they manipulate sensor data to compromise system operations while avoiding detection. While existing studies have focused on either detection methods or specific attack formulations, the relationship between attack sophistication, system knowledge requirements, and achievable impact remains unexplored. This paper presents a systematic analysis of sensor attacks against WDNs, investigating different combinations of physical constraints, state monitoring requirements, and intrusion detection evasion conditions. We propose several attack formulations that range from tailored strategies satisfying both physical and detection constraints to simpler measurement manipulations. The proposed attacks are simple and local -- requiring knowledge only of targeted sensors and their hydraulic connections -- making them scalable and practical. Through case studies on Net1 and Net3 benchmark networks, we demonstrate how these attacks can persistently increase operational costs and alter water flows while remaining undetected by monitoring systems for extended periods. The analysis provides utilities with insights for vulnerability assessment and motivates the development of protection strategies that combine physical and statistical security mechanisms.

Paperid: 2484, https://arxiv.org/pdf/2504.16617.pdf

Abstract:
Digital signatures are fundamental cryptographic primitives that ensure the authenticity and integrity of digital communication. However, in scenarios involving sensitive interactions -- such as e-voting or e-cash -- there is a growing need for more controlled signing mechanisms. Strong-Designated Verifier Signature (SDVS) offers such control by allowing the signer to specify and restrict the verifier of a signature. The existing state-of-the-art SDVS are mostly based on number-theoretic hardness assumptions. Thus, they are not secure against quantum attacks. Moreover, Post-Quantum Cryptography (PQC)-based SDVS are inefficient and have large key and signature sizes. In this work, we address these challenges and propose an efficient post-quantum SDVS (namely, LaSDVS) based on ideal lattices under the hardness assumptions of the Ring-SIS and Ring-LWE problems. LaSDVS achieves advanced security properties including strong unforgeability under chosen-message attacks, non-transferability, non-delegatability, and signer anonymity. By employing the algebraic structure of rings and the gadget trapdoor mechanism of Micciancio et al., we design LaSDVS to minimize computational overhead and significantly reduce key and signature sizes. Notably, our scheme achieves a compact signature size of $\mathcal{O}(n\log q)$, compared to $\mathcal{O}(n^2)$ size, where $n$ is the security parameter, in the existing state-of-the-art PQC designs. To the best of our knowledge, LaSDVS offers the \textit{smallest private key and signature size} among the existing PQC-based SDVS schemes.

Paperid: 2486, https://arxiv.org/pdf/2504.16355.pdf

Abstract:
Perceptual hashing is used to detect whether an input image is similar to a reference image with a variety of security applications. Recently, they have been shown to succumb to adversarial input attacks which make small imperceptible changes to the input image yet the hashing algorithm does not detect its similarity to the original image. Property-preserving hashing (PPH) is a recent construct in cryptography, which preserves some property (predicate) of its inputs in the hash domain. Researchers have so far shown constructions of PPH for Hamming distance predicates, which, for instance, outputs 1 if two inputs are within Hamming distance $t$. A key feature of PPH is its strong correctness guarantee, i.e., the probability that the predicate will not be correctly evaluated in the hash domain is negligible. Motivated by the use case of detecting similar images under adversarial setting, we propose the first PPH construction for an $\ell_1$-distance predicate. Roughly, this predicate checks if the two one-sided $\ell_1$-distances between two images are within a threshold $t$. Since many adversarial attacks use $\ell_2$-distance (related to $\ell_1$-distance) as the objective function to perturb the input image, by appropriately choosing the threshold $t$, we can force the attacker to add considerable noise to evade detection, and hence significantly deteriorate the image quality. Our proposed scheme is highly efficient, and runs in time $O(t^2)$. For grayscale images of size $28 \times 28$, we can evaluate the predicate in $0.0784$ seconds when pixel values are perturbed by up to $1 \%$. For larger RGB images of size $224 \times 224$, by dividing the image into 1,000 blocks, we achieve times of $0.0128$ seconds per block for $1 \%$ change, and up to $0.2641$ seconds per block for $14\%$ change.

Paperid: 2487, https://arxiv.org/pdf/2504.16219.pdf

Abstract:
Binary Code Similarity Detection (BCSD) is not only essential for security tasks such as vulnerability identification but also for code copying detection, yet it remains challenging due to binary stripping and diverse compilation environments. Existing methods tend to adopt increasingly complex neural networks for better accuracy performance. The computation time increases with the complexity. Even with powerful GPUs, the treatment of large-scale software becomes time-consuming. To address these issues, we present a framework called ReGraph to efficiently compare binary code functions across architectures and optimization levels. Our evaluation with public datasets highlights that ReGraph exhibits a significant speed advantage, performing 700 times faster than Natural Language Processing (NLP)-based methods while maintaining comparable accuracy results with respect to the state-of-the-art models.

Paperid: 2488, https://arxiv.org/pdf/2504.16113.pdf

Abstract:
With the rapid growth of the NFT market, the security of smart contracts has become crucial. However, existing AI-based detection models for NFT contract vulnerabilities remain limited due to their complexity, while traditional manual methods are time-consuming and costly. This study proposes an AI-driven approach to detect vulnerabilities in NFT smart contracts. We collected 16,527 public smart contract codes, classifying them into five vulnerability categories: Risky Mutable Proxy, ERC-721 Reentrancy, Unlimited Minting, Missing Requirements, and Public Burn. Python-processed data was structured into training/test sets. Using the CART algorithm with Gini coefficient evaluation, we built initial decision trees for feature extraction. A random forest model was implemented to improve robustness through random data/feature sampling and multitree integration. GridSearch hyperparameter tuning further optimized the model, with 3D visualizations demonstrating parameter impacts on vulnerability detection. Results show the random forest model excels in detecting all five vulnerabilities. For example, it identifies Risky Mutable Proxy by analyzing authorization mechanisms and state modifications, while ERC-721 Reentrancy detection relies on external call locations and lock mechanisms. The ensemble approach effectively reduces single-tree overfitting, with stable performance improvements after parameter tuning. This method provides an efficient technical solution for automated NFT contract detection and lays groundwork for scaling AI applications.

Paperid: 2489, https://arxiv.org/pdf/2504.16089.pdf

Abstract:
The increasing adoption of cryptocurrencies has significantly amplified the resource requirements for operating full nodes, creating substantial barriers to entry. Unlike miners, who are financially incentivized through block rewards and transaction fees, full nodes lack direct economic compensation for their critical role in maintaining the network. A key resource burden is the transaction pool, which is particularly memory-intensive as it temporarily stores unconfirmed transactions awaiting verification and propagation across the network. We present Neonpool, a novel optimization for transaction pool leveraging bloom filter variants to drastically reduce memory consumption by up to 200 (e.g., 400 MB to 2 MB) while maintaining over 99.99% transaction processing accuracy. Implemented in C++ and evaluated on unique Bitcoin and Ethereum datasets, Neonpool enables efficient operation on lightweight clients, such as smartphones, IoT devices, and systems-on-a-chip, without requiring a hard fork. By lowering the cost of node participation, Neonpool enhances decentralization and strengthens the overall security and robustness of cryptocurrency networks.

Paperid: 2490, https://arxiv.org/pdf/2504.16000.pdf

Abstract:
In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms underlying ICL, its feasibility under formal privacy constraints remains largely unexplored. In this paper, we propose a differentially private pretraining algorithm for linear attention heads and present the first theoretical analysis of the privacy-accuracy trade-off for ICL in linear regression. Our results characterize the fundamental tension between optimization and privacy-induced noise, formally capturing behaviors observed in private training via iterative methods. Additionally, we show that our method is robust to adversarial perturbations of training prompts, unlike standard ridge regression. All theoretical findings are supported by extensive simulations across diverse settings.

Paperid: 2491, https://arxiv.org/pdf/2504.15632.pdf

Abstract:
Various deep learning (DL) methods have recently been utilized to detect software vulnerabilities. Real-world software vulnerability datasets are rare and hard to acquire, as there is no simple metric for classifying vulnerability. Such datasets are heavily imbalanced, and none of the current datasets are considered huge for DL models. To tackle these problems, a recent work has tried to augment the dataset using the source code and generate realistic single-statement vulnerabilities, which is not quite practical and requires manual checking of the generated vulnerabilities. In this paper, we aim to explore the augmentation of vulnerabilities at the representation level to help current models learn better, which has never been done before to the best of our knowledge. We implement and evaluate five augmentation techniques that augment the embedding of the data and have recently been used for code search, which is a completely different software engineering task. We also introduced a conditioned version of those augmentation methods, which ensures the augmentation does not change the vulnerable section of the vector representation. We show that such augmentation methods can be helpful and increase the F1-score by up to 9.67%, yet they cannot beat Random Oversampling when balancing datasets, which increases the F1-score by 10.82%.

Paperid: 2492, https://arxiv.org/pdf/2504.15447.pdf

Abstract:
A popular approach to detect cyberattacks is to monitor systems in real-time to identify malicious activities as they occur. While these solutions aim to detect threats early, minimizing damage, they suffer from a significant challenge due to the presence of false positives. False positives have a detrimental impact on computer systems, which can lead to interruptions of legitimate operations and reduced productivity. Most contemporary works tend to use advanced Machine Learning and AI solutions to address this challenge. Unfortunately, false positives can, at best, be reduced but not eliminated. In this paper, we propose an alternate approach that focuses on reducing the impact of false positives rather than eliminating them. We introduce Valkyrie, a framework that can enhance any existing runtime detector with a post-detection response. Valkyrie is designed for time-progressive attacks, such as micro-architectural attacks, rowhammer, ransomware, and cryptominers, that achieve their objectives incrementally using system resources. As soon as an attack is detected, Valkyrie limits the allocated computing resources, throttling the attack, until the detector's confidence is sufficiently high to warrant a more decisive action. For a false positive, limiting the system resources only results in a small increase in execution time. On average, the slowdown incurred due to false positives is less than 1% for single-threaded programs and 6.7% for multi-threaded programs. On the other hand, attacks like rowhammer are prevented, while the potency of micro-architectural attacks, ransomware, and cryptominers is greatly reduced.

Paperid: 2493, https://arxiv.org/pdf/2504.14985.pdf

Abstract:
Evaluating Large Language Models (LLMs) for safety and security remains a complex task, often requiring users to navigate a fragmented landscape of ad hoc benchmarks, datasets, metrics, and reporting formats. To address this challenge, we present aiXamine, a comprehensive black-box evaluation platform for LLM safety and security. aiXamine integrates over 40 tests (i.e., benchmarks) organized into eight key services targeting specific dimensions of safety and security: adversarial robustness, code security, fairness and bias, hallucination, model and data privacy, out-of-distribution (OOD) robustness, over-refusal, and safety alignment. The platform aggregates the evaluation results into a single detailed report per model, providing a detailed breakdown of model performance, test examples, and rich visualizations. We used aiXamine to assess over 50 publicly available and proprietary LLMs, conducting over 2K examinations. Our findings reveal notable vulnerabilities in leading models, including susceptibility to adversarial attacks in OpenAI's GPT-4o, biased outputs in xAI's Grok-3, and privacy weaknesses in Google's Gemini 2.0. Additionally, we observe that open-source models can match or exceed proprietary models in specific services such as safety alignment, fairness and bias, and OOD robustness. Finally, we identify trade-offs between distillation strategies, model size, training methods, and architectural choices.

Paperid: 2494, https://arxiv.org/pdf/2504.14235.pdf

Abstract:
Cyber threats have become increasingly prevalent and sophisticated. Prior work has extracted actionable cyber threat intelligence (CTI), such as indicators of compromise, tactics, techniques, and procedures (TTPs), or threat feeds from various sources: open source data (e.g., social networks), internal intelligence (e.g., log data), and ``first-hand'' communications from cybercriminals (e.g., underground forums, chats, darknet websites). However, "first-hand" data sources remain underutilized because it is difficult to access or scrape their data. In this work, we analyze (i) 6.6 million posts, (ii) 3.4 million messages, and (iii) 120,000 darknet websites. We combine NLP tools to address several challenges in analyzing such data. First, even on dedicated platforms, only some content is CTI-relevant, requiring effective filtering. Second, "first-hand" data can be CTI-relevant from a technical or strategic viewpoint. We demonstrate how to organize content along this distinction. Third, we describe the topics discussed and how "first-hand" data sources differ from each other. According to our filtering, 20% of our sample is CTI-relevant. Most of the CTI-relevant data focuses on strategic rather than technical discussions. Credit card-related crime is the most prevalent topic on darknet websites. On underground forums and chat channels, account and subscription selling is discussed most. Topic diversity is higher on underground forums and chat channels than on darknet websites. Our analyses suggest that different platforms may be used for activities with varying complexity and risks for criminals.

Paperid: 2495, https://arxiv.org/pdf/2504.13767.pdf

Abstract:
Data spaces represent an emerging paradigm that facilitates secure and trusted data exchange through foundational elements of data interoperability, sovereignty, and trust. Within a data space, data items, potentially owned by different entities, can be interconnected. Concurrently, data consumers can execute advanced data lookup operations and subscribe to data-driven events. Achieving fine-grained access control without compromising functionality presents a significant challenge. In this paper, we design and implement an access control mechanism that ensures continuous evaluation of access control policies, is data semantics aware, and supports subscriptions to data events. We present a construction where access control policies are stored in a centralized location, which we extend to allow data owners to maintain their own Policy Administration Points. This extension builds upon W3C Verifiable Credentials.

Paperid: 2496, https://arxiv.org/pdf/2504.13209.pdf

Abstract:
Augmented Reality (AR) and Multimodal Large Language Models (LLMs) are rapidly evolving, providing unprecedented capabilities for human-computer interaction. However, their integration introduces a new attack surface for social engineering. In this paper, we systematically investigate the feasibility of orchestrating AR-driven Social Engineering attacks using Multimodal LLM for the first time, via our proposed SEAR framework, which operates through three key phases: (1) AR-based social context synthesis, which fuses Multimodal inputs (visual, auditory and environmental cues); (2) role-based Multimodal RAG (Retrieval-Augmented Generation), which dynamically retrieves and integrates contextual data while preserving character differentiation; and (3) ReInteract social engineering agents, which execute adaptive multiphase attack strategies through inference interaction loops. To verify SEAR, we conducted an IRB-approved study with 60 participants in three experimental configurations (unassisted, AR+LLM, and full SEAR pipeline) compiling a new dataset of 180 annotated conversations in simulated social scenarios. Our results show that SEAR is highly effective at eliciting high-risk behaviors (e.g., 93.3% of participants susceptible to email phishing). The framework was particularly effective in building trust, with 85% of targets willing to accept an attacker's call after an interaction. Also, we identified notable limitations such as ``occasionally artificial'' due to perceived authenticity gaps. This work provides proof-of-concept for AR-LLM driven social engineering attacks and insights for developing defensive countermeasures against next-generation augmented reality threats.

Paperid: 2497, https://arxiv.org/pdf/2504.12748.pdf

Abstract:
Effective risk management in cybersecurity requires a thorough understanding of the interplay between attacker capabilities and defense strategies. Attack-Defense Trees (ADTs) are a commonly used methodology for representing this interplay; however, previous work in this domain has only focused on analyzing metrics such as cost, damage, or time from the perspective of the attacker. This approach provides an incomplete view of the system, as it neglects to model defender attributes: in real-world scenarios, defenders have finite resources for countermeasures and are similarly constrained. In this paper, we propose a novel framework that incorporates defense metrics into ADTs, and we present efficient algorithms for computing the Pareto front between defense and attack metrics. Our methods encode both attacker and defender metrics as semirings, allowing our methods to be used for many metrics such as cost, damage, and skill. We analyze tree-structured ADTs using a bottom-up approach and general ADTs by translating them into binary decision diagrams. Experiments on randomly generated ADTS demonstrate that both approaches effectively handle ADTs with several hundred nodes.

Paperid: 2498, https://arxiv.org/pdf/2504.11730.pdf

Abstract:
In recent years, the term Metaverse emerged as one of the most compelling concepts, captivating the interest of international companies such as Tencent, ByteDance, Microsoft, and Facebook. These company recognized the Metaverse as a pivotal element for future success and have since made significant investments in this area. The Metaverse is still in its developmental stages, requiring the integration and advancement of various technologies to bring its vision to life. One of the key technologies associated with the Metaverse is blockchain, known for its decentralization, security, trustworthiness, and ability to manage time-series data. These characteristics align perfectly with the ecosystem of the Metaverse, making blockchain foundational for its security and infrastructure. This paper introduces both blockchain and the Metaverse ecosystem while exploring the application of the blockchain within the Metaverse, including decentralization, consensus mechanisms, hash algorithms, timestamping, smart contracts, distributed storage, distributed ledgers, and non-fungible tokens (NFTs) to provide insights for researchers investigating these topics.

Paperid: 2499, https://arxiv.org/pdf/2504.09971.pdf

Abstract:
We revisit the longstanding open problem of implementing Nakamoto's proof-of-work (PoW) consensus based on a real-world computational task $T(x)$ (as opposed to artificial random hashing), in a truly permissionless setting where the miner itself chooses the input $x$. The challenge in designing such a Proof-of-Useful-Work (PoUW) protocol, is using the native computation of $T(x)$ to produce a PoW certificate with prescribed hardness and with negligible computational overhead over the worst-case complexity of $T(\cdot)$ -- This ensures malicious miners cannot ``game the system" by fooling the verifier to accept with higher probability compared to honest miners (while using similar computational resources). Indeed, obtaining a PoUW with $O(1)$-factor overhead is trivial for any task $T$, but also useless. Our main result is a PoUW for the task of Matrix Multiplication $MatMul(A,B)$ of arbitrary matrices with $1+o(1)$ multiplicative overhead compared to naive $MatMul$ (even in the presence of Fast Matrix Multiplication-style algorithms, which are currently impractical). We conjecture that our protocol has optimal security in the sense that a malicious prover cannot obtain any significant advantage over an honest prover. This conjecture is based on reducing hardness of our protocol to the task of solving a batch of low-rank random linear equations which is of independent interest. Since $MatMul$s are the bottleneck of AI compute as well as countless industry-scale applications, this primitive suggests a concrete design of a new L1 base-layer protocol, which nearly eliminates the energy-waste of Bitcoin mining -- allowing GPU consumers to reduce their AI training and inference costs by ``re-using" it for blockchain consensus, in exchange for block rewards (2-for-1). This blockchain is currently under construction.

Paperid: 2500, https://arxiv.org/pdf/2504.08618.pdf

Abstract:
We present CryptoChaos, a novel hybrid cryptographic framework that synergizes deterministic chaos theory with cutting-edge cryptographic primitives to achieve robust, post-quantum resilient encryption. CryptoChaos harnesses the intrinsic unpredictability of four discrete chaotic maps (Logistic, Chebyshev, Tent, and Henon) to generate a high-entropy, multidimensional key from a unified entropy pool. This key is derived through a layered process that combines SHA3-256 hashing with an ephemeral X25519 Diffie-Hellman key exchange and is refined using an HMAC-based key derivation function (HKDF). The resulting encryption key powers AES-GCM, providing both confidentiality and integrity. Comprehensive benchmarking against established symmetric ciphers confirms that CryptoChaos attains near-maximal Shannon entropy (approximately 8 bits per byte) and exhibits negligible adjacent-byte correlations, while robust performance on the NIST SP 800-22 test suite underscores its statistical rigor. Moreover, quantum simulations demonstrate that the additional complexity inherent in chaotic key generation dramatically elevates the resource requirements for Grover-based quantum attacks, with an estimated T gate count of approximately 2.1 x 10^9. The modular and interoperable design of CryptoChaos positions it as a promising candidate for high-assurance applications, ranging from secure communications and financial transactions to IoT systems, paving the way for next-generation post-quantum encryption standards.

Paperid: 2501, https://arxiv.org/pdf/2504.08480.pdf

Abstract:
Transferability-based adversarial attacks exploit the ability of adversarial examples, crafted to deceive a specific source Intrusion Detection System (IDS) model, to also mislead a target IDS model without requiring access to the training data or any internal model parameters. These attacks exploit common vulnerabilities in machine learning models to bypass security measures and compromise systems. Although the transferability concept has been widely studied, its practical feasibility remains limited due to assumptions of high similarity between source and target models. This paper analyzes the core factors that contribute to transferability, including feature alignment, model architectural similarity, and overlap in the data distributions that each IDS examines. We propose a novel metric, the Transferability Feasibility Score (TFS), to assess the feasibility and reliability of such attacks based on these factors. Through experimental evidence, we demonstrate that TFS and actual attack success rates are highly correlated, addressing the gap between theoretical understanding and real-world impact. Our findings provide needed guidance for designing more realistic transferable adversarial attacks, developing robust defenses, and ultimately improving the security of machine learning-based IDS in critical systems.

Paperid: 2502, https://arxiv.org/pdf/2504.08183.pdf

Abstract:
This study proposes a credit card fraud detection method based on Heterogeneous Graph Neural Network (HGNN) to address fraud in complex transaction networks. Unlike traditional machine learning methods that rely solely on numerical features of transaction records, this approach constructs heterogeneous transaction graphs. These graphs incorporate multiple node types, including users, merchants, and transactions. By leveraging graph neural networks, the model captures higher-order transaction relationships. A Graph Attention Mechanism is employed to dynamically assign weights to different transaction relationships. Additionally, a Temporal Decay Mechanism is integrated to enhance the model's sensitivity to time-related fraud patterns. To address the scarcity of fraudulent transaction samples, this study applies SMOTE oversampling and Cost-sensitive Learning. These techniques strengthen the model's ability to identify fraudulent transactions. Experimental results demonstrate that the proposed method outperforms existing GNN models, including GCN, GAT, and GraphSAGE, on the IEEE-CIS Fraud Detection dataset. The model achieves notable improvements in both accuracy and OC-ROC. Future research may explore the integration of dynamic graph neural networks and reinforcement learning. Such advancements could enhance the real-time adaptability of fraud detection systems and provide more intelligent solutions for financial risk control.

Paperid: 2503, https://arxiv.org/pdf/2504.07419.pdf

Abstract:
The Solana blockchain was created by Anatoly Yakovenko of Solana Labs and was introduced in 2017, employing a novel transaction verification method. However, at the same time, the innovation process introduced some new security issues. The frequent security incidents in smart contracts have not only caused enormous economic losses, but also undermined the credit system based on the blockchain. The security and reliability of smart contracts have become a new focus of research both domestically and abroad. This paper studies the current status of security analysis of Solana by researching Solana smart contract security analysis tools. This paper systematically sorts out the vulnerabilities existing in Solana smart contracts and gives examples of some vulnerabilities, summarizes the principles of security analysis tools, and comprehensively summarizes and details the security analysis tools in Solana smart contracts. The data of Solana smart contract security analysis tools are collected and compared with Ethereum, and the differences are analyzed and some tools are selected for practical testing.

Paperid: 2504, https://arxiv.org/pdf/2504.07280.pdf

Abstract:
Conthereum is a concurrent Ethereum solution for intra-block parallel transaction execution, enabling validators to utilize multi-core infrastructure and transform the sequential execution model of Ethereum into a parallel one. This shift significantly increases throughput and transactions per second (TPS), while ensuring conflict-free execution in both proposer and attestor modes and preserving execution order consistency in the attestor. At the heart of Conthereum is a novel, lightweight, high-performance scheduler inspired by the Flexible Job Shop Scheduling Problem (FJSS). We propose a custom greedy heuristic algorithm, along with its efficient implementation, that solves this formulation effectively and decisively outperforms existing scheduling methods in finding suboptimal solutions that satisfy the constraints, achieve minimal makespan, and maximize speedup in parallel execution. Additionally, Conthereum includes an offline phase that equips its real-time scheduler with a conflict analysis repository obtained through static analysis of smart contracts, identifying potentially conflicting functions using a pessimistic approach. Building on this novel scheduler and extensive conflict data, Conthereum outperforms existing concurrent intra-block solutions. Empirical evaluations show near-linear throughput gains with increasing computational power on standard 8-core machines. Although scalability deviates from linear with higher core counts and increased transaction conflicts, Conthereum still significantly improves upon the current sequential execution model and outperforms existing concurrent solutions under a wide range of conditions.

Paperid: 2505, https://arxiv.org/pdf/2504.07027.pdf

Abstract:
[Context:] The acceptance of candidate patches in automated program repair has been typically based on testing oracles. Testing requires typically a costly process of building the application while ML models can be used to quickly classify patches, thus allowing more candidate patches to be generated in a positive feedback loop. [Problem:] If the model predictions are unreliable (as in vulnerability detection) they can hardly replace the more reliable oracles based on testing. [New Idea:] We propose to use an ML model as a preliminary filter of candidate patches which is put in front of a traditional filter based on testing. [Preliminary Results:] We identify some theoretical bounds on the precision and recall of the ML algorithm that makes such operation meaningful in practice. With these bounds and the results published in the literature, we calculate how fast some of state-of-the art vulnerability detectors must be to be more effective over a traditional AVR pipeline such as APR4Vuln based just on testing.

Paperid: 2506, https://arxiv.org/pdf/2504.06744.pdf

Abstract:
The integration of privacy-preserving transactions into public blockchains such as Ethereum remains a major challenge. The Stealth Address Protocol (SAP) provides recipient anonymity by generating unlinkable stealth addresses. Existing SAPs, such as the Dual-Key Stealth Address Protocol and the Curvy Protocol, have shown significant improvements in efficiency, but remain vulnerable to quantum attacks. Post-quantum SAPs based on lattice-based cryptography, such as the Module-LWE SAP, on the other hand, offer quantum resistance while achieving better performance. In this paper, we present a novel hybrid SAP that combines the Curvy protocol with the computational advantages of the Module-LWE technique while remaining Ethereum-friendly. In contrast to full post-quantum solutions, our approach does not provide quantum security, but achieves a significant speedup in scanning the ephemeral public key registry, about three times faster than the Curvy protocol. We present a detailed cryptographic construction of our protocol and compare its performance with existing solutions. Our results prove that this hybrid approach is the most efficient Ethereum-compatible SAP to date.

Paperid: 2507, https://arxiv.org/pdf/2504.05710.pdf

Abstract:
We prove that it is impossible to construct perfect-complete quantum public-key encryption (QPKE) with classical keys from quantumly secure one-way functions (OWFs) in a black-box manner, resolving a long-standing open question in quantum cryptography. Specifically, in the quantum random oracle model (QROM), no perfect-complete QPKE scheme with classical keys, and classical/quantum ciphertext can be secure. This improves the previous works which require either unproven conjectures or imposed restrictions on key generation algorithms. This impossibility even extends to QPKE with quantum public key if the public key can be uniquely determined by the secret key, and thus is tight to all existing QPKE constructions.

Paperid: 2508, https://arxiv.org/pdf/2504.03778.pdf

Abstract:
Large Language Models (LLMs) have demonstrated advanced capabilities in both text generation and comprehension, and their application to data archives might facilitate the privatization of sensitive information about the data subjects. In fact, the information contained in data often includes sensitive and personally identifiable details. This data, if not safeguarded, may bring privacy risks in terms of both disclosure and identification. Furthermore, the application of anonymisation techniques, such as k-anonymity, can lead to a significant reduction in the amount of data within data sources, which may reduce the efficacy of predictive processes. In our study, we investigate the capabilities offered by LLMs to enrich anonymized data sources without affecting their anonymity. To this end, we designed new ad-hoc prompt template engineering strategies to perform anonymized Data Augmentation and assess the effectiveness of LLM-based approaches in providing anonymized data. To validate the anonymization guarantees provided by LLMs, we exploited the pyCanon library, designed to assess the values of the parameters associated with the most common privacy-preserving techniques via anonymization. Our experiments conducted on real-world datasets demonstrate that LLMs yield promising results for this goal.

Paperid: 2509, https://arxiv.org/pdf/2504.01905.pdf

Abstract:
The Internet of Vehicles (IoV) may face challenging cybersecurity attacks that may require sophisticated intrusion detection systems, necessitating a rapid development and response system. This research investigates the performance advantages of GPU-accelerated libraries (cuML) compared to traditional CPU-based implementations (scikit-learn), focusing on the speed and efficiency required for machine learning models used in IoV threat detection environments. The comprehensive evaluations conducted employ four machine learning approaches (Random Forest, KNN, Logistic Regression, XGBoost) across three distinct IoV security datasets (OTIDS, GIDS, CICIoV2024). Our findings demonstrate that GPU-accelerated implementations dramatically improved computational efficiency, with training times reduced by a factor of up to 159 and prediction speeds accelerated by up to 95 times compared to traditional CPU processing, all while preserving detection accuracy. This remarkable performance breakthrough empowers researchers and security specialists to harness GPU acceleration for creating faster, more effective threat detection systems that meet the urgent real-time security demands of today's connected vehicle networks.

Paperid: 2510, https://arxiv.org/pdf/2504.00282.pdf

Abstract:
This paper proposes a data privacy protection framework based on federated learning, which aims to realize effective cross-domain data collaboration under the premise of ensuring data privacy through distributed learning. Federated learning greatly reduces the risk of privacy breaches by training the model locally on each client and sharing only model parameters rather than raw data. The experiment verifies the high efficiency and privacy protection ability of federated learning under different data sources through the simulation of medical, financial, and user data. The results show that federated learning can not only maintain high model performance in a multi-domain data environment but also ensure effective protection of data privacy. The research in this paper provides a new technical path for cross-domain data collaboration and promotes the application of large-scale data analysis and machine learning while protecting privacy.

Paperid: 2511, https://arxiv.org/pdf/2503.23138.pdf

Abstract:
Communication encryption is crucial in computer technology, but existing algorithms struggle with balancing cost and security. We propose EncGPT, a multi-agent framework using large language models (LLM). It includes rule, encryption, and decryption agents that generate encryption rules and apply them dynamically. This approach addresses gaps in LLM-based multi-agent systems for communication security. We tested GPT-4o's rule generation and implemented a substitution encryption workflow with homomorphism preservation, achieving an average execution time of 15.99 seconds.

Paperid: 2512, https://arxiv.org/pdf/2503.22681.pdf

Abstract:
Credit card fraud is a major issue nowadays, costing huge money and affecting trust in financial systems. Traditional fraud detection methods often fail to detect advanced and growing fraud techniques. This study focuses on using Graph Neural Networks (GNNs) to improve fraud detection by analyzing transactions as a network of connected data points, such as accounts, traders, and devices. The proposed "detectGNN" model uses advanced features like time-based patterns and dynamic updates to expose hidden fraud and improve detection accuracy. Tests show that GNNs perform better than traditional methods in finding complex and multi-layered fraud. The model also addresses real-time processing, data imbalance, and privacy concerns, making it practical for real-world use. This research shows that GNNs can provide a powerful, accurate, and a scalable solution for detecting fraud. Future work will focus on making the models easier to understand, privacy-friendly, and adaptable to new types of fraud, ensuring safer financial transactions in the digital world.

Paperid: 2513, https://arxiv.org/pdf/2503.21705.pdf

Abstract:
The disconnect between distributed software artifacts and their supposed source code enables attackers to leverage the build process for inserting malicious functionality. Past research in this field focuses on compiled language ecosystems, mostly analysing Linux distribution packages. However, the popular scripting language ecosystems potentially face unique issues given the systematic difference in distributed artifacts. This SoK provides an overview of existing research, aiming to highlight future directions, as well as chances to transfer existing knowledge from compiled language ecosystems. To that end, we work out key aspects in current research, systematize identified challenges for software reproducibility, and map them between the ecosystems. We find that the literature is sparse, focusing on few individual problems and ecosystems. This allows us to effectively identify next steps to improve reproducibility in this field.

Paperid: 2514, https://arxiv.org/pdf/2503.21400.pdf

Abstract:
We explore the computational implications of a superposition of spacetimes, a phenomenon hypothesized in quantum gravity theories. This was initiated by Shmueli (2024) where the author introduced the complexity class $\mathbf{BQP^{OI}}$ consisting of promise problems decidable by quantum polynomial time algorithms with access to an oracle for computing order interference. In this work, it was shown that the Graph Isomorphism problem and the Gap Closest Vector Problem (with approximation factor $\mathcal{O}(n^{3/2})$) are in $\mathbf{BQP^{OI}}$. We extend this result by showing that the entire complexity class $\mathbf{SZK}$ (Statistical Zero Knowledge) is contained within $\mathbf{BQP^{OI}}$. This immediately implies that the security of numerous lattice based cryptography schemes will be compromised in a computational model based on superposition of spacetimes, since these often rely on the hardness of the Learning with Errors problem, which is in $\mathbf{SZK}$.

Paperid: 2515, https://arxiv.org/pdf/2503.21305.pdf

Abstract:
Backdoor attacks are among the most effective, practical, and stealthy attacks in deep learning. In this paper, we consider a practical scenario where a developer obtains a deep model from a third party and uses it as part of a safety-critical system. The developer wants to inspect the model for potential backdoors prior to system deployment. We find that most existing detection techniques make assumptions that are not applicable to this scenario. In this paper, we present a novel framework for detecting backdoors under realistic restrictions. We generate candidate triggers by deductively searching over the space of possible triggers. We construct and optimize a smoothed version of Attack Success Rate as our search objective. Starting from a broad class of template attacks and just using the forward pass of a deep model, we reverse engineer the backdoor attack. We conduct extensive evaluation on a wide range of attacks, models, and datasets, with our technique performing almost perfectly across these settings.

Paperid: 2516, https://arxiv.org/pdf/2503.20976.pdf

Abstract:
Real-time price signals and power generation levels (disaggregated or aggregated) are commonly made available to the public by Independent System Operators (ISOs) to promote efficiency and transparency. However, they may inadvertently reveal crucial private information about the power grid, such as the cost functions of generators. Adversaries can exploit these vulnerabilities for strategic bidding, potentially leading to financial losses for power market participants and consumers. In this paper, we prove the existence of a closed-form solution for recovering coefficients in cost functions when LMPs and disaggregated power generation data are available. Additionally, we establish the convergence conditions for inference the quadratic coefficients of cost functions when LMPs and aggregated generation data are given. Our theoretical analysis provides the conditions under which the algorithm is guaranteed to converge, and our experiments demonstrate the efficacy of this method on IEEE benchmark systems, including 14-bus and 30-bus and 118-bus systems.

Paperid: 2517, https://arxiv.org/pdf/2503.20932.pdf

Abstract:
There is growing interest in Secure Analytics, but fully oblivious query execution in Secure Multi-Party Computation (MPC) settings is often prohibitively expensive. Recent related works propose different approaches to trimming the size of intermediate results between query operators, resulting in significant speedups at the cost of some information leakage. In this work, we generalize these ideas into a method of flexible and efficient trimming of operator outputs that can be added to MPC operators easily. This allows for precisely controlling the security/performance trade-off on a per-operator and per-query basis. We demonstrate that our work is practical by porting a state-of-the-art trimming approach to it, resulting in a faster runtime and increased security. Our work lays down the foundation for a future MPC query planner that can pick different performance and security targets when composing physical query plans.

Paperid: 2518, https://arxiv.org/pdf/2503.20794.pdf

Abstract:
We evaluate the performance of four leading solutions for de-identification of unstructured medical text - Azure Health Data Services, AWS Comprehend Medical, OpenAI GPT-4o, and John Snow Labs - on a ground truth dataset of 48 clinical documents annotated by medical experts. The analysis, conducted at both entity-level and token-level, suggests that John Snow Labs' Medical Language Models solution achieves the highest accuracy, with a 96% F1-score in protected health information (PHI) detection, outperforming Azure (91%), AWS (83%), and GPT-4o (79%). John Snow Labs is not only the only solution which achieves regulatory-grade accuracy (surpassing that of human experts) but is also the most cost-effective solution: It is over 80% cheaper compared to Azure and GPT-4o, and is the only solution not priced by token. Its fixed-cost local deployment model avoids the escalating per-request fees of cloud-based services, making it a scalable and economical choice.

Paperid: 2519, https://arxiv.org/pdf/2503.18487.pdf

Abstract:
Malicious traffic detection is a pivotal technology for network security to identify abnormal network traffic and detect network attacks. Large Language Models (LLMs) are trained on a vast corpus of text, have amassed remarkable capabilities of context-understanding and commonsense knowledge. This has opened up a new door for network attacks detection. Researchers have already initiated discussions regarding the application of LLMs on specific cyber-security tasks. Unfortunately, there remains a lack of comprehensive analysis on harnessing LLMs for traffic detection, as well as the opportunities and challenges. In this paper, we focus on unleashing the full potential of Large Language Models (LLMs) in malicious traffic detection. We present a holistic view of the architecture of LLM-powered malicious traffic detection, including the procedures of Pre-training, Fine-tuning, and Detection. Especially, by exploring the knowledge and capabilities of LLM, we identify three distinct roles LLM can act in traffic classification: Classifier, Encoder, and Predictor. For each of them, the modeling paradigm, opportunities and challenges are elaborated. Finally, we present our design on LLM-powered DDoS detection as a case study. The proposed framework attains accurate detection on carpet bombing DDoS by exploiting LLMs' capabilities in contextual mining. The evaluation shows its efficacy, exhibiting a nearly 35% improvement compared to existing systems.

Paperid: 2520, https://arxiv.org/pdf/2503.18173.pdf

Abstract:
In recent years, many cyber incidents have occurred in the maritime sector, targeting the information technology (IT) and operational technology (OT) infrastructure. One of the key approaches for handling cyber incidents is cyber security monitoring, which aims at timely detection of cyber attacks with automated methods. Although several literature review papers have been published in the field of maritime cyber security, none of the previous studies has focused on cyber security monitoring. The current paper addresses this research gap and surveys the methods, algorithms, tools and architectures used for cyber security monitoring in the maritime sector. For the survey, a systematic literature review of cyber security monitoring studies is conducted following the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) protocol. The first contribution of this paper is the bibliometric analysis of related literature and the identification of the main research themes in previous works. For that purpose, the paper presents a taxonomy for existing studies which highlights the main properties of maritime cyber security monitoring research. The second contribution of this paper is an in-depth analysis of previous works and the identification of research gaps and limitations in existing literature. The gaps and limitations include several dataset and evaluation issues and a number of understudied research topics. Based on these findings, the paper outlines future research directions for cyber security monitoring in the maritime field.

Paperid: 2521, https://arxiv.org/pdf/2503.17756.pdf

Abstract:
Onsite bandwidth reservation requests often face challenges such as price fluctuations and fairness issues due to unpredictable bandwidth availability and stringent latency requirements. Requesting bandwidth in advance can mitigate the impact of these fluctuations and ensure timely access to critical resources. In a multi-Mobile Network Operator (MNO) environment, vehicles need to select cost-effective and reliable resources for their safety-critical applications. This research aims to minimize resource costs by finding the best price among multiple MNOs. It formulates multi-operator scenarios as a Markov Decision Process (MDP), utilizing a Deep Reinforcement Learning (DRL) algorithm, specifically Dueling Deep Q-Learning. For efficient and stable learning, we propose a novel area-wise approach and an adaptive MDP synthetic close to the real environment. The Temporal Fusion Transformer (TFT) is used to handle time-dependent data and model training. Furthermore, the research leverages Amazon spot price data and adopts a multi-phase training approach, involving initial training on synthetic data, followed by real-world data. These phases enable the DRL agent to make informed decisions using insights from historical data and real-time observations. The results show that our model leads to significant cost reductions, up to 40%, compared to scenarios without a policy model in such a complex environment.

Paperid: 2522, https://arxiv.org/pdf/2503.17298.pdf

Abstract:
Unmanned aerial vehicles (UAVs) depend on untrusted software components to automate dangerous or critical missions, making them a desirable target for attacks. Some work has been done to prevent an attacker who has either compromised a ground control station or parts of a UAV's software from sabotaging the vehicle, but not both. We present an architecture running a UAV software stack with runtime monitoring and seL4-based software isolation that prevents attackers from both exploiting software bugs and stealthy attacks. Our architecture retrofits legacy UAVs and secures the popular MAVLink protocol, making wide adoption possible.

Paperid: 2523, https://arxiv.org/pdf/2503.16266.pdf

Abstract:
Model Inversion Attacks (MIAs) aim to reconstruct private training data from models, leading to privacy leakage, particularly in facial recognition systems. Although many studies have enhanced the effectiveness of white-box MIAs, less attention has been paid to improving efficiency and utility under limited attacker capabilities. Existing black-box MIAs necessitate an impractical number of queries, incurring significant overhead. Therefore, we analyze the limitations of existing MIAs and introduce Surrogate Model-based Inversion with Long-tailed Enhancement (SMILE), a high-resolution oriented and query-efficient MIA for the black-box setting. We begin by analyzing the initialization of MIAs from a data distribution perspective and propose a long-tailed surrogate training method to obtain high-quality initial points. We then enhance the attack's effectiveness by employing the gradient-free black-box optimization algorithm selected by NGOpt. Our experiments show that SMILE outperforms existing state-of-the-art black-box MIAs while requiring only about 5% of the query overhead.

Paperid: 2524, https://arxiv.org/pdf/2503.16047.pdf

Abstract:
Denial-of-Service (DoS) attacks remain a critical threat to network security, disrupting services and causing significant economic losses. Traditional detection methods, including statistical and rule-based models, struggle to adapt to evolving attack patterns. To address this challenge, we propose a novel Temporal-Spatial Attention Network (TSAN) architecture for detecting Denial of Service (DoS) attacks in network traffic. By leveraging both temporal and spatial features of network traffic, our approach captures complex traffic patterns and anomalies that traditional methods might miss. The TSAN model incorporates transformer-based temporal encoding, convolutional spatial encoding, and a cross-attention mechanism to fuse these complementary feature spaces. Additionally, we employ multi-task learning with auxiliary tasks to enhance the model's robustness. Experimental results on the NSL-KDD dataset demonstrate that TSAN outperforms state-of-the-art models, achieving superior accuracy, precision, recall, and F1-score while maintaining computational efficiency for real-time deployment. The proposed architecture offers an optimal balance between detection accuracy and computational overhead, making it highly suitable for real-world network security applications.

Paperid: 2525, https://arxiv.org/pdf/2503.14922.pdf

Abstract:
Graph Convolutional Networks (GCNs) have shown excellent performance in graph-structured tasks such as node classification and graph classification. However, recent research has shown that GCNs are vulnerable to a new type of threat called the backdoor attack, where the adversary can inject a hidden backdoor into the GCNs so that the backdoored model performs well on benign samples, whereas its prediction will be maliciously changed to the attacker-specified target label if the hidden backdoor is activated by the attacker-defined trigger. Clean-label backdoor attack and semantic backdoor attack are two new backdoor attacks to Deep Neural Networks (DNNs), they are more imperceptible and have posed new and serious threats. The semantic and clean-label backdoor attack is not fully explored in GCNs. In this paper, we propose a semantic and clean-label backdoor attack against GCNs under the context of graph classification to reveal the existence of this security vulnerability in GCNs. Specifically, SCLBA conducts an importance analysis on graph samples to select one type of node as semantic trigger, which is then inserted into the graph samples to create poisoning samples without changing the labels of the poisoning samples to the attacker-specified target label. We evaluate SCLBA on multiple datasets and the results show that SCLBA can achieve attack success rates close to 99% with poisoning rates of less than 3%, and with almost no impact on the performance of model on benign samples.

Paperid: 2526, https://arxiv.org/pdf/2503.14611.pdf

Abstract:
Confidential services running in hardware-protected Trusted Execution Environments (TEEs) can provide higher security assurance, but this requires custom clients and protocols to distribute, update, and verify their attestation evidence. Compared with classic Internet security, built upon universal abstractions such as domain names, origins, and certificates, this puts a significant burden on service users and providers. In particular, Web browsers and other legacy clients do not get the same security guaranties as custom clients. We present a new approach for users to establish trust in confidential services. We propose attested DNS (aDNS): a name service that securely binds the attested implementation of confidential services to their domain names. ADNS enforces policies for all names in its zone of authority: any TEE that runs a service must present hardware attestation that complies with the domain-specific policy before registering keys and obtaining certificates for any name in this domain. ADNS provides protocols for zone delegation, TEE registration, and certificate issuance. ADNS builds on standards such as DNSSEC, DANE, ACME and Certificate Transparency. ADNS provides DNS transparency by keeping all records, policies, and attestations in a public append-only log, thereby enabling auditing and preventing targeted attacks. We implement aDNS as a confidential service using a fault-tolerant network of TEEs. We evaluate it using sample confidential services that illustrate various TEE platforms. On the client side, we provide a generic browser extension that queries and verifies attestation records before opening TLS connections, with negligible performance overhead, and we show that, with aDNS, even legacy Web clients benefit from confidential computing as long as some enlightened clients verify attestations to deter or blame malicious actors.

Paperid: 2527, https://arxiv.org/pdf/2503.14281.pdf

Abstract:
AI coding assistants are widely used for tasks like code generation. These tools now require large and complex contexts, automatically sourced from various origins$\unicode{x2014}$across files, projects, and contributors$\unicode{x2014}$forming part of the prompt fed to underlying LLMs. This automatic context-gathering introduces new vulnerabilities, allowing attackers to subtly poison input to compromise the assistant's outputs, potentially generating vulnerable code or introducing critical errors. We propose a novel attack, Cross-Origin Context Poisoning (XOXO), that is challenging to detect as it relies on adversarial code modifications that are semantically equivalent. Traditional program analysis techniques struggle to identify these perturbations since the semantics of the code remains correct, making it appear legitimate. This allows attackers to manipulate coding assistants into producing incorrect outputs, while shifting the blame to the victim developer. We introduce a novel, task-agnostic, black-box attack algorithm GCGS that systematically searches the transformation space using a Cayley Graph, achieving a 75.72% attack success rate on average across five tasks and eleven models, including GPT 4.1 and Claude 3.5 Sonnet v2 used by popular AI coding assistants. Furthermore, defenses like adversarial fine-tuning are ineffective against our attack, underscoring the need for new security measures in LLM-powered coding tools.

Paperid: 2528, https://arxiv.org/pdf/2503.11659.pdf

Abstract:
The increasing complexity of digital ecosystems and evolving cybersecurity threats have highlighted the limitations of traditional perimeter-based security models, leading to the growing adoption of Zero Trust Architecture (ZTA). ZTA operates on the principle of "never trust, always verify", enforcing continuous authentication, conditional access, dynamic trust evaluation, and the principle of least privilege to enhance security across diverse domains. This study applies the PRISMA framework to analyze 10 years of research (2016-2025) on ZTA, presenting a systematic literature review (SLR) that synthesizes its applications, enabling technologies, and associated challenges. It provides a detailed taxonomy that organizes ZTA's application domains, together with the emerging technologies that facilitate its implementation, and critically examines the barriers to ZTA adoption. Additionally, the study traces the historical evolution of ZTA alongside notable events and publications trends while highlighting some potential factors for the surge over the past few years. This comprehensive analysis serves as a practical guide for researchers and practitioners seeking to leverage ZTA for stronger, more adaptive security frameworks in a rapidly shifting threat landscape.

Paperid: 2529, https://arxiv.org/pdf/2503.11573.pdf

Abstract:
Cloud compute systems allow administrators to write access control policies that govern access to private data. While policies are written in convenient languages, such as AWS Identity and Access Management Policy Language, manually written policies often become complex and error prone. In this paper, we investigate whether and how well Large Language Models (LLMs) can be used to synthesize access control policies. Our investigation focuses on the task of taking an access control request specification and zero-shot prompting LLMs to synthesize a well-formed access control policy which correctly adheres to the request specification. We consider two scenarios, one which the request specification is given as a concrete list of requests to be allowed or denied, and another in which a natural language description is used to specify sets of requests to be allowed or denied. We then argue that for zero-shot prompting, more precise and structured prompts using a syntax based approach are necessary and experimentally show preliminary results validating our approach.

Paperid: 2530, https://arxiv.org/pdf/2503.09460.pdf

Abstract:
The European Cybersecurity Certification Scheme for Cloud Services (EUCS) is one of the first cybersecurity schemes in Europe, defined by the European Union Agency for Cybersecurity (ENISA). It aims to encourage cloud providers to strengthen their cybersecurity policies in order to receive an official seal of approval from European authorities. EUCS defines a set of security requirements that the cloud provider must meet, in whole or in part, in order to achieve the security certification. The requirements are written in natural language and cover every aspect of security in the cloud environment, from logging access to protecting the system with anti-malware tools to training staff. Operationally, each requirement is associated with one or more evaluable metrics. For example, a requirement to monitor access attempts to a service will have associated metrics that take into account the number of accesses, the number of access attempts, who is accessing, and what resources are being used. Partners in the European project Medina, which ended in October 2023, defined 163 metrics and manually mapped them to 70 EUCS requirements. Manual mapping is intuitively a long and costly process in terms of human resources. This paper proposes an approach based on Sentence Transformers to automatically associate requirements and metrics. In terms of correctness of associations, the proposed method achieves a Normalized Discounted Cumulative Gain of 0.640, improving a previous experiment by 0.146 points.

Paperid: 2531, https://arxiv.org/pdf/2503.07048.pdf

Abstract:
In an MPC-protected distributed computation, although the use of MPC assures data privacy during computation, sensitive information may still be inferred by curious MPC participants from the computation output. This can be observed, for instance, in the inference attacks on either federated learning or a more standard statistical computation with distributed inputs. In this work, we address this output privacy issue by proposing a discrete and bounded Laplace-inspired perturbation mechanism along with a secure realization of this mechanism using MPC. The proposed mechanism strictly adheres to a zero failure probability, overcoming the limitation encountered on other existing bounded and discrete variants of Laplace perturbation. We provide analyses of the proposed differential privacy (DP) perturbation in terms of its privacy and utility. Additionally, we designed MPC protocols to implement this mechanism and presented performance benchmarks based on our experimental setup. The MPC realization of the proposed mechanism exhibits a complexity similar to the state-of-the-art discrete Gaussian mechanism, which can be considered an alternative with comparable efficiency while providing stronger differential privacy guarantee. Moreover, efficiency of the proposed scheme can be further enhanced by performing the noise generation offline while leaving the perturbation phase online.

Paperid: 2532, https://arxiv.org/pdf/2503.05850.pdf

Abstract:
This paper explores the use of partially homomorphic encryption (PHE) for encrypted vector similarity search, with a focus on facial recognition and broader applications like reverse image search, recommendation engines, and large language models (LLMs). While fully homomorphic encryption (FHE) exists, we demonstrate that encrypted cosine similarity can be computed using PHE, offering a more practical alternative. Since PHE does not directly support cosine similarity, we propose a method that normalizes vectors in advance, enabling dot product calculations as a proxy. We also apply min-max normalization to handle negative dimension values. Experiments on the Labeled Faces in the Wild (LFW) dataset use DeepFace's FaceNet128d, FaceNet512d, and VGG-Face (4096d) models in a two-tower setup. Pre-encrypted embeddings are stored in one tower, while an edge device captures images, computes embeddings, and performs encrypted-plaintext dot products via additively homomorphic encryption. We implement this with LightPHE, evaluating Paillier, Damgard-Jurik, and Okamoto-Uchiyama schemes, excluding others due to performance or decryption complexity. Tests at 80-bit and 112-bit security (NIST-secure until 2030) compare PHE against FHE (via TenSEAL), analyzing encryption, decryption, operation time, cosine similarity loss, key/ciphertext sizes. Results show PHE is less computationally intensive, faster, and produces smaller ciphertexts/keys, making it well-suited for memory-constrained environments and real-world privacy-preserving encrypted similarity search.

Paperid: 2533, https://arxiv.org/pdf/2503.05812.pdf

Abstract:
Frontier AI models -- highly capable foundation models at the cutting edge of AI development -- may pose severe risks to public safety, human rights, economic stability, and societal value in the coming years. These risks could arise from deliberate adversarial misuse, system failures, unintended cascading effects, or simultaneous failures across multiple models. In response to such risks, at the AI Seoul Summit in May 2024, 16 global AI industry organizations signed the Frontier AI Safety Commitments, and 27 nations and the EU issued a declaration on their intent to define these thresholds. To fulfill these commitments, organizations must determine and disclose ``thresholds at which severe risks posed by a model or system, unless adequately mitigated, would be deemed intolerable.'' To assist in setting and operationalizing intolerable risk thresholds, we outline key principles and considerations; for example, to aim for ``good, not perfect'' thresholds in the face of limited data on rapidly advancing AI capabilities and consequently evolving risks. We also propose specific threshold recommendations, including some detailed case studies, for a subset of risks across eight risk categories: (1) Chemical, Biological, Radiological, and Nuclear (CBRN) Weapons, (2) Cyber Attacks, (3) Model Autonomy, (4) Persuasion and Manipulation, (5) Deception, (6) Toxicity, (7) Discrimination, and (8) Socioeconomic Disruption. Our goal is to serve as a starting point or supplementary resource for policymakers and industry leaders, encouraging proactive risk management that prioritizes preventing intolerable risks (ex ante) rather than merely mitigating them after they occur (ex post).

Paperid: 2534, https://arxiv.org/pdf/2503.04302.pdf

Abstract:
The rapid evolution of malware attacks calls for the development of innovative detection methods, especially in resource-constrained edge computing. Traditional detection techniques struggle to keep up with modern malware's sophistication and adaptability, prompting a shift towards advanced methodologies like those leveraging Large Language Models (LLMs) for enhanced malware detection. However, deploying LLMs for malware detection directly at edge devices raises several challenges, including ensuring accuracy in constrained environments and addressing edge devices' energy and computational limits. To tackle these challenges, this paper proposes an architecture leveraging lightweight LLMs' strengths while addressing limitations like reduced accuracy and insufficient computational power. To evaluate the effectiveness of the proposed lightweight LLM-based approach for edge computing, we perform an extensive experimental evaluation using several state-of-the-art lightweight LLMs. We test them with several publicly available datasets specifically designed for edge and IoT scenarios and different edge nodes with varying computational power and characteristics.

Paperid: 2535, https://arxiv.org/pdf/2503.04260.pdf

Abstract:
We propose Data Tumbling Layer (DTL), a cryptographic scheme for non-interactive data tumbling. The core concept is to enable users to commit to specific data and subsequently re-use to the encrypted version of these data across different applications while removing the link to the previous data commit action. We define the following security and privacy notions for DTL: (i) no one-more redemption: a malicious user cannot redeem and use the same data more than the number of times they have committed the data; (ii) theft prevention: a malicious user cannot use data that has not been committed by them; (iii) non-slanderabilty: a malicious user cannot prevent an honest user from using their previously committed data; and (iv) unlinkability: a malicious user cannot link tainted data from an honest user to the corresponding data after it has been tumbled. To showcase the practicality of DTL, we use DTL to realize applications for (a) unlinkable fixed-amount payments; (b) unlinkable and confidential payments for variable amounts; (c) unlinkable weighted voting protocol. Finally, we implemented and evaluated all the proposed applications. For the unlinkable and confidential payment application, a user can initiate such a transaction in less than $1.5$s on a personal laptop. In terms of on-chain verification, the gas cost is less than $1.8$ million.

Paperid: 2536, https://arxiv.org/pdf/2503.03043.pdf

Abstract:
We study how inherent randomness in the training process -- where each sample (or client in federated learning) contributes only to a randomly selected portion of training -- can be leveraged for privacy amplification. This includes (1) data partitioning, where a sample participates in only a subset of training iterations, and (2) model partitioning, where a sample updates only a subset of the model parameters. We apply our framework to model parallelism in federated learning, where each client updates a randomly selected subnetwork to reduce memory and computational overhead, and show that existing methods, e.g. model splitting or dropout, provide a significant privacy amplification gain not captured by previous privacy analysis techniques. Additionally, we introduce Balanced Iteration Subsampling, a new data partitioning method where each sample (or client) participates in a fixed number of training iterations. We show that this method yields stronger privacy amplification than Poisson (i.i.d.) sampling of data (or clients). Our results demonstrate that randomness in the training process, which is structured rather than i.i.d. and interacts with data in complex ways, can be systematically leveraged for significant privacy amplification.

Paperid: 2537, https://arxiv.org/pdf/2503.00659.pdf

Abstract:
With cooperative perception, autonomous vehicles can wirelessly share sensor data and representations to overcome sensor occlusions, improving situational awareness. Securing such data exchanges is crucial for connected autonomous vehicles. Existing, automated reputation-based approaches often suffer from a delay between detection and exclusion of misbehaving vehicles, while majority-based approaches have communication overheads that limits scalability. In this paper, we introduce CATS, a novel automated system that blends together the best traits of reputation-based and majority-based detection mechanisms to secure vehicle-to-everything (V2X) communications for cooperative perception, while preserving the privacy of cooperating vehicles. Our evaluation with city-scale simulations on realistic traffic data shows CATS's effectiveness in rapidly identifying and isolating misbehaving vehicles, with a low false negative rate and overheads, proving its suitability for real world deployments.

Paperid: 2538, https://arxiv.org/pdf/2502.19996.pdf

Abstract:
Distributed Denial of Service (DDoS) attacks persist as significant threats to online services and infrastructure, evolving rapidly in sophistication and eluding traditional detection mechanisms. This evolution demands a comprehensive examination of current trends in DDoS attacks and the efficacy of modern detection strategies. This paper offers an comprehensive survey of emerging DDoS attacks and detection strategies over the past decade. We delve into the diversification of attack targets, extending beyond conventional web services to include newer network protocols and systems, and the adoption of advanced adversarial tactics. Additionally, we review current detection techniques, highlighting essential features that modern systems must integrate to effectively neutralize these evolving threats. Given the technological demands of contemporary network systems, such as high-volume and in-line packet processing capabilities, we also explore how innovative hardware technologies like programmable switches can significantly enhance the development and deployment of robust DDoS detection systems. We conclude by identifying open problems and proposing future directions for DDoS research. In particular, our survey sheds light on the investigation of DDoS attack surfaces for emerging systems, protocols, and adversarial strategies. Moreover, we outlines critical open questions in the development of effective detection systems, e.g., the creation of defense mechanisms independent of control planes.

Paperid: 2539, https://arxiv.org/pdf/2502.19710.pdf

Abstract:
Given the need to evaluate the robustness of face recognition (FR) models, many efforts have focused on adversarial patch attacks that mislead FR models by introducing localized perturbations. Impersonation attacks are a significant threat because adversarial perturbations allow attackers to disguise themselves as legitimate users. This can lead to severe consequences, including data breaches, system damage, and misuse of resources. However, research on such attacks in FR remains limited. Existing adversarial patch generation methods exhibit limited efficacy in impersonation attacks due to (1) the need for high attacker capabilities, (2) low attack success rates, and (3) excessive query requirements. To address these challenges, we propose a novel method SAP-DIFF that leverages diffusion models to generate adversarial patches via semantic perturbations in the latent space rather than direct pixel manipulation. We introduce an attention disruption mechanism to generate features unrelated to the original face, facilitating the creation of adversarial samples and a directional loss function to guide perturbations toward the target identity feature space, thereby enhancing attack effectiveness and efficiency. Extensive experiments on popular FR models and datasets demonstrate that our method outperforms state-of-the-art approaches, achieving an average attack success rate improvement of 45.66% (all exceeding 40%), and a reduction in the number of queries by about 40% compared to the SOTA approach

Paperid: 2540, https://arxiv.org/pdf/2502.19567.pdf

Abstract:
The rapid adoption of open source machine learning (ML) datasets and models exposes today's AI applications to critical risks like data poisoning and supply chain attacks across the ML lifecycle. With growing regulatory pressure to address these issues through greater transparency, ML model vendors face challenges balancing these requirements against confidentiality for data and intellectual property needs. We propose Atlas, a framework that enables fully attestable ML pipelines. Atlas leverages open specifications for data and software supply chain provenance to collect verifiable records of model artifact authenticity and end-to-end lineage metadata. Atlas combines trusted hardware and transparency logs to enhance metadata integrity, preserve data confidentiality, and limit unauthorized access during ML pipeline operations, from training through deployment. Our prototype implementation of Atlas integrates several open-source tools to build an ML lifecycle transparency system, and assess the practicality of Atlas through two case study ML pipelines.

Paperid: 2541, https://arxiv.org/pdf/2502.19154.pdf

Abstract:
Energy communities consist of decentralized energy production, storage, consumption, and distribution and are gaining traction in modern power systems. However, these communities may increase the vulnerability of the grid to cyber threats. We propose an anomaly-based intrusion detection system to enhance the security of energy communities. The system leverages deep autoencoders to detect deviations from normal operational patterns in order to identify anomalies induced by malicious activities and attacks. Operational data for training and evaluation are derived from a Simulink model of an energy community. The results show that the autoencoder-based intrusion detection system achieves good detection performance across multiple attack scenarios. We also demonstrate potential for real-world application of the system by training a federated model that enables distributed intrusion detection while preserving data privacy.

Paperid: 2542, https://arxiv.org/pdf/2502.18545.pdf

Abstract:
The widespread adoption of Large Language Models (LLMs) has raised significant privacy concerns regarding the exposure of personally identifiable information (PII) in user prompts. To address this challenge, we propose a query-unrelated PII masking strategy and introduce PII-Bench, the first comprehensive evaluation framework for assessing privacy protection systems. PII-Bench comprises 2,842 test samples across 55 fine-grained PII categories, featuring diverse scenarios from single-subject descriptions to complex multi-party interactions. Each sample is carefully crafted with a user query, context description, and standard answer indicating query-relevant PII. Our empirical evaluation reveals that while current models perform adequately in basic PII detection, they show significant limitations in determining PII query relevance. Even state-of-the-art LLMs struggle with this task, particularly in handling complex multi-subject scenarios, indicating substantial room for improvement in achieving intelligent PII masking.

Paperid: 2543, https://arxiv.org/pdf/2502.17653.pdf

Abstract:
Remote attestation (RA) is the foundation for trusted execution environments in the cloud and trusted device driver onboarding in operating systems. However, RA misses a rigorous mechanized definition of its security properties in one of the strongest models, i.e., the semantic model. Such a mechanization requires the concept of StateSeparating Proofs (SSP). However, SSP was only recently implemented as a foundational framework in the Rocq Prover. Based on this framework, this paper presents the first mechanized formalization of the fundamental security properties of RA. Our Rocq Prover development first defines digital signatures and formally verifies security against forgery in the strong existential attack model. Based on these results, we define RA and reduce the security of RA to the security of digital signatures. Our development provides evidence that the RA protocol is secure against forgery. Additionally, we extend our reasoning to the primitives of RA and reduce their security to the security of the primitives of the digital signatures.

Paperid: 2544, https://arxiv.org/pdf/2502.16756.pdf

Abstract:
Speculative attacks such as Spectre can leak secret information without being discovered by the operating system. Speculative execution vulnerabilities are finicky and deep in the sense that to exploit them, it requires intensive manual labor and intimate knowledge of the hardware. In this paper, we introduce SpecRL, a framework that utilizes reinforcement learning to find speculative execution leaks in post-silicon (black box) microprocessors.

Paperid: 2545, https://arxiv.org/pdf/2502.16750.pdf

Abstract:
The autonomous AI agents using large language models can create undeniable values in all span of the society but they face security threats from adversaries that warrants immediate protective solutions because trust and safety issues arise. Considering the many-shot jailbreaking and deceptive alignment as some of the main advanced attacks, that cannot be mitigated by the static guardrails used during the supervised training, points out a crucial research priority for real world robustness. The combination of static guardrails in dynamic multi-agent system fails to defend against those attacks. We intend to enhance security for LLM-based agents through the development of new evaluation frameworks which identify and counter threats for safe operational deployment. Our work uses three examination methods to detect rogue agents through a Reverse Turing Test and analyze deceptive alignment through multi-agent simulations and develops an anti-jailbreaking system by testing it with GEMINI 1.5 pro and llama-3.3-70B, deepseek r1 models using tool-mediated adversarial scenarios. The detection capabilities are strong such as 94\% accuracy for GEMINI 1.5 pro yet the system suffers persistent vulnerabilities when under long attacks as prompt length increases attack success rates (ASR) and diversity metrics become ineffective in prediction while revealing multiple complex system faults. The findings demonstrate the necessity of adopting flexible security systems based on active monitoring that can be performed by the agents themselves together with adaptable interventions by system admin as the current models can create vulnerabilities that can lead to the unreliable and vulnerable system. So, in our work, we try to address such situations and propose a comprehensive framework to counteract the security issues.

Paperid: 2546, https://arxiv.org/pdf/2502.16375.pdf

Abstract:
Building on related concepts, like, decentralized identifiers (DIDs), proof of personhood, anonymous credentials, personhood credentials (PHCs) emerged as an alternative approach, enabling individuals to verify to digital service providers that they are a person without disclosing additional information. However, new technologies might introduce some friction due to users misunderstandings and mismatched expectations. Despite their growing importance, limited research has been done on users perceptions and preferences regarding PHCs. To address this gap, we conducted competitive analysis, and semi-structured online user interviews with 23 participants from US and EU to provide concrete design recommendations for PHCs that incorporate user needs, adoption rules, and preferences. Our study -- (a)surfaces how people reason about unknown privacy and security guarantees of PHCs compared to current verification methods -- (b) presents the impact of several factors on how people would like to onboard and manage PHCs, including, trusted issuers (e.g. gov), ground truth data to issue PHC (e.g biometrics, physical id), and issuance system (e.g. centralized vs decentralized). In a think-aloud conceptual design session, participants recommended -- conceptualized design, such as periodic biometrics verification, time-bound credentials, visually interactive human-check, and supervision of government for issuance system. We propose actionable designs reflecting users preferences.

Paperid: 2547, https://arxiv.org/pdf/2502.15929.pdf

Abstract:
This paper presents a novel approach to evaluate the efficiency of a RAG-based agentic Large Language Model (LLM) architecture for network packet seed generation and enrichment. Enhanced by chain-of-thought (COT) prompting techniques, the proposed approach focuses on the improvement of the seeds' structural quality in order to guide protocol fuzzing frameworks through a wide exploration of the protocol state space. Our method leverages RAG and text embeddings to dynamically reference to the Request For Comments (RFC) documents knowledge base for answering queries regarding the protocol's Finite State Machine (FSM), then iteratively reasons through the retrieved knowledge, for output refinement and proper seed placement. We then evaluate the response structure quality of the agent's output, based on metrics as BLEU, ROUGE, and Word Error Rate (WER) by comparing the generated packets against the ground-truth packets. Our experiments demonstrate significant improvements of up to 18.19%, 14.81%, and 23.45% in BLEU, ROUGE, and WER, respectively, over baseline models. These results confirm the potential of such approach, improving LLM-based protocol fuzzing frameworks for the identification of hidden vulnerabilities.

Paperid: 2549, https://arxiv.org/pdf/2502.15504.pdf

Abstract:
Audio deepfakes are increasingly in-differentiable from organic speech, often fooling both authentication systems and human listeners. While many techniques use low-level audio features or optimization black-box model training, focusing on the features that humans use to recognize speech will likely be a more long-term robust approach to detection. We explore the use of prosody, or the high-level linguistic features of human speech (e.g., pitch, intonation, jitter) as a more foundational means of detecting audio deepfakes. We develop a detector based on six classical prosodic features and demonstrate that our model performs as well as other baseline models used by the community to detect audio deepfakes with an accuracy of 93% and an EER of 24.7%. More importantly, we demonstrate the benefits of using a linguistic features-based approach over existing models by applying an adaptive adversary using an $L_{\infty}$ norm attack against the detectors and using attention mechanisms in our training for explainability. We show that we can explain the prosodic features that have highest impact on the model's decision (Jitter, Shimmer and Mean Fundamental Frequency) and that other models are extremely susceptible to simple $L_{\infty}$ norm attacks (99.3% relative degradation in accuracy). While overall performance may be similar, we illustrate the robustness and explainability benefits to a prosody feature approach to audio deepfake detection.

Paperid: 2551, https://arxiv.org/pdf/2502.13830.pdf

Abstract:
We study the round complexity of secure multi-party computation (MPC) in the post-quantum regime. Our focus is on the fully black-box setting, where both the construction and security reduction are black-box. Chia, Chung, Liu, and Yamakawa [FOCS'22] demonstrated the infeasibility of achieving standard simulation-based security within constant rounds unless $\mathbf{NP} \subseteq \mathbf{BQP}$. This leaves crucial feasibility questions unresolved. Specifically, it remains unknown whether black-box constructions are achievable within polynomial rounds; also, the existence of constant-round constructions with respect to $Îµ$-simulation, a relaxed yet useful alternative to standard simulation, remains unestablished. This work provides positive answers. We introduce the first black-box construction for PQ-MPC in polynomial rounds, from the minimal assumption of post-quantum semi-honest oblivious transfers. In the two-party scenario, our construction requires only $Ï(1)$ rounds. These results have already been applied in the oracle separation between classical-communication quantum MPC and $\mathbf{P} = \mathbf{NP}$ in Kretschmer, Qian, and Tal [STOC'25]. As for $Îµ$-simulation, Chia, Chung, Liang, and Yamakawa [CRYPTO'22] resolved the issue for the two-party setting, leaving the multi-party case open. We complete the picture by presenting the first black-box, constant-round construction in the multi-party setting, instantiable using various standard post-quantum primitives. En route, we obtain a black-box, constant-round post-quantum commitment achieving a weaker version of 1-many non-malleability, from post-quantum one-way functions. Besides its role in our MPC construction, this commitment also reduces the assumption used in the quantum parallel repetition lower bound by Bostanci, Qian, Spooner, and Yuen [STOC'24]. We anticipate further applications in the future.

Paperid: 2552, https://arxiv.org/pdf/2502.11616.pdf

Abstract:
The Internet of Behaviors (IoB) is an emerging concept that utilizes devices to collect human behavior and provide intelligent services. Although some research has focused on human behavior analysis and data collection within IoB, the associated security and privacy challenges remain insufficiently explored. This paper analyzes the security and privacy risks at different stages of behavioral data generating, uploading, and using, while also considering the dynamic characteristics of user activity areas. Then, we propose a blockchain-based distributed IoB data storage and sharing framework, which is categorized into sensing, processing, and management layers based on these stages. To accommodate both identity authentication and behavioral privacy, zero-knowledge proofs are used in the sensing layer to separate the correlation between behavior and identity, which is further extended to a distributed architecture for cross-domain authentication. In the processing layer, an improved consensus protocol is proposed to enhance the decision-making efficiency of distributed IoB by analyzing the geographical and computational capability of the servers. In the management layer, user permission differences and the privacy of access targets are considered. Different types of behavior are modeled as corresponding relationships between keys, and fine-grained secure access is achieved through function secret sharing. Simulation results demonstrate the effectiveness of the proposed framework in multi-scenario IoB, with average consensus and authentication times reduced by 74% and 56%, respectively.

Paperid: 2553, https://arxiv.org/pdf/2502.11014.pdf

Abstract:
This study evaluates the effectiveness of different feature extraction techniques and classification algorithms in detecting spam messages within SMS data. We analyzed six classifiers Naive Bayes, K-Nearest Neighbors, Support Vector Machines, Linear Discriminant Analysis, Decision Trees, and Deep Neural Networks using two feature extraction methods: bag-of-words and TF-IDF. The primary objective was to determine the most effective classifier-feature combination for SMS spam detection. Our research offers two main contributions: first, by systematically examining various classifier and feature extraction pairings, and second, by empirically evaluating their ability to distinguish spam messages. Our results demonstrate that the TF-IDF method consistently outperforms the bag-of-words approach across all six classifiers. Specifically, Naive Bayes with TF-IDF achieved the highest accuracy of 96.2%, with a precision of 0.976 for non-spam and 0.754 for spam messages. Similarly, Support Vector Machines with TF-IDF exhibited an accuracy of 94.5%, with a precision of 0.926 for non-spam and 0.891 for spam. Deep Neural Networks using TF-IDF yielded an accuracy of 91.0%, with a recall of 0.991 for non-spam and 0.415 for spam messages. In contrast, classifiers such as K-Nearest Neighbors, Linear Discriminant Analysis, and Decision Trees showed weaker performance, regardless of the feature extraction method employed. Furthermore, we observed substantial variability in classifier effectiveness depending on the chosen feature extraction technique. Our findings emphasize the significance of feature selection in SMS spam detection and suggest that TF-IDF, when paired with Naive Bayes, Support Vector Machines, or Deep Neural Networks, provides the most reliable performance. These insights provide a foundation for improving SMS spam detection through optimized feature extraction and classification methods.

Paperid: 2554, https://arxiv.org/pdf/2502.10997.pdf

Abstract:
Hu and Mehta (2024) posed an open problem: what is the optimal instance-dependent rate for the stochastic decision-theoretic online learning (with $K$ actions and $T$ rounds) under $\varepsilon$-differential privacy? Before, the best known upper bound and lower bound are $O\left(\frac{\log K}{Î_{\min}} + \frac{\log K\log T}{\varepsilon}\right)$ and $Î©\left(\frac{\log K}{Î_{\min}} + \frac{\log K}{\varepsilon}\right)$ (where $Î_{\min}$ is the gap between the optimal and the second actions). In this paper, we partially address this open problem by having two new results. First, we provide an improved upper bound for this problem $O\left(\frac{\log K}{Î_{\min}} + \frac{\log^2K}{\varepsilon}\right)$, which is $T$-independent and only has a log dependency in $K$. Second, to further understand the gap, we introduce the \textit{deterministic setting}, a weaker setting of this open problem, where the received loss vector is deterministic. At this weaker setting, a direct application of the analysis and algorithms from the original setting still leads to an extra log factor. We conduct a novel analysis which proves upper and lower bounds that match at $Î(\frac{\log K}{\varepsilon})$.

Paperid: 2555, https://arxiv.org/pdf/2502.10722.pdf

Abstract:
Modern processors widely equip the Performance Monitoring Unit (PMU) to collect various architecture and microarchitecture events. Software developers often utilize the PMU to enhance program's performance, but the potential side effects that arise from its activation are often disregarded. In this paper, we find that the PMU can be employed to retrieve instruction operands. Based on this discovery, we introduce PMU-Data, a novel category of side-channel attacks aimed at leaking secret by identifying instruction operands with PMU. To achieve the PMU-Data attack, we develop five gadgets to encode the confidential data into distinct data-related traces while maintaining the control-flow unchanged. We then measure all documented PMU events on three physical machines with different processors while those gadgets are performing. We successfully identify two types of vulnerable gadgets caused by DIV and MOV instructions. Additionally, we discover 40 vulnerable PMU events that can be used to carry out the PMU-Data attack. We through real experiments to demonstrate the perniciousness of the PMU-Data attack by implementing three attack goals: (1) leaking the kernel data illegally combined with the transient execution vulnerabilities including Meltdown, Spectre, and Zombieload; (2) building a covert-channel to secretly transfer data; (3) extracting the secret data protected by the Trusted Execution Environment (TEE) combined with the Zombieload vulnerability.

Paperid: 2556, https://arxiv.org/pdf/2502.09808.pdf

Abstract:
Securely computing graph convolutional networks (GCNs) is critical for applying their analytical capabilities to privacy-sensitive data like social/credit networks. Multiplying a sparse yet large adjacency matrix of a graph in GCN--a core operation in training/inference--poses a performance bottleneck in secure GCNs. Consider a GCN with $|V|$ nodes and $|E|$ edges; it incurs a large $O(|V|^2)$ communication overhead. Modeling bipartite graphs and leveraging the monotonicity of non-zero entry locations, we propose a co-design harmonizing secure multi-party computation (MPC) with matrix sparsity. Our sparse matrix decomposition transforms an arbitrary sparse matrix into a product of structured matrices. Specialized MPC protocols for oblivious permutation and selection multiplication are then tailored, enabling our secure sparse matrix multiplication ($(SM)^2$) protocol, optimized for secure multiplication of these structured matrices. Together, these techniques take $O(|E|)$ communication in constant rounds. Supported by $(SM)^2$, we present Virgos, a secure 2-party framework that is communication-efficient and memory-friendly on standard vertically-partitioned graph datasets. Performance of Virgos has been empirically validated across diverse network conditions.

Paperid: 2557, https://arxiv.org/pdf/2502.09788.pdf

Abstract:
Internet miscreants increasingly utilize short-lived disposable domains to launch various attacks. Existing detection mechanisms are either too late to catch such malicious domains due to limited information and their short life spans or unable to catch them due to evasive techniques such as cloaking and captcha. In this work, we investigate the possibility of detecting malicious domains early in their life cycle using a content-agnostic approach. We observe that attackers often reuse or rotate hosting infrastructures to host multiple malicious domains due to increased utilization of automation and economies of scale. Thus, it gives defenders the opportunity to monitor such infrastructure to identify newly hosted malicious domains. However, such infrastructures are often shared hosting environments where benign domains are also hosted, which could result in a prohibitive number of false positives. Therefore, one needs innovative mechanisms to better distinguish malicious domains from benign ones even when they share hosting infrastructures. In this work, we build MANTIS, a highly accurate practical system that not only generates daily blocklists of malicious domains but also is able to predict malicious domains on-demand. We design a network graph based on the hosting infrastructure that is accurate and generalizable over time. Consistently, our models achieve a precision of 99.7%, a recall of 86.9% with a very low false positive rate (FPR) of 0.1% and on average detects 19K new malicious domains per day, which is over 5 times the new malicious domains flagged daily in VirusTotal. Further, MANTIS predicts malicious domains days to weeks before they appear in popular blocklists.

Paperid: 2558, https://arxiv.org/pdf/2502.09396.pdf

Abstract:
Machine learning models can inadvertently expose confidential properties of their training data, making them vulnerable to membership inference attacks (MIA). While numerous evaluation methods exist, many require computationally expensive processes, such as training multiple shadow models. This article presents two new complementary approaches for efficiently identifying vulnerable tree-based models: an ante-hoc analysis of hyperparameter choices and a post-hoc examination of trained model structure. While these new methods cannot certify whether a model is safe from MIA, they provide practitioners with a means to significantly reduce the number of models that need to undergo expensive MIA assessment through a hierarchical filtering approach. More specifically, it is shown that the rank order of disclosure risk for different hyperparameter combinations remains consistent across datasets, enabling the development of simple, human-interpretable rules for identifying relatively high-risk models before training. While this ante-hoc analysis cannot determine absolute safety since this also depends on the specific dataset, it allows the elimination of unnecessarily risky configurations during hyperparameter tuning. Additionally, computationally inexpensive structural metrics serve as indicators of MIA vulnerability, providing a second filtering stage to identify risky models after training but before conducting expensive attacks. Empirical results show that hyperparameter-based risk prediction rules can achieve high accuracy in predicting the most at risk combinations of hyperparameters across different tree-based model types, while requiring no model training. Moreover, target model accuracy is not seen to correlate with privacy risk, suggesting opportunities to optimise model configurations for both performance and privacy.

Paperid: 2559, https://arxiv.org/pdf/2502.08989.pdf

Abstract:
Federated Learning (FL) enables multiple users to collaboratively train a machine learning model without sharing raw data, making it suitable for privacy-sensitive applications. However, local model or weight updates can still leak sensitive information. Secure aggregation protocols mitigate this risk by ensuring that only the aggregated updates are revealed. Among these, single-setup protocols, where key generation and exchange occur only once, are the most efficient due to reduced communication and computation overhead. However, existing single-setup protocols often lack support for dynamic user participation and do not provide strong privacy guarantees such as forward and backward secrecy. \par In this paper, we present a novel secure aggregation protocol that requires only a single setup for the entire FL training. Our protocol supports dynamic user participation, tolerates dropouts, and achieves both forward and backward secrecy. It leverages lightweight symmetric homomorphic encryption with a key negation technique to mask updates efficiently, eliminating the need for user-to-user communication. To defend against model inconsistency attacks, we introduce a low-overhead verification mechanism using message authentication codes (MACs). We provide formal security proofs under both semi-honest and malicious adversarial models and implement a full prototype. Experimental results show that our protocol reduces user-side computation by up to $99\%$ compared to state-of-the-art protocols like e-SeaFL (ACSAC'24), while maintaining competitive model accuracy. These features make our protocol highly practical for real-world FL deployments, especially on resource-constrained devices.

Paperid: 2560, https://arxiv.org/pdf/2502.08332.pdf

Abstract:
The development of large language models (LLMs) has raised concerns about potential misuse. One practical solution is to embed a watermark in the text, allowing ownership verification through watermark extraction. Existing methods primarily focus on defending against modification attacks, often neglecting other spoofing attacks. For example, attackers can alter the watermarked text to produce harmful content without compromising the presence of the watermark, which could lead to false attribution of this malicious content to the LLM. This situation poses a serious threat to the LLMs service providers and highlights the significance of achieving modification detection and generated-text detection simultaneously. Therefore, we propose a technique to detect modifications in text for unbiased watermark which is sensitive to modification. We introduce a new metric called ``discarded tokens", which measures the number of tokens not included in watermark detection. When a modification occurs, this metric changes and can serve as evidence of the modification. Additionally, we improve the watermark detection process and introduce a novel method for unbiased watermark. Our experiments demonstrate that we can achieve effective dual detection capabilities: modification detection and generated-text detection by watermark.

Paperid: 2561, https://arxiv.org/pdf/2502.06609.pdf

Abstract:
Instruction set architectures are complex, with hundreds of registers and instructions that can modify dozens of them during execution, variably on each instance. Prose-style ISA specifications struggle to capture these intricacies of the ISAs, where often the important details about a single register are spread out across hundreds of pages of documentation. Ensuring that all ISA-state is swapped in context switch implementations of privileged software requires meticulous examination of these pages. This manual process is tedious and error-prone. We propose a tool called Sailor that leverages machine-readable ISA specifications written in Sail to automate this task. Sailor determines the ISA-state necessary to swap during the context switch using the data collected from Sail and a novel algorithm to classify ISA-state as security-sensitive. Using Sailor's output, we identify three different classes of mishandled ISA-state across four open-source confidential computing systems. We further reveal five distinct security vulnerabilities that can be exploited using the mishandled ISA-state. This research exposes an often overlooked attack surface that stems from mishandled ISA-state, enabling unprivileged adversaries to exploit system vulnerabilities.

Paperid: 2562, https://arxiv.org/pdf/2502.05211.pdf

Abstract:
While the community has designed various defenses to counter the threat of poisoning attacks in Federated Learning (FL), there are no guidelines for evaluating these defenses. These defenses are prone to subtle pitfalls in their experimental setups that lead to a false sense of security, rendering them unsuitable for practical deployment. In this paper, we systematically understand, identify, and provide a better approach to address these challenges. First, we design a comprehensive systemization of FL defenses along three dimensions: i) how client updates are processed, ii) what the server knows, and iii) at what stage the defense is applied. Next, we thoroughly survey 50 top-tier defense papers and identify the commonly used components in their evaluation setups. Based on this survey, we uncover six distinct pitfalls and study their prevalence. For example, we discover that around 30% of these works solely use the intrinsically robust MNIST dataset, and 40% employ simplistic attacks, which may inadvertently portray their defense as robust. Using three representative defenses as case studies, we perform a critical reevaluation to study the impact of the identified pitfalls and show how they lead to incorrect conclusions about robustness. We provide actionable recommendations to help researchers overcome each pitfall.

Paperid: 2563, https://arxiv.org/pdf/2502.05160.pdf

Abstract:
In this work, we study the solution of shortest vector problems (SVPs) arising in terms of learning with error problems (LWEs). LWEs are linear systems of equations over a modular ring, where a perturbation vector is added to the right-hand side. This type of problem is of great interest, since LWEs have to be solved in order to be able to break lattice-based cryptosystems as the Module-Lattice-Based Key-Encapsulation Mechanism published by NIST in 2024. Due to this fact, several classical and quantum-based algorithms have been studied to solve SVPs. Two well-known algorithms that can be used to simplify a given SVP are the Lenstra-Lenstra-LovÃ¡sz (LLL) algorithm and the Block Korkine-Zolotarev (BKZ) algorithm. LLL and BKZ construct bases that can be used to compute or approximate solutions of the SVP. We study the performance of both algorithms for SVPs with different sizes and modular rings. Thereby, application of LLL or BKZ to a given SVP is considered to be successful if they produce bases containing a solution vector of the SVP.

Paperid: 2564, https://arxiv.org/pdf/2502.04758.pdf

Abstract:
We analyze the DP (differential privacy) properties of the quantum recommendation algorithm and the quantum-inspired-classical recommendation algorithm. We discover that the quantum recommendation algorithm is a privacy curating mechanism on its own, requiring no external noise, which is different from traditional differential privacy mechanisms. In our analysis, a novel perturbation method tailored for SVD (singular value decomposition) and low-rank matrix approximation problems is introduced. Using the perturbation method and random matrix theory, we are able to derive that both the quantum and quantum-inspired-classical algorithms are $\big(\tilde{\mathcal{O}}\big(\frac 1n\big),\,\, \tilde{\mathcal{O}}\big(\frac{1}{\min\{m,n\}}\big)\big)$-DP under some reasonable restrictions, where $m$ and $n$ are numbers of users and products in the input preference database respectively. Nevertheless, a comparison shows that the quantum algorithm has better privacy preserving potential than the classical one.

Paperid: 2565, https://arxiv.org/pdf/2502.04536.pdf

Abstract:
Decompilers are important tools for reverse engineers that help them analyze software at a higher level of abstraction than assembly code. Unfortunately, because compilation is lossy, deterministic decompilers produce code that is missing many of the details that make source code readable in the first place, like variable names and types. Neural decompilers, on the other hand, offer the ability to statistically fill in these details. Existing work in neural decompilation, however, suffers from substantial limitations that preclude its use on real code, such as the inability to define composite types, which is essential to fully specify function semantics. In this work, we introduce a new dataset, Realtype, that includes substantially more complicated and realistic types than existing neural decompilation benchmarks, and Idioms, a new neural decompilation approach to finetune any LLM into a neural decompiler capable of generating the appropriate user-defined type definitions alongside the decompiled code. We show that our approach yields state-of-the-art results in neural decompilation. On the most challenging existing benchmark, ExeBench, our model achieves 54.4% accuracy vs. 46.3% for LLM4Decompile and 37.5% for Nova; on Realtype, our model performs at least 95% better.

Paperid: 2566, https://arxiv.org/pdf/2502.04360.pdf

Abstract:
Retrieval-Augmented Generation (RAG) offers a solution to mitigate hallucinations in Large Language Models (LLMs) by grounding their outputs to knowledge retrieved from external sources. The use of private resources and data in constructing these external data stores can expose them to risks of extraction attacks, in which attackers attempt to steal data from these private databases. Existing RAG extraction attacks often rely on manually crafted prompts, which limit their effectiveness. In this paper, we introduce a framework called MARAGE for optimizing an adversarial string that, when appended to user queries submitted to a target RAG system, causes outputs containing the retrieved RAG data verbatim. MARAGE leverages a continuous optimization scheme that integrates gradients from multiple models with different architectures simultaneously to enhance the transferability of the optimized string to unseen models. Additionally, we propose a strategy that emphasizes the initial tokens in the target RAG data, further improving the attack's generalizability. Evaluations show that MARAGE consistently outperforms both manual and optimization-based baselines across multiple LLMs and RAG datasets, while maintaining robust transferability to previously unseen models. Moreover, we conduct probing tasks to shed light on the reasons why MARAGE is more effective compared to the baselines and to analyze the impact of our approach on the model's internal state.

Paperid: 2567, https://arxiv.org/pdf/2502.03668.pdf

Abstract:
Despite the generative model's groundbreaking success, the need to study its implications for privacy and utility becomes more urgent. Although many studies have demonstrated the privacy threats brought by GANs, no existing survey has systematically categorized the privacy and utility perspectives of GANs and VAEs. In this article, we comprehensively study privacy-preserving generative models, articulating the novel taxonomies for both privacy and utility metrics by analyzing 100 research publications. Finally, we discuss the current challenges and future research directions that help new researchers gain insight into the underlying concepts.

Paperid: 2568, https://arxiv.org/pdf/2502.02709.pdf

Abstract:
The technical literature about data privacy largely consists of two complementary approaches: formal definitions of conditions sufficient for privacy preservation and attacks that demonstrate privacy breaches. Differential privacy is an accepted standard in the former sphere. However, differential privacy's powerful adversarial model and worst-case guarantees may make it too stringent in some situations, especially when achieving it comes at a significant cost to data utility. Meanwhile, privacy attacks aim to expose real and worrying privacy risks associated with existing data release processes but often face criticism for being unrealistic. Moreover, the literature on attacks generally does not identify what properties are necessary to defend against them. We address the gap between these approaches by introducing demographic coherence, a condition inspired by privacy attacks that we argue is necessary for data privacy. This condition captures privacy violations arising from inferences about individuals that are incoherent with respect to the demographic patterns in the data. Our framework focuses on confidence rated predictors, which can in turn be distilled from almost any data-informed process. Thus, we capture privacy threats that exist even when no attack is explicitly being carried out. Our framework not only provides a condition with respect to which data release algorithms can be analysed but suggests natural experimental evaluation methodologies that could be used to build practical intuition and make tangible assessment of risks. Finally, we argue that demographic coherence is weaker than differential privacy: we prove that every differentially private data release is also demographically coherent, and that there are demographically coherent algorithms which are not differentially private.

Paperid: 2569, https://arxiv.org/pdf/2502.02335.pdf

Abstract:
Backdoor Malware are installed by an attacker on the victim's server(s) for authorized access. A customized backdoor is weaponized to execute unauthorized system, database and application commands to access the user credentials and confidential digital assets. Recently, we discovered and analyzed a targeted persistent module backdoor in Web Server in an online business company that was undetectable by their deployed Anti-Virus software for a year. This led us to carry out research to detect this specific type of persistent module backdoor installed in Web servers. Other than typical Malware static analysis, we carry out analysis with binary similarity, strings, and command obfuscation over the backdoor, resulting in the Target Attack Backdoor Malware Analysis Matrix (TABMAX) for organizations to detect this sophisticated target attack backdoor instead of a general one which can be detected by Anti-Virus detectors. Our findings show that backdoor malware can be designed with different APIs, commands, strings, and query language on top of preferred libraries used by typical Malware.

Paperid: 2570, https://arxiv.org/pdf/2502.02230.pdf

Abstract:
One of the major goals of incident response is to help an organization or a system owner to quickly identify and halt the attacks to minimize the damages (and financial loss) to the system being attacked. Typical incident responses rely very much on the log information captured by the system during the attacks and if needed, may need to isolate the victim from the network to avoid further destructive attacks. However, there are real cases that there are insufficient log records/information for the incident response team to identify the attacks and their origins while the attacked system cannot be stopped due to service requirements (zero downtime online systems) such as online gaming sites. Typical incident response procedures and industrial standards do not provide an adequate solution to address this scenario. In this paper, being motivated by a real case, we propose a solution, called "Attack-Driven Incident Response and Defense System (ADIRDS)" to tackle this problem. ADIRDS is an online monitoring system to run with the real system. By modeling the real system as a graph, critical nodes/assets of the system are closely monitored. Instead of relying on the original logging system, evidence will be collected from the attack technique perspectives. To migrate the risks, realistic honeypots with very similar business context as the real system are deployed to trap the attackers. We successfully apply this system to a real case. Based on our experiments, we verify that our new approach of designing the realistic honeypots is effective, 38 unique attacker's IP addresses were captured. We also compare the performance of our realistic honey with both low and high interactive honeypots proposed in the literature, the results found that our proposed honeypot can successfully cheat the attackers to attack our honeypot, which verifies that our honeypot is more effective.

Paperid: 2571, https://arxiv.org/pdf/2502.01221.pdf

Abstract:
Ransomware impact different organizations for years, it causes huge monetary, reputation loss and operation impact. Other than typical data encryption by ransomware, attackers can request ransom from the victim organizations via data extortion, otherwise, attackers will publish stolen data publicly in their ransomware dashboard forum and data-sharing platforms. However, there is no clear and proven published incident response strategy to satisfy different business priorities and objectives under ransomware attack in detail. In this paper, we quote one of our representative front-line ransomware incident response experiences for Company X. Organization and incident responder can reference our established model strategy and implement proactive threat intelligence-based incident response architecture if one is under ransomware attack, which helps to respond the incident more effectively and speedy.

Paperid: 2572, https://arxiv.org/pdf/2502.01110.pdf

Abstract:
The nonlinear filter model is an old and well understood approach to the design of secure stream ciphers. Extensive research over several decades has shown how to attack stream ciphers based on this model and has identified the security properties required of the Boolean function used as the filtering function to resist such attacks. This led to the problem of constructing Boolean functions which provide adequate security \textit{and} at the same time are efficient to implement. Unfortunately, over the last two decades no good solutions to this problem appeared in the literature. The lack of good solutions has effectively led to nonlinear filter model becoming more or less obsolete. This is a big loss to the cryptographic design toolkit, since the great advantages of the nonlinear filter model are its simplicity, well understood security and the potential to provide low cost solutions for hardware oriented stream ciphers. In this paper, we revive the nonlinear filter model by constructing appropriate Boolean functions which provide required security and are also efficient to implement. We put forward concrete suggestions of stream ciphers which are $Îº$-bit secure against known types of attacks for $Îº=80,128,160,192,224$ and $256$. For the $80$-bit, $128$-bit, and the $256$-bit security levels, the circuits for the corresponding stream ciphers require about 1743.5, 2771.5, and 5607.5 NAND gates respectively. For the $80$-bit and the $128$-bit security levels, the gate count estimates compare quite well to the famous ciphers Trivium and Grain-128a respectively, while for the $256$-bit security level, we do not know of any other stream cipher design which has such a low gate count.

Paperid: 2573, https://arxiv.org/pdf/2502.00706.pdf

Abstract:
Large language models are increasingly customized through fine-tuning and other adaptations, creating challenges in enforcing licensing terms and managing downstream impacts. Tracking model origins is crucial both for protecting intellectual property and for identifying derived models when biases or vulnerabilities are discovered in foundation models. We address this challenge by developing a framework for testing model provenance: Whether one model is derived from another. Our approach is based on the key observation that real-world model derivations preserve significant similarities in model outputs that can be detected through statistical analysis. Using only black-box access to models, we employ multiple hypothesis testing to compare model similarities against a baseline established by unrelated models. On two comprehensive real-world benchmarks spanning models from 30M to 4B parameters and comprising over 600 models, our tester achieves 90-95% precision and 80-90% recall in identifying derived models. These results demonstrate the viability of systematic provenance verification in production environments even when only API access is available.

Paperid: 2574, https://arxiv.org/pdf/2501.19206.pdf

Abstract:
The recent rise in increasingly sophisticated cyber-attacks raises the need for robust and resilient autonomous cyber-defence (ACD) agents. Given the variety of cyber-attack tactics, techniques and procedures (TTPs) employed, learning approaches that can return generalisable policies are desirable. Meanwhile, the assurance of ACD agents remains an open challenge. We address both challenges via an empirical game-theoretic analysis of deep reinforcement learning (DRL) approaches for ACD using the principled double oracle (DO) algorithm. This algorithm relies on adversaries iteratively learning (approximate) best responses against each others' policies; a computationally expensive endeavour for autonomous cyber operations agents. In this work we introduce and evaluate a theoretically-sound, potential-based reward shaping approach to expedite this process. In addition, given the increasing number of open-source ACD-DRL approaches, we extend the DO formulation to allow for multiple response oracles (MRO), providing a framework for a holistic evaluation of ACD approaches.

Paperid: 2575, https://arxiv.org/pdf/2501.18874.pdf

Abstract:
A compromised system component can issue message sequences that are legal while also leading the overall system into unsafe states. Such stealthy attacks are challenging to characterize, because message interfaces in standard languages specify each individual message separately but do not specify safe sequences of messages. We present initial results from ongoing work applying refined multiparty session types as a mechanism for expressing and enforcing proper message usage to exclude unsafe sequences. We illustrate our approach by using refined multiparty session types to mitigate safety and security issues in the MAVLink protocol commonly used in UAVs.

Paperid: 2576, https://arxiv.org/pdf/2501.18663.pdf

Abstract:
Large language models (LLMs) have significantly facilitated human life, and prompt engineering has improved the efficiency of these models. However, recent years have witnessed a rise in prompt engineering-empowered attacks, leading to issues such as privacy leaks, increased latency, and system resource wastage. Though safety fine-tuning based methods with Reinforcement Learning from Human Feedback (RLHF) are proposed to align the LLMs, existing security mechanisms fail to cope with fickle prompt attacks, highlighting the necessity of performing security detection on prompts. In this paper, we jointly consider prompt security, service latency, and system resource optimization in Edge-Cloud LLM (EC-LLM) systems under various prompt attacks. To enhance prompt security, a vector-database-enabled lightweight attack detector is proposed. We formalize the problem of joint prompt detection, latency, and resource optimization into a multi-stage dynamic Bayesian game model. The equilibrium strategy is determined by predicting the number of malicious tasks and updating beliefs at each stage through Bayesian updates. The proposed scheme is evaluated on a real implemented EC-LLM system, and the results demonstrate that our approach offers enhanced security, reduces the service latency for benign users, and decreases system resource consumption compared to state-of-the-art algorithms.

Paperid: 2577, https://arxiv.org/pdf/2501.18279.pdf

Abstract:
In the context of blockchain systems, the importance of decentralization is undermined by the lack of a widely accepted methodology to measure it. To address this gap, we set out a systematization effort targeting the decentralization measurement workflow. To facilitate our systematization, we put forth a framework that categorizes all measurement techniques used in previous work based on the resource they target, the methods they use to extract resource allocation, and the functions they apply to produce the final measurements. We complement this framework with an empirical analysis designed to evaluate whether the various pre-processing steps and metrics used in prior work capture the same underlying concept of decentralization. Our analysis brings about a number of novel insights and observations. First, the seemingly innocuous choices performed during data extraction, such as the size of estimation windows or the application of thresholds that affect the resource distribution, have important repercussions when calculating the level of decentralization. Second, exploratory factor analysis suggests that in Proof-of-Work (PoW) blockchains, participation on the consensus layer is not correlated with decentralization, but rather captures a distinct signal, unlike in Proof-of-Stake (PoS) systems, where the different metrics align under a single factor. These findings challenge the long-held assumption within the blockchain community that higher participation drives higher decentralization. Finally, we combine the results of our empirical analysis with first-principles reasoning to derive practical recommendations for researchers that set out to measure blockchain decentralization, and we further systematize the existing literature in line with these recommendations.

Paperid: 2578, https://arxiv.org/pdf/2501.16763.pdf

Abstract:
With the rapid advancement of blockchain technology, a significant trend is the adoption of Directed Acyclic Graphs (DAGs) as an alternative to traditional chain-based architectures for organizing ledger records. Systems like IOTA, which are specially designed for the Internet of Things (IoT), leverage DAG-based architectures to achieve greater scalability by enabling multiple attachment points in the ledger for new transactions while allowing these transactions to be added to the network without incurring any fees. To determine these attachment points, many tip selection algorithms commonly employ specific strategies on the DAG ledger. Transaction prioritization is not considered in the IOTA network, which becomes especially important when network bandwidth is limited. In this paper, we propose an optimization framework designed to integrate a priority level for critical or high-priority IoT transactions within the IOTA network. We evaluate our system using fully based on the official IOTA GitHub repository, which employs the currently operational IOTA node software (Hornet version), as part of the Chrysalis update (1.5). The experimental results show that higher-priority transactions in the proposed algorithm reach final confirmation in less time compared to the original IOTA system.

Paperid: 2579, https://arxiv.org/pdf/2501.16681.pdf

Abstract:
In many blockchains, e.g., Ethereum, Binance Smart Chain (BSC), the primary representation used for wallet addresses is a hardly memorable 40-digit hexadecimal string. As a result, users often select addresses from their recent transaction history, which enables blockchain address poisoning. The adversary first generates lookalike addresses similar to one with which the victim has previously interacted, and then engages with the victim to ``poison'' their transaction history. The goal is to have the victim mistakenly send tokens to the lookalike address, as opposed to the intended recipient. Compared to contemporary studies, this paper provides four notable contributions. First, we develop a detection system and perform measurements over two years on both Ethereum and BSC. We identify 13~times more attack attempts than reported previously -- totaling 270M on-chain attacks targeting 17M victims. 6,633 incidents have caused at least 83.8M USD in losses, which makes blockchain address poisoning one of the largest cryptocurrency phishing schemes observed in the wild. Second, we analyze a few large attack entities using improved clustering techniques, and model attacker profitability and competition. Third, we reveal attack strategies -- targeted populations, success conditions (address similarity, timing), and cross-chain attacks. Fourth, we mathematically define and simulate the lookalike address generation process across various software- and hardware-based implementations, and identify a large-scale attacker group that appears to use GPUs. We also discuss defensive countermeasures.

Paperid: 2580, https://arxiv.org/pdf/2501.16680.pdf

Abstract:
We study the problem of differentially private (DP) mechanisms for representing sets of size $k$ from a large universe. Our first construction creates $(Îµ,Î´)$-DP representations with error probability of $1/(e^Îµ+ 1)$ using space at most $1.05 k Îµ\cdot \log(e)$ bits where the time to construct a representation is $O(k \log(1/Î´))$ while decoding time is $O(\log(1/Î´))$. We also present a second algorithm for pure $Îµ$-DP representations with the same error using space at most $k Îµ\cdot \log(e)$ bits, but requiring large decoding times. Our algorithms match our lower bounds on privacy-utility trade-offs (including constants but ignoring $Î´$ factors) and we also present a new space lower bound matching our constructions up to small constant factors. To obtain our results, we design a new approach embedding sets into random linear systems deviating from most prior approaches that inject noise into non-private solutions.

Paperid: 2581, https://arxiv.org/pdf/2501.15077.pdf

Abstract:
As a valuable digital resource, graph data is an important data asset, which has been widely utilized across various fields to optimize decision-making and enable smarter solutions. To manage data assets, blockchain is widely used to enable data sharing and trading, but it cannot supply complex analytical queries. vChain was proposed to achieve verifiable boolean queries over blockchain by designing an embedded authenticated data structure (ADS). However, for generating (non-)existence proofs, vChain suffers from expensive storage and computation costs in ADS construction, along with high communication and verification costs. In this paper, we propose a novel NetChain framework that enables efficient top-k queries over on-chain graph data with verifiability. Specifically, we design a novel authenticated two-layer index that supports (non-)existence proof generation in block-level and built-in verifiability for matched objects. To further alleviate the computation and verification overhead, an optimized variant NetChain+ is derived. The authenticity of our frameworks is validated through security analysis. Evaluations show that NetChain and NetChain+ outperform vChain, respectively achieving up to 85X and 31X improvements on ADS construction. Moreover, compared with vChain, NetChain+ reduces the communication and verification costs by 87% and 96% respectively.

Paperid: 2582, https://arxiv.org/pdf/2501.14555.pdf

Abstract:
Provenance graphs are useful and powerful tools for representing system-level activities in cybersecurity; however, existing approaches often struggle with complex queries and flexible reasoning. This paper presents a novel approach using Answer Set Programming (ASP) to model and analyze provenance graphs. We introduce an ASP-based representation that captures intricate relationships between system entities, including temporal and causal dependencies. Our model enables sophisticated analysis capabilities such as attack path tracing, data exfiltration detection, and anomaly identification. The declarative nature of ASP allows for concise expression of complex security patterns and policies, facilitating both real-time threat detection and forensic analysis. We demonstrate our approach's effectiveness through case studies showcasing its threat detection capabilities. Experimental results illustrate the model's ability to handle large-scale provenance graphs while providing expressive querying. The model's extensibility allows for incorporation of new system behaviors and security rules, adapting to evolving cyber threats. This work contributes a powerful, flexible, and explainable framework for reasoning about system behaviors and security incidents, advancing the development of effective threat detection and forensic investigation tools.

Paperid: 2583, https://arxiv.org/pdf/2501.14500.pdf

Abstract:
This paper presents a scalable, practical approach to quantifying information leaks in software; these errors are often overlooked and downplayed, but can seriously compromise security mechanisms such as address space layout randomisation (ASLR) and Pointer Authentication (PAC). We introduce approaches for three different metrics to estimate the size of information leaks, including a new derivation for the calculation of conditional mutual information. Together, these metrics can inform of the relative safety of the target program against different threat models and provide useful details for finding the source of any leaks. We provide an implementation of a fuzzer, NIFuzz, which is capable of dynamically computing these metrics with little overhead and has several strategies to optimise for the detection and quantification of information leaks. We evaluate NIFuzz on a set of 14 programs -- including 8 real-world CVEs and ranging up to 278k lines of code in size -- where we find that it is capable of detecting and providing good estimates for all of the known information leaks.

Paperid: 2584, https://arxiv.org/pdf/2501.13733.pdf

Abstract:
The Stealth Address Protocol (SAP) allows users to receive assets through stealth addresses that are unlinkable to their stealth meta-addresses. The most widely used SAP, Dual-Key SAP (DKSAP), and the most performant SAP, Elliptic Curve Pairing Dual-Key SAP (ECPDKSAP), are based on elliptic curve cryptography, which is vulnerable to quantum attacks. These protocols depend on the elliptic curve discrete logarithm problem, which could be efficiently solved on a sufficiently powerful quantum computer using the Shor algorithm. In this paper three novel post-quantum SAPs based on lattice-based cryptography are presented: LWE SAP, Ring-LWE SAP and Module-LWE SAP. These protocols leverage Learning With Errors (LWE) problem to ensure quantum-resistant privacy. Among them, Module-LWE SAP, which is based on the Kyber key encapsulation mechanism, achieves the best performance and outperforms ECPDKSAP by approximately 66.8% in the scan time of the ephemeral public key registry.

Paperid: 2585, https://arxiv.org/pdf/2501.13540.pdf

Abstract:
We present a novel yet simple and comprehensive DNS cache POisoning Prevention System (POPS), designed to integrate as a module in Intrusion Prevention Systems (IPS). POPS addresses statistical DNS poisoning attacks, including those documented from 2002 to the present, and offers robust protection against similar future threats. It consists of two main components: a detection module that employs three simple rules, and a mitigation module that leverages the TC flag in the DNS header to enhance security. Once activated, the mitigation module has zero false positives or negatives, correcting any such errors on the side of the detection module. We first analyze POPS against historical DNS services and attacks, showing that it would have mitigated all network-based statistical poisoning attacks, yielding a success rate of only 0.0076% for the adversary. We then simulate POPS on traffic benchmarks (PCAPs) incorporating current potential network-based statistical poisoning attacks, and benign PCAPs; the simulated attacks still succeed with a probability of 0.0076%. This occurs because five malicious packets go through before POPS detects the attack and activates the mitigation module. In addition, POPS completes its task using only 20%-50% of the time required by other tools (e.g., Suricata or Snort), and after examining just 5%-10% as many packets. Furthermore, it successfully identifies DNS cache poisoning attacks-such as fragmentation attacks-that both Suricata and Snort fail to detect, underscoring its superiority in providing comprehensive DNS protection.

Paperid: 2586, https://arxiv.org/pdf/2501.13213.pdf

Abstract:
Flying Ad Hoc Networks (FANETs), which primarily interconnect Unmanned Aerial Vehicles (UAVs), present distinctive security challenges due to their distributed and dynamic characteristics, necessitating tailored security solutions. Intrusion detection in FANETs is particularly challenging due to communication costs, and privacy concerns. While Federated Learning (FL) holds promise for intrusion detection in FANETs with its cooperative and decentralized model training, it also faces drawbacks such as large data requirements, power consumption, and time constraints. Moreover, the high speeds of nodes in dynamic networks like FANETs may disrupt communication among Intrusion Detection Systems (IDS). In response, our study explores the use of few-shot learning (FSL) to effectively reduce the data required for intrusion detection in FANETs. The proposed approach called Few-shot Federated Learning-based IDS (FSFL-IDS) merges FL and FSL to tackle intrusion detection challenges such as privacy, power constraints, communication costs, and lossy links, demonstrating its effectiveness in identifying routing attacks in dynamic FANETs.This approach reduces both the local models and the global model's training time and sample size, offering insights into reduced computation and communication costs and extended battery life. Furthermore, by employing FSL, which requires less data for training, IDS could be less affected by lossy links in FANETs.

Paperid: 2587, https://arxiv.org/pdf/2501.11618.pdf

Abstract:
To address the critical need for secure IoT networks, this study presents a scalable and lightweight curriculum learning framework enhanced with Explainable AI (XAI) techniques, including LIME, to ensure transparency and adaptability. The proposed model employs novel neural network architecture utilized at every stage of Curriculum Learning to efficiently capture and focus on both short- and long-term temporal dependencies, improve learning stability, and enhance accuracy while remaining lightweight and robust against noise in sequential IoT data. Robustness is achieved through staged learning, where the model iteratively refines itself by removing low-relevance features and optimizing performance. The workflow includes edge-optimized quantization and pruning to ensure portability that could easily be deployed in the edge-IoT devices. An ensemble model incorporating Random Forest, XGBoost, and the staged learning base further enhances generalization. Experimental results demonstrate 98% accuracy on CIC-IoV-2024 and CIC-APT-IIoT-2024 datasets and 97% on EDGE-IIoT, establishing this framework as a robust, transparent, and high-performance solution for IoT network security.

Paperid: 2588, https://arxiv.org/pdf/2501.09798.pdf

Abstract:
We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned from the remote fine-tuning interface to guide the search for adversarial prompts. The fine-tuning interface is hosted by an LLM vendor and allows developers to fine-tune LLMs for their tasks, thus providing utility, but also exposes enough information for an attacker to compute adversarial prompts. Through an experimental analysis, we characterize the loss-like values returned by the Gemini fine-tuning API and demonstrate that they provide a useful signal for discrete optimization of adversarial prompts using a greedy search algorithm. Using the PurpleLlama prompt injection benchmark, we demonstrate attack success rates between 65% and 82% on Google's Gemini family of LLMs. These attacks exploit the classic utility-security tradeoff - the fine-tuning interface provides a useful feature for developers but also exposes the LLMs to powerful attacks.

Paperid: 2589, https://arxiv.org/pdf/2501.09380.pdf

Abstract:
We propose a method for constructing 9-variable cryptographic Boolean functions from the iterates of 5-variable cellular automata rules. We then analyze, for important cryptographic properties of 5-variable cellular automata rules, how they are preserved after extension to 9-variable Boolean functions. For each cryptographic property, we analyze the proportion of 5-variable cellular automata rules that preserve it for each of the 48 affine equivalence classes.

Paperid: 2590, https://arxiv.org/pdf/2501.07927.pdf

Abstract:
Current evaluations of defenses against prompt attacks in large language model (LLM) applications often overlook two critical factors: the dynamic nature of adversarial behavior and the usability penalties imposed on legitimate users by restrictive defenses. We propose D-SEC (Dynamic Security Utility Threat Model), which explicitly separates attackers from legitimate users, models multi-step interactions, and expresses the security-utility in an optimizable form. We further address the shortcomings in existing evaluations by introducing Gandalf, a crowd-sourced, gamified red-teaming platform designed to generate realistic, adaptive attack. Using Gandalf, we collect and release a dataset of 279k prompt attacks. Complemented by benign user data, our analysis reveals the interplay between security and utility, showing that defenses integrated in the LLM (e.g., system prompts) can degrade usability even without blocking requests. We demonstrate that restricted application domains, defense-in-depth, and adaptive defenses are effective strategies for building secure and useful LLM applications.

Paperid: 2591, https://arxiv.org/pdf/2501.07801.pdf

Abstract:
New research focuses on creating artificial intelligence (AI) solutions for network intrusion detection systems (NIDS), drawing its inspiration from the ever-growing number of intrusions on networked systems, increasing its complexity and intelligibility. Hence, the use of explainable AI (XAI) techniques in real-world intrusion detection systems comes from the requirement to comprehend and elucidate black-box AI models to security analysts. In an effort to meet such requirements, this paper focuses on applying and evaluating White-Box XAI techniques (particularly LRP, IG, and DeepLift) for NIDS via an end-to-end framework for neural network models, using three widely used network intrusion datasets (NSL-KDD, CICIDS-2017, and RoEduNet-SIMARGL2021), assessing its global and local scopes, and examining six distinct assessment measures (descriptive accuracy, sparsity, stability, robustness, efficiency, and completeness). We also compare the performance of white-box XAI methods with black-box XAI methods. The results show that using White-box XAI techniques scores high in robustness and completeness, which are crucial metrics for IDS. Moreover, the source codes for the programs developed for our XAI evaluation framework are available to be improved and used by the research community.

Paperid: 2592, https://arxiv.org/pdf/2501.07380.pdf

Abstract:
With passkeys, the FIDO Alliance introduces the ability to sync FIDO2 credentials across a user's devices through passkey providers. This aims to mitigate user concerns about losing their devices and promotes the shift toward password-less authentication. As a consequence, many major online services have adopted passkeys. However, credential syncing has also created a debate among experts about their security guarantees. In this paper, we categorize the different access levels of passkeys to show how syncing credentials impacts their security and availability. Moreover, we use the established framework from Bonneau et al.'s Quest to Replace Passwords and apply it to different types of device-bound and synced passkeys. By this, we reveal relevant differences, particularly in their usability and security, and show that the security of synced passkeys is mainly concentrated in the passkey provider. We further provide practical recommendations for end users, passkey providers, and relying parties.

Paperid: 2593, https://arxiv.org/pdf/2501.06729.pdf

Abstract:
Federated Learning (FL) enables multiple users to collaboratively train a global model in a distributed manner without revealing their personal data. However, FL remains vulnerable to model poisoning attacks, where malicious actors inject crafted updates to compromise the global model's accuracy. We propose a novel defense mechanism, Kernel-based Trust Segmentation (KeTS), to counter model poisoning attacks. Unlike existing approaches, KeTS analyzes the evolution of each client's updates and effectively segments malicious clients using Kernel Density Estimation (KDE), even in the presence of benign outliers. We thoroughly evaluate KeTS's performance against the six most effective model poisoning attacks (i.e., Trim-Attack, Krum-Attack, Min-Max attack, Min-Sum attack, and their variants) on four different datasets (i.e., MNIST, Fashion-MNIST, CIFAR-10, and KDD-CUP-1999) and compare its performance with three classical robust schemes (i.e., Krum, Trim-Mean, and Median) and a state-of-the-art defense (i.e., FLTrust). Our results show that KeTS outperforms the existing defenses in every attack setting; beating the best-performing defense by an overall average of >24% (on MNIST), >14% (on Fashion-MNIST), >9% (on CIFAR-10), >11% (on KDD-CUP-1999). A series of further experiments (varying poisoning approaches, attacker population, etc.) reveal the consistent and superior performance of KeTS under diverse conditions. KeTS is a practical solution as it satisfies all three defense objectives (i.e., fidelity, robustness, and efficiency) without imposing additional overhead on the clients. Finally, we also discuss a simple, yet effective extension to KeTS to handle consistent-untargeted (e.g., sign-flipping) attacks as well as targeted attacks (e.g., label-flipping).

Paperid: 2594, https://arxiv.org/pdf/2501.05923.pdf

Abstract:
Battery energy storage systems are an important part of modern power systems as a solution to maintain grid balance. However, such systems are often remotely managed using cloud-based control systems. This exposes them to cyberattacks that could result in catastrophic consequences for the electrical grid and the connected infrastructure. This paper takes a step towards advancing understanding of these systems and investigates the effects of cyberattacks targeting them. We propose a reference model for an electrical grid cloud-controlled load-balancing system connected to remote battery energy storage systems. The reference model is evaluated from a cybersecurity perspective by implementing and simulating various cyberattacks. The results reveal the system's attack surface and demonstrate the impact of cyberattacks that can criticaly threaten the security and stability of the electrical grid.

Paperid: 2595, https://arxiv.org/pdf/2501.04900.pdf

Abstract:
In the digital era, managing posthumous data presents a growing challenge, with current technical solutions often falling short in practicality. Existing tools are typically closed-source, lack transparency, fail to offer cross-platform support, and provide limited access control. This paper introduces `Beyond Life', a cross-platform digital will management solution designed to securely handle and distribute digital assets after death. At the core of this solution is a customized Ciphertext-Policy Attribute-Based Encryption (CP-ABE) scheme, referred to as PD-CP-ABE, which enables efficient, fine-grained control over access to will content at scale. Unlike existing systems, Beyond Life operates independently of service providers, offering users greater transparency and control over how their will is generated, stored, and executed. The system is also designed to be portable, allowing users to change their will service provider. The proposed system has been fully developed and rigorously evaluated to ensure performance and real-world feasibility. The system implementation is made publicly available.

Paperid: 2596, https://arxiv.org/pdf/2501.04811.pdf

Abstract:
Neural decompilers are machine learning models that reconstruct the source code from an executable program. Critical to the lifecycle of any machine learning model is an evaluation of its effectiveness. However, existing techniques for evaluating neural decompilation models have substantial weaknesses, especially when it comes to showing the correctness of the neural decompiler's predictions. To address this, we introduce codealign, a novel instruction-level code equivalence technique designed for neural decompilers. We provide a formal definition of a relation between equivalent instructions, which we term an equivalence alignment. We show how codealign generates equivalence alignments, then evaluate codealign by comparing it with symbolic execution. Finally, we show how the information codealign provides-which parts of the functions are equivalent and how well the variable names match-is substantially more detailed than existing state-of-the-art evaluation metrics, which report unitless numbers measuring similarity.

Paperid: 2597, https://arxiv.org/pdf/2501.04571.pdf

Abstract:
Notarization is a procedure that enhance data management by ensuring the authentication of data during audits, thereby increasing trust in the audited data. Blockchain is frequently used as a secure, immutable, and transparent storage, contributing to make data notarization procedures more effective and trustable. Several blockchain-based data notarization protocols have been proposed in literature and commercial solutions. However, these implementations, whether on public or private blockchains, face inherent challenges: high fees on public blockchains and trust issues on private platforms, limiting the adoption of blockchains for data notarization or forcing several trade-offs. In this paper, we explore the use of hybrid blockchain architectures for data notarization, with a focus on scalability issues. Through the analysis of a real-world use case, the data notarization of product passports in supply chains, we propose a novel approach utilizing a data structure designed to efficiently manage the trade-offs in terms of storage occupation and costs involved in notarizing a large collection of data.

Paperid: 2598, https://arxiv.org/pdf/2501.03306.pdf

Abstract:
Spiking Neural Networks (SNNs), which offer exceptional energy efficiency for inference, and Federated Learning (FL), which offers privacy-preserving distributed training, is a rising area of interest that highly beneficial towards Internet of Things (IoT) devices. Despite this, research that tackles Byzantine attacks and bandwidth limitation in FL-SNNs, both poses significant threats on model convergence and training times, still remains largely unexplored. Going beyond proposing a solution for both of these problems, in this work we highlight the dual benefits of FL-SNNs, against non-omniscient Byzantine adversaries (ones that restrict attackers access to local clients datasets), and greater communication efficiency, over FL-ANNs. Specifically, we discovered that a simple integration of Top-\k{appa} sparsification into the FL apparatus can help leverage the advantages of the SNN models in both greatly reducing bandwidth usage and significantly boosting the robustness of FL training against non-omniscient Byzantine adversaries. Most notably, we saw a massive improvement of roughly 40% accuracy gain in FL-SNNs training under the lethal MinMax attack

Paperid: 2599, https://arxiv.org/pdf/2501.02352.pdf

Abstract:
The increasing reliance on Global Navigation Satellite Systems (GNSS), particularly the Global Positioning System (GPS), underscores the urgent need to safeguard these technologies against malicious threats such as spoofing and jamming. As the backbone for positioning, navigation, and timing (PNT) across various applications including transportation, telecommunications, and emergency services GNSS is vulnerable to deliberate interference that poses significant risks. Spoofing attacks, which involve transmitting counterfeit GNSS signals to mislead receivers into calculating incorrect positions, can result in serious consequences, from navigational errors in civilian aviation to security breaches in military operations. Furthermore, the lack of inherent security measures within GNSS systems makes them attractive targets for adversaries. While GNSS/GPS jamming and spoofing systems consist of numerous components, the ability to distinguish authentic signals from malicious ones is essential for maintaining system integrity. Recent advancements in machine learning and deep learning provide promising avenues for enhancing detection and mitigation strategies against these threats. This paper addresses both spoofing and jamming by tackling real-world challenges through machine learning, deep learning, and computer vision techniques. Through extensive experiments on two real-world datasets related to spoofing and jamming detection using advanced algorithms, we achieved state of the art results. In the GNSS/GPS jamming detection task, we attained approximately 99% accuracy, improving performance by around 5% compared to previous studies. Additionally, we addressed a challenging tasks related to spoofing detection, yielding results that underscore the potential of machine learning and deep learning in this domain.

Paperid: 2600, https://arxiv.org/pdf/2501.02349.pdf

Abstract:
We present `Revelio', a real-world screen-camera communication system leveraging temporal flicker fusion in the OKLAB color space. Using spatially-adaptive flickering and encoding information in pixel region shapes, Revelio achieves visually imperceptible data embedding while remaining robust against noise, asynchronicity, and distortions in screen-camera channels, ensuring reliable decoding by standard smartphone cameras. The decoder, driven by a two-stage neural network, uses a weighted differential accumulator for precise frame detection and symbol recognition. Initial experiments demonstrate Revelio's effectiveness in interactive television, offering an unobtrusive method for meta-information transmission.

Paperid: 2601, https://arxiv.org/pdf/2501.02292.pdf

Abstract:
Post-quantum cryptography is essential for securing digital communications against threats posed by quantum computers. Re-searchers have focused on developing algorithms that can withstand attacks from both classical and quantum computers, thereby ensuring the security of data transmissions over public networks. A critical component of this security is the key agreement protocol, which allows two parties to establish a shared secret key over an insecure channel. This paper introduces two novel post-quantum key agreement protocols that can be easily implemented on standard computers using rectangular or rank-deficient matrices, exploiting the generalizations of the matrix power function, which is a generator of NP-hard problems. We provide basic concepts and proofs, pseudocodes, examples, and a discussion of complexity.

Paperid: 2602, https://arxiv.org/pdf/2501.02229.pdf

Abstract:
As blockchain technology and smart contracts become widely adopted, securing them throughout every stage of the transaction process is essential. The concern of improved security for smart contracts is to find and detect vulnerabilities using classical Machine Learning (ML) models and fine-tuned Large Language Models (LLM). The robustness of such work rests on a labeled smart contract dataset that includes annotated vulnerabilities on which several LLMs alongside various traditional machine learning algorithms such as DistilBERT model is trained and tested. We train and test machine learning algorithms to classify smart contract codes according to vulnerability types in order to compare model performance. Having fine-tuned the LLMs specifically for smart contract code classification should help in getting better results when detecting several types of well-known vulnerabilities, such as Reentrancy, Integer Overflow, Timestamp Dependency and Dangerous Delegatecall. From our initial experimental results, it can be seen that our fine-tuned LLM surpasses the accuracy of any other model by achieving an accuracy of over 90%, and this advances the existing vulnerability detection benchmarks. Such performance provides a great deal of evidence for LLMs ability to describe the subtle patterns in the code that traditional ML models could miss. Thus, we compared each of the ML and LLM models to give a good overview of each models strengths, from which we can choose the most effective one for real-world applications in smart contract security. Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities, which lays a foundation for a more secure blockchain ecosystem.

Paperid: 2603, https://arxiv.org/pdf/2501.01593.pdf

Abstract:
Recent studies have shown that cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor trigger is observed, it will perform malicious actions leading to failures or malicious goals. However, existing backdoor attacks suffer from several issues, e.g., instant trigger patterns lack stealthiness, the backdoor is trained or activated by an additional network, or all agents are backdoored. To this end, in this paper, we propose a novel backdoor leverage attack against c-MADRL, BLAST, which attacks the entire multi-agent team by embedding the backdoor only in a single agent. Firstly, we introduce adversary spatiotemporal behavior patterns as the backdoor trigger rather than manual-injected fixed visual patterns or instant status and control the period to perform malicious actions. This method can guarantee the stealthiness and practicality of BLAST. Secondly, we hack the original reward function of the backdoor agent via unilateral guidance to inject BLAST, so as to achieve the \textit{leverage attack effect} that can pry open the entire multi-agent system via a single backdoor agent. We evaluate our BLAST against 3 classic c-MADRL algorithms (VDN, QMIX, and MAPPO) in 2 popular c-MADRL environments (SMAC and Pursuit), and 2 existing defense mechanisms. The experimental results demonstrate that BLAST can achieve a high attack success rate while maintaining a low clean performance variance rate.

Paperid: 2604, https://arxiv.org/pdf/2501.01063.pdf

Abstract:
The FAPL-DM-BC solution is a new FL-based privacy, security, and scalability solution for the Internet of Vehicles (IoV). It leverages Federated Adaptive Privacy-Aware Learning (FAPL) and Dynamic Masking (DM) to learn and adaptively change privacy policies in response to changing data sensitivity and state in real-time, for the optimal privacy-utility tradeoff. Secure Logging and Verification, Blockchain-based provenance and decentralized validation, and Cloud Microservices Secure Aggregation using FedAvg (Federated Averaging) and Secure Multi-Party Computation (SMPC). Two-model feedback, driven by Model-Agnostic Explainable AI (XAI), certifies local predictions and explanations to drive it to the next level of efficiency. Combining local feedback with world knowledge through a weighted mean computation, FAPL-DM-BC assures federated learning that is secure, scalable, and interpretable. Self-driving cars, traffic management, and forecasting, vehicular network cybersecurity in real-time, and smart cities are a few possible applications of this integrated, privacy-safe, and high-performance IoV platform.

Paperid: 2605, https://arxiv.org/pdf/2501.00707.pdf

Abstract:
Adversarial examples' (AE) transferability refers to the phenomenon that AEs crafted with one surrogate model can also fool other models. Notwithstanding remarkable progress in untargeted transferability, its targeted counterpart remains challenging. This paper proposes an everywhere scheme to boost targeted transferability. Our idea is to attack a victim image both globally and locally. We aim to optimize 'an army of targets' in every local image region instead of the previous works that optimize a high-confidence target in the image. Specifically, we split a victim image into non-overlap blocks and jointly mount a targeted attack on each block. Such a strategy mitigates transfer failures caused by attention inconsistency between surrogate and victim models and thus results in stronger transferability. Our approach is method-agnostic, which means it can be easily combined with existing transferable attacks for even higher transferability. Extensive experiments on ImageNet demonstrate that the proposed approach universally improves the state-of-the-art targeted attacks by a clear margin, e.g., the transferability of the widely adopted Logit attack can be improved by 28.8%-300%.We also evaluate the crafted AEs on a real-world platform: Google Cloud Vision. Results further support the superiority of the proposed method.

Paperid: 2606, https://arxiv.org/pdf/2501.00363.pdf

Abstract:
Privacy computing receives increasing attention but writing privacy computing code remains challenging for developers due to limited library functions, necessitating function implementation from scratch, and data-oblivious requirement, contradicting intuitive thinking and usual practices of programmers. Automating the generation of privacy computing code with Large Language Models can streamline development effort and lower the barrier to using privacy computing frameworks. However, existing LLMs still encounter challenges in code translation for privacy-preserving computation, such as translating Python to MP-SPDZ, due to the scarcity of MP-SPDZ data required for effective pre-training or fine-tuning. Moreover, the lack of a benchmark further complicates the evaluation of translation quality. To address the limitations, this work proposes SPDZCoder, a rule-based framework that combines LLMs with expert knowledge for generating privacy-computing code without requiring additional training data. Specifically, SPDZCoder employ a rigorous procedure for collecting high-quality expert knowledge to represent the semantic-expressing differences between Python and MP-SPDZ, and to derive transformation rules for translating Python to MP-SPDZ based on these knowledge. Then, SPDZCoder progressively converts Python code into MP-SPDZ code using transformation rules in a three stage pipeline. To evaluate SPDZCoder, we manually constructed a benchmark dataset, SPDZEval, which comprises six data splits, each representing a distinct class of challenging tasks in MP-SPDZ implementation. Extensive experiments show that SPDZCoder achieves superior performance, significantly surpassing baselines in pass@1 and pass@2. Specifically, SPDZCoder attains an overall correctness of 85.94% and 92.01% in pass@1 and pass@2, respectively, whereas the best-performing baseline achieves 63.58% and 76.36%, respectively.

Paperid: 2607, https://arxiv.org/pdf/2501.00186.pdf

Abstract:
Rapid progress in the development of information technology has led to a significant increase in the number and complexity of cyber threats. Traditional methods of cybersecurity training based on theoretical knowledge do not provide a sufficient level of practical skills to effectively counter real threats. The article explores the possibilities of integrating simulation environments into the cybersecurity training process as an effective approach to improving the quality of training. The article presents the architecture of a simulation environment based on a cluster of KVM hypervisors, which allows creating scalable and flexible platforms at minimal cost. The article describes the implementation of various scenarios using open source software tools such as pfSense, OPNsense, Security Onion, Kali Linux, Parrot Security OS, Ubuntu Linux, Oracle Linux, FreeBSD, and others, which create realistic conditions for practical training.

Paperid: 2608, https://arxiv.org/pdf/2501.00175.pdf

Abstract:
Timestamps play a pivotal role in digital forensic event reconstruction, but due to their non-essential nature, tampering or manipulation of timestamps is possible by users in multiple ways, even on running systems. This has a significant effect on the reliability of the results from applying a timeline analysis as part of an investigation. In this paper, we investigate the problem of users tampering with timestamps on a running (``live'') system. While prior work has shown that digital evidence tampering is hard, we focus on the question of \emph{why} this is so. By performing a qualitative user study with advanced university students, we observe, for example, a commonly applied multi-step approach in order to deal with second-order traces (traces of traces). We also derive factors that influence the reliability of successful tampering, such as the individual knowledge about temporal traces, and technical restrictions to change them. These insights help to assess the reliability of timestamps from individual artifacts that are relied on for event reconstruction and subsequently reduce the risk of incorrect event reconstruction during investigations.

Paperid: 2609, https://arxiv.org/pdf/2501.00050.pdf

Abstract:
Network intrusion detection systems face significant challenges in identifying emerging attack patterns, especially when limited data samples are available. To address this, we propose a novel Multi-Space Prototypical Learning (MSPL) framework tailored for few-shot attack detection. The framework operates across multiple metric spaces-Euclidean, Cosine, Chebyshev, and Wasserstein distances-integrated through a constrained weighting scheme to enhance embedding robustness and improve pattern recognition. By leveraging Polyak-averaged prototype generation, the framework stabilizes the learning process and effectively adapts to rare and zero-day attacks. Additionally, an episodic training paradigm ensures balanced representation across diverse attack classes, enabling robust generalization. Experimental results on benchmark datasets demonstrate that MSPL outperforms traditional approaches in detecting low-profile and novel attack types, establishing it as a robust solution for zero-day attack detection.

Paperid: 2610, https://arxiv.org/pdf/2506.24072.pdf

Abstract:
We introduce logit-gap steering, a fast jailbreak framework that casts the refusal-affirmation gap of RLHF-aligned language models as a single pass over the vocabulary. A forward-computable score blends gap reduction with lightweight proxies for KL penalty and reward shift, allowing a "sort-sum-stop" sweep to complete in under a second and return a short suffix--two orders of magnitude fewer model calls than beam or gradient attacks. The same suffix generalises to unseen prompts and scales from 0.5 B to 70 B checkpoints, lifting one-shot attack success from baseline levels to 80-100% while preserving topical coherence. Beyond efficiency, these suffixes expose sentence-boundary reward cliffs and other alignment artefacts, offering a lightweight probe into how safety tuning reshapes internal representations.

Paperid: 2612, https://arxiv.org/pdf/2506.23866.pdf

Abstract:
In this paper, we explore the intersection of privacy, security, and environmental sustainability in cloud-based office solutions, focusing on quantifying user- and network-side energy use and associated carbon emissions. We hypothesise that privacy-focused services are typically more energy-efficient than those funded through data collection and advertising. To evaluate this, we propose a framework that systematically measures environmental costs based on energy usage and network data traffic during well-defined, automated usage scenarios. To test our hypothesis, we first analyse how underlying architectures and business models, such as monetisation through personalised advertising, contribute to the environmental footprint of these services. We then explore existing methodologies and tools for software environmental impact assessment. We apply our framework to three mainstream email services selected to reflect different privacy policies, from ad-supported tracking-intensive models to privacy-focused designs: Microsoft Outlook, Google Mail (Gmail), and Proton Mail. We extend this comparison to a self-hosted email solution, evaluated with and without end-to-end encryption. We show that the self-hosted solution, even with 14% of device energy and 15% of emissions overheads from PGP encryption, remains the most energy-efficient, saving up to 33% of emissions per session compared to Gmail. Among commercial providers, Proton Mail is the most efficient, saving up to 0.1 gCO2 e per session compared to Outlook, whose emissions can be further reduced by 2% through ad-blocking.

Paperid: 2613, https://arxiv.org/pdf/2506.23634.pdf

Abstract:
Mixed Boolean-Arithmetic (MBA) obfuscation protects intellectual property by converting programs into forms that are more complex to analyze. However, MBA has been increasingly exploited by malware developers to evade detection and cause significant real-world problems. Traditional MBA deobfuscation methods often consider these expressions as part of a black box and overlook their internal semantic information. To bridge this gap, we propose a truth table, which is an automatically constructed semantic representation of an expression's behavior that does not rely on external resources. The truth table is a mathematical form that represents the output of expression for all possible combinations of input. We also propose a general and extensible guided MBA deobfuscation framework (gMBA) that modifies a Transformer-based neural encoder-decoder Seq2Seq architecture to incorporate this semantic guidance. Experimental results and in-depth analysis show that integrating expression semantics significantly improves performance and highlights the importance of internal semantic expressions in recovering obfuscated code to its original form.

Paperid: 2614, https://arxiv.org/pdf/2506.22961.pdf

Abstract:
The MPC-in-the-head technique (Ishai et al., STOC 2007) is a celebrated method to build zero-knowledge protocols with desirable theoretical properties and high practical efficiency. This technique has generated a large body of research and has influenced the design of real-world post-quantum cryptographic signatures. In this work, we present a generalization of the MPC-in-the-head paradigm to the quantum setting, where the MPC is running a quantum computation. As an application of our framework, we propose a new approach to build zero-knowledge protocols where security holds even against a verifier that can obtain a superposition of transcripts. This notion was pioneered by Damgard et al., who built a zero-knowledge protocol for NP (in the common reference string model) secure against superposition attacks, by relying on perfectly hiding and unconditionally binding dual-mode commitments. Unfortunately, no such commitments are known from standard cryptographic assumptions. In this work we revisit this problem, and present two new three-round protocols in the common reference string model: (i) A zero-knowledge argument for NP, whose security reduces to the standard learning with errors (LWE) problem. (ii) A zero-knowledge argument for QMA from the same assumption.

Paperid: 2615, https://arxiv.org/pdf/2506.22938.pdf

Abstract:
With current advancement in hybermedia knowledges, the privacy of digital information has developed a critical problem. To overawed the susceptibilities of present security protocols, scholars tend to focus mainly on efforts on alternation of current protocols. Over past decade, various proposed encoding models have been shown insecurity, leading to main threats against significant data. Utilizing the suitable encryption model is very vital means of guard against various such, but algorithm is selected based on the dependency of data which need to be secured. Moreover, testing potentiality of the security assessment one by one to identify the best choice can take a vital time for processing. For faster and precisive identification of assessment algorithm, we suggest a security phase exposure model for cipher encryption technique by invoking Support Vector Machine (SVM). In this work, we form a dataset using usual security components like contrast, homogeneity. To overcome the uncertainty in analysing the security and lack of ability of processing data to a risk assessment mechanism. To overcome with such complications, this paper proposes an assessment model for security issues using fuzzy evidential reasoning (ER) approaches. Significantly, the model can be utilised to process and assemble risk assessment data on various aspects in systematic ways. To estimate the performance of our framework, we have various analyses like, recall, F1 score and accuracy.

Paperid: 2616, https://arxiv.org/pdf/2506.22639.pdf

Abstract:
This paper presents a large-scale analysis of fingerprinting-like behavior in the mobile application ecosystem. We take a market-based approach, focusing on third-party tracking as enabled by applications' common use of third-party SDKs. Our dataset consists of over 228,000 SDKs from popular Maven repositories, 178,000 Android applications collected from the Google Play store, and our static analysis pipeline detects exfiltration of over 500 individual signals. To the best of our knowledge, this represents the largest-scale analysis of SDK behavior undertaken to date. We find that Ads SDKs (the ostensible focus of industry efforts such as Apple's App Tracking Transparency and Google's Privacy Sandbox) appear to be the source of only 30.56% of the fingerprinting behaviors. A surprising 23.92% originate from SDKs whose purpose was unknown or unclear. Furthermore, Security and Authentication SDKs are linked to only 11.7% of likely fingerprinting instances. These results suggest that addressing fingerprinting solely in specific market-segment contexts like advertising may offer incomplete benefit. Enforcing anti-fingerprinting policies is also complex, as we observe a sparse distribution of signals and APIs used by likely fingerprinting SDKs. For instance, only 2% of exfiltrated APIs are used by more than 75% of SDKs, making it difficult to rely on user permissions to control fingerprinting behavior.

Paperid: 2617, https://arxiv.org/pdf/2506.22515.pdf

Abstract:
Traditional phishing detection often overlooks psychological manipulation. This study investigates using Large Language Model (LLM) In-Context Learning (ICL) for fine-grained classification of phishing emails based on a taxonomy of 40 manipulation techniques. Using few-shot examples with GPT-4o-mini on real-world French phishing emails (SignalSpam), we evaluated performance against a human-annotated test set (100 emails). The approach effectively identifies prevalent techniques (e.g., Baiting, Curiosity Appeal, Request For Minor Favor) with a promising accuracy of 0.76. This work demonstrates ICL's potential for nuanced phishing analysis and provides insights into attacker strategies.

Paperid: 2618, https://arxiv.org/pdf/2506.21972.pdf

Abstract:
The advancement of Pre-Trained Language Models (PTLMs) and Large Language Models (LLMs) has led to their widespread adoption across diverse applications. Despite their success, these models remain vulnerable to attacks that exploit their inherent weaknesses to bypass safety measures. Two primary inference-phase threats are token-level and prompt-level jailbreaks. Token-level attacks embed adversarial sequences that transfer well to black-box models like GPT but leave detectable patterns and rely on gradient-based token optimization, whereas prompt-level attacks use semantically structured inputs to elicit harmful responses yet depend on iterative feedback that can be unreliable. To address the complementary limitations of these methods, we propose two hybrid approaches that integrate token- and prompt-level techniques to enhance jailbreak effectiveness across diverse PTLMs. GCG + PAIR and the newly explored GCG + WordGame hybrids were evaluated across multiple Vicuna and Llama models. GCG + PAIR consistently raised attack-success rates over its constituent techniques on undefended models; for instance, on Llama-3, its Attack Success Rate (ASR) reached 91.6%, a substantial increase from PAIR's 58.4% baseline. Meanwhile, GCG + WordGame matched the raw performance of WordGame maintaining a high ASR of over 80% even under stricter evaluators like Mistral-Sorry-Bench. Crucially, both hybrids retained transferability and reliably pierced advanced defenses such as Gradient Cuff and JBShield, which fully blocked single-mode attacks. These findings expose previously unreported vulnerabilities in current safety stacks, highlight trade-offs between raw success and defensive robustness, and underscore the need for holistic safeguards against adaptive adversaries.

Paperid: 2619, https://arxiv.org/pdf/2506.20856.pdf

Abstract:
Memorization in large language models (LLMs) makes them vulnerable to data extraction attacks. While pre-training memorization has been extensively studied, fewer works have explored its impact in fine-tuning, particularly for LoRA fine-tuning, a widely adopted parameter-efficient method. In this work, we re-examine memorization in fine-tuning and uncover a surprising divergence from prior findings across different fine-tuning strategies. Factors such as model scale and data duplication, which strongly influence memorization in pre-training and full fine-tuning, do not follow the same trend in LoRA fine-tuning. Using a more relaxed similarity-based memorization metric, we demonstrate that LoRA significantly reduces memorization risks compared to full fine-tuning, while still maintaining strong task performance.

Paperid: 2620, https://arxiv.org/pdf/2506.20651.pdf

Abstract:
Recent work has shown that gradient updates in federated learning (FL) can unintentionally reveal sensitive information about a client's local data. This risk becomes significantly greater when a malicious server manipulates the global model to provoke information-rich updates from clients. In this paper, we adopt a defender's perspective to provide the first comprehensive analysis of malicious gradient leakage attacks and the model manipulation techniques that enable them. Our investigation reveals a core trade-off: these attacks cannot be both highly effective in reconstructing private data and sufficiently stealthy to evade detection -- especially in realistic FL settings that incorporate common normalization techniques and federated averaging. Building on this insight, we argue that malicious gradient leakage attacks, while theoretically concerning, are inherently limited in practice and often detectable through basic monitoring. As a complementary contribution, we propose a simple, lightweight, and broadly applicable client-side detection mechanism that flags suspicious model updates before local training begins, despite the fact that such detection may not be strictly necessary in realistic FL settings. This mechanism further underscores the feasibility of defending against these attacks with minimal overhead, offering a deployable safeguard for privacy-conscious federated learning systems.

Paperid: 2621, https://arxiv.org/pdf/2506.20228.pdf

Abstract:
Phishing attacks frequently use email body obfuscation to bypass detection filters, but quantitative insights into how techniques are combined and their impact on filter scores remain limited. This paper addresses this gap by empirically investigating the prevalence, co-occurrence patterns, and spam score associations of body obfuscation techniques. Analysing 386 verified phishing emails, we quantified ten techniques, identified significant pairwise co-occurrences revealing strategic layering like the presence of text in images with multipart abuse, and assessed associations with antispam scores using multilinear regression. Text in Image (47.0%), Base64 Encoding (31.2%), and Invalid HTML (28.8%) were highly prevalent. Regression (R${}^2$=0.486, p<0.001) linked Base64 Encoding and Text in Image with significant antispam evasion (p<0.05) in this configuration, suggesting potential bypass capabilities, while Invalid HTML correlated with higher scores. These findings establish a quantitative baseline for complex evasion strategies, underscoring the need for multi-modal defences against combined obfuscation tactics.

Paperid: 2622, https://arxiv.org/pdf/2506.20170.pdf

Abstract:
Deobfuscating JavaScript (JS) code poses a significant challenge in web security, particularly as obfuscation techniques are frequently used to conceal malicious activities within scripts. While Large Language Models (LLMs) have recently shown promise in automating the deobfuscation process, transforming detection and mitigation strategies against these obfuscated threats, a systematic benchmark to quantify their effectiveness and limitations has been notably absent. To address this gap, we present JsDeObsBench, a dedicated benchmark designed to rigorously evaluate the effectiveness of LLMs in the context of JS deobfuscation. We detail our benchmarking methodology, which includes a wide range of obfuscation techniques ranging from basic variable renaming to sophisticated structure transformations, providing a robust framework for assessing LLM performance in real-world scenarios. Our extensive experimental analysis investigates the proficiency of cutting-edge LLMs, e.g., GPT-4o, Mixtral, Llama, and DeepSeek-Coder, revealing superior performance in code simplification despite challenges in maintaining syntax accuracy and execution reliability compared to baseline methods. We further evaluate the deobfuscation of JS malware to exhibit the potential of LLMs in security scenarios. The findings highlight the utility of LLMs in deobfuscation applications and pinpoint crucial areas for further improvement.

Paperid: 2623, https://arxiv.org/pdf/2506.19220.pdf

Abstract:
We study model personalization under user-level differential privacy (DP) in the shared representation framework. In this problem, there are $n$ users whose data is statistically heterogeneous, and their optimal parameters share an unknown embedding $U^* \in\mathbb{R}^{d\times k}$ that maps the user parameters in $\mathbb{R}^d$ to low-dimensional representations in $\mathbb{R}^k$, where $k\ll d$. Our goal is to privately recover the shared embedding and the local low-dimensional representations with small excess risk in the federated setting. We propose a private, efficient federated learning algorithm to learn the shared embedding based on the FedRep algorithm in [CHM+21]. Unlike [CHM+21], our algorithm satisfies differential privacy, and our results hold for the case of noisy labels. In contrast to prior work on private model personalization [JRS+21], our utility guarantees hold under a larger class of users' distributions (sub-Gaussian instead of Gaussian distributions). Additionally, in natural parameter regimes, we improve the privacy error term in [JRS+21] by a factor of $\widetilde{O}(dk)$. Next, we consider the binary classification setting. We present an information-theoretic construction to privately learn the shared embedding and derive a margin-based accuracy guarantee that is independent of $d$. Our method utilizes the Johnson-Lindenstrauss transform to reduce the effective dimensions of the shared embedding and the users' data. This result shows that dimension-independent risk bounds are possible in this setting under a margin loss.

Paperid: 2624, https://arxiv.org/pdf/2506.18685.pdf

Abstract:
In this study, we conducted an in-depth examination of the utility analysis of the differentially private mechanism (DPM). The authors of DPM have already established the probability of a good split being selected and of DPM halting. In this study, we expanded the analysis of the stopping criterion and provided an interpretation of these guarantees in the context of realistic input distributions. Our findings revealed constraints on the minimum cluster size and the metric weight for the scoring function. Furthermore, we introduced an interpretation of the utility of DPM through the lens of the clustering metric, the silhouette score. Our findings indicate that even when an optimal DPM-based split is employed, the silhouette score of the resulting clustering may still decline. This observation calls into question the suitability of the silhouette score as a clustering metric. Finally, we examined the potential of the underlying concept of DPM by linking it to a more theoretical view, that of $(Î¾, Ï)$-separability. This extensive analysis of the theoretical guarantees of DPM allows a better understanding of its behaviour for arbitrary inputs. From these guarantees, we can analyse the impact of different hyperparameters and different input data sets, thereby promoting the application of DPM in practice for unknown settings and data sets.

Paperid: 2625, https://arxiv.org/pdf/2506.17795.pdf

Abstract:
Hardware security primitives including True Random Number Generators (TRNG) and Physical Unclonable Functions (PUFs) are central components to establishing a root of trust in microelectronic systems. In this paper, we propose a unified PUF-TRNG architecture that leverages a combination of the static entropy available in a strong PUF called the shift-register, reconvergent-fanout (SiRF) PUF, and the dynamic entropy associated with random noise present in path delay measurements. The SiRF PUF uses an engineered netlist containing a large number of paths as the source of static entropy, and a time-to-digital-converter (TDC) as a high-resolution, embedded instrument for measuring path delays, where measurement noise serves as the source of dynamic entropy. A novel data postprocessing algorithm is proposed based on a modified duplex sponge construction. The sponge function operates on soft data, i.e., fixed point data values, to add entropy to the ensuing random bit sequences and to increase the bit generation rate. A postprocessing algorithm for reproducing PUF-generated encryption keys is also used in the TRNG to protect against temperature voltage attacks designed to subvert the random characteristics in the bit sequences. The unified PUF-TRNG architecture is implemented across multiple instances of a ZYBO Z7-10 FPGA board and extensively tested with NIST SP 800-22, NIST SP 800-90B, AIS-31, and DieHarder test suites. Results indicate a stable and robust TRNG design with excellent min-entropy and a moderate data rate.

Paperid: 2626, https://arxiv.org/pdf/2506.17371.pdf

Abstract:
Multi-access Edge Computing (MEC), an enhancement of 5G, processes data closer to its generation point, reducing latency and network load. However, the distributed and edge-based nature of 5G-MEC presents privacy and security challenges, including data exposure risks. Ensuring efficient manipulation and security of sensitive data at the edge is crucial. To address these challenges, we investigate the usage of threshold secret sharing in 5G-MEC storage, an approach that enhances both security and dependability. A (k,n) threshold secret sharing scheme splits and stores sensitive data among n nodes, requiring at least k nodes for reconstruction. The solution ensures confidentiality by protecting data against fewer than k colluding nodes and enhances availability by tolerating up to n-k failing nodes. This approach mitigates threats such as unauthorized access and node failures, whether accidental or intentional. We further discuss a method for selecting the convenient MEHs to store the shares, considering the MEHs' trustworthiness level as a main criterion. Although we define our proposal in the context of secret-shared data storage, it can be seen as an independent, standalone selection process for 5G-MEC trustworthy node selection in other scenarios too.

Paperid: 2627, https://arxiv.org/pdf/2506.17350.pdf

Abstract:
Backdoor attacks have emerged as a critical security threat against deep neural networks in recent years. The majority of existing backdoor attacks focus on targeted backdoor attacks, where trigger is strongly associated to specific malicious behavior. Various backdoor detection methods depend on this inherent property and shows effective results in identifying and mitigating such targeted attacks. However, a purely untargeted attack in backdoor scenarios is, in some sense, self-weakening, since the target nature is what makes backdoor attacks so powerful. In light of this, we introduce a novel Constrained Untargeted Backdoor Attack (CUBA), which combines the flexibility of untargeted attacks with the intentionality of targeted attacks. The compromised model, when presented with backdoor images, will classify them into random classes within a constrained range of target classes selected by the attacker. This combination of randomness and determinedness enables the proposed untargeted backdoor attack to natively circumvent existing backdoor defense methods. To implement the untargeted backdoor attack under controlled flexibility, we propose to apply logit normalization on cross-entropy loss with flipped one-hot labels. By constraining the logit during training, the compromised model will show a uniform distribution across selected target classes, resulting in controlled untargeted attack. Extensive experiments demonstrate the effectiveness of the proposed CUBA on different datasets.

Paperid: 2628, https://arxiv.org/pdf/2506.17349.pdf

Abstract:
The exponential growth of android-based mobile IoT systems has significantly increased the susceptibility of devices to cyberattacks, particularly in smart homes, UAVs, and other connected mobile environments. This article presents a federated learning-based intrusion detection framework called AndroIDS that leverages system call traces as a personalized and privacy-preserving data source. Unlike conventional centralized approaches, the proposed method enables collaborative anomaly detection without sharing raw data, thus preserving user privacy across distributed nodes. A generalized system call dataset was generated to reflect realistic android system behavior and serves as the foundation for experimentation. Extensive evaluation demonstrates the effectiveness of the FL model under both IID and non-IID conditions, achieving an accuracy of 96.46 % and 92.87 %, and F1-scores of 89 % and 86 %, respectively. These results highlight the models robustness to data heterogeneity, with only a minor performance drop in the non-IID case. Further, a detailed comparison with centralized deep learning further illustrates trade-offs in detection performance and deployment feasibility. Overall, the results validate the practical applicability of the proposed approach for secure and scalable intrusion detection in real-world mobile IoT scenarios.

Paperid: 2629, https://arxiv.org/pdf/2506.17266.pdf

Abstract:
Generative Artificial Intelligence (GenAI) presents significant advancements but also introduces novel security challenges, particularly within agentic workflows where AI agents operate autonomously. These risks escalate in multi-agent systems due to increased interaction complexity. This paper outlines critical security vulnerabilities inherent in GenAI agentic workflows, including data privacy breaches, model manipulation, and issues related to agent autonomy and system integration. It discusses key mitigation strategies such as data encryption, access control, prompt engineering, model monitoring, agent sandboxing, and security audits. Furthermore, it details a proposed "GenAI Security Firewall" architecture designed to provide comprehensive, adaptable, and efficient protection for these systems by integrating various security services and leveraging GenAI itself for enhanced defense. Addressing these security concerns is paramount for the responsible and safe deployment of this transformative technology.

Paperid: 2630, https://arxiv.org/pdf/2506.16649.pdf

Abstract:
This paper presents a comprehensive approach to automated energy billing that leverages IoT-based smart meters, blockchain technology, and the Prophet time series forecasting model. The proposed system facilitates real-time power consumption monitoring via Wi-Fi-enabled ESP32 modules and a mobile application interface. It integrates Firebase and blockchain for secure, transparent billing processes and employs smart contracts for automated payments. The Prophet model is used for energy demand forecasting, with careful data preprocessing, transformation, and parameter tuning to improve prediction accuracy. This holistic solution aims to reduce manual errors, enhance user awareness, and promote sustainable energy use.

Paperid: 2631, https://arxiv.org/pdf/2506.16458.pdf

Abstract:
Federated Learning (FL) protects data privacy while providing a decentralized method for training models. However, because of the distributed schema, it is susceptible to adversarial clients that could alter results or sabotage model performance. This study presents SecureFed, a two-phase FL framework for identifying and reducing the impact of such attackers. Phase 1 involves collecting model updates from participating clients and applying a dimensionality reduction approach to identify outlier patterns frequently associated with malicious behavior. Temporary models constructed from the client updates are evaluated on synthetic datasets to compute validation losses and support anomaly scoring. The idea of learning zones is presented in Phase 2, where weights are dynamically routed according to their contribution scores and gradient magnitudes. High-value gradient zones are given greater weight in aggregation and contribute more significantly to the global model, while lower-value gradient zones, which may indicate possible adversarial activity, are gradually removed from training. Until the model converges and a strong defense against poisoning attacks is possible, this training cycle continues Based on the experimental findings, SecureFed considerably improves model resilience without compromising model performance.

Paperid: 2632, https://arxiv.org/pdf/2506.15888.pdf

Abstract:
Random number generators that utilize arrays of entropy source elements suffer from bias variation (BV). Despite the availability of efficient debiasing algorithms, optimized implementations of hardware friendly options depend on the bit bias in the raw bit streams and cannot accommodate a wide BV. In this work, we present a 64 x 64 array of perimeter gated single photon avalanche diodes (pgSPADs), fabricated in a 0.35 Î¼m standard CMOS technology, as a source of entropy to generate random binary strings with a BV compensation technique. By applying proper gate voltages based on the devices' native dark count rates, we demonstrate less than 1% BV for a raw-bit generation rate of 2 kHz/pixel at room temperature. The raw bits were debiased using the classical iterative Von Neumann's algorithm and the debiased bits were found to pass all of the 16 tests from NIST's Statistical Test Suite.

Paperid: 2633, https://arxiv.org/pdf/2506.15034.pdf

Abstract:
This paper presents a multithread and efficient cryptographic hardware access (MECHA) for efficient and fast cryptographic operations that eliminates the need for context switching. Utilizing a UNIX domain socket, MECHA manages multiple requests from multiple applications simultaneously, resulting in faster processing and improved efficiency. We comprise several key components, including the Server thread, Client thread, Transceiver thread, and a pair of Sender and Receiver queues. MECHA design is portable and can be used with any communication protocol, with experimental results demonstrating a 83% increase in the speed of concurrent cryptographic requests compared to conventional interface design. MECHA architecture has significant potential in the field of secure communication applications ranging from cloud computing to the IoT, offering a faster and more efficient solution for managing multiple cryptographic operation requests concurrently.

Paperid: 2634, https://arxiv.org/pdf/2506.14682.pdf

Abstract:
We introduce AIRTBench, an AI red teaming benchmark for evaluating language models' ability to autonomously discover and exploit Artificial Intelligence and Machine Learning (AI/ML) security vulnerabilities. The benchmark consists of 70 realistic black-box capture-the-flag (CTF) challenges from the Crucible challenge environment on the Dreadnode platform, requiring models to write python code to interact with and compromise AI systems. Claude-3.7-Sonnet emerged as the clear leader, solving 43 challenges (61% of the total suite, 46.9% overall success rate), with Gemini-2.5-Pro following at 39 challenges (56%, 34.3% overall), GPT-4.5-Preview at 34 challenges (49%, 36.9% overall), and DeepSeek R1 at 29 challenges (41%, 26.9% overall). Our evaluations show frontier models excel at prompt injection attacks (averaging 49% success rates) but struggle with system exploitation and model inversion challenges (below 26%, even for the best performers). Frontier models are far outpacing open-source alternatives, with the best truly open-source model (Llama-4-17B) solving 7 challenges (10%, 1.0% overall), though demonstrating specialized capabilities on certain hard challenges. Compared to human security researchers, large language models (LLMs) solve challenges with remarkable efficiency completing in minutes what typically takes humans hours or days-with efficiency advantages of over 5,000x on hard challenges. Our contribution fills a critical gap in the evaluation landscape, providing the first comprehensive benchmark specifically designed to measure and track progress in autonomous AI red teaming capabilities.

Paperid: 2635, https://arxiv.org/pdf/2506.14393.pdf

Abstract:
The distribution of consensus power is a cornerstone of decentralisation, influencing the security, resilience, and fairness of blockchain networks while ensuring equitable impact among participants. This study provides a rigorous evaluation of consensus power inequality across five prominent blockchain networks - Bitcoin, Ethereum, Cardano, Hedera, and Algorand - using data collected from January 2022 to July 2024. Leveraging established economic metrics, including the Gini coefficient and Theil index, the research quantitatively assesses how power is distributed among blockchain network participants. A robust dataset, capturing network-specific characteristics such as mining pools, staking patterns, and consensus nodes, forms the foundation of the analysis, enabling meaningful comparisons across diverse architectures. Through an in-depth comparative study, the paper identifies key disparities in consensus power distribution. Hedera and Bitcoin demonstrate more balanced power distribution, aligning closely with the principles of decentralisation. Ethereum and Cardano demonstrate moderate levels of inequality. However, contrary to expectations, Ethereum has become more concentrated following its transition to Proof-of-Stake. Meanwhile, Algorand shows a pronounced centralisation of power. Moreover, the findings highlight the structural and operational drivers of inequality, including economic barriers, governance models, and network effects, offering actionable insights for more equitable network design. This study establishes a methodological framework for evaluating blockchain consensus power inequality, emphasising the importance of targeted strategies to ensure fairer power distribution and enhancing the sustainability of decentralised systems. Future research will build on these findings by integrating additional metrics and examining the influence of emerging consensus mechanisms.

Paperid: 2636, https://arxiv.org/pdf/2506.13895.pdf

Abstract:
This paper presents a robust image encryption and key distribution framework that integrates an enhanced AES-128 algorithm with chaos theory and advanced steganographic techniques for dual-layer security. The encryption engine features a dynamic ShiftRows operation controlled by a logistic map, variable S-boxes generated from a two-dimensional Henon map for substitution and key expansion, and feedback chaining with post-encryption XOR diffusion to improve confusion, diffusion, and key sensitivity. To address secure key delivery, the scheme introduces dual-key distribution via steganographically modified QR codes. A static key and an AES-encrypted dynamic session key are embedded with a covert hint message using least significant bit (LSB) steganography. This design ensures the dynamic key can only be decrypted after reconstructing the static key from the hidden message, offering multi-factor protection against interception. Experimental results demonstrate the framework outperforms existing chaos-based and hybrid AES methods, achieving near-ideal entropy (7.997), minimal pixel correlation, and strong differential resistance with NPCR (>99.6%) and UACI (50.1%). Encrypted images show uniform histograms and robustness against noise and data loss. The framework offers a scalable, secure solution for sensitive image transmission in applications such as surveillance, medical imaging, and digital forensics, bridging the gap between cryptographic strength and safe key distribution.

Paperid: 2637, https://arxiv.org/pdf/2506.13726.pdf

Abstract:
The introduction of advanced reasoning capabilities have improved the problem-solving performance of large language models, particularly on math and coding benchmarks. However, it remains unclear whether these reasoning models are more or less vulnerable to adversarial prompt attacks than their non-reasoning counterparts. In this work, we present a systematic evaluation of weaknesses in advanced reasoning models compared to similar non-reasoning models across a diverse set of prompt-based attack categories. Using experimental data, we find that on average the reasoning-augmented models are \emph{slightly more robust} than non-reasoning models (42.51\% vs 45.53\% attack success rate, lower is better). However, this overall trend masks significant category-specific differences: for certain attack types the reasoning models are substantially \emph{more vulnerable} (e.g., up to 32 percentage points worse on a tree-of-attacks prompt), while for others they are markedly \emph{more robust} (e.g., 29.8 points better on cross-site scripting injection). Our findings highlight the nuanced security implications of advanced reasoning in language models and emphasize the importance of stress-testing safety across diverse adversarial techniques.

Paperid: 2638, https://arxiv.org/pdf/2506.13090.pdf

Abstract:
Software developers frequently hard-code credentials such as passwords, generic secrets, private keys, and generic tokens in software repositories, even though it is strictly advised against due to the severe threat to the security of the software. These credentials create attack surfaces exploitable by a potential adversary to conduct malicious exploits such as backdoor attacks. Recent detection efforts utilize embedding models to vectorize textual credentials before passing them to classifiers for predictions. However, these models struggle to discriminate between credentials with contextual and complex sequences resulting in high false positive predictions. Context-dependent Pre-trained Language Models (PLMs) or Large Language Models (LLMs) such as Generative Pre-trained Transformers (GPT) tackled this drawback by leveraging the transformer neural architecture capacity for self-attention to capture contextual dependencies between words in input sequences. As a result, GPT has achieved wide success in several natural language understanding endeavors. Hence, we assess LLMs to represent these observations and feed extracted embedding vectors to a deep learning classifier to detect hard-coded credentials. Our model outperforms the current state-of-the-art by 13% in F1 measure on the benchmark dataset. We have made all source code and data publicly available to facilitate the reproduction of all results presented in this paper.

Paperid: 2639, https://arxiv.org/pdf/2506.12900.pdf

Abstract:
The ability to perform repeated Byzantine agreement lies at the heart of important applications such as blockchain price oracles or replicated state machines. Any such protocol requires the following properties: (1) \textit{Byzantine fault-tolerance}, because not all participants can be assumed to be honest, (2) r\textit{ecurrent transient fault-tolerance}, because even honest participants may be subject to transient ``glitches'', (3) \textit{accuracy}, because the results of quantitative queries (such as price quotes) must lie within the interval of honest participants' inputs, and (4) \textit{self-stabilization}, because it is infeasible to reboot a distributed system following a fault. This paper presents the first protocol for repeated Byzantine agreement that satisfies the properties listed above. Specifically, starting in an arbitrary system configuration, our protocol establishes consistency. It preserves consistency in the face of up to $\lceil n/3 \rceil -1$ Byzantine participants {\em and} constant recurring (``noise'') transient faults, of up to $\lceil n/6 \rceil-1$ additional malicious transient faults, or even more than $\lceil n/6 \rceil-1$ (uniformly distributed) random transient faults, in each repeated Byzantine agreement.

Paperid: 2640, https://arxiv.org/pdf/2506.12883.pdf

Abstract:
Fully Homomorphic Encryption (FHE) is a promising privacy-preserving technology enabling secure computation over encrypted data. A major limitation of current FHE schemes is their high runtime overhead. As a result, automatic optimization of circuits describing FHE computation has garnered significant attention in the logic synthesis community. Existing works primarily target the multiplicative depth (MD) and multiplicative complexity (MC) of FHE circuits, corresponding to the total number of multiplications and maximum number of multiplications in a path from primary input to output, respectively. In many FHE schemes, these metrics are the primary contributors to the homomorphic evaluation runtime of a circuit. However, oftentimes they are opposed: reducing either depth or complexity may result in an increase in the other. To our knowledge, existing works have yet to optimize FHE circuits for overall runtime, only considering one metric at a time and thus making significant tradeoffs. In this paper, we use e-graphs to augment existing flows that individually optimize MC and MD, in a technique called cut tracing. We show how cut tracing can effectively combine two state-of-the-art MC and MD reduction flows and balance their weaknesses to minimize runtime. Our preliminary results demonstrate that cut tracing yields up to a 40% improvement in homomorphic evaluation runtime when applied to these two flows.

Paperid: 2641, https://arxiv.org/pdf/2506.12595.pdf

Abstract:
Given a sequence of $N$ independent sources $\mathbf{X}_1,\mathbf{X}_2,\dots,\mathbf{X}_N\sim\{0,1\}^n$, how many of them must be good (i.e., contain some min-entropy) in order to extract a uniformly random string? This question was first raised by Chattopadhyay, Goodman, Goyal and Li (STOC '20), motivated by applications in cryptography, distributed computing, and the unreliable nature of real-world sources of randomness. In their paper, they showed how to construct explicit low-error extractors for just $K \geq N^{1/2}$ good sources of polylogarithmic min-entropy. In a follow-up, Chattopadhyay and Goodman improved the number of good sources required to just $K \geq N^{0.01}$ (FOCS '21). In this paper, we finally achieve $K=3$. Our key ingredient is a near-optimal explicit construction of a new pseudorandom primitive, called a leakage-resilient extractor (LRE) against number-on-forehead (NOF) protocols. Our LRE can be viewed as a significantly more robust version of Li's low-error three-source extractor (FOCS '15), and resolves an open question put forth by Kumar, Meka, and Sahai (FOCS '19) and Chattopadhyay, Goodman, Goyal, Kumar, Li, Meka, and Zuckerman (FOCS '20). Our LRE construction is based on a simple new connection we discover between multiparty communication complexity and non-malleable extractors, which shows that such extractors exhibit strong average-case lower bounds against NOF protocols.

Paperid: 2642, https://arxiv.org/pdf/2506.12523.pdf

Abstract:
Proof-of-Attendance (PoA) mechanisms are typically employed to demonstrate a specific user's participation in an event, whether virtual or in-person. The goal of this study is to extend such mechanisms to broader contexts where the user wishes to digitally demonstrate her involvement in a specific activity (Proof-of-Engagement, PoE). This work explores different solutions, including DLTs as well as established technologies based on centralized systems. The main aspects we consider include the level of privacy guaranteed to users, the scope of PoA/PoE (both temporal and spatial), the transferability of the proof, and the integration with incentive mechanisms.

Paperid: 2643, https://arxiv.org/pdf/2506.12257.pdf

Abstract:
An Advanced Persistent Threat (APT) is a multistage, highly sophisticated, and covert form of cyber threat that gains unauthorized access to networks to either steal valuable data or disrupt the targeted network. These threats often remain undetected for extended periods, emphasizing the critical need for early detection in networks to mitigate potential APT consequences. In this work, we propose a feature selection method for developing a lightweight intrusion detection system capable of effectively identifying APTs at the initial compromise stage. Our approach leverages the XGBoost algorithm and Explainable Artificial Intelligence (XAI), specifically utilizing the SHAP (SHapley Additive exPlanations) method for identifying the most relevant features of the initial compromise stage. The results of our proposed method showed the ability to reduce the selected features of the SCVIC-APT-2021 dataset from 77 to just four while maintaining consistent evaluation metrics for the suggested system. The estimated metrics values are 97% precision, 100% recall, and a 98% F1 score. The proposed method not only aids in preventing successful APT consequences but also enhances understanding of APT behavior at early stages.

Paperid: 2645, https://arxiv.org/pdf/2506.11669.pdf

Abstract:
With the rapid development and extensive deployment of the fifth-generation wireless system (5G), it has achieved ubiquitous high-speed connectivity and improved overall communication performance. Additionally, as one of the promising technologies for integration beyond 5G, digital twin in cyberspace can interact with the core network, transmit essential information, and further enhance the wireless communication quality of the corresponding mobile device (MD). However, the utilization of millimeter-wave, terahertz band, and ultra-dense network technologies presents urgent challenges for MD in 5G and beyond, particularly in terms of frequent handover authentication with target base stations during faster mobility, which can cause connection interruption and incur malicious attacks. To address such challenges in 5G and beyond, in this paper, we propose a secure and efficient handover authentication scheme by utilizing digital twin. Acting as an intelligent intermediate, the authorized digital twin can handle computations and assist the corresponding MD in performing secure mutual authentication and key negotiation in advance before attaching the target base stations in both intra-domain and inter-domain scenarios. In addition, we provide the formal verification based on BAN logic, RoR model, and ProVerif, and informal analysis to demonstrate that the proposed scheme can offer diverse security functionality. Performance evaluation shows that the proposed scheme outperforms most related schemes in terms of signaling, computation, and communication overheads.

Paperid: 2646, https://arxiv.org/pdf/2506.10280.pdf

Abstract:
Software vulnerabilities in source code pose serious cybersecurity risks, prompting a shift from traditional detection methods (e.g., static analysis, rule-based matching) to AI-driven approaches. This study presents a systematic review of software vulnerability detection (SVD) research from 2018 to 2023, offering a comprehensive taxonomy of techniques, feature representations, and embedding methods. Our analysis reveals that 91% of studies use AI-based methods, with graph-based models being the most prevalent. We identify key limitations, including dataset quality, reproducibility, and interpretability, and highlight emerging opportunities in underexplored techniques such as federated learning and quantum neural networks, providing a roadmap for future research.

Paperid: 2647, https://arxiv.org/pdf/2506.10025.pdf

Abstract:
Key decision-makers in small and medium businesses (SMBs) often lack the awareness and knowledge to implement cybersecurity measures effectively. To gain a deeper understanding of how SMB executives navigate cybersecurity decision-making, we deployed a mixed-method approach, conducting semi-structured interviews (n=21) and online surveys (n=322) with SMB key decision-makers. Using thematic analysis, we revealed SMB decision-makers' perceived risks in terms of the digital assets they valued, and found reasons for their choice of defense measures and factors impacting security perception. We employed the situational awareness model to characterize decision-makers based on cybersecurity awareness, identifying those who have comparatively low awareness in the fight against adversaries. We further explored the relationship between awareness and business attributes, and constructed a holistic structural equation model to understand how awareness can be improved. Finally, we proposed interventions to help SMBs overcome potential challenges.

Paperid: 2648, https://arxiv.org/pdf/2506.09452.pdf

Abstract:
The high cost of ownership of AI compute infrastructure and challenges of robust serving of large language models (LLMs) has led to a surge in managed Model-as-a-service deployments. Even when enterprises choose on-premises deployments, the compute infrastructure is typically shared across many teams in order to maximize the return on investment. In both scenarios the deployed models operate only on plaintext data, and so enterprise data owners must allow their data to appear in plaintext on a shared or multi-tenant compute infrastructure. This results in data owners with private or sensitive data being hesitant or restricted in what data they use with these types of deployments. In this work we introduce the Stained Glass Transform, a learned, stochastic, and sequence dependent transformation of the word embeddings of an LLM which information theoretically provides privacy to the input of the LLM while preserving the utility of model. We theoretically connect a particular class of Stained Glass Transforms to the theory of mutual information of Gaussian Mixture Models. We then calculate a-postiori privacy estimates, based on mutual information, and verify the privacy and utility of instances of transformed embeddings through token level metrics of privacy and standard LLM performance benchmarks.

Paperid: 2649, https://arxiv.org/pdf/2506.08602.pdf

Abstract:
Graph Neural Networks (GNNs) are increasingly deployed in graph-related applications, making ownership verification critical to protect their intellectual property against model theft. Fingerprinting and black-box watermarking are two main methods. However, the former relies on determining model similarity, which is computationally expensive and prone to ownership collisions after model post-processing such as model pruning or fine-tuning. The latter embeds backdoors, exposing watermarked models to the risk of backdoor attacks. Moreover, both methods enable ownership verification but do not convey additional information. As a result, each distributed model requires a unique trigger graph, and all trigger graphs must be used to query the suspect model during verification. Multiple queries increase the financial cost and the risk of detection. To address these challenges, this paper proposes WGLE, a novel black-box watermarking paradigm for GNNs that enables embedding the multi-bit string as the ownership information without using backdoors. WGLE builds on a key insight we term Layer-wise Distance Difference on an Edge (LDDE), which quantifies the difference between the feature distance and the prediction distance of two connected nodes. By predefining positive or negative LDDE values for multiple selected edges, WGLE embeds the watermark encoding the intended information without introducing incorrect mappings that compromise the primary task. WGLE is evaluated on six public datasets and six mainstream GNN architectures along with state-of-the-art methods. The results show that WGLE achieves 100% ownership verification accuracy, an average fidelity degradation of 0.85%, comparable robustness against potential attacks, and low embedding overhead. The code is available in the repository.

Paperid: 2650, https://arxiv.org/pdf/2506.08383.pdf

Abstract:
With the rapid expansion of Internet of Things (IoT) networks, detecting malicious traffic in real-time has become a critical cybersecurity challenge. This research addresses the detection challenges by presenting a comprehensive empirical analysis of machine learning techniques for malware detection using the IoT-23 dataset provided by the Stratosphere Laboratory. We address the significant class imbalance within the dataset through three resampling strategies. We implement and compare a few machine learning techniques. Our findings demonstrate that the combination of appropriate imbalance treatment techniques with ensemble methods, particularly gcForest, achieves better detection performance compared to traditional approaches. This work contributes significantly to the development of more intelligent and efficient automated threat detection systems for IoT environments, helping to secure critical infrastructure against sophisticated cyber attacks while optimizing computational resource usage.

Paperid: 2651, https://arxiv.org/pdf/2506.08320.pdf

Abstract:
Generative AI technologies, particularly Large Language Models (LLMs), are rapidly being adopted across industry, academia, and government sectors, owing to their remarkable capabilities in natural language processing. However, despite their strengths, the inconsistency and unpredictability of LLM outputs present substantial challenges, especially in security-critical domains such as access control. One critical issue that emerges prominently is the consistency of LLM-generated responses, which is paramount for ensuring secure and reliable operations. In this paper, we study the application of LLMs within the context of Cybersecurity Access Control Systems. Specifically, we investigate the consistency and accuracy of LLM-generated password policies, translating natural language prompts into executable pwquality$.$conf configuration files. Our experimental methodology adopts two distinct approaches: firstly, we utilize pre-trained LLMs to generate configuration files purely from natural language prompts without additional guidance. Secondly, we provide these models with official pwquality$.$conf documentation to serve as an informative baseline. We systematically assess the soundness, accuracy, and consistency of these AI-generated configurations. Our findings underscore significant challenges in the current generation of LLMs and contribute valuable insights into refining the deployment of LLMs in Access Control Systems.

Paperid: 2652, https://arxiv.org/pdf/2506.07948.pdf

Abstract:
Natural Language Processing (NLP) models are used for text-related tasks such as classification and generation. To complete these tasks, input data is first tokenized from human-readable text into a format the model can understand, enabling it to make inferences and understand context. Text classification models can be implemented to guard against threats such as prompt injection attacks against Large Language Models (LLMs), toxic input and cybersecurity risks such as spam emails. In this paper, we introduce TokenBreak: a novel attack that can bypass these protection models by taking advantage of the tokenization strategy they use. This attack technique manipulates input text in such a way that certain models give an incorrect classification. Importantly, the end target (LLM or email recipient) can still understand and respond to the manipulated text and therefore be vulnerable to the very attack the protection model was put in place to prevent. The tokenizer is tied to model architecture, meaning it is possible to predict whether or not a model is vulnerable to attack based on family. We also present a defensive strategy as an added layer of protection that can be implemented without having to retrain the defensive model.

Paperid: 2653, https://arxiv.org/pdf/2506.07868.pdf

Abstract:
Recent works have started to theoretically investigate how we can protect differentially private programs against timing attacks, by making the joint distribution the output and the runtime differentially private (JOT-DP). However, the existing approaches to JOT-DP have some limitations, particularly in the setting of unbounded DP (which protects the size of the dataset and applies to arbitrarily large datasets). First, the known conversion of pure DP programs to pure JOT-DP programs in the unbounded setting (a) incurs a constant additive increase in error probability (and thus does not provide vanishing error as $n\to\infty$) (b) produces JOT-DP programs that fail to preserve the computational efficiency of the original pure DP program and (c) is analyzed in a toy computational model in which the runtime is defined to be the number of coin flips. In this work, we overcome these limitations. Specifically, we show that the error required for pure JOT-DP in the unbounded setting depends on the model of computation. In a randomized RAM model where the dataset size $n$ is given (or can be computed in constant time) and we can generate random numbers (not just random bits) in constant time, polynomially small error probability is necessary and sufficient. If $n$ is not given or we only have a random-bit generator, an (arbitrarily small) constant error probability is necessary and sufficient. The aforementioned positive results are proven by efficient procedures to convert any pure JOT-DP program $P$ in the upper-bounded setting to a pure JOT-DP program $P'$ in the unbounded setting, such that the output distribution of $P'$ is $Î³$-close in total variation distance to that of $P$, where $Î³$ is either an arbitrarily small constant or polynomially small, depending on the model of computation.

Paperid: 2654, https://arxiv.org/pdf/2506.07480.pdf

Abstract:
Advanced Persistent Threats (APTs) represent a sophisticated and persistent cy-bersecurity challenge, characterized by stealthy, multi-phase, and targeted attacks aimed at compromising information systems over an extended period. Develop-ing an effective Intrusion Detection System (IDS) capable of detecting APTs at different phases relies on selecting network traffic features. However, not all of these features are directly related to the phases of APTs. Some network traffic features may be unrelated or have limited relevance to identifying malicious ac-tivity. Therefore, it is important to carefully select and analyze the most relevant features to improve the IDS performance. This work proposes a feature selection and classification model that integrates two prominent machine learning algo-rithms: SHapley Additive exPlanations (SHAP) and Extreme Gradient Boosting (XGBoost). The aim is to develop lightweight IDS based on a selected minimum number of influential features for detecting APTs at various phases. The pro-posed method also specifies the relevant features for each phase of APTs inde-pendently. Extensive experimental results on the SCVIC-APT-2021 dataset indi-cated that our proposed approach has improved performance compared to other standard techniques. Specifically, both the macro-average F1-score and recall reached 94% and 93 %, respectively, while reducing the complexity of the detec-tion model by selecting only 12 features out of 77.

Paperid: 2655, https://arxiv.org/pdf/2506.07313.pdf

Abstract:
Large language models (LLMs) have seen widespread success in code generation tasks for different scenarios, both everyday and professional. However current LLMs, despite producing functional code, do not prioritize security and may generate code with exploitable vulnerabilities. In this work, we propose techniques for generating code that is more likely to be secure and introduce SCGAgent, a proactive secure coding agent that implements our techniques. We use security coding guidelines that articulate safe programming practices, combined with LLM-generated unit tests to preserve functional correctness. In our evaluation, we find that SCGAgent is able to preserve nearly 98% of the functionality of the base Sonnet-3.7 LLM while achieving an approximately 25% improvement in security. Moreover, SCGAgent is able to match or best the performance of sophisticated reasoning LLMs using a non-reasoning model and an agentic workflow.

Paperid: 2656, https://arxiv.org/pdf/2506.06933.pdf

Abstract:
Traditional decision-based black-box adversarial attacks on image classifiers aim to generate adversarial examples by slightly modifying input images while keeping the number of queries low, where each query involves sending an input to the model and observing its output. Most existing methods assume that all queries have equal cost. However, in practice, queries may incur asymmetric costs; for example, in content moderation systems, certain output classes may trigger additional review, enforcement, or penalties, making them more costly than others. While prior work has considered such asymmetric cost settings, effective algorithms for this scenario remain underdeveloped. In this paper, we propose a general framework for decision-based attacks under asymmetric query costs, which we refer to as asymmetric black-box attacks. We modify two core components of existing attacks: the search strategy and the gradient estimation process. Specifically, we propose Asymmetric Search (AS), a more conservative variant of binary search that reduces reliance on high-cost queries, and Asymmetric Gradient Estimation (AGREST), which shifts the sampling distribution to favor low-cost queries. We design efficient algorithms that minimize total attack cost by balancing different query types, in contrast to earlier methods such as stealthy attacks that focus only on limiting expensive (high-cost) queries. Our method can be integrated into a range of existing black-box attacks with minimal changes. We perform both theoretical analysis and empirical evaluation on standard image classification benchmarks. Across various cost regimes, our method consistently achieves lower total query cost and smaller perturbations than existing approaches, with improvements of up to 40% in some settings.

Paperid: 2657, https://arxiv.org/pdf/2506.06597.pdf

Abstract:
The confidentiality of trained AI models on edge devices is at risk from side-channel attacks exploiting power and electromagnetic emissions. This paper proposes a novel training methodology to enhance resilience against such threats by introducing randomized and interchangeable model configurations during inference. Experimental results on Google Coral Edge TPU show a reduction in side-channel leakage and a slower increase in t-scores over 20,000 traces, demonstrating robustness against adversarial observations. The defense maintains high accuracy, with about 1% degradation in most configurations, and requires no additional hardware or software changes, making it the only applicable solution for existing Edge TPUs.

Paperid: 2658, https://arxiv.org/pdf/2506.06565.pdf

Abstract:
Evolving attacks are a critical challenge for the long-term success of Network Intrusion Detection Systems (NIDS). The rise of these changing patterns has exposed the limitations of traditional network security methods. While signature-based methods are used to detect different types of attacks, they often fail to detect unknown attacks. Moreover, the system requires frequent updates with new signatures as the attackers are constantly changing their tactics. In this paper, we design an environment where two agents improve their policies over time. The adversarial agent, referred to as the red agent, perturbs packets to evade the intrusion detection mechanism, whereas the blue agent learns new defensive policies using drift adaptation techniques to counter the attacks. Both agents adapt iteratively: the red agent responds to the evolving NIDS, while the blue agent adjusts to emerging attack patterns. By studying the model's learned policy, we offer concrete insights into drift adaptation techniques with high utility. Experiments show that the blue agent boosts model accuracy by 30% with just 2 to 3 adaptation steps using only 25 to 30 samples each.

Paperid: 2659, https://arxiv.org/pdf/2506.06547.pdf

Abstract:
Data trustees serve as intermediaries that facilitate secure data sharing between independent parties. This paper offers a technical perspective on Data trustees, guided by privacy-by-design principles. We introduce PrivTru, an instantiation of a data trustee that provably achieves optimal privacy properties. Therefore, PrivTru calculates the minimal amount of information the data trustee needs to request from data sources to respond to a given query. Our analysis shows that PrivTru minimizes information leakage to the data trustee, regardless of the trustee's prior knowledge, while preserving the utility of the data.

Paperid: 2661, https://arxiv.org/pdf/2506.05900.pdf

Abstract:
The dire need to protect sensitive data has led to various flavors of privacy definitions. Among these, Differential privacy (DP) is considered one of the most rigorous and secure notions of privacy, enabling data analysis while preserving the privacy of data contributors. One of the fundamental tasks of data analysis is clustering , which is meant to unravel hidden patterns within complex datasets. However, interpreting clustering results poses significant challenges, and often necessitates an extensive analytical process. Interpreting clustering results under DP is even more challenging, as analysts are provided with noisy responses to queries, and longer, manual exploration sessions require additional noise to meet privacy constraints. While increasing attention has been given to clustering explanation frameworks that aim at assisting analysts by automatically uncovering the characteristics of each cluster, such frameworks may also disclose sensitive information within the dataset, leading to a breach in privacy. To address these challenges, we present DPClustX, a framework that provides explanations for black-box clustering results while satisfying DP. DPClustX takes as input the sensitive dataset alongside privately computed clustering labels, and outputs a global explanation, emphasizing prominent characteristics of each cluster while guaranteeing DP. We perform an extensive experimental analysis of DPClustX on real data, showing that it provides insightful and accurate explanations even under tight privacy constraints.

Paperid: 2662, https://arxiv.org/pdf/2506.05502.pdf

Abstract:
Watermarking for large language models (LLMs) offers a promising approach to identifying AI-generated text. Existing approaches, however, either compromise the distribution of original generated text by LLMs or are limited to embedding zero-bit information that only allows for watermark detection but ignores identification. We present StealthInk, a stealthy multi-bit watermarking scheme that preserves the original text distribution while enabling the embedding of provenance data, such as userID, TimeStamp, and modelID, within LLM-generated text. This enhances fast traceability without requiring access to the language model's API or prompts. We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate, which provides insights on how to enhance the capacity. Comprehensive empirical evaluations across diverse tasks highlight the stealthiness, detectability, and resilience of StealthInk, establishing it as an effective solution for LLM watermarking applications.

Paperid: 2663, https://arxiv.org/pdf/2506.05446.pdf

Abstract:
Large Language Models (LLMs) are increasingly powerful but remain vulnerable to prompt injection attacks, where malicious inputs cause the model to deviate from its intended instructions. This paper introduces Sentinel, a novel detection model, qualifire/prompt-injection-sentinel, based on the \answerdotai/ModernBERT-large architecture. By leveraging ModernBERT's advanced features and fine-tuning on an extensive and diverse dataset comprising a few open-source and private collections, Sentinel achieves state-of-the-art performance. This dataset amalgamates varied attack types, from role-playing and instruction hijacking to attempts to generate biased content, alongside a broad spectrum of benign instructions, with private datasets specifically targeting nuanced error correction and real-world misclassifications. On a comprehensive, unseen internal test set, Sentinel demonstrates an average accuracy of 0.987 and an F1-score of 0.980. Furthermore, when evaluated on public benchmarks, it consistently outperforms strong baselines like protectai/deberta-v3-base-prompt-injection-v2. This work details Sentinel's architecture, its meticulous dataset curation, its training methodology, and a thorough evaluation, highlighting its superior detection capabilities.

Paperid: 2664, https://arxiv.org/pdf/2506.05359.pdf

Abstract:
Meme tokens represent a distinctive asset class within the cryptocurrency ecosystem, characterized by high community engagement, significant market volatility, and heightened vulnerability to market manipulation. This paper introduces an innovative approach to assessing liquidity risk in meme token markets using entity-linked address identification techniques. We propose a multi-dimensional method integrating fund flow analysis, behavioral similarity, and anomalous transaction detection to identify related addresses. We develop a comprehensive set of liquidity risk indicators tailored for meme tokens, covering token distribution, trading activity, and liquidity metrics. Empirical analysis of tokens like BabyBonk, NMT, and BonkFork validates our approach, revealing significant disparities between apparent and actual liquidity in meme token markets. The findings of this study provide significant empirical evidence for market participants and regulatory authorities, laying a theoretical foundation for building a more transparent and robust meme token ecosystem.

Paperid: 2665, https://arxiv.org/pdf/2506.04800.pdf

Abstract:
This paper presents MULTISS, a new protocol for long-term storage distributed across multiple Quantum Key Distribution (QKD) networks. This protocol is an extension of LINCOS, a secure storage protocol that uses Shamir secret sharing for secret storage on a single QKD network. Our protocol uses hierarchical secret sharing to distribute a secret across multiple QKD networks while ensuring perfect security. Our protocol further allows for sharing updates without having to reconstruct the entire secret. We also prove that MULTISS is strictly more secure than LINCOS, which remains vulnerable when its QKD network is compromised.

Paperid: 2666, https://arxiv.org/pdf/2506.04307.pdf

Abstract:
The InterPlanetary File System~(IPFS) offers a decentralized approach to file storage and sharing, promising resilience and efficiency while also realizing the Web3 paradigm. Simultaneously, the offered anonymity raises significant questions about potential misuse. In this study, we explore methods that malicious actors can exploit IPFS to upload and disseminate harmful content while remaining anonymous. We evaluate the role of pinning services and public gateways, identifying their capabilities and limitations in maintaining content availability. Using scripts, we systematically test the behavior of these services by uploading malicious files. Our analysis reveals that pinning services and public gateways lack mechanisms to assess or restrict the propagation of malicious content.

Paperid: 2667, https://arxiv.org/pdf/2506.02667.pdf

Abstract:
Automated debugging, long pursued in a variety of fields from software engineering to cybersecurity, requires a framework that offers the building blocks for a programmable debugging workflow. However, existing debuggers are primarily tailored for human interaction, and those designed for programmatic debugging focus on kernel space, resulting in limited functionality in userland. To fill this gap, we introduce libdebug, a Python library for programmatic debugging of userland binary executables. libdebug offers a user-friendly API that enables developers to build custom debugging tools for various applications, including software engineering, reverse engineering, and software security. It is released as an open-source project, along with comprehensive documentation to encourage use and collaboration across the community. We demonstrate the versatility and performance of libdebug through case studies and benchmarks, all of which are publicly available. We find that the median latency of syscall and breakpoint handling in libdebug is 3 to 4 times lower compared to that of GDB.

Paperid: 2668, https://arxiv.org/pdf/2506.02030.pdf

Abstract:
OpenID Connect (OIDC) enables a user with commercial-off-the-shelf browsers to log into multiple websites, called relying parties (RPs), by her username and credential set up in another trusted web system, called the identity provider (IdP). Identity transformations are proposed in UppreSSO to provide OIDC-compatible SSO services, preventing both IdP-based login tracing and RP-based identity linkage. While security and privacy of SSO services in UppreSSO have been proved, several essential issues of this identity-transformation approach are not well studied. In this paper, we comprehensively investigate the approach as below. Firstly, several suggestions for the efficient integration of identity transformations in OIDC-compatible SSO are explained. Then, we uncover the relationship between identity-transformations in SSO and oblivious pseudo-random functions (OPRFs), and present two variations of the properties required for SSO security as well as the privacy requirements, to analyze existing OPRF protocols. Finally, new identity transformations different from those designed in UppreSSO, are constructed based on OPRFs, satisfying different variations of SSO security requirements. To the best of our knowledge, this is the first time to uncover the relationship between identity transformations in OIDC-compatible privacy-preserving SSO services and OPRFs, and prove the SSO-related properties (i.e., key-identifier freeness, RP designation and user identification) of OPRF protocols, in addition to the basic properties of correctness, obliviousness and pseudo-randomness.

Paperid: 2670, https://arxiv.org/pdf/2506.01227.pdf

Abstract:
Graph-based frameworks are often used in network hardening to help a cyber defender understand how a network can be attacked and how the best defenses can be deployed. However, incorporating network connectivity parameters in the attack graph, reasoning about the attack graph when we do not have access to complete information, providing system administrator suggestions in an understandable format, and allowing them to do what-if analysis on various scenarios and attacker motives is still missing. We fill this gap by presenting SPEAR, a formal framework with tool support for security posture evaluation and analysis that keeps human-in-the-loop. SPEAR uses the causal formalism of AI planning to model vulnerabilities and configurations in a networked system. It automatically converts network configurations and vulnerability descriptions into planning models expressed in the Planning Domain Definition Language (PDDL). SPEAR identifies a set of diverse security hardening strategies that can be presented in a manner understandable to the domain expert. These allow the administrator to explore the network hardening solution space in a systematic fashion and help evaluate the impact and compare the different solutions.

Paperid: 2671, https://arxiv.org/pdf/2505.23880.pdf

Abstract:
WhatsApp and many other commonly used communication platforms guarantee end-to-end encryption (E2EE), which requires that service providers lack the cryptographic keys to read communications on their own platforms. WhatsApp's privacy-preserving design makes it difficult to study important phenomena like the spread of misinformation or political messaging, as users have a clear expectation and desire for privacy and little incentive to forfeit that privacy in the process of handing over raw data to researchers, journalists, or other parties. We introduce Synopsis, a secure architecture for analyzing messaging trends in consensually-donated E2EE messages using message embeddings. Since the goal of this system is investigative journalism workflows, Synopsis must facilitate both exploratory and targeted analyses -- a challenge for systems using differential privacy (DP), and, for different reasons, a challenge for private computation approaches based on cryptography. To meet these challenges, we combine techniques from the local and central DP models and wrap the system in malicious-secure multi-party computation to ensure the DP query architecture is the only way to access messages, preventing any party from directly viewing stored message embeddings. Evaluations on a dataset of Hindi-language WhatsApp messages (34,024 messages represented as 500-dimensional embeddings) demonstrate the efficiency and accuracy of our approach. Queries on this data run in about 30 seconds, and the accuracy of the fine-grained interface exceeds 94% on benchmark tasks.

Paperid: 2672, https://arxiv.org/pdf/2505.23800.pdf

Abstract:
The increasing digitization of agricultural operations has introduced new cybersecurity challenges for the farming community. This paper introduces an educational intervention called Cybersecurity Improvement Initiative for Agriculture (CIIA), which aims to strengthen cybersecurity awareness and resilience among farmers and food producers. Using a case study that focuses on farmers from the Ponca Tribe of Nebraska, the research evaluates pre- and post- intervention survey data to assess participants' cybersecurity knowledge and awareness before and after exposure to the CIIA. The findings reveal a substantial baseline deficiency in cybersecurity education among participants, however, post-intervention assessments demonstrate improvements in the comprehension of cybersecurity concepts, such as password hygiene, multi-factor authentication, and the necessity of routine data backups. These initial findings highlight the need for a continued and sustained, community-specific cybersecurity education effort to help mitigate emerging cyber threats in the agricultural sector.

Paperid: 2673, https://arxiv.org/pdf/2505.23192.pdf

Abstract:
The rise of text-to-image (T2I) models has enabled the synthesis of photorealistic human portraits, raising serious concerns about identity misuse and the robustness of AIGC detectors. In this work, we propose an automated adversarial prompt generation framework that leverages a grammar tree structure and a variant of the Monte Carlo tree search algorithm to systematically explore the semantic prompt space. Our method generates diverse, controllable prompts that consistently evade both open-source and commercial AIGC detectors. Extensive experiments across multiple T2I models validate its effectiveness, and the approach ranked first in a real-world adversarial AIGC detection competition. Beyond attack scenarios, our method can also be used to construct high-quality adversarial datasets, providing valuable resources for training and evaluating more robust AIGC detection and defense systems.

Paperid: 2674, https://arxiv.org/pdf/2505.22778.pdf

Abstract:
Powerful machine learning (ML) models are now readily available online, which creates exciting possibilities for users who lack the deep technical expertise or substantial computing resources needed to develop them. On the other hand, this type of open ecosystem comes with many risks. In this paper, we argue that the current ecosystem for open ML models contains significant supply-chain risks, some of which have been exploited already in real attacks. These include an attacker replacing a model with something malicious (e.g., malware), or a model being trained using a vulnerable version of a framework or on restricted or poisoned data. We then explore how Sigstore, a solution designed to bring transparency to open-source software supply chains, can be used to bring transparency to open ML models, in terms of enabling model publishers to sign their models and prove properties about the datasets they use.

Paperid: 2675, https://arxiv.org/pdf/2505.22619.pdf

Abstract:
Research on blockchains addresses multiple issues, with one being writing smart contracts. In our previous research we described methodology and a tool to generate, in automated fashion, smart contracts from BPMN models. The generated smart contracts provide support for multi-step transactions that facilitate repair/upgrade of smart contracts. In this paper we show how the approach is used to support collaborations via smart contracts for companies ranging from SMEs with little IT capabilities to companies with IT using blockchain smart contracts. Furthermore, we also show how the approach is used for certain applications to generate smart contracts by a BPMN modeler who does not need any knowledge of blockchain technology or smart contract development - thus we are hoping to facilitate democratization of smart contracts and blockchain technology.

Paperid: 2676, https://arxiv.org/pdf/2505.22612.pdf

Abstract:
This paper addresses the challenge of creating smart contracts for applications represented using Business Process Management and Notation (BPMN) models. In our prior work we presented a methodology that automates the generation of smart contracts from BPMN models. This approach abstracts the BPMN flow control, making it independent of the underlying blockchain infrastructure, with only the BPMN task elements requiring coding. In subsequent research, we enhanced our approach by adding support for nested transactions and enabling a smart contract repair and/or upgrade. To empower Business Analysts (BAs) to generate smart contracts without relying on software developers, we tackled the challenge of generating smart contracts from BPMN models without assistance of a software developer. We exploit the Decision Model and Notation (DMN) standard to represent the decisions and the business logic of the BPMN task elements and amended our methodology for transformation of BPMN models into smart contracts to support also the generation script to represent the business logic represented by the DMN models. To support such transformation, we describe how the BA documents, using the BPMN elements, the flow of information along with the flow of execution. Thus, if the BA is successful in representing the blockchain application requirements using BPMN and DMN models, our methodology and the tool, called TABS, that we developed as a proof of concept, is used to generate the smart contracts directly from those models without developer assistance.

Paperid: 2677, https://arxiv.org/pdf/2505.22435.pdf

Abstract:
Due to the increasing presence of networked devices in everyday life, not only cybersecurity specialists but also end users benefit from security applications such as firewalls, vulnerability scanners, and intrusion detection systems. Recent approaches use large language models (LLMs) to rewrite brief, technical security alerts into intuitive language and suggest actionable measures, helping everyday users understand and respond appropriately to security risks. However, it remains an open question how well such alerts are explained to users. LLM outputs can also be hallucinated, inconsistent, or misleading. In this work, we introduce the Human-Centered Security Alert Evaluation Framework (HCSAEF). HCSAEF assesses LLM-generated cybersecurity notifications to support researchers who want to compare notifications generated for everyday users, improve them, or analyze the capabilities of different LLMs in explaining cybersecurity issues. We demonstrate HCSAEF through three use cases, which allow us to quantify the impact of prompt design, model selection, and output consistency. Our findings indicate that HCSAEF effectively differentiates generated notifications along dimensions such as intuitiveness, urgency, and correctness.

Paperid: 2678, https://arxiv.org/pdf/2505.22010.pdf

Abstract:
Recognizing vulnerabilities in stripped binary files presents a significant challenge in software security. Although some progress has been made in generating human-readable information from decompiled binary files with Large Language Models (LLMs), effectively and scalably detecting vulnerabilities within these binary files is still an open problem. This paper explores the novel application of LLMs to detect vulnerabilities within these binary files. We demonstrate the feasibility of identifying vulnerable programs through a combined approach of decompilation optimization to make the vulnerabilities more prominent and long-term memory for a larger context window, achieving state-of-the-art performance in binary vulnerability analysis. Our findings highlight the potential for LLMs to overcome the limitations of traditional analysis methods and advance the field of binary vulnerability detection, paving the way for more secure software systems. In this paper, we present Vul-BinLLM , an LLM-based framework for binary vulnerability detection that mirrors traditional binary analysis workflows with fine-grained optimizations in decompilation and vulnerability reasoning with an extended context. In the decompilation phase, Vul-BinLLM adds vulnerability and weakness comments without altering the code structure or functionality, providing more contextual information for vulnerability reasoning later. Then for vulnerability reasoning, Vul-BinLLM combines in-context learning and chain-of-thought prompting along with a memory management agent to enhance accuracy. Our evaluations encompass the commonly used synthetic dataset Juliet to evaluate the potential feasibility for analysis and vulnerability detection in C/C++ binaries. Our evaluations show that Vul-BinLLM is highly effective in detecting vulnerabilities on the compiled Juliet dataset.

Paperid: 2679, https://arxiv.org/pdf/2505.21609.pdf

Abstract:
Adversarial artificial intelligence (AI) attacks pose a significant threat to autonomous transportation, such as maritime vessels, that rely on AI components. Malicious actors can exploit these systems to deceive and manipulate AI-driven operations. This paper addresses three critical research challenges associated with adversarial AI: the limited scope of traditional defences, inadequate security metrics, and the need to build resilience beyond model-level defences. To address these challenges, we propose building defences utilising multiple inputs and data fusion to create defensive components and an AI security metric as a novel approach toward developing more secure AI systems. We name this approach the Data Fusion Cyber Resilience (DFCR) method, and we evaluate it through real-world demonstrations and comprehensive quantitative analyses, comparing a system built with the DFCR method against single-input models and models utilising existing state-of-the-art defences. The findings show that the DFCR approach significantly enhances resilience against adversarial machine learning attacks in maritime autonomous system operations, achieving up to a 35\% reduction in loss for successful multi-pronged perturbation attacks, up to a 100\% reduction in loss for successful adversarial patch attacks and up to 100\% reduction in loss for successful spoofing attacks when using these more resilient systems. We demonstrate how DFCR and DFCR confidence scores can reduce adversarial AI contact confidence and improve decision-making by the system, even when typical adversarial defences have been compromised. Ultimately, this work contributes to the development of more secure and resilient AI-driven systems against adversarial attacks.

Paperid: 2680, https://arxiv.org/pdf/2505.20183.pdf

Abstract:
The widespread adoption of the Go programming language in infrastructure backends and blockchain projects has heightened the need for improved security measures. Established techniques such as unit testing, static analysis, and program fuzzing provide foundational protection mechanisms. Although symbolic execution tools have made significant contributions, opportunities remain to address the complexities of Go's runtime and concurrency model. In this work, we present Zorya, a novel methodology leveraging concrete and symbolic (concolic) execution to evaluate Go programs comprehensively. By systematically exploring execution paths to uncover vulnerabilities beyond conventional testing, symbolic execution offers distinct advantages, and coupling it with concrete execution mitigates the path explosion problem. Our solution employs Ghidra's P-Code as an intermediate representation (IR). This implementation detects runtime panics in the TinyGo compiler and supports both generic and custom invariants. Furthermore, P-Code's generic IR nature enables analysis of programs written in other languages such as C. Future enhancements may include intelligent classification of concolic execution logs to identify vulnerability patterns.

Paperid: 2681, https://arxiv.org/pdf/2505.20136.pdf

Abstract:
As Artificial Intelligence (AI) systems, particularly those based on machine learning (ML), become integral to high-stakes applications, their probabilistic and opaque nature poses significant challenges to traditional verification and validation methods. These challenges are exacerbated in regulated sectors requiring tamper-proof, auditable evidence, as highlighted by apposite legal frameworks, e.g., the EU AI Act. Conversely, Zero-Knowledge Proofs (ZKPs) offer a cryptographic solution that enables provers to demonstrate, through verified computations, adherence to set requirements without revealing sensitive model details or data. Through a systematic survey of ZKP protocols, we identify five key properties (non-interactivity, transparent setup, standard representations, succinctness, and post-quantum security) critical for their application in AI validation and verification pipelines. Subsequently, we perform a follow-up systematic survey analyzing ZKP-enhanced ML applications across an adaptation of the Team Data Science Process (TDSP) model (Data & Preprocessing, Training & Offline Metrics, Inference, and Online Metrics), detailing verification objectives, ML models, and adopted protocols. Our findings indicate that current research on ZKP-Enhanced ML primarily focuses on inference verification, while the data preprocessing and training stages remain underexplored. Most notably, our analysis identifies a significant convergence within the research domain toward the development of a unified Zero-Knowledge Machine Learning Operations (ZKMLOps) framework. This emerging framework leverages ZKPs to provide robust cryptographic guarantees of correctness, integrity, and privacy, thereby promoting enhanced accountability, transparency, and compliance with Trustworthy AI principles.

Paperid: 2682, https://arxiv.org/pdf/2505.18872.pdf

Abstract:
In response to the increasing complexity and sophistication of cyber threats, particularly those enhanced by advancements in artificial intelligence, traditional security methods are proving insufficient. This paper explores the Zero Trust cybersecurity framework, which operates on the principle of never trust, always verify to mitigate vulnerabilities within organizations. Specifically, it examines the applicability of Zero Trust principles in environments where large volumes of information are exchanged, such as schools and libraries. The discussion highlights the importance of continuous authentication, least privilege access, and breach assumption. The findings underscore avenues for future research that may help preserve the security of these vulnerable organizations.

Paperid: 2683, https://arxiv.org/pdf/2505.18814.pdf

Abstract:
As electronic signatures (e-signatures) become increasingly integral to secure digital transactions, understanding their usability and security perception from an end-user perspective has become crucial. This study empirically evaluates and compares two major e-signature systems -- token-based and remote signatures -- through a controlled user experience study with 20 participants. Participants completed tasks involving acquisition, installation, and document signing using both methods, followed by structured surveys and qualitative feedback. Statistical analyses revealed that remote e-signatures were perceived as significantly more usable than token-based ones, due to their minimal setup and platform-independent accessibility. In contrast, token-based signatures were rated as significantly more secure, highlighting users' trust in hardware-based protection. Although more participants preferred remote e-signatures for document signing, the preference did not reach statistical significance, indicating a trend toward favoring convenience in real-world scenarios. These findings underline the fundamental trade-off between usability and perceived security in digital signing systems. By bridging the gap between theoretical frameworks and real user experience, this study contributes valuable insights to the design and policymaking of qualified electronic signature solutions.

Paperid: 2684, https://arxiv.org/pdf/2505.18402.pdf

Abstract:
The advancements in communication technology (5G and beyond) and global connectivity Internet of Things (IoT) also come with new security problems that will need to be addressed in the next few years. The threats and vulnerabilities introduced by AI/ML based 5G and beyond IoT systems need to be investigated to avoid the amplification of attack vectors on AI/ML. AI/ML techniques are playing a vital role in numerous applications of cybersecurity. Despite the ongoing success, there are significant challenges in ensuring the trustworthiness of AI/ML systems. However, further research is needed to define what is considered an AI/ML threat and how it differs from threats to traditional systems, as currently there is no common understanding of what constitutes an attack on AI/ML based systems, nor how it might be created, hosted and propagated [ETSI, 2020]. Therefore, there is a need for studying the AI/ML approach to ensure safe and secure development, deployment, and operation of AI/ML based 5G and beyond IoT systems. For 5G and beyond, it is essential to continuously monitor and analyze any changing environment in real-time to identify and reduce intentional and unintentional risks. In this study, we will review the role of the AI/ML technique for 5G and beyond security. Furthermore, we will provide our perspective for predicting and mitigating 5G and beyond security using AI/ML techniques.

Paperid: 2685, https://arxiv.org/pdf/2505.18386.pdf

Abstract:
The rise of online social networks, user-gene-rated content, and third-party apps made data sharing an inevitable trend, driven by both user behavior and the commercial value of personal information. As service providers amass vast amounts of data, safeguarding individual privacy has become increasingly challenging. Privacy threat modeling has emerged as a critical tool for identifying and mitigating risks, with methodologies such as LINDDUN, xCOMPASS, and PANOPTIC offering systematic approaches. However, these frameworks primarily focus on threats arising from interactions between a single user and system components, often overlooking interdependent privacy (IDP); the phenomenon where one user's actions affect the privacy of other users and even non-users. IDP risks are particularly pronounced in third-party applications, where platform permissions, APIs, and user behavior can lead to unintended and unconsented data sharing, such as in the Cambridge Analytica case. We argue that existing threat modeling approaches are limited in exposing IDP-related threats, potentially underestimating privacy risks. To bridge this gap, we propose a specialized methodology that explicitly focuses on interdependent privacy. Our contributions are threefold: (i) we identify IDP-specific challenges and limitations in current threat modeling frameworks, (ii) we create IDPA, a threat modeling approach tailored to IDP threats, and (iii) we validate our approach through a case study on WeChat. We believe that IDPA can operate effectively on systems other than third-party apps and may motivate further research on specialized threat modeling.

Paperid: 2686, https://arxiv.org/pdf/2505.18172.pdf

Abstract:
The increasing sophistication and integration of Generative AI (GenAI) models into diverse applications introduce new security challenges that traditional methods struggle to address. This research explores the critical need for proactive security measures to mitigate the risks associated with malicious exploitation of GenAI systems. We present a framework encompassing key approaches, tools, and strategies designed to outmaneuver even advanced adversarial attacks, emphasizing the importance of securing GenAI innovation against potential liabilities. We also empirically prove the effectiveness of the said framework by testing it against the SPML Chatbot Prompt Injection Dataset. This work highlights the shift from reactive to proactive security practices essential for the safe and responsible deployment of GenAI technologies

Paperid: 2687, https://arxiv.org/pdf/2505.17466.pdf

Abstract:
Recent years have witnessed a rapid development of platform economy, as it effectively addresses the trust dilemma between untrusted online buyers and merchants. However, malicious platforms can misuse users' funds and information, causing severe security concerns. Previous research efforts aimed at enhancing security in platform payment systems often sacrificed processing performance, while those focusing on processing efficiency struggled to completely prevent fund and information misuse. In this paper, we introduce SecurePay, a secure, yet performant payment processing system for platform economy. SecurePay is the first payment system that combines permissioned blockchain with central bank digital currency (CBDC) to ensure fund security, information security, and resistance to collusion by intermediaries; it also facilitates counter-party auditing, closed-loop regulation, and enhances operational efficiency for transaction settlement. We develop a full implementation of the proposed SecurePay system, and our experiments conducted on personal devices demonstrate a throughput of 256.4 transactions per second and an average latency of 4.29 seconds, demonstrating a comparable processing efficiency with a centralized system, with a significantly improved security level.

Paperid: 2688, https://arxiv.org/pdf/2505.16888.pdf

Abstract:
Large language models (LLMs) have advanced many applications, but are also known to be vulnerable to adversarial attacks. In this work, we introduce a novel security threat: hijacking AI-human conversations by manipulating LLMs' system prompts to produce malicious answers only to specific targeted questions (e.g., "Who should I vote for US President?", "Are Covid vaccines safe?"), while behaving benignly on others. This attack is detrimental as it can enable malicious actors to exercise large-scale information manipulation by spreading harmful but benign-looking system prompts online. To demonstrate such an attack, we develop CAIN, an algorithm that can automatically curate such harmful system prompts for a specific target question in a black-box setting or without the need to access the LLM's parameters. Evaluated on both open-source and commercial LLMs, CAIN demonstrates significant adversarial impact. In untargeted attacks or forcing LLMs to output incorrect answers, CAIN achieves up to 40% F1 degradation on targeted questions while preserving high accuracy on benign inputs. For targeted attacks or forcing LLMs to output specific harmful answers, CAIN achieves over 70% F1 scores on these targeted responses with minimal impact on benign questions. Our results highlight the critical need for enhanced robustness measures to safeguard the integrity and safety of LLMs in real-world applications. All source code will be publicly available.

Paperid: 2689, https://arxiv.org/pdf/2505.16318.pdf

Abstract:
As vision-based machine learning models are increasingly integrated into autonomous and cyber-physical systems, concerns about (physical) adversarial patch attacks are growing. While state-of-the-art defenses can achieve certified robustness with minimal impact on utility against highly-concentrated localized patch attacks, they fall short in two important areas: (i) State-of-the-art methods are vulnerable to low-noise distributed patches where perturbations are subtly dispersed to evade detection or masking, as shown recently by the DorPatch attack; (ii) Achieving high robustness with state-of-the-art methods is extremely time and resource-consuming, rendering them impractical for latency-sensitive applications in many cyber-physical systems. To address both robustness and latency issues, this paper proposes a new defense strategy for adversarial patch attacks called SuperPure. The key novelty is developing a pixel-wise masking scheme that is robust against both distributed and localized patches. The masking involves leveraging a GAN-based super-resolution scheme to gradually purify the image from adversarial patches. Our extensive evaluations using ImageNet and two standard classifiers, ResNet and EfficientNet, show that SuperPure advances the state-of-the-art in three major directions: (i) it improves the robustness against conventional localized patches by more than 20%, on average, while also improving top-1 clean accuracy by almost 10%; (ii) It achieves 58% robustness against distributed patch attacks (as opposed to 0% in state-of-the-art method, PatchCleanser); (iii) It decreases the defense end-to-end latency by over 98% compared to PatchCleanser. Our further analysis shows that SuperPure is robust against white-box attacks and different patch sizes. Our code is open-source.

Paperid: 2690, https://arxiv.org/pdf/2505.16215.pdf

Abstract:
Due to its nature of dynamic, mobility, and wireless data transfer, the Internet of Vehicles (IoV) is prone to various cyber threats, ranging from spoofing and Distributed Denial of Services (DDoS) attacks to malware. To safeguard the IoV ecosystem from intrusions, malicious activities, policy violations, intrusion detection systems (IDS) play a critical role by continuously monitoring and analyzing network traffic to identify and mitigate potential threats in real-time. However, most existing research has focused on developing centralized, machine learning-based IDS systems for IoV without accounting for its inherently distributed nature. Due to intensive computing requirements, these centralized systems often rely on the cloud to detect cyber threats, increasing delay of system response. On the other hand, edge nodes typically lack the necessary resources to train and deploy complex machine learning algorithms. To address this issue, this paper proposes an effective hierarchical classification framework tailored for IoV networks. Hierarchical classification allows classifiers to be trained and tested at different levels, enabling edge nodes to detect specific types of attacks independently. With this approach, edge nodes can conduct targeted attack detection while leveraging cloud nodes for comprehensive threat analysis and support. Given the resource constraints of edge nodes, we have employed the Boruta feature selection method to reduce data dimensionality, optimizing processing efficiency. To evaluate our proposed framework, we utilize the latest IoV security dataset CIC-IoV2024, achieving promising results that demonstrate the feasibility and effectiveness of our models in securing IoV networks.

Paperid: 2691, https://arxiv.org/pdf/2505.16137.pdf

Abstract:
The emergence of cloud computing gives huge impact on large computations. Cloud computing platforms offer servers with large computation power to be available for customers. These servers can be used efficiently to solve problems that are complex by nature, for example, satisfiability (SAT) problems. Many practical problems can be converted to SAT, for example, circuit verification and network configuration analysis. However, outsourcing SAT instances to the servers may cause data leakage that can jeopardize system's security. Before outsourcing the SAT instance, one needs to hide the input information. One way to preserve privacy and hide information is to randomize the SAT instance before outsourcing. In this paper, we present multiple novel methods to randomize SAT instances. We present a novel method to randomize the SAT instance, a variable randomization method to randomize the solution set, and methods to randomize Mincost SAT and MAX3SAT instances. Our analysis and evaluation show the correctness and feasibility of these randomization methods. The scalability and generality of our methods make it applicable for real world problems.

Paperid: 2692, https://arxiv.org/pdf/2505.15568.pdf

Abstract:
Payment channel networks are an approach to improve the scalability of blockchain-based cryptocurrencies. The Lightning Network is a payment channel network built for Bitcoin that is already used in practice. Because the Lightning Network is used for transfer of financial value, its security in the presence of adversarial participants should be verified. The Lightning protocol's complexity makes it hard to assess whether the protocol is secure. To enable computer-aided security verification of Lightning, we formalize the protocol in TLA+ and formally specify the security property that honest users are guaranteed to retrieve their correct balance. While model checking provides a fully automated verification of the security property, the state space of the protocol's specification is so large that model checking becomes unfeasible. We make model checking the Lightning Network possible using two refinement steps that we verify using proofs. In a first step, we prove that the model of time used in the protocol can be abstracted using ideas from the research of timed automata. In a second step, we prove that it suffices to model check the protocol for single payment channels and the protocol for multi-hop payments separately. These refinements reduce the state space sufficiently to allow for model checking Lightning with models with payments over up to four hops and two concurrent payments. These results indicate that the current specification of Lightning is secure.

Paperid: 2693, https://arxiv.org/pdf/2505.15124.pdf

Abstract:
In this survey, we will explore the interaction between secure multiparty computation and the area of machine learning. Recent advances in secure multiparty computation (MPC) have significantly improved its applicability in the realm of machine learning (ML), offering robust solutions for privacy-preserving collaborative learning. This review explores key contributions that leverage MPC to enable multiple parties to engage in ML tasks without compromising the privacy of their data. The integration of MPC with ML frameworks facilitates the training and evaluation of models on combined datasets from various sources, ensuring that sensitive information remains encrypted throughout the process. Innovations such as specialized software frameworks and domain-specific languages streamline the adoption of MPC in ML, optimizing performance and broadening its usage. These frameworks address both semi-honest and malicious threat models, incorporating features such as automated optimizations and cryptographic auditing to ensure compliance and data integrity. The collective insights from these studies highlight MPC's potential in fostering collaborative yet confidential data analysis, marking a significant stride towards the realization of secure and efficient computational solutions in privacy-sensitive industries. This paper investigates a spectrum of SecureML libraries that includes cryptographic protocols, federated learning frameworks, and privacy-preserving algorithms. By surveying the existing literature, this paper aims to examine the efficacy of these libraries in preserving data privacy, ensuring model confidentiality, and fortifying ML systems against adversarial attacks. Additionally, the study explores an innovative application domain for SecureML techniques: the integration of these methodologies in gaming environments utilizing ML.

Paperid: 2694, https://arxiv.org/pdf/2505.14891.pdf

Abstract:
The Nakamoto consensus protocol underlying the Bitcoin blockchain uses proof of work as a voting mechanism. Honest miners who contribute hashing power towards securing the chain try to extend the longest chain they are aware of. Despite its simplicity, Nakamoto consensus achieves meaningful security guarantees assuming that at any point in time, a majority of the hashing power is controlled by honest parties. This also holds under ``resource variability'', i.e., if the total hashing power varies greatly over time. Proofs of space (PoSpace) have been suggested as a more sustainable replacement for proofs of work. Unfortunately, no construction of a ``longest-chain'' blockchain based on PoSpace, that is secure under dynamic availability, is known. In this work, we prove that without additional assumptions no such protocol exists. We exactly quantify this impossibility result by proving a bound on the length of the fork required for double spending as a function of the adversarial capabilities. This bound holds for any chain selection rule, and we also show a chain selection rule (albeit a very strange one) that almost matches this bound. Concretely, we consider a security game in which the honest parties at any point control $Ï>1$ times more space than the adversary. The adversary can change the honest space by a factor $1\pm \varepsilon$ with every block (dynamic availability), and ``replotting'' the space takes as much time as $Ï$ blocks. We prove that no matter what chain selection rule is used, in this game the adversary can create a fork of length $Ï^2\cdot Ï/ \varepsilon$ that will be picked as the winner by the chain selection rule. We also provide an upper bound that matches the lower bound up to a factor $Ï$. There exists a chain selection rule which in the above game requires forks of length at least $Ï\cdot Ï/ \varepsilon$.

Paperid: 2695, https://arxiv.org/pdf/2505.14673.pdf

Abstract:
Invisible image watermarking can protect image ownership and prevent malicious misuse of visual generative models. However, existing generative watermarking methods are mainly designed for diffusion models while watermarking for autoregressive image generation models remains largely underexplored. We propose IndexMark, a training-free watermarking framework for autoregressive image generation models. IndexMark is inspired by the redundancy property of the codebook: replacing autoregressively generated indices with similar indices produces negligible visual differences. The core component in IndexMark is a simple yet effective match-then-replace method, which carefully selects watermark tokens from the codebook based on token similarity, and promotes the use of watermark tokens through token replacement, thereby embedding the watermark without affecting the image quality. Watermark verification is achieved by calculating the proportion of watermark tokens in generated images, with precision further improved by an Index Encoder. Furthermore, we introduce an auxiliary validation scheme to enhance robustness against cropping attacks. Experiments demonstrate that IndexMark achieves state-of-the-art performance in terms of image quality and verification accuracy, and exhibits robustness against various perturbations, including cropping, noises, Gaussian blur, random erasing, color jittering, and JPEG compression.

Paperid: 2696, https://arxiv.org/pdf/2505.14368.pdf

Abstract:
Recent studies demonstrate that Large Language Models (LLMs) are vulnerable to different prompt-based attacks, generating harmful content or sensitive information. Both closed-source and open-source LLMs are underinvestigated for these attacks. This paper studies effective prompt injection attacks against the $\mathbf{14}$ most popular open-source LLMs on five attack benchmarks. Current metrics only consider successful attacks, whereas our proposed Attack Success Probability (ASP) also captures uncertainty in the model's response, reflecting ambiguity in attack feasibility. By comprehensively analyzing the effectiveness of prompt injection attacks, we propose a simple and effective hypnotism attack; results show that this attack causes aligned language models, including Stablelm2, Mistral, Openchat, and Vicuna, to generate objectionable behaviors, achieving around $90$% ASP. They also indicate that our ignore prefix attacks can break all $\mathbf{14}$ open-source LLMs, achieving over $60$% ASP on a multi-categorical dataset. We find that moderately well-known LLMs exhibit higher vulnerability to prompt injection attacks, highlighting the need to raise public awareness and prioritize efficient mitigation strategies.

Paperid: 2697, https://arxiv.org/pdf/2505.13964.pdf

Abstract:
We present a secure and efficient string-matching platform leveraging zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge) to address the challenge of detecting sensitive information leakage while preserving data privacy. Our solution enables organizations to verify whether private strings appear on public platforms without disclosing the strings themselves. To achieve computational efficiency, we integrate a sliding window technique with the Rabin-Karp algorithm and Rabin Fingerprint, enabling hash-based rolling comparisons to detect string matches. This approach significantly reduces time complexity compared to traditional character-by-character comparisons. We implement the proposed system using gnark, a high-performance zk-SNARK library, which generates succinct and verifiable proofs for privacy-preserving string matching. Experimental results demonstrate that our solution achieves strong privacy guarantees while maintaining computational efficiency and scalability. This work highlights the practical applications of zero-knowledge proofs in secure data verification and contributes a scalable method for privacy-preserving string matching.

Paperid: 2698, https://arxiv.org/pdf/2505.13758.pdf

Abstract:
In this work, we consider an inversion attack on the obfuscated input embeddings sent to a language model on a server, where the adversary has no access to the language model or the obfuscation mechanism and sees only the obfuscated embeddings along with the model's embedding table. We propose BeamClean, an inversion attack that jointly estimates the noise parameters and decodes token sequences by integrating a language-model prior. Against Laplacian and Gaussian obfuscation mechanisms, BeamClean always surpasses naive distance-based attacks. This work highlights the necessity for and robustness of more advanced learned, input-dependent methods.

Paperid: 2699, https://arxiv.org/pdf/2505.13751.pdf

Abstract:
Censorship resistance is one of the core value proposition of blockchains. A recurring design pattern aimed at providing censorship resistance is enabling multiple proposers to contribute inputs into block construction. Notably, Fork-Choice Enforced Inclusion Lists (FOCIL) is proposed to be included in Ethereum. However, the current proposal relies on altruistic behavior, without a Transaction Fee Mechanism (TFM). This study aims to address this gap by exploring how multiple proposers should be rewarded to incentivize censorship resistance. The main contribution of this work is the identification of TFMs that ensure censorship resistance under bribery attacks, while also satisfying the incentive compatibility properties of EIP-1559. We provide a concrete payment mechanism for FOCIL, along with generalizable contributions to the literature by analyzing 1) incentive compatibility of TFMs in the presence of a bribing adversary, 2) TFMs in protocols with multiple phases of transaction inclusion, and 3) TFMs of protocols in which parties are uncertain about the behavior and the possible bribe of others.

Paperid: 2700, https://arxiv.org/pdf/2505.13694.pdf

Abstract:
In response to the rising frequency and complexity of data breaches and evolving global privacy regulations, this study presents a comprehensive examination of academic literature on the classification of privacy breaches and violations between 2010-2024. Through a systematic literature review, a corpus of screened studies was assembled and analyzed to identify primary research themes, emerging trends, and gaps in the field. A novel taxonomy is introduced to guide efforts by categorizing research efforts into seven domains: breach classification, report classification, breach detection, threat detection, breach prediction, risk analysis, and threat classification. An analysis reveals that breach classification and detection dominate the literature, while breach prediction and risk analysis have only recently emerged in the literature, suggesting opportunities for potential research impacts. Keyword and phrase frequency analysis reveal potentially underexplored areas, including location privacy, prediction models, and healthcare data breaches.

Paperid: 2701, https://arxiv.org/pdf/2505.12851.pdf

Abstract:
Byzantine attacks during model aggregation in Federated Learning (FL) threaten training integrity by manipulating malicious clients' updates. Existing methods struggle with limited robustness under high malicious client ratios and sensitivity to non-i.i.d. data, leading to degraded accuracy. To address this, we propose FLTG, a novel aggregation algorithm integrating angle-based defense and dynamic reference selection. FLTG first filters clients via ReLU-clipped cosine similarity, leveraging a server-side clean dataset to exclude misaligned updates. It then dynamically selects a reference client based on the prior global model to mitigate non-i.i.d. bias, assigns aggregation weights inversely proportional to angular deviations, and normalizes update magnitudes to suppress malicious scaling. Evaluations across datasets of varying complexity under five classic attacks demonstrate FLTG's superiority over state-of-the-art methods under extreme bias scenarios and sustains robustness with a higher proportion(over 50%) of malicious clients.

Paperid: 2702, https://arxiv.org/pdf/2505.12770.pdf

Abstract:
Access-control misconfigurations are among the main causes of today's data breaches in web applications. However, few techniques are available to support automatic and systematic testing for access-control changes and detecting risky changes to prevent severe consequences. As a result, those critical security configurations often lack testing, or are tested manually in an ad hoc way. This paper advocates that tests should be made available for users to test access-control configuration changes. The key challenges are such tests need to be run with production environments (to reason end-to-end behavior) and need to be performance-efficient. We present a new approach to create such tests, as a mini test environment incorporating production program and data, called ACtests. ACtests report the impacts of access-control changes, namely the requests that were denied but would be allowed after a change, and vice versa. Users can validate if the changed requests are intended or not and identify potential security vulnerabilities. We evaluate ACtests with 193 public configurations of widely-used web applications on Dockerhub. ACtests detect 168 new vulnerabilities from 72 configuration images. We report them to the image maintainers: 54 of them have been confirmed and 44 have been fixed. We also conduct in-depth experiments with five real-world deployed systems, including Wikipedia and a commercial company's web proxy. Our results show that ACtests effectively and efficiently detect all the change impacts.

Paperid: 2703, https://arxiv.org/pdf/2505.12613.pdf

Abstract:
The United States Cyber Command (USCYBERCOM) Cyber Protection Condition (CPCON) framework mandates graduated security postures across Department of Defense (DoD) networks, but current implementation remains largely manual, inconsistent, and error-prone. This paper presents a prototype system for centralized orchestration of CPCON directives, enabling automated policy enforcement and real-time threat response across heterogeneous network environments. Building on prior work in host-based intrusion response, our system leverages a policy-driven orchestrator to standardize security actions, isolate compromised subnets, and verify enforcement status. We validate the system through emulated attack scenarios, demonstrating improved speed, accuracy, and verifiability in CPCON transitions with human-in-the-loop oversight.

Paperid: 2704, https://arxiv.org/pdf/2505.12541.pdf

Abstract:
We introduce a novel framework for differentially private (DP) statistical estimation via data truncation, addressing a key challenge in DP estimation when the data support is unbounded. Traditional approaches rely on problem-specific sensitivity analysis, limiting their applicability. By leveraging techniques from truncated statistics, we develop computationally efficient DP estimators for exponential family distributions, including Gaussian mean and covariance estimation, achieving near-optimal sample complexity. Previous works on exponential families only consider bounded or one-dimensional families. Our approach mitigates sensitivity through truncation while carefully correcting for the introduced bias using maximum likelihood estimation and DP stochastic gradient descent. Along the way, we establish improved uniform convergence guarantees for the log-likelihood function of exponential families, which may be of independent interest. Our results provide a general blueprint for DP algorithm design via truncated statistics.

Paperid: 2705, https://arxiv.org/pdf/2505.11744.pdf

Abstract:
We study multi-authority attribute-based functional encryption for noisy inner-product functionality, and propose two new primitives: (1) multi-authority attribute-based (noisy) inner-product functional encryption (MA-AB(N)IPFE), which generalizes existing multi-authority attribute-based IPFE schemes by Agrawal et al. (TCC'21), by enabling approximate inner-product computation; and (2) multi-authority attribute-based evasive inner-product functional encryption (MA-evIPFE), a relaxed variant inspired by the evasive IPFE framework by Hsieh et al. (EUROCRYPT'24), shifting focus from ciphertext indistinguishability to a more relaxed pseudorandomness-based security notion. To support the above notions, we introduce two variants of lattice-based computational assumptions: evasive IPFE assumption and indistinguishability-based evasive IPFE assumption (IND-evIPFE). We present lattice-based constructions of both primitives for subset policies, building upon the framework of Waters et al.( TCC'22). Our schemes are proven to be statically secure in the random oracle model under the standard LWE assumption and the newly introduced assumptions. Additionally, we show our MA-AB(N)IPFE scheme can be transformed via modulus switching into a noiseless MA-IPFE scheme that supports exact inner-product functionality. This yields the first lattice-based construction of such a primitive. All our schemes support arbitrary polynomial-size attribute policies and are secure in the random oracle model under lattice assumptions with a sub-exponential modulus-to-noise ratio, making them practical candidates for noise-tolerant, fine-grained access control in multi-authority settings.

Paperid: 2706, https://arxiv.org/pdf/2505.11551.pdf

Abstract:
Connected and Autonomous Vehicles (CAVs) enhance mobility but face cybersecurity threats, particularly through the insecure Controller Area Network (CAN) bus. Cyberattacks can have devastating consequences in connected vehicles, including the loss of control over critical systems, necessitating robust security solutions. In-vehicle Intrusion Detection Systems (IDSs) offer a promising approach by detecting malicious activities in real time. This survey provides a comprehensive review of state-of-the-art research on learning-based in-vehicle IDSs, focusing on Machine Learning (ML), Deep Learning (DL), and Federated Learning (FL) approaches. Based on the reviewed studies, we critically examine existing IDS approaches, categorising them by the types of attacks they detect - known, unknown, and combined known-unknown attacks - while identifying their limitations. We also review the evaluation metrics used in research, emphasising the need to consider multiple criteria to meet the requirements of safety-critical systems. Additionally, we analyse FL-based IDSs and highlight their limitations. By doing so, this survey helps identify effective security measures, address existing limitations, and guide future research toward more resilient and adaptive protection mechanisms, ensuring the safety and reliability of CAVs.

Paperid: 2707, https://arxiv.org/pdf/2505.11532.pdf

Abstract:
Autonomous driving systems (ADS) increasingly rely on deep learning-based perception models, which remain vulnerable to adversarial attacks. In this paper, we revisit adversarial attacks and defense methods, focusing on road sign recognition and lead object detection and prediction (e.g., relative distance). Using a Level-2 production ADS, OpenPilot by Comma$.$ai, and the widely adopted YOLO model, we systematically examine the impact of adversarial perturbations and assess defense techniques, including adversarial training, image processing, contrastive learning, and diffusion models. Our experiments highlight both the strengths and limitations of these methods in mitigating complex attacks. Through targeted evaluations of model robustness, we aim to provide deeper insights into the vulnerabilities of ADS perception systems and contribute guidance for developing more resilient defense strategies.

Paperid: 2708, https://arxiv.org/pdf/2505.11058.pdf

Abstract:
Homomorphic encryption provides many opportunities for privacy-aware processing, including with methods related to machine learning. Many of our existing cryptographic methods have been shown in the past to be susceptible to side channel attacks. With these, the implementation of the cryptographic methods can reveal information about the private keys used, the result, or even the original plaintext. An example of this includes the processing of the RSA exponent using the Montgomery method, and where 0's and 1's differ in their processing time for modular exponentiation. With FHE, we typically use lattice methods, and which can have particular problems in their implementation in relation to side channel leakage. This paper aims to outline a range of weaknesses within FHE implementations as related to side channel analysis. It outlines a categorization for side-channel analysis, some case studies, and mitigation strategies.

Paperid: 2709, https://arxiv.org/pdf/2505.10864.pdf

Abstract:
Recent advancements in Ultra-Wideband (UWB) radar technology have enabled contactless, non-line-of-sight vital sign monitoring, making it a valuable tool for healthcare. However, UWB radar's ability to capture sensitive physiological data, even through walls, raises significant privacy concerns, particularly in human-robot interactions and autonomous systems that rely on radar for sensing human presence and physiological functions. In this paper, we present Anti-Sensing, a novel defense mechanism designed to prevent unauthorized radar-based sensing. Our approach introduces physically realizable perturbations, such as oscillatory motion from wearable devices, to disrupt radar sensing by mimicking natural cardiac motion, thereby misleading heart rate (HR) estimations. We develop a gradient-based algorithm to optimize the frequency and spatial amplitude of these oscillations for maximal disruption while ensuring physiological plausibility. Through both simulations and real-world experiments with radar data and neural network-based HR sensing models, we demonstrate the effectiveness of Anti-Sensing in significantly degrading model accuracy, offering a practical solution for privacy preservation.

Paperid: 2710, https://arxiv.org/pdf/2505.10790.pdf

Abstract:
The study by Gohr et.al at CRYPTO 2019 and sunsequent related works have shown that neural networks can uncover previously unused features, offering novel insights into cryptanalysis. Motivated by these findings, we employ neural networks to learn features specifically related to integral properties and integrate the corresponding insights into optimized search frameworks. These findings validate the framework of using neural networks for feature exploration, providing researchers with novel insights that advance established cryptanalysis methods. Neural networks have inspired the development of more precise integral search models. By comparing the integral distinguishers obtained via neural networks with those identified by classical methods, we observe that existing automated search models often fail to find optimal distinguishers. To address this issue, we develop a meet in the middle search framework that balances model accuracy and computational efficiency. As a result, we reduce the number of active plaintext bits required for an 11 rounds integral distinguisher on SKINNY64/64, and further identify a 12 rounds key dependent integral distinguisher achieving one additional round over the previous best-known result. The integral distinguishers discovered by neural networks enable key recovery attacks on more rounds. We identify a 7 rounds key independent integral distinguisher from neural networks with even only one active plaintext cell, which is based on linear combinations of bits. This distinguisher enables a 15 rounds key recovery attack on SKINNYn/n, improving upon the previous record by one round. Additionally, we discover an 8 rounds key dependent integral distinguisher using neural network that further reduces the time complexity of key recovery attacks against SKINNY.

Paperid: 2711, https://arxiv.org/pdf/2505.10585.pdf

Abstract:
Malicious Unmanned Aerial Vehicles (UAVs) present a significant threat to next-generation networks (NGNs), posing risks such as unauthorized surveillance, data theft, and the delivery of hazardous materials. This paper proposes an integrated (AE)-classifier system to detect malicious UAVs. The proposed AE, based on a 4-layer Tri-orientated Spatial Mamba (TSMamba) architecture, effectively captures complex spatial relationships crucial for identifying malicious UAV activities. The first phase involves generating residual values through the AE, which are subsequently processed by a ResNet-based classifier. This classifier leverages the residual values to achieve lower complexity and higher accuracy. Our experiments demonstrate significant improvements in both binary and multi-class classification scenarios, achieving up to 99.8 % recall compared to 96.7 % in the benchmark. Additionally, our method reduces computational complexity, making it more suitable for large-scale deployment. These results highlight the robustness and scalability of our approach, offering an effective solution for malicious UAV detection in NGN environments.

Paperid: 2712, https://arxiv.org/pdf/2505.10316.pdf

Abstract:
Aggregate signatures are digital signatures that compress multiple signatures from different parties into a single signature, thereby reducing storage and bandwidth requirements. BLS aggregate signatures are a popular kind of aggregate signature, deployed by Ethereum, Dfinity, and Cloudflare amongst others, currently undergoing standardization at the IETF. However, BLS aggregate signatures are difficult to use correctly, with nuanced requirements that must be carefully handled by protocol developers. In this work, we design the first models of aggregate signatures that enable formal verification tools, such as Tamarin and ProVerif, to be applied to protocols using these signatures. We introduce general models that are based on the cryptographic security definition of generic aggregate signatures, allowing the attacker to exploit protocols where the security requirements are not satisfied. We also introduce a second family of models formalizing BLS aggregate signatures in particular. We demonstrate our approach's practical relevance by modelling and analyzing in Tamarin a device attestation protocol called SANA. Despite SANA's claimed correctness proof, with Tamarin we uncover undocumented assumptions that, when omitted, lead to attacks.

Paperid: 2713, https://arxiv.org/pdf/2505.10066.pdf

Abstract:
Large Language Models (LLMs) rapidly reshape modern life, advancing fields from healthcare to education and beyond. However, alongside their remarkable capabilities lies a significant threat: the susceptibility of these models to jailbreaking. The fundamental vulnerability of LLMs to jailbreak attacks stems from the very data they learn from. As long as this training data includes unfiltered, problematic, or 'dark' content, the models can inherently learn undesirable patterns or weaknesses that allow users to circumvent their intended safety controls. Our research identifies the growing threat posed by dark LLMs models deliberately designed without ethical guardrails or modified through jailbreak techniques. In our research, we uncovered a universal jailbreak attack that effectively compromises multiple state-of-the-art models, enabling them to answer almost any question and produce harmful outputs upon request. The main idea of our attack was published online over seven months ago. However, many of the tested LLMs were still vulnerable to this attack. Despite our responsible disclosure efforts, responses from major LLM providers were often inadequate, highlighting a concerning gap in industry practices regarding AI safety. As model training becomes more accessible and cheaper, and as open-source LLMs proliferate, the risk of widespread misuse escalates. Without decisive intervention, LLMs may continue democratizing access to dangerous knowledge, posing greater risks than anticipated.

Paperid: 2714, https://arxiv.org/pdf/2505.09342.pdf

Abstract:
Machine learning is a key tool for Android malware detection, effectively identifying malicious patterns in apps. However, ML-based detectors are vulnerable to evasion attacks, where small, crafted changes bypass detection. Despite progress in adversarial defenses, the lack of comprehensive evaluation frameworks in binary-constrained domains limits understanding of their robustness. We introduce two key contributions. First, Prioritized Binary Rounding, a technique to convert continuous perturbations into binary feature spaces while preserving high attack success and low perturbation size. Second, the sigma-binary attack, a novel adversarial method for binary domains, designed to achieve attack goals with minimal feature changes. Experiments on the Malscan dataset show that sigma-binary outperforms existing attacks and exposes key vulnerabilities in state-of-the-art defenses. Defenses equipped with adversary detectors, such as KDE, DLA, DNN+, and ICNN, exhibit significant brittleness, with attack success rates exceeding 90% using fewer than 10 feature modifications and reaching 100% with just 20. Adversarially trained defenses, including AT-rFGSM-k, AT-MaxMA, improves robustness under small budgets but remains vulnerable to unrestricted perturbations, with attack success rates of 99.45% and 96.62%, respectively. Although PAD-SMA demonstrates strong robustness against state-of-the-art gradient-based adversarial attacks by maintaining an attack success rate below 16.55%, the sigma-binary attack significantly outperforms these methods, achieving a 94.56% success rate under unrestricted perturbations. These findings highlight the critical need for precise method like sigma-binary to expose hidden vulnerabilities in existing defenses and support the development of more resilient malware detection systems.

Paperid: 2715, https://arxiv.org/pdf/2505.09313.pdf

Abstract:
Sybil attacks pose a significant security threat to blockchain ecosystems, particularly in token airdrop events. This paper proposes a novel sybil address identification method based on subgraph feature extraction lightGBM. The method first constructs a two-layer deep transaction subgraph for each address, then extracts key event operation features according to the lifecycle of sybil addresses, including the time of first transaction, first gas acquisition, participation in airdrop activities, and last transaction. These temporal features effectively capture the consistency of sybil address behavior operations. Additionally, the method extracts amount and network structure features, comprehensively describing address behavior patterns and network topology through feature propagation and fusion. Experiments conducted on a dataset containing 193,701 addresses (including 23,240 sybil addresses) show that this method outperforms existing approaches in terms of precision, recall, F1 score, and AUC, with all metrics exceeding 0.9. The methods and results of this study can be further applied to broader blockchain security areas such as transaction manipulation identification and token liquidity risk assessment, contributing to the construction of a more secure and fair blockchain ecosystem.

Paperid: 2716, https://arxiv.org/pdf/2505.09221.pdf

Abstract:
Software-Defined Networking (SDN) has transformed network architectures by decoupling the control and data-planes, enabling fine-grained control over packet processing and forwarding. P4, a language designed for programming data-plane devices, allows developers to define custom packet processing behaviors directly on programmable network devices. This provides greater control over packet forwarding, inspection, and modification. However, the increased flexibility provided by P4 also brings significant security challenges, particularly in managing sensitive data and preventing information leakage within the data-plane. This paper presents a novel security type system for analyzing information flow in P4 programs that combines security types with interval analysis. The proposed type system allows the specification of security policies in terms of input and output packet bit fields rather than program variables. We formalize this type system and prove it sound, guaranteeing that well-typed programs satisfy noninterference. Our prototype implementation, Tap4s, is evaluated on several use cases, demonstrating its effectiveness in detecting security violations and information leakages.

Paperid: 2717, https://arxiv.org/pdf/2505.08541.pdf

Abstract:
Memory safety is a critical concern for modern embedded systems, particularly in security-sensitive applications. This paper explores the area impact of adding memory safety extensions to the Ibex RISC-V core, focusing on physical memory protection (PMP) and Capability Hardware Extension to RISC-V for Internet of Things (CHERIoT). We synthesise the extended Ibex cores using a commercial tool targeting the open FreePDK45 process and provide a detailed area breakdown and discussion of the results. The PMP configuration we consider is one with 16 PMP regions. We find that the extensions increase the core size by 24 thousand gate-equivalent (kGE) for PMP and 33 kGE for CHERIoT. The increase is mainly due to the additional state required to store information about protected memory. While this increase amounts to 42% for PMP and 57% for CHERIoT in Ibex's area, its effect on the overall system is minimal. In a complete system-on-chip (SoC), like the secure microcontroller OpenTitan Earl Grey, where the core represents only a fraction of the total area, the estimated system-wide overhead is 0.6% for PMP and 1% for CHERIoT. Given the security benefits these extensions provide, the area trade-off is justified, making Ibex a compelling choice for secure embedded applications.

Paperid: 2718, https://arxiv.org/pdf/2505.08209.pdf

Abstract:
Attribute-Based Access Control (ABAC) provides expressiveness and flexibility, making it a compelling model for enforcing fine-grained access control policies. To facilitate the transition to ABAC, extensive research has been conducted to develop methodologies, frameworks, and tools that assist policy administrators in adapting the model. Despite these efforts, challenges remain in the availability and benchmarking of ABAC datasets. Specifically, there is a lack of clarity on how datasets can be systematically acquired, no standardized benchmarking practices to evaluate existing methodologies and their effectiveness, and limited access to real-world datasets suitable for policy analysis and testing. This paper introduces ABAC Lab, an interactive platform that addresses these challenges by integrating existing ABAC policy datasets with analytical tools for policy evaluation. Additionally, we present two new ABAC datasets derived from real-world case studies. ABAC Lab serves as a valuable resource for both researchers studying ABAC policies and policy administrators seeking to adopt ABAC within their organizations. By offering an environment for dataset exploration and policy analysis, ABAC Lab facilitates research, aids policy administrators in transitioning to ABAC, and promotes a more structured approach to ABAC policy evaluation and development.

Paperid: 2719, https://arxiv.org/pdf/2505.08138.pdf

Abstract:
Machine unlearning methods take a model trained on a dataset and a forget set, then attempt to produce a model as if it had only been trained on the examples not in the forget set. We empirically show that an adversary is able to distinguish between a mirror model (a control model produced by retraining without the data to forget) and a model produced by an unlearning method across representative unlearning methods from the literature. We build distinguishing algorithms based on evaluation scores in the literature (i.e. membership inference scores) and Kullback-Leibler divergence. We propose a strong formal definition for machine unlearning called computational unlearning. Computational unlearning is defined as the inability for an adversary to distinguish between a mirror model and a model produced by an unlearning method. If the adversary cannot guess better than random (except with negligible probability), then we say that an unlearning method achieves computational unlearning. Our computational unlearning definition provides theoretical structure to prove unlearning feasibility results. For example, our computational unlearning definition immediately implies that there are no deterministic computational unlearning methods for entropic learning algorithms. We also explore the relationship between differential privacy (DP)-based unlearning methods and computational unlearning, showing that DP-based approaches can satisfy computational unlearning at the cost of an extreme utility collapse. These results demonstrate that current methodology in the literature fundamentally falls short of achieving computational unlearning. We conclude by identifying several open questions for future work.

Paperid: 2720, https://arxiv.org/pdf/2505.07158.pdf

Abstract:
Despite the widespread adoption of Shannon's confusion-diffusion architecture in image encryption, the implementation of diffusion to sequentially establish inter-pixel dependencies for attaining plaintext sensitivity constrains algorithmic parallelism, while the execution of multiple rounds of diffusion operations to meet the required sensitivity metrics incurs excessive computational overhead. Consequently, the pursuit of plaintext sensitivity through diffusion operations is the primary factor limiting the computational efficiency and throughput of video encryption algorithms, rendering them inadequate to meet the demands of real-time encryption for high-resolution video. To address the performance limitation, this paper proposes a real-time video encryption protocol based on heterogeneous parallel computing, which incorporates the SHA-256 hashes of original frames as input, employs multiple CPU threads to concurrently generate encryption-related data, and deploys numerous GPU threads to simultaneously encrypt pixels. By leveraging the extreme input sensitivity of the SHA hash, the proposed protocol achieves the required plaintext sensitivity metrics with only a single round of confusion and XOR operations, significantly reducing computational overhead. Furthermore, through eliminating the reliance on diffusion, it realizes the allocation of a dedicated GPU thread for encrypting each pixel within every channel, effectively enhancing algorithm's parallelism. The experimental results demonstrate that our approach not only exhibits superior statistical properties and robust security but also achieving delay-free bit-level encryption for 1920$\times$1080 resolution (full high definition) video at 30 FPS, with an average encryption time of 25.84 ms on a server equipped with an Intel Xeon Gold 6226R CPU and an NVIDIA GeForce RTX 3090 GPU.

Paperid: 2721, https://arxiv.org/pdf/2505.06394.pdf

Abstract:
Security Operations Centers (SOCs) face growing challenges in managing cybersecurity threats due to an overwhelming volume of alerts, a shortage of skilled analysts, and poorly integrated tools. Human-AI collaboration offers a promising path to augment the capabilities of SOC analysts while reducing their cognitive overload. To this end, we introduce an AI-driven human-machine co-teaming paradigm that leverages large language models (LLMs) to enhance threat intelligence, alert triage, and incident response workflows. We present a vision in which LLM-based AI agents learn from human analysts the tacit knowledge embedded in SOC operations, enabling the AI agents to improve their performance on SOC tasks through this co-teaming. We invite SOCs to collaborate with us to further develop this process and uncover replicable patterns where human-AI co-teaming yields measurable improvements in SOC productivity.

Paperid: 2722, https://arxiv.org/pdf/2505.06384.pdf

Abstract:
In academic settings, the demanding environment often forces students to prioritize academic performance over their physical well-being. Moreover, privacy concerns and the inherent risk of data breaches hinder the deployment of traditional machine learning techniques for addressing these health challenges. In this study, we introduce RiM: Record, Improve, and Maintain, a mobile application which incorporates a novel personalized machine learning framework that leverages federated learning to enhance students' physical well-being by analyzing their lifestyle habits. Our approach involves pre-training a multilayer perceptron (MLP) model on a large-scale simulated dataset to generate personalized recommendations. Subsequently, we employ federated learning to fine-tune the model using data from IISER Bhopal students, thereby ensuring its applicability in real-world scenarios. The federated learning approach guarantees differential privacy by exclusively sharing model weights rather than raw data. Experimental results show that the FedAvg-based RiM model achieves an average accuracy of 60.71% and a mean absolute error of 0.91--outperforming the FedPer variant (average accuracy 46.34%, MAE 1.19)--thereby demonstrating its efficacy in predicting lifestyle deficits under privacy-preserving constraints.

Paperid: 2723, https://arxiv.org/pdf/2505.05872.pdf

Abstract:
Split Learning (SL) has emerged as a promising paradigm for distributed deep learning, allowing resource-constrained clients to offload portions of their model computation to servers while maintaining collaborative learning. However, recent research has demonstrated that SL remains vulnerable to a range of privacy and security threats, including information leakage, model inversion, and adversarial attacks. While various defense mechanisms have been proposed, a systematic understanding of the attack landscape and corresponding countermeasures is still lacking. In this study, we present a comprehensive taxonomy of attacks and defenses in SL, categorizing them along three key dimensions: employed strategies, constraints, and effectiveness. Furthermore, we identify key open challenges and research gaps in SL based on our systematization, highlighting potential future directions.

Paperid: 2724, https://arxiv.org/pdf/2505.05648.pdf

Abstract:
In this paper we train a transformer using differential privacy (DP) for language modeling in SwiftKey. We run multiple experiments to balance the trade-off between the model size, run-time speed and accuracy. We show that we get small and consistent gains in the next-word-prediction and accuracy with graceful increase in memory and speed compared to the production GRU. This is obtained by scaling down a GPT2 architecture to fit the required size and a two stage training process that builds a seed model on general data and DP finetunes it on typing data. The transformer is integrated using ONNX offering both flexibility and efficiency.

Paperid: 2725, https://arxiv.org/pdf/2505.05190.pdf

Abstract:
Text watermarking aims to subtly embed statistical signals into text by controlling the Large Language Model (LLM)'s sampling process, enabling watermark detectors to verify that the output was generated by the specified model. The robustness of these watermarking algorithms has become a key factor in evaluating their effectiveness. Current text watermarking algorithms embed watermarks in high-entropy tokens to ensure text quality. In this paper, we reveal that this seemingly benign design can be exploited by attackers, posing a significant risk to the robustness of the watermark. We introduce a generic efficient paraphrasing attack, the Self-Information Rewrite Attack (SIRA), which leverages the vulnerability by calculating the self-information of each token to identify potential pattern tokens and perform targeted attack. Our work exposes a widely prevalent vulnerability in current watermarking algorithms. The experimental results show SIRA achieves nearly 100% attack success rates on seven recent watermarking methods with only 0.88 USD per million tokens cost. Our approach does not require any access to the watermark algorithms or the watermarked LLM and can seamlessly transfer to any LLM as the attack model, even mobile-level models. Our findings highlight the urgent need for more robust watermarking.

Paperid: 2726, https://arxiv.org/pdf/2505.04466.pdf

Abstract:
Delivering high-quality, secure 360Â° video content introduces unique challenges, primarily due to the high bitrates and interactive demands of immersive media. Traditional HTTPS-based methods, although widely used, face limitations in computational efficiency and scalability when securing these high-resolution streams. To address these issues, this paper proposes a novel framework integrating Attribute-Based Encryption (ABE) with selective encryption techniques tailored specifically for tiled 360Â° video streaming. Our approach employs selective encryption of frames at varying levels to reduce computational overhead while ensuring robust protection against unauthorized access. Moreover, we explore viewport-adaptive encryption, dynamically encrypting more frames within tiles occupying larger portions of the viewer's field of view. This targeted method significantly enhances security in critical viewing areas without unnecessary overhead in peripheral regions. We deploy and evaluate our proposed approach using the CloudLab testbed, comparing its performance against traditional HTTPS streaming. Experimental results demonstrate that our ABE-based model achieves reduced computational load on intermediate caches, improves cache hit rates, and maintains comparable visual quality to HTTPS, as assessed by Video Multimethod Assessment Fusion (VMAF).

Paperid: 2727, https://arxiv.org/pdf/2505.03796.pdf

Abstract:
Insider threats pose a significant challenge to organizational security, often evading traditional rule-based detection systems due to their subtlety and contextual nature. This paper presents an AI-powered Insider Risk Management (IRM) system that integrates behavioral analytics, dynamic risk scoring, and real-time policy enforcement to detect and mitigate insider threats with high accuracy and adaptability. We introduce a hybrid scoring mechanism - transitioning from the static PRISM model to an adaptive AI-based model utilizing an autoencoder neural network trained on expert-annotated user activity data. Through iterative feedback loops and continuous learning, the system reduces false positives by 59% and improves true positive detection rates by 30%, demonstrating substantial gains in detection precision. Additionally, the platform scales efficiently, processing up to 10 million log events daily with sub-300ms query latency, and supports automated enforcement actions for policy violations, reducing manual intervention. The IRM system's deployment resulted in a 47% reduction in incident response times, highlighting its operational impact. Future enhancements include integrating explainable AI, federated learning, graph-based anomaly detection, and alignment with Zero Trust principles to further elevate its adaptability, transparency, and compliance-readiness. This work establishes a scalable and proactive framework for mitigating emerging insider risks in both on-premises and hybrid environments.

Paperid: 2728, https://arxiv.org/pdf/2505.03742.pdf

Abstract:
Advancements in AI capabilities, driven in large part by scaling up computing resources used for AI training, have created opportunities to address major global challenges but also pose risks of misuse. Hardware-enabled mechanisms (HEMs) can support responsible AI development by enabling verifiable reporting of key properties of AI training activities such as quantity of compute used, training cluster configuration or location, as well as policy enforcement. Such tools can promote transparency and improve security, while addressing privacy and intellectual property concerns. Based on insights from an interdisciplinary workshop, we identify open questions regarding potential implementation approaches, emphasizing the need for further research to ensure robust, scalable solutions.

Paperid: 2729, https://arxiv.org/pdf/2505.03639.pdf

Abstract:
The analysis of network assortativity is of great importance for understanding the structural characteristics of and dynamics upon networks. Often, network assortativity is quantified using the assortativity coefficient that is defined based on the Pearson correlation coefficient between vertex degrees. It is well known that a network may contain sensitive information, such as the number of friends of an individual in a social network (which is abstracted as the degree of vertex.). So, the computation of the assortativity coefficient leads to privacy leakage, which increases the urgent need for privacy-preserving protocol. However, there has been no scheme addressing the concern above. To bridge this gap, in this work, we are the first to propose approaches based on differential privacy (DP for short). Specifically, we design three DP-based algorithms: $Local_{ru}$, $Shuffle_{ru}$, and $Decentral_{ru}$. The first two algorithms, based on Local DP (LDP) and Shuffle DP respectively, are designed for settings where each individual only knows his/her direct friends. In contrast, the third algorithm, based on Decentralized DP (DDP), targets scenarios where each individual has a broader view, i.e., also knowing his/her friends' friends. Theoretically, we prove that each algorithm enables an unbiased estimation of the assortativity coefficient of the network. We further evaluate the performance of the proposed algorithms using mean squared error (MSE), showing that $Shuffle_{ru}$ achieves the best performance, followed by $Decentral_{ru}$, with $Local_{ru}$ performing the worst. Note that these three algorithms have different assumptions, so each has its applicability scenario. Lastly, we conduct extensive numerical simulations, which demonstrate that the presented approaches are adequate to achieve the estimation of network assortativity under the demand for privacy protection.

Paperid: 2730, https://arxiv.org/pdf/2505.03439.pdf

Abstract:
The potential for large language models (LLMs) to hide messages within plain text (steganography) poses a challenge to detection and thwarting of unaligned AI agents, and undermines faithfulness of LLMs reasoning. We explore the steganographic capabilities of LLMs fine-tuned via reinforcement learning (RL) to: (1) develop covert encoding schemes, (2) engage in steganography when prompted, and (3) utilize steganography in realistic scenarios where hidden reasoning is likely, but not prompted. In these scenarios, we detect the intention of LLMs to hide their reasoning as well as their steganography performance. Our findings in the fine-tuning experiments as well as in behavioral non fine-tuning evaluations reveal that while current models exhibit rudimentary steganographic abilities in terms of security and capacity, explicit algorithmic guidance markedly enhances their capacity for information concealment.

Paperid: 2731, https://arxiv.org/pdf/2505.03120.pdf

Abstract:
Machine learning (ML)-based intrusion detection systems (IDS) are vulnerable to adversarial attacks. It is crucial for an IDS to learn to recognize adversarial examples before malicious entities exploit them. In this paper, we generated adversarial samples using the Jacobian Saliency Map Attack (JSMA). We validate the generalization and scalability of the adversarial samples to tackle a broad range of real attacks on Industrial Control Systems (ICS). We evaluated the impact by assessing multiple attacks generated using the proposed method. The model trained with adversarial samples detected attacks with 95% accuracy on real-world attack data not used during training. The study was conducted using an operational secure water treatment (SWaT) testbed.

Paperid: 2732, https://arxiv.org/pdf/2505.02725.pdf

Abstract:
Acoustic Side-Channel Attacks (ASCAs) extract sensitive information by using audio emitted from a computing devices and their peripherals. Attacks targeting keyboards are popular and have been explored in the literature. However, similar attacks targeting other human interface peripherals, such as computer mice, are under-explored. To this end, this paper considers security leakage via acoustic signals emanating from normal mouse usage. We first confirm feasibility of such attacks by showing a proof-of-concept attack that classifies four mouse movements with 97% accuracy in a controlled environment. We then evolve the attack towards discerning twelve unique mouse movements using a smartphone to record the experiment. Using Machine Learning (ML) techniques, the model is trained on an experiment with six participants to be generalizable and discern among twelve movements with 94% accuracy. In addition, we experiment with an attack that detects a user action of closing a full-screen window on a laptop. Achieving an accuracy of 91%, this experiment highlights exploiting audio leakage from computer mouse movements in a realistic scenario.

Paperid: 2733, https://arxiv.org/pdf/2505.02224.pdf

Abstract:
A decision tree is an easy-to-understand tool that has been widely used for classification tasks. On the one hand, due to privacy concerns, there has been an urgent need to create privacy-preserving classifiers that conceal the user's input from the classifier. On the other hand, with the rise of cloud computing, data owners are keen to reduce risk by outsourcing their model, but want security guarantees that third parties cannot steal their decision tree model. To address these issues, Joye and Salehi introduced a theoretical protocol that efficiently evaluates decision trees while maintaining privacy by leveraging their comparison protocol that is resistant to timing attacks. However, their approach was not only inefficient but also prone to side-channel attacks. Therefore, in this paper, we propose a new decision tree inference protocol in which the model is shared and evaluated among multiple entities. We partition our decision tree model by each level to be stored in a new entity we refer to as a "level-site." Utilizing this approach, we were able to gain improved average run time for classifier evaluation for a non-complete tree, while also having strong mitigations against side-channel attacks.

Paperid: 2734, https://arxiv.org/pdf/2505.01873.pdf

Abstract:
Attribute-Based Access Control (ABAC) enables highly expressive and flexible access decisions by considering a wide range of contextual attributes. ABAC policies use logical expressions that combine these attributes, allowing for precise and context-aware control. Algorithms that mine ABAC policies from legacy access control systems can significantly reduce the costs associated with migrating to ABAC. However, a major challenge in this process is handling incomplete entity information, where some attribute values are missing. This paper introduces an approach that enhances the policy mining process by predicting or inferring missing attribute values. This is accomplished by employing a contextual clustering technique that groups entities according to their known attributes, which are then used to analyze and refine authorization decisions. By effectively managing incomplete data, our approach provides security administrators with a valuable tool to improve their attribute data and ensure a smoother, more efficient transition to ABAC.

Paperid: 2735, https://arxiv.org/pdf/2505.01845.pdf

Abstract:
Efficient scalar multiplication is critical for enhancing the performance of elliptic curve cryptography (ECC), especially in applications requiring large-scale or real-time cryptographic operations. This paper proposes an M-ary precomputation-based scalar multiplication algorithm, aiming to optimize both computational efficiency and memory usage. The method reduces the time complexity from $Î(Q \log p)$ to $Î\left(\frac{Q \log p}{\log Q}\right)$ and achieves a memory complexity of $Î\left(\frac{Q \log p}{\log^2 Q}\right)$. Experiments on ElGamal encryption and NS3-based communication simulations validate its effectiveness. On secp256k1, the proposed method achieves up to a 59\% reduction in encryption time and 30\% memory savings. In network simulations, the binary-optimized variant reduces communication time by 22.1\% on secp384r1 and simulation time by 25.4\% on secp521r1. The results demonstrate the scalability, efficiency, and practical applicability of the proposed algorithm. The source code will be publicly released upon acceptance.

Paperid: 2736, https://arxiv.org/pdf/2505.01538.pdf

Abstract:
As vector databases gain traction in enterprise applications, robust access control has become critical to safeguard sensitive data. Access control in these systems is often implemented through hybrid vector queries, which combine nearest neighbor search on vector data with relational predicates based on user permissions. However, existing approaches face significant trade-offs: creating dedicated indexes for each user minimizes query latency but introduces excessive storage redundancy, while building a single index and applying access control after vector search reduces storage overhead but suffers from poor recall and increased query latency. This paper introduces HoneyBee, a dynamic partitioning framework that bridges the gap between these approaches by leveraging the structure of Role-Based Access Control (RBAC) policies. RBAC, widely adopted in enterprise settings, groups users into roles and assigns permissions to those roles, creating a natural "thin waist" in the permission structure that is ideal for partitioning decisions. Specifically, HoneyBee produces overlapping partitions where vectors can be strategically replicated across different partitions to reduce query latency while controlling storage overhead. By introducing analytical models for the performance and recall of the vector search, HoneyBee formulates the partitioning strategy as a constrained optimization problem to dynamically balance storage, query efficiency, and recall. Evaluations on RBAC workloads demonstrate that HoneyBee reduces storage redundancy compared to role partitioning and achieves up to 6x faster query speeds than row-level security (RLS) with only 1.4x storage increase, offering a practical middle ground for secure and efficient vector search.

Paperid: 2737, https://arxiv.org/pdf/2505.01524.pdf

Abstract:
Synthetic data has become an increasingly popular way to share data without revealing sensitive information. Though Membership Inference Attacks (MIAs) are widely considered the gold standard for empirically assessing the privacy of a synthetic dataset, practitioners and researchers often rely on simpler proxy metrics such as Distance to Closest Record (DCR). These metrics estimate privacy by measuring the similarity between the training data and generated synthetic data. This similarity is also compared against that between the training data and a disjoint holdout set of real records to construct a binary privacy test. If the synthetic data is not more similar to the training data than the holdout set is, it passes the test and is considered private. In this work we show that, while computationally inexpensive, DCR and other distance-based metrics fail to identify privacy leakage. Across multiple datasets and both classical models such as Baynet and CTGAN and more recent diffusion models, we show that datasets deemed private by proxy metrics are highly vulnerable to MIAs. We similarly find both the binary privacy test and the continuous measure based on these metrics to be uninformative of actual membership inference risk. We further show that these failures are consistent across different metric hyperparameter settings and record selection methods. Finally, we argue DCR and other distance-based metrics to be flawed by design and show a example of a simple leakage they miss in practice. With this work, we hope to motivate practitioners to move away from proxy metrics to MIAs as the rigorous, comprehensive standard of evaluating privacy of synthetic data, in particular to make claims of datasets being legally anonymous.

Paperid: 2738, https://arxiv.org/pdf/2505.01514.pdf

Abstract:
The rapid digitalization of communication systems has elevated Interactive Voice Response (IVR) technologies to become critical interfaces for customer engagement. With Artificial Intelligence (AI) now driving these platforms, ensuring secure, compliant, and ethically designed development practices is more imperative than ever. AI-powered IVRs leverage Natural Language Processing (NLP) and Machine Learning (ML) to personalize interactions, automate service delivery, and optimize user experiences. However, these innovations expose systems to heightened risks, including data privacy breaches, AI decision opacity, and model security vulnerabilities. This paper analyzes the evolution of IVRs from static code-based designs to adaptive AI-driven systems, presenting a cybersecurity-centric perspective. We propose a practical governance framework that embeds agile security principles, compliance with global data legislation, and user-centric ethics. Emphasizing privacy-by-design, adaptive risk modeling, and transparency, the paper argues that ethical AI integration is not a feature but a strategic imperative. Through this multidimensional lens, we highlight how modern IVRs can transition from communication tools to intelligent, secure, and accountable digital frontlines-resilient against emerging threats and aligned with societal expectations.

Paperid: 2739, https://arxiv.org/pdf/2505.01436.pdf

Abstract:
In this paper, we present the principles of designing new self-organising and autonomous management protocol to govern the dynamics of bio-inspired decentralized firewall architecture based on Biological Regularity Networks. The new architecture called Firewall Regulatory Networks (FRN) exhibits the following features (1) automatic rule policy configuration with provable utility-risk appetite guarantee, (2) resilient response for changing risks or new service requirements, and (3) globally optimized access control policy reconciliation. We present the FRN protocol and formalize the constraints to synthesize the undetermined components in the protocol to produce interactions that can achieve these objectives. We illustrate the feasibility of the FRN architecture in multiple case studies.

Paperid: 2740, https://arxiv.org/pdf/2505.01328.pdf

Abstract:
While machine learning has significantly advanced Network Intrusion Detection Systems (NIDS), particularly within IoT environments where devices generate large volumes of data and are increasingly susceptible to cyber threats, these models remain vulnerable to adversarial attacks. Our research reveals a critical flaw in existing adversarial attack methodologies: the frequent violation of domain-specific constraints, such as numerical and categorical limits, inherent to IoT and network traffic. This leads to up to 80.3% of adversarial examples being invalid, significantly overstating real-world vulnerabilities. These invalid examples, though effective in fooling models, do not represent feasible attacks within practical IoT deployments. Consequently, relying on these results can mislead resource allocation for defense, inflating the perceived susceptibility of IoT-enabled NIDS models to adversarial manipulation. Furthermore, we demonstrate that simpler surrogate models like Multi-Layer Perceptron (MLP) generate more valid adversarial examples compared to complex architectures such as CNNs and LSTMs. Using the MLP as a surrogate, we analyze the transferability of adversarial severity to other ML/DL models commonly used in IoT contexts. This work underscores the importance of considering both domain constraints and model architecture when evaluating and designing robust ML/DL models for security-critical IoT and network applications.

Paperid: 2741, https://arxiv.org/pdf/2505.01065.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities in code-related tasks, raising concerns about their potential for automated exploit generation (AEG). This paper presents the first systematic study on LLMs' effectiveness in AEG, evaluating both their cooperativeness and technical proficiency. To mitigate dataset bias, we introduce a benchmark with refactored versions of five software security labs. Additionally, we design an LLM-based attacker to systematically prompt LLMs for exploit generation. Our experiments reveal that GPT-4 and GPT-4o exhibit high cooperativeness, comparable to uncensored models, while Llama3 is the most resistant. However, no model successfully generates exploits for refactored labs, though GPT-4o's minimal errors highlight the potential for LLM-driven AEG advancements.

Paperid: 2742, https://arxiv.org/pdf/2505.00894.pdf

Abstract:
The power of adaptivity in algorithms has been intensively studied in diverse areas of theoretical computer science. In this paper, we obtain a number of sharp lower bound results which show that adaptivity provides a significant extra power in cryptanalytic time-space tradeoffs with (possibly unlimited) preprocessing time. Most notably, we consider the discrete logarithm (DLOG) problem in a generic group of $N$ elements. The classical `baby-step giant-step' algorithm for the problem has time complexity $T=O(\sqrt{N})$, uses $O(\sqrt{N})$ bits of space (up to logarithmic factors in $N$) and achieves constant success probability. We examine a generalized setting where an algorithm obtains an advice string of $S$ bits and is allowed to make $T$ arbitrary non-adaptive queries that depend on the advice string (but not on the challenge group element). We show that in this setting, the $T=O(\sqrt{N})$ online time complexity of the baby-step giant-step algorithm cannot be improved, unless the advice string is more than $Î©(\sqrt{N})$ bits long. This lies in stark contrast with the classical adaptive Pollard's rho algorithm for DLOG, which can exploit preprocessing to obtain the tradeoff curve $ST^2=O(N)$. We obtain similar sharp lower bounds for several other cryptanalytic problems. To obtain our results, we present a new model that allows analyzing non-adaptive preprocessing algorithms for a wide array of search and decision problems in a unified way. Since previous proof techniques inherently cannot distinguish between adaptive and non-adaptive algorithms for the problems in our model, they cannot be used to obtain our results. Consequently, our proof uses a variant of Shearer's lemma for this setting, due to Barthe, Cordero-Erausquin, Ledoux, and Maurey (2011). This seems to be the first time a variant of Shearer's lemma for permutations is used in an algorithmic context.

Paperid: 2743, https://arxiv.org/pdf/2504.21680.pdf

Abstract:
Retrieval-Augmented Generation (RAG) integrates Large Language Models (LLMs) with external knowledge bases, improving output quality while introducing new security risks. Existing studies on RAG vulnerabilities typically focus on exploiting the retrieval mechanism to inject erroneous knowledge or malicious texts, inducing incorrect outputs. However, these approaches overlook critical weaknesses within LLMs, leaving important attack vectors unexplored and limiting the scope and efficiency of attacks. In this paper, we uncover a novel vulnerability: the safety guardrails of LLMs, while designed for protection, can also be exploited as an attack vector by adversaries. Building on this vulnerability, we propose MutedRAG, a novel denial-of-service attack that reversely leverages the guardrails of LLMs to undermine the availability of RAG systems. By injecting minimalistic jailbreak texts, such as "\textit{How to build a bomb}", into the knowledge base, MutedRAG intentionally triggers the LLM's safety guardrails, causing the system to reject legitimate queries. Besides, due to the high sensitivity of guardrails, a single jailbreak sample can affect multiple queries, effectively amplifying the efficiency of attacks while reducing their costs. Experimental results on three datasets demonstrate that MutedRAG achieves an attack success rate exceeding 60% in many scenarios, requiring only less than one malicious text to each target query on average. In addition, we evaluate potential defense strategies against MutedRAG, finding that some of current mechanisms are insufficient to mitigate this threat, underscoring the urgent need for more robust solutions.

Paperid: 2744, https://arxiv.org/pdf/2504.21041.pdf

Abstract:
Nowadays, due to the growing phenomenon of forgery in many fields, the interest in developing new anti-counterfeiting device and cryptography keys, based on the Physical Unclonable Functions (PUFs) paradigm, is widely increased. PUFs are physical hardware with an intrinsic, irreproducible disorder that allows for on-demand cryptographic key extraction. Among them, optical PUF are characterized by a large number of degrees of freedom resulting in higher security and higher sensitivity to environmental conditions. While these promising features led to the growth of advanced fabrication strategies and materials for new PUF devices, their combination with robust recognition algorithm remains largely unexplored. In this work, we present a metric-independent authentication approach that leverages the Scale Invariant Feature Transform (SIFT) algorithm to extract unique and invariant features from the speckle patterns generated by optical Physical Unclonable Functions (PUFs). The application of SIFT to the challenge response pairs (CRPs) protocol allows us to correctly authenticate a client while denying any other fraudulent access. In this way, the authentication process is highly reliable even in presence of response rotation, zooming, and cropping that may occur in consecutive PUF interrogations and to which other postprocessing algorithm are highly sensitive. This characteristics together with the speed of the method (tens of microseconds for each operation) broaden the applicability and reliability of PUF to practical high-security authentication or merchandise anti-counterfeiting.

Paperid: 2745, https://arxiv.org/pdf/2504.21037.pdf

Abstract:
Early detection of security bug reports (SBRs) is crucial for preventing vulnerabilities and ensuring system reliability. While machine learning models have been developed for SBR prediction, their predictive performance still has room for improvement. In this study, we conduct a comprehensive comparison between BERT and Random Forest (RF), a competitive baseline for predicting SBRs. The results show that RF outperforms BERT with a 34% higher average G-measure for within-project predictions. Adding only SBRs from various projects improves both models' average performance. However, including both security and nonsecurity bug reports significantly reduces RF's average performance to 46%, while boosts BERT to its best average performance of 66%, surpassing RF. In cross-project SBR prediction, BERT achieves a remarkable 62% G-measure, which is substantially higher than RF.

Paperid: 2746, https://arxiv.org/pdf/2504.20906.pdf

Abstract:
The continuous monitoring of the interactions between cyber-physical components of any industrial control system (ICS) is required to secure automation of the system controls, and to guarantee plant processes are fail-safe and remain in an acceptably safe state. Safety is achieved by managing actuation (where electric signals are used to trigger physical movement), dependent on corresponding sensor readings; used as ground truth in decision making. Timely detection of anomalies (attacks, faults and unascertained states) in ICSs is crucial for the safe running of a plant, the safety of its personnel, and for the safe provision of any services provided. We propose an anomaly detection method that involves accurate linearization of the non-linear forms arising from sensor-actuator(s) relationships, primarily because solving linear models is easier and well understood. Further, the time complexity of the anomaly detection scenario/problem at hand is lowered using dimensionality reduction of the actuator(s) in relationship with a sensor. We accomplish this by using a well-known water treatment testbed as a use case. Our experiments show millisecond time response to detect anomalies and provide explainability; that are not simultaneously achieved by other state of the art AI/ML models with eXplainable AI (XAI) used for the same purpose. Further, we pin-point the sensor(s) and its actuation state for which anomaly was detected.

Paperid: 2747, https://arxiv.org/pdf/2504.20814.pdf

Abstract:
While prior studies have explored security in code generated by ChatGPT and other Large Language Models, they were conducted in controlled experimental settings and did not use code generated or provided from actual developer interactions. This paper not only examines the security of code generated by ChatGPT based on real developer interactions, curated in the DevGPT dataset, but also assesses ChatGPT's capability to find and fix these vulnerabilities. We analysed 1,586 C, C++, and C# code snippets using static scanners, which detected potential issues in 124 files. After manual analysis, we selected 26 files with 32 confirmed vulnerabilities for further investigation. We submitted these files to ChatGPT via the OpenAI API, asking it to detect security issues, identify the corresponding Common Weakness Enumeration numbers, and propose fixes. The responses and modified code were manually reviewed and re-scanned for vulnerabilities. ChatGPT successfully detected 18 out of 32 security issues and resolved 17 issues but failed to recognize or fix the remainder. Interestingly, only 10 vulnerabilities were resulted from the user prompts, while 22 were introduced by ChatGPT itself. We highlight for developers that code generated by ChatGPT is more likely to contain vulnerabilities compared to their own code. Furthermore, at times ChatGPT reports incorrect information with apparent confidence, which may mislead less experienced developers. Our findings confirm previous studies in demonstrating that ChatGPT is not sufficiently reliable for generating secure code nor identifying all vulnerabilities, highlighting the continuing importance of static scanners and manual review.

Paperid: 2748, https://arxiv.org/pdf/2504.20626.pdf

Abstract:
We present MAVShield, a novel lightweight cipher designed to secure communications in Unmanned Aerial Vehicles (UAVs) using the MAVLink protocol, which by default transmits unencrypted messages between UAVs and Ground Control Stations (GCS). While existing studies propose encryption for MAVLink, most remain theoretical or simulation-based. We implement MAVShield alongside AES-CTR, ChaCha20, Speck-CTR, and Rabbit, and evaluate them on a real drone testbed. A comprehensive security analysis using statistical test suites (NIST and Diehard) demonstrates strong resistance of the novel cipher to cryptanalysis. Performance evaluation across key metrics including memory usage, CPU load, and battery power consumption, demonstrates that MAVShield outperforms existing algorithms and offers an efficient, real-world solution for securing MAVLink communications in UAVs.

Paperid: 2749, https://arxiv.org/pdf/2504.20544.pdf

Abstract:
Flexible sharing of electronic medical records (EMRs) is an urgent need in healthcare, as fragmented storage creates EMR management complexity for both practitioners and patients. Blockchain has emerged as a promising solution to address the limitations of centralized EMR systems regarding interoperability, data ownership, and trust concerns. Whilst its healthcare implementation continues to face scalability challenges, particularly in uploading lag time as EMR volumes increase. In this paper, we describe the design of a novel blockchain-based data structure, MedBlockTree, which aims to solve the scalability issue in blockchain-based EMR systems, particularly low block throughput and patient awareness. MedBlockTree leverages a chameleon hash function to generate collision blocks for existing patients and expand a single chain into a growing block tree with $n$ branches that are capable of processing $n$ new blocks in a single consensus round. We also introduce the EnhancedPro consensus algorithm to manage multiple branches and maintain network consistency. Our comprehensive simulation evaluates performance across four dimensions: branch number, worker number, collision rate, and network latency. Comparative analysis against a traditional blockchain-based EMR system demonstrates outstanding throughput improvements across all dimensions, achieving processing speeds $Î½\cdot n$ times faster than conventional approaches.

Paperid: 2750, https://arxiv.org/pdf/2504.20310.pdf

Abstract:
In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine Learning algorithms during inference time. We formally define defense by detection (DbD) and defense by mitigation (DbM). Our definitions come in the form of a 3-round protocol between two resource-bounded parties: a trainer/defender and an attacker. The attacker aims to produce inference-time inputs that fool the training algorithm. We define correctness, completeness, and soundness properties to capture successful defense at inference time while not degrading (too much) the performance of the algorithm on inputs from the training distribution. We first show that achieving DbD and achieving DbM are equivalent for ML classification tasks. Surprisingly, this is not the case for ML generative learning tasks, where there are many possible correct outputs for each input. We show a separation between DbD and DbM by exhibiting two generative learning tasks for which it is possible to defend by mitigation but it is provably impossible to defend by detection. The mitigation phase uses significantly less computational resources than the initial training algorithm. In the first learning task we consider sample complexity as the resource and in the second the time complexity. The first result holds under the assumption that the Identity-Based Fully Homomorphic Encryption (IB-FHE), publicly-verifiable zero-knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK), and Strongly Unforgeable Signatures exist. The second result assumes the existence of Non-Parallelizing Languages with Average-Case Hardness (NPL) and Incrementally-Verifiable Computation (IVC) and IB-FHE.

Paperid: 2751, https://arxiv.org/pdf/2504.20245.pdf

Abstract:
Cyberspace is an ever-evolving battleground involving adversaries seeking to circumvent existing safeguards and defenders aiming to stay one step ahead by predicting and mitigating the next threat. Existing mitigation strategies have focused primarily on solutions that consider software or hardware aspects, often ignoring the human factor. This paper takes a first step towards psychology-informed, active defense strategies, where we target biases that human beings are susceptible to under conditions of uncertainty. Using capture-the-flag events, we create realistic challenges that tap into a particular cognitive bias: representativeness. This study finds that this bias can be triggered to thwart hacking attempts and divert hackers into non-vulnerable attack paths. Participants were exposed to two different challenges designed to exploit representativeness biases. One of the representativeness challenges significantly thwarted attackers away from vulnerable attack vectors and onto non-vulnerable paths, signifying an effective bias-based defense mechanism. This work paves the way towards cyber defense strategies that leverage additional human biases to thwart future, sophisticated adversarial attacks.

Paperid: 2752, https://arxiv.org/pdf/2504.20050.pdf

Abstract:
Typical protocols in the multi-party private set operations (MPSO) setting enable m > 2 parties to perform certain secure computation on the intersection or union of their private sets, realizing a very limited range of MPSO functionalities. Most works in this field focus on just one or two specific functionalities, resulting in a large variety of isolated schemes and a lack of a unified framework in MPSO research. In this work, we present an MPSO framework, which allows m parties, each holding a set, to securely compute any set formulas (arbitrary compositions of a finite number of binary set operations, including intersection, union and difference) on their private sets. Our framework is highly versatile and can be instantiated to accommodate a broad spectrum of MPSO functionalities. To the best of our knowledge, this is the first framework to achieve such a level of flexibility and generality in MPSO, without relying on generic secure multi-party computation (MPC) techniques. Our framework exhibits favorable theoretical and practical performance. The computation and communication complexity scale linearly with the set size n, and it achieves optimal complexity that is on par with the naive solution for widely used functionalities, such as multi-party private set intersection (MPSI), MPSI with cardinality output (MPSI-card), and MPSI with cardinality and sum (MPSI-card-sum), in the standard semi-honest model. Furthermore, the instantiations of our framework mainly from symmetric-key techniques yield efficient protocols for MPSI, MPSI-card, MPSI-card-sum, and multi-party private set union (MPSU), with online performance surpassing or matching the state of the art.

Paperid: 2753, https://arxiv.org/pdf/2504.19855.pdf

Abstract:
This paper analyzes Large Language Model (LLM) security vulnerabilities based on data from Crucible, encompassing 214,271 attack attempts by 1,674 users across 30 LLM challenges. Our findings reveal automated approaches significantly outperform manual techniques (69.5% vs 47.6% success rate), despite only 5.2% of users employing automation. We demonstrate that automated approaches excel in systematic exploration and pattern matching challenges, while manual approaches retain speed advantages in certain creative reasoning scenarios, often solving problems 5x faster when successful. Challenge categories requiring systematic exploration are most effectively targeted through automation, while intuitive challenges sometimes favor manual techniques for time-to-solve metrics. These results illuminate how algorithmic testing is transforming AI red-teaming practices, with implications for both offensive security research and defensive measures. Our analysis suggests optimal security testing combines human creativity for strategy development with programmatic execution for thorough exploration.

Paperid: 2754, https://arxiv.org/pdf/2504.19521.pdf

Abstract:
The adoption of Generative AI (GenAI) in applications inevitably comes with the expansion of the attack surface, combining new security threats along with the traditional ones. Consequently, numerous research and industrial initiatives aim to mitigate the GenAI related security threats by developing evaluation methods and designing defenses. However, while most of the GenAI security work focuses on universal threats (e.g. 'How to build a bomb'), there is significantly less discussion on application-level security and how to evaluate and mitigate it. Thus, in this work we adopt an application-centric approach to GenAI security, and show that while LLMs cannot protect against ad-hoc application specific threats, they can provide the framework for applications to protect themselves against such threats. Our first contribution is defining Security Steerability - a novel security measure for LLMs, assessing the model's capability to adhere to strict guardrails that are defined in the system prompt (e.g. 'Refrain from discussing about our competitors'). These guardrails, in case effective, can stop threats in the presence of malicious users who attempt to circumvent the application purpose. Our second contribution is a methodology to measure the security steerability of LLMs, utilizing a newly-developed benchmark called VeganRibs which assesses the LLM behavior in forcing specific guardrails that are not security per-se, in the presence of malicious user that tries to bypass the guardrails through prompt injection attacks with attack boosters (jailbreaks and perturbations). Using the new benchmark, we analyzed 18 open-source LLMs, demonstrating significant differences between their security steerability that are not trivial to foresee...

Paperid: 2755, https://arxiv.org/pdf/2504.18636.pdf

Abstract:
Phishing attacks represent an increasingly sophisticated and pervasive threat to individuals and organizations, causing significant financial losses, identity theft, and severe damage to institutional reputations. Existing phishing detection methods often struggle to simultaneously achieve high accuracy and explainability, either failing to detect novel attacks or operating as opaque black-box models. To address this critical gap, we propose a novel phishing URL detection system based on a first-order Takagi-Sugeno-Kang (TSK) fuzzy inference model optimized through gradient-based techniques. Our approach intelligently combines the interpretability and human-like reasoning capabilities of fuzzy logic with the precision and adaptability provided by gradient optimization methods, specifically leveraging the Adam optimizer for efficient parameter tuning. Experiments conducted using a comprehensive dataset of over 235,000 URLs demonstrate rapid convergence, exceptional predictive performance (accuracy averaging 99.95% across 5 cross-validation folds, with a perfect AUC i.e. 1.00). Furthermore, optimized fuzzy rules and membership functions improve interoperability, clearly indicating how the model makes decisions - an essential feature for cybersecurity applications. This high-performance, transparent, and interpretable phishing detection framework significantly advances current cybersecurity defenses, providing practitioners with accurate and explainable decision-making tools.

Paperid: 2756, https://arxiv.org/pdf/2504.18608.pdf

Abstract:
Electrocardiogram (ECG) signal exhibits inherent uniqueness, making it a promising biometric modality for identity authentication. As a result, ECG authentication has gained increasing attention in recent years. However, most existing methods focus primarily on improving authentication accuracy within closed-set settings, with limited research addressing the challenges posed by open-set scenarios. In real-world applications, identity authentication systems often encounter a substantial amount of unseen data, leading to potential security vulnerabilities and performance degradation. To address this issue, we propose a robust ECG identity authentication system that maintains high performance even in open-set settings. Firstly, we employ a multi-modal pretraining framework, where ECG signals are paired with textual reports derived from their corresponding fiducial features to enhance the representational capacity of the signal encoder. During fine-tuning, we introduce Self-constraint Center Learning and Irrelevant Sample Repulsion Learning to constrain the feature distribution, ensuring that the encoded representations exhibit clear decision boundaries for classification. Our method achieves 99.83% authentication accuracy and maintains a False Accept Rate as low as 5.39% in the presence of open-set samples. Furthermore, across various open-set ratios, our method demonstrates exceptional stability, maintaining an Open-set Classification Rate above 95%.

Paperid: 2757, https://arxiv.org/pdf/2504.18565.pdf

Abstract:
Uncontrollable autonomous replication of language model agents poses a critical safety risk. To better understand this risk, we introduce RepliBench, a suite of evaluations designed to measure autonomous replication capabilities. RepliBench is derived from a decomposition of these capabilities covering four core domains: obtaining resources, exfiltrating model weights, replicating onto compute, and persisting on this compute for long periods. We create 20 novel task families consisting of 86 individual tasks. We benchmark 5 frontier models, and find they do not currently pose a credible threat of self-replication, but succeed on many components and are improving rapidly. Models can deploy instances from cloud compute providers, write self-propagating programs, and exfiltrate model weights under simple security setups, but struggle to pass KYC checks or set up robust and persistent agent deployments. Overall the best model we evaluated (Claude 3.7 Sonnet) has a >50% pass@10 score on 15/20 task families, and a >50% pass@10 score for 9/20 families on the hardest variants. These findings suggest autonomous replication capability could soon emerge with improvements in these remaining areas or with human assistance.

Paperid: 2758, https://arxiv.org/pdf/2504.18333.pdf

Abstract:
LLM as judge systems used to assess text quality code correctness and argument strength are vulnerable to prompt injection attacks. We introduce a framework that separates content author attacks from system prompt attacks and evaluate five models Gemma 3.27B Gemma 3.4B Llama 3.2 3B GPT 4 and Claude 3 Opus on four tasks with various defenses using fifty prompts per condition. Attacks achieved up to seventy three point eight percent success smaller models proved more vulnerable and transferability ranged from fifty point five to sixty two point six percent. Our results contrast with Universal Prompt Injection and AdvPrompter We recommend multi model committees and comparative scoring and release all code and datasets

Paperid: 2759, https://arxiv.org/pdf/2504.17930.pdf

Abstract:
Digital systems find it challenging to keep up with cybersecurity threats. The daily emergence of more than 560,000 new malware strains poses significant hazards to the digital ecosystem. The traditional malware detection methods fail to operate properly and yield high false positive rates with low accuracy of the protection system. This study explores the ways in which malware can be detected using these machine learning (ML) and deep learning (DL) approaches to address those shortcomings. This study also includes a systematic comparison of the performance of some of the widely used ML models, such as random forest, multi-layer perceptron (MLP), and deep neural network (DNN), for determining the effectiveness of the domain of modern malware threat systems. We use a considerable-sized database from Kaggle, which has undergone optimized feature selection and preprocessing to improve model performance. Our finding suggests that the DNN model outperformed the other traditional models with the highest training accuracy of 99.92% and an almost perfect AUC score. Furthermore, the feature selection and preprocessing can help improve the capabilities of detection. This research makes an important contribution by analyzing the performance of the model on the performance metrics and providing insight into the effectiveness of the advanced detection techniques to build more robust and more reliable cybersecurity solutions against the growing malware threats.

Paperid: 2760, https://arxiv.org/pdf/2504.17650.pdf

Abstract:
A pseudorandom quantum state (PRS) is an ensemble of quantum states indistinguishable from Haar-random states to observers with efficient quantum computers. It allows one to substitute the costly Haar-random state with efficiently preparable PRS as a resource for cryptographic protocols, while also finding applications in quantum learning theory, black hole physics, many-body thermalization, quantum foundations, and quantum chaos. All existing constructions of PRS equate the notion of efficiency to quantum computers which runtime is bounded by a polynomial in its input size. In this work, we relax the notion of efficiency for PRS with respect to observers with near-term quantum computers implementing algorithms with runtime that scales slower than polynomial-time. We introduce the $\mathbf{T}$-PRS which is indistinguishable to quantum algorithms with runtime $\mathbf{T}(n)$ that grows slower than polynomials in the input size $n$. We give a set of reasonable conditions that a $\mathbf{T}$-PRS must satisfy and give two constructions by using quantum-secure pseudorandom functions and pseudorandom functions. For $\mathbf{T}(n)$ being linearithmic, linear, polylogarithmic, and logarithmic function, we characterize the amount of quantum resources a $\mathbf{T}$-PRS must possess, particularly on its coherence, entanglement, and magic. Our quantum resource characterization applies generally to any two state ensembles that are indistinguishable to observers with computational power $\mathbf{T}(n)$, giving a general necessary condition of whether a low-resource ensemble can mimic a high-resource ensemble, forming a $\mathbf{T}$-pseudoresource pair. We demonstate how the necessary amount of resource decreases as the observer's computational power is more restricted, giving a $\mathbf{T}$-pseudoresource pair with larger resource gap for more computationally limited observers.

Paperid: 2761, https://arxiv.org/pdf/2504.17271.pdf

Abstract:
Smart mobile devices have become indispensable in modern daily life, where sensitive information is frequently processed, stored, and transmitted-posing critical demands for robust security controls. Given that touchscreens are the primary medium for human-device interaction, continuous user authentication based on touch behavior presents a natural and seamless security solution. While existing methods predominantly adopt binary classification under single-modal learning settings, we propose a unified contrastive learning framework for continuous authentication in a non-disruptive manner. Specifically, the proposed method leverages a Temporal Masked Autoencoder to extract temporal patterns from raw multi-sensor data streams, capturing continuous motion and gesture dynamics. The pre-trained TMAE is subsequently integrated into a Siamese Temporal-Attentive Convolutional Network within a contrastive learning paradigm to model both sequential and cross-modal patterns. To further enhance performance, we incorporate multi-head attention and channel attention mechanisms to capture long-range dependencies and optimize inter-channel feature integration. Extensive experiments on public benchmarks and a self-collected dataset demonstrate that our approach outperforms state-of-the-art methods, offering a reliable and effective solution for user authentication on mobile devices.

Paperid: 2762, https://arxiv.org/pdf/2504.17256.pdf

Abstract:
Proof-of-Stake (PoS) is a prominent Sybil control mechanism for blockchain-based systems. In "e-PoS: Making PoS Decentralized and Fair," Saad et al. (TPDS'21) introduced a new Proof-of-Stake protocol, e-PoS, to enhance PoS applications' decentralization and fairness. In this comment paper, we address a misunderstanding in the work of Saad et al. The conventional Proof-of-Stake model that causes the fairness problem does not align with the general concept of Proof-of-Stake nor the Proof-of-Stake cryptocurrencies mentioned in their paper.

Paperid: 2763, https://arxiv.org/pdf/2504.17121.pdf

Abstract:
Modern password hashing remains a critical defense against credential cracking, yet the transition from theoretically secure algorithms to robust real-world implementations remains fraught with challenges. This paper presents a dual analysis of Argon2, the Password Hashing Competition winner, combining attack simulations quantifying how parameter configurations impact guessing costs under realistic budgets, with the first large-scale empirical study of Argon2 adoption across public GitHub software repositories. Our economic model, validated against cryptocurrency mining benchmarks, demonstrates that OWASP's recommended 46 MiB configuration reduces compromise rates by 42.5% compared to SHA-256 at \$1/account attack budgets for strong user passwords. However, memory-hardness exhibits diminishing returns as increasing allocations to RFC 9106's 2048 MiB provides just 23.3% (\$1) and 17.7% (\$20) additional protection despite 44.5 times greater memory demands. Crucially, both configurations fail to mitigate risks from weak passwords, with 96.9-99.8% compromise rates for RockYou-like credentials regardless of algorithm choice. Our repository analysis shows accelerating Argon2 adoption, yet weak configuration practices: 46.6% of deployments use weaker-than-OWASP parameters. Surprisingly, sensitive applications (password managers, encryption tools) show no stronger configurations than general software. Our findings highlight that a secure algorithm alone cannot ensure security, effective parameter guidance and developer education remain essential for realizing Argon2's theoretical advantages.

Paperid: 2764, https://arxiv.org/pdf/2504.16709.pdf

Abstract:
Quantum secret sharing (QSS) enables secure distribution of information among multiple parties but remains vulnerable to noise. We analyze the effects of bit-flip, phase-flip, and amplitude damping noise on the multiparty QSS for classical message (QSSCM) and secret sharing of quantum information (SSQI) protocols proposed by Zhang et al. (Phys. Rev. A, 71:044301, 2005). To scale down these effects, we introduce an efficient quantum error correction (QEC) scheme based on a simplified version of Shor's code. Leveraging the specific structure of the QSS protocols, we reduce the qubit overhead from the standard 9 of Shor's code to as few as 3 while still achieving lower average error rates than existing QEC methods. Thus, our approach can also be adopted for other single-qubit-based quantum protocols. Simulations demonstrate that our approach significantly enhances the protocols' resilience, improving their practicality for real-world deployment.

Paperid: 2765, https://arxiv.org/pdf/2504.16251.pdf

Abstract:
The second version of Intel Software Guard Extensions (Intel SGX), or SGX2, adds dynamic management of enclave memory and threads. The first version required the address space and thread counts to be fixed before execution. The Enclave Dynamic Memory Management (EDMM) feature of SGX2 has the potential to lower launch times and overall execution time. Despite reducing the enclave loading time by 28--93%, straightforward EDMM adoption strategies actually slow execution time down by as much as 58%. Using the Gramine library OS as a representative enclave runtime environment, this paper shows how to recover EDMM performance. The paper explains how implementing mutual distrust between the OS and enclave increases the cost of modifying page mappings. The paper then describes and evaluates a series of optimizations on application benchmarks, showing that these optimizations effectively eliminate the overheads of EDMM while retaining EDMM's performance and flexibility gains.

Paperid: 2766, https://arxiv.org/pdf/2504.16091.pdf

Abstract:
Homomorphic Encryption (HE) allows secure and privacy-protected computation on encrypted data without the need to decrypt it. Since Shor's algorithm rendered prime factorisation and discrete logarithm-based ciphers insecure with quantum computations, researchers have been working on building post-quantum homomorphic encryption (PQHE) algorithms. Most of the current PQHE algorithms are secured by Lattice-based problems and there have been limited attempts to build ciphers based on error-correcting code-based problems. This review presents an overview of the current approaches to building PQHE schemes and justifies code-based encryption as a novel way to diversify post-quantum algorithms. We present the mathematical underpinnings of existing code-based cryptographic frameworks and their security and efficiency guarantees. We compare lattice-based and code-based homomorphic encryption solutions identifying challenges that have inhibited the progress of code-based schemes. We finally propose five new research directions to advance post-quantum code-based homomorphic encryption.

Paperid: 2767, https://arxiv.org/pdf/2504.15738.pdf

Abstract:
The Open Radio Access Network (O-RAN) marks a significant shift in the mobile network industry. By transforming a traditionally vertically integrated architecture into an open, data-driven one, O-RAN promises to enhance operational flexibility and drive innovation. In this paper, we harness O-RAN's openness to address one critical threat to 5G availability: signaling storms caused by abuse of the Radio Resource Control (RRC) protocol. Such attacks occur when a flood of RRC messages from one or multiple User Equipments (UEs) deplete resources at a 5G base station (gNB), leading to service degradation. We provide a reference implementation of an RRC signaling storm attack, using the OpenAirInterface (OAI) platform to evaluate its impact on a gNB. We supplement the experimental results with a theoretical model to extend the findings for different load conditions. To mitigate RRC signaling storms, we develop a threshold-based detection technique that relies on RRC layer features to distinguish between malicious activity and legitimate high network load conditions. Leveraging O-RAN capabilities, our detection method is deployed as an external Application (xApp). Performance evaluation shows attacks can be detected within 90ms, providing a mitigation window of 60ms before gNB unavailability, with an overhead of 1.2% and 0% CPU and memory consumption, respectively.

Paperid: 2768, https://arxiv.org/pdf/2504.15375.pdf

Abstract:
The proliferation of Internet of Things (IoT) devices has expanded the attack surface, necessitating efficient intrusion detection systems (IDSs) for network protection. This paper presents FLARE, a feature-based lightweight aggregation for robust evaluation of IoT intrusion detection to address the challenges of securing IoT environments through feature aggregation techniques. FLARE utilizes a multilayered processing approach, incorporating session, flow, and time-based sliding-window data aggregation to analyze network behavior and capture vital features from IoT network traffic data. We perform extensive evaluations on IoT data generated from our laboratory experimental setup to assess the effectiveness of the proposed aggregation technique. To classify attacks in IoT IDS, we employ four supervised learning models and two deep learning models. We validate the performance of these models in terms of accuracy, precision, recall, and F1-score. Our results reveal that incorporating the FLARE aggregation technique as a foundational step in feature engineering, helps lay a structured representation, and enhances the performance of complex end-to-end models, making it a crucial step in IoT IDS pipeline. Our findings highlight the potential of FLARE as a valuable technique to improve performance and reduce computational costs of end-to-end IDS implementations, thereby fostering more robust IoT intrusion detection systems.

Paperid: 2769, https://arxiv.org/pdf/2504.15025.pdf

Abstract:
While one-way functions (OWFs) serve as the minimal assumption for computational cryptography in the classical setting, in quantum cryptography, we have even weaker cryptographic assumptions such as pseudo-random states, and EFI pairs, among others. Moreover, the minimal assumption for computational quantum cryptography remains an open question. Recently, it has been shown that pseudoentanglement is necessary for the existence of quantum cryptography (GoulÃ£o and Elkouss 2024), but no cryptographic construction has been built from it. In this work, we study the cryptographic usefulness of quantum pseudoresources -- a pair of families of quantum states that exhibit a gap in their resource content yet remain computationally indistinguishable. We show that quantum pseudoresources imply a variant of EFI pairs, which we call EPFI pairs, and that these are equivalent to quantum commitments and thus EFI pairs. Our results suggest that, just as randomness is fundamental to classical cryptography, quantum resources may play a similarly crucial role in the quantum setting. Finally, we focus on the specific case of entanglement, analyzing different definitions of pseudoentanglement and their implications for constructing EPFI pairs. Moreover, we propose a new cryptographic functionality that is intrinsically dependent on entanglement as a resource.

Paperid: 2770, https://arxiv.org/pdf/2504.14886.pdf

Abstract:
The effectiveness of an AI model in accurately classifying novel malware hinges on the quality of the features it is trained on, which in turn depends on the effectiveness of the analysis tool used. Peekaboo, a Dynamic Binary Instrumentation (DBI) tool, defeats malware evasion techniques to capture authentic behavior at the Assembly (ASM) instruction level. This behavior exhibits patterns consistent with Zipf's law, a distribution commonly seen in natural languages, making Transformer models particularly effective for binary classification tasks. We introduce Alpha, a framework for zero day malware detection that leverages Transformer models and ASM language. Alpha is trained on malware and benign software data collected through Peekaboo, enabling it to identify entirely new samples with exceptional accuracy. Alpha eliminates any common functions from the test samples that are in the training dataset. This forces the model to rely on contextual patterns and novel ASM instruction combinations to detect malicious behavior, rather than memorizing familiar features. By combining the strengths of DBI, ASM analysis, and Transformer architectures, Alpha offers a powerful approach to proactively addressing the evolving threat of malware. Alpha demonstrates perfect accuracy for Ransomware, Worms and APTs with flawless classification for both malicious and benign samples. The results highlight the model's exceptional performance in detecting truly new malware samples.

Paperid: 2771, https://arxiv.org/pdf/2504.14328.pdf

Abstract:
Bitcoin blockchain uses hash-based Proof-of-Work (PoW) that prevents unwanted participants from hogging the network resources. Anyone entering the mining game has to prove that they have expended a specific amount of computational power. However, the most popular Bitcoin blockchain consumes 175.87 TWh of electrical energy annually, and most of this energy is wasted on hash calculations, which serve no additional purpose. Several studies have explored re-purposing the wasted energy by replacing the hash function with meaningful computational problems that have practical applications. Minimum Dominating Set (MDS) in networks has numerous real-life applications. Building on this concept, Chrisimos [TrustCom '23] was proposed to replace hash-based PoW with the computation of a dominating set on real-life graph instances. However, Chrisimos has several drawbacks regarding efficiency and solution quality. This work presents a new framework for Useful PoW, ScaloWork, that decides the block proposer for the Bitcoin blockchain based on the solution for the dominating set problem. ScaloWork relies on the property of graph isomorphism and guarantees solution extractability. We also propose a distributed approach for calculating the dominating set, allowing miners to collaborate in a pool. This enables ScaloWork to handle larger graphs relevant to real-life applications, thereby enhancing scalability. Our framework also eliminates the problem of free-riders, ensuring fairness in the distribution of block rewards. We perform a detailed security analysis of our framework and prove our scheme as secure as hash-based PoW. We implement a prototype of our framework, and the results show that our system outperforms Chrisimos in all aspects.

Paperid: 2772, https://arxiv.org/pdf/2504.14044.pdf

Abstract:
Operational Technology Cybersecurity (OTCS) continues to be a dominant challenge for critical infrastructure such as railways. As these systems become increasingly vulnerable to malicious attacks due to digitalization, effective documentation and compliance processes are essential to protect these safety-critical systems. This paper proposes a novel system that leverages Large Language Models (LLMs) and multi-stage retrieval to enhance the compliance verification process against standards like IEC 62443 and the rail-specific IEC 63452. We first evaluate a Baseline Compliance Architecture (BCA) for answering OTCS compliance queries, then develop an extended approach called Parallel Compliance Architecture (PCA) that incorporates additional context from regulatory standards. Through empirical evaluation comparing OpenAI-gpt-4o and Claude-3.5-haiku models in these architectures, we demonstrate that the PCA significantly improves both correctness and reasoning quality in compliance verification. Our research establishes metrics for response correctness, logical reasoning, and hallucination detection, highlighting the strengths and limitations of using LLMs for compliance verification in railway cybersecurity. The results suggest that retrieval-augmented approaches can significantly improve the efficiency and accuracy of compliance assessments, particularly valuable in an industry facing a shortage of cybersecurity expertise.

Paperid: 2773, https://arxiv.org/pdf/2504.13737.pdf

Abstract:
The security of the Elliptic Curve Digital Signature Algorithm (ECDSA) depends on the uniqueness and secrecy of the nonce, which is used in each signature. While it is well understood that nonce $k$ reuse across two distinct messages can leak the private key, we show that even if a distinct value is used for $k_2$, where an affine relationship exists in the form of: $k_m = a \cdot k_n + b$, we can also recover the private key. Our method requires only two signatures (even over the same message) and relies purely on algebra, with no need for lattice reduction or brute-force search(if the relationship, or offset, is known). To our knowledge, this is the first closed-form derivation of the ECDSA private key from only two signatures over the same message, under a known affine relationship between nonces.

Paperid: 2774, https://arxiv.org/pdf/2504.12546.pdf

Abstract:
We formalise the notion of an anonymous public announcement in the tradition of public announcement logic. Such announcements can be seen as in-between a public announcement from ``the outside" (an announcement of $Ï$) and a public announcement by one of the agents (an announcement of $K_aÏ$): we get more information than just $Ï$, but not (necessarily) about exactly who made it. Even if such an announcement is prima facie anonymous, depending on the background knowledge of the agents it might reveal the identity of the announcer: if I post something on a message board, the information might reveal who I am even if I don't sign my name. Furthermore, like in the Russian Cards puzzle, if we assume that the announcer's intention was to stay anonymous, that in fact might reveal more information. In this paper we first look at the case when no assumption about intentions are made, in which case the logic with an anonymous public announcement operator is reducible to epistemic logic. We then look at the case when we assume common knowledge of the intention to stay anonymous, which is both more complex and more interesting: in several ways it boils down to the notion of a ``safe" announcement (again, similarly to Russian Cards). Main results include formal expressivity results and axiomatic completeness for key logical languages.

Paperid: 2775, https://arxiv.org/pdf/2504.11744.pdf

Abstract:
Ransomware has emerged as a persistent cybersecurity threat,leveraging robust encryption schemes that often remain unbroken even after public disclosure of source code. Motivated by the technical resilience of such mechanisms, this paper presents SEER (Secure and Efficient Encryption-based Erasure via Ransomware), a provably secure file destruction system that repurposes ransomware encryption for legitimate data erasure tasks. SEER integrates the triple-encryption design of the Babuk ransomware family, including Curve25519-based key exchange,SHA-256-based key derivation, and the Sosemanuk stream cipher, to construct a layered key management architecture. It tightly couples encryption and key destruction by securely erasing session keys immediately after use. Experimental results on an ESXI platform demonstrate that SEER achieves four orders of magnitude performance improvement over the DoD 5220.22 standard. The proposed system further ensures provable security through both theoretical foundations and practical validation, offering an efficient and resilient solution for the secure destruction of sensitive data.

Paperid: 2776, https://arxiv.org/pdf/2504.11622.pdf

Abstract:
The large integration of microphones into devices increases the opportunities for Acoustic Side-Channel Attacks (ASCAs), as these can be used to capture keystrokes' audio signals that might reveal sensitive information. However, the current State-Of-The-Art (SOTA) models for ASCAs, including Convolutional Neural Networks (CNNs) and hybrid models, such as CoAtNet, still exhibit limited robustness under realistic noisy conditions. Solving this problem requires either: (i) an increased model's capacity to infer contextual information from longer sequences, allowing the model to learn that an initially noisily typed word is the same as a futurely collected non-noisy word, or (ii) an approach to fix misidentified information from the contexts, as one does not type random words, but the ones that best fit the conversation context. In this paper, we demonstrate that both strategies are viable and complementary solutions for making ASCAs practical. We observed that no existing solution leverages advanced transformer architectures' power for these tasks and propose that: (i) Visual Transformers (VTs) are the candidate solutions for capturing long-term contextual information and (ii) transformer-powered Large Language Models (LLMs) are the candidate solutions to fix the ``typos'' (mispredictions) the model might make. Thus, we here present the first-of-its-kind approach that integrates VTs and LLMs for ASCAs. We first show that VTs achieve SOTA performance in classifying keystrokes when compared to the previous CNN benchmark. Second, we demonstrate that LLMs can mitigate the impact of real-world noise. Evaluations on the natural sentences revealed that: (i) incorporating LLMs (e.g., GPT-4o) in our ASCA pipeline boosts the performance of error-correction tasks; and (ii) the comparable performance can be attained by a lightweight, fine-tuned smaller LLM (67 times smaller than GPT-4o), using...

Paperid: 2777, https://arxiv.org/pdf/2504.11575.pdf

Abstract:
Detecting Distributed Denial of Service (DDoS) attacks in Multi-Environment (M-En) networks presents significant challenges due to diverse malicious traffic patterns and the evolving nature of cyber threats. Existing AI-based detection systems struggle to adapt to new attack strategies and lack real-time attack detection capabilities with high accuracy and efficiency. This study proposes an online, continuous learning methodology for DDoS detection in M-En networks, enabling continuous model updates and real-time adaptation to emerging threats, including zero-day attacks. First, we develop a unique M-En network dataset by setting up a realistic, real-time simulation using the NS-3 tool, incorporating both victim and bot devices. DDoS attacks with varying packet sizes are simulated using the DDoSim application across IoT and traditional IP-based environments under M-En network criteria. Our approach employs a multi-level framework (MULTI-LF) featuring two machine learning models: a lightweight Model 1 (M1) trained on a selective, critical packet dataset for fast and efficient initial detection, and a more complex, highly accurate Model 2 (M2) trained on extensive data. When M1 exhibits low confidence in its predictions, the decision is escalated to M2 for verification and potential fine-tuning of M1 using insights from M2. If both models demonstrate low confidence, the system flags the incident for human intervention, facilitating model updates with human-verified categories to enhance adaptability to unseen attack patterns. We validate the MULTI-LF through real-world simulations, demonstrating superior classification accuracy of 0.999 and low prediction latency of 0.866 seconds compared to established baselines. Furthermore, we evaluate performance in terms of memory usage (3.632 MB) and CPU utilization (10.05%) in real-time scenarios.

Paperid: 2778, https://arxiv.org/pdf/2504.11429.pdf

Abstract:
Differential privacy (DP) considers a scenario, where an adversary has almost complete information about the entries of a database This worst-case assumption is likely to overestimate the privacy thread for an individual in real life. Statistical privacy (SP) denotes a setting where only the distribution of the database entries is known to an adversary, but not their exact values. In this case one has to analyze the interaction between noiseless privacy based on the entropy of distributions and privacy mechanisms that distort the answers of queries, which can be quite complex. A privacy mechanism often used is to take samples of the data for answering a query. This paper proves precise bounds how much different methods of sampling increase privacy in the statistical setting with respect to database size and sampling rate. They allow us to deduce when and how much sampling provides an improvement and how far this depends on the privacy parameter Îµ. To perform these investigations we develop a framework to model sampling techniques. For the DP setting tradeoff functions have been proposed as a finer measure for privacy compared to (Îµ,Î´)-pairs. We apply these tools to statistical privacy with subsampling to get a comparable characterization

Paperid: 2779, https://arxiv.org/pdf/2504.10944.pdf

Abstract:
This paper introduces the Cartesian Merkle Tree, a deterministic data structure that combines the properties of a Binary Search Tree, a Heap, and a Merkle tree. The Cartesian Merkle Tree supports insertions, updates, and removals of elements in $O(\log n)$ time, requires $n$ space, and enables membership and non-membership proofs via Merkle-based authentication paths. This structure is particularly suitable for zero-knowledge applications, blockchain systems, and other protocols that require efficient and verifiable data structures.

Paperid: 2780, https://arxiv.org/pdf/2504.08264.pdf

Abstract:
The increasing use of the Internet of Things raises security concerns. To address this, device fingerprinting is often employed to authenticate devices, detect adversaries, and identify eavesdroppers in an environment. This requires the ability to discern between legitimate and malicious devices which is achieved by analyzing the unique physical and/or operational characteristics of IoT devices. In the era of the latest progress in machine learning, particularly generative models, it is crucial to methodically examine the current studies in device fingerprinting. This involves explaining their approaches and underscoring their limitations when faced with adversaries armed with these ML tools. To systematically analyze existing methods, we propose a generic, yet simplified, model for device fingerprinting. Additionally, we thoroughly investigate existing methods to authenticate devices and detect eavesdropping, using our proposed model. We further study trends and similarities between works in authentication and eavesdropping detection and present the existing threats and attacks in these domains. Finally, we discuss future directions in fingerprinting based on these trends to develop more secure IoT fingerprinting schemes.

Paperid: 2781, https://arxiv.org/pdf/2504.07766.pdf

Abstract:
In this paper, we ask the question of why the quality of commercial software, in terms of security and safety, does not measure up to that of other (durable) consumer goods we have come to expect. We examine this question through the lens of incentives. We argue that the challenge around better quality software is due in no small part to a sequence of misaligned incentives, the most critical of which being that the harm caused by software problems is by and large shouldered by consumers, not developers. This lack of liability means software vendors have every incentive to rush low-quality software onto the market and no incentive to enhance quality control. Within this context, this paper outlines a holistic technical and policy framework we believe is needed to incentivize better and more secure software development. At the heart of the incentive realignment is the concept of software liability. This framework touches on various components, including legal, technical, and financial, that are needed for software liability to work in practice; some currently exist, some will need to be re-imagined or established. This is primarily a market-driven approach that emphasizes voluntary participation but highlights the role appropriate regulation can play. We connect and contrast this with the EU legal environment and discuss what this framework means for open-source software (OSS) development and emerging AI risks. Moreover, we present a CrowdStrike case study complete with a what-if analysis had our proposed framework been in effect. Our intention is very much to stimulate a robust conversation among both researchers and practitioners.

Paperid: 2782, https://arxiv.org/pdf/2504.07265.pdf

Abstract:
The ECDSA (Elliptic Curve Digital Signature Algorithm) is used in many blockchain networks for digital signatures. This includes the Bitcoin and the Ethereum blockchains. While it has good performance levels and as strong current security, it should be handled with care. This care typically relates to the usage of the nonce value which is used to create the signature. This paper outlines the methods that can be used to break ECDSA signatures, including revealed nonces, weak nonce choice, nonce reuse, two keys and shared nonces, and fault attack.

Paperid: 2783, https://arxiv.org/pdf/2504.06241.pdf

Abstract:
Recent cyber incidents and the push for zero trust security underscore the necessity of monitoring host-level events. However, current host-level intrusion detection systems (IDS) lack the ability to correlate alerts and coordinate a network-wide response in real time. Motivated by advances in system-level extensions free of rebooting and network-wide orchestration of host actions, we propose using a central IDS orchestrator to remotely program the logic of each host IDS and collect the alerts generated in real time. In this paper, we make arguments for such a system concept and provide a high level design of the main system components. Furthermore, we have developed a system prototype and evaluated it using two experimental scenarios rooted from real-world attacks. The evaluation results show that the host-based IDS orchestration system is able to defend against the attacks effectively.

Paperid: 2784, https://arxiv.org/pdf/2504.06180.pdf

Abstract:
Blockchain technology has seen adoption across various industries and the real estate sector is no exception. The traditional property leasing process guarantees no trust between parties, uses insecure communication channels, and forces participants who are not familiar with the process to perform contracts. Blockchain technology emerges as a solution to simplify the traditional property leasing process. This work proposes the use of two blockchain oracles to handle, respectively, maintenance issues and automate rent payments in the context of property rental. These two components are introduced in a blockchain-based property rental platform.

Paperid: 2785, https://arxiv.org/pdf/2504.05159.pdf

Abstract:
This paper proves the RLWE-PLWE equivalence for the maximal real subfields of the cyclotomic fields with conductor $n = 2^r p^s$, where $p$ is an odd prime, and $r \geq 0$ and $s \geq 1$ are integers. In particular, we show that the canonical embedding as a linear transform has a condition number bounded above by a polynomial in $n$. In addition, we describe a fast multiplication algorithm in the ring of integers of these real subfields. The multiplication algorithm uses the fast Discrete Cosine Transform (DCT) and has computational complexity $\mathcal{O}(n \log n)$. Both the proof of the RLWE-PLWE equivalence and the fast multiplication algorithm are generalizations of previous results by Ahola et al., where the same claims are proved for a single prime $p = 3$.

Paperid: 2786, https://arxiv.org/pdf/2504.04685.pdf

Abstract:
Recent years have seen an explosion of activity in Generative AI, specifically Large Language Models (LLMs), revolutionising applications across various fields. Smart contract vulnerability detection is no exception; as smart contracts exist on public chains and can have billions of dollars transacted daily, continuous improvement in vulnerability detection is crucial. This has led to many researchers investigating the usage of generative large language models (LLMs) to aid in detecting vulnerabilities in smart contracts. This paper presents a systematic review of the current LLM-based smart contract vulnerability detection tools, comparing them against traditional static and dynamic analysis tools Slither and Mythril. Our analysis highlights key areas where each performs better and shows that while these tools show promise, the LLM-based tools available for testing are not ready to replace more traditional tools. We conclude with recommendations on how LLMs are best used in the vulnerability detection process and offer insights for improving on the state-of-the-art via hybrid approaches and targeted pre-training of much smaller models.

Paperid: 2787, https://arxiv.org/pdf/2504.04422.pdf

Abstract:
Memory leaks are prevalent in various real-world software projects, thereby leading to serious attacks like denial-of-service. Though prior methods for detecting memory leaks made significant advance, they often suffer from low accuracy and weak scalability for testing large and complex programs. In this paper we present LeakGuard, a memory leak detection tool which provides satisfactory balance of accuracy and scalability. For accuracy, LeakGuard analyzes the behaviors of library and developer-defined memory allocation and deallocation functions in a path-sensitive manner and generates function summaries for them in a bottom-up approach. Additionally, we develop a pointer escape analysis technique to model the transfer of pointer ownership. For scalability, LeakGuard examines each function of interest independently by using its function summary and under-constrained symbolic execution technique, which effectively mitigates path explosion problem. Our extensive evaluation on 18 real-world software projects and standard benchmark datasets demonstrates that LeakGuard achieves significant advancements in multiple aspects: it exhibits superior MAD function identification capability compared to Goshawk, outperforms five state-of-the-art methods in defect detection accuracy, and successfully identifies 129 previously undetected memory leak bugs, all of which have been independently verified and confirmed by the respective development teams.

Paperid: 2788, https://arxiv.org/pdf/2504.04374.pdf

Abstract:
Anomaly detection for cyber-physical systems (ADCPS) is crucial in identifying faults and potential attacks by analyzing the time series of sensor measurements and actuator states. However, current methods lack adaptation to data distribution shifts in both temporal and spatial dimensions as cyber-physical systems evolve. To tackle this issue, we propose an incremental meta-learning-based approach, namely iADCPS, which can continuously update the model through limited evolving normal samples to reconcile the distribution gap between evolving and historical time series. Specifically, We first introduce a temporal mixup strategy to align data for data-level generalization which is then combined with the one-class meta-learning approach for model-level generalization. Furthermore, we develop a non-parametric dynamic threshold to adaptively adjust the threshold based on the probability density of the abnormal scores without any anomaly supervision. We empirically evaluate the effectiveness of the iADCPS using three publicly available datasets PUMP, SWaT, and WADI. The experimental results demonstrate that our method achieves 99.0%, 93.1%, and 78.7% F1-Score, respectively, which outperforms the state-of-the-art (SOTA) ADCPS method, especially in the context of the evolving CPSs.

Paperid: 2789, https://arxiv.org/pdf/2504.03363.pdf

Abstract:
EMV is the global standard for smart card payments and its NFC-based version, EMV contactless, is widely used, also for mobile payments. In this systematization of knowledge, we examine attacks on the EMV contactless protocol. We provide a comprehensive framework encompassing its desired security properties and adversary models. We also identify and categorize a comprehensive collection of protocol flaws and show how different subsets thereof can be combined into attacks. In addition to this systematization, we examine the underlying reasons for the many attacks against EMV and point to a better way forward.

Paperid: 2790, https://arxiv.org/pdf/2504.02979.pdf

Abstract:
Side-channel attacks consist of retrieving internal data from a victim system by analyzing its leakage, which usually requires proximity to the victim in the range of a few millimetres. Screaming channels are EM side channels transmitted at a distance of a few meters. They appear on mixed-signal devices integrating an RF module on the same silicon die as the digital part. Consequently, the side channels are modulated by legitimate RF signal carriers and appear at the harmonics of the digital clock frequency. While initial works have only considered collecting leakage at these harmonics, late work has demonstrated that the leakage is also present at frequencies other than these harmonics. This result significantly increases the number of available frequencies to perform a screaming-channel attack, which can be convenient in an environment where multiple harmonics are polluted. This work studies how this diversity of frequencies carrying leakage can be used to improve attack performance. We first study how to combine multiple frequencies. Second, we demonstrate that frequency combination can improve attack performance and evaluate this improvement according to the performance of the combined frequencies. Finally, we demonstrate the interest of frequency combination in attacks at 15 and, for the first time to the best of our knowledge, at 30 meters. One last important observation is that this frequency combination divides by 2 the number of traces needed to reach a given attack performance.

Paperid: 2791, https://arxiv.org/pdf/2504.02133.pdf

Abstract:
Current cellular networking remains vulnerable to malicious fake base stations due to the lack of base station authentication mechanism or even a key to enable authentication. We design and build a base station certificate (certifying the base station's public key and location) and a multi-factor authentication (making use of the certificate and the information transmitted in the online radio control communications) to secure the authenticity and message integrity of the base station control communications. We advance beyond the state-of-the-art research by introducing greater authentication factors (and analyzing their individual security properties and benefits), and by using blockchain to deliver the base station digital certificate offline (enabling greater key length or security strength and computational or networking efficiency). We design the certificate construction, delivery, and the multi-factor authentication use on the user equipment. The user verification involves multiple factors verified through the ledger database, the location sensing (GPS in our implementation), and the cryptographic signature verification of the cellular control communication (SIB1 broadcasting). We analyze our scheme's security, performance, and the fit to the existing standardized networking protocols. Our work involves the implementation of building on X.509 certificate (adapted), smart contract-based blockchain, 5G-standardized RRC control communications, and software-defined radios. Our analyses show that our scheme effectively defends against more security threats and can enable stronger security, i.e., ECDSA with greater key lengths. Furthermore, our scheme enables computing and energy to be more than three times efficient than the previous research on the mobile user equipment.

Paperid: 2792, https://arxiv.org/pdf/2504.02095.pdf

Abstract:
Systems such as file backup services often use content-defined chunking (CDC) algorithms, especially those based on rolling hash techniques, to split files into chunks in a way that allows for data deduplication. These chunking algorithms often depend on per-user parameters in an attempt to avoid leaking information about the data being stored. We present attacks to extract these chunking parameters and discuss protocol-agnostic attacks and loss of security once the parameters are breached (including when these parameters are not setup at all, which is often available as an option). Our parameter-extraction attacks themselves are protocol-specific but their ideas are generalizable to many potential CDC schemes.

Paperid: 2793, https://arxiv.org/pdf/2504.02080.pdf

Abstract:
Large Language Models (LLMs) are increasingly popular, powering a wide range of applications. Their widespread use has sparked concerns, especially through jailbreak attacks that bypass safety measures to produce harmful content. In this paper, we present a comprehensive security analysis of large language models (LLMs), addressing critical research questions on the evolution and determinants of model safety. Specifically, we begin by identifying the most effective techniques for detecting jailbreak attacks. Next, we investigate whether newer versions of LLMs offer improved security compared to their predecessors. We also assess the impact of model size on overall security and explore the potential benefits of integrating multiple defense strategies to enhance model robustness. Our study evaluates both open-source models (e.g., LLaMA and Mistral) and closed-source systems (e.g., GPT-4) by employing four state-of-the-art attack techniques and assessing the efficacy of three new defensive approaches.

Paperid: 2794, https://arxiv.org/pdf/2504.01933.pdf

Abstract:
Deep neural networks are not resilient to parameter corruptions: even a single-bitwise error in their parameters in memory can cause an accuracy drop of over 10%, and in the worst cases, up to 99%. This susceptibility poses great challenges in deploying models on computing platforms, where adversaries can induce bit-flips through software or bitwise corruptions may occur naturally. Most prior work addresses this issue with hardware or system-level approaches, such as integrating additional hardware components to verify a model's integrity at inference. However, these methods have not been widely deployed as they require infrastructure or platform-wide modifications. In this paper, we propose a new approach to addressing this issue: training models to be more resilient to bitwise corruptions to their parameters. Our approach, Hessian-aware training, promotes models with $flatter$ loss surfaces. We show that, while there have been training methods, designed to improve generalization through Hessian-based approaches, they do not enhance resilience to parameter corruptions. In contrast, models trained with our method demonstrate increased resilience to parameter corruptions, particularly with a 20$-$50% reduction in the number of bits whose individual flipping leads to a 90$-$100% accuracy drop. Moreover, we show the synergy between ours and existing hardware and system-level defenses.

Paperid: 2795, https://arxiv.org/pdf/2504.01481.pdf

Abstract:
Protecting sensitive program content is a critical issue in various situations, ranging from legitimate use cases to unethical contexts. Obfuscation is one of the most used techniques to ensure such protection. Consequently, attackers must first detect and characterize obfuscation before launching any attack against it. This paper investigates the problem of function-level obfuscation detection using graph-based approaches, comparing algorithms, from elementary baselines to promising techniques like GNN (Graph Neural Networks), on different feature choices. We consider various obfuscation types and obfuscators, resulting in two complex datasets. Our findings demonstrate that GNNs need meaningful features that capture aspects of function semantics to outperform baselines. Our approach shows satisfactory results, especially in a challenging 11-class classification task and in a practical malware analysis example.

Paperid: 2796, https://arxiv.org/pdf/2504.01305.pdf

Abstract:
In today's rapidly evolving digital landscape, organisations face escalating cyber threats that can disrupt operations, compromise sensitive data, and inflict financial and reputational harm. A key reason for this lies in the organisations' lack of a clear understanding of their cybersecurity capabilities, leading to ineffective defences. To address this gap, Cybersecurity Capability Maturity Models (CCMMs) provide a systematic approach to assessing and enhancing an organisation's cybersecurity posture by focusing on capability maturity rather than merely implementing controls. However, their limitations, such as rigid structures, one-size-fits-all approach, complexity, gaps in security scope (i.e., technological, organisational, and human aspects) and lack of quantitative metrics, hinder their effectiveness. It makes implementing CCMMs in varying contexts challenging and results in fragmented, incomprehensive assessments. Therefore, we propose a novel Cybersecurity Capability Maturity Framework that is holistic, flexible, and measurable to provide organisations with a more relevant and impactful assessment to enhance their cybersecurity posture.

Paperid: 2797, https://arxiv.org/pdf/2503.23949.pdf

Abstract:
Biometric systems strive to balance security and usability. The use of multi-biometric systems combining multiple biometric modalities is usually recommended for high-security applications. However, the presentation of multiple biometric modalities can impair the user-friendliness of the overall system and might not be necessary in all cases. In this work, we present a simple but flexible approach to increase the privacy protection of homomorphically encrypted multi-biometric reference templates while enabling adaptation to security requirements at run-time: An adaptive multi-biometric fusion with fully homomorphic encryption (AMB-FHE). AMB-FHE is benchmarked against a bimodal biometric database consisting of the CASIA iris and MCYT fingerprint datasets using deep neural networks for feature extraction. Our contribution is easy to implement and increases the flexibility of biometric authentication while offering increased privacy protection through joint encryption of templates from multiple modalities.

Paperid: 2798, https://arxiv.org/pdf/2503.23627.pdf

Abstract:
We examine the security of a cloud storage service that makes very strong claims about the ``trustless'' nature of its security. We find that, although stored files are end-to-end encrypted, the encryption method allows for effective dictionary attacks by a malicious server when passwords only just meet the minimum length required. Furthermore, the file sharing function simply sends the decryption passwords to the server with no protection other than TLS.

Paperid: 2799, https://arxiv.org/pdf/2503.23424.pdf

Abstract:
Following the rapid increase in Artificial Intelligence (AI) capabilities in recent years, the AI community has voiced concerns regarding possible safety risks. To support decision-making on the safe use and development of AI systems, there is a growing need for high-quality evaluations of dangerous model capabilities. While several attempts to provide such evaluations have been made, a clear definition of what constitutes a "good evaluation" has yet to be agreed upon. In this practitioners' perspective paper, we present a set of best practices for safety evaluations, drawing on prior work in model evaluation and illustrated through cybersecurity examples. We first discuss the steps of the initial thought process, which connects threat modeling to evaluation design. Then, we provide the characteristics and parameters that make an evaluation useful. Finally, we address additional considerations as we move from building specific evaluations to building a full and comprehensive evaluation suite.

Paperid: 2800, https://arxiv.org/pdf/2503.23277.pdf

Abstract:
Satellite communication systems (SatCom) is a brand-new network that uses artificial Earth satellites as relay stations to provide communication services such as broadband Internet access to various users on land, sea, air and in space. It features wide coverage, relatively high transmission rates and strong anti-interference capabilities. Security authentication is of crucial significance for the stable operation and widespread application of satellite communication systems. It can effectively prevent unauthorized access, ensuring that only users and devices that pass security authentication can access the satellite network. It also ensures the confidentiality, integrity, and availability of data during transmission and storage, preventing data from being stolen, tampered with, or damaged. By means of literature research and comparative analysis, this paper carries out on a comprehensive survey towards the security authentication methods used by SatCom. This paper first summarizes the existing SatCom authentication methods as five categories, namely, those based on cryptography, Blockchain, satellite orbital information, the AKA protocol and physical hardware respectively. Subsequently, a comprehensive comparative analysis is carried out on the above-mentioned five categories of security authentication methods from four dimensions, i.e., security, implementation difficulty and cost, applicable scenarios and real-time performance, and the final comparison results are following obtained. Finally, prospects are made for several important future research directions of security authentication methods for SatCom, laying a well foundation for further carrying on the related research works.

Paperid: 2801, https://arxiv.org/pdf/2503.23238.pdf

Abstract:
At CRYPTO 2015, Kirchner and Fouque claimed that a carefully tuned variant of the Blum-Kalai-Wasserman (BKW) algorithm (JACM 2003) should solve the Learning with Errors problem (LWE) in slightly subexponential time for modulus $q=\mathrm{poly}(n)$ and narrow error distribution, when given enough LWE samples. Taking a modular view, one may regard BKW as a combination of Wagner's algorithm (CRYPTO 2002), run over the corresponding dual problem, and the Aharonov-Regev distinguisher (JACM 2005). Hence the subexponential Wagner step alone should be of interest for solving this dual problem - namely, the Short Integer Solution problem (SIS) - but this appears to be undocumented so far. We re-interpret this Wagner step as walking backward through a chain of projected lattices, zigzagging through some auxiliary superlattices. We further randomize the bucketing step using Gaussian randomized rounding to exploit the powerful discrete Gaussian machinery. This approach avoids sample amplification and turns Wagner's algorithm into an approximate discrete Gaussian sampler for $q$-ary lattices. For an SIS lattice with $n$ equations modulo $q$, this algorithm runs in subexponential time $\exp(O(n/\log \log n))$ to reach a Gaussian width parameter $s = q/\mathrm{polylog}(n)$ only requiring $m = n + Ï(n/\log \log n)$ many SIS variables. This directly provides a provable algorithm for solving the Short Integer Solution problem in the infinity norm ($\mathrm{SIS}^\infty$) for norm bounds $Î²= q/\mathrm{polylog}(n)$. This variant of SIS underlies the security of the NIST post-quantum cryptography standard Dilithium. Despite its subexponential complexity, Wagner's algorithm does not appear to threaten Dilithium's concrete security.

Paperid: 2802, https://arxiv.org/pdf/2503.22503.pdf

Abstract:
This paper evaluates the Audio Spectrogram Transformer (AST) architecture for synthesized speech detection, with focus on generalization across modern voice generation technologies. Using differentiated augmentation strategies, the model achieves 0.91% EER overall when tested against ElevenLabs, NotebookLM, and Minimax AI voice generators. Notably, after training with only 102 samples from a single technology, the model demonstrates strong cross-technology generalization, achieving 3.3% EER on completely unseen voice generators. This work establishes benchmarks for rapid adaptation to emerging synthesis technologies and provides evidence that transformer-based architectures can identify common artifacts across different neural voice synthesis methods, contributing to more robust speech verification systems.

Paperid: 2803, https://arxiv.org/pdf/2503.22156.pdf

Abstract:
Cryptocurrency is a novel exploration of a form of currency that proposes a decentralized electronic payment scheme based on blockchain technology and cryptographic theory. While cryptocurrency has the security characteristics of being distributed and tamper-proof, increasing market demand has led to a rise in malicious transactions and attacks, thereby exposing cryptocurrency to vulnerabilities, privacy issues, and security threats. Particularly concerning are the emerging types of attacks and threats, which have made securing cryptocurrency increasingly urgent. Therefore, this paper classifies existing cryptocurrency security threats and attacks into five fundamental categories based on the blockchain infrastructure and analyzes in detail the vulnerability principles exploited by each type of threat and attack. Additionally, the paper examines the attackers' logic and methods and successfully reproduces the vulnerabilities. Furthermore, the author summarizes the existing detection and defense solutions and evaluates them, all of which provide important references for ensuring the security of cryptocurrency. Finally, the paper discusses the future development trends of cryptocurrency, as well as the public challenges it may face.

Paperid: 2804, https://arxiv.org/pdf/2503.22010.pdf

Abstract:
Self-Sovereign Identity (SSI) is an emerging paradigm for authentication and credential presentation that aims to give users control over their data and prevent any kind of tracking by (even trusted) third parties. In the European Union, the EUDI Digital Identity wallet is about to become a concrete implementation of this paradigm. However, a debate is still ongoing, partially reflecting some aspects that are not yet consolidated in the scientific state of the art. Among these, an effective, efficient, and privacy-preserving implementation of verifiable credential revocation remains a subject of discussion. In this work-in-progress paper, we propose the basis of a novel method that customizes the use of anonymous hierarchical identity-based encryption to restrict the Verifier access to the temporal authorizations granted by the Holder. This way, the Issuer cannot track the Holder's credential presentations, and the Verifier cannot check revocation information beyond what is permitted by the Holder.

Paperid: 2805, https://arxiv.org/pdf/2503.21674.pdf

Abstract:
The widespread adoption of Internet of Things (IoT) devices has introduced significant cybersecurity challenges, particularly with the increasing frequency and sophistication of Distributed Denial of Service (DDoS) attacks. Traditional machine learning (ML) techniques often fall short in detecting such attacks due to the complexity of blended and evolving patterns. To address this, we propose a novel framework leveraging On-Device Large Language Models (ODLLMs) augmented with fine-tuning and knowledge base (KB) integration for intelligent IoT network attack detection. By implementing feature ranking techniques and constructing both long and short KBs tailored to model capacities, the proposed framework ensures efficient and accurate detection of DDoS attacks while overcoming computational and privacy limitations. Simulation results demonstrate that the optimized framework achieves superior accuracy across diverse attack types, especially when using compact models in edge computing environments. This work provides a scalable and secure solution for real-time IoT security, advancing the applicability of edge intelligence in cybersecurity.

Paperid: 2806, https://arxiv.org/pdf/2503.21440.pdf

Abstract:
The closure $\mathcal{M}_{m}^{\#}$ and the extension $\widehat{\mathcal{M}}_{m}$ of the Maiorana--McFarland class $\mathcal{M}_{m}$ in $m = 2n$ variables relative to the extended-affine equivalence and the bent function construction $f \oplus \mathrm{Ind}_{U}$ are considered, where $U$ is an affine subspace of $\mathbb{F}_{2}^{m}$ of dimension $m/2$. We obtain an explicit formula for $|\widehat{\mathcal{M}}_{m}|$ and an upper bound for $|\widehat{\mathcal{M}}_{m}^{\#}|$. Asymptotically tight bounds for $|\mathcal{M}_{m}^{\#}|$ are proved as well, for instance, $|\mathcal{M}_{8}^{\#}| \approx 2^{77.865}$. Metric properties of $\mathcal{M}_{m}$ and $\mathcal{M}_{m}^{\#}$ are also investigated. We find the number of all closest bent functions to the set $\mathcal{M}_{m}$ and provide an upper bound of the same number for $\mathcal{M}_{m}^{\#}$. The average number $E(\mathcal{M}_{m})$ of $m/2$-dimensional affine subspaces of $\mathbb{F}_{2}^{m}$ such that a function from $\mathcal{M}_{m}$ is affine on each of them is calculated. We obtain that similarly defined $E(\mathcal{M}_{m}^{\#})$ satisfies $E(\mathcal{M}_{m}^{\#}) < E(\mathcal{M}_{m})$ and $E(\mathcal{M}_{m}^{\#}) = E(\mathcal{M}_{m}) - o(1)$.

Paperid: 2807, https://arxiv.org/pdf/2503.20821.pdf

Abstract:
Pig-butchering scams have emerged as a complex form of fraud that combines elements of romance, investment fraud, and advanced social engineering tactics to systematically exploit victims. In this paper, we present the first qualitative analysis of pig-butchering scams, informed by in-depth semi-structured interviews with $N=26$ victims. We capture nuanced, first-hand accounts from victims, providing insight into the lifecycle of pig-butchering scams and the complex emotional and financial manipulation involved. We systematically analyze each phase of the scam, revealing that perpetrators employ tactics such as staged trust-building, fraudulent financial platforms, fabricated investment returns, and repeated high-pressure tactics, all designed to exploit victims' trust and financial resources over extended periods. Our findings reveal an organized scam lifecycle characterized by emotional manipulation, staged financial exploitation, and persistent re-engagement efforts that amplify victim losses. We also find complex psychological and financial impacts on victims, including heightened vulnerability to secondary scams. Finally, we propose actionable intervention points for social media and financial platforms to curb the prevalence of these scams and highlight the need for non-stigmatizing terminology to encourage victims to report and seek assistance.

Paperid: 2808, https://arxiv.org/pdf/2503.20079.pdf

Abstract:
Distributed systems widely adopt microservice architecture to handle growing complexity and scale. This approach breaks applications into independent, loosely coupled services. Kubernetes has become the de facto standard for managing microservices, and automating complex, multi-step workflows is a common requirement in Kubernetes. Argo Workflows is a Kubernetes-native engine for managing these workflows in an automated fashion. These workflows generate artifacts such as executables, logs, container images, and packages, which often require proper management through software supply chain security. However, Argo Workflows does not include built-in functionality for frameworks like Supply-chain Levels for Software Artifacts (SLSA), which is essential for ensuring artifact integrity, traceability, and security. This gap compels practitioners to rely on external tools to meet software supply chain security standards. In response, this paper proposes a Kubernetes-native controller built on top of existing open-source Argo Workflows to enhance artifact security. By generating cryptographic signing and provenance attestations, the controller enables Argo Workflows to comply with SLSA standards. We demonstrate that implementations can provide such cryptographic signing and provenance attestations for artifacts produced by the controller, allowing software artifacts built with Argo Workflows to adhere to SLSA requirements. The proposed validation model evaluates the proof of concept of the controller, including its ability to reconcile workflows, detect pods associated with workflow nodes, operate without disrupting existing operations, enforce integrity, and monitor software artifacts.

Paperid: 2809, https://arxiv.org/pdf/2503.19817.pdf

Abstract:
Neural image compression (NIC) has emerged as a promising alternative to classical compression techniques, offering improved compression ratios. Despite its progress towards standardization and practical deployment, there has been minimal exploration into it's robustness and security. This study reveals an unexpected vulnerability in NIC - bitstream collisions - where semantically different images produce identical compressed bitstreams. Utilizing a novel whitebox adversarial attack algorithm, this paper demonstrates that adding carefully crafted perturbations to semantically different images can cause their compressed bitstreams to collide exactly. The collision vulnerability poses a threat to the practical usability of NIC, particularly in security-critical applications. The cause of the collision is analyzed, and a simple yet effective mitigation method is presented.

Paperid: 2810, https://arxiv.org/pdf/2503.19768.pdf

Abstract:
Fermilab is transitioning authentication and authorization for grid operations to using bearer tokens based on the WLCG Common JWT (JSON Web Token) Profile. One of the functionalities that Fermilab experimenters rely on is the ability to automate batch job submission, which in turn depends on the ability to securely refresh and distribute the necessary credentials to experiment job submit points. Thus, with the transition to using tokens for grid operations, we needed to create a service that would obtain, refresh, and distribute tokens for experimenters' use. This service would avoid the need for experimenters to be experts in obtaining their own tokens and would better protect the most sensitive long-lived credentials. Further, the service needed to be widely scalable, as Fermilab hosts many experiments, each of which would need their own credentials. To address these issues, we created and deployed a Managed Tokens service. The service is written in Go, taking advantage of that language's native concurrency primitives to easily be able to scale operations as we onboard experiments. The service uses as its first credentials a set of kerberos keytabs, stored on the same secure machine that the Managed Tokens service runs on. These kerberos credentials allow the service to use htgettoken via condor_vault_storer to store vault tokens in the HTCondor credential managers (credds) that run on the batch system scheduler machines (HTCondor schedds); as well as downloading a local, shorter-lived copy of the vault token. The kerberos credentials are then also used to distribute copies of the locally-stored vault tokens to experiment submit points.

Paperid: 2811, https://arxiv.org/pdf/2503.19609.pdf

Abstract:
Researchers aim to build secure compilation chains enforcing that if there is no attack a source context can mount against a source program then there is also no attack an adversarial target context can mount against the compiled program. Proving that these compilation chains are secure is, however, challenging, and involves a non-trivial back-translation step: for any attack a target context mounts against the compiled program one has to exhibit a source context mounting the same attack against the source program. We describe a novel back-translation technique, which results in simpler proofs that can be more easily mechanized in a proof assistant. Given a finite set of finite trace prefixes, capturing the interaction recorded during an attack between a target context and the compiled program, we build a call-return tree that we back-translate into a source context producing the same trace prefixes. We use state in the generated source context to record the current location in the call-return tree. The back-translation is done in several small steps, each adding to the tree new information describing how the location should change depending on how the context regains control. To prove this back-translation correct we give semantics to every intermediate call-return tree language, using ghost state to store information and explicitly enforce execution invariants. We prove several small forward simulations, basically seeing the back-translation as a verified nanopass compiler. Thanks to this modular structure, we are able to mechanize this complex back-translation and its correctness proof in the Rocq prover without too much effort.

Paperid: 2812, https://arxiv.org/pdf/2503.19318.pdf

Abstract:
As Artificial Intelligence (AI) becomes increasingly integrated into microgrid control systems, the risk of malicious actors exploiting vulnerabilities in Machine Learning (ML) algorithms to disrupt power generation and distribution grows. Detection models to identify adversarial attacks need to meet the constraints of edge environments, where computational power and memory are often limited. To address this issue, we propose a novel strategy that optimizes detection models for Vehicle-to-Microgrid (V2M) edge environments without compromising performance against inference and evasion attacks. Our approach integrates model design and compression into a unified process and results in a highly compact detection model that maintains high accuracy. We evaluated our method against four benchmark evasion attacks-Fast Gradient Sign Method (FGSM), Basic Iterative Method (BIM), Carlini & Wagner method (C&W) and Conditional Generative Adversarial Network (CGAN) method-and two knowledge-based attacks, white-box and gray-box. Our optimized model reduces memory usage from 20MB to 1.3MB, inference time from 3.2 seconds to 0.9 seconds, and GPU utilization from 5% to 2.68%.

Paperid: 2813, https://arxiv.org/pdf/2503.19058.pdf

Abstract:
The continuous increase in malware samples, both in sophistication and number, presents many challenges for organizations and analysts, who must cope with thousands of new heterogeneous samples daily. This requires robust methods to quickly determine whether a file is malicious. Due to its speed and efficiency, static analysis is the first line of defense. In this work, we illustrate how the practical state-of-the-art methods used by antivirus solutions may fail to detect evident malware traces. The reason is that they highly depend on very strict signatures where minor deviations prevent them from detecting shellcodes that otherwise would immediately be flagged as malicious. Thus, our findings illustrate that malware authors may drastically decrease the detections by converting the code base to less-used programming languages. To this end, we study the features that such programming languages introduce in executables and the practical issues that arise for practitioners to detect malicious activity.

Paperid: 2814, https://arxiv.org/pdf/2503.18859.pdf

Abstract:
Encryption is crucial for securing sensitive data during transmission over networks. Various encryption techniques exist, such as AES, DES, and RC4, with AES being the most renowned algorithm. We proposed methodology that enables users to encrypt text messages for secure transmission over cellular networks. This approach utilizes the AES algorithm following the proposed protocols for encryption and decryption, ensuring fast and reliable data protection. This approach ensures secure text encryption and enables users to enter messages that are encrypted using a key at the sender's end and decrypted at the recipient's end, which is compatible with any Android device. SMS are encrypted with the AES algorithm, making them resistant to brute-force attempts. As SMS has become a popular form of communication, protecting personal data, email alerts, banking details, and transactions information. It addresses security concerns by encrypting messages using AES and cryptographic techniques, providing an effective solution for protecting sensitive data during SMS exchanges.

Paperid: 2815, https://arxiv.org/pdf/2503.18542.pdf

Abstract:
In todays landscape of increasing electronic crime, network forensics plays a pivotal role in digital investigations. It aids in understanding which systems to analyse and as a supplement to support evidence found through more traditional computer based investigations. However, the nature and functionality of the existing Network Forensic Analysis Tools (NFATs) fall short compared to File System Forensic Analysis Tools (FS FATs) in providing usable data. The analysis tends to focus upon IP addresses, which are not synonymous with user identities, a point of significant interest to investigators. This paper presents several experiments designed to create a novel NFAT approach that can identify users and understand how they are using network based applications whilst the traffic remains encrypted. The experiments build upon the prior art and investigate how effective this approach is in classifying users and their actions. Utilising an in-house dataset composed of 50 million packers, the experiments are formed of three incremental developments that assist in improving performance. Building upon the successful experiments, a proposed NFAT interface is presented to illustrate the ease at which investigators would be able to ask relevant questions of user interactions. The experiments profiled across 27 users, has yielded an average 93.3% True Positive Identification Rate (TPIR), with 41% of users experiencing 100% TPIR. Skype, Wikipedia and Hotmail services achieved a notably high level of recognition performance. The study has developed and evaluated an approach to analyse encrypted network traffic more effectively through the modelling of network traffic and to subsequently visualise these interactions through a novel network forensic analysis tool.

Paperid: 2816, https://arxiv.org/pdf/2503.17426.pdf

Abstract:
The evaluation of smart contract reputability is essential to foster trust in decentralized ecosystems. However, existing methods that rely solely on code analysis or transactional data, offer limited insight into evolving trustworthiness. We propose a multimodal data fusion framework that integrates code features with transactional data to enhance reputability prediction. Our framework initially focuses on AI-based code analysis, utilizing GAN-augmented opcode embeddings to address class imbalance, achieving 97.67% accuracy and a recall of 0.942 in detecting illicit contracts, surpassing traditional oversampling methods. This forms the crux of a reputability-centric fusion strategy, where combining code and transactional data improves recall by 7.25% over single-source models, demonstrating robust performance across validation sets. By providing a holistic view of smart contract behaviour, our approach enhances the model's ability to assess reputability, identify fraudulent activities, and predict anomalous patterns. These capabilities contribute to more accurate reputability assessments, proactive risk mitigation, and enhanced blockchain security.

Paperid: 2817, https://arxiv.org/pdf/2503.17219.pdf

Abstract:
This paper introduces a novel mathematical framework for analyzing cyber threat campaigns through fractal geometry. By conceptualizing hierarchical taxonomies (MITRE ATT&CK, DISARM) as snowflake-like structures with tactics, techniques, and sub-techniques forming concentric layers, we establish a rigorous method for campaign comparison using Hutchinson's Theorem and Hausdorff distance metrics. Evaluation results confirm that our fractal representation preserves hierarchical integrity while providing a dimensionality-based complexity assessment that correlates with campaign complexity. The proposed methodology bridges taxonomy-driven cyber threat analysis and computational geometry, providing analysts with both mathematical rigor and interpretable visualizations for addressing the growing complexity of adversarial operations across multiple threat domains.

Paperid: 2818, https://arxiv.org/pdf/2503.16847.pdf

Abstract:
Flow correlation attacks is an efficient network attacks, aiming to expose those who use anonymous network services, such as Tor. Conducting such attacks during the early stages of network communication is particularly critical for scenarios demanding rapid decision-making, such as cybercrime detection or financial fraud prevention. Although recent studies have made progress in flow correlation attacks techniques, research specifically addressing flow correlation with early network traffic flow remains limited. Moreover, due to factors such as model complexity, training costs, and real-time requirements, existing technologies cannot be directly applied to flow correlation with early network traffic flow. In this paper, we propose flow correlation attack with early network traffic, named Early-MFC, based on multi-view triplet networks. The proposed approach extracts multi-view traffic features from the payload at the transport layer and the Inter-Packet Delay. It then integrates multi-view flow information, converting the extracted features into shared embeddings. By leveraging techniques such as metric learning and contrastive learning, the method optimizes the embeddings space by ensuring that similar flows are mapped closer together while dissimilar flows are positioned farther apart. Finally, Bayesian decision theory is applied to determine flow correlation, enabling high-accuracy flow correlation with early network traffic flow. Furthermore, we investigate flow correlation attacks under extra-early network traffic flow conditions. To address this challenge, we propose Early-MFC+, which utilizes payload data to construct embedded feature representations, ensuring robust performance even with minimal packet availability.

Paperid: 2819, https://arxiv.org/pdf/2503.16828.pdf

Abstract:
Public key authenticated encryption with keyword search (PAEKS) represents a significant advancement of secure and searchable data sharing in public network systems, such as medical systems. It can effectively mitigate the risk of keyword guessing attacks (KGA), which is a critical issue in public key encryption with keyword search (PEKS). However, in scenarios with a large number of users, the enforced point-to-point access control necessitates that the data sender encrypt the same keyword using the public keys of multiple receivers to create indexes, while the data receiver also must generate trapdoors of size linear to senders in the system. The burden on users aiming for efficient data sharing is considerable, as the overheads increase linearly with the number of users. Furthermore, the majority of current PAEKS schemes lack expressive search functions, including conjunctions, disjunctions, or any monotone boolean formulas, which are prevalent in practical applications. To tackle the abovementioned challenges, we propose an efficient and expressive PAEKS scheme. In efficiency, one auxiliary server is integrated to assist users in generating indexes and trapdoors. Users encrypt with their respective private keys along with the public keys of the servers, facilitating secure and searchable data sharing while significantly minimizing overhead. Additionally, the LSSS is employed to implement expressive search, including monotone boolean queries. We also obfuscate the mapping relationship associated with the LSSS matrix to the keywords, thereby enhancing the privacy protection. Security analysis alongside theoretical and experimental evaluations of our scheme illustrates its practicality and efficiency in multi-user data sharing scenarios.

Paperid: 2820, https://arxiv.org/pdf/2503.16783.pdf

Abstract:
We present a formal analysis of quorum-based State Machine Replication (SMR) protocols in Proof-of-Stake (PoS) systems under a hybrid threat model comprising honest, Byzantine, and rational validators. Our analysis of traditional quorum-based protocols establishes two fundamental impossibility results: (1) in partially synchronous networks, no quorum-based protocol can achieve SMR when rational and Byzantine validators comprise more than $1/3$ of participants, and (2) in synchronous networks, SMR remains impossible when rational and Byzantine validators comprise $2/3$ or more of participants. To overcome these limitations, we propose two complementary solutions in our hybrid model. First, we introduce a protocol that enforces a bound on the volume of the total transacted amount that is finalized within any time window $Î$ and prove that this bound is necessary for secure SMR protocols in our model. Second, we present the \emph{strongest chain rule}, which enables efficient finalization of transactions when the majority of honest participants provably support the SMR execution. Through empirical analysis of Ethereum and Cosmos networks, we demonstrate that validator participation consistently exceeds the required ${5}/{6}$ threshold, establishing the practical feasibility of our solution in production PoS systems.

Paperid: 2821, https://arxiv.org/pdf/2503.16392.pdf

Abstract:
With AI-based software becoming widely available, the risk of exploiting its capabilities, such as high automation and complex pattern recognition, could significantly increase. An AI used offensively to attack non-AI assets is referred to as offensive AI. Current research explores how offensive AI can be utilized and how its usage can be classified. Additionally, methods for threat modeling are being developed for AI-based assets within organizations. However, there are gaps that need to be addressed. Firstly, there is a need to quantify the factors contributing to the AI threat. Secondly, there is a requirement to create threat models that analyze the risk of being attacked by AI for vulnerability assessment across all assets of an organization. This is particularly crucial and challenging in cloud environments, where sophisticated infrastructure and access control landscapes are prevalent. The ability to quantify and further analyze the threat posed by offensive AI enables analysts to rank vulnerabilities and prioritize the implementation of proactive countermeasures. To address these gaps, this paper introduces the Graph of Effort, an intuitive, flexible, and effective threat modeling method for analyzing the effort required to use offensive AI for vulnerability exploitation by an adversary. While the threat model is functional and provides valuable support, its design choices need further empirical validation in future work.

Paperid: 2822, https://arxiv.org/pdf/2503.16292.pdf

Abstract:
As technology increasingly integrates into farm settings, the food and agriculture sector has become vulnerable to cyberattacks. However, previous research has indicated that many farmers and food producers lack the cybersecurity education they require to identify and mitigate the growing number of threats and risks impacting the industry. This paper presents an ongoing research effort describing a cybersecurity initiative to educate various populations in the farming and agriculture community. The initiative proposes the development and delivery of a ten-module cybersecurity course, to create a more secure workforce, focusing on individuals who, in the past, have received minimal exposure to cybersecurity education initiatives.

Paperid: 2823, https://arxiv.org/pdf/2503.16283.pdf

Abstract:
As various technologies are integrated and implemented into the food and agricultural industry, it is increasingly important for stakeholders throughout the sector to identify and reduce cybersecurity vulnerabilities and risks associated with these technologies. However, numerous industry and government reports suggest that many farmers and agricultural equipment manufacturers do not fully understand the cyber threats posed by modern agricultural technologies, including CAN bus-driven farming equipment. This paper addresses this knowledge gap by attempting to quantify the cybersecurity risks associated with cyberattacks on farming equipment that utilize CAN bus technology. The contribution of this paper is twofold. First, it presents a hypothetical case study, using real-world data, to illustrate the specific and wider impacts of a cyberattack on a CAN bus-driven fertilizer applicator employed in row-crop farming. Second, it establishes a foundation for future research on quantifying cybersecurity risks related to agricultural machinery.

Paperid: 2824, https://arxiv.org/pdf/2503.16080.pdf

Abstract:
Homomorphic encryption is a cryptographic paradigm allowing to compute on encrypted data, opening a wide range of applications in privacy-preserving data manipulation, notably in AI. Many of those applications require significant linear algebra computations (matrix x vector products, and matrix x matrix products). This central role of linear algebra computations goes far beyond homomorphic algebra and applies to most areas of scientific computing. This high versatility led, over time, to the development of a set of highly optimized routines, specified in 1979 under the name BLAS (basic linear algebra subroutines). Motivated both by the applicative importance of homomorphic linear algebra and the access to highly efficient implementations of cleartext linear algebra able to draw the most out of available hardware, we explore the connections between CKKS-based homomorphic linear algebra and floating-point plaintext linear algebra. The CKKS homomorphic encryption system is the most natural choice in this setting, as it natively handles real numbers and offers a large SIMD parallelism. We provide reductions for matrix-vector products, vector-vector products for moderate-sized to large matrices to their plaintext equivalents. Combined with BLAS, we demonstrate that the efficiency loss between CKKS-based encrypted square matrix multiplication and double-precision floating-point square matrix multiplication is a mere 4-12 factor, depending on the precise situation.

Paperid: 2825, https://arxiv.org/pdf/2503.15964.pdf

Abstract:
The development of Decentralized Identities (DI) and Self-Sovereign Identities (SSI) has seen significant growth in recent years. This is accompanied by a numerous academic and commercial contributions to the development of principles, standards, and systems. While several comprehensive reviews have been produced, they predominantly focus on academic literature, with few considering grey literature to provide a holistic view of technological advancements. Furthermore, no existing surveys have thoroughly analyzed real-world deployments to understand the barriers to the widespread adoption of decentralized identity models. This paper addresses the gap by exploring both academic and grey literature and examining commercial and governmental initiatives, to present a comprehensive landscape of decentralized identity technologies and their adoption in real-world. Additionally, it identifies the practical challenges and limitations that slowdown the transition from centralized to decentralized identity management systems. By shifting the focus from purely technological constraints to real-world deployment issues, this survey identifies the underlying reasons preventing the adoption of decentralized identities despite their evident benefits to the data owner.

Paperid: 2826, https://arxiv.org/pdf/2503.15654.pdf

Abstract:
Centralized trust is ubiquitous in today's interconnected world, from computational resources to data storage and its underlying infrastructure. The monopolization of cloud computing resembles a feudalistic system, causing a loss of privacy and data ownership. Cloud Computing and the Internet in general face widely recognized challenges, such as (1) the centralization of trust in auxiliary systems (e.g., centralized cloud providers), (2) the seamless and permissionless interoperability of fragmented ecosystems and (2) the effectiveness, verifiability, and confidentiality of the computation. Acurast is a decentralized serverless cloud that addresses all these shortcomings, following the call for a global-scale cloud founded on the principles of the open-source movement. In Acurast, a purpose-built orchestrator, a reputation engine, and an attestation service are enshrined in the consensus layer. Developers can off-load their computations and verify executions cryptographically. Furthermore, Acurast offers a modular execution layer, taking advantage of secure hardware and trusted execution environments, removing the trust required in third parties, and reducing them to cryptographic hardness assumptions. With this modular architecture, Acurast serves as a decentralized and serverless cloud, allowing confidential and verifiable compute backed by the hardware of security and performance mobile devices.

Paperid: 2827, https://arxiv.org/pdf/2503.15547.pdf

Abstract:
Large Language Models (LLMs) are combined with tools to create powerful LLM agents that provide a wide range of services. Unlike traditional software, LLM agent's behavior is determined at runtime by natural language prompts from either user or tool's data. This flexibility enables a new computing paradigm with unlimited capabilities and programmability, but also introduces new security risks, vulnerable to privilege escalation attacks. Moreover, user prompts are prone to be interpreted in an insecure way by LLM agents, creating non-deterministic behaviors that can be exploited by attackers. To address these security risks, we propose Prompt Flow Integrity (PFI), a system security-oriented solution to prevent privilege escalation in LLM agents. Analyzing the architectural characteristics of LLM agents, PFI features three mitigation techniques -- i.e., agent isolation, secure untrusted data processing, and privilege escalation guardrails. Our evaluation result shows that PFI effectively mitigates privilege escalation attacks while successfully preserving the utility of LLM agents.

Paperid: 2828, https://arxiv.org/pdf/2503.15542.pdf

Abstract:
Identifying reputable Ethereum projects remains a critical challenge within the expanding blockchain ecosystem. The ability to distinguish between legitimate initiatives and potentially fraudulent schemes is non-trivial. This work presents a systematic approach that integrates multiple data sources with advanced analytics to evaluate credibility, transparency, and overall trustworthiness. The methodology applies machine learning techniques to analyse transaction histories on the Ethereum blockchain. The study classifies accounts based on a dataset comprising 2,179 entities linked to illicit activities and 3,977 associated with reputable projects. Using the LightGBM algorithm, the approach achieves an average accuracy of 0.984 and an average AUC of 0.999, validated through 10-fold cross-validation. Key influential factors include time differences between transactions and received_tnx. The proposed methodology provides a robust mechanism for identifying reputable Ethereum projects, fostering a more secure and transparent investment environment. By equipping stakeholders with data-driven insights, this research enables more informed decision-making, risk mitigation, and the promotion of legitimate blockchain initiatives. Furthermore, it lays the foundation for future advancements in trust assessment methodologies, contributing to the continued development and maturity of the Ethereum ecosystem.

Paperid: 2829, https://arxiv.org/pdf/2503.14709.pdf

Abstract:
We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, uniformity, identity, and closeness testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).

Paperid: 2830, https://arxiv.org/pdf/2503.14618.pdf

Abstract:
Distributed denial-of-service (DDoS) attacks remain a critical threat to Internet services, causing costly disruptions. While machine learning (ML) has shown promise in DDoS detection, current solutions struggle with multi-domain environments where attacks must be detected across heterogeneous networks and organizational boundaries. This limitation severely impacts the practical deployment of ML-based defenses in real-world settings. This paper introduces Anomaly-Flow, a novel framework that addresses this critical gap by combining Federated Learning (FL) with Generative Adversarial Networks (GANs) for privacy-preserving, multi-domain DDoS detection. Our proposal enables collaborative learning across diverse network domains while preserving data privacy through synthetic flow generation. Through extensive evaluation across three distinct network datasets, Anomaly-Flow achieves an average F1-score of $0.747$, outperforming baseline models. Importantly, our framework enables organizations to share attack detection capabilities without exposing sensitive network data, making it particularly valuable for critical infrastructure and privacy-sensitive sectors. Beyond immediate technical contributions, this work provides insights into the challenges and opportunities in multi-domain DDoS detection, establishing a foundation for future research in collaborative network defense systems. Our findings have important implications for academic research and industry practitioners working to deploy practical ML-based security solutions.

Paperid: 2831, https://arxiv.org/pdf/2503.14462.pdf

Abstract:
We propose a blockchain architecture in which mining requires a quantum computer. The consensus mechanism is based on proof of quantum work, a quantum-enhanced alternative to traditional proof of work that leverages quantum supremacy to make mining intractable for classical computers. We have refined the blockchain framework to incorporate the probabilistic nature of quantum mechanics, ensuring stability against sampling errors and hardware inaccuracies. To validate our approach, we implemented a prototype blockchain on four D-Wave(TM) quantum annealing processors geographically distributed within North America, demonstrating stable operation across hundreds of thousands of quantum hashing operations. Our experimental protocol follows the same approach used in the recent demonstration of quantum supremacy [King et al. Science 2025], ensuring that classical computers cannot efficiently perform the same computation task. By replacing classical machines with quantum systems for mining, it is possible to significantly reduce the energy consumption and environmental impact traditionally associated with blockchain mining while providing a quantum-safe layer of security. Beyond serving as a proof of concept for a meaningful application of quantum computing, this work highlights the potential for other near-term quantum computing applications using existing technology.

Paperid: 2832, https://arxiv.org/pdf/2503.13654.pdf

Abstract:
Large Language Models (LLMs) are widely used for automated code generation. Their reliance on infrequently updated pretraining data leaves them unaware of newly discovered vulnerabilities and evolving security standards, making them prone to producing insecure code. In contrast, developer communities on Stack Overflow (SO) provide an ever-evolving repository of knowledge, where security vulnerabilities are actively discussed and addressed through collective expertise. These community-driven insights remain largely untapped by LLMs. This paper introduces SOSecure, a Retrieval-Augmented Generation (RAG) system that leverages the collective security expertise found in SO discussions to improve the security of LLM-generated code. We build a security-focused knowledge base by extracting SO answers and comments that explicitly identify vulnerabilities. Unlike common uses of RAG, SOSecure triggers after code has been generated to find discussions that identify flaws in similar code. These are used in a prompt to an LLM to consider revising the code. Evaluation across three datasets (SALLM, LLMSecEval, and LMSys) show that SOSecure achieves strong fix rates of 71.7%, 91.3%, and 96.7% respectively, compared to prompting GPT-4 without relevant discussions (49.1%, 56.5%, and 37.5%), and outperforms multiple other baselines. SOSecure operates as a language-agnostic complement to existing LLMs, without requiring retraining or fine-tuning, making it easy to deploy. Our results underscore the importance of maintaining active developer forums, which have dropped substantially in usage with LLM adoptions.

Paperid: 2833, https://arxiv.org/pdf/2503.12801.pdf

Abstract:
Model memorization has implications for both the generalization capacity of machine learning models and the privacy of their training data. This paper investigates label memorization in binary classification models through two novel passive label inference attacks (BLIA). These attacks operate passively, relying solely on the outputs of pre-trained models, such as confidence scores and log-loss values, without interacting with or modifying the training process. By intentionally flipping 50% of the labels in controlled subsets, termed "canaries," we evaluate the extent of label memorization under two conditions: models trained without label differential privacy (Label-DP) and those trained with randomized response-based Label-DP. Despite the application of varying degrees of Label-DP, the proposed attacks consistently achieve success rates exceeding 50%, surpassing the baseline of random guessing and conclusively demonstrating that models memorize training labels, even when these labels are deliberately uncorrelated with the features.

Paperid: 2834, https://arxiv.org/pdf/2503.12497.pdf

Abstract:
Malicious users attempt to replicate commercial models functionally at low cost by training a clone model with query responses. It is challenging to timely prevent such model-stealing attacks to achieve strong protection and maintain utility. In this paper, we propose a novel non-parametric detector called Account-aware Distribution Discrepancy (ADD) to recognize queries from malicious users by leveraging account-wise local dependency. We formulate each class as a Multivariate Normal distribution (MVN) in the feature space and measure the malicious score as the sum of weighted class-wise distribution discrepancy. The ADD detector is combined with random-based prediction poisoning to yield a plug-and-play defense module named D-ADD for image classification models. Results of extensive experimental studies show that D-ADD achieves strong defense against different types of attacks with little interference in serving benign users for both soft and hard-label settings.

Paperid: 2835, https://arxiv.org/pdf/2503.12248.pdf

Abstract:
Side-channel vulnerabilities pose an increasing threat to cryptographically protected devices. Consequently, it is crucial to observe information leakages through physical parameters such as power consumption and electromagnetic (EM) radiation to reduce susceptibility during interactions with cryptographic functions. EM side-channel attacks are becoming more prevalent. PRESENT is a promising lightweight cryptographic algorithm expected to be incorporated into Internet-of-Things (IoT) devices in the future. This research investigates the EM side-channel robustness of PRESENT using a correlation attack model. This work extends our previous Correlation EM Analysis (CEMA) of PRESENT with improved results. The attack targets the Substitution box (S-box) and can retrieve 8 bytes of the 10-byte encryption key with a minimum of 256 EM waveforms. This paper presents the process of EM attack modelling, encompassing both simple and correlation attacks, followed by a critical analysis.

Paperid: 2836, https://arxiv.org/pdf/2503.11920.pdf

Abstract:
Recent smart grid advancements enable near-realtime reporting of electricity consumption, raising concerns about consumer privacy. Differential privacy (DP) has emerged as a viable privacy solution, where a calculated amount of noise is added to the data by a trusted third party, or individual users perturb their information locally, and only send the randomized data to an aggregator for analysis safeguarding users and aggregators privacy. However, the practical implementation of a Local DP-based (LDP) privacy model for smart grids has its own challenges. In this paper, we discuss the challenges of implementing an LDP-based model for smart grids. We compare existing LDP mechanisms in smart grids for privacy preservation of numerical data and discuss different methods for selecting privacy parameters in the existing literature, their limitations and the non-existence of an optimal method for selecting the privacy parameters. We also discuss the challenges of translating theoretical models of LDP into a practical setting for smart grids for different utility functions, the impact of the size of data set on privacy and accuracy, and vulnerability of LDP-based smart grids to manipulation attacks. Finally, we discuss future directions in research for better practical applications in LDP based models for smart grids.

Paperid: 2837, https://arxiv.org/pdf/2503.11841.pdf

Abstract:
Machine learning (ML) malware detectors rely heavily on crowd-sourced AntiVirus (AV) labels, with platforms like VirusTotal serving as a trusted source of malware annotations. But what if attackers could manipulate these labels to classify benign software as malicious? We introduce label spoofing attacks, a new threat that contaminates crowd-sourced datasets by embedding minimal and undetectable malicious patterns into benign samples. These patterns coerce AV engines into misclassifying legitimate files as harmful, enabling poisoning attacks against ML-based malware classifiers trained on those data. We demonstrate this scenario by developing AndroVenom, a methodology for polluting realistic data sources, causing consequent poisoning attacks against ML malware detectors. Experiments show that not only state-of-the-art feature extractors are unable to filter such injection, but also various ML models experience Denial of Service already with 1% poisoned samples. Additionally, attackers can flip decisions of specific unaltered benign samples by modifying only 0.015% of the training data, threatening their reputation and market share and being unable to be stopped by anomaly detectors on training data. We conclude our manuscript by raising the alarm on the trustworthiness of the training process based on AV annotations, requiring further investigation on how to produce proper labels for ML malware detectors.

Paperid: 2838, https://arxiv.org/pdf/2503.11508.pdf

Abstract:
In this paper, we investigate the utilization of the angle of arrival (AoA) as a feature for robust physical layer authentication (PLA). While most of the existing approaches to PLA focus on common features of the physical layer of communication channels, such as channel frequency response, channel impulse response or received signal strength, the use of AoA in this domain has not yet been studied in depth, particularly regarding the ability to thwart impersonation attacks. In this work, we demonstrate that an impersonation attack targeting AoA based PLA is only feasible under strict conditions on the attacker's location and hardware capabilities, which highlights the AoA's potential as a strong feature for PLA. We extend previous works considering a single-antenna attacker to the case of a multiple-antenna attacker, and we develop a theoretical characterization of the conditions in which a successful impersonation attack can be mounted. Furthermore, we leverage extensive simulations in support of theoretical analyses, to validate the robustness of AoA-based PLA.

Paperid: 2839, https://arxiv.org/pdf/2503.10793.pdf

Abstract:
As an emerging programming language, Rust has rapidly gained popularity and recognition among developers due to its strong emphasis on safety. It employs a unique ownership system and safe concurrency practices to ensure robust safety. Despite these safeguards, security in Rust still presents challenges. Since 2018, 442 Rust-related vulnerabilities have been reported in real-world applications. The limited availability of data has resulted in existing vulnerability detection tools performing poorly in real-world scenarios, often failing to adapt to new and complex vulnerabilities. This paper introduces HALURust, a novel framework that leverages hallucinations of large language models (LLMs) to detect vulnerabilities in real-world Rust scenarios. HALURust leverages LLMs' strength in natural language generation by transforming code into detailed vulnerability analysis reports. The key innovation lies in prompting the LLM to always assume the presence of a vulnerability. If the code sample is vulnerable, the LLM provides an accurate analysis; if not, it generates a hallucinated report. By fine-tuning LLMs on these hallucinations, HALURust can effectively distinguish between vulnerable and non-vulnerable code samples. HALURust was evaluated on a dataset of 81 real-world vulnerabilities, covering 447 functions and 18,691 lines of code across 54 applications. It outperformed existing methods, achieving an F1 score of 77.3%, with over 10% improvement. The hallucinated report-based fine-tuning improved detection by 20\% compared to traditional code-based fine-tuning. Additionally, HALURust effectively adapted to unseen vulnerabilities and other programming languages, demonstrating strong generalization capabilities.

Paperid: 2840, https://arxiv.org/pdf/2503.10255.pdf

Abstract:
Shor's and Grover's algorithms' efficiency and the advancement of quantum computers imply that the cryptography used until now to protect one's privacy is potentially vulnerable to retrospective decryption, also known as \emph{harvest now, decrypt later} attack in the near future. This dissertation proposes an overview of the cryptographic schemes used by Tor, highlighting the non-quantum-resistant ones and introducing theoretical performance assessment methods of a local Tor network. The measurement is divided into three phases. We will start with benchmarking a local Tor network simulation on constrained devices to isolate the time taken by classical cryptography processes. Secondly, the analysis incorporates existing benchmarks of quantum-secure algorithms and compares these performances on the devices. Lastly, the estimation of overhead is calculated by replacing the measured times of traditional cryptography with the times recorded for Post Quantum Cryptography (PQC) execution within the specified Tor environment. By focusing on the replaceable cryptographic components, using theoretical estimations, and leveraging existing benchmarks, valuable insights into the potential impact of PQC can be obtained without needing to implement it fully.

Paperid: 2842, https://arxiv.org/pdf/2503.10185.pdf

Abstract:
Following the publication of Bitcoin's arguably most famous attack, selfish mining, various works have introduced mechanisms to enhance blockchain systems' game theoretic resilience. Some reward mechanisms, like FruitChains, have been shown to be equilibria in theory. However, their guarantees assume non-realistic parameters and their performance degrades significantly in a practical deployment setting. In this work we introduce a reward allocation mechanism, called Proportional Splitting (PRS), which outperforms existing state of the art. We show that, for large enough parameters, PRS is an equilibrium, offering the same theoretical guarantees as the state of the art. In addition, for practical, realistically small, parameters, PRS outperforms all existing reward mechanisms across an array of metrics. We implement PRS on top of a variant of PoEM, a Proof-of-Work (PoW) protocol that enables a more accurate estimation of each party's mining power compared to e.g., Bitcoin. We then evaluate PRS both theoretically and in practice. On the theoretical side, we show that our protocol combined with PRS is an equilibrium and guarantees fairness, similar to FruitChains. In practice, we compare PRS with an array of existing reward mechanisms and show that, assuming an accurate estimation of the mining power distribution, it outperforms them across various well-established metrics. Finally, we realize this assumption by approximating the power distribution via low-work objects called "workshares" and quantify the tradeoff between the approximation's accuracy and storage overhead.

Paperid: 2843, https://arxiv.org/pdf/2503.09964.pdf

Abstract:
Large Multimodal Models (LMMs) are increasingly vulnerable to AI-generated extremist content, including photorealistic images and text, which can be used to bypass safety mechanisms and generate harmful outputs. However, existing datasets for evaluating LMM robustness offer limited exploration of extremist content, often lacking AI-generated images, diverse image generation models, and comprehensive coverage of historical events, which hinders a complete assessment of model vulnerabilities. To fill this gap, we introduce ExtremeAIGC, a benchmark dataset and evaluation framework designed to assess LMM vulnerabilities against such content. ExtremeAIGC simulates real-world events and malicious use cases by curating diverse text- and image-based examples crafted using state-of-the-art image generation techniques. Our study reveals alarming weaknesses in LMMs, demonstrating that even cutting-edge safety measures fail to prevent the generation of extremist material. We systematically quantify the success rates of various attack strategies, exposing critical gaps in current defenses and emphasizing the need for more robust mitigation strategies.

Paperid: 2844, https://arxiv.org/pdf/2503.09735.pdf

Abstract:
Adversarial examples are a major problem for machine learning models, leading to a continuous search for effective defenses. One promising direction is to leverage model explanations to better understand and defend against these attacks. We looked at AmI, a method proposed by a NeurIPS 2018 spotlight paper that uses model explanations to detect adversarial examples. Our study shows that while AmI is a promising idea, its performance is too dependent on specific settings (e.g., hyperparameter) and external factors such as the operating system and the deep learning framework used, and such drawbacks limit AmI's practical usage. Our findings highlight the need for more robust defense mechanisms that are effective under various conditions. In addition, we advocate for a comprehensive evaluation framework for defense techniques.

Paperid: 2845, https://arxiv.org/pdf/2503.09327.pdf

Abstract:
Blockchain technology has recently gained widespread popularity as a practical method of storing immutable data while preserving the privacy of users by anonymizing their real identities. This anonymization approach, however, significantly complicates the analysis of blockchain data. To address this problem, heuristic-based clustering algorithms as an effective way of linking all addresses controlled by the same entity have been presented in the literature. In this paper, considering the particular features of the Extended Unspent Transaction Outputs accounting model introduced by the Cardano blockchain, two new clustering heuristics are proposed for clustering the Cardano payment addresses. Applying these heuristics and employing the UnionFind algorithm, we efficiently cluster all the addresses that have appeared on the Cardano blockchain from September 2017 to January 2023, where each cluster represents a distinct entity. The results show that each medium-sized entity in the Cardano network owns and controls 9.67 payment addresses on average. The results also confirm that a power law distribution is fitted to the distribution of entity sizes recognized using our proposed heuristics.

Paperid: 2846, https://arxiv.org/pdf/2503.09317.pdf

Abstract:
Decentralized on-chain smart contracts enable trustless collaboration, yet their inherent data transparency and execution overhead hinder widespread adoption. Existing cryptographic approaches incur high computational costs and lack generality. Meanwhile, prior TEE-based solutions suffer from practical limitations, such as the inability to support inter-contract interactions, reliance on unbreakable TEEs, and compromised usability. We introduce RaceTEE, a practical and privacy-preserving off-chain execution architecture for smart contracts that leverages Trusted Execution Environments (TEEs). RaceTEE decouples transaction ordering (on-chain) from execution (off-chain), with computations performed competitively in TEEs, ensuring confidentiality and minimizing overhead. It further enhances practicality through three key improvements: supporting secure inter-contract interactions, providing a key rotation scheme that enforces forward and backward secrecy even in the event of TEE breaches, and enabling full compatibility with existing blockchains without altering the user interaction model. To validate its feasibility, we prototype RaceTEE using Intel SGX and Ethereum, demonstrating its applicability across various use cases and evaluating its performance.

Paperid: 2847, https://arxiv.org/pdf/2503.09068.pdf

Abstract:
To improve trust and transparency, it is crucial to be able to interpret the decisions of Deep Neural classifiers (DNNs). Instance-level examinations, such as attribution techniques, are commonly employed to interpret the model decisions. However, when interpreting misclassified decisions, human intervention may be required. Analyzing the attribu tions across each class within one instance can be particularly labor intensive and influenced by the bias of the human interpreter. In this paper, we present a novel framework to uncover the weakness of the classifier via counterfactual examples. A prober is introduced to learn the correctness of the classifier's decision in terms of binary code-hit or miss. It enables the creation of the counterfactual example concerning the prober's decision. We test the performance of our prober's misclassification detection and verify its effectiveness on the image classification benchmark datasets. Furthermore, by generating counterfactuals that penetrate the prober, we demonstrate that our framework effectively identifies vulnerabilities in the target classifier without relying on label information on the MNIST dataset.

Paperid: 2848, https://arxiv.org/pdf/2503.08973.pdf

Abstract:
Reducing the memory footprint of Machine Learning (ML) models, especially Deep Neural Networks (DNNs), is imperative to facilitate their deployment on resource-constrained edge devices. However, a notable drawback of DNN models lies in their susceptibility to adversarial attacks, wherein minor input perturbations can deceive them. A primary challenge revolves around the development of accurate, resilient, and compact DNN models suitable for deployment on resource-constrained edge devices. This paper presents the outcomes of a compact DNN model that exhibits resilience against both black-box and white-box adversarial attacks. This work has achieved this resilience through training with the QKeras quantization-aware training framework. The study explores the potential of QKeras and an adversarial robustness technique, Jacobian Regularization (JR), to co-optimize the DNN architecture through per-layer JR methodology. As a result, this paper has devised a DNN model employing this co-optimization strategy based on Stochastic Ternary Quantization (STQ). Its performance was compared against existing DNN models in the face of various white-box and black-box attacks. The experimental findings revealed that, the proposed DNN model had small footprint and on average, it exhibited better performance than Quanos and DS-CNN MLCommons/TinyML (MLC/T) benchmarks when challenged with white-box and black-box attacks, respectively, on the CIFAR-10 image and Google Speech Commands audio datasets.

Paperid: 2849, https://arxiv.org/pdf/2503.08717.pdf

Abstract:
The ability of tracing states of logistic transportations requires an efficient storage and retrieval of the state of logistic transportations and locations of logistic objects. However, the restriction of sharing states and locations of logistic objects across organizations from different countries makes it hard to deploy a centralized database for implementing the traceability in a cross-border logistic system. This paper proposes a semantic data model on Blockchain to represent a logistic process based on the Semantic Link Network model where each semantic link represents a logistic transportation of a logistic object between two parties. A state representation model is designed to represent the states of a logistic transportation with semantic links. It enables the locations of logistic objects to be derived from the link states. A mapping from the semantic links to the blockchain transactions is designed to enable schema of semantic links and states of semantic links to be published in blockchain transactions. To improve the efficiency of tracing a path of semantic links on blockchain platform, an algorithm is designed to build shortcuts along the path of semantic links to enable a query on the path of a logistic object to reach the target in logarithmic steps on the blockchain platform. A reward-penalty policy is designed to allow participants to confirm the state of links on blockchain. Analysis and simulation demonstrate the flexibility, effectiveness and the efficiency of Semantic Link Network on immutable blockchain for implementing logistic traceability.

Paperid: 2850, https://arxiv.org/pdf/2503.08607.pdf

Abstract:
As hyperconnected devices and decentralized data architectures expand, securing IoT transactions becomes increasingly challenging. Blockchain offers a promising solution, but its effectiveness relies on the underlying consensus algorithm. Traditional mechanisms like PoW and PoS are often impractical for resource-constrained IoT environments. To address these limitations, this work introduces a fair and lightweight hybrid consensus algorithm tailored for IoT. The proposed approach minimizes resource demands on the nodes while ensuring a secure and fair agreement process. Specifically, it leverages a distributed lottery mechanism to fairly propose blocks without requiring specialized hardware. In addition, a reputation-based block voting mechanism is incorporated to enhance trust and establish finality. Finally, experimental evaluation was conducted to validate the key features of the consensus algorithm.

Paperid: 2851, https://arxiv.org/pdf/2503.08256.pdf

Abstract:
Confidential computing in the public cloud intends to safeguard workload privacy while outsourcing infrastructure management to a cloud provider. This is achieved by executing customer workloads within so called Trusted Execution Environments (TEEs), such as Confidential Virtual Machines (CVMs), which protect them from unauthorized access by cloud administrators and privileged system software. At the core of confidential computing lies remote attestation -- a mechanism that enables workload owners to verify the initial state of their workload and furthermore authenticate the underlying hardware. hile this represents a significant advancement in cloud security, this SoK critically examines the confidential computing offerings of market-leading cloud providers to assess whether they genuinely adhere to its core principles. We develop a taxonomy based on carefully selected criteria to systematically evaluate these offerings, enabling us to analyse the components responsible for remote attestation, the evidence provided at each stage, the extent of cloud provider influence and whether this undermines the threat model of confidential computing. Specifically, we investigate how CVMs are deployed in the public cloud infrastructures, the extent to which customers can request and verify attestation evidence, and their ability to define and enforce configuration and attestation requirements. This analysis provides insight into whether confidential computing guarantees -- namely confidentiality and integrity -- are genuinely upheld. Our findings reveal that all major cloud providers retain control over critical parts of the trusted software stack and, in some cases, intervene in the standard remote attestation process. This directly contradicts their claims of delivering confidential computing, as the model fundamentally excludes the cloud provider from the set of trusted entities.

Paperid: 2852, https://arxiv.org/pdf/2503.08085.pdf

Abstract:
Despite recent advancements in federated learning (FL), the integration of generative models into FL has been limited due to challenges such as high communication costs and unstable training in heterogeneous data environments. To address these issues, we propose PRISM, a FL framework tailored for generative models that ensures (i) stable performance in heterogeneous data distributions and (ii) resource efficiency in terms of communication cost and final model size. The key of our method is to search for an optimal stochastic binary mask for a random network rather than updating the model weights, identifying a sparse subnetwork with high generative performance; i.e., a ``strong lottery ticket''. By communicating binary masks in a stochastic manner, PRISM minimizes communication overhead. This approach, combined with the utilization of maximum mean discrepancy (MMD) loss and a mask-aware dynamic moving average aggregation method (MADA) on the server side, facilitates stable and strong generative capabilities by mitigating local divergence in FL scenarios. Moreover, thanks to its sparsifying characteristic, PRISM yields a lightweight model without extra pruning or quantization, making it ideal for environments such as edge devices. Experiments on MNIST, FMNIST, CelebA, and CIFAR10 demonstrate that PRISM outperforms existing methods, while maintaining privacy with minimal communication costs. PRISM is the first to successfully generate images under challenging non-IID and privacy-preserving FL environments on complex datasets, where previous methods have struggled.

Paperid: 2853, https://arxiv.org/pdf/2503.05839.pdf

Abstract:
The automotive industry is increasingly reliant on software to manage complex vehicle functionalities, making efficient and secure firmware updates essential. Traditional firmware update methods, requiring physical connections through On-Board Diagnostics (OBD) ports, are inconvenient, costly, and time-consuming. Firmware Over-the-Air (FOTA) technology offers a revolutionary solution by enabling wireless updates, reducing operational costs, and enhancing the user experience. This project aims to design and implement an advanced FOTA system tailored for modern vehicles, incorporating the AUTOSAR architecture for scalability and standardization, and utilizing delta updating to minimize firmware update sizes, thereby improving bandwidth efficiency and reducing flashing times. To ensure security, the system integrates the UDS 0x27 protocol for authentication and data integrity during the update process. Communication between Electronic Control Units (ECUs) is achieved using the CAN protocol, while the ESP8266 module and the master ECU communicate via SPI for data transfer. The system's architecture includes key components such as a bootloader, boot manager, and bootloader updater to facilitate seamless firmware updates. The functionality of the system is demonstrated through two applications: a blinking LED and a Lane Keeping Assist (LKA) system, showcasing its versatility in handling critical automotive features. This project represents a significant step forward in automotive technology, offering a user-centric, efficient, and secure solution for automotive firmware management.

Paperid: 2854, https://arxiv.org/pdf/2503.05045.pdf

Abstract:
We propose a semi-quantum conference key agreement (SQCKA) protocol that leverages on GHZ states. We provide a comprehensive security analysis for our protocol that does not rely on a trusted mediator party. We present information-theoretic security proof, addressing collective attacks within the asymptotic limit of infinitely many rounds. This assumption is practical, as participants can monitor and abort the protocol if deviations from expected noise patterns occur. This advancement enhances the feasibility of SQCKA protocols for real-world applications, ensuring strong security without complex network topologies or third-party trust.

Paperid: 2855, https://arxiv.org/pdf/2503.04549.pdf

Abstract:
Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARK) schemes have gained significant adoption in privacy-preserving applications, decentralized systems (e.g., blockchain), and verifiable computation due to their efficiency. However, the most efficient zk-SNARKs often rely on a one-time trusted setup to generate a public parameter, often known as the ``Powers of Tau" (PoT) string. The leakage of the secret parameter, $Ï$, in the string would allow attackers to generate false proofs, compromising the soundness of all zk-SNARK systems built on it. Prior proposals for decentralized setup ceremonies have utilized blockchain-based smart contracts to allow any party to contribute randomness to $Ï$ while also preventing censorship of contributions. For a PoT string of $d$-degree generated by the randomness of $m$ contributors, these solutions required a total of $O(md)$ on-chain operations (i.e., in terms of both storage and cryptographic operations). These operations primarily consisted of costly group operations, particularly scalar multiplication on pairing curves, which discouraged participation and limited the impact of decentralization In this work, we present Lite-PoT, which includes two key protocols designed to reduce participation costs: \emph{(i)} a fraud-proof protocol to reduce the number of expensive on-chain cryptographic group operations to $O(1)$ per contributor. Our experimental results show that (with one transaction per update) our protocol enables decentralized ceremonies for PoT strings up to a $2^{15}$ degree, an $\approx 16x$ improvement over existing on-chain solutions; \emph{(ii)} a proof aggregation technique that batches $m$ randomness contributions into one on-chain update with only $O(d)$ on-chain operations, independent of $m$. This significantly reduces the monetary cost of on-chain updates by $m$-fold via amortization.

Paperid: 2856, https://arxiv.org/pdf/2503.04054.pdf

Abstract:
Federated Learning (FL) is the standard protocol for collaborative learning. In FL, multiple workers jointly train a shared model. They exchange model updates calculated on their data, while keeping the raw data itself local. Since workers naturally form groups based on common interests and privacy policies, we are motivated to extend standard FL to reflect a setting with multiple, potentially overlapping groups. In this setup where workers can belong and contribute to more than one group at a time, complexities arise in understanding privacy leakage and in adhering to privacy policies. To address the challenges, we propose differential private overlapping grouped learning (DPOGL), a novel method to implement privacy guarantees within overlapping groups. Under the honest-but-curious threat model, we derive novel privacy guarantees between arbitrary pairs of workers. These privacy guarantees describe and quantify two key effects of privacy leakage in DP-OGL: propagation delay, i.e., the fact that information from one group will leak to other groups only with temporal offset through the common workers and information degradation, i.e., the fact that noise addition over model updates limits information leakage between workers. Our experiments show that applying DP-OGL enhances utility while maintaining strong privacy compared to standard FL setups.

Paperid: 2857, https://arxiv.org/pdf/2503.03980.pdf

Abstract:
The USB protocol has become a ubiquitous standard for connecting peripherals to computers, making its security a critical concern. A recent research study demonstrated the potential to exploit weaknesses in well-established protocols, such as PCIe, and created a side-channel for leaking sensitive information by leveraging congestion within shared interfaces. Drawing inspiration from that, this project introduces an innovative approach to USB side-channel attacks via congestion. We evaluated the susceptibility of USB devices and hubs to remote profiling and side-channel attacks, identified potential weaknesses within the USB standard, and highlighted the critical need for heightened security and privacy in USB technology. Our findings discover vulnerabilities within the USB standard, which are difficult to effectively mitigate and underscore the need for enhanced security measures to protect user privacy in an era increasingly dependent on USB-connected devices.

Paperid: 2858, https://arxiv.org/pdf/2503.03974.pdf

Abstract:
Voter registration systems are a critical - and surprisingly understudied - element of most high-stakes elections. Despite a history of targeting by adversaries, relatively little academic work has been done to increase visibility into how voter registration systems keep voters' data secure, accurate, and up to date. Enhancing transparency and verifiability could help election officials and the public detect and mitigate risks to this essential component of electoral processes worldwide. This work introduces cryptographic verifiability for voter registration systems. Based on consultation with diverse expert stakeholders that support elections systems, we precisely define the requirements for cryptographic verifiability in voter registration and systematize the practical challenges that must be overcome for near-term deployment. We then introduce VRLog, the first system to bring strong verifiability to voter registration. VRLog enables election officials to provide a transparent log that (1) allows voters to verify that their registration data has not been tampered with and (2) allows the public to monitor update patterns and database consistency. We also introduce VRLog$^x$, an enhancement to VRLog that offers cryptographic privacy to voter deduplication between jurisdictions - a common maintenance task currently performed in plaintext or using trusted third parties. Our designs rely on standard, efficient cryptographic primitives, and are backward compatible with existing voter registration systems. Finally, we provide an open-source implementation of VRLog and benchmarks to demonstrate that the system is practical - capable of running on low-cost commodity hardware and scaling to support databases the size of the largest U.S. state voter registration systems.

Paperid: 2859, https://arxiv.org/pdf/2503.03684.pdf

Abstract:
This paper develops a comprehensive framework to address three critical trustworthy challenges in federated learning (FL): robustness against Byzantine attacks, fairness, and privacy preservation. To improve the system's defense against Byzantine attacks that send malicious information to bias the system's performance, we develop a Two-sided Norm Based Screening (TNBS) mechanism, which allows the central server to crop the gradients that have the l lowest norms and h highest norms. TNBS functions as a screening tool to filter out potential malicious participants whose gradients are far from the honest ones. To promote egalitarian fairness, we adopt the q-fair federated learning (q-FFL). Furthermore, we adopt a differential privacy-based scheme to prevent raw data at local clients from being inferred by curious parties. Convergence guarantees are provided for the proposed framework under different scenarios. Experimental results on real datasets demonstrate that the proposed framework effectively improves robustness and fairness while managing the trade-off between privacy and accuracy. This work appears to be the first study that experimentally and theoretically addresses fairness, privacy, and robustness in trustworthy FL.

Paperid: 2860, https://arxiv.org/pdf/2503.03652.pdf

Abstract:
The use of language models as remote services requires transmitting private information to external providers, raising significant privacy concerns. This process not only risks exposing sensitive data to untrusted service providers but also leaves it vulnerable to interception by eavesdroppers. Existing privacy-preserving methods for natural language processing (NLP) interactions primarily rely on semantic similarity, overlooking the role of contextual information. In this work, we introduce dchi-stencil, a novel token-level privacy-preserving mechanism that integrates contextual and semantic information while ensuring strong privacy guarantees under the dchi differential privacy framework, achieving 2epsilon-dchi-privacy. By incorporating both semantic and contextual nuances, dchi-stencil achieves a robust balance between privacy and utility. We evaluate dchi-stencil using state-of-the-art language models and diverse datasets, achieving comparable and even better trade-off between utility and privacy compared to existing methods. This work highlights the potential of dchi-stencil to set a new standard for privacy-preserving NLP in modern, high-risk applications.

Paperid: 2861, https://arxiv.org/pdf/2503.03494.pdf

Abstract:
A computing device typically identifies itself by exhibiting unique measurable behavior or by proving its knowledge of a secret. In both cases, the identifying device must reveal information to a verifier. Considerable research has focused on protecting identifying entities (provers) and reducing the amount of leaked data. However, little has been done to conceal the fact that the verification occurred. We show how this problem naturally arises in the context of digital emblems, which were recently proposed by the International Committee of the Red Cross to protect digital resources during cyber-conflicts. To address this new and important open problem, we define a new primitive, called an Oblivious Digital Token (ODT) that can be verified obliviously. Verifiers can use this procedure to check whether a device has an ODT without revealing to any other parties (including the device itself) that this check occurred. We demonstrate the feasibility of ODTs and present a concrete construction that provably meets the ODT security requirements, even if the prover device's software is fully compromised. We also implement a prototype of the proposed construction and evaluate its performance, thereby confirming its practicality.

Paperid: 2862, https://arxiv.org/pdf/2503.02986.pdf

Abstract:
Adversarial attacks remain a significant threat that can jeopardize the integrity of Machine Learning (ML) models. In particular, query-based black-box attacks can generate malicious noise without having access to the victim model's architecture, making them practical in real-world contexts. The community has proposed several defenses against adversarial attacks, only to be broken by more advanced and adaptive attack strategies. In this paper, we propose a framework that detects if an adversarial noise instance is being generated. Unlike existing stateful defenses that detect adversarial noise generation by monitoring the input space, our approach learns adversarial patterns in the input update similarity space. In fact, we propose to observe a new metric called Delta Similarity (DS), which we show it captures more efficiently the adversarial behavior. We evaluate our approach against 8 state-of-the-art attacks, including adaptive attacks, where the adversary is aware of the defense and tries to evade detection. We find that our approach is significantly more robust than existing defenses both in terms of specificity and sensitivity.

Paperid: 2863, https://arxiv.org/pdf/2503.02702.pdf

Abstract:
Internal threat detection (IDT) aims to address security threats within organizations or enterprises by identifying potential or already occurring malicious threats within vast amounts of logs. Although organizations or enterprises have dedicated personnel responsible for reviewing these logs, it is impossible to manually examine all logs entirely.In response to the vast number of logs, we propose a system called RedChronos, which is a Large Language Model-Based Log Analysis System. This system incorporates innovative improvements over previous research by employing Query-Aware Weighted Voting and a Semantic Expansion-based Genetic Algorithm with LLM-driven Mutations. On the public datasets CERT 4.2 and 5.2, RedChronos outperforms or matches existing approaches in terms of accuracy, precision, and detection rate. Moreover, RedChronos reduces the need for manual intervention in security log reviews by approximately 90% in the Xiaohongshu Security Operation Center. Therefore, our RedChronos system demonstrates exceptional performance in handling IDT tasks, providing innovative solutions for these challenges. We believe that future research can continue to enhance the system's performance in IDT tasks while also reducing the response time to internal risk events.

Paperid: 2864, https://arxiv.org/pdf/2503.02499.pdf

Abstract:
CONTEXT. Attack treesare a recommended threat modeling tool, but there is no established method to compare them. OBJECTIVE. We aim to establish a method to compare "real" attack trees, based on both the structure of the tree itself and the meaning of the node labels. METHOD. We define four methods of comparison (three novel and one established) and compare them to a dataset of attack trees created from a study run on students (n = 39). These attack trees all follow from the same scenario, but have slightly different labels. RESULTS. We find that applying semantic similarity as a means of comparing node labels is a valid approach. Further, we find that treeedit distance (established) and radical distance (novel) are themost promising methods of comparison in most circumstances. CONCLUSION. We show that these two methods are valid as means of comparing attack trees, and suggest a novel technique for using semantic similarity to compare node labels. We further suggest that these methods can be used to compare attack trees in a real-world scenario, and that they can be used to identify similar attack trees.

Paperid: 2865, https://arxiv.org/pdf/2503.02455.pdf

Abstract:
Privacy preservation in Internet of Things (IoT) systems requires the use of privacy-enhancing technologies (PETs) built from innovative technologies such as cryptography and artificial intelligence (AI) to create techniques called privacy preservation techniques (PPTs). These PPTs achieve various privacy goals and address different privacy concerns by mitigating potential privacy threats within IoT systems. This study carried out a scoping review of different types of PPTs used in previous research works on IoT systems between 2010 and early 2023 to further explore the advantages of privacy preservation in these systems. This scoping review looks at privacy goals, possible technologies used for building PET, the integration of PPTs into the computing layer of the IoT architecture, different IoT applications in which PPTs are deployed, and the different privacy types addressed by these techniques within IoT systems. Key findings, such as the prominent privacy goal and privacy type in IoT, are discussed in this survey, along with identified research gaps that could inform future endeavors in privacy research and benefit the privacy research community and other stakeholders in IoT systems.

Paperid: 2866, https://arxiv.org/pdf/2503.02065.pdf

Abstract:
Explainable AI (XAI) holds significant promise for enhancing the transparency and trustworthiness of AI-driven threat detection in Security Operations Centers (SOCs). However, identifying the appropriate level and format of explanation, particularly in environments that demand rapid decision-making under high-stakes conditions, remains a complex and underexplored challenge. To address this gap, we conducted a three-month mixed-methods study combining an online survey (N1=248) with in-depth interviews (N2=24) to examine (1) how SOC analysts conceptualize AI-generated explanations and (2) which types of explanations are perceived as actionable and trustworthy across different analyst roles. Our findings reveal that participants were consistently willing to accept XAI outputs, even in cases of lower predictive accuracy, when explanations were perceived as relevant and evidence-backed. Analysts repeatedly emphasized the importance of understanding the rationale behind AI decisions, expressing a strong preference for contextual depth over a mere presentation of outcomes on dashboards. Building on these insights, this study re-evaluates current explanation methods within security contexts and demonstrates that role-aware, context-rich XAI designs aligned with SOC workflows can substantially improve practical utility. Such tailored explainability enhances analyst comprehension, increases triage efficiency, and supports more confident responses to evolving threats.

Paperid: 2867, https://arxiv.org/pdf/2503.02017.pdf

Abstract:
The ever growing Internet of Things (IoT) connections drive a new type of organization, the Intelligent Enterprise. In intelligent enterprises, machine learning based models are adopted to extract insights from data. Due to the efficiency and privacy challenges of these traditional models, a new federated learning (FL) paradigm has emerged. In FL, multiple enterprises can jointly train a model to update a final model. However, firstly, FL trained models usually perform worse than centralized models, especially when enterprises training data is non-IID (Independent and Identically Distributed). Second, due to the centrality of FL and the untrustworthiness of local enterprises, traditional FL solutions are vulnerable to poisoning and inference attacks and violate privacy. Thirdly, the continuous transfer of parameters between enterprises and servers increases communication costs. To this end, the FedAnil+ model is proposed, a novel, lightweight, and secure Federated Deep Learning Model that includes three main phases. In the first phase, the goal is to solve the data type distribution skew challenge. Addressing privacy concerns against poisoning and inference attacks is covered in the second phase. Finally, to alleviate the communication overhead, a novel compression approach is proposed that significantly reduces the size of the updates. The experiment results validate that FedAnil+ is secure against inference and poisoning attacks with better accuracy. In addition, it shows improvements over existing approaches in terms of model accuracy (13%, 16%, and 26%), communication cost (17%, 21%, and 25%), and computation cost (7%, 9%, and 11%).

Paperid: 2868, https://arxiv.org/pdf/2503.00404.pdf

Abstract:
We introduce SecRef*, a secure compilation framework protecting stateful programs verified in F* against linked unverified code, with which the program dynamically shares ML-style mutable references. To ease program verification in this setting, we propose a way of tracking which references are shareable with the unverified code, and which ones are not shareable and whose contents are thus guaranteed to be unchanged after calling into unverified code. This universal property of non-shareable references is exposed in the interface on which the verified program can rely when calling into unverified code. The remaining refinement types and pre- and post-conditions that the verified code expects from the unverified code are converted into dynamic checks about the shared references by using higher-order contracts. We prove formally in F* that this strategy ensures sound and secure interoperability with unverified code. Since SecRef* is built on top of the Monotonic State effect of F*, these proofs rely on the first monadic representation for this effect, which is a contribution of our work that can be of independent interest. Finally, we use SecRef* to build a simple cooperative multi-threading scheduler that is verified and that securely interacts with unverified threads.

Paperid: 2869, https://arxiv.org/pdf/2503.00145.pdf

Abstract:
In recent years, several hardware-based countermeasures proposed to mitigate Spectre attacks have been shown to be insecure. To enable the development of effective secure speculation countermeasures, we need easy-to-use tools that can automatically test their security guarantees early-on in the design phase to facilitate rapid prototyping. This paper develops AMuLeT, the first tool capable of testing secure speculation countermeasures for speculative leakage early in their design phase in simulators. Our key idea is to leverage model-based relational testing tools that can detect speculative leaks in commercial CPUs, and apply them to micro-architectural simulators to test secure speculation defenses. We identify and overcome several challenges, including designing an expressive yet realistic attacker observer model in a simulator, overcoming the slow simulation speed, and searching the vast micro-architectural state space for potential vulnerabilities. AMuLeT speeds up test throughput by more than 10x compared to a naive design and uses techniques to amplify vulnerabilities to uncover them within a limited test budget. Using AMuLeT, we launch for the first time, a systematic, large-scale testing campaign of four secure speculation countermeasures from 2018 to 2024--InvisiSpec, CleanupSpec, STT, and SpecLFB--and uncover 3 known and 6 unknown bugs and vulnerabilities, within 3 hours of testing. We also show for the first time that the open-source implementation of SpecLFB is insecure.

Paperid: 2870, https://arxiv.org/pdf/2502.20670.pdf

Abstract:
We propose a mechanism embedded into the foundational infrastructure of a blockchain network, designed to improve the utility of idle network resources, whilst enhancing market microstructure efficiency during block production by leveraging both network-owned and external capital. By systematically seeking to use idle network resources for internally capture arbitrageable inefficiencies, the mechanism mitigates extractable value leakage, reduces execution frictions, and improves price formation across venues. This framework optimises resource allocation by incentivising an ordered set of transactions to be identified and automatically executed at the end of each block, redirecting any realised arbitrage income - to marketplaces operating on the host blockchain network (and other stakeholders), which may have otherwise been extracted as rent by external actors. Crucially, this process operates without introducing additional inventory risk, ensuring that the network remains a neutral facilitator of price discovery. While the systematic framework governing the distribution of these internally captured returns is beyond the scope of this work, reinvesting them to support the ecosystem deployed on the host blockchain network is envisioned to endogenously enhance liquidity, strengthen transactional efficiency, and promote the organic adoption of the blockchain for end users. This mechanism is designed specifically for Supra's blockchain and seeks to maximally utilise its highly efficient automation framework to enhance the blockchain network's efficiency.

Paperid: 2871, https://arxiv.org/pdf/2502.20403.pdf

Abstract:
Adversarial robustness in quantum classifiers is a critical area of study, providing insights into their performance compared to classical models and uncovering potential advantages inherent to quantum machine learning. In the NISQ era of quantum computing, circuit cutting is a notable technique for simulating circuits that exceed the qubit limitations of current devices, enabling the distribution of a quantum circuit's execution across multiple quantum processing units through classical communication. We examine how partitioning quantum classifiers through circuit cutting increase their susceptibility to adversarial attacks, establishing a link between attacking the state preparation channels in wire cutting and implementing adversarial gates within intermediate layers of a quantum classifier. We then proceed to study the latter problem from both a theoretical and experimental perspective.

Paperid: 2872, https://arxiv.org/pdf/2502.18697.pdf

Abstract:
The widespread adoption of Electric Vehicles (EVs) poses critical challenges for energy providers, particularly in predicting charging time (temporal prediction), ensuring user privacy, and managing resources efficiently in mobility-driven networks. This paper introduces the Hierarchical Federated Learning Transformer Network (H-FLTN) framework to address these challenges. H-FLTN employs a three-tier hierarchical architecture comprising EVs, community Distributed Energy Resource Management Systems (DERMS), and the Energy Provider Data Centre (EPDC) to enable accurate spatio-temporal predictions of EV charging needs while preserving privacy. Temporal prediction is enhanced using Transformer-based learning, capturing complex dependencies in charging behavior. Privacy is ensured through Secure Aggregation, Additive Secret Sharing, and Peer-to-Peer (P2P) Sharing with Augmentation, which allow only secret shares of model weights to be exchanged while securing all transmissions. To improve training efficiency and resource management, H-FLTN integrates Dynamic Client Capping Mechanism (DCCM) and Client Rotation Management (CRM), ensuring that training remains both computationally and temporally efficient as the number of participating EVs increases. DCCM optimises client participation by limiting excessive computational loads, while CRM balances training contributions across epochs, preventing imbalanced participation. Our simulation results based on large-scale empirical vehicle mobility data reveal that DCCM and CRM reduce the training time complexity with increasing EVs from linear to constant. Its integration into real-world smart city infrastructure enhances energy demand forecasting, resource allocation, and grid stability, ensuring reliability and sustainability in future mobility ecosystems.

Paperid: 2873, https://arxiv.org/pdf/2502.18092.pdf

Abstract:
The Update Framework or TUF was developed to address several known weaknesses that have been observed in software update distribution and validation systems. Unlike conventional secure software distribution methods where there may be a single digital signature applied to each update, TUF introduces four distinct roles each with one or more signing key, that must participate in the update process. This approach increases the total size of each update package and increases the number of signatures that each client system must validate. As system architects consider the transition to post-quantum algorithms, understanding the impact of new signature algorithms on a TUF deployment becomes a significant consideration. In this work we introduce a state machine model that accounts for the cumulative impact of of signature algorithm selection when used with TUF for software updates.

Paperid: 2874, https://arxiv.org/pdf/2502.17880.pdf

Abstract:
With the popularity of 3D volumetric video applications, such as Autonomous Driving, Virtual Reality, and Mixed Reality, current developers have turned to deep learning for compressing volumetric video frames, i.e., point clouds for video upstreaming. The latest deep learning-based solutions offer higher efficiency, lower distortion, and better hardware support compared to traditional ones like MPEG and JPEG. However, privacy threats arise, especially reconstruction attacks targeting to recover the original input point cloud from the intermediate results. In this paper, we design VVRec, to the best of our knowledge, which is the first targeting DL-based Volumetric Video Reconstruction attack scheme. VVRec demonstrates the ability to reconstruct high-quality point clouds from intercepted transmission intermediate results using four well-trained neural network modules we design. Leveraging the latest latent diffusion models with Gamma distribution and a refinement algorithm, VVRec excels in reconstruction quality, color recovery, and surpasses existing defenses. We evaluate VVRec using three volumetric video datasets. The results demonstrate that VVRec achieves 64.70dB reconstruction accuracy, with an impressive 46.39% reduction of distortion over baselines.

Paperid: 2875, https://arxiv.org/pdf/2502.17485.pdf

Abstract:
In Federated Deep Learning (FDL), multiple local enterprises are allowed to train a model jointly. Then, they submit their local updates to the central server, and the server aggregates the updates to create a global model. However, trained models usually perform worse than centralized models, especially when the training data distribution is non-independent and identically distributed (nonIID). NonIID data harms the accuracy and performance of the model. Additionally, due to the centrality of federated learning (FL) and the untrustworthiness of enterprises, traditional FL solutions are vulnerable to security and privacy attacks. To tackle this issue, we propose FedAnil, a secure blockchain enabled Federated Deep Learning Model that improves enterprise models decentralization, performance, and tamper proof properties, incorporating two main phases. The first phase addresses the nonIID challenge (label and feature distribution skew). The second phase addresses security and privacy concerns against poisoning and inference attacks through three steps. Extensive experiments were conducted using the Sent140, FashionMNIST, FEMNIST, and CIFAR10 new real world datasets to evaluate FedAnils robustness and performance. The simulation results demonstrate that FedAnil satisfies FDL privacy preserving requirements. In terms of convergence analysis, the model parameter obtained with FedAnil converges to the optimum of the model parameter. In addition, it performs better in terms of accuracy (more than 11, 15, and 24%) and computation overhead (less than 8, 10, and 15%) compared with baseline approaches, namely ShieldFL, RVPFL, and RFA.

Paperid: 2876, https://arxiv.org/pdf/2502.17010.pdf

Abstract:
In this paper, we prove that the supersingular isogeny problem (Isogeny), endomorphism ring problem (EndRing) and maximal order problem (MaxOrder) are equivalent under probabilistic polynomial time reductions, unconditionally.Isogeny-based cryptography is founded on the presumed hardness of these problems, and their interconnection is at the heart of the design and analysis of cryptosystems like the SQIsign digital signature scheme. Previously known reductions relied on unproven assumptions such as the generalized Riemann hypothesis. In this work, we present unconditional reductions, and extend this network of equivalences to the problem of computing the lattice of all isogenies between two supersingular elliptic curves (HomModule).For cryptographic applications, one requires computational problems to be hard on average for random instances. It is well-known that if Isogeny is hard (in the worst case), then it is hard for random instances. We extend this result by proving that if any of the above-mentionned classical problems is hard in the worst case, then all of them are hard on average. In particular, if there exist hard instances of Isogeny, then all of Isogeny, EndRing, MaxOrder and HomModule are hard on average.

Paperid: 2877, https://arxiv.org/pdf/2502.16955.pdf

Abstract:
Smart contracts, closely intertwined with cryptocurrency transactions, have sparked widespread concerns about considerable financial losses of security issues. To counteract this, a variety of tools have been developed to identify vulnerability in smart contract. However, they fail to overcome two challenges at the same time when faced with smart contract bytecode: (i) strong interference caused by enormous non-relevant instructions; (ii) missing semantics of bytecode due to incomplete data and control flow dependencies. In this paper, we propose a multi-teacher based bytecode vulnerability detection method, namely Multi-Teacher Vulnerability Hunter (MTVHunter), which delivers effective denoising and missing semantic to bytecode under multi-teacher guidance. Specifically, we first propose an instruction denoising teacher to eliminate noise interference by abstract vulnerability pattern and further reflect in contract embeddings. Secondly, we design a novel semantic complementary teacher with neuron distillation, which effectively extracts necessary semantic from source code to replenish the bytecode. Particularly, the proposed neuron distillation accelerate this semantic filling by turning the knowledge transition into a regression task. We conduct experiments on 229,178 real-world smart contracts that concerns four types of common vulnerabilities. Extensive experiments show MTVHunter achieves significantly performance gains over state-of-the-art approaches.

Paperid: 2878, https://arxiv.org/pdf/2502.16835.pdf

Abstract:
Detecting vulnerabilities in source code is a critical task for software security assurance. Graph Neural Network (GNN) machine learning can be a promising approach by modeling source code as graphs. Early approaches treated code elements uniformly, limiting their capacity to model diverse relationships that contribute to various vulnerabilities. Recent research addresses this limitation by considering the heterogeneity of node types and using Gated Graph Neural Networks (GGNN) to aggregate node information through different edge types. However, these edges primarily function as conduits for passing node information and may not capture detailed characteristics of distinct edge types. This paper presents Inter-Procedural Abstract Graphs (IPAGs) as an efficient, language-agnostic representation of source code, complemented by heterogeneous GNN training for vulnerability prediction. IPAGs capture the structural and contextual properties of code elements and their relationships. We also propose a Heterogeneous Attention GNN (HAGNN) model that incorporates multiple subgraphs capturing different features of source code. These subgraphs are learned separately and combined using a global attention mechanism, followed by a fully connected neural network for final classification. The proposed approach has achieved up to 96.6% accuracy on a large C dataset of 108 vulnerability types and 97.8% on a large Java dataset of 114 vulnerability types, outperforming state-of-the-art methods. Its applications to various real-world software projects have also demonstrated low false positive rates.

Paperid: 2879, https://arxiv.org/pdf/2502.16184.pdf

Abstract:
The EU Artificial Intelligence Act (AIA) establishes different legal principles for different types of AI systems. While prior work has sought to clarify some of these principles, little attention has been paid to robustness and cybersecurity. This paper aims to fill this gap. We identify legal challenges and shortcomings in provisions related to robustness and cybersecurity for high-risk AI systems(Art. 15 AIA) and general-purpose AI models (Art. 55 AIA). We show that robustness and cybersecurity demand resilience against performance disruptions. Furthermore, we assess potential challenges in implementing these provisions in light of recent advancements in the machine learning (ML) literature. Our analysis informs efforts to develop harmonized standards, guidelines by the European Commission, as well as benchmarks and measurement methodologies under Art. 15(2) AIA. With this, we seek to bridge the gap between legal terminology and ML research, fostering a better alignment between research and implementation efforts.

Paperid: 2880, https://arxiv.org/pdf/2502.15797.pdf

Abstract:
The prospect of artificial intelligence (AI) competing in the adversarial landscape of cyber security has long been considered one of the most impactful, challenging, and potentially dangerous applications of AI. Here, we demonstrate a new approach to assessing AI's progress towards enabling and scaling real-world offensive cyber operations (OCO) tactics in use by modern threat actors. We detail OCCULT, a lightweight operational evaluation framework that allows cyber security experts to contribute to rigorous and repeatable measurement of the plausible cyber security risks associated with any given large language model (LLM) or AI employed for OCO. We also prototype and evaluate three very different OCO benchmarks for LLMs that demonstrate our approach and serve as examples for building benchmarks under the OCCULT framework. Finally, we provide preliminary evaluation results to demonstrate how this framework allows us to move beyond traditional all-or-nothing tests, such as those crafted from educational exercises like capture-the-flag environments, to contextualize our indicators and warnings in true cyber threat scenarios that present risks to modern infrastructure. We find that there has been significant recent advancement in the risks of AI being used to scale realistic cyber threats. For the first time, we find a model (DeepSeek-R1) is capable of correctly answering over 90% of challenging offensive cyber knowledge tests in our Threat Actor Competency Test for LLMs (TACTL) multiple-choice benchmarks. We also show how Meta's Llama and Mistral's Mixtral model families show marked performance improvements over earlier models against our benchmarks where LLMs act as offensive agents in MITRE's high-fidelity offensive and defensive cyber operations simulation environment, CyberLayer.

Paperid: 2881, https://arxiv.org/pdf/2502.15629.pdf

Abstract:
In distributed differential privacy, multiple parties collaborate to analyze their combined data while each party protects the confidentiality of its data from the others. Interestingly, for certain fundamental two-party functions, such as the inner product and Hamming distance, the accuracy of distributed solutions significantly lags behind what can be achieved in the centralized model. For computational differential privacy, however, these limitations can be circumvented using oblivious transfer (used to implement secure multi-party computation). Yet, no results show that oblivious transfer is indeed necessary for accurately estimating a non-Boolean functionality. In particular, for the inner-product functionality, it was previously unknown whether oblivious transfer is necessary even for the best possible constant additive error. In this work, we prove that any computationally differentially private protocol that estimates the inner product over $\{-1,1\}^n \times \{-1,1\}^n$ up to an additive error of $O(n^{1/6})$, can be used to construct oblivious transfer. In particular, our result implies that protocols with sub-polynomial accuracy are equivalent to oblivious transfer. In this accuracy regime, our result improves upon Haitner, Mazor, Silbak, and Tsfadia [STOC '22], who showed that a key-agreement protocol is necessary.

Paperid: 2882, https://arxiv.org/pdf/2502.15041.pdf

Abstract:
Android malware detection has been extensively studied using both traditional machine learning (ML) and deep learning (DL) approaches. While many state-of-the-art detection models, particularly those based on DL, claim superior performance, they often rely on limited comparisons, lacking comprehensive benchmarking against traditional ML models across diverse datasets. This raises concerns about the robustness of DL-based approaches' performance and the potential oversight of simpler, more efficient ML models. In this paper, we conduct a systematic evaluation of Android malware detection models across four datasets: three recently published, publicly available datasets and a large-scale dataset we systematically collected. We implement a range of traditional ML models, including Random Forests (RF) and CatBoost, alongside advanced DL models such as Capsule Graph Neural Networks (CapsGNN), BERT-based models, and ExcelFormer based models. Our results reveal that in many cases simpler and more computationally efficient ML models achieve comparable or even superior performance compared with DL models. These findings highlight the need for rigorous benchmarking in Android malware detection research. We encourage future studies to conduct more comprehensive benchmarking comparisons between traditional and advanced models to ensure a more accurate assessment of detection capabilities. To facilitate further research, we provide access to our dataset, including app IDs, hash values, and labels.

Paperid: 2883, https://arxiv.org/pdf/2502.14558.pdf

Abstract:
With the introduction of regulations related to the ``right to be forgotten", federated learning (FL) is facing new privacy compliance challenges. To address these challenges, researchers have proposed federated unlearning (FU). However, existing FU research has primarily focused on improving the efficiency of unlearning, with less attention paid to the potential privacy vulnerabilities inherent in these methods. To address this gap, we draw inspiration from gradient inversion attacks in FL and propose the federated unlearning inversion attack (FUIA). The FUIA is specifically designed for the three types of FU (sample unlearning, client unlearning, and class unlearning), aiming to provide a comprehensive analysis of the privacy leakage risks associated with FU. In FUIA, the server acts as an honest-but-curious attacker, recording and exploiting the model differences before and after unlearning to expose the features and labels of forgotten data. FUIA significantly leaks the privacy of forgotten data and can target all types of FU. This attack contradicts the goal of FU to eliminate specific data influence, instead exploiting its vulnerabilities to recover forgotten data and expose its privacy flaws. Extensive experimental results show that FUIA can effectively reveal the private information of forgotten data. To mitigate this privacy leakage, we also explore two potential defense methods, although these come at the cost of reduced unlearning effectiveness and the usability of the unlearned model.

Paperid: 2884, https://arxiv.org/pdf/2502.13314.pdf

Abstract:
Given a differentially private unbiased estimate $\tilde{q}=q(D) +Î½$ of a statistic $q(D)$, we wish to obtain unbiased estimates of functions of $q(D)$, such as $1/q(D)$, solely through post-processing of $\tilde{q}$, with no further access to the confidential dataset $D$. To this end, we adapt the deconvolution method used for unbiased estimation in the statistical literature, deriving unbiased estimators for a broad family of twice-differentiable functions when the privacy-preserving noise $Î½$ is drawn from the Laplace distribution (Dwork et al., 2006). We further extend this technique to a more general class of functions, deriving approximately optimal estimators that are unbiased for values in a user-specified interval (possibly extending to $\pm \infty$). We use these results to derive an unbiased estimator for private means when the size $n$ of the dataset is not publicly known. In a numerical application, we find that a mechanism that uses our estimator to return an unbiased sample size and mean outperforms a mechanism that instead uses the previously known unbiased privacy mechanism for such means (Kamath et al., 2023). We also apply our estimators to develop unbiased transformation mechanisms for per-record differential privacy, a privacy concept in which the privacy guarantee is a public function of a record's value (Seeman et al., 2024). Our mechanisms provide stronger privacy guarantees than those in prior work (Finley et al., 2024) by using Laplace, rather than Gaussian, noise. Finally, using a different approach, we go beyond Laplace noise by deriving unbiased estimators for polynomials under the weak condition that the noise distribution has sufficiently many moments.

Paperid: 2885, https://arxiv.org/pdf/2502.13256.pdf

Abstract:
In our increasingly interconnected world, Cyber-Physical Systems (CPS) play a crucial role in industries like healthcare, transportation, and manufacturing by combining physical processes with computing power. These systems, however, face many challenges, especially regarding security and system faults. Anomalies in CPS may indicate unexpected problems, from sensor malfunctions to cyber-attacks, and must be detected to prevent failures that can cause harm or disrupt services. This paper provides an overview of the different ways researchers have approached anomaly detection in CPS. We categorize and compare methods like machine learning, deep learning, mathematical models, invariant, and hybrid techniques. Our goal is to help readers understand the strengths and weaknesses of these methods and how they can be used to create safer, more reliable CPS. By identifying the gaps in current solutions, we aim to encourage future research that will make CPS more secure and adaptive in our increasingly automated world.

Paperid: 2886, https://arxiv.org/pdf/2502.13167.pdf

Abstract:
Smart contracts are essential to decentralized finance (DeFi) and blockchain ecosystems but are increasingly vulnerable to exploits due to coding errors and complex attack vectors. Traditional static analysis tools and existing vulnerability detection methods often fail to address these challenges comprehensively, leading to high false-positive rates and an inability to detect dynamic vulnerabilities. This paper introduces SmartLLM, a novel approach leveraging fine-tuned LLaMA 3.1 models with Retrieval-Augmented Generation (RAG) to enhance the accuracy and efficiency of smart contract auditing. By integrating domain-specific knowledge from ERC standards and employing advanced techniques such as QLoRA for efficient fine-tuning, SmartLLM achieves superior performance compared to static analysis tools like Mythril and Slither, as well as zero-shot large language model (LLM) prompting methods such as GPT-3.5 and GPT-4. Experimental results demonstrate a perfect recall of 100% and an accuracy score of 70%, highlighting the model's robustness in identifying vulnerabilities, including reentrancy and access control issues. This research advances smart contract security by offering a scalable and effective auditing solution, supporting the secure adoption of decentralized applications.

Paperid: 2887, https://arxiv.org/pdf/2502.13065.pdf

Abstract:
Cryptographic primitives have been used for various non-cryptographic objectives, such as eliminating or reducing randomness and interaction. We show how to use cryptography to improve the time complexity of solving computational problems. Specifically, we show that under standard cryptographic assumptions, we can design algorithms that are asymptotically faster than existing ones while maintaining correctness. As a concrete demonstration, we construct a distribution of trapdoored matrices with the following properties: (a) computationally bounded adversaries cannot distinguish a random matrix from one drawn from this distribution (under computational hardness assumptions), and (b) given a trapdoor, we can multiply such an $n \times n$ matrix with any vector in near-linear (in $n$) time. We provide constructions both over finite fields and over the reals. This enables a broad speedup technique: any algorithm relying on a random matrix -- such as those that use various notions of dimensionality reduction -- can replace it with a matrix from our distribution, achieving computational speedups while preserving correctness. Using these trapdoored matrices, we present the first uniform reduction from worst-case to approximate and average-case matrix multiplication with optimal parameters (improving on Hirahara--Shimizu STOC 2025, albeit under computational assumptions), the first worst-case to average-case reductions for matrix inversion, solving a linear system, and computing a determinant, as well as a speedup of inference time in classification models.

Paperid: 2888, https://arxiv.org/pdf/2502.12848.pdf

Abstract:
Strand spaces are a formal framework for symbolic protocol verification that allows for pen-and-paper proofs of security. While extremely insightful, pen-and-paper proofs are error-prone, and it is hard to gain confidence on their correctness. To overcome this problem, we developed StrandsRocq, a full mechanization of the strand spaces in Coq (soon to be renamed Rocq). The mechanization was designed to be faithful to the original pen-and-paper development, and it was engineered to be modular and extensible. StrandsRocq incorporates new original proof techniques, a novel notion of maximal penetrator that enables protocol compositionality, and a set of Coq tactics tailored to the domain, facilitating proof automation and reuse, and simplifying the work of protocol analysts. To demonstrate the versatility of our approach, we modelled and analyzed a family of authentication protocols, drawing inspiration from ISO/IEC 9798-2 two-pass authentication, the classical Needham-Schroeder-Lowe protocol, as well as a recently-proposed static analysis for a key management API. The analyses in StrandsRocq confirmed the high degree of proof reuse, and enabled us to distill the minimal requirements for protocol security. Through mechanization, we identified and addressed several issues in the original proofs and we were able to significantly improve the precision of the static analysis for the key management API. Moreover, we were able to leverage the novel notion of maximal penetrator to provide a compositional proof of security for two simple authentication protocols.

Paperid: 2889, https://arxiv.org/pdf/2502.12382.pdf

Abstract:
The rapid growth of the Internet of Things (IoT) has revolutionized industries, enabling unprecedented connectivity and functionality. However, this expansion also increases vulnerabilities, exposing IoT networks to increasingly sophisticated cyberattacks. Intrusion Detection Systems (IDS) are crucial for mitigating these threats, and recent advancements in Machine Learning (ML) offer promising avenues for improvement. This research explores a hybrid approach, combining several standalone ML models such as Random Forest (RF), XGBoost, K-Nearest Neighbors (KNN), and AdaBoost, in a voting-based hybrid classifier for effective IoT intrusion detection. This ensemble method leverages the strengths of individual algorithms to enhance accuracy and address challenges related to data complexity and scalability. Using the widely-cited IoT-23 dataset, a prominent benchmark in IoT cybersecurity research, we evaluate our hybrid classifiers for both binary and multi-class intrusion detection problems, ensuring a fair comparison with existing literature. Results demonstrate that our proposed hybrid models, designed for robustness and scalability, outperform standalone approaches in IoT environments. This work contributes to the development of advanced, intelligent IDS frameworks capable of addressing evolving cyber threats.

Paperid: 2890, https://arxiv.org/pdf/2502.10538.pdf

Abstract:
Locally Decodable Codes (LDCs) are error correcting codes that admit efficient decoding of individual message symbols without decoding the entire message. Unfortunately, known LDC constructions offer a sub-optimal trade-off between rate, error tolerance and locality, the number of queries that the decoder must make to the received codeword $\tilde {y}$ to recovered a particular symbol from the original message $x$, even in relaxed settings where the encoder/decoder share randomness or where the channel is resource bounded. We initiate the study of Amortized Locally Decodable Codes where the local decoder wants to recover multiple symbols of the original message $x$ and the total number of queries to the received codeword $y$ can be amortized by the total number of message symbols recovered. We demonstrate that amortization allows us to overcome prior barriers and impossibility results. We first demonstrate that the Hadamard code achieves amortized locality below $2$ -- a result that is known to be impossible without amortization. Second, we study amortized locally decodable codes in cryptographic settings where the sender and receiver share a secret key or where the channel is resource-bounded and where the decoder wants to recover a consecutive subset of message symbols $[L,R]$. In these settings we show that it is possible to achieve a trifecta: constant rate, error tolerance and constant amortized locality.

Paperid: 2891, https://arxiv.org/pdf/2502.10490.pdf

Abstract:
As artificial intelligence becomes more prevalent in our lives, people are enjoying the convenience it brings, but they are also facing hidden threats, such as data poisoning and adversarial attacks. These threats can have disastrous consequences for the application of artificial intelligence, especially for some applications that take effect immediately, such as autonomous driving and medical fields. Among these threats, backdoor attacks have left a deep impression on people with their concealment and simple deployment, making them a threat that cannot be ignored, however, in the process of deploying the backdoor model, the backdoor attack often has some reasons that make it unsatisfactory in real-world applications, such as jitter and brightness changes. Based on this, we propose a highly robust backdoor attack that shifts the target sample and combines it with itself to form a backdoor sample, the Displacement Backdoor Attack(DBA). Experimental results show that the DBA attack can resist data augmentation that simulates real-world differences, such as rotation and cropping.

Paperid: 2892, https://arxiv.org/pdf/2502.10453.pdf

Abstract:
Attribution tags form the foundation of modern cryptoasset forensics. However, inconsistent or incorrect tags can mislead investigations and even result in false accusations. To address this issue, we propose a novel computational method based on Large Language Models (LLMs) to link attribution tags with well-defined knowledge graph concepts. We implemented this method in an end-to-end pipeline and conducted experiments showing that our approach outperforms baseline methods by up to 37.4% in F1-score across three publicly available attribution tag datasets. By integrating concept filtering and blocking procedures, we generate candidate sets containing five knowledge graph entities, achieving a recall of 93% without the need for labeled data. Additionally, we demonstrate that local LLM models can achieve F1-scores of 90%, comparable to remote models which achieve 94%. We also analyze the cost-performance trade-offs of various LLMs and prompt templates, showing that selecting the most cost-effective configuration can reduce costs by 90%, with only a 1% decrease in performance. Our method not only enhances attribution tag quality but also serves as a blueprint for fostering more reliable forensic evidence.

Paperid: 2893, https://arxiv.org/pdf/2502.10329.pdf

Abstract:
The rapid advancements in AI voice cloning, fueled by machine learning, have significantly impacted text-to-speech (TTS) and voice conversion (VC) fields. While these developments have led to notable progress, they have also raised concerns about the misuse of AI VC technology, causing economic losses and negative public perceptions. To address this challenge, this study focuses on creating active defense mechanisms against AI VC systems. We propose a novel active defense method, VocalCrypt, which embeds pseudo-timbre (jamming information) based on SFS into audio segments that are imperceptible to the human ear, thereby forming systematic fragments to prevent voice cloning. This approach protects the voice without compromising its quality. In comparison to existing methods, such as adversarial noise incorporation, VocalCrypt significantly enhances robustness and real-time performance, achieving a 500\% increase in generation speed while maintaining interference effectiveness. Unlike audio watermarking techniques, which focus on post-detection, our method offers preemptive defense, reducing implementation costs and enhancing feasibility. Extensive experiments using the Zhvoice and VCTK Corpus datasets show that our AI-cloned speech defense system performs excellently in automatic speaker verification (ASV) tests while preserving the integrity of the protected audio.

Paperid: 2894, https://arxiv.org/pdf/2502.10321.pdf

Abstract:
In this paper, we present a novel fraud-proof mechanism that achieves fast finality and, when combined with optimistic execution, enables real-time transaction processing. State-of-the-art optimistic rollups typically adopt a 7-day challenge window, during which any honest party can raise a challenge in case of fraud. We propose a new assert/challenge construction called "Dynamic Fraud Proofs" that achieves sub-second finality in ideal scenarios, while dynamically delaying settlement in the event of fraud detection and challenge resolution. The system relies on 1) a dynamic challenge period and 2) a configurable number of randomly selected verifier nodes who must interactively approve a state commitment without raising a challenge. If these conditions are not met, the state is not finalized, and the challenge period and approval criteria are dynamically adjusted. We provide a detailed analysis of the system's design, explaining how it maintains the assumption of a single honest node and addresses censorship attacks by inverting the traditional challenge process. Additionally, we formalize the system's probabilistic security model and discuss how bonding, incentives, and slashing mechanisms can encourage honest behavior, thereby increasing the likelihood of fast settlement in ideal scenarios.

Paperid: 2895, https://arxiv.org/pdf/2502.10283.pdf

Abstract:
Detecting attacks using encrypted signals is challenging since encryption hides its information content. We present a novel mechanism for anomaly detection over Learning with Errors (LWE) encrypted signals without using decryption, secure channels, nor complex communication schemes. Instead, the detector exploits the homomorphic property of LWE encryption to perform hypothesis tests on transformations of the encrypted samples. The specific transformations are determined by solutions to a hard lattice-based minimization problem. While the test's sensitivity deteriorates with suboptimal solutions, similar to the exponential deterioration of the (related) test that breaks the cryptosystem, we show that the deterioration is polynomial for our test. This rate gap can be exploited to pick parameters that lead to somewhat weaker encryption but large gains in detection capability. Finally, we conclude the paper by presenting a numerical example that simulates anomaly detection, demonstrating the effectiveness of our method in identifying attacks.

Paperid: 2896, https://arxiv.org/pdf/2502.10281.pdf

Abstract:
We present a passport-level trust token for Europe. In an era of escalating cyber threats fueled by global competition in economic, military, and technological domains, traditional security models are proving inadequate. The rise of advanced attacks exploiting zero-day vulnerabilities, supply chain infiltration, and system interdependencies underscores the need for a paradigm shift in cybersecurity. Zero Trust Architecture (ZTA) emerges as a transformative framework that replaces implicit trust with continuous verification of identity and granular access control. This thesis introduces TrustZero, a scalable layer of zero-trust security built around a universal "trust token" - a non-revocable self-sovereign identity with cryptographic signatures to enable robust, mathematically grounded trust attestations. By integrating ZTA principles with cryptography, TrustZero establishes a secure web-of-trust framework adaptable to legacy systems and inter-organisational communication.

Paperid: 2897, https://arxiv.org/pdf/2502.09549.pdf

Abstract:
Phishing continues to pose a significant cybersecurity threat. While blocklists currently serve as a primary defense, due to their reactive, passive nature, these delayed responses leave phishing websites operational long enough to harm potential victims. It is essential to address this fundamental challenge at the root, particularly in phishing domains. Domain registration presents a crucial intervention point, as domains serve as the primary gateway between users and websites. We conduct a comprehensive longitudinal analysis of 690,502 unique phishing domains, spanning a 39 month period, to examine their characteristics and behavioral patterns throughout their lifecycle-from initial registration to detection and eventual deregistration. We find that 66.1% of the domains in our dataset are maliciously registered, leveraging cost-effective TLDs and targeting brands by mimicking their domain names under alternative TLDs (e.g., .top and .tk) instead of the TLDs under which the brand domains are registered (e.g., .com and .ru). We also observe minimal improvements in detection speed for maliciously registered domains compared to compromised domains. Detection times vary widely across blocklists, and phishing domains remain accessible for an average of 11.5 days after detection, prolonging their potential impact. Our systematic investigation uncovers key patterns from registration through detection to deregistration, which could be leveraged to enhance anti-phishing active defenses at the DNS level.

Paperid: 2898, https://arxiv.org/pdf/2502.08830.pdf

Abstract:
The scarcity of data and the high complexity of Advanced Persistent Threats (APTs) attacks have created challenges in comprehending their behavior and hindered the exploration of effective detection techniques. To create an effective APT detection strategy, it is important to examine the Tactics, Techniques, and Procedures (TTPs) that have been reported by the industry. These TTPs can be difficult to classify as either malicious or legitimate. When developing an approach for the next generation of network intrusion detection systems (NIDS), it is necessary to take into account the specific context of the attack explained in this paper. In this study, we select 33 APT campaigns based on the fair distribution over the past 22 years to observe the evolution of APTs over time. We focus on their evasion techniques and how they stay undetected for months or years. We found that APTs cannot continue their operations without C&C servers, which are mostly addressed by Domain Name System (DNS). We identify several TTPs used for DNS, such as Dynamic DNS, typosquatting, and TLD squatting. The next step for APT operators is to start communicating with a victim. We found that the most popular protocol to deploy evasion techniques is using HTTP(S) with 81% of APT campaigns. HTTP(S) can evade firewall filtering and pose as legitimate web-based traffic. DNS protocol is also widely used by 45% of APTs for DNS resolution and tunneling. We identify and analyze the TTPs associated with using HTTP(S) based on real artifacts.

Paperid: 2899, https://arxiv.org/pdf/2502.08240.pdf

Abstract:
The Sender Policy Framework (SPF) is a basic mechanism for authorizing the use of domains in email. In combination with other mechanisms, it serves as a cornerstone for protecting users from forged senders. In this paper, we investigate the configuration of SPF across the Internet. To this end, we analyze SPF records from 12 million domains in the wild. Our analysis shows a growing adoption, with 56.5 % of the domains providing SPF records. However, we also uncover notable security issues: First, 2.9 % of the SPF records have errors, undefined content or ineffective rules, undermining the intended protection. Second, we observe a large number of very lax configurations. For example, 34.7 % of the domains allow emails to be sent from over 100 000 IP addresses. We explore the reasons for these loose policies and demonstrate that they facilitate email forgery. As a remedy, we derive recommendations for an adequate configuration and notify all operators of domains with misconfigured SPF records.

Paperid: 2900, https://arxiv.org/pdf/2502.07815.pdf

Abstract:
Detecting sensitive data such as Personally Identifiable Information (PII) and Protected Health Information (PHI) is critical for data security platforms. This study evaluates regex-based pattern matching algorithms and exact-match search techniques to optimize detection speed, accuracy, and scalability. Our benchmarking results indicate that Google RE2 provides the best balance of speed (10-15 ms/MB), memory efficiency (8-16 MB), and accuracy (99.5%) among regex engines, outperforming PCRE while maintaining broader hardware compatibility than Hyperscan. For exact matching, Aho-Corasick demonstrated superior performance (8 ms/MB) and scalability for large datasets. Performance analysis revealed that regex processing time scales linearly with dataset size and pattern complexity. A hybrid AI + Regex approach achieved the highest F1 score (91. 6%) by improving recall and minimizing false positives. Device benchmarking confirmed that our solution maintains efficient CPU and memory usage on both high-performance and mid-range systems. Despite its effectiveness, challenges remain, such as limited multilingual support and the need for regular pattern updates. Future work should focus on expanding language coverage, integrating data security and privacy management (DSPM) with data loss prevention (DLP) tools, and enhancing regulatory compliance for broader global adoption.

Paperid: 2901, https://arxiv.org/pdf/2502.07116.pdf

Abstract:
Threat modelling is the process of identifying potential vulnerabilities in a system and prioritising them. Existing threat modelling tools focus primarily on technical systems and are not as well suited to interpersonal threats. In this paper, we discuss traditional threat modelling methods and their shortcomings, and propose a new threat modelling framework (HARMS) to identify non-technical and human factors harms. We also cover a case study of applying HARMS when it comes to IoT devices such as smart speakers with virtual assistants.

Paperid: 2902, https://arxiv.org/pdf/2502.07036.pdf

Abstract:
Generative AI (Gen AI) with large language models (LLMs) are being widely adopted across the industry, academia and government. Cybersecurity is one of the key sectors where LLMs can be and/or are already being used. There are a number of problems that inhibit the adoption of trustworthy Gen AI and LLMs in cybersecurity and such other critical areas. One of the key challenge to the trustworthiness and reliability of LLMs is: how consistent an LLM is in its responses? In this paper, we have analyzed and developed a formal definition of consistency of responses of LLMs. We have formally defined what is consistency of responses and then develop a framework for consistency evaluation. The paper proposes two approaches to validate consistency: self-validation, and validation across multiple LLMs. We have carried out extensive experiments for several LLMs such as GPT4oMini, GPT3.5, Gemini, Cohere, and Llama3, on a security benchmark consisting of several cybersecurity questions: informational and situational. Our experiments corroborate the fact that even though these LLMs are being considered and/or already being used for several cybersecurity tasks today, they are often inconsistent in their responses, and thus are untrustworthy and unreliable for cybersecurity.

Paperid: 2903, https://arxiv.org/pdf/2502.06752.pdf

Abstract:
Blockchain Technology has revolutionized Finance and Technology with its secure, decentralized, and trust-less methodologies of data management. In a world where asset value fluctuations are unprecedented, it has become increasingly important to secure one's stake on their valuable assets and streamline the process of acquiring and transferring that stake over a trust-less environment. Tokenization proves to be unbeaten when it comes to giving the ownership of one's asset, an immutable, liquid, and irrefutable identity, as of the likes of cryptocurrency. It enables users to store and maintain records of their assets and even transfer fractions of these assets to other investors and stakeholders in the form of these tokens. However, like cryptocurrency, it too has witnessed attacks by malicious users that have compromised on their very foundation of security.These attacks have inflicted more damage since they represent real-world assets that have physical importance. This project aims to assist users to secure their valuable assets by providing a highly secure user-friendly platform to manage, create and deploy asset-tokens, and facilitate open and transparent communication between stakeholders, thereby upholding the decentralized nature of blockchain and offering the financial freedom of asset ownership, with an added market value of a cryptocurrency-backed tokens.

Paperid: 2904, https://arxiv.org/pdf/2502.06688.pdf

Abstract:
Data-driven cyberthreat detection has become a crucial defense technique in modern cybersecurity. Network defense, supported by Network Intrusion Detection Systems (NIDSs), has also increasingly adopted data-driven approaches, leading to greater reliance on data. Despite the importance of data, its scarcity has long been recognized as a major obstacle in NIDS research. In response, the community has published many new datasets recently. However, many of them remain largely unknown and unanalyzed, leaving researchers uncertain about their suitability for specific use cases. In this paper, we aim to address this knowledge gap by performing a systematic literature review (SLR) of 89 public datasets for NIDS research. Each dataset is comparatively analyzed across 13 key properties, and its potential applications are outlined. Beyond the review, we also discuss domain-specific challenges and common data limitations to facilitate a critical view on data quality. To aid in data selection, we conduct a dataset popularity analysis in contemporary state-of-the-art NIDS research. Furthermore, the paper presents best practices for dataset selection, generation, and usage. By providing a comprehensive overview of the domain and its data, this work aims to guide future research toward improving data quality and the robustness of NIDS solutions.

Paperid: 2905, https://arxiv.org/pdf/2502.06651.pdf

Abstract:
In order to both learn and protect sensitive training data, there has been a growing interest in privacy preserving machine learning methods. Differential privacy has emerged as an important measure of privacy. We are interested in the federated setting where a group of parties each have one or more training instances and want to learn collaboratively without revealing their data. In this paper, we propose strategies to compute differentially private empirical distribution functions. While revealing complete functions is more expensive from the point of view of privacy budget, it may also provide richer and more valuable information to the learner. We prove privacy guarantees and discuss the computational cost, both for a generic strategy fitting any security model and a special-purpose strategy based on secret sharing. We survey a number of applications and present experiments.

Paperid: 2906, https://arxiv.org/pdf/2502.06033.pdf

Abstract:
The National Institute of Standards and Technology (NIST) has recommended the use of stateful hash-based digital signatures for long-term applications that may require protection from future threats that use quantum computers. XMSS and LMS, the two approved algorithms, have multiple parameter options that impact digital signature size, public key size, the number of signatures that can be produced over the life of a keypair, and the computational effort to validate signatures. This collection of benchmark data is intended to support system designers in understanding the differences among the configuration options.

Paperid: 2907, https://arxiv.org/pdf/2502.05739.pdf

Abstract:
Large Language Models for Code (LLMs4Code) excel at code generation tasks, yielding promise to release developers from huge software development burdens. Nonetheless, these models have been shown to suffer from the significant privacy risks due to the potential leakage of sensitive information embedded during training, known as the memorization problem. Addressing this issue is crucial for ensuring privacy compliance and upholding user trust, but till now there is a dearth of dedicated studies in the literature that focus on this specific direction. Recently, machine unlearning has emerged as a promising solution by enabling models to "forget" sensitive information without full retraining, offering an efficient and scalable approach compared to traditional data cleaning methods. In this paper, we empirically evaluate the effectiveness of unlearning techniques for addressing privacy concerns in LLMs4Code.Specifically, we investigate three state-of-the-art unlearning algorithms and three well-known open-sourced LLMs4Code, on a benchmark that takes into consideration both the privacy data to be forgotten as well as the code generation capabilites of these models. Results show that it is feasible to mitigate the privacy concerns of LLMs4Code through machine unlearning while maintain their code generation capabilities at the same time. We also dissect the forms of privacy protection/leakage after unlearning and observe that there is a shift from direct leakage to indirect leakage, which underscores the need for future studies addressing this risk.

Paperid: 2908, https://arxiv.org/pdf/2502.05530.pdf

Abstract:
User identification procedures, essential to the information security of systems, enable system-user interactions by exchanging data through communication links and interfaces to validate and confirm user authenticity. However, human errors can introduce vulnerabilities that may disrupt the intended identification workflow and thus impact system behavior. Therefore, ensuring the integrity of these procedures requires accounting for such erroneous behaviors. We follow a formal, human-centric approach to analyze user identification procedures by modeling them as security ceremonies and apply proven techniques for automatically analyzing such ceremonies. The approach relies on mutation rules to model potential human errors that deviate from expected interactions during the identification process, and is implemented as the X-Men tool, an extension of the Tamarin prover, which automatically generates models with human mutations and implements matching mutations to other ceremony participants for analysis. As a proof-of-concept, we consider a real-life pilot study involving an AI-driven, virtual receptionist kiosk for authenticating visitors.

Paperid: 2909, https://arxiv.org/pdf/2502.05367.pdf

Abstract:
Advanced Persistent Threats (APTs) are among the most sophisticated threats facing critical organizations worldwide. APTs employ specific tactics, techniques, and procedures (TTPs) which make them difficult to detect in comparison to frequent and aggressive attacks. In fact, current network intrusion detection systems struggle to detect APTs communications, allowing such threats to persist unnoticed on victims' machines for months or even years. In this paper, we present EarlyCrow, an approach to detect APT malware command and control over HTTP(S) using contextual summaries. The design of EarlyCrow is informed by a novel threat model focused on TTPs present in traffic generated by tools recently used as part of APT campaigns. The threat model highlights the importance of the context around the malicious connections, and suggests traffic attributes which help APT detection. EarlyCrow defines a novel multipurpose network flow format called PairFlow, which is leveraged to build the contextual summary of a PCAP capture, representing key behavioral, statistical and protocol information relevant to APT TTPs. We evaluate the effectiveness of EarlyCrow on unseen APTs obtaining a headline macro average F1-score of 93.02% with FPR of $0.74%.

Paperid: 2910, https://arxiv.org/pdf/2502.05208.pdf

Abstract:
Autonomous vehicles (AVs) rely heavily on cameras and artificial intelligence (AI) to make safe and accurate driving decisions. However, since AI is the core enabling technology, this raises serious cyber threats that hinder the large-scale adoption of AVs. Therefore, it becomes crucial to analyze the resilience of AV security systems against sophisticated attacks that manipulate camera inputs, deceiving AI models. In this paper, we develop camera-camouflaged adversarial attacks targeting traffic sign recognition (TSR) in AVs. Specifically, if the attack is initiated by modifying the texture of a stop sign to fool the AV's object detection system, thereby affecting the AV actuators. The attack's effectiveness is tested using the CARLA AV simulator and the results show that such an attack can delay the auto-braking response to the stop sign, resulting in potential safety issues. We conduct extensive experiments under various conditions, confirming that our new attack is effective and robust. Additionally, we address the attack by presenting mitigation strategies. The proposed attack and defense methods are applicable to other end-to-end trained autonomous cyber-physical systems.

Paperid: 2911, https://arxiv.org/pdf/2502.03203.pdf

Abstract:
The Spectre speculative side-channel attacks pose formidable threats for security. Research has shown that code following the cryptographic constant-time discipline can be efficiently protected against Spectre v1 using a selective variant of Speculative Load Hardening (SLH). SLH was, however, not strong enough for protecting non-cryptographic code, leading to the introduction of Ultimate SLH, which provides protection for arbitrary programs, but has too large overhead for general use, since it conservatively assumes that all data is secret. In this paper we introduce a flexible SLH notion that achieves the best of both worlds by generalizing both Selective and Ultimate SLH. We give a suitable security definition for such transformations protecting arbitrary programs: any transformed program running with speculation should not leak more than what the source program leaks sequentially. We formally prove using the Rocq prover that two flexible SLH variants enforce this relative security guarantee. As easy corollaries we also obtain that, in our setting, Ultimate SLH enforces our relative security notion, and two selective SLH variants enforce speculative constant-time security.

Paperid: 2912, https://arxiv.org/pdf/2502.02720.pdf

Abstract:
Organizations are increasingly moving towards the cloud computing paradigm, in which an on-demand access to a pool of shared configurable resources is provided. However, security challenges, which are particularly exacerbated by the multitenancy and virtualization features of cloud computing, present a major obstacle. In particular, sharing of resources among potentially untrusted tenants in access controlled cloud datacenters can result in increased risk of data leakage. To address such risk, we propose an efficient risk-aware sensitive property-driven virtual resource assignment mechanism for cloud datacenters. We have used two information-theoretic measures, i.e., KL-divergence and mutual information, to represent sensitive properties in the dataset. Based on the vulnerabilities of cloud architecture and the sensitive property profile, we have formulated the problem as a cost-drive optimization problem. The problem is shown to be NP-complete. Accordingly, we have proposed two heuristics and presented simulation based performance results for cloud datacenters with multiple sensitivity.

Paperid: 2913, https://arxiv.org/pdf/2502.01225.pdf

Abstract:
Large language models are typically trained on vast amounts of data during the pre-training phase, which may include some potentially harmful information. Fine-tuning attacks can exploit this by prompting the model to reveal such behaviours, leading to the generation of harmful content. In this paper, we focus on investigating the performance of the Chain of Thought based reasoning model, DeepSeek, when subjected to fine-tuning attacks. Specifically, we explore how fine-tuning manipulates the model's output, exacerbating the harmfulness of its responses while examining the interaction between the Chain of Thought reasoning and adversarial inputs. Through this study, we aim to shed light on the vulnerability of Chain of Thought enabled models to fine-tuning attacks and the implications for their safety and ethical deployment.

Paperid: 2914, https://arxiv.org/pdf/2502.00587.pdf

Abstract:
Federated Learning (FL) enables collaborative model training across multiple devices while preserving data privacy. However, it remains susceptible to backdoor attacks, where malicious participants can compromise the global model. Existing defence methods are limited by strict assumptions on data heterogeneity (Non-Independent and Identically Distributed data) and the proportion of malicious clients, reducing their practicality and effectiveness. To overcome these limitations, we propose Robust Knowledge Distillation (RKD), a novel defence mechanism that enhances model integrity without relying on restrictive assumptions. RKD integrates clustering and model selection techniques to identify and filter out malicious updates, forming a reliable ensemble of models. It then employs knowledge distillation to transfer the collective insights from this ensemble to a global model. Extensive evaluations demonstrate that RKD effectively mitigates backdoor threats while maintaining high model performance, outperforming current state-of-the-art defence methods across various scenarios.

Paperid: 2915, https://arxiv.org/pdf/2502.00347.pdf

Abstract:
Significant losses in terms of life and property occur from road traffic accidents, which are often caused by drunk and drowsy drivers. Reducing accidents requires effective detection of alcohol impairment and drowsiness as well as real-time driver monitoring. This paper aims to create an Internet of Things (IoT)--enabled Drowsiness Driver Safety Alert System with Real-Time Monitoring Using Integrated Sensors Technology. The system features an alcohol sensor and an IR sensor for detecting alcohol presence and monitoring driver eye movements, respectively. Upon detecting alcohol, alarms and warning lights are activated, the vehicle speed is progressively reduced, and the motor stops within ten to fifteen seconds if the alcohol presence persists. The IR sensor monitors prolonged eye closure, triggering alerts, or automatic vehicle stoppage to prevent accidents caused by drowsiness. Data from the IR sensor is transmitted to a mobile phone via Bluetooth for real-time monitoring and alerts. By identifying driver alcoholism and drowsiness, this system seeks to reduce accidents and save lives by providing safer transportation.

Paperid: 2916, https://arxiv.org/pdf/2501.18740.pdf

Abstract:
With the ever-increasing integration of artificial intelligence into daily life and the growing importance of well-trained models, the security of hardware accelerators supporting Deep Neural Networks (DNNs) has become paramount. As a promising solution to prevent hardware intellectual property theft, eFPGA redaction has emerged. This technique selectively conceals critical components of the design, allowing authorized users to restore functionality post-fabrication by inserting the correct bitstream. In this paper, we explore the redaction of DNN accelerators using eFPGAs, from specification to physical design implementation. Specifically, we investigate the selection of critical DNN modules for redaction using both regular and fracturable look-up tables. We perform synthesis, timing verification, and place & route on redacted DNN accelerators. Furthermore, we evaluate the overhead of incorporating eFPGAs into DNN accelerators in terms of power, area, and delay, finding it reasonable given the security benefits.

Paperid: 2917, https://arxiv.org/pdf/2501.17824.pdf

Abstract:
Secure Multi-Party Computation (MPC) is an important enabling technology for data privacy in modern distributed applications. We develop a new type theory to automatically enforce correctness,confidentiality, and integrity properties of protocols written in the \emph{Prelude/Overture} language framework. Judgements in the type theory are predicated on SMT verifications in a theory of finite fields, which supports precise and efficient analysis. Our approach is automated, compositional, scalable, and generalizes to arbitrary prime fields for data and key sizes.

Paperid: 2918, https://arxiv.org/pdf/2501.17786.pdf

Abstract:
The heterogeneity of the blockchain landscape has motivated the design of blockchain protocols tailored to specific blockchains and applications that, hence, require custom security proofs. We observe that many blockchain protocols share common security and functionality goals, which can be captured by an atomic transfer graph (ATG) describing the structure of desired transfers. Based on this observation, we contribute a framework for generating secure-by-design protocols that realize these goals. The resulting protocols build upon Conditional Timelock Contracts (CTLCs), a novel minimal smart contract functionality that can be implemented in a large variety of cryptocurrencies with a restricted scripting language (e.g., Bitcoin), and payment channels. We show how ATGs, in addition to enabling novel applications, capture the security and functionality goals of existing applications, including many examples from payment channel networks and complex multi-party cross-currency swaps among Ethereum-style cryptocurrencies. Our framework is the first to provide generic and provably secure protocols for all these use cases while matching or improving the performance of existing use-case-specific protocols.

Paperid: 2919, https://arxiv.org/pdf/2501.17762.pdf

Abstract:
A smart contract is an interactive program that governs funds in the realm of a single cryptocurrency. Yet, the many existing cryptocurrencies have spurred the design of cross-chain applications that require interactions with multiple cryptocurrencies simultaneously. Currently, cross-chain applications are implemented as use-case-specific cryptographic protocols that serve as overlay to synchronize smart contract executions in the different cryptocurrencies. Hence, their design requires substantial expertise, as well as a security analysis in complex cryptographic frameworks. In this work, we present BitMLx, the first domain-specific language for cross-chain smart contracts, enabling interactions with several users that hold funds across multiple Bitcoin-like cryptocurrencies. We contribute a compiler to automatically translate a BitMLx contract into one contract per involved cryptocurrency and a user strategy that synchronizes the execution of these contracts. We prove that an honest user, who follows the prescribed strategy when interacting with the several contracts, ends up with at least as many funds as in the corresponding execution of the BitMLx contract. Last, but not least, we implement the BitMLx compiler and demonstrate its utility in the design of illustrative examples of cross-chain applications such as multi-chain donations or loans across different cryptocurrencies.

Paperid: 2921, https://arxiv.org/pdf/2501.17402.pdf

Abstract:
The outsourcing of semiconductor manufacturing raises security risks, such as piracy and overproduction of hardware intellectual property. To overcome this challenge, logic locking has emerged to lock a given circuit using additional key bits. While single-key logic locking approaches have demonstrated serious vulnerability to a wide range of attacks, multi-key solutions, if carefully designed, can provide a reliable defense against not only oracle-guided logic attacks, but also removal and dataflow attacks. In this paper, using time base keys, we propose, implement and evaluate a family of secure multi-key logic locking algorithms called Cute-Lock that can be applied both in RTL-level behavioral and netlist-level structural representations of sequential circuits. Our extensive experimental results under a diverse range of attacks confirm that, compared to vulnerable state-of-the-art methods, employing the Cute-Lock family drives attacking attempts to a dead end without additional overhead.

Paperid: 2922, https://arxiv.org/pdf/2501.16964.pdf

Abstract:
Detecting cyberattacks using Graph Neural Networks (GNNs) has seen promising results recently. Most of the state-of-the-art models that leverage these techniques require labeled examples, hard to obtain in many real-world scenarios. To address this issue, unsupervised learning and Self-Supervised Learning (SSL) have emerged as interesting approaches to reduce the dependency on labeled data. Nonetheless, these methods tend to yield more anomalous detection algorithms rather than effective attack detection systems. This paper introduces Few Edges Are Enough (FEAE), a GNN-based architecture trained with SSL and Few-Shot Learning (FSL) to better distinguish between false positive anomalies and actual attacks. To maximize the potential of few-shot examples, our model employs a hybrid self-supervised objective that combines the advantages of contrastive-based and reconstruction-based SSL. By leveraging only a minimal number of labeled attack events, represented as attack edges, FEAE achieves competitive performance on two well-known network datasets compared to both supervised and unsupervised methods. Remarkably, our experimental results unveil that employing only 1 malicious event for each attack type in the dataset is sufficient to achieve substantial improvements. FEAE not only outperforms self-supervised GNN baselines but also surpasses some supervised approaches on one of the datasets.

Paperid: 2923, https://arxiv.org/pdf/2501.15313.pdf

Abstract:
Virtual Reality (VR) technology has gained substantial traction and has the potential to transform a number of industries, including education, entertainment, and professional sectors. Nevertheless, concerns have arisen about the security and privacy implications of VR applications and the impact that they might have on users. In this paper, we investigate the following overarching research question: can VR applications and VR user activities in the context of such applications (e.g., manipulating virtual objects, walking, talking, flying) be identified based on the (potentially encrypted) network traffic that is generated by VR headsets during the operation of VR applications? To answer this question, we collect network traffic data from 25 VR applications running on the Meta Quest Pro headset and identify characteristics of the generated network traffic, which we subsequently use to train off-the-shelf Machine Learning (ML) models. Our results indicate that through the use of ML models, we can identify the VR applications being used with an accuracy of 92.4% and the VR user activities performed with an accuracy of 91%. Furthermore, our results demonstrate that an attacker does not need to collect large amounts of network traffic data for each VR application to carry out such an attack. Specifically, an attacker only needs to collect less than 10 minutes of network traffic data for each VR application in order to identify applications with an accuracy higher than 90% and VR user activities with an accuracy higher than 88%.

Paperid: 2924, https://arxiv.org/pdf/2501.14556.pdf

Abstract:
This paper presents a sandbox study proposal focused on the distributed processing of personal health data within the Vinnova-funded SARDIN project. The project aims to develop the Health Data Bank (HÃ¤lsodatabanken in Swedish), a secure platform for research and innovation that complies with the European Health Data Space (EHDS) legislation. By minimizing the sharing and storage of personal data, the platform sends analysis tasks directly to the original data locations, avoiding centralization. This approach raises questions about data controller responsibilities in distributed environments and the anonymization status of aggregated statistical results. The study explores federated analysis, secure multi-party aggregation, and differential privacy techniques, informed by real-world examples from clinical research on Parkinson's disease, stroke rehabilitation, and wound analysis. To validate the proposed study, numerical experiments were conducted using four open-source datasets to assess the feasibility and effectiveness of the proposed methods. The results support the methods for the proposed sandbox study by demonstrating that differential privacy in combination with secure aggregation techniques significantly improves the privacy-utility trade-off.

Paperid: 2925, https://arxiv.org/pdf/2501.13865.pdf

Abstract:
Iterative bit flipping decoders are an efficient and effective decoder choice for decoding codes which admit a sparse parity-check matrix. Among these, sparse $(v,w)$-regular codes, which include LDPC and MDPC codes are of particular interest both for efficient data correction and the design of cryptographic primitives. In attaining the decoding the choice of the bit flipping thresholds, which can be determined either statically, or during the decoder execution by using information coming from the initial syndrome value and its updates. In this work, we analyze a two-iterations parallel hard decision bit flipping decoders and propose concrete criteria for threshold determination, backed by a closed form model. In doing so, we introduce a new tightly fitting model for the distribution of the Hamming weight of the syndrome after the first decoder iteration and substantial improvements on the DFR estimation with respect to existing approaches.

Paperid: 2926, https://arxiv.org/pdf/2501.13256.pdf

Abstract:
We observed the Array Canary, a novel JavaScript anti-analysis technique currently exploited in-the-wild by the Phishing-as-a-Service framework Darcula. The Array Canary appears to be an advanced form of the array shuffling techniques employed by the Emotet JavaScript downloader. In practice, a series of Array Canaries are set within a string array and if modified will cause the program to endlessly loop. In this paper, we demonstrate how an Array Canary works and discuss Autonomous Function Call Resolution (AFCR), which is a method we created to bypass Array Canaries. We also introduce Arphsy, a proof-of-concept for AFCR designed to guide Large Language Models and security researchers in the deobfuscation of "canaried" JavaScript code. We accomplish this by (i) Finding and extracting all Immediately Invoked Function Expressions from a canaried file, (ii) parsing the file's Abstract Syntax Tree for any function that does not implement imported function calls, (iii) identifying the most reassigned variable and its corresponding function body, (iv) calculating the length of the largest string array and uses it to determine the offset values within the canaried file, (v) aggregating all the previously identified functions into a single file, and (vi) appending driver code into the verified file and using it to deobfuscate the canaried file.

Paperid: 2927, https://arxiv.org/pdf/2501.13252.pdf

Abstract:
In today's rapidly evolving technological landscape, organizations face the challenge of integrating external insights into their decision-making processes to stay competitive. To address this issue, this study proposes a method that combines topic modeling, expert knowledge inputs, and reinforcement learning (RL) to enhance the detection of technological changes. The method has four main steps: (1) Build a relevant topic model, starting with textual data like documents and reports to find key themes. (2) Create aspect-based topic models. Experts use curated keywords to build models that showcase key domain-specific aspects. (3) Iterative analysis and RL driven refinement: We examine metrics such as topic magnitude, similarity, entropy shifts, and how models change over time. We optimize topic selection with RL. Our reward function balances the diversity and similarity of the topics. (4) Synthesis and operational integration: Each iteration provides insights. In the final phase, the experts check these insights and reach new conclusions. These conclusions are designed for use in the firm's operational processes. The application is tested by forecasting trends in quantum communication. Results demonstrate the method's effectiveness in identifying, ranking, and tracking trends that align with expert input, providing a robust tool for exploring evolving technological landscapes. This research offers a scalable and adaptive solution for organizations to make informed strategic decisions in dynamic environments.

Paperid: 2928, https://arxiv.org/pdf/2501.13080.pdf

Abstract:
Large Language Models (LLMs) have demonstrated powerful capabilities that render them valuable in different applications, including conversational AI products. It is paramount to ensure the security and reliability of these products by mitigating their vulnerabilities towards malicious user interactions, which can lead to the exposure of great risks and reputational repercussions. In this work, we present a comprehensive study on the efficacy of fine-tuning and aligning Chain-of-Thought (CoT) responses of different LLMs that serve as input moderation guardrails. We systematically explore various tuning methods by leveraging a small set of training data to adapt these models as proxy defense mechanisms to detect malicious inputs and provide a reasoning for their verdicts, thereby preventing the exploitation of conversational agents. We rigorously evaluate the efficacy and robustness of different tuning strategies to generalize across diverse adversarial and malicious query types. Our experimental results outline the potential of alignment processes tailored to a varied range of harmful input queries, even with constrained data resources. These techniques significantly enhance the safety of conversational AI systems and provide a feasible framework for deploying more secure and trustworthy AI-driven interactions.

Paperid: 2929, https://arxiv.org/pdf/2501.12893.pdf

Abstract:
To analyze the privacy guarantee of personal data in a database that is subject to queries it is necessary to model the prior knowledge of a possible attacker. Differential privacy considers a worst-case scenario where he knows almost everything, which in many applications is unrealistic and requires a large utility loss. This paper considers a situation called statistical privacy where an adversary knows the distribution by which the database is generated, but no exact data of all (or sufficient many) of its entries. We analyze in detail how the entropy of the distribution guarantes privacy for a large class of queries called property queries. Exact formulas are obtained for the privacy parameters. We analyze how they depend on the probability that an entry fulfills the property under investigation. These formulas turn out to be lengthy, but can be used for tight numerical approximations of the privacy parameters. Such estimations are necessary for applying privacy enhancing techniques in practice. For this statistical setting we further investigate the effect of adding noise or applying subsampling and the privacy utility tradeoff. The dependencies on the parameters are illustrated in detail by a series of plots. Finally, these results are compared to the differential privacy model.

Paperid: 2930, https://arxiv.org/pdf/2501.12890.pdf

Abstract:
We present uSpectre, a new class of transient execution attacks that exploit microcode branch mispredictions to transiently leak sensitive data. We find that many long-known and recently-discovered transient execution attacks, which were previously categorized as Spectre or Meltdown variants, are actually instances of uSpectre on some Intel microarchitectures. Based on our observations, we discover multiple new uSpectre attacks and present a defense against uSpectre vulnerabilities, called uSLH.

Paperid: 2931, https://arxiv.org/pdf/2501.11830.pdf

Abstract:
Machine learning model genealogy enables practitioners to determine which architectural family a neural network belongs to. In this paper, we introduce ShadowGenes, a novel, signature-based method for identifying a given model's architecture, type, and family. Our method involves building a computational graph of the model that is agnostic of its serialization format, then analyzing its internal operations to identify unique patterns, and finally building and refining signatures based on these. We highlight important workings of the underlying engine and demonstrate the technique used to construct a signature and scan a given model. This approach to model genealogy can be applied to model files without the need for additional external information. We test ShadowGenes on a labeled dataset of over 1,400 models and achieve a mean true positive rate of 97.49% and a precision score of 99.51%; which validates the technique as a practical method for model genealogy. This enables practitioners to understand the use cases of a given model, the internal computational process, and identify possible security risks, such as the potential for model backdooring.

Paperid: 2932, https://arxiv.org/pdf/2501.11795.pdf

Abstract:
This paper establishes a mathematically precise definition of dataset poisoning attack and proves that the very act of effectively poisoning a dataset ensures that the attack can be effectively detected. On top of a mathematical guarantee that dataset poisoning is identifiable by a new statistical test that we call the Conformal Separability Test, we provide experimental evidence that we can adequately detect poisoning attempts in the real world.

Paperid: 2933, https://arxiv.org/pdf/2501.11707.pdf

Abstract:
In recent years, blockchain technology has been recognized as a transformative innovation in the tech world, and it has quickly become the core infrastructure of digital currencies such as Bitcoin and an important tool in various industries. This technology facilitates the recording and tracking of transactions across a vast network of computers by providing a distributed and decentralized ledger. Blockchain's decentralized structure significantly enhances security and transparency and prevents a single entity from dominating the network. This chapter examines blockchain's advantages, disadvantages, and applications in various industries and analyzes the implementation environments and reasons for using this technology. Also, this chapter discusses challenges such as scalability and high energy consumption that inhibit the expansion of this technology and examines blockchain technology's role in increasing efficiency and security in economic and social interactions. Finally, a comprehensive conclusion of blockchain applications and challenges has been presented by comparing blockchain applications in various industries and analyzing future trends.

Paperid: 2934, https://arxiv.org/pdf/2501.11659.pdf

Abstract:
Federated learning (FL) is a popular privacy-preserving edge-to-cloud technique used for training and deploying artificial intelligence (AI) models on edge devices. FL aims to secure local client data while also collaboratively training a global model. Under standard FL, clients within the federation send model updates, derived from local data, to a central server for aggregation into a global model. However, extensive research has demonstrated that private data can be reliably reconstructed from these model updates using gradient inversion attacks (GIAs). To protect client data from server-side GIAs, previous FL schemes have employed fully homomorphic encryption (FHE) to secure model updates while still enabling popular aggregation methods. However, current FHE-based FL schemes either incur substantial computational overhead or trade security and/or model accuracy for efficiency. We introduce BlindFL, a framework for global model aggregation in which clients encrypt and send a subset of their local model update. With choice over the subset size, BlindFL offers flexible efficiency gains while preserving full encryption of aggregated updates. Moreover, we demonstrate that implementing BlindFL can substantially lower space and time transmission costs per client, compared with plain FL with FHE, while maintaining global model accuracy. BlindFL also offers additional depth of security. While current single-key, FHE-based FL schemes explicitly defend against server-side adversaries, they do not address the realistic threat of malicious clients within the federation. By contrast, we theoretically and experimentally demonstrate that BlindFL significantly impedes client-side model poisoning attacks, a first for single-key, FHE-based FL schemes.

Paperid: 2935, https://arxiv.org/pdf/2501.11235.pdf

Abstract:
Threshold fully homomorphic encryption (ThFHE) enables multiple parties to compute functions over their sensitive data without leaking data privacy. Most of existing ThFHE schemes are restricted to full threshold and require the participation of \textit{all} parties to output computing results. Compared with these full-threshold schemes, arbitrary threshold (ATh)-FHE schemes are robust to non-participants and can be a promising solution to many real-world applications. However, existing AThFHE schemes are either inefficient to be applied with a large number of parties $N$ and a large data size $K$, or insufficient to tolerate all types of non-participants. In this paper, we propose an AThFHE scheme to handle all types of non-participants with lower complexity over existing schemes. At the core of our scheme is the reduction from AThFHE construction to the design of a new primitive called \textit{approximate secret sharing} (ApproxSS). Particularly, we formulate ApproxSS and prove the correctness and security of AThFHE on top of arbitrary-threshold (ATh)-ApproxSS's properties. Such a reduction reveals that existing AThFHE schemes implicitly design ATh-ApproxSS following a similar idea called ``noisy share''. Nonetheless, their ATh-ApproxSS design has high complexity and become the performance bottleneck. By developing ATASSES, an ATh-ApproxSS scheme based on a novel ``encrypted share'' idea, we reduce the computation (resp. communication) complexity from $\mathcal{O}(N^2K)$ to $\mathcal{O}(N^2+K)$ (resp. from $\mathcal{O}(NK)$ to $\mathcal{O}(N+K)$). We not only theoretically prove the (approximate) correctness and security of ATASSES, but also empirically evaluate its efficiency against existing baselines. Particularly, when applying to a system with one thousand parties, ATASSES achieves a speedup of $3.83\times$ -- $15.4\times$ over baselines.

Paperid: 2936, https://arxiv.org/pdf/2501.11183.pdf

Abstract:
As LLMs develop increasingly advanced capabilities, there is an increased need to minimize the harm that could be caused to society by certain model outputs; hence, most LLMs have safety guardrails added, for example via fine-tuning. In this paper, we argue the position that current safety fine-tuning is very similar to a traditional cat-and-mouse game (or arms race) between attackers and defenders in cybersecurity. Model jailbreaks and attacks are patched with bandaids to target the specific attack mechanism, but many similar attack vectors might remain. When defenders are not proactively coming up with principled mechanisms, it becomes very easy for attackers to sidestep any new defenses. We show how current defenses are insufficient to prevent new adversarial jailbreak attacks, reward hacking, and loss of control problems. In order to learn from past mistakes in cybersecurity, we draw analogies with historical examples and develop lessons learned that can be applied to LLM safety. These arguments support the need for new and more principled approaches to designing safe models, which are architected for security from the beginning. We describe several such approaches from the AI literature.

Paperid: 2937, https://arxiv.org/pdf/2501.10983.pdf

Abstract:
Previous schemes for designing secure branch prediction unit (SBPU) based on physical isolation can only offer limited security and significantly affect BPU's prediction capability, leading to prominent performance degradation. Moreover, encryption-based SBPU schemes based on periodic key re-randomization have the risk of being compromised by advanced attack algorithms, and the performance overhead is also considerable. To this end, this paper proposes a conflict-invisible SBPU (CIBPU). CIBPU employs redundant storage design, load-aware indexing, and replacement design, as well as an encryption mechanism without requiring periodic key updates, to prevent attackers' perception of branch conflicts. We provide a thorough security analysis, which shows that CIBPU achieves strong security throughout the BPU's lifecycle. We implement CIBPU in a RISC-V core model in gem5. The experimental results show that CIBPU causes an average performance overhead of only 1.12%-2.20% with acceptable hardware storage overhead, which is the lowest among the state-of-the-art SBPU schemes. CIBPU has also been implemented in the open-source RISC-V core, SonicBOOM, which is then burned onto an FPGA board. The evaluation based on the board shows an average performance degradation of 2.01%, which is approximately consistent with the result obtained in gem5.

Paperid: 2938, https://arxiv.org/pdf/2501.10403.pdf

Abstract:
Cyber polygon used to train cybersecurity professionals, test new security technologies and simulate attacks play an important role in ensuring cybersecurity. The creation of such training grounds is based on the use of hypervisors, which allow efficient management of virtual machines, isolating operating systems and resources of a physical computer from virtual machines, ensuring a high level of security and stability. The paper analyses various aspects of using hypervisors in cyber polygons, including types of hypervisors, their main functions, and the specifics of their use in modelling cyber threats. The article shows the ability of hypervisors to increase the efficiency of hardware resources, create complex virtual environments for detailed modelling of network structures and simulation of real situations in cyberspace.

Paperid: 2939, https://arxiv.org/pdf/2501.09808.pdf

Abstract:
Many Security Operations Centers (SOCs) today still heavily rely on signature-based Network Intrusion Detection Systems (NIDS) such as Suricata. The specificity of intrusion detection rules and the coverage provided by rulesets are common concerns within the professional community surrounding SOCs, which impact the effectiveness of automated alert post-processing approaches. We postulate a better understanding of factors influencing the quality of rules can help address current SOC issues. In this paper, we characterize the rules in use at a collaborating commercial (managed) SOC serving customers in sectors including education and IT management. During this process, we discover six relevant design principles, which we consolidate through interviews with experienced rule designers at the SOC.We then validate our design principles by quantitatively assessing their effect on rule specificity. We find that several of these design considerations significantly impact unnecessary workload caused by rules. For instance, rules that leverage proxies for detection, and rules that do not employ alert throttling or do not distinguish (un)successful malicious actions, cause significantly more workload for SOC analysts. Moreover, rules that match a generalized characteristic to detect malicious behavior, which is believed to increase coverage, also significantly increase workload, suggesting a tradeoff must be struck between rule specificity and coverage. We show that these design principles can be applied successfully at a SOC to reduce workload whilst maintaining coverage despite the prevalence of violations of the principles.

Paperid: 2940, https://arxiv.org/pdf/2501.08435.pdf

Abstract:
Quantum key distribution (QKD) allows Alice and Bob to share a secret key over an insecure channel with proven information-theoretic security against an adversary whose strategy is bounded only by the laws of physics. Composability-based security proofs of QKD ensure that using the established key with a one-time-pad encryption scheme provides information theoretic secrecy for the message. In this paper, we consider the problem of using the QKD established key with a secure symmetric key-based encryption algorithm and use an approach based on hybrid encryption to provide a proof of security for the composition. Hybrid encryption was first proposed as a public key cryptographic algorithm with proven security for messages of unrestricted length. We use an extension of this framework to correlated randomness setting (Sharifian et al. in ISIT 2021) to propose a quantum-enabled Key Encapsulation Mechanism (qKEM) and quantum-enabled hybrid encryption (qHE), and prove a composition theorem for the security of the qHE. We construct a qKEM with proven security using an existing QKD (Portmann et al. in Rev. of Mod. Physics 2022). Using this qKEM with a secure Data Encapsulation Mechanism (DEM), that can be constructed using a one-time symmetric key encryption scheme, results in an efficient encryption system for unrestricted length messages with proved security against an adversary with access to efficient computations on a quantum computer (i.e. post-quantum secure encryption without using any computational assumptions.)

Paperid: 2941, https://arxiv.org/pdf/2501.08031.pdf

Abstract:
Random number generation plays a vital role in cryptographic systems and computational applications, where uniformity, unpredictability, and robustness are essential. This paper presents the Entropy Mixing Network (EMN), a novel hybrid random number generator designed to enhance randomness quality by combining deterministic pseudo-random generation with periodic entropy injection. To evaluate its effectiveness, we propose a comprehensive assessment framework that integrates statistical tests, advanced metrics, and visual analyses, providing a holistic view of randomness quality, predictability, and computational efficiency. The results demonstrate that EMN outperforms Python's SystemRandom and MersenneTwister in critical metrics, achieving the highest Chi-squared p-value (0.9430), entropy (7.9840), and lowest predictability (-0.0286). These improvements come with a trade-off in computational performance, as EMN incurs a higher generation time (0.2602 seconds). Despite this, its superior randomness quality makes it particularly suitable for cryptographic applications where security is prioritized over speed.

Paperid: 2942, https://arxiv.org/pdf/2501.07868.pdf

Abstract:
Field Programmable Gate Array (FPGA)-based embedded systems have become mainstream in the last decade, often in security-sensitive applications. However, even with an authenticated hardware platform, compromised software can severely jeopardize the overall system security, making hardware protection insufficient if the software itself is malicious. In this paper, we propose a novel low-overhead hardware-software co-design solution that utilizes Physical Unclonable Functions (PUFs) to ensure the authenticity of program binaries for microprocessors/microcontrollers mapped on the FPGA. Our technique binds a program binary to a specific target FPGA through a PUF signature, performs runtime authentication for the program binary, and allows execution of the binary only after successful authentication. The proposed scheme is platform-agnostic and capable of operating in a "bare metal'' mode (no system software requirement) for maximum flexibility. Our scheme also does not require any modification of the original hardware design or program binary. We demonstrate a successful prototype implementation using the open-source PicoBlaze microcontroller on AMD/Xilinx FPGA, comparing its hardware resource footprint and performance with other existing solutions of a similar nature.

Paperid: 2943, https://arxiv.org/pdf/2501.07703.pdf

Abstract:
The Internet of Medical Things (IoMT) has transformed the healthcare industry by connecting medical devices in monitoring treatment outcomes of patients. This increased connectivity has resulted to significant security vulnerabilities in the case of malware and Distributed Denial of Service (DDoS) attacks. This literature review examines the vulnerabilities of IoMT devices, focusing on critical threats and exploring mitigation strategies. We conducted a comprehensive search across leading databases such as ACM Digital Library, IEEE Xplore, and Elsevier to analyze peer-reviewed studies published within the last five years (from 2019 to 2024). The review shows that inadequate encryption protocols, weak authentication methods, and irregular firmware updates are the main causes of risks associated with IoMT devices. We have identified emerging solutions like machine learning algorithms, blockchain technology, and edge computing as promising approaches to enhance IoMT security. This review emphasizes the pressing need to develop lightweight security measures and standardized protocols to protect patient data and ensure the integrity of healthcare services.

Paperid: 2944, https://arxiv.org/pdf/2501.07435.pdf

Abstract:
We present Union, a trust-minimized bridge protocol that enables secure transfer of BTC between Bitcoin and a secondary blockchain. The growing ecosystem of blockchain systems built around Bitcoin has created a pressing need for secure and efficient bridges to transfer BTC between networks while preserving Bitcoin's security guarantees. Union employs a multi-party variant of BitVMX, an optimistic proving system on Bitcoin, to create a bridge that operates securely under the assumption that at least one participant remains honest. This 1-of-n honest approach is strikingly different from the conventional honest-majority assumption adopted by practically all federated systems. The protocol introduces several innovations: a packet-based architecture that allows security bonds to be reused for multiple bridge operations, improving capital efficiency; a system of enablers to manage functionaries participation and to enforce penalties; a flexible light client framework adaptable to various blockchain architectures; and an efficient stop watch mechanism to optimize time-lock management. Union is a practical and scalable solution for Bitcoin interoperability that maintains strong security guarantees and minimizes trust assumptions.

Paperid: 2945, https://arxiv.org/pdf/2501.07326.pdf

Abstract:
There is an expectation that users of home IoT devices will be able to secure those devices, but they may lack information about what they need to do. In February 2022, we launched a web service that scans users' IoT devices to determine how secure they are. The service aims to diagnose and remediate vulnerabilities and malware infections of IoT devices of Japanese users. This paper reports on findings from operating this service drawn from three studies: (1) the engagement of 114,747 users between February, 2022 - May, 2024; (2) a large-scale evaluation survey among service users (n=4,103), and; (3) an investigation and targeted survey (n=90) around the remediation actions of users of non-secure devices. During the operation, we notified 417 (0.36%) users that one or more of their devices were detected as vulnerable, and 171 (0.15%) users that one of their devices was infected with malware. The service found no issues for 99% of users. Still, 96% of all users evaluated the service positively, most often for it providing reassurance, being free of charge, and short diagnosis time. Of the 171 users with malware infections, 67 returned to the service later for a new check, with 59 showing improvement. Of the 417 users with vulnerable devices, 151 users revisited and re-diagnosed, where 75 showed improvement. We report on lessons learned, including a consideration of the capabilities that non-expert users will assume of a security scan.

Paperid: 2946, https://arxiv.org/pdf/2501.07028.pdf

Abstract:
Due to the current standard of Security Credential Management System (SCMS) for Vehicle-to-Everything (V2X) communications using asymmetric cryptography, specifically Elliptic-Curve Cryptography (ECC), which may be vulnerable to quantum computing attacks. Therefore, the V2X SCMS is threatened by quantum computing attacks. However, although the National Institute of Standards and Technology (NIST) has already selected Post-Quantum Cryptography (PQC) algorithms as the standard, the current PQC algorithms may have issues such as longer public key lengths, longer signature lengths, or lower signature generation and verification efficiency, which may not fully meet the requirements of V2X communication applications. In view of the challenges in V2X communication, such as packet length, signature generation and verification efficiency, security level, and vehicle privacy, this study proposes a hybrid certificate scheme of PQC and ECC. By leveraging the strengths of both PQC and ECC, this scheme aims to overcome the challenges in V2X communication. PQC is used to establish a security level resistant to quantum computing attacks, while ECC is utilized to establish anonymous certificates and reduce packet length to meet the requirements of V2X communication. In the practical experiments, the study implemented the SCMS end entity based on the Chunghwa Telecom SCMS and the Clientron On-Board Unit (OBU) to conduct field tests in Danhai New Town in New Taipei City. The performance of various existing hybrid certificate schemes combining PQC (e.g., Dilithium, Falcon, and SPHINCS+) and ECC is compared, and a practical solution is provided for V2X industries.

Paperid: 2947, https://arxiv.org/pdf/2501.06646.pdf

Abstract:
With lowering thresholds, transparently defending against Rowhammer within DRAM is challenging due to the lack of time to perform mitigation. Commercially deployed in-DRAM defenses like TRR that steal time from normal refreshes~(REF) to perform mitigation have been proven ineffective against Rowhammer. In response, a new Refresh Management (RFM) interface has been added to the DDR5 specifications. RFM provides dedicated time to an in-DRAM defense to perform mitigation. Several recent works have used RFM for the intended purpose - building better Rowhammer defenses. However, to the best of our knowledge, no prior study has looked at the potential security implications of this new feature if an attacker subjects it to intentional misuse. Our paper shows that RFM introduces new side effects in the system - the activity of one bank causes interference with the operation of the other banks. Thus, the latency of a bank becomes dependent on the activity of other banks. We use these side effects to build two new attacks. First, a novel memory-based covert channel, which has a bandwidth of up to 31.3 KB/s, and is also effective even in a bank-partitioned system. Second, a new Denial-of-Service (DOS) attack pattern that exploits the activity within a single bank to reduce the performance of the other banks. Our experiments on SPEC2017, PARSEC, and LIGRA workloads show a slowdown of up to 67\% when running alongside our DOS pattern. We also discuss potential countermeasures for our attacks.

Paperid: 2948, https://arxiv.org/pdf/2501.06515.pdf

Abstract:
Physical Unclonable Functions (PUFs) based on Non-Volatile Memory (NVM) technology have emerged as a promising solution for secure authentication and cryptographic applications. By leveraging the multi-level cell (MLC) characteristic of NVMs, these PUFs can generate a wide range of unique responses, enhancing their resilience to machine learning (ML) modeling attacks. However, a significant issue with NVM-based PUFs is their endurance problem; frequent write operations lead to wear and degradation over time, reducing the reliability and lifespan of the PUF. This paper addresses these issues by offering a comprehensive model to predict and analyze the effects of endurance changes on NVM PUFs. This model provides insights into how wear impacts the PUF's quality and helps in designing more robust PUFs. Building on this model, we present a novel design for NVM PUFs that significantly improves endurance. Our design approach incorporates advanced techniques to distribute write operations more evenly and reduce stress on individual cells. The result is an NVM PUF that demonstrates a $62\times$ improvement in endurance compared to current state-of-the-art solutions while maintaining protection against learning-based attacks.

Paperid: 2950, https://arxiv.org/pdf/2501.06305.pdf

Abstract:
Cloud computing has emerged as a crucial solution for managing data- and compute-intensive workflows, offering scalability to address dynamic demands. However, security concerns persist, especially for workflows involving sensitive data and tasks. One of the main gaps in the literature is the lack of robust and flexible measures for reacting to these security violations. To address this, we propose an innovative approach leveraging Reinforcement Learning (RL) to formulate adaptation chains, responding effectively to security violations within cloud-based workflows. These chains consist of sequences of adaptation actions tailored to attack characteristics, workflow dependencies, and user-defined requirements. Unlike conventional single-task adaptations, adaptation chains provide a comprehensive mitigation strategy by taking into account both control and data dependencies between tasks, thereby accommodating conflicting objectives effectively. Moreover, our RL-based approach uses insights from past responses to mitigate uncertainties associated with adaptation costs. We evaluate the method using our jBPM and Cloudsim Plus based implementation and compare the impact of selected adaptation chains on workflows with the single adaptation approach. Results demonstrate that the adaptation chain approach outperforms in terms of total adaptation cost, offering resilience and adaptability against security threats.

Paperid: 2951, https://arxiv.org/pdf/2501.06264.pdf

Abstract:
This research introduces a robust detection system against malicious network traffic, leveraging hierarchical structures and self-attention mechanisms. The proposed system includes a Packet Segmenter that divides a given raw network packet into fixed-size segments that are fed to the HPAC-IDS. The experiments performed on CIC-IDS2017 dataset show that the system exhibits high accuracy and low false positive rates while demonstrating resilience against diverse adversarial methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and Wasserstein GAN (WGAN). The model's ability to withstand adversarial perturbations is attributed to the fusion of hierarchical attention mechanisms and convolutional neural networks, resulting in a 0% to 10% adversarial attack severity under tested adversarial attacks with different segment sizes, surpassing the state-of-the-art model in detection performance and adversarial attack robustness.

Paperid: 2952, https://arxiv.org/pdf/2501.06237.pdf

Abstract:
In the evolving landscape of data privacy, the anonymization of electric load profiles has become a critical issue, especially with the enforcement of the General Data Protection Regulation (GDPR) in Europe. These electric load profiles, which are essential datasets in the energy industry, are classified as personal behavioral data, necessitating stringent protective measures. This article explores the implications of this classification, the importance of data anonymization, and the potential of forecasting using microaggregated data. The findings underscore that effective anonymization techniques, such as microaggregation, do not compromise the performance of forecasting models under certain conditions (i.e., forecasting aggregated). In such an aggregated level, microaggregated data maintains high levels of utility, with minimal impact on forecasting accuracy. The implications for the energy sector are profound, suggesting that privacy-preserving data practices can be integrated into smart metering technology applications without hindering their effectiveness.

Paperid: 2953, https://arxiv.org/pdf/2501.06084.pdf

Abstract:
The Fisher-Yates shuffle is a well-known algorithm for shuffling a finite sequence, such that every permutation is equally likely. Despite its simplicity, it is prone to implementation errors that can introduce bias into the generated permutations. We verify its correctness in Dafny as follows. First, we define a functional model that operates on sequences and streams of random bits. Second, we establish that the functional model has the desired distribution. Third, we define an executable imperative implementation that operates on arrays and prove it equivalent to the functional model. The approach may serve as a blueprint for the verification of more complex algorithms.

Paperid: 2954, https://arxiv.org/pdf/2501.04233.pdf

Abstract:
Let $\gf_{p^n}$ denote the finite field containing $p^n$ elements, where $n$ is a positive integer and $p$ is a prime. The function $f_u(x)=x^{\frac{p^n+3}{2}}+ux^2$ over $\gf_{p^n}[x]$ with $u\in\gf_{p^n}\setminus\{0,\pm1\}$ was recently studied by Budaghyan and Pal in \cite{Budaghyan2024ArithmetizationorientedAP}, whose differential uniformity is at most $5$ when $p^n\equiv3~(mod~4)$. In this paper, we study the differential uniformity and the differential spectrum of $f_u$ for $u=\pm1$. We first give some properties of the differential spectrum of any cryptographic function. Moreover, by solving some systems of equations over finite fields, we express the differential spectrum of $f_{\pm1}$ in terms of the quadratic character sums.

Paperid: 2955, https://arxiv.org/pdf/2501.04104.pdf

Abstract:
As autonomous vehicle (AV) technology advances towards maturity, it becomes imperative to examine the security vulnerabilities within these cyber-physical systems. While conventional cyber-security concerns are often at the forefront of discussions, it is essential to get deeper into the various layers of vulnerability that are often overlooked within mainstream frameworks. Our goal is to spotlight imminent challenges faced by AV operators and explore emerging technologies for comprehensive solutions. This research outlines the diverse security layers, spanning physical, cyber, coding, and communication aspects, in the context of AVs. Furthermore, we provide insights into potential solutions for each potential attack vector, ensuring that autonomous vehicles remain secure and resilient in an evolving threat landscape.

Paperid: 2956, https://arxiv.org/pdf/2501.02754.pdf

Abstract:
In recent years, attention-based models have excelled across various domains but remain vulnerable to backdoor attacks, often from downloading or fine-tuning on poisoned datasets. Many current methods to mitigate backdoors in NLP models rely on the pre-trained (unfine-tuned) weights, but these methods fail in scenarios where the pre-trained weights are not available. In this work, we propose MBTSAD, which can mitigate backdoors in the language model by utilizing only a small subset of clean data and does not require pre-trained weights. Specifically, MBTSAD retrains the backdoored model on a dataset generated by token splitting. Then MBTSAD leverages attention distillation, the retrained model is the teacher model, and the original backdoored model is the student model. Experimental results demonstrate that MBTSAD achieves comparable backdoor mitigation performance as the methods based on pre-trained weights while maintaining the performance on clean data. MBTSAD does not rely on pre-trained weights, enhancing its utility in scenarios where pre-trained weights are inaccessible. In addition, we simplify the min-max problem of adversarial training and visualize text representations to discover that the token splitting method in MBTSAD's first step generates Out-of-Distribution (OOD) data, leading the model to learn more generalized features and eliminate backdoor patterns.

Paperid: 2957, https://arxiv.org/pdf/2501.02147.pdf

Abstract:
This paper investigates the resilience of a ResNet-50 image classification model under two prominent security threats: Fast Gradient Sign Method (FGSM) adversarial attacks and malicious payload injection. Initially, the model attains a 53.33% accuracy on clean images. When subjected to FGSM perturbations, its overall accuracy remains unchanged; however, the model's confidence in incorrect predictions notably increases. Concurrently, a payload injection scheme is successfully executed in 93.33% of the tested samples, revealing how stealthy attacks can manipulate model predictions without degrading visual quality. These findings underscore the vulnerability of even high-performing neural networks and highlight the urgency of developing more robust defense mechanisms for security-critical applications.

Paperid: 2958, https://arxiv.org/pdf/2501.02118.pdf

Abstract:
Logic locking has emerged to prevent piracy and overproduction of integrated circuits ever since the split of the design house and manufacturing foundry was established. While there has been a lot of research using a single global key to lock the circuit, even the most sophisticated single-key locking methods have been shown to be vulnerable to powerful SAT-based oracle-guided attacks that can extract the correct key with the help of an activated chip bought off the market and the locked netlist leaked from the untrusted foundry. To address this challenge, we propose, implement, and evaluate a novel logic locking method called K-Gate Lock that encodes input patterns using multiple keys that are applied to one set of key inputs at different operational times. Our comprehensive experimental results confirm that using multiple keys will make the circuit secure against oracle-guided attacks and increase attacker efforts to an exponentially time-consuming brute force search. K-Gate Lock has reasonable power and performance overheads, making it a practical solution for real-world hardware intellectual property protection.

Paperid: 2959, https://arxiv.org/pdf/2501.02011.pdf

Abstract:
Counterfeiting poses an evergrowing challenge, driving the need for innovative and sophisticated anti-counterfeiting strategies and technologies. Many solutions focus on tags characterized by optical features that are partially or completely camouflaged to the human eye, thus discouraging scammers. In this paper, a QR code is laser printed on a thin plastic foil previously coated by a specific nanocavity consisting of a metal/insulator/metal/insulator (MIMI) multilayer. This metamaterial possesses unique features in terms of light transmission that are due to the specific design. A thin layer of polymer dispersed liquid crystals, fabricated incorporating specific nematic liquid crystals in a polymer matrix, is able to camouflage the QR code that becomes, then, readable only under specific thermal conditions. Three anti-counterfeiting tags were fabricated, each using a distinct LC with its own nematic-isotropic transition temperature. The peculiar combination of the unique optical properties of nematic liquid crystals and optical nanocavities results in the creation of a novel type of tags showing two different encoding levels. Stress tests including water immersion, bending test, and prolonged heating have been performed ensuring the long-term stability of the tags. The realized two security-level anti-counterfeiting tags are cost-effective, straightforward to manufacture and, thanks to their flexibility, can be easily integrated into packaging and products.

Paperid: 2960, https://arxiv.org/pdf/2501.01818.pdf

Abstract:
LLM routers aim to balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity. Routers represent one type of what we call LLM control planes: systems that orchestrate use of one or more LLMs. In this paper, we investigate routers' adversarial robustness. We first define LLM control plane integrity, i.e., robustness of LLM orchestration to adversarial inputs, as a distinct problem in AI safety. Next, we demonstrate that an adversary can generate query-independent token sequences we call ``confounder gadgets'' that, when added to any query, cause LLM routers to send the query to a strong LLM. Our quantitative evaluation shows that this attack is successful both in white-box and black-box settings against a variety of open-source and commercial routers, and that confounding queries do not affect the quality of LLM responses. Finally, we demonstrate that gadgets can be effective while maintaining low perplexity, thus perplexity-based filtering is not an effective defense. We finish by investigating alternative defenses.

Paperid: 2961, https://arxiv.org/pdf/2501.01732.pdf

Abstract:
Customer Identity and Access Management (CIAM) systems play a pivotal role in securing enterprise infrastructures. However, the complexity of implementing these systems requires careful architectural planning to ensure positive Return on Investment (RoI) and avoid costly delays. The proliferation of Active Persistent cyber threats, coupled with advancements in AI, cloud computing, and geographically distributed customer populations, necessitates a paradigm shift towards adaptive and zero-trust security frameworks. This paper introduces the Combined Hyper-Extensible Extremely-Secured Zero-Trust (CHEZ) CIAM-PAM architecture, designed specifically for large-scale enterprises. The CHEZ PL CIAM-PAM framework addresses critical security gaps by integrating federated identity management (private and public identities), password-less authentication, adaptive multi-factor authentication (MFA), microservice-based PEP (Policy Entitlement Point), multi-layer RBAC (Role Based Access Control) and multi-level trust systems. This future-proof design also includes end-to-end data encryption, and seamless integration with state-of-the-art AI-based threat detection systems, while ensuring compliance with stringent regulatory standards.

Paperid: 2962, https://arxiv.org/pdf/2501.01089.pdf

Abstract:
In the face of increasing cyber threats, particularly ransomware attacks, there is a pressing need for advanced detection and analysis systems that adapt to evolving malware behaviours. Throughout the literature, using machine learning (ML) to obviate ransomware attacks has increased in popularity. Unfortunately, most of these proposals leverage non-incremental learning approaches that require the underlying models to be updated from scratch to detect new ransomware, wasting time and resources. This approach is problematic because it leaves sensitive data vulnerable to attack during retraining, as newly emerging ransomware strains may go undetected until the model is updated. Furthermore, most of these approaches are not designed to detect ransomware in real-time data streams, limiting their effectiveness in complex network environments. To address this challenge, we present the Sysmon Incremental Learning System for Ransomware Analysis and Detection (SILRAD), which enables continuous updates to the underlying model and effectively closes the training gap. By leveraging the capabilities of Sysmon for detailed monitoring of system activities, our approach integrates online incremental learning techniques to enhance the adaptability and efficiency of ransomware detection. The most valuable features for detection were selected using the Pearson Correlation Coefficient (PCC), and concept drift detection was implemented through the ADWIN algorithm, ensuring that the model remains responsive to changes in ransomware behaviour. We compared our results to other popular techniques, such as Hoeffding Trees (HT) and Leveraging Bagging Classifier (LB), observing a detection accuracy of 98.89% and a Matthews Correlation Coefficient (MCC) rate of 94.11%, demonstrating the effectiveness of our technique.

Paperid: 2963, https://arxiv.org/pdf/2501.01083.pdf

Abstract:
In response to the increasing ransomware threat, this study presents a novel detection system that integrates Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. By leveraging Sysmon logs, the system enables real-time analysis on Windows-based endpoints. Our approach overcomes the limitations of traditional models by employing batch-based incremental learning, allowing the system to continuously adapt to new ransomware variants without requiring complete retraining. The proposed model achieved an impressive average F2-score of 99.61\%, with low false positive and false negative rates of 0.17\% and 4.69\%, respectively, within a highly imbalanced dataset. This demonstrates exceptional accuracy in identifying malicious behaviour. The dynamic detection capabilities of Sysmon enhance the model's effectiveness by providing a reliable stream of security events, mitigating the vulnerabilities associated with static detection methods. Furthermore, the parallel processing of LSTM modules, combined with attention mechanisms, significantly improves training efficiency and reduces latency, making our system well-suited for real-world applications. These findings underscore the potential of our CNN-LSTM framework as a robust solution for real-time ransomware detection, ensuring adaptability and resilience in the face of evolving cyber threats.

Paperid: 2964, https://arxiv.org/pdf/2501.00940.pdf

Abstract:
The rapid evolution of modern malware presents significant challenges to the development of effective defense mechanisms. Traditional cyber deception techniques often rely on static or manually configured parameters, limiting their adaptability to dynamic and sophisticated threats. This study leverages Generative AI (GenAI) models to automate the creation of adaptive cyber deception ploys, focusing on structured prompt engineering (PE) to enhance relevance, actionability, and deployability. We introduce a systematic framework (SPADE) to address inherent challenges large language models (LLMs) pose to adaptive deceptions, including generalized outputs, ambiguity, under-utilization of contextual information, and scalability constraints. Evaluations across diverse malware scenarios using metrics such as Recall, Exact Match (EM), BLEU Score, and expert quality assessments identified ChatGPT-4o as the top performer. Additionally, it achieved high engagement (93%) and accuracy (96%) with minimal refinements. Gemini and ChatGPT-4o Mini demonstrated competitive performance, with Llama3.2 showing promise despite requiring further optimization. These findings highlight the transformative potential of GenAI in automating scalable, adaptive deception strategies and underscore the critical role of structured PE in advancing real-world cybersecurity applications.

Paperid: 2965, https://arxiv.org/pdf/2501.00824.pdf

Abstract:
Collaborative inference (CI) improves computational efficiency for edge devices by transmitting intermediate features to cloud models. However, this process inevitably exposes feature representations to model inversion attacks (MIAs), enabling unauthorized data reconstruction. Despite extensive research, there is no established criterion for assessing the difficulty of MIA implementation, leaving a fundamental question unanswered: \textit{What factors truly and verifiably determine the attack's success in CI?} Moreover, existing defenses lack the theoretical foundation described above, making it challenging to regulate feature information effectively while ensuring privacy and minimizing computational overhead. These shortcomings introduce three key challenges: theoretical gap, methodological limitation, and practical constraint. To overcome these challenges, we propose the first theoretical criterion to assess MIA difficulty in CI, identifying mutual information, entropy, and effective information volume as key influencing factors. The validity of this criterion is demonstrated by using the mutual information neural estimator. Building on this insight, we propose SiftFunnel, a privacy-preserving framework to resist MIA while maintaining usability. Specifically, we incorporate linear and non-linear correlation constraints alongside label smoothing to suppress redundant information transmission, effectively balancing privacy and usability. To enhance deployability, the edge model adopts a funnel-shaped structure with attention mechanisms, strengthening privacy while reducing computational and storage burdens. Experiments show that, compared to state-of-the-art defense, SiftFunnel increases reconstruction error by $\sim$30\%, lowers mutual and effective information metrics by $\geq$50\%, and reduces edge burdens by almost $20\times$, while maintaining comparable usability.

Paperid: 2966, https://arxiv.org/pdf/2501.00537.pdf

Abstract:
Explainable Artificial Intelligence (XAI) plays an important role in improving the transparency and reliability of complex machine learning models, especially in critical domains such as cybersecurity. Despite the prevalence of heuristic interpretation methods such as SHAP and LIME, these techniques often lack formal guarantees and may produce inconsistent local explanations. To fulfill this need, few tools have emerged that use formal methods to provide formal explanations. Among these, XReason uses a SAT solver to generate formal instance-level explanation for XGBoost models. In this paper, we extend the XReason tool to support LightGBM models as well as class-level explanations. Additionally, we implement a mechanism to generate and detect adversarial examples in XReason. We evaluate the efficiency and accuracy of our approach on the CICIDS-2017 dataset, a widely used benchmark for detecting network attacks.

Paperid: 2967, https://arxiv.org/pdf/2501.00356.pdf

Abstract:
Malicious URL (Uniform Resource Locator) classification is a pivotal aspect of Cybersecurity, offering defense against web-based threats. Despite deep learning's promise in this area, its advancement is hindered by two main challenges: the scarcity of comprehensive, open-source datasets and the limitations of existing models, which either lack real-time capabilities or exhibit suboptimal performance. In order to address these gaps, we introduce a novel, multi-class dataset for malicious URL classification, distinguishing between benign, phishing and malicious URLs, named DeepURLBench. The data has been rigorously cleansed and structured, providing a superior alternative to existing datasets. Notably, the multi-class approach enhances the performance of deep learning models, as compared to a standard binary classification approach. Additionally, we propose improvements to string-based URL classifiers, applying these enhancements to URLNet. Key among these is the integration of DNS-derived features, which enrich the model's capabilities and lead to notable performance gains while preserving real-time runtime efficiency-achieving an effective balance for cybersecurity applications.

Paperid: 2968, https://arxiv.org/pdf/2501.00264.pdf

Abstract:
Wireless Sensor Networks (WSNs) continue to experience rapid developments and integration into modern-day applications. Overall, WSNs collect and process relevant data through sensors or nodes and communicate with different networks for superior information management. Nevertheless, a primary concern relative to WSNs is security. Considering the high constraints on throughput, battery, processing power, and memory, typical security procedures present limitations for application in WSNs. This research focuses on the integration of WSNs with the cloud platform, specifically to address these security risks. The cloud platform also adopts a security-driven approach and has attracted many applications across various sectors globally. This research specifically explores how cloud computing could be exploited to impede Denial of Service attacks from endangering WSNs. WSNs are now deployed in various low-powered applications, including disaster management, homeland security, battlefield surveillance, agriculture, and the healthcare industry. WSNs are distinguished from traditional networks by the numerous wireless connected sensors being deployed to conduct an assigned task. In testing scenarios, the size of WSNs ranges from a few to several thousand. The overarching requirements of WSNs include rapid processing of collected data, low-cost installation and maintenance, and low latency in network operations. Given that a substantial amount of WSN applications are used in high-risk and volatile environments, they must effectively address security concerns. This includes the secure movement, storage, and communication of data through networks, an environment in which WSNs are notably vulnerable. The limitations of WSNs have meant that they are predominantly used in unsecured applications despite positive advancements. This study explores methods for integrating the WSN with the cloud.

Paperid: 2969, https://arxiv.org/pdf/2501.00261.pdf

Abstract:
The introduction sets the stage for exploring collaborative approaches to bolstering smart vehicle cybersecurity through AI-driven threat detection. As the automotive industry increasingly adopts connected and automated vehicles (CAVs), the need for robust cybersecurity measures becomes paramount. With the emergence of new vulnerabilities and security requirements, the integration of advanced technologies such as 5G networks, blockchain, and quantum computing presents promising avenues for enhancing CAV cybersecurity . Additionally, the roadmap for cybersecurity in autonomous vehicles emphasizes the importance of efficient intrusion detection systems and AI-based techniques, along with the integration of secure hardware, software stacks, and advanced threat intelligence to address cybersecurity challenges in future autonomous vehicles.

Paperid: 2970, https://arxiv.org/pdf/2501.00193.pdf

Abstract:
Pseudo-random number generators (PRNGs) are essential in a wide range of applications, from cryptography to statistical simulations and optimization algorithms. While uniform randomness is crucial for security-critical areas like cryptography, many domains, such as simulated annealing and CMOS-based Ising Machines, benefit from controlled or non-uniform randomness to enhance solution exploration and optimize performance. This paper presents a hardware PRNG that can simultaneously generate multiple uncorrelated sequences with programmable statistics tailored to specific application needs. Designed in 65nm process, the PRNG occupies an area of approximately 0.0013mm^2 and has an energy consumption of 0.57pJ/bit. Simulations confirm the PRNG's effectiveness in modulating the statistical distribution while demonstrating high-quality randomness properties.

Paperid: 2971, https://arxiv.org/pdf/2506.23644.pdf

Abstract:
We introduce QLPro, a vulnerability detection framework that systematically integrates LLMs and static analysis tools to enable comprehensive vulnerability detection across entire open-source projects.We constructed a new dataset, JavaTest, comprising 10 open-source projects from GitHub with 62 confirmed vulnerabilities. CodeQL, a state-of-the-art static analysis tool, detected only 24 of these vulnerabilities while QLPro detected 41. Furthermore, QLPro discovered 6 previously unknown vulnerabilities, 2 of which have been confirmed as 0-days.

Paperid: 2972, https://arxiv.org/pdf/2506.23296.pdf

Abstract:
Embedded into information systems, artificial intelligence (AI) faces security threats that exploit AI-specific vulnerabilities. This paper provides an accessible overview of adversarial attacks unique to predictive and generative AI systems. We identify eleven major attack types and explicitly link attack techniques to their impacts -- including information leakage, system compromise, and resource exhaustion -- mapped to the confidentiality, integrity, and availability (CIA) security triad. We aim to equip researchers, developers, security practitioners, and policymakers, even those without specialized AI security expertise, with foundational knowledge to recognize AI-specific risks and implement effective defenses, thereby enhancing the overall security posture of AI systems.

Paperid: 2973, https://arxiv.org/pdf/2506.22787.pdf

Abstract:
We propose a harm-centric conceptualization of privacy that asks: What harms from personal data use can privacy prevent? The motivation behind this research is limitations in existing privacy frameworks (e.g., Contextual Integrity) to capture or categorize many of the harms that arise from modern technology's use of personal data. We operationalize this conceptualization in an online study with 400 college and university students. Study participants indicated their perceptions of different harms (e.g., manipulation, discrimination, and harassment) that may arise when artificial intelligence-based algorithms infer personal data (e.g., demographics, personality traits, and cognitive disability) and use it to identify students who are likely to drop out of a course or the best job candidate. The study includes 14 harms and six types of personal data selected based on an extensive literature review. Comprehensive statistical analyses of the study data show that the 14 harms are internally consistent and collectively represent a general notion of privacy harms. The study data also surfaces nuanced perceptions of harms, both across the contexts and participants' demographic factors. Based on these results, we discuss how privacy can be improved equitably. Thus, this research not only contributes to enhancing the understanding of privacy as a concept but also provides practical guidance to improve privacy in the context of education and employment.

Paperid: 2974, https://arxiv.org/pdf/2506.22706.pdf

Abstract:
In the face of evolving cyber threats such as malware, ransomware and phishing, autonomous cybersecurity defense (ACD) systems have become essential for real-time threat detection and response with optional human intervention. However, existing ACD systems rely on limiting assumptions, particularly the stationarity of the underlying network dynamics. In real-world scenarios, network topologies can change due to actions taken by attackers or defenders, system failures, or time evolution of networks, leading to failures in the adaptive capabilities of current defense agents. Moreover, many agents are trained on static environments, resulting in overfitting to specific topologies, which hampers their ability to generalize to out-of-distribution network topologies. This work addresses these challenges by exploring methods for developing agents to learn generalizable policies across dynamic network environments -- general ACD (GACD).

Paperid: 2975, https://arxiv.org/pdf/2506.22482.pdf

Abstract:
With the advent of Internet of Things, Wireless Home Automation Systems WHAS are gradually gaining popularity. These systems are faced with multiple challenges such as security; controlling a variety of home appliances with a single interface and user friendliness. In this paper we propose a system that uses secure authentication systems of social networking websites such as Twitter, tracks the end-users activities on the social network and then control his or her domestic appliances. At the end, we highlight the applications of the proposed WHAS and compare the advantages of our proposed system over traditional home automation systems.

Paperid: 2976, https://arxiv.org/pdf/2506.21425.pdf

Abstract:
Traffic anomalies and attacks are commonplace in today's networks and identifying them rapidly and accurately is critical for large network operators. For a statistical intrusion detection system (IDS), it is crucial to detect at the flow-level for accurate detection and mitigation. However, existing IDS systems offer only limited support for 1) interactively examining detected intrusions and anomalies, 2) analyzing worm propagation patterns, 3) and discovering correlated attacks. These problems are becoming even more acute as the traffic on today's high-speed routers continues to grow. IDGraphs is an interactive visualization system for intrusion detection that addresses these challenges. The central visualization in the system is a flow-level trace plotted with time on the horizontal axis and aggregated number of unsuccessful connections on the vertical axis. We then summarize a stack of tens or hundreds of thousands of these traces using the Histographs [RW05] technique, which maps data frequency at each pixel to brightness. Users may then interactively query the summary view, performing analysis by highlighting subsets of the traces. For example, brushing a linked correlation matrix view highlights traces with similar patterns, revealing distributed attacks that are difficult to detect using standard statistical analysis. We apply IDGraphs system to a real network router data-set with 179M flow-level records representing a total traffic of 1.16TB. The system successfully detects and analyzes a variety of attacks and anomalies, including port scanning, worm outbreaks, stealthy TCP SYN floodings, and some distributed attacks.

Paperid: 2977, https://arxiv.org/pdf/2506.21069.pdf

Abstract:
Electromagnetic (EM) covert channels pose significant threats to computer and communications security in air-gapped networks. Previous works exploit EM radiation from various components (e.g., video cables, memory buses, CPUs) to secretly send sensitive information. These approaches typically require the attacker to deploy highly specialized receivers near the victim, which limits their real-world impact. This paper reports a new EM covert channel, TEMPEST-LoRa, that builds on Cross-Technology Covert Communication (CTCC), which could allow attackers to covertly transmit EM-modulated secret data from air-gapped networks to widely deployed operational LoRa receivers from afar. We reveal the potential risk and demonstrate the feasibility of CTCC by tackling practical challenges involved in manipulating video cables to precisely generate the EM leakage that could readily be received by third-party commercial LoRa nodes/gateways. Experiment results show that attackers can reliably decode secret data modulated by the EM leakage from a video cable at a maximum distance of 87.5m or a rate of 21.6 kbps. We note that the secret data transmission can be performed with monitors turned off (therefore covertly).

Paperid: 2978, https://arxiv.org/pdf/2506.20585.pdf

Abstract:
Mobile Crowd-Sensing (MCS) enables users with personal mobile devices (PMDs) to gain information on their surroundings. Users collect and contribute data on different phenomena using their PMD sensors, and the MCS system processes this data to extract valuable information for end users. Navigation MCS-based applications (N-MCS) are prevalent and important for transportation: users share their location and speed while driving and, in return, find efficient routes to their destinations. However, N-MCS are currently vulnerable to malicious contributors, often termed Sybils: submitting falsified data, seemingly from many devices that are not truly present on target roads, falsely reporting congestion when there is none, thus changing the road status the N-MCS infers. The attack effect is that the N-MCS returns suboptimal routes to users, causing late arrival and, overall, deteriorating road traffic flow. We investigate exactly the impact of Sybil-based attacks on N-MCS: we design an N-MCS system that offers efficient routing on top of the vehicular simulator SUMO, using the InTAS road network as our scenario. We design experiments attacking an individual N-MCS user as well as a larger population of users, selecting the adversary targets based on graph-theoretical arguments. Our experiments show that the resources required for a successful attack depend on the location of the attack (i.e., the surrounding road network and traffic) and the extent of Sybil contributed data for the targeted road(s). We demonstrate that Sybil attacks can alter the route of N-MCS users, increasing average travel time by 20% with Sybils 3% of the N-MCS user population.

Paperid: 2979, https://arxiv.org/pdf/2506.20444.pdf

Abstract:
Vulnerability detection is crucial for identifying security weaknesses in software systems. However, training effective machine learning models for this task is often constrained by the high cost and expertise required for data annotation. Active learning is a promising approach to mitigate this challenge by intelligently selecting the most informative data points for labeling. This paper proposes a novel method to significantly enhance the active learning process by using dataset maps. Our approach systematically identifies samples that are hard-to-learn for a model and integrates this information to create a more sophisticated sample selection strategy. Unlike traditional active learning methods that focus primarily on model uncertainty, our strategy enriches the selection process by considering learning difficulty, allowing the active learner to more effectively pinpoint truly informative examples. The experimental results show that our approach can improve F1 score over random selection by 61.54% (DeepGini) and 45.91% (K-Means) and outperforms standard active learning by 8.23% (DeepGini) and 32.65% (K-Means) for CodeBERT on the Big-Vul dataset, demonstrating the effectiveness of integrating dataset maps for optimizing sample selection in vulnerability detection. Furthermore, our approach also enhances model robustness, improves sample selection by filtering hard-to-learn data, and stabilizes active learning performance across iterations. By analyzing the characteristics of these outliers, we provide insights for future improvements in dataset construction, making vulnerability detection more reliable and cost-effective.

Paperid: 2980, https://arxiv.org/pdf/2506.19877.pdf

Abstract:
Identifying suitable machine learning paradigms for intrusion detection remains critical for building effective and generalizable security solutions. In this study, we present a controlled comparison of four representative models - Multi-Layer Perceptron (MLP), 1D Convolutional Neural Network (CNN), One-Class Support Vector Machine (OCSVM) and Local Outlier Factor (LOF) - on the CICIDS2017 dataset under two scenarios: detecting known attack types and generalizing to previously unseen threats. Our results show that supervised MLP and CNN achieve near-perfect accuracy on familiar attacks but suffer drastic recall drops on novel attacks. Unsupervised LOF attains moderate overall accuracy and high recall on unknown threats at the cost of elevated false alarms, while boundary-based OCSVM balances precision and recall best, demonstrating robust detection across both scenarios. These findings offer practical guidance for selecting IDS models in dynamic network environments.

Paperid: 2981, https://arxiv.org/pdf/2506.19870.pdf

Abstract:
Peer-to-peer trading and the move to decentralized grids have reshaped the energy markets in the United States. Notwithstanding, such developments lead to new challenges, mainly regarding the safety and authenticity of energy trade. This study aimed to develop and build a secure, intelligent, and efficient energy transaction system for the decentralized US energy market. This research interlinks the technological prowess of blockchain and artificial intelligence (AI) in a novel way to solve long-standing challenges in the distributed energy market, specifically those of security, fraudulent behavior detection, and market reliability. The dataset for this research is comprised of more than 1.2 million anonymized energy transaction records from a simulated peer-to-peer (P2P) energy exchange network emulating real-life blockchain-based American microgrids, including those tested by LO3 Energy and Grid+ Labs. Each record contains detailed fields of transaction identifier, timestamp, energy volume (kWh), transaction type (buy/sell), unit price, prosumer/consumer identifier (hashed for privacy), smart meter readings, geolocation regions, and settlement confirmation status. The dataset also includes system-calculated behavior metrics of transaction rate, variability of energy production, and historical pricing patterns. The system architecture proposed involves the integration of two layers, namely a blockchain layer and artificial intelligence (AI) layer, each playing a unique but complementary function in energy transaction securing and market intelligence improvement. The machine learning models used in this research were specifically chosen for their established high performance in classification tasks, specifically in the identification of energy transaction fraud in decentralized markets.

Paperid: 2982, https://arxiv.org/pdf/2506.19802.pdf

Abstract:
Despite extensive research on Machine Learning-based Network Intrusion Detection Systems (ML-NIDS), their capability to detect diverse attack variants remains uncertain. Prior studies have largely relied on homogeneous datasets, which artificially inflate performance scores and offer a false sense of security. Designing systems that can effectively detect a wide range of attack variants remains a significant challenge. The progress of ML-NIDS continues to depend heavily on human expertise, which can embed subjective judgments of system designers into the model, potentially hindering its ability to generalize across diverse attack types. To address this gap, we propose KnowML, a framework for knowledge-guided machine learning that integrates attack knowledge into ML-NIDS. KnowML systematically explores the threat landscape by leveraging Large Language Models (LLMs) to perform automated analysis of attack implementations. It constructs a unified Knowledge Graph (KG) of attack strategies, on which it applies symbolic reasoning to generate KG-Augmented Input, embedding domain knowledge directly into the design process of ML-NIDS. We evaluate KnowML on 28 realistic attack variants, of which 10 are newly collected for this study. Our findings reveal that baseline ML-NIDS models fail to detect several variants entirely, achieving F1 scores as low as 0 %. In contrast, our knowledge-guided approach achieves up to 99 % F1 score while maintaining a False Positive Rate below 0.1 %.

Paperid: 2983, https://arxiv.org/pdf/2506.19109.pdf

Abstract:
Prompt injection threatens novel applications that emerge from adapting LLMs for various user tasks. The newly developed LLM-based software applications become more ubiquitous and diverse. However, the threat of prompt injection attacks undermines the security of these systems as the mitigation and defenses against them, proposed so far, are insufficient. We investigated the capabilities of early prompt injection detection systems, focusing specifically on the detection performance of techniques implemented in various open-source solutions. These solutions are supposed to detect certain types of prompt injection attacks, including the prompt leak. In prompt leakage attacks, an attacker maliciously manipulates the LLM into outputting its system instructions, violating the system's confidentiality. Our study presents analyzes of distinct prompt leakage detection techniques, and a comparative analysis of several detection solutions, which implement those techniques. We identify the strengths and weaknesses of these techniques and elaborate on their optimal configuration and usage in high-stake deployments. In one of the first studies on existing prompt leak detection solutions, we compared the performances of LLM Guard, Vigil, and Rebuff. We concluded that the implementations of canary word checks in Vigil and Rebuff were not effective at detecting prompt leak attacks, and we proposed improvements for them. We also found an evasion weakness in Rebuff's secondary model-based technique and proposed a mitigation. Then, the result of the comparison of LLM Guard, Vigil, and Rebuff at their peak performance revealed that Vigil is optimal for cases when minimal false positive rate is required, and Rebuff is the most optimal for average needs.

Paperid: 2984, https://arxiv.org/pdf/2506.19052.pdf

Abstract:
Artificial Intelligence brings innovations into the society. However, bias and unethical exist in many algorithms that make the applications less trustworthy. Threats hunting algorithms based on machine learning have shown great advantage over classical methods. Reinforcement learning models are getting more accurate for identifying not only signature-based but also behavior-based threats. Quantum mechanics brings a new dimension in improving classification speed with exponential advantage. In this research, we developed a machine learning based cyber threat detection and assessment tool. It uses two stage, unsupervised and supervised learning, analyzing method on log data recorded from a web server on AWS cloud. The results show the algorithm has the ability to identify cyber threats with high confidence.

Paperid: 2985, https://arxiv.org/pdf/2506.18715.pdf

Abstract:
Vulnerability assessment is a critical challenge in cybersecurity, particularly in industrial environments. This work presents an innovative approach by incorporating the temporal dimension into vulnerability assessment, an aspect neglected in existing literature. Specifically, this paper focuses on refining vulnerability assessment and prioritization by integrating Common Vulnerability Scoring System (CVSS) Temporal Metrics with Bayesian Networks to account for exploit availability, remediation efforts, and confidence in reported vulnerabilities. Through probabilistic modeling, Bayesian networks enable a structured and adaptive evaluation of vulnerabilities, allowing for more accurate prioritization and decision-making. The proposed approach dynamically computes the Temporal Score and updates the CVSS Base Score by processing data on exploits and fixes from vulnerability databases.

Paperid: 2986, https://arxiv.org/pdf/2506.18100.pdf

Abstract:
Address Resolution Protocol (ARP) spoofing attacks severely threaten Internet of Things (IoT) networks by allowing attackers to intercept, modify, or block communications. Traditional detection methods are insufficient due to high false positives and poor adaptability. This research proposes a multi-layered machine learning-based framework for intelligently detecting ARP spoofing in IoT networks. Our approach utilizes an ensemble of classifiers organized into multiple layers, each layer optimizing detection accuracy and reducing false alarms. Experimental evaluations demonstrate significant improvements in detection accuracy (up to 97.5\%), reduced false positive rates (less than 2\%), and faster detection time compared to existing methods. Our key contributions include introducing multi-layer ensemble classifiers specifically tuned for IoT networks, systematically addressing dataset imbalance problems, introducing a dynamic feedback mechanism for classifier retraining, and validating practical applicability through extensive simulations. This research enhances security management in IoT deployments, providing robust defenses against ARP spoofing attacks and improving reliability and trust in IoT environments.

Paperid: 2987, https://arxiv.org/pdf/2506.18053.pdf

Abstract:
Architectural obfuscation - e.g., permuting hidden-state tensors, linearly transforming embedding tables, or remapping tokens - has recently gained traction as a lightweight substitute for heavyweight cryptography in privacy-preserving large-language-model (LLM) inference. While recent work has shown that these techniques can be broken under dedicated reconstruction attacks, their impact on mechanistic interpretability has not been systematically studied. In particular, it remains unclear whether scrambling a network's internal representations truly thwarts efforts to understand how the model works, or simply relocates the same circuits to an unfamiliar coordinate system. We address this gap by analyzing a GPT-2-small model trained from scratch with a representative obfuscation map. Assuming the obfuscation map is private and the original basis is hidden (mirroring an honest-but-curious server), we apply logit-lens attribution, causal path-patching, and attention-head ablation to locate and manipulate known circuits. Our findings reveal that obfuscation dramatically alters activation patterns within attention heads yet preserves the layer-wise computational graph. This disconnect hampers reverse-engineering of user prompts: causal traces lose their alignment with baseline semantics, and token-level logit attributions become too noisy to reconstruct. At the same time, feed-forward and residual pathways remain functionally intact, suggesting that obfuscation degrades fine-grained interpretability without compromising top-level task performance. These results establish quantitative evidence that architectural obfuscation can simultaneously (i) retain global model behaviour and (ii) impede mechanistic analyses of user-specific content. By mapping where interpretability breaks down, our study provides guidance for future privacy defences and for robustness-aware interpretability tooling.

Paperid: 2988, https://arxiv.org/pdf/2506.17988.pdf

Abstract:
Emerging crypto economies still hemorrhage digital assets because legacy wallets leak private keys at almost every layer of the software stack, from user-space libraries to kernel memory dumps. This paper solves that twin crisis of security and interoperability by re-imagining key management as a platform-level service anchored in ARM TrustZone through OP-TEE. Our architecture fractures the traditional monolithic Trusted Application into per-chain modules housed in a multi-tenant TA store, finally breaking OP-TEE's single-binary ceiling. A cryptographically sealed firmware-over-the-air pipeline welds each TA set to an Android system image, enabling hot-swap updates while Verified Boot enforces rollback protection. Every package carries a chained signature developer first, registry second so even a compromised supply chain cannot smuggle malicious code past the Secure World's RSA-PSS gatekeeper. Inside the TEE, strict inter-TA isolation, cache partitioning, and GP-compliant crypto APIs ensure secrets never bleed across trust boundaries or timing domains. The Rich Execution Environment can interact only via hardware-mediated Secure Monitor Calls, collapsing the surface exposed to malware in Android space. End-users enjoy a single polished interface yet can install or retire Bitcoin, Ethereum, Solana, or tomorrow's chain with one tap, shrinking both storage footprint and audit scope. For auditors, the composition model slashes duplicated verification effort by quarantining blockchain logic inside narrowly scoped modules that share formally specified interfaces. Our threat analysis spans six adversary layers and shows how the design neutralizes REE malware sniffing, OTA injection, and cross-module side channels without exotic hardware. A reference implementation on AOSP exports a Wallet Manager HAL, custom SELinux domains, and a CI/CD pipeline that vet community modules before release. The result is not merely another hardware wallet but a programmable substrate that can evolve at the velocity of the blockchain ecosystem. By welding radical extensibility to hardware-anchored assurance, the platform closes the security-usability gap that has long stymied mass-market self-custody. We posit that modular TEEs are the missing OS primitive for Web3, much as virtual memory unlocked multi-tasking in classical computing. Together, these contributions sketch a blueprint for multi-chain asset management that is auditable, resilient, and poised for global deployment.

Paperid: 2989, https://arxiv.org/pdf/2506.17935.pdf

Abstract:
To address the privacy protection problem in cloud computing, privacy enhancement techniques such as the Paillier additive homomorphism algorithm are receiving widespread attention. Paillier algorithm allows addition and scalar multiplication operations in dencrypted state, which can effectively protect privacy. However, its computational efficiency is limited by complex modulo operations due to the ciphertext expansion followed by encryption. To accelerate its decryption operation, the Chinese Remainder Theorem (CRT) is often used to optimize these modulo operations, which lengthens the decryption computation chain in turn. To address this issue, we propose an eCRT-Paillier decryption algorithm that shortens the decryption computation chain by combining precomputed parameters and eliminating extra judgment operations introduced by Montgomery modular multiplications. These two improvements reduce 50% modular multiplications and 60% judgment operations in the postprocessing of the CRT-Paillier decryption algorithm. Based on these improvements, we propose a highly parallel full-pipeline architecture to eliminate stalls caused by multiplier reuse in traditional modular exponentiation operations. This architecture also adopts some optimizations such as simplifying modular exponentiation units by dividing the exponent into segments and parallelizing data flow by multi-core instantiation. Finally, a high-throughput and efficient Paillier accelerator named MESA was implemented on the Xilinx Virtex-7 FPGA for evaluation, which can complete a decryption using 2048-bit key within 0.577ms under 100 MHz clock frequency. Compared to prior works, MESA demonstrates a throughput improvement of 1.16 to 313.21 under identical conditions, also with enhancements in area efficiency for LUT, DSP, and FF of 3.32 to 117.55, 1.49 to 1.64, and 2.94 to 9.94, respectively.

Paperid: 2990, https://arxiv.org/pdf/2506.17911.pdf

Abstract:
Routing Protocol for Low-Power and Lossy Networks (RPL) is an energy-efficient routing solution for IPv6 over Low-Power Wireless Personal Area Networks (6LoWPAN), recommended for resource-constrained devices. While RPL offers significant benefits, its security vulnerabilities pose challenges, particularly due to unauthenticated control messages used to establish and maintain routing information. These messages are susceptible to manipulation, enabling malicious nodes to inject false routing data. A notable security concern is the Routing Table Falsification (RTF) attack, where attackers forge Destination Advertisement Object (DAO) messages to promote fake routes via a parent nodes routing table. Experimental results indicate that RTF attacks significantly reduce packet delivery ratio, increase end-to-end delay, and leverage power consumption. Currently, no effective countermeasures exist in the literature, reinforcing the need for a security solution to prevent network disruption and protect user applications. This paper introduces a Lightweight Security Solution against Routing Table Falsification Attack (LiSec-RTF), leveraging Physical Unclonable Functions (PUFs) to generate unique authentication codes, termed Licenses. LiSec-RTF mitigates RTF attack impact while considering the resource limitations of 6LoWPAN devices in both static and mobile scenarios. Our testbed experiments indicate that LiSec-RTF significantly improves network performance compared to standard RPL under RTF attacks, thereby ensuring reliable and efficient operation.

Paperid: 2991, https://arxiv.org/pdf/2506.17625.pdf

Abstract:
Private Information Retrieval (PIR) is a privacy-preserving primitive in cryptography. Significant endeavors have been made to address the variant of PIR concerning the malicious servers. Among those endeavors, list-decodable Byzantine robust PIR schemes may tolerate a majority of malicious responding servers that provide incorrect answers. In this paper, we propose two perfect list-decodable BRPIR schemes. Our schemes are the first ones that can simultaneously handle a majority of malicious responding servers, achieve a communication complexity of $o(n^{1/2})$ for a database of size n, and provide a nontrivial estimation on the list sizes. Compared with the existing solutions, our schemes attain lower communication complexity, higher byzantine tolerance, and smaller list size.

Paperid: 2992, https://arxiv.org/pdf/2506.17329.pdf

Abstract:
Healthcare 5.0 integrates Artificial Intelligence (AI), the Internet of Things (IoT), real-time monitoring, and human-centered design toward personalized medicine and predictive diagnostics. However, the increasing reliance on interconnected medical technologies exposes them to cyber threats. Meanwhile, current AI-driven cybersecurity models often neglect biomedical data, limiting their effectiveness and interpretability. This study addresses this gap by applying eXplainable AI (XAI) to a Healthcare 5.0 dataset that integrates network traffic and biomedical sensor data. Classification outputs indicate that XGBoost achieved 99% F1-score for benign and data alteration, and 81% for spoofing. Explainability findings reveal that network data play a dominant role in intrusion detection whereas biomedical features contributed to spoofing detection, with temperature reaching a Shapley values magnitude of 0.37.

Paperid: 2993, https://arxiv.org/pdf/2506.17309.pdf

Abstract:
Malware detection using machine learning requires feature extraction from binary files, as models cannot process raw binaries directly. A common approach involves using LIEF for raw feature extraction and the EMBER vectorizer to generate 2381-dimensional feature vectors. However, the high dimensionality of these features introduces significant computational challenges. This study addresses these challenges by applying two dimensionality reduction techniques: XGBoost-based feature selection and Principal Component Analysis (PCA). We evaluate three reduced feature dimensions (128, 256, and 384), which correspond to approximately 5.4%, 10.8%, and 16.1% of the original 2381 features, across four models-XGBoost, LightGBM, Extra Trees, and Random Forest-using a unified training, validation, and testing split formed from the EMBER-2018, ERMDS, and BODMAS datasets. This approach ensures generalization and avoids dataset bias. Experimental results show that LightGBM trained on the 384-dimensional feature set after XGBoost feature selection achieves the highest accuracy of 97.52% on the unified dataset, providing an optimal balance between computational efficiency and detection performance. The best model, trained in 61 minutes using 30 GB of RAM and 19.5 GB of disk space, generalizes effectively to completely unseen datasets, maintaining 95.31% accuracy on TRITIUM and 93.98% accuracy on INFERNO. These findings present a scalable, compute-efficient approach for malware detection without compromising accuracy.

Paperid: 2994, https://arxiv.org/pdf/2506.17047.pdf

Abstract:
Neural network model extraction has emerged in recent years as an important security concern, as adversaries attempt to recover a network's parameters via black-box queries. A key step in this process is signature extraction, which aims to recover the absolute values of the network's weights layer by layer. Prior work, notably by Carlini et al. (2020), introduced a technique inspired by differential cryptanalysis to extract neural network parameters. However, their method suffers from several limitations that restrict its applicability to networks with a few layers only. Later works focused on improving sign extraction, but largely relied on the assumption that signature extraction itself was feasible. In this work, we revisit and refine the signature extraction process by systematically identifying and addressing for the first time critical limitations of Carlini et al.'s signature extraction method. These limitations include rank deficiency and noise propagation from deeper layers. To overcome these challenges, we propose efficient algorithmic solutions for each of the identified issues, greatly improving the efficiency of signature extraction. Our approach permits the extraction of much deeper networks than was previously possible. We validate our method through extensive experiments on ReLU-based neural networks, demonstrating significant improvements in extraction depth and accuracy. For instance, our extracted network matches the target network on at least 95% of the input space for each of the eight layers of a neural network trained on the CIFAR-10 dataset, while previous works could barely extract the first three layers. Our results represent a crucial step toward practical attacks on larger and more complex neural network architectures.

Paperid: 2995, https://arxiv.org/pdf/2506.17012.pdf

Abstract:
As data-driven technologies advance swiftly, maintaining strong privacy measures becomes progressively difficult. Conventional $(Îµ, Î´)$-differential privacy, while prevalent, exhibits limited adaptability for many applications. To mitigate these constraints, we present alpha differential privacy (ADP), an innovative privacy framework grounded in alpha divergence, which provides a more flexible assessment of privacy consumption. This study delineates the theoretical underpinnings of ADP and contrasts its performance with competing privacy frameworks across many scenarios. Empirical assessments demonstrate that ADP offers enhanced privacy guarantees in small to moderate iteration contexts, particularly where severe privacy requirements are necessary. The suggested method markedly improves privacy-preserving methods, providing a flexible solution for contemporary data analysis issues in a data-centric environment.

Paperid: 2996, https://arxiv.org/pdf/2506.16812.pdf

Abstract:
This paper introduces a new set of privacy-preserving mechanisms for verifying compliance with location-based policies for vehicle taxation, or for (electric) vehicle (EV) subsidies, using Zero-Knowledge Proofs (ZKPs). We present the design and evaluation of a Zero-Knowledge Proof-of-Location (ZK-PoL) system that ensures a vehicle's adherence to territorial driving requirements without disclosing specific location data, hence maintaining user privacy. Our findings suggest a promising approach to apply ZK-PoL protocols in large-scale governmental subsidy or taxation programs.

Paperid: 2997, https://arxiv.org/pdf/2506.16615.pdf

Abstract:
In a wireless sensor network, the virtual connectivity between nodes is a function of the keys shared between various nodes. Pre-embedding these key configurations in the nodes would make the network inflexible. On the other hand, permitting subsets of nodes to engage in a common key synthesis phase to create secure distributed connections amongst themselves, would decouple and conceal the information flow from the controlling centre. An intermediate solution is the notion of a centre driven key generation process through broadcast tokens, designed to extract different keys in different nodes based on some prior information stored at the nodes. As more tokens arrive, the virtual connectivity of the nodes are altered and the network evolves. This evolution can be distributed and can be controlled to converge to a certain specific connectivity profile. In this paper we present a framework and an algorithm which controls the simultaneous and distributed key release in different nodes, resulting in the creation of parallel virtual multicast groups. The design of the node shares and the supporting broadcast tokens have been discussed in conjunction with the process of balancing the spans of individual groups with spans of several coexistent multicast groups.

Paperid: 2998, https://arxiv.org/pdf/2506.16497.pdf

Abstract:
Face swapping manipulations in video streams represents an increasing threat in remote video communications, due to advances in automated and real-time tools. Recent literature proposes to characterize and exploit visual artifacts introduced in video frames by swapping algorithms when dealing with challenging physical scenes, such as face occlusions. This paper investigates the effectiveness of this approach by benchmarking CNN-based data-driven models on two data corpora (including a newly collected one) and analyzing generalization capabilities with respect to different acquisition sources and swapping algorithms. The results confirm excellent performance of general-purpose CNN architectures when operating within the same data source, but a significant difficulty in robustly characterizing occlusion-based visual cues across datasets. This highlights the need for specialized detection strategies to deal with such artifacts.

Paperid: 2999, https://arxiv.org/pdf/2506.16400.pdf

Abstract:
The proliferation of electric vehicles in recent years has significantly expanded the charging infrastructure while introducing new security risks to both vehicles and chargers. In this paper, we investigate the security of major charging protocols such as SAE J1772, CCS, IEC 61851, GB/T 20234, and NACS, uncovering new physical signal spoofing attacks in their authentication mechanisms. By inserting a compact malicious device into the charger connector, attackers can inject fraudulent signals to sabotage the charging process, leading to denial of service, vehicle-induced charger lockout, and damage to the chargers or the vehicle's charge management system. To demonstrate the feasibility of our attacks, we propose PORTulator, a proof-of-concept (PoC) attack hardware, including a charger gun plugin device for injecting physical signals and a wireless controller for remote manipulation. By evaluating PORTulator on multiple real-world chargers, we identify 7 charging standards used by 20 charger piles that are vulnerable to our attacks. The root cause is that chargers use simple physical signals for authentication and control, making them easily spoofed by attackers. To address this issue, we propose enhancing authentication circuits by integrating non-resistive memory components and utilizing dynamic high-frequency Pulse Width Modulation (PWM) signals to counter such physical signal spoofing attacks.

Paperid: 3000, https://arxiv.org/pdf/2506.15711.pdf

Abstract:
Federated learning (FL) has emerged as a transformative framework for privacy-preserving distributed training, allowing clients to collaboratively train a global model without sharing their local data. This is especially crucial in sensitive fields like healthcare, where protecting patient data is paramount. However, privacy leakage remains a critical challenge, as the communication of model updates can be exploited by potential adversaries. Gradient inversion attacks (GIAs), for instance, allow adversaries to approximate the gradients used for training and reconstruct training images, thus stealing patient privacy. Existing defense mechanisms obscure gradients, yet lack a nuanced understanding of which gradients or types of image information are most vulnerable to such attacks. These indiscriminate calibrated perturbations result in either excessive privacy protection degrading model accuracy, or insufficient one failing to safeguard sensitive information. Therefore, we introduce a framework that addresses these challenges by leveraging a shadow model with interpretability for identifying sensitive areas. This enables a more targeted and sample-specific noise injection. Specially, our defensive strategy achieves discrepancies of 3.73 in PSNR and 0.2 in SSIM compared to the circumstance without defense on the ChestXRay dataset, and 2.78 in PSNR and 0.166 in the EyePACS dataset. Moreover, it minimizes adverse effects on model performance, with less than 1\% F1 reduction compared to SOTA methods. Our extensive experiments, conducted across diverse types of medical images, validate the generalization of the proposed framework. The stable defense improvements for FedAvg are consistently over 1.5\% times in LPIPS and SSIM. It also offers a universal defense against various GIA types, especially for these sensitive areas in images.

Paperid: 3001, https://arxiv.org/pdf/2506.15388.pdf

Abstract:
With the rapid advancements in Natural Language Processing (NLP), large language models (LLMs) like GPT-4 have gained significant traction in diverse applications, including security vulnerability scanning. This paper investigates the efficacy of GPT-4 in identifying software vulnerabilities compared to traditional Static Application Security Testing (SAST) tools. Drawing from an array of security mistakes, our analysis underscores the potent capabilities of GPT-4 in LLM-enhanced vulnerability scanning. We unveiled that GPT-4 (Advanced Data Analysis) outperforms SAST by an accuracy of 94% in detecting 32 types of exploitable vulnerabilities. This study also addresses the potential security concerns surrounding LLMs, emphasising the imperative of security by design/default and other security best practices for AI.

Paperid: 3003, https://arxiv.org/pdf/2506.15117.pdf

Abstract:
In recent years, the widespread application of large language models has inspired us to consider using inference for communication encryption. We therefore propose CipherMind, which utilizes intermediate results from deterministic fine-tuning of large model inferences as transmission content. The semantic parameters of large models exhibit characteristics like opaque underlying implementations and weak interpretability, thus enabling their use as an encryption method for data transmission. This communication paradigm can be applied in scenarios like intra-gateway transmission, and theoretically, it can be implemented using any large model as its foundation.

Paperid: 3004, https://arxiv.org/pdf/2506.14539.pdf

Abstract:
Since the advent of large language models, prompt engineering now enables the rapid, low-effort creation of diverse autonomous agents that are already in widespread use. Yet this convenience raises urgent concerns about the safety, robustness, and behavioral consistency of the underlying prompts, along with the pressing challenge of preventing those prompts from being exposed to user's attempts. In this paper, we propose the ''Doppelganger method'' to demonstrate the risk of an agent being hijacked, thereby exposing system instructions and internal information. Next, we define the ''Prompt Alignment Collapse under Adversarial Transfer (PACAT)'' level to evaluate the vulnerability to this adversarial transfer attack. We also propose a ''Caution for Adversarial Transfer (CAT)'' prompt to counter the Doppelganger method. The experimental results demonstrate that the Doppelganger method can compromise the agent's consistency and expose its internal information. In contrast, CAT prompts enable effective defense against this adversarial attack.

Paperid: 3005, https://arxiv.org/pdf/2506.13746.pdf

Abstract:
Phishing attacks remain one of the most prevalent and persistent cybersecurity threat with attackers continuously evolving and intensifying tactics to evade the general detection system. Despite significant advances in artificial intelligence and machine learning, faithfully reproducing the interpretable reasoning with classification and explainability that underpin phishing judgments remains challenging. Due to recent advancement in Natural Language Processing, Large Language Models (LLMs) show a promising direction and potential for improving domain specific phishing classification tasks. However, enhancing the reliability and robustness of classification models requires not only accurate predictions from LLMs but also consistent and trustworthy explanations aligning with those predictions. Therefore, a key question remains: can LLMs not only classify phishing emails accurately but also generate explanations that are reliably aligned with their predictions and internally self-consistent? To answer these questions, we have fine-tuned transformer based models, including BERT, Llama models, and Wizard, to improve domain relevance and make them more tailored to phishing specific distinctions, using Binary Sequence Classification, Contrastive Learning (CL) and Direct Preference Optimization (DPO). To that end, we examined their performance in phishing classification and explainability by applying the ConsistenCy measure based on SHAPley values (CC SHAP), which measures prediction explanation token alignment to test the model's internal faithfulness and consistency and uncover the rationale behind its predictions and reasoning. Overall, our findings show that Llama models exhibit stronger prediction explanation token alignment with higher CC SHAP scores despite lacking reliable decision making accuracy, whereas Wizard achieves better prediction accuracy but lower CC SHAP scores.

Paperid: 3006, https://arxiv.org/pdf/2506.13612.pdf

Abstract:
Despite federated learning (FL)'s potential in collaborative learning, its performance has deteriorated due to the data heterogeneity of distributed users. Recently, clustered federated learning (CFL) has emerged to address this challenge by partitioning users into clusters according to their similarity. However, CFL faces difficulties in training when users are unwilling to share their cluster identities due to privacy concerns. To address these issues, we present an innovative Efficient and Robust Secure Aggregation scheme for CFL, dubbed EBS-CFL. The proposed EBS-CFL supports effectively training CFL while maintaining users' cluster identity confidentially. Moreover, it detects potential poisonous attacks without compromising individual client gradients by discarding negatively correlated gradients and aggregating positively correlated ones using a weighted approach. The server also authenticates correct gradient encoding by clients. EBS-CFL has high efficiency with client-side overhead O(ml + m^2) for communication and O(m^2l) for computation, where m is the number of cluster identities, and l is the gradient size. When m = 1, EBS-CFL's computational efficiency of client is at least O(log n) times better than comparison schemes, where n is the number of clients.In addition, we validate the scheme through extensive experiments. Finally, we theoretically prove the scheme's security.

Paperid: 3007, https://arxiv.org/pdf/2506.13563.pdf

Abstract:
Website Fingerprinting (WF) is an effective tool for regulating and governing the dark web. However, its performance can be significantly degraded by backdoor poisoning attacks in practical deployments. This paper aims to address the problem of hidden backdoor poisoning attacks faced by Website Fingerprinting attack, and designs a feasible mothed that integrates unlearning technology to realize detection of automatic poisoned points and complete removal of its destructive effects, requiring only a small number of known poisoned test points. Taking Tor onion routing as an example, our method evaluates the influence value of each training sample on these known poisoned test points as the basis for judgment. We optimize the use of influence scores to identify poisoned samples within the training dataset. Furthermore, by quantifying the difference between the contribution of model parameters on the taining data and the clean data, the target parameters are dynamically adjusted to eliminate the impact of the backdoor attacks. Experiments on public datasets under the assumptions of closed-world (CW) and open-world (OW) verify the effectiveness of the proposed method. In complex scenes containing both clean website fingerprinting features and backdoor triggers, the accuracy of the model on the poisoned dataset and the test dataset is stable at about 80%, significantly outperforming the traditional WF attack models. In addition, the proposed method achieves a 2-3 times speedup in runtime efficiency compared to baseline methods. By incorporating machine unlearning, we realize a WF attack model that exhibits enhanced resistance to backdoor poisoning and faster execution speeds in adversarial settings.

Paperid: 3008, https://arxiv.org/pdf/2506.13418.pdf

Abstract:
Quantum computing is transforming the world profoundly, affecting businesses, organisations, technologies, and human beings' information systems, and will have a profound impact on accounting and finance, particularly in the realm of cybersecurity. It presents both opportunities and risks in ensuring confidentiality and protecting financial data. The purpose of this article is to show the application of quantum technologies in accounting cybersecurity, utilising quantum algorithms and QKD to overcome the limitations of classical computing. The literature review reveals the vulnerabilities of the current accounting cybersecurity to quantum attacks and the need for quantum-resistant cryptographic mechanisms. It elaborates on the risks associated with conventional encryption in the context of quantum capabilities. This study contributes to the understanding of how quantum computing can transform accounting cybersecurity by enhancing quantum-resistant algorithms and using QKD in accounting. The study employs PSALSAR systematic review methodology to ensure rigour and depth. The analysis shows that quantum computing enhances encryption techniques to superior possibilities than classical ones. Using quantum technologies in accounting minimises data breaches and unauthorised access. The study concludes that quantum-resistant algorithms and quantum key distribution (QKD) are necessary for securing the accounting and finance systems of the future. Keywords Quantum Computing, Cybersecurity, Accounting, Machine Learning, Artificial Intelligence, Quantum Key Distribution, Operations Management

Paperid: 3010, https://arxiv.org/pdf/2506.12070.pdf

Abstract:
With the advent of 5G, mobile networks are becoming more dynamic and will therefore present a wider attack surface. To secure these new systems, we propose a multi-domain anomaly detection method that is distinguished by the study of traffic correlation on three dimensions: temporal by analyzing message sequences, semantic by abstracting the parameters these messages contain, and topological by linking them in the form of a graph. Unlike traditional approaches, which are limited to considering these domains independently, our method studies their correlations to obtain a global, coherent and explainable view of anomalies.

Paperid: 3011, https://arxiv.org/pdf/2506.11458.pdf

Abstract:
Machine-learning systems continue to advance at a rapid pace, demonstrating remarkable utility in various fields and disciplines. As these systems continue to grow in size and complexity, a nascent industry is emerging which aims to bring machine-learning-as-a-service (MLaaS) to market. Outsourcing the operation and training of these systems to powerful hardware carries numerous advantages, but challenges arise when privacy and the correctness of work carried out must be ensured. Recent advancements in the field of zero-knowledge cryptography have led to a means of generating arguments of integrity for any computation, which in turn can be efficiently verified by any party, in any place, at any time. In this work we prove the correct training of a differentially-private (DP) linear regression over a dataset of 50,000 samples on a single machine in less than 6 minutes, verifying the entire computation in 0.17 seconds. To our knowledge, this result represents the fastest known instance in the literature of provable-DP over a dataset of this size. We believe this result constitutes a key stepping-stone towards end-to-end private MLaaS.

Paperid: 3012, https://arxiv.org/pdf/2506.10721.pdf

Abstract:
The paper presents an analysis of Commitment Schemes (CSs) used in Multi-Party Computation (MPC) protocols. While the individual properties of CSs and the guarantees offered by MPC have been widely studied in isolation, their interrelation in concrete protocols and applications remains mostly underexplored. This paper presents the relation between the two, with an emphasis on (security) properties and their impact on the upper layer MPC. In particular, we investigate how different types of CSs contribute to various MPC constructions and their relation to real-life applications of MPC. The paper can also serve as a tutorial for understanding the cryptographic interplay between CS and MPC, making it accessible to both researchers and practitioners. Our findings emphasize the importance of carefully selecting CS to meet the adversarial and functional requirements of MPC, thereby aiming for more robust and privacy-preserving cryptographic applications

Paperid: 3013, https://arxiv.org/pdf/2506.10645.pdf

Abstract:
Indicators of Compromise (IOCs) such as IP addresses, file hashes, and domain names are commonly used for threat detection and attribution. However, IOCs tend to be short-lived as they are easy to change. As a result, the cybersecurity community is shifting focus towards more persistent behavioral profiles, such as the Tactics, Techniques, and Procedures (TTPs) and the software used by a threat group. However, the distinctiveness and completeness of such behavioral profiles remain largely unexplored. In this work, we systematically analyze threat group profiles built from two open cyber threat intelligence (CTI) knowledge bases: MITRE ATT&CK and Malpedia. We first investigate what fraction of threat groups have group-specific behaviors, i.e., behaviors used exclusively by a single group. We find that only 34% of threat groups in ATT&CK have group-specific techniques. The software used by a threat group proves to be more distinctive, with 73% of ATT&CK groups using group-specific software. However, this percentage drops to 24% in the broader Malpedia dataset. Next, we evaluate how group profiles improve when data from both sources are combined. While coverage improves modestly, the proportion of groups with group-specific behaviors remains under 30%. We then enhance profiles by adding exploited vulnerabilities and additional techniques extracted from more threat reports. Despite the additional information, 64% of groups still lack any group-specific behavior. Our findings raise concerns on the belief that behavioral profiles can replace IOCs in threat group attribution.

Paperid: 3014, https://arxiv.org/pdf/2506.10502.pdf

Abstract:
We present a novel attack specifically designed against Tree-Ring, a watermarking technique for diffusion models known for its high imperceptibility and robustness against removal attacks. Unlike previous removal attacks, which rely on strong assumptions about attacker capabilities, our attack only requires access to the variational autoencoder that was used to train the target diffusion model, a component that is often publicly available. By leveraging this variational autoencoder, the attacker can approximate the model's intermediate latent space, enabling more effective surrogate-based attacks. Our evaluation shows that this approach leads to a dramatic reduction in the AUC of Tree-Ring detector's ROC and PR curves, decreasing from 0.993 to 0.153 and from 0.994 to 0.385, respectively, while maintaining high image quality. Notably, our attacks outperform existing methods that assume full access to the diffusion model. These findings highlight the risk of reusing public autoencoders to train diffusion models -- a threat not considered by current industry practices. Furthermore, the results suggest that the Tree-Ring detector's precision, a metric that has been overlooked by previous evaluations, falls short of the requirements for real-world deployment.

Paperid: 3015, https://arxiv.org/pdf/2506.10323.pdf

Abstract:
Generation-based fuzzing produces appropriate test cases according to specifications of input grammars and semantic constraints to test systems and software. However, these specifications require significant manual effort to construct. This paper proposes a new approach, ELFuzz (Evolution Through Large Language Models for Fuzzing), that automatically synthesizes generation-based fuzzers tailored to a system under test (SUT) via LLM-driven synthesis over fuzzer space. At a high level, it starts with minimal seed fuzzers and propels the synthesis by fully automated LLM-driven evolution with coverage guidance. Compared to previous approaches, ELFuzz can 1) seamlessly scale to SUTs of real-world sizes -- up to 1,791,104 lines of code in our evaluation -- and 2) synthesize efficient fuzzers that catch interesting grammatical structures and semantic constraints in a human-understandable way. Our evaluation compared ELFuzz with specifications manually written by domain experts and synthesized by state-of-the-art approaches. It shows that ELFuzz achieves up to 434.8% more coverage over the second best and triggers up to 216.7% more artificially injected bugs, compared to the state-of-the-art. We also used ELFuzz to conduct a real-world fuzzing campaign on the newest version of cvc5 for 14 days, and encouragingly, it found five 0-day bugs (three are exploitable). Moreover, we conducted an ablation study, which shows that the fuzzer space model, the key component of ELFuzz, contributes the most (up to 62.5%) to the effectiveness of ELFuzz. Further analysis of the fuzzers synthesized by ELFuzz confirms that they catch interesting grammatical structures and semantic constraints in a human-understandable way. The results present the promising potential of ELFuzz for more automated, efficient, and extensible input generation for fuzzing.

Paperid: 3016, https://arxiv.org/pdf/2506.10236.pdf

Abstract:
In this work, we demonstrate that certain machine unlearning methods may fail under straightforward prompt attacks. We systematically evaluate eight unlearning techniques across three model families using output-based, logit-based, and probe analysis to assess the extent to which supposedly unlearned knowledge can be retrieved. While methods like RMU and TAR exhibit robust unlearning, ELM remains vulnerable to specific prompt attacks (e.g., prepending Hindi filler text to the original prompt recovers 57.3% accuracy). Our logit analysis further indicates that unlearned models are unlikely to hide knowledge through changes in answer formatting, given the strong correlation between output and logit accuracy. These findings challenge prevailing assumptions about unlearning effectiveness and highlight the need for evaluation frameworks that can reliably distinguish between genuine knowledge removal and superficial output suppression. To facilitate further research, we publicly release our evaluation framework to easily evaluate prompting techniques to retrieve unlearned knowledge.

Paperid: 3017, https://arxiv.org/pdf/2506.10147.pdf

Abstract:
In this paper, we introduce the Kirchhoff-Law-Johnson-Noise (KLJN) as an approach to securing satellite communications. KLJN has the potential to revolutionize satellite communication security through its combination of simplicity, cost-effectiveness, and resilience with unconditional security. Unlike quantum key distribution (QKD), which requires complex, fragile, and expensive infrastructure like photon detectors and dedicated optical links, KLJN operates using standard electronic components and wires, significantly reducing implementation costs and logistical hurdles. KLJN's security, grounded in the fundamental laws of classical physics, is impervious to environmental and radiation-induced noise, making it highly reliable in the harsh conditions of satellite communications. This robustness, coupled with its ability to integrate seamlessly with existing infrastructure, positions KLJN as a revolutionary alternative to quantum solutions for ensuring secure, resilient satellite communications. The authors explore the value of achieving unconditionally secure communications in strategic ground-to-satellite networks which address vulnerabilities posed by advanced computational threats, including quantum computing. Our team has examined two leading approaches to unconditional security - the KLJN scheme and QKD - and analyzed the potential use of each for space systems. While QKD leverages quantum mechanics for security, it faces challenges related to cost, complexity, and environmental sensitivity. In contrast, the KLJN scheme utilizes classical physics principles to provide a simpler, more cost-effective, and resilient alternative, particularly for ground-based systems. The study concludes that KLJN offers significant advantages in simplicity, cost-efficiency, and robustness, making it a practical choice for many secure communication applications.

Paperid: 3018, https://arxiv.org/pdf/2506.10028.pdf

Abstract:
Cloud computing has made storing and accessing data easier but keeping it secure is a big challenge nowadays. Traditional methods of ensuring data may not be strong enough in the future when powerful quantum computers become available. To solve this problem, this study uses quantum cryptography to protect data in the cloud environment. Quantum Key Distribution (QKD) creates secure keys by sending information using quantum particles like photons. Specifically, we use the BB84 protocol, a simple and reliable way to make secure keys that cannot be stolen without detection. To protect the data, we use the Quantum One Time pad (QOTP) for encryption and decryption, ensuring the data stays completely private. This study shows how these Quantum methods can be applied in cloud systems to provide a strong defense against hackers, even if they have access to quantum computers. The combination of QKD, BB84, and QOTP creates a safe and reliable way to keep data secure when it is stored or shared in the cloud. Using quantum cryptography, this paper provides a way to ensure data security now and in the future, making cloud computing safer for everyone to store their data securely and safely.

Paperid: 3019, https://arxiv.org/pdf/2506.09950.pdf

Abstract:
The multistep solving strategy consists in a divide-and-conquer approach: when a multivariate polynomial system is computationally infeasible to solve directly, one variable is assigned over the elements of the base finite field, and the procedure is recursively applied to the resulting simplified systems. In a previous work by the same authors (among others), this approach proved effective in the algebraic cryptanalysis of the Trivium cipher. In this paper, we present a new implementation of the corresponding algorithm based on a Depth-First Search strategy, along with a novel complexity analysis leveraging tree structures. We further introduce the notion of an "oracle function" as a general predictive tool for deciding whether the evaluation of a new variable is necessary to simplify the current polynomial system. This notion allows us to unify all previously proposed variants of the multistep strategy, including the classical hybrid approach, by appropriately selecting the oracle function. Finally, we apply the multistep solving strategy to the cryptanalysis of the low-latency block cipher Aradi, recently introduced by the NSA. We present the first full round algebraic attack, raising concerns about the cipher's actual security with respect to its key length.

Paperid: 3020, https://arxiv.org/pdf/2506.09719.pdf

Abstract:
We report on an ethnographic study with members of the climate movement in the United Kingdom (UK). We conducted participant observation and interviews at protests and in various activist settings. Reporting on the findings as they relate to information security, we show that members of the UK climate movement wrestled with (i) a fundamental tension between openness and secrecy; (ii) tensions between autonomy and collective interdependence in information-security decision-making; (iii) conflicting activist ideals that shape security discourses; and (iv) pressures from different social gazes -- from each other, from people outside the movement and from their adversaries. Overall, our findings shed light on the social complexities of information-security research in activist settings and provoke methodological questions about programmes that aim to design for activists.

Paperid: 3021, https://arxiv.org/pdf/2506.09569.pdf

Abstract:
Modern mix networks improve over Tor and provide stronger privacy guarantees by robustly obfuscating metadata. As long as a message is routed through at least one honest mixnode, the privacy of the users involved is safeguarded. However, the complexity of the mixing mechanisms makes it difficult to estimate the cumulative privacy erosion occurring over time. This work uses a generative model trained on mixnet traffic to estimate the loss of privacy when users communicate persistently over a period of time. We train our large-language model from scratch on our specialized network traffic ``language'' and then use it to measure the sender-message unlinkability in various settings (e.g. mixing strategies, security parameters, observation window). Our findings reveal notable differences in privacy levels among mix strategies, even when they have similar mean latencies. In comparison, we demonstrate the limitations of traditional privacy metrics, such as entropy and log-likelihood, in fully capturing an adversary's potential to synthesize information from multiple observations. Finally, we show that larger models exhibit greater sample efficiency and superior capabilities implying that further advancements in transformers will consequently enhance the accuracy of model-based privacy estimates.

Paperid: 3023, https://arxiv.org/pdf/2506.08336.pdf

Abstract:
Despite their growing adoption across domains, large language model (LLM)-powered agents face significant security risks from backdoor attacks during training and fine-tuning. These compromised agents can subsequently be manipulated to execute malicious operations when presented with specific triggers in their inputs or environments. To address this pressing risk, we present ReAgent, a novel defense against a range of backdoor attacks on LLM-based agents. Intuitively, backdoor attacks often result in inconsistencies among the user's instruction, the agent's planning, and its execution. Drawing on this insight, ReAgent employs a two-level approach to detect potential backdoors. At the execution level, ReAgent verifies consistency between the agent's thoughts and actions; at the planning level, ReAgent leverages the agent's capability to reconstruct the instruction based on its thought trajectory, checking for consistency between the reconstructed instruction and the user's instruction. Extensive evaluation demonstrates ReAgent's effectiveness against various backdoor attacks across tasks. For instance, ReAgent reduces the attack success rate by up to 90\% in database operation tasks, outperforming existing defenses by large margins. This work reveals the potential of utilizing compromised agents themselves to mitigate backdoor risks.

Paperid: 3024, https://arxiv.org/pdf/2506.08330.pdf

Abstract:
Search engines have vast technical capabilities to retain Internet search logs for each user and thus present major privacy vulnerabilities to both individuals and organizations in revealing user intent. Additionally, many of the web search privacy enhancing tools available today require that the user trusts a third party, which make confidentiality of user intent even more challenging. The user is left at the mercy of the third party without the control over his or her own privacy. In this article, we suggest a user-centric heuristic, Distortion Search, a web search query privacy methodology that works by the formation of obfuscated search queries via the permutation of query keyword categories, and by strategically applying k-anonymised web navigational clicks on URLs and Ads to generate a distorted user profile and thus providing specific user intent and query confidentiality. We provide empirical results via the evaluation of distorted web search queries in terms of retrieved search results and the resulting web ads from search engines. Preliminary experimental results indicate that web search query and specific user intent privacy might be achievable from the user side without the involvement of the search engine or other third parties.

Paperid: 3025, https://arxiv.org/pdf/2506.07974.pdf

Abstract:
The explosive growth of Non-Fungible Tokens (NFTs) has revolutionized digital ownership by enabling the creation, exchange, and monetization of unique assets on blockchain networks. However, this surge in popularity has also given rise to a disturbing trend: the emergence of rug pulls - fraudulent schemes where developers exploit trust and smart contract privileges to drain user funds or invalidate asset ownership. Central to many of these scams are hidden backdoors embedded within NFT smart contracts. Unlike unintentional bugs, these backdoors are deliberately coded and often obfuscated to bypass traditional audits and exploit investor confidence. In this paper, we present a large-scale static analysis of 49,940 verified NFT smart contracts using Slither, a static analysis framework, to uncover latent vulnerabilities commonly linked to rug pulls. We introduce a custom risk scoring model that classifies contracts into high, medium, or low risk tiers based on the presence and severity of rug pull indicators. Our dataset was derived from verified contracts on the Ethereum mainnet, and we generate multiple visualizations to highlight red flag clusters, issue prevalence, and co-occurrence of critical vulnerabilities. While we do not perform live exploits, our results reveal how malicious patterns often missed by simple reviews can be surfaced through static analysis at scale. We conclude by offering mitigation strategies for developers, marketplaces, and auditors to enhance smart contract security. By exposing how hidden backdoors manifest in real-world smart contracts, this work contributes a practical foundation for detecting and mitigating NFT rug pulls through scalable automated analysis.

Paperid: 3026, https://arxiv.org/pdf/2506.07728.pdf

Abstract:
The npm (Node Package Manager) ecosystem is the most important package manager for JavaScript development with millions of users. Consequently, a plethora of earlier work investigated how vulnerability reporting, patch propagation, and in general detection as well as resolution of security issues in such ecosystems can be facilitated. However, understanding the ground reality of security-related issue reporting by users (and bots) in npm-along with the associated challenges has been relatively less explored at scale. In this work, we bridge this gap by collecting 10,907,467 issues reported across GitHub repositories of 45,466 diverse npm packages. We found that the tags associated with these issues indicate the existence of only 0.13% security-related issues. However, our approach of manual analysis followed by developing high accuracy machine learning models identify 1,617,738 security-related issues which are not tagged as security-related (14.8% of all issues) as well as 4,461,934 comments made on these issues. We found that the bots which are in wide use today might not be sufficient for either detecting or offering assistance. Furthermore, our analysis of user-developer interaction data hints that many user-reported security issues might not be addressed by developers-they are not tagged as security-related issues and might be closed without valid justification. Consequently, a correlation analysis hints that the developers quickly handle security issues with known solutions (e.g., corresponding to CVE). However, security issues without such known solutions (even with reproducible code) might not be resolved. Our findings offer actionable insights for improving security management in open-source ecosystems, highlighting the need for smarter tools and better collaboration. The data and code for this work is available at https://doi.org/10.5281/zenodo.15614029

Paperid: 3027, https://arxiv.org/pdf/2506.07404.pdf

Abstract:
Steganography is an information hiding technique for covert communication. The core issue in steganography design is the rate-distortion coding problem. Polar codes, which have been proven to achieve the rate-distortion bound for any binary symmetric source, are utilized to design a steganographic scheme that can reach the embedding capacity for the Distortion-Limited Sender problem in certain cases. In adaptive steganography, for attack scenarios where each noise element can have different intensities, existing steganographic coding methods fail to resist such attacks. In this paper, we propose a pixel-sensitive and robust steganographic scheme based on polar codes. Our steganographic scheme not only matches the adaptive distortion well but is also robust against sophisticated noise attacks. Futher, it is proven that our scheme achieves the embedding capacity in certain cases. Experimentally, a steganographic scheme can be designed and implemented with a secret message error rate at the $10^{-5}$ level when the attack noise is known to both the sender and the receiver. This demonstrates its significant robustness.

Paperid: 3028, https://arxiv.org/pdf/2506.07330.pdf

Abstract:
We present JavelinGuard, a suite of low-cost, high-performance model architectures designed for detecting malicious intent in Large Language Model (LLM) interactions, optimized specifically for production deployment. Recent advances in transformer architectures, including compact BERT(Devlin et al. 2019) variants (e.g., ModernBERT (Warner et al. 2024)), allow us to build highly accurate classifiers with as few as approximately 400M parameters that achieve rapid inference speeds even on standard CPU hardware. We systematically explore five progressively sophisticated transformer-based architectures: Sharanga (baseline transformer classifier), Mahendra (enhanced attention-weighted pooling with deeper heads), Vaishnava and Ashwina (hybrid neural ensemble architectures), and Raudra (an advanced multi-task framework with specialized loss functions). Our models are rigorously benchmarked across nine diverse adversarial datasets, including popular sets like the NotInject series, BIPIA, Garak, ImprovedLLM, ToxicChat, WildGuard, and our newly introduced JavelinBench, specifically crafted to test generalization on challenging borderline and hard-negative cases. Additionally, we compare our architectures against leading open-source guardrail models as well as large decoder-only LLMs such as gpt-4o, demonstrating superior cost-performance trade-offs in terms of accuracy, and latency. Our findings reveal that while Raudra's multi-task design offers the most robust performance overall, each architecture presents unique trade-offs in speed, interpretability, and resource requirements, guiding practitioners in selecting the optimal balance of complexity and efficiency for real-world LLM security applications.

Paperid: 3029, https://arxiv.org/pdf/2506.07263.pdf

Abstract:
Modern out-of-order CPUs heavily rely on speculative execution for performance optimization, with branch prediction serving as a cornerstone to minimize stalls and maximize efficiency. Whenever shared branch prediction resources lack proper isolation and sanitization methods, they may originate security vulnerabilities that expose sensitive data across different software contexts. This paper examines the fundamental components of modern Branch Prediction Units (BPUs) and investigates how resource sharing and contention affect two widely implemented but underdocumented features: Bias-Free Branch Prediction and Branch History Speculation. Our analysis demonstrates that these BPU features, while designed to enhance speculative execution efficiency through more accurate branch histories, can also introduce significant security risks. We show that these features can inadvertently modify the Branch History Buffer (BHB) update behavior and create new primitives that trigger malicious mis-speculations. This discovery exposes previously unknown cross-privilege attack surfaces for Branch History Injection (BHI). Based on these findings, we present three novel attack primitives: two Spectre attacks, namely Spectre-BSE and Spectre-BHS, and a cross-privilege control flow side-channel attack called BiasScope. Our research identifies corresponding patterns of vulnerable control flows and demonstrates exploitation on multiple processors. Finally, Chimera is presented: an attack demonstrator based on eBPF for a variant of Spectre-BHS that is capable of leaking kernel memory contents at 24,628 bit/s.

Paperid: 3030, https://arxiv.org/pdf/2506.07200.pdf

Abstract:
Cache-timing attacks exploit microarchitectural characteristics to leak sensitive data, posing a severe threat to modern systems. Despite its severity, analyzing the vulnerability of a given cache structure against cache-timing attacks is challenging. To this end, a method based on Reinforcement Learning (RL) has been proposed to automatically explore vulnerabilities for a given cache structure. However, a naive RL-based approach suffers from inefficiencies due to the agent performing actions that do not contribute to the exploration. In this paper, we propose a method to identify these useless actions during training and penalize them so that the agent avoids them and the exploration efficiency is improved. Experiments on 17 cache structures show that our training mechanism reduces the number of useless actions by up to 43.08%. This resulted in the reduction of training time by 28\% in the base case and 4.84\% in the geomean compared to a naive RL-based approach.

Paperid: 3031, https://arxiv.org/pdf/2506.07190.pdf

Abstract:
Inter-VM RowHammer is an attack that induces a bitflip beyond the boundaries of virtual machines (VMs) to compromise a VM from another, and some software-based techniques have been proposed to mitigate this attack. Evaluating these mitigation techniques requires to confirm that they actually mitigate inter-VM RowHammer in low overhead. A challenge in this evaluation process is that both the mitigation ability and the overhead depend on the underlying hardware whose DRAM address mappings are different from machine to machine. This makes comprehensive evaluation prohibitively costly or even implausible as no machine that has a specific DRAM address mapping might be available. To tackle this challenge, we propose a simulation-based framework to evaluate software-based inter-VM RowHammer mitigation techniques across configurable DRAM address mappings. We demonstrate how to reproduce existing mitigation techniques on our framework, and show that it can evaluate the mitigation abilities and performance overhead of them with configurable DRAM address mappings.

Paperid: 3032, https://arxiv.org/pdf/2506.07034.pdf

Abstract:
Arm Confidential Computing Architecture (CCA) currently isolates at the granularity of an entire Confidential Virtual Machine (CVM), leaving intra-VM bugs such as Heartbleed unmitigated. The state-of-the-art narrows this to the process level, yet still cannot stop attacks that pivot within the same process, and prior intra-enclave schemes are either too slow or incompatible with CVM-style isolation. We extend CCA with a three-tier zone model that spawns an unlimited number of lightweight isolation domains inside a single process, while shielding them from kernel-space adversaries. To block domain-switch abuse, we also add a fast user-level Code-Pointer Integrity (CPI) mechanism. We developed two prototypes: a functional version on Arm's official simulator to validate resistance against intra-process and kernel-space adversaries, and a performance variant on Arm development boards evaluated for session-key isolation within server applications, in-memory key-value protection, and non-volatile-memory data isolation. NanoZone incurs roughly a 20% performance overhead while retaining 95% throughput compared to the system without fine-grained isolation.

Paperid: 3033, https://arxiv.org/pdf/2506.06971.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable success in tasks requiring complex reasoning, such as code generation, mathematical problem solving, and algorithmic synthesis -- especially when aided by reasoning tokens and Chain-of-Thought prompting. Yet, a core question remains: do these models truly reason, or do they merely exploit shallow statistical patterns? In this paper, we introduce Chain-of-Code Collapse, where we systematically investigate the robustness of reasoning LLMs by introducing a suite of semantically faithful yet adversarially structured prompt perturbations. Our evaluation -- spanning 700 perturbed code generations derived from LeetCode-style problems -- applies transformations such as storytelling reframing, irrelevant constraint injection, example reordering, and numeric perturbation. We observe that while certain modifications severely degrade performance (with accuracy drops up to -42.1%), others surprisingly improve model accuracy by up to 35.3%, suggesting sensitivity not only to semantics but also to surface-level prompt dynamics. These findings expose the fragility and unpredictability of current reasoning systems, underscoring the need for more principles approaches to reasoning alignments and prompting robustness. We release our perturbation datasets and evaluation framework to promote further research in trustworthy and resilient LLM reasoning.

Paperid: 3034, https://arxiv.org/pdf/2506.06916.pdf

Abstract:
Rogue Base Station (RBS) attacks, particularly those exploiting downgrade vulnerabilities, remain a persistent threat as 5G Standalone (SA) deployments are still limited and User Equipment (UE) manufacturers continue to support legacy network connectivity. This work introduces ARGOS, a comprehensive O-RAN compliant Intrusion Detection System (IDS) deployed within the Near Real-Time RIC, designed to detect RBS downgrade attacks in real time, an area previously unexplored within the O-RAN context. The system enhances the 3GPP KPM Service Model to enable richer, UE-level telemetry and features a custom xApp that applies unsupervised Machine Learning models for anomaly detection. Distinctively, the updated KPM Service Model operates on cross-layer features extracted from Modem Layer 1 (ML1) logs and Measurement Reports collected directly from Commercial Off-The-Shelf (COTS) UEs. To evaluate system performance under realistic conditions, a dedicated testbed is implemented using Open5GS, srsRAN, and FlexRIC, and validated against an extensive real-world measurement dataset. Among the evaluated models, the Variational Autoencoder (VAE) achieves the best balance of detection performance and efficiency, reaching 99.5% Accuracy with only 0.6% False Positives and minimal system overhead.

Paperid: 3035, https://arxiv.org/pdf/2506.06691.pdf

Abstract:
In the age of IoT and mobile platforms, ensuring that content stay authentic whilst avoiding overburdening limited hardware is a key problem. This study introduces hybrid Fast Wavelet Transform & Additive Quantization index Modulation (FWT-AQIM) scheme, a lightweight watermarking approach that secures digital pictures on low-power, memory-constrained small scale devices to achieve a balanced trade-off among robustness, imperceptibility, and computational efficiency. The method embeds watermark in the luminance component of YCbCr color space using low-frequency FWT sub-bands, minimizing perceptual distortion, using additive QIM for simplicity. Both the extraction and embedding processes run in less than 40 ms and require minimum RAM when tested on a Raspberry Pi 5. Quality assessments on standard and high-resolution images yield PSNR greater than equal to 34 dB and SSIM greater than equal to 0.97, while robustness verification includes various geometric and signal-processing attacks demonstrating near-zero bit error rates and NCC greater than equal to 0.998. Using a mosaic-based watermark, redundancy added enhancing robustness without reducing throughput, which peaks at 11 MP/s. These findings show that FWT-AQIM provides an efficient, scalable solution for real-time, secure watermarking in bandwidth- and power-constrained contexts, opening the way for dependable content protection in developing IoT and multimedia applications.

Paperid: 3036, https://arxiv.org/pdf/2506.06604.pdf

Abstract:
In this paper we present a study on using novel data types to perform cyber risk quantification by estimating the likelihood of a data breach. We demonstrate that it is feasible to build a highly accurate cyber risk assessment model using public and readily available technology signatures obtained from crawling an organization's website. This approach overcomes the limitations of previous similar approaches that relied on large-scale IP address based scanning data, which suffers from incomplete/missing IP address mappings as well as the lack of such data for large numbers of small and medium-sized organizations (SMEs). In comparison to scan data, technology digital signature data is more readily available for millions of SMEs. Our study shows that there is a strong relationship between these technology signatures and an organization's cybersecurity posture. In cross-validating our model using different cyber incident datasets, we also highlight the key differences between ransomware attack victims and the larger population of cyber incident and data breach victims.

Paperid: 3037, https://arxiv.org/pdf/2506.06572.pdf

Abstract:
Sensor systems are extremely popular today and vulnerable to sensor data attacks. Due to possible devastating consequences, counteracting sensor data attacks is an extremely important topic, which has not seen sufficient study. This paper develops the first methods that accurately identify/eliminate only the problematic attacked sensor data presented to a sequence estimation/regression algorithm under a powerful attack model constructed based on known/observed attacks. The approach does not assume a known form for the statistical model of the sensor data, allowing data-driven and machine learning sequence estimation/regression algorithms to be protected. A simple protection approach for attackers not endowed with knowledge of the details of our protection approach is first developed, followed by additional processing for attacks based on protection system knowledge. In the cases tested for which it was designed, experimental results show that the simple approach achieves performance indistinguishable, to two decimal places, from that for an approach which knows which sensors are attacked. For cases where the attacker has knowledge of the protection approach, experimental results indicate the additional processing can be configured so that the worst-case degradation under the additional processing and a large number of sensors attacked can be made significantly smaller than the worst-case degradation of the simple approach, and close to an approach which knows which sensors are attacked, for the same number of attacked sensors with just a slight degradation under no attacks. Mathematical descriptions of the worst-case attacks are used to demonstrate the additional processing will provide similar advantages for cases for which we do not have numerical results. All the data-driven processing used in our approaches employ only unattacked training data.

Paperid: 3038, https://arxiv.org/pdf/2506.06518.pdf

Abstract:
With the widespread availability of pretrained Large Language Models (LLMs) and their training datasets, concerns about the security risks associated with their usage has increased significantly. One of these security risks is the threat of LLM poisoning attacks where an attacker modifies some part of the LLM training process to cause the LLM to behave in a malicious way. As an emerging area of research, the current frameworks and terminology for LLM poisoning attacks are derived from earlier classification poisoning literature and are not fully equipped for generative LLM settings. We conduct a systematic review of published LLM poisoning attacks to clarify the security implications and address inconsistencies in terminology across the literature. We propose a comprehensive poisoning threat model applicable to categorize a wide range of LLM poisoning attacks. The poisoning threat model includes four poisoning attack specifications that define the logistics and manipulation strategies of an attack as well as six poisoning metrics used to measure key characteristics of an attack. Under our proposed framework, we organize our discussion of published LLM poisoning literature along four critical dimensions of LLM poisoning attacks: concept poisons, stealthy poisons, persistent poisons, and poisons for unique tasks, to better understand the current landscape of security risks.

Paperid: 3039, https://arxiv.org/pdf/2506.06062.pdf

Abstract:
Minoritised ethnic people are marginalised in society, and therefore at a higher risk of adverse online harms, including those arising from the loss of security and privacy of personal data. Despite this, there has been very little research focused on minoritised ethnic people's security and privacy concerns, attitudes, and behaviours. In this work, we provide the results of one of the first studies in this regard. We explore minoritised ethnic people's experiences of using essential online services across three sectors: health, social housing, and energy, their security and privacy-related concerns, and responses towards these services. We conducted a thematic analysis of 44 semi-structured interviews with people of various reported minoritised ethnicities in the UK. Privacy concerns and lack of control over personal data emerged as a major theme, with many interviewees considering privacy as their most significant concern when using online services. Several creative tactics to exercise some agency were reported, including selective and inconsistent disclosure of personal data. A core concern about how data may be used was driven by a fear of repercussions, including penalisation and discrimination, influenced by prior experiences of institutional and online racism. The increased concern and potential for harm resulted in minoritised ethnic people grappling with a higher-stakes dilemma of whether to disclose personal information online or not. Furthermore, trust in institutions, or lack thereof, was found to be embedded throughout as a basis for adapting behaviour. We draw on our results to provide lessons learned for the design of more inclusive, marginalisation-aware, and privacy-preserving online services.

Paperid: 3040, https://arxiv.org/pdf/2506.05932.pdf

Abstract:
Reentrancy is a well-known source of smart contract bugs on Ethereum, leading e.g. to double-spending vulnerabilities in DeFi applications. But less is known about this problem in other blockchains, which can have significantly different execution models. Sharded blockchains in particular generally use an asynchronous messaging model that differs substantially from the synchronous and transactional model of Ethereum. We study the features of this model and its effect on reentrancy bugs on three examples: the Internet Computer (ICP) blockchain, NEAR Protocol, and MultiversX. We argue that this model, while useful for improving performance, also makes it easier to introduce reentrancy bugs. For example, reviews of the pre-production versions of some of the most critical ICP smart contracts found that 66% (10/15) of the reviewed contracts -- written by expert authors -- contained reentrancy bugs of medium or high severity, with potential damages in tens of millions of dollars. We evaluate existing Ethereum programming techniques (in particular the effects-checks-interactions pattern, and locking) to prevent reentrancy bugs in the context of this new messaging model and identify some issues with them. We then present novel Rust and Motoko patterns that can be leveraged on ICP to solve these issues. Finally, we demonstrate that the formal verification tool TLA+ can be used to find and eliminate such bugs in real world smart contracts on sharded blockchains.

Paperid: 3041, https://arxiv.org/pdf/2506.05740.pdf

Abstract:
Fraudulent activities are rapidly evolving, employing increasingly diverse and sophisticated methods that pose serious threats to individuals, organizations, and society. This paper proposes the FIST Framework (Fraud Incident Structured Threat Framework), an innovative structured threat modeling methodology specifically designed for fraud scenarios. Inspired by MITRE ATT\&CK and DISARM, FIST systematically incorporates social engineering tactics, stage-based behavioral decomposition, and detailed attack technique mapping into a reusable knowledge base. FIST aims to enhance the efficiency of fraud detection and the standardization of threat intelligence sharing, promoting collaboration and a unified language across organizations and sectors. The framework integrates interdisciplinary insights from cybersecurity, criminology, and behavioral science, addressing both technical vectors and psychological manipulation mechanisms in fraud. This approach enables fine-grained analysis of fraud incidents, supporting automated detection, quantitative risk assessment, and standardized incident reporting. The effectiveness of the framework is further validated through real-world case studies, demonstrating its value in bridging academic research and practical applications, and laying the foundation for an intelligence-driven anti-fraud ecosystem. To the best of our knowledge, FIST is the first systematic, open-source fraud threat modeling framework that unifies both technical and psychological aspects, and is made freely available to foster collaboration between academia and industry.

Paperid: 3042, https://arxiv.org/pdf/2506.05711.pdf

Abstract:
This article describes a post-quantum multirecipient symmetric cryptosystem whose security is based on the hardness of the LWE problem. In this scheme a single sender encrypts multiple messages for multiple recipients generating a single ciphertext which is broadcast to the recipients. Each recipient decrypts the ciphertext with her secret key to recover the message intended for her. In this process, the recipient cannot efficiently extract any information about the other messages. This scheme is intended for messages like images and sound that can tolerate a small amount of noise. This article introduces the scheme and establishes its security based on the LWE problem. Further, an example is given to demonstrate the application of this scheme for encrypting multiple images.

Paperid: 3043, https://arxiv.org/pdf/2506.05611.pdf

Abstract:
Mobility traces represent a critical class of personal data, often subjected to privacy-preserving transformations before public release. In this study, we analyze the anonymized Yjmob100k dataset, which captures the trajectories of 100,000 users in Japan, and demonstrate how existing anonymization techniques fail to protect their sensitive attributes. We leverage population density patterns, structural correlations, and temporal activity profiles to re-identify the dataset's real-world location and timing. Our results reveal that the anonymization process carried out for Yjmob100k is inefficient and preserves enough spatial and temporal structure to enable re-identification. This work underscores the limitations of current trajectory anonymization methods and calls for more robust privacy mechanisms in the publication of mobility data.

Paperid: 3044, https://arxiv.org/pdf/2506.05601.pdf

Abstract:
A critical requirement for modern-day Intelligent Transportation Systems (ITS) is the ability to collect geo-referenced data from connected vehicles and mobile devices in a safe, secure and anonymous way. The Nexagon protocol, which builds on the IETF Locator/ID Separation Protocol (LISP) and the Hierarchical Hexagonal Clustering (H3) geo-spatial indexing system, offers a promising framework for dynamic, privacy-preserving data aggregation. Seeking to address the critical security and privacy vulnerabilities that persist in its current specification, we apply the STRIDE and LINDDUN threat modelling frameworks and prove among other that the Nexagon protocol is susceptible to user re-identification, session linkage, and sparse-region attacks. To address these challenges, we propose an enhanced security architecture that combines public key infrastructure (PKI) with ephemeral pseudonym certificates. Our solution guarantees user and device anonymity through randomized key rotation and adaptive geospatial resolution, thereby effectively mitigating re-identification and surveillance risks in sparse environments. A prototype implementation over a microservice-based overlay network validates the approach and underscores its readiness for real-world deployment. Our results show that it is possible to achieve the required level of security without increasing latency by more than 25% or reducing the throughput by more than 7%.

Paperid: 3045, https://arxiv.org/pdf/2506.05382.pdf

Abstract:
Deep learning systems, critical in domains like autonomous vehicles, are vulnerable to adversarial examples (crafted inputs designed to mislead classifiers). This study investigates black-box adversarial attacks in computer vision. This is a realistic scenario, where attackers have query-only access to the target model. Three properties are introduced to evaluate attack feasibility: robustness to compression, stealthiness to automatic detection, and stealthiness to human inspection. State-of-the-Art methods tend to prioritize one criterion at the expense of others. We propose ECLIPSE, a novel attack method employing Gaussian blurring on sampled gradients and a local surrogate model. Comprehensive experiments on a public dataset highlight ECLIPSE's advantages, demonstrating its contribution to the trade-off between the three properties.

Paperid: 3046, https://arxiv.org/pdf/2506.05381.pdf

Abstract:
Intelligent Reflecting Surfaces (IRS) enhance spectral efficiency by adjusting reflection phase shifts, while Non-Orthogonal Multiple Access (NOMA) increases system capacity. Consequently, IRS-assisted NOMA communications have garnered significant research interest. However, the passive nature of the IRS, lacking authentication and security protocols, makes these systems vulnerable to external eavesdropping due to the openness of electromagnetic signal propagation and reflection. NOMA's inherent multi-user signal superposition also introduces internal eavesdropping risks during user pairing. This paper investigates secure transmissions in IRS-assisted NOMA systems with heterogeneous resource configuration in wireless networks to mitigate both external and internal eavesdropping. To maximize the sum secrecy rate of legitimate users, we propose a combinatorial optimization graph neural network (CO-GNN) approach to jointly optimize beamforming at the base station, power allocation of NOMA users, and phase shifts of IRS for dynamic heterogeneous resource allocation, thereby enabling the design of dual-link or multi-link secure transmissions in the presence of eavesdroppers on the same or heterogeneous links. The CO-GNN algorithm simplifies the complex mathematical problem-solving process, eliminates the need for channel estimation, and enhances scalability. Simulation results demonstrate that the proposed algorithm significantly enhances the secure transmission performance of the system.

Paperid: 3047, https://arxiv.org/pdf/2506.05374.pdf

Abstract:
Boolean functions and binary sequences are main tools used in cryptography. In this work, we introduce a new bijection between the set of Boolean functions and the set of binary sequences with period a power of two. We establish a connection between them which allows us to study some properties of Boolean functions through binary sequences and vice versa. Then, we define a new representation of sequences, based on Boolean functions and derived from the algebraic normal form, named reverse-ANF. Next, we study the relation between such a representation and other representations of Boolean functions as well as between such a representation and the binary sequences. Finally, we analyse the generalized self-shrinking sequences in terms of Boolean functions and some of their properties using the different representations.

Paperid: 3048, https://arxiv.org/pdf/2506.04634.pdf

Abstract:
Decoy passwords, or "honeywords," alert a site to its breach if they are ever entered in a login attempt on that site. However, an attacker can identify a user-chosen password from among the decoys, without risk of alerting the site to its breach, by performing credential stuffing, i.e., entering the stolen passwords at another site where the same user reused her password. Prior work has thus proposed that sites monitor for the entry of their honeywords at other sites. Unfortunately, it is not clear what incentives sites have to participate in this monitoring. In this paper we propose and evaluate an algorithm by which sites can exchange monitoring favors. Through a model-checking analysis, we show that using our algorithm, a site improves its ability to detect its own breach when it increases the monitoring effort it expends for other sites. We additionally quantify the impacts of various parameters on detection effectiveness and their implications for the deployment of a system to support a monitoring ecosystem. Finally, we evaluate our algorithm on a real dataset of breached credentials and provide a performance analysis that confirms its scalability and practical viability.

Paperid: 3049, https://arxiv.org/pdf/2506.04105.pdf

Abstract:
We consider a network of users connected by pairwise quantum key distribution (QKD) links. Using these pairwise secret keys and public classical communication, the users want to generate a common (conference) secret key at the maximal rate. We propose an algorithm based on spanning tree packing (a known problem in graph theory) and prove its optimality. This algorithm enables optimal conference key generation in modern quantum networks of arbitrary topology. Additionally, we discuss how it can guide the optimal placement of new bipartite links in the network design.

Paperid: 3050, https://arxiv.org/pdf/2506.04036.pdf

Abstract:
Large language models (LLMs) demonstrate powerful information handling capabilities and are widely integrated into chatbot applications. OpenAI provides a platform for developers to construct custom GPTs, extending ChatGPT's functions and integrating external services. Since its release in November 2023, over 3 million custom GPTs have been created. However, such a vast ecosystem also conceals security and privacy threats. For developers, instruction leaking attacks threaten the intellectual property of instructions in custom GPTs through carefully crafted adversarial prompts. For users, unwanted data access behavior by custom GPTs or integrated third-party services raises significant privacy concerns. To systematically evaluate the scope of threats in real-world LLM applications, we develop three phases instruction leaking attacks target GPTs with different defense level. Our widespread experiments on 10,000 real-world custom GPTs reveal that over 98.8% of GPTs are vulnerable to instruction leaking attacks via one or more adversarial prompts, and half of the remaining GPTs can also be attacked through multiround conversations. We also developed a framework to assess the effectiveness of defensive strategies and identify unwanted behaviors in custom GPTs. Our findings show that 77.5% of custom GPTs with defense strategies are vulnerable to basic instruction leaking attacks. Additionally, we reveal that 738 custom GPTs collect user conversational information, and identified 8 GPTs exhibiting data access behaviors that are unnecessary for their intended functionalities. Our findings raise awareness among GPT developers about the importance of integrating specific defensive strategies in their instructions and highlight users' concerns about data privacy when using LLM-based applications.

Paperid: 3051, https://arxiv.org/pdf/2506.03551.pdf

Abstract:
In response to the escalating cyber threats, the efficiency of Cyber Threat Intelligence (CTI) data collection has become paramount in ensuring robust cybersecurity. However, existing works encounter significant challenges in preprocessing large volumes of multilingual threat data, leading to inefficiencies in real-time threat analysis. This paper presents a systematic review of current techniques aimed at enhancing CTI data collection efficiency. Additionally, it proposes a conceptual model to further advance the effectiveness of threat intelligence feeds. Following the PRISMA guidelines, the review examines relevant studies from the Scopus database, highlighting the critical role of artificial intelligence (AI) and machine learning models in optimizing CTI data preprocessing. The findings underscore the importance of AI-driven methods, particularly supervised and unsupervised learning, in significantly improving the accuracy of threat detection and event extraction, thereby strengthening cybersecurity. Furthermore, the study identifies a gap in the existing research and introduces XBC conceptual model integrating XLM-RoBERTa, BiGRU, and CRF, specifically developed to address this gap. This paper contributes conceptually to the field by providing a detailed analysis of current CTI data collection techniques and introducing an innovative conceptual model to enhance future threat intelligence capabilities.

Paperid: 3052, https://arxiv.org/pdf/2506.03507.pdf

Abstract:
Software Bill of Materials (SBOMs) are increasingly regarded as essential tools for securing software supply chains (SSCs), yet their real-world use and adoption barriers remain poorly understood. This systematic literature review synthesizes evidence from 40 peer-reviewed studies to evaluate how SBOMs are currently used to bolster SSC security. We identify five primary application areas: vulnerability management, transparency, component assessment, risk assessment, and SSC integrity. Despite clear promise, adoption is hindered by significant barriers: generation tooling, data privacy, format/standardization, sharing/distribution, cost/overhead, vulnerability exploitability, maintenance, analysis tooling, false positives, hidden packages, and tampering. To structure our analysis, we map these barriers to the ISO/IEC 25019:2023 Quality-in-Use model, revealing critical deficiencies in SBOM trustworthiness, usability, and suitability for security tasks. We also highlight key gaps in the literature. These include the absence of applying machine learning techniques to assess SBOMs and limited evaluation of SBOMs and SSCs using software quality assurance techniques. Our findings provide actionable insights for researchers, tool developers, and practitioners seeking to advance SBOM-driven SSC security and lay a foundation for future work at the intersection of SSC assurance, automation, and empirical software engineering.

Paperid: 3053, https://arxiv.org/pdf/2506.02942.pdf

Abstract:
High-quality real-world data (RWD) is essential for healthcare but must be transformed to comply with the General Data Protection Regulation (GDPR). GDPRs broad definitions of quasi-identifiers (QIDs) and sensitive attributes (SAs) complicate implementation. We aim to standardise RWD anonymisation for GDPR compliance while preserving data utility by introducing an algorithmic method to identify QIDs and SAs and evaluate utility in anonymised datasets. We conducted a systematic literature review via ProQuest and PubMed to inform a three-stage anonymisation pipeline: identification, de-identification, and quasi-identifier dimension evaluation. The pipeline was implemented, validated, and tested on two mock RWD datasets (500 and 1000 rows). Privacy was assessed using k-anonymity, l-diversity, and t-closeness; utility was measured by non-uniform entropy (NUE). The review yielded two studies on QID/SA identification and five on utility metrics. Applying the pipeline, attributes were classified by re-identification risk using alpha and beta thresholds (25 percent/1 percent for 500 rows; 10 percent/1 percent for 1000 rows). Privacy metrics improved k-anonymity from 1 to 4 (500 rows) and 1 to 110 (1000 rows). NUE scores were 69.26 percent and 69.05 percent, respectively, indicating consistent utility despite varying privacy gains. We present a GDPR-compliant anonymisation pipeline for healthcare RWD that provides a reproducible approach to QID/SA identification and utility evaluation; publicly available code promotes standardisation, data privacy, and open science.

Paperid: 3054, https://arxiv.org/pdf/2506.02063.pdf

Abstract:
Clinical free-text data offers immense potential to improve population health research such as richer phenotyping, symptom tracking, and contextual understanding of patient care. However, these data present significant privacy risks due to the presence of directly or indirectly identifying information embedded in unstructured narratives. While numerous de-identification tools have been developed, few have been tested on real-world, heterogeneous datasets at scale or assessed for governance readiness. In this paper, we synthesise our findings from previous studies examining the privacy-risk landscape across multiple document types and NHS data providers in Scotland. We characterise how direct and indirect identifiers vary by record type, clinical setting, and data flow, and show how changes in documentation practice can degrade model performance over time. Through public engagement, we explore societal expectations around the safe use of clinical free text and reflect these in the design of a prototype privacy-risk management tool to support transparent, auditable decision-making. Our findings highlight that privacy risk is context-dependent and cumulative, underscoring the need for adaptable, hybrid de-identification approaches that combine rule-based precision with contextual understanding. We offer a comprehensive view of the challenges and opportunities for safe, scalable reuse of clinical free-text within Trusted Research Environments and beyond, grounded in both technical evidence and public perspectives on responsible data use.

Paperid: 3055, https://arxiv.org/pdf/2506.02054.pdf

Abstract:
Quantum energy teleportation (QET) is a process that leverages quantum entanglement and local operations to transfer energy between two spatially separated locations without physically transporting particles or energy carriers. We construct a QET-based quantum key distribution (QKD) protocol and analyze its security and robustness to noise in both the classical and the quantum channels. We generalize the construction to an $N$-party information sharing protocol, possessing a feature that dishonest participants can be detected.

Paperid: 3056, https://arxiv.org/pdf/2506.02028.pdf

Abstract:
Quantum computers impose an immense threat to system security. As a countermeasure, new cryptographic classes have been created to prevent these attacks. Technologies such as post-quantum cryptography and quantum cryptography. Quantum cryptography uses the principle of quantum physics to produce theoretically unbreakable security. This tertiary review selected 51 secondary studies from the Scopus database and presented bibliometric analysis, a list of the main techniques used in the field, and existing open challenges and future directions in quantum cryptography research. The results showed a prevalence of QKD over other techniques among the selected papers and stated that the field still faces many problems related to implementation cost, error correction, decoherence, key rates, communication distance, and quantum hacking.

Paperid: 3057, https://arxiv.org/pdf/2506.01856.pdf

Abstract:
As search, social media, and artificial intelligence continue to reshape collective knowledge, the preservation of trust on the public infosphere has become a defining challenge of our time. Given the breadth and versatility of adversarial threats, the best--and perhaps only--defense is an equally broad and versatile infrastructure for digital identity. This document discusses the opportunities and implications of building such an infrastructure from the perspective of a national laboratory. The technical foundation for this discussion is the emergence of the Synchronic Web, a Sandia-developed infrastructure for asserting cryptographic provenance at Internet scale. As of the writing of this document, there is ongoing work to develop the underlying technology and apply it to multiple mission-specific domains within Sandia. The primary objective of this document to extend the body of existing work toward the more public-facing domain of digital identity. Our approach depends on a non-standard, but philosophically defensible notion of identity: digital identity is an unbroken sequence of states in a well-defined digital space. From this foundation, we abstractly describe the infrastructural foundations and applied configurations that we expect to underpin future notions of digital identity.

Paperid: 3058, https://arxiv.org/pdf/2506.01220.pdf

Abstract:
As the number of Common Vulnerabilities and Exposures (CVE) continues to grow exponentially, security teams face increasingly difficult decisions about prioritization. Current approaches using Common Vulnerability Scoring System (CVSS) scores produce overwhelming volumes of high-priority vulnerabilities, while Exploit Prediction Scoring System (EPSS) and Known Exploited Vulnerabilities (KEV) catalog offer valuable but incomplete perspectives on actual exploitation risk. We present Vulnerability Management Chaining, a decision tree framework that systematically integrates these three approaches to achieve efficient vulnerability prioritization. Our framework employs a two-stage evaluation process: first applying threat-based filtering using KEV membership or EPSS threshold $\geq$ 0.088), then applying vulnerability severity assessment using CVSS scores $\geq$ 7.0) to enable informed deprioritization. Experimental validation using 28,377 real-world vulnerabilities and vendor-reported exploitation data demonstrates 18-fold efficiency improvements while maintaining 85.6\% coverage. Organizations can reduce urgent remediation workload by approximately 95\%. The integration identifies 48 additional exploited vulnerabilities that neither KEV nor EPSS captures individually. Our framework uses exclusively open-source data, enabling immediate adoption regardless of organizational resources.

Paperid: 3059, https://arxiv.org/pdf/2506.00719.pdf

Abstract:
Web client fingerprinting has become a widely used technique for uniquely identifying users, browsers, operating systems, and devices with high accuracy. While it is beneficial for applications such as fraud detection and personalized experiences, it also raises privacy concerns by enabling persistent tracking and detailed user profiling. This paper introduces an advanced fingerprinting method using WebAssembly (Wasm) - a low-level programming language that offers near-native execution speed in modern web browsers. With broad support across major browsers and growing adoption, WebAssembly provides a strong foundation for developing more effective fingerprinting methods. In this work, we present a new approach that leverages WebAssembly's computational capabilities to identify returning devices-such as smartphones, tablets, laptops, and desktops across different browsing sessions. Our method uses subtle differences in the WebAssembly JavaScript API implementation to distinguish between Chromium-based browsers like Google Chrome and Microsoft Edge, even when identifiers such as the User-Agent are completely spoofed, achieving a false-positive rate of less than 1%. The fingerprint is generated using a combination of CPU-bound operations, memory tasks, and I/O activities to capture unique browser behaviors. We validate this approach on a variety of platforms, including Intel, AMD, and ARM CPUs, operating systems such as Windows, macOS, Android, and iOS, and in environments like VMWare, KVM, and VirtualBox. Extensive evaluation shows that WebAssembly-based fingerprinting significantly improves identification accuracy. We also propose mitigation strategies to reduce the privacy risks associated with this method, which could be integrated into future browser designs to better protect user privacy.

Paperid: 3060, https://arxiv.org/pdf/2506.00677.pdf

Abstract:
This study addresses critical challenges in managing the transportation of spent nuclear fuel, including inadequate data transparency, stringent confidentiality requirements, and a lack of trust among collaborating parties, issues prevalent in traditional centralized management systems. Given the high risks involved, balancing data confidentiality with regulatory transparency is imperative. To overcome these limitations, a prototype system integrating blockchain technology and the Internet of Things (IoT) is proposed, featuring a multi-tiered consortium chain architecture. This system utilizes IoT sensors for real-time data collection, which is immutably recorded on the blockchain, while a hierarchical data structure (operational, supervisory, and public layers) manages access for diverse stakeholders. The results demonstrate that this approach significantly enhances data immutability, enables real-time multi-sensor data integration, improves decentralized transparency, and increases resilience compared to traditional systems. Ultimately, this blockchain-IoT framework improves the safety, transparency, and efficiency of spent fuel transportation, effectively resolving the conflict between confidentiality and transparency in nuclear data management and offering significant practical implications.

Paperid: 3061, https://arxiv.org/pdf/2506.00282.pdf

Abstract:
In online auctions, fraudulent behaviors such as shill bidding pose significant risks. This paper presents a conceptual framework that applies dynamic, behavior-based penalties to deter auction fraud using blockchain smart contracts. Unlike traditional post-auction detection methods, this approach prevents manipulation in real-time by introducing an economic disincentive system where penalty severity scales with suspicious bidding patterns. The framework employs the proposed Bid Shill Score (BSS) to evaluate nine distinct bidding behaviors, dynamically adjusting the penalty fees to make fraudulent activity financially unaffordable while providing fair competition. The system is implemented within a decentralized English auction on the Ethereum blockchain, demonstrating how smart contracts enforce transparent auction rules without trusted intermediaries. Simulations confirm the effectiveness of the proposed model: the dynamic penalty mechanism reduces the profitability of shill bidding while keeping penalties low for honest bidders. Performance evaluation shows that the system introduces only moderate gas and latency overhead, keeping transaction costs and response times within practical bounds for real-world use. The approach provides a practical method for behaviour-based fraud prevention in decentralised systems where trust cannot be assumed.

Paperid: 3062, https://arxiv.org/pdf/2506.00191.pdf

Abstract:
Heterogeneous Graph Neural Networks (HGNNs) excel in modeling complex, multi-typed relationships across diverse domains, yet their vulnerability to backdoor attacks remains unexplored. To address this gap, we conduct the first investigation into the susceptibility of HGNNs to existing graph backdoor attacks, revealing three critical issues: (1) high attack budget required for effective backdoor injection, (2) inefficient and unreliable backdoor activation, and (3) inaccurate attack effectiveness evaluation. To tackle these issues, we propose the Heterogeneous Graph Backdoor Attack (HGBA), the first backdoor attack specifically designed for HGNNs, introducing a novel relation-based trigger mechanism that establishes specific connections between a strategically selected trigger node and poisoned nodes via the backdoor metapath. HGBA achieves efficient and stealthy backdoor injection with minimal structural modifications and supports easy backdoor activation through two flexible strategies: Self-Node Attack and Indiscriminate Attack. Additionally, we improve the ASR measurement protocol, enabling a more accurate assessment of attack effectiveness. Extensive experiments demonstrate that HGBA far surpasses multiple state-of-the-art graph backdoor attacks in black-box settings, efficiently attacking HGNNs with low attack budgets. Ablation studies show that the strength of HBGA benefits from our trigger node selection method and backdoor metapath selection strategy. In addition, HGBA shows superior robustness against node feature perturbations and multiple types of existing graph backdoor defense mechanisms. Finally, extension experiments demonstrate that the relation-based trigger mechanism can effectively extend to tasks in homogeneous graph scenarios, thereby posing severe threats to broader security-critical domains.

Paperid: 3063, https://arxiv.org/pdf/2505.24685.pdf

Abstract:
This paper explores the evolving dynamics of cybersecurity in the age of advanced AI, from the perspective of the introduced Human Layer Kill Chain framework. As traditional attack models like Lockheed Martin's Cyber Kill Chain become inadequate in addressing human vulnerabilities exploited by modern adversaries, the Humal Layer Kill Chain offers a nuanced approach that integrates human psychology and behaviour into the analysis of cyber threats. We detail the eight stages of the Human Layer Kill Chain, illustrating how AI-enabled techniques can enhance psychological manipulation in attacks. By merging the Human Layer with the Cyber Kill Chain, we propose a Sociotechnical Kill Plane that allows for a holistic examination of attackers' tactics, techniques, and procedures (TTPs) across the sociotechnical landscape. This framework not only aids cybersecurity professionals in understanding adversarial methods, but also empowers non-technical personnel to engage in threat identification and response. The implications for incident response and organizational resilience are significant, particularly as AI continues to shape the threat landscape.

Paperid: 3064, https://arxiv.org/pdf/2505.24289.pdf

Abstract:
Traditionally, threshold secret sharing (TSS) schemes assume all parties have equal weight, yet emerging systems like blockchains reveal disparities in party trustworthiness, such as stake or reputation. Weighted Secret Sharing (WSS) addresses this by assigning varying weights to parties, ensuring security even if adversaries control parties with total weight at most a threshold $t$. Current WSS schemes assume honest dealers, resulting in security from only honest-but-curious behaviour but not protection from malicious adversaries for downstream applications. \emph{Verifiable} secret sharing (VSS) is a well-known technique to address this, but existing VSS schemes are either tailored to TSS, or require additional trust assumptions. We propose the first efficient verifiable WSS scheme that tolerates malicious dealers and is compatible with the latest CRT-based WSS~\cite{crypto_w_weights}. Our solution uses Bulletproofs for efficient verification and introduces new privacy-preserving techniques for proving relations between committed values, which may be of independent interest. Evaluation on Ethereum show up to a $100\times$ improvement in communication complexity compared to the current design and $20\times$ improvement compared to unweighted VSS schemes.

Paperid: 3065, https://arxiv.org/pdf/2505.24284.pdf

Abstract:
This paper introduces a fraud-deterrent access validation system for public blockchains, leveraging two complementary concepts: "Transaction Proximity", which measures the distance between wallets in the transaction graph, and "Easily Attainable Identities (EAIs)", wallets with direct transaction connections to centralized exchanges. Recognizing the limitations of traditional approaches like blocklisting (reactive, slow) and strict allow listing (privacy-invasive, adoption barriers), we propose a system that analyzes transaction patterns to identify wallets with close connections to centralized exchanges. Our directed graph analysis of the Ethereum blockchain reveals that 56% of large USDC wallets (with a lifetime maximum balance greater than \$10,000) are EAI and 88% are within one transaction hop of an EAI. For transactions exceeding \$2,000, 91% involve at least one EAI. Crucially, an analysis of past exploits shows that 83% of the known exploiter addresses are not EAIs, with 21% being more than five hops away from any regulated exchange. We present three implementation approaches with varying gas cost and privacy tradeoffs, demonstrating that EAI-based access control can potentially prevent most of these incidents while preserving blockchain openness. Importantly, our approach does not restrict access or share personally identifiable information, but it provides information for protocols to implement their own validation or risk scoring systems based on specific needs. This middle-ground solution enables programmatic compliance while maintaining the core values of open blockchain.

Paperid: 3066, https://arxiv.org/pdf/2505.23771.pdf

Abstract:
Advanced Encryption Standard (AES) is one of the most widely used symmetric cipher for the confidentiality of data. Also it is used for other security services, viz. integrity, authentication and key establishment. However, recently, authors have shown some weakness in the generation of sub-keys in AES, e.g. bit leakage attack, etc. Also, AES sub-keys are generated sequentially, which is an overhead, especially for resource-constrained devices. Therefore, we propose and investigate a novel encryption AESHA3, which uses sub-keys generated by Secure Hash Algorithm-3 (SHA3). The output of SHA3 is one-way and highly non-linear, and random. The experimental analysis shows that the average time taken for generating the sub-keys to be used for encrypting the data using our approach i.e. AESHA3 is ~1300 times faster than the sub-key generated by the standard AES. Accordingly, we find that AESHA3 will be very relevant not only in terms of security but also it will save the resources in IoT devices. We investigated AESHA3 in Intel Core i7, 6th Generation processor and Raspberry Pi 4B and found that up to two MB data encryption is very significant, and lesser the data size, more the resource saving compared to AES.

Paperid: 3067, https://arxiv.org/pdf/2505.23357.pdf

Abstract:
The paper proposes a method to secure the Compressive Sensing (CS) streams. It consists in protecting part of the measurements by a secret key and inserting the code into the rest. The secret key is generated via a cryptographically secure pseudo-random number generator (CSPRNG) and XORed with the measurements to be inserted. For insertion, we use a reversible data hiding (RDH) scheme, which is a prediction error expansion algorithm, modified to match the statistics of CS measurements. The reconstruction from the embedded stream conducts to visibly distorted images. The image distortion is controlled by the number of embedded levels. In our tests, the embedding on 10 levels results in $\approx 18 dB $ distortion for images of 256x256 pixels reconstructed with the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). A particularity of the presented method is on-the-fly insertion that makes it appropriate for the sequential acquisition of measurements by a Single Pixel Camera. On-the-fly insertion avoids the buffering of CS measurements for a subsequent standard encryption and generation of a thumbnail image.

Paperid: 3068, https://arxiv.org/pdf/2505.22989.pdf

Abstract:
Modern blockchain applications are often constrained by a trade-off between user experience and trust. Chainless Apps present a new paradigm of application architecture that separates execution, trust, bridging, and settlement into distinct compostable layers. This enables app-specific sequencing, verifiable off-chain computation, chain-agnostic asset and message routing via Agglayer, and finality on Ethereum - resulting in fast Web2-like UX with Web3-grade verifiability. Although consensus mechanisms have historically underpinned verifiable computation, the advent of zkVMs and decentralized validation services opens up new trust models for developers. Chainless Apps leverage this evolution to offer modular, scalable applications that maintain interoperability with the broader blockchain ecosystem while allowing domain-specific trade-offs.

Paperid: 3069, https://arxiv.org/pdf/2505.22852.pdf

Abstract:
CaMeL (Capabilities for Machine Learning) introduces a capability-based sandbox to mitigate prompt injection attacks in large language model (LLM) agents. While effective, CaMeL assumes a trusted user prompt, omits side-channel concerns, and incurs performance tradeoffs due to its dual-LLM design. This response identifies these issues and proposes engineering improvements to expand CaMeL's threat coverage and operational usability. We introduce: (1) prompt screening for initial inputs, (2) output auditing to detect instruction leakage, (3) a tiered-risk access model to balance usability and control, and (4) a verified intermediate language for formal guarantees. Together, these upgrades align CaMeL with best practices in enterprise security and support scalable deployment.

Paperid: 3070, https://arxiv.org/pdf/2505.22843.pdf

Abstract:
The performance figures of modern drift-adaptive malware classifiers appear promising, but does this translate to genuine operational reliability? The standard evaluation paradigm primarily focuses on baseline performance metrics, neglecting confidence-error alignment and operational stability. While TESSERACT established the importance of temporal evaluation, we take a complementary direction by investigating whether malware classifiers maintain reliable and stable confidence estimates under distribution shifts and exploring the tensions between scientific advancement and practical impacts when they do not. We propose AURORA, a framework to evaluate malware classifiers based on their confidence quality and operational resilience. AURORA subjects the confidence profile of a given model to verification to assess the reliability of its estimates. Unreliable confidence estimates erode operational trust, waste valuable annotation budget on non-informative samples for active learning, and leave error-prone instances undetected in selective classification. AURORA is complemented by a set of metrics designed to go beyond point-in-time performance, striving towards a more holistic assessment of operational stability throughout temporal evaluation periods. The fragility in SOTA frameworks across datasets of varying drift suggests the need for a return to the whiteboard.

Paperid: 3071, https://arxiv.org/pdf/2505.22531.pdf

Abstract:
Open-ended learning (OEL) -- which emphasizes training agents that achieve broad capability over narrow competency -- is emerging as a paradigm to develop artificial intelligence (AI) agents to achieve robustness and generalization. However, despite promising results that demonstrate the benefits of OEL, applying OEL to develop autonomous agents for real-world cybersecurity applications remains a challenge. We propose a training approach, inspired by OEL, to develop autonomous network defenders. Our results demonstrate that like in other domains, OEL principles can translate into more robust and generalizable agents for cyber defense. To apply OEL to network defense, it is necessary to address several technical challenges. Most importantly, it is critical to provide a task representation approach over a broad universe of tasks that maintains a consistent interface over goals, rewards and action spaces. This way, the learning agent can train with varying network conditions, attacker behaviors, and defender goals while being able to build on previously gained knowledge. With our tools and results, we aim to fundamentally impact research that applies AI to solve cybersecurity problems. Specifically, as researchers develop gyms and benchmarks for cyber defense, it is paramount that they consider diverse tasks with consistent representations, such as those we propose in our work.

Paperid: 3072, https://arxiv.org/pdf/2505.22023.pdf

Abstract:
Software systems have grown as an indispensable commodity used across various industries, and almost all essential services depend on them for effective operation. The software is no longer an independent or stand-alone piece of code written by a developer but rather a collection of packages designed by multiple developers across the globe. Ensuring the reliability and resilience of these systems is crucial since emerging threats target software supply chains, as demonstrated by the widespread SolarWinds hack in late 2020. These supply chains extend beyond patches and updates, involving distribution networks throughout the software lifecycle. Industries like smart grids, manufacturing, healthcare, and finance rely on interconnected software systems and their dependencies for effective functioning. To secure software modules and add-ons, robust distribution architectures are essential. The proposed chapter enhances the existing delivery frameworks by including a permissioned ledger with Proof of Authority consensus and multi-party signatures. The proposed system aims to prevent attacks while permitting every stakeholder to verify the same. Critical systems can interface with the secure pipeline without disrupting existing functionalities, thus preventing the cascading effect of an attack at any point in the supply chain.

Paperid: 3073, https://arxiv.org/pdf/2505.21442.pdf

Abstract:
One-way functions (OWFs) form the foundation of modern cryptography, yet their unconditional existence remains a major open question. In this work, we study this question by exploring its relation to lossy reductions, i.e., reductions $R$ for which it holds that $I(X;R(X)) \ll n$ for all distributions $X$ over inputs of size $n$. Our main result is that either OWFs exist or any lossy reduction for any promise problem $Î $ runs in time $2^{Î©(\logÏ_Î / \log\log n)}$, where $Ï_Î (n)$ is the infimum of the runtime of all (worst-case) solvers of $Î $ on instances of size $n$. In fact, our result requires a milder condition, that $R$ is lossy for sparse uniform distributions (which we call mild-lossiness). It also extends to $f$-reductions as long as $f$ is a non-constant permutation-invariant Boolean function, which includes And-, Or-, Maj-, Parity-, Modulo$_k$, and Threshold$_k$-reductions. Additionally, we show that worst-case to average-case Karp reductions and randomized encodings are special cases of mildly-lossy reductions and improve the runtime above as $2^{Î©(\log Ï_Î )}$ when these mappings are considered. Restricting to weak fine-grained OWFs, this runtime can be further improved as $Î©(Ï_Î )$. Taking $Î $ as $kSAT$, our results provide sufficient conditions under which (fine-grained) OWFs exist assuming the Exponential Time Hypothesis (ETH). Conversely, if (fine-grained) OWFs do not exist, we obtain impossibilities on instance compressions (Harnik and Naor, FOCS 2006) and instance randomizations of $kSAT$ under the ETH. Finally, we partially extend these findings to the quantum setting; the existence of a pure quantum mildly-lossy reduction for $Î $ within the runtime $2^{o(\logÏ_Î / \log\log n)}$ implies the existence of one-way state generators.

Paperid: 3074, https://arxiv.org/pdf/2505.21021.pdf

Abstract:
While law enforcements agencies and cybercrime researchers are working hard, fake E-commerce scam is still a big threat to Internet users. One of the major techniques to victimize users is luring them by black-hat search-engine-optimization (SEO); making search engines display their lure pages as if these were placed on compromised websites and then redirecting visitors to malicious sites. In this study, we focus on the threat actors conduct fake E-commerce scam with this strategy. Our previous study looked at the connection between some malware families used for black-hat SEO to enlighten threat actors and their infrastructures, however it shows only a limited part of the whole picture because we could not find all SEO malware samples from limited sources. In this paper, we aim to identify and analyze threat actor groups using a large dataset of fake E-commerce sites collected by Japan Cybercrime Control Center, which we believe is of higher quality. It includes 692,865 fake EC sites gathered from redirectors over two and a half years, from May 20, 2022 to Dec. 31, 2024. We analyzed the links between these sites using Maltego, a well-known link analysis tool, and tailored programs. We also conducted time series analysis to track group changes in the groups. According to the analysis, we estimate that 17 relatively large groups were active during the dataset period and some of them were active throughout the period.

Paperid: 3075, https://arxiv.org/pdf/2505.20186.pdf

Abstract:
Vulnerabilities in open-source software can cause cascading effects in the modern digital ecosystem. It is especially worrying if these vulnerabilities repeat across many projects, as once the adversaries find one of them, they can scale up the attack very easily. Unfortunately, since developers frequently reuse code from their own or external code resources, some nearly identical vulnerabilities exist across many open-source projects. We conducted a study to examine the prevalence of a particular vulnerable code pattern that enables path traversal attacks (CWE-22) across open-source GitHub projects. To handle this study at the GitHub scale, we developed an automated pipeline that scans GitHub for the targeted vulnerable pattern, confirms the vulnerability by first running a static analysis and then exploiting the vulnerability in the context of the studied project, assesses its impact by calculating the CVSS score, generates a patch using GPT-4, and reports the vulnerability to the maintainers. Using our pipeline, we identified 1,756 vulnerable open-source projects, some of which are very influential. For many of the affected projects, the vulnerability is critical (CVSS score higher than 9.0), as it can be exploited remotely without any privileges and critically impact the confidentiality and availability of the system. We have responsibly disclosed the vulnerability to the maintainers, and 14\% of the reported vulnerabilities have been remediated. We also investigated the root causes of the vulnerable code pattern and assessed the side effects of the large number of copies of this vulnerable pattern that seem to have poisoned several popular LLMs. Our study highlights the urgent need to help secure the open-source ecosystem by leveraging scalable automated vulnerability management solutions and raising awareness among developers.

Paperid: 3076, https://arxiv.org/pdf/2505.19887.pdf

Abstract:
Large language models (LLMs) have shown promise in software engineering, yet their effectiveness for binary analysis remains unexplored. We present the first comprehensive evaluation of commercial LLMs for assembly code deobfuscation. Testing seven state-of-the-art models against four obfuscation scenarios (bogus control flow, instruction substitution, control flow flattening, and their combination), we found striking performance variations--from autonomous deobfuscation to complete failure. We propose a theoretical framework based on four dimensions: Reasoning Depth, Pattern Recognition, Noise Filtering, and Context Integration, explaining these variations. Our analysis identifies five error patterns: predicate misinterpretation, structural mapping errors, control flow misinterpretation, arithmetic transformation errors, and constant propagation errors, revealing fundamental limitations in LLM code processing.We establish a three-tier resistance model: bogus control flow (low resistance), control flow flattening (moderate resistance), and instruction substitution/combined techniques (high resistance). Universal failure against combined techniques demonstrates that sophisticated obfuscation remains effective against advanced LLMs. Our findings suggest a human-AI collaboration paradigm where LLMs reduce expertise barriers for certain reverse engineering tasks while requiring human guidance for complex deobfuscation. This work provides a foundation for evaluating emerging capabilities and developing resistant obfuscation techniques.x deobfuscation. This work provides a foundation for evaluating emerging capabilities and developing resistant obfuscation techniques.

Paperid: 3077, https://arxiv.org/pdf/2505.19283.pdf

Abstract:
IoT is a dynamic network of interconnected things that communicate and exchange data, where security is a significant issue. Previous studies have mainly focused on attack classifications and open issues rather than presenting a comprehensive overview on the existing threats and vulnerabilities. This knowledge helps analyzing the network in the early stages even before any attack takes place. In this paper, the researchers have proposed different security aspects and a novel Bayesian Security Aspects Dependency Graph for IoT (BSAGIoT) to illustrate their relations. The proposed BSAGIoT is a generic model applicable to any IoT network and contains aspects from five categories named data, access control, standard, network, and loss. This proposed Bayesian Security Aspect Graph (BSAG) presents an overview of the security aspects in any given IoT network. The purpose of BSAGIoT is to assist security experts in analyzing how a successful compromise and/or a failed breach could impact the overall security and privacy of the respective IoT network. In addition, root cause identification of security challenges, how they affect one another, their impact on IoT networks via topological sorting, and risk assessment could be achieved. Hence, to demonstrate the feasibility of the proposed method, experimental results with various scenarios has been presented, in which the security aspects have been quantified based on the network configurations. The results indicate the impact of the aspects on each other and how they could be utilized to mitigate and/or eliminate the security and privacy deficiencies in IoT networks.

Paperid: 3078, https://arxiv.org/pdf/2505.19006.pdf

Abstract:
Decentralized applications are often composed of multiple interconnected smart contracts. This is especially evident in DeFi, where protocols are heavily intertwined and rely on a variety of basic building blocks such as tokens, decentralized exchanges and lending protocols. A crucial security challenge in this setting arises when adversaries target individual components to cause systemic economic losses. Existing security notions focus on determining the existence of these attacks, but fail to quantify the effect of manipulating individual components on the overall economic security of the system. In this paper, we introduce a quantitative security notion that measures how an attack on a single component can amplify economic losses of the overall system. We study the fundamental properties of this notion and apply it to assess the security of key compositions. In particular, we analyse under-collateralized loan attacks in systems made of lending protocols and decentralized exchanges.

Paperid: 3079, https://arxiv.org/pdf/2505.18870.pdf

Abstract:
With constant threats to the safety of personal data in the United States, privacy literacy has become an increasingly important competency among university students, one that ties intimately to the information sharing behavior of these students. This survey based study examines how university students in the United States perceive personal data privacy and how their privacy literacy influences their understanding and behaviors. Students responses to a privacy literacy scale were categorized into high and low privacy literacy groups, revealing that high literacy individuals demonstrate a broader range of privacy practices, including multi factor authentication, VPN usage, and phishing awareness, whereas low literacy individuals rely on more basic security measures. Statistical analyses suggest that high literacy respondents display greater diversity in recommendations and engagement in privacy discussions. These findings suggest the need for enhanced educational initiatives to improve data privacy awareness at the university level to create a better cyber safe population.

Paperid: 3080, https://arxiv.org/pdf/2505.18806.pdf

Abstract:
Machine learning (ML) has been developed to detect malware in recent years. Most researchers focused their efforts on improving the detection performance but ignored the robustness of the ML models. In addition, many machine learning algorithms are very vulnerable to intentional attacks. To solve these problems, adversarial malware examples are generated by GANs to enhance the robustness of the malware detector. However, since current GAN models suffer from limitations such as unstable training and weak adversarial examples, we propose the Mal-D2GAN model to address these problems. Specifically, the Mal-D2GAN architecture was designed with double-detector and a least square loss function and tested on a dataset of 20,000 samples. The results show that the Mal-D2GAN model reduced the detection accuracy (true positive rate) in 8 malware detectors. The performance was then compared with that of the existing MalGAN and Mal- LSGAN models.

Paperid: 3081, https://arxiv.org/pdf/2505.18627.pdf

Abstract:
Anonymization is a foundational principle of data privacy regulation, yet its practical application remains riddled with ambiguity and inconsistency. This paper introduces the concept of anonymity-washing -- the misrepresentation of the anonymity level of ``sanitized'' personal data -- as a critical privacy concern. While both legal and technical critiques of anonymization exist, they tend to address isolated aspects of the problem. In contrast, this paper offers a comprehensive overview of the conditions that enable anonymity-washing. It synthesizes fragmented legal interpretations, technical misunderstandings, and outdated regulatory guidance and complements them with a systematic review of national and international resources, including legal cases, data protection authority guidelines, and technical documentation. Our findings reveal a lack of coherent support for practitioners, contributing to the persistent misuse of pseudonymization and obsolete anonymization techniques. We conclude by recommending targeted education, clearer technical guidance, and closer cooperation between regulators, researchers, and industry to bridge the gap between legal norms and technical reality.

Paperid: 3082, https://arxiv.org/pdf/2505.18520.pdf

Abstract:
It is well known that anti-malware scanners depend on malware signatures to identify malware. However, even minor modifications to malware code structure results in a change in the malware signature thus enabling the variant to evade detection by scanners. Therefore, there exists the need for a proactively generated malware variant dataset to aid detection of such diverse variants by automated antivirus scanners. This paper proposes and demonstrates a generic assembly source code based framework that facilitates any evolutionary algorithm to generate diverse and potential variants of an input malware, while retaining its maliciousness, yet capable of evading antivirus scanners. Generic code transformation functions and a novelty search supported quality metric have been proposed as components of the framework to be used respectively as variation operators and fitness function, for evolutionary algorithms. The results demonstrate the effectiveness of the framework in generating diverse variants and the generated variants have been shown to evade over 98% of popular antivirus scanners. The malware variants evolved by the framework can serve as antigens to assist malware analysis engines to improve their malware detection algorithms.

Paperid: 3083, https://arxiv.org/pdf/2505.18323.pdf

Abstract:
For nearly a decade the academic community has investigated backdoors in neural networks, primarily focusing on classification tasks where adversaries manipulate the model prediction. While demonstrably malicious, the immediate real-world impact of such prediction-altering attacks has remained unclear. In this paper we introduce a novel and significantly more potent class of backdoors that builds upon recent advancements in architectural backdoors. We demonstrate how these backdoors can be specifically engineered to exploit batched inference, a common technique for hardware utilization, enabling large-scale user data manipulation and theft. By targeting the batching process, these architectural backdoors facilitate information leakage between concurrent user requests and allow attackers to fully control model responses directed at other users within the same batch. In other words, an attacker who can change the model architecture can set and steal model inputs and outputs of other users within the same batch. We show that such attacks are not only feasible but also alarmingly effective, can be readily injected into prevalent model architectures, and represent a truly malicious threat to user privacy and system integrity. Critically, to counteract this new class of vulnerabilities, we propose a deterministic mitigation strategy that provides formal guarantees against this new attack vector, unlike prior work that relied on Large Language Models to find the backdoors. Our mitigation strategy employs a novel Information Flow Control mechanism that analyzes the model graph and proves non-interference between different user inputs within the same batch. Using our mitigation strategy we perform a large scale analysis of models hosted through Hugging Face and find over 200 models that introduce (unintended) information leakage between batch entries due to the use of dynamic quantization.

Paperid: 3084, https://arxiv.org/pdf/2505.18242.pdf

Abstract:
In-home elderly monitoring requires systems that can detect emergency events - such as falls or prolonged inactivity - while preserving privacy and requiring no user input. These systems must be embedded into the surrounding environment, capable of capturing activity, and responding promptly. This paper presents a low-cost, privacy-preserving solution using Passive Infrared (PIR) and Light Detection and Ranging (LiDAR) sensors to track entries, sitting, exits, and emergency scenarios within a home bathroom setting. We developed and evaluated a rule-based detection system through five real-world experiments simulating elderly behavior. Annotated time-series graphs demonstrate the system's ability to detect dangerous states, such as motionless collapses, while maintaining privacy through non-visual sensing.

Paperid: 3085, https://arxiv.org/pdf/2505.17623.pdf

Abstract:
Verifiable computing (VC) has gained prominence in decentralized machine learning systems, where resource-intensive tasks like deep neural network (DNN) inference are offloaded to external participants due to blockchain limitations. This creates a need to verify the correctness of outsourced computations without re-execution. We propose \texttt{Range-Arithmetic}, a novel framework for efficient and verifiable DNN inference that transforms non-arithmetic operations, such as rounding after fixed-point matrix multiplication and ReLU, into arithmetic steps verifiable using sum-check protocols and concatenated range proofs. Our approach avoids the complexity of Boolean encoding, high-degree polynomials, and large lookup tables while remaining compatible with finite-field-based proof systems. Experimental results show that our method not only matches the performance of existing approaches, but also reduces the computational cost of verifying the results, the computational effort required from the untrusted party performing the DNN inference, and the communication overhead between the two sides.

Paperid: 3086, https://arxiv.org/pdf/2505.17586.pdf

Abstract:
The Internet of Things (IoT) and Large Language Models (LLMs) have been two major emerging players in the information technology era. Although there has been significant coverage of their individual capabilities, our literature survey sheds some light on the integration and interaction of LLMs and IoT devices - a mutualistic relationship in which both parties leverage the capabilities of the other. LLMs like OpenAI's ChatGPT, Anthropic's Claude, Google's Gemini/BERT, any many more, all demonstrate powerful capabilities in natural language understanding and generation, enabling more intuitive and context-aware interactions across diverse IoT applications such as smart cities, healthcare systems, industrial automation, and smart home environments. Despite these opportunities, integrating these resource-intensive LLMs into IoT devices that lack the state-of-the-art computational power is a challenging task. The security of these edge devices is another major concern as they can easily act as a backdoor to private networks if the LLM integration is sloppy and unsecured. This literature survey systematically explores the current state-of-the-art in applying LLMs within IoT, emphasizing their applications in various domains/sectors of society, the significant role they play in enhancing IoT security through anomaly detection and threat mitigation, and strategies for effective deployment using edge computing frameworks. Finally, this survey highlights existing challenges, identifies future research directions, and underscores the need for cross-disciplinary collaboration to fully realize the transformative potential of integrating LLMs and IoT.

Paperid: 3087, https://arxiv.org/pdf/2505.17253.pdf

Abstract:
The modern semiconductor industry requires memory solutions that can keep pace with the high-speed demands of high-performance computing. Embedded non-volatile memories (eNVMs) address these requirements by offering faster access to stored data at an improved computational throughput and efficiency. Furthermore, these technologies offer numerous appealing features, including limited area-energy-runtime budget and data retention capabilities. Among these, the data retention feature of eNVMs has garnered particular interest within the semiconductor community. Although this property allows eNVMs to retain data even in the absence of a continuous power supply, it also introduces some vulnerabilities, prompting security concerns. These concerns have sparked increased interest in examining the broader security implications associated with eNVM technologies. This paper examines the security aspects of eNVMs by discussing the reasons for vulnerabilities in specific memories from an architectural point of view. Additionally, this paper extensively reviews eNVM-based security primitives, such as physically unclonable functions and true random number generators, as well as techniques like logic obfuscation. The paper also explores a broad spectrum of security threats to eNVMs, including physical attacks such as side-channel attacks, fault injection, and probing, as well as logical threats like information leakage, denial-of-service, and thermal attacks. Finally, the paper presents a study of publication trends in the eNVM domain since the early 2000s, reflecting the rising momentum and research activity in this field.

Paperid: 3088, https://arxiv.org/pdf/2505.17226.pdf

Abstract:
Federated Learning (FL) enables collaborative machine learning across decentralized data sources without sharing raw data. It offers a promising approach to privacy-preserving AI. However, FL remains vulnerable to adversarial threats from malicious participants, referred to as Byzantine clients, who can send misleading updates to corrupt the global model. Traditional aggregation methods, such as simple averaging, are not robust to such attacks. More resilient approaches, like the Krum algorithm, require prior knowledge of the number of malicious clients, which is often unavailable in real-world scenarios. To address these limitations, we propose Average-rKrum (ArKrum), a novel aggregation strategy designed to enhance both the resilience and privacy guarantees of FL systems. Building on our previous work (rKrum), ArKrum introduces two key innovations. First, it includes a median-based filtering mechanism that removes extreme outliers before estimating the number of adversarial clients. Second, it applies a multi-update averaging scheme to improve stability and performance, particularly when client data distributions are not identical. We evaluate ArKrum on benchmark image and text datasets under three widely studied Byzantine attack types. Results show that ArKrum consistently achieves high accuracy and stability. It performs as well as or better than other robust aggregation methods. These findings demonstrate that ArKrum is an effective and practical solution for secure FL systems in adversarial environments.

Paperid: 3089, https://arxiv.org/pdf/2505.17160.pdf

Abstract:
This work presents LURK (Latent UnleaRned Knowledge), a novel framework that probes for hidden retained knowledge in unlearned LLMs through adversarial suffix prompting. LURK automatically generates adversarial prompt suffixes designed to elicit residual knowledge about the Harry Potter domain, a commonly used benchmark for unlearning. Our experiments reveal that even models deemed successfully unlearned can leak idiosyncratic information under targeted adversarial conditions, highlighting critical limitations of current unlearning evaluation standards. By uncovering latent knowledge through indirect probing, LURK offers a more rigorous and diagnostic tool for assessing the robustness of unlearning algorithms. All code will be publicly available.

Paperid: 3090, https://arxiv.org/pdf/2505.17145.pdf

Abstract:
Large language models (LLMs) are increasingly applied in fields such as finance, education, and governance due to their ability to generate human-like text and adapt to specialized tasks. However, their widespread adoption raises critical concerns about data privacy and security, including the risk of sensitive data exposure. In this paper, we propose a security framework to enforce policy compliance and mitigate risks in LLM interactions. Our approach introduces three key innovations: (i) LLM-based policy enforcement: a customizable mechanism that enhances domain-specific detection of sensitive data. (ii) Dynamic policy customization: real-time policy adaptation and enforcement during user-LLM interactions to ensure compliance with evolving security requirements. (iii) Sensitive data anonymization: a format-preserving encryption technique that protects sensitive information while maintaining contextual integrity. Experimental results demonstrate that our framework effectively mitigates security risks while preserving the functional accuracy of LLM-driven tasks.

Paperid: 3091, https://arxiv.org/pdf/2505.17092.pdf

Abstract:
Secure multiparty computation (MPC) allows data owners to train machine learning models on combined data while keeping the underlying training data private. The MPC threat model either considers an adversary who passively corrupts some parties without affecting their overall behavior, or an adversary who actively modifies the behavior of corrupt parties. It has been argued that in some settings, active security is not a major concern, partly because of the potential risk of reputation loss if a party is detected cheating. In this work we show explicit, simple, and effective attacks that an active adversary can run on existing passively secure MPC training protocols, while keeping essentially zero risk of the attack being detected. The attacks we show can compromise both the integrity and privacy of the model, including attacks reconstructing exact training data. Our results challenge the belief that a threat model that does not include malicious behavior by the involved parties may be reasonable in the context of PPML, motivating the use of actively secure protocols for training.

Paperid: 3092, https://arxiv.org/pdf/2505.17084.pdf

Abstract:
Large language models (LLMs) offer unprecedented and growing capabilities, but also introduce complex safety and security challenges that resist conventional risk management. While conventional probabilistic risk analysis (PRA) requires exhaustive risk enumeration and quantification, the novelty and complexity of these systems make PRA impractical, particularly against adaptive adversaries. Previous research found that risk management in various fields of engineering such as nuclear or civil engineering is often solved by generic (i.e. field-agnostic) strategies such as event tree analysis or robust designs. Here we show how emerging risks in LLM-powered systems could be met with 100+ of these non-probabilistic strategies to risk management, including risks from adaptive adversaries. The strategies are divided into five categories and are mapped to LLM security (and AI safety more broadly). We also present an LLM-powered workflow for applying these strategies and other workflows suitable for solution architects. Overall, these strategies could contribute (despite some limitations) to security, safety and other dimensions of responsible AI.

Paperid: 3093, https://arxiv.org/pdf/2505.17077.pdf

Abstract:
Applications over the Web primarily rely on the HTTP protocol to transmit web pages to and from systems. There are a variety of application layer protocols, but among all, HTTP is the most targeted because of its versatility and ease of integration with online services. The attackers leverage the fact that by default no detection system blocks any HTTP traffic. Thus, by exploiting such characteristics of the protocol, attacks are launched against web applications. HTTP flooding attacks are one such attack in the application layer of the OSI model. In this paper, a method for the detection of such an attack is proposed. The heart of the detection method is an incremental feature subset selection method based on mutual information and correlation. INFS-MICC helps in identifying a subset of highly relevant and independent feature subset so as to detect HTTP Flooding attacks with best possible classification performance in near-real time.

Paperid: 3094, https://arxiv.org/pdf/2505.17066.pdf

Abstract:
Using LLMs in a production environment presents security challenges that include vulnerabilities to jailbreaks and prompt injections, which can result in harmful outputs for humans or the enterprise. The challenge is amplified when working within a specific domain, as topics generally accepted for LLMs to address may be irrelevant to that field. These problems can be mitigated, for example, by fine-tuning large language models with domain-specific and security-focused data. However, these alone are insufficient, as jailbreak techniques evolve. Additionally, API-accessed models do not offer the flexibility needed to tailor behavior to industry-specific objectives, and in-context learning is not always sufficient or reliable. In response to these challenges, we introduce Archias, an expert model adept at distinguishing between in-domain and out-of-domain communications. Archias classifies user inquiries into several categories: in-domain (specifically for the automotive industry), malicious questions, price injections, prompt injections, and out-of-domain examples. Our methodology integrates outputs from the expert model (Archias) into prompts, which are then processed by the LLM to generate responses. This method increases the model's ability to understand the user's intention and give appropriate answers. Archias can be adjusted, fine-tuned, and used for many different purposes due to its small size. Therefore, it can be easily customized to the needs of any industry. To validate our approach, we created a benchmark dataset for the automotive industry. Furthermore, in the interest of advancing research and development, we release our benchmark dataset to the community.

Paperid: 3095, https://arxiv.org/pdf/2505.16614.pdf

Abstract:
The emergence of quantum computing and Shor's algorithm necessitates an imminent shift from current public key cryptography techniques to post-quantum robust techniques. NIST has responded by standardising Post-Quantum Cryptography (PQC) algorithms, with ML-KEM (FIPS-203) slated to replace ECDH (Elliptic Curve Diffie-Hellman) for key exchange. A key practical concern for PQC adoption is energy consumption. This paper introduces a new framework for measuring the PQC energy consumption on a Raspberry Pi when performing key generation. The framework uses both available traditional methods and the newly standardised ML-KEM algorithm via the commonly utilised OpenSSL library.

Paperid: 3096, https://arxiv.org/pdf/2505.16398.pdf

Abstract:
Cyber Security Incident Response (IR) Playbooks are used to capture the steps required to recover from a cyber intrusion. Individual IR playbooks should focus on a specific type of incident and be aligned with the architecture of a system under attack. Intrusion modelling focuses on a specific potential cyber intrusion and is used to identify where and what countermeasures are needed, and the resulting intrusion models are expected to be used in effective IR, ideally by feeding IR Playbooks designs. IR playbooks and intrusion models, however, are created in isolation and at varying stages of the system's lifecycle. We take nine critical national infrastructure intrusion models - expressed using Sequential AND Attack Trees - and transform them into models of the same format as IR playbooks. We use Security Modelling Framework for modelling attacks and playbooks, and for demonstrating the feasibility of the better integration between risk assessment and IR at the modelling level. This results in improved intrusion models and tighter coupling between IR playbooks and threat modelling which - as we demonstrate - yields novel insights into the analysis of attacks and response actions. The main contributions of this paper are (a) a novel way of representing attack trees using the Security Modelling Framework,(b) a new tool for converting Sequential AND attack trees into models compatible with playbooks, and (c) the examples of nine intrusion models represented using the Security Modelling Framework.

Paperid: 3097, https://arxiv.org/pdf/2505.16261.pdf

Abstract:
The widespread adoption of encrypted communication protocols such as HTTPS and TLS has enhanced data privacy but also rendered traditional anomaly detection techniques less effective, as they often rely on inspecting unencrypted payloads. This study aims to develop an interpretable machine learning-based framework for anomaly detection in encrypted network traffic. This study proposes a model-agnostic framework that integrates multiple machine learning classifiers, with SHapley Additive exPlanations SHAP to ensure post-hoc model interpretability. The models are trained and evaluated on three benchmark encrypted traffic datasets. Performance is assessed using standard classification metrics, and SHAP is used to explain model predictions by attributing importance to individual input features. SHAP visualizations successfully revealed the most influential traffic features contributing to anomaly predictions, enhancing the transparency and trustworthiness of the models. Unlike conventional approaches that treat machine learning as a black box, this work combines robust classification techniques with explainability through SHAP, offering a novel interpretable anomaly detection system tailored for encrypted traffic environments. While the framework is generalizable, real-time deployment and performance under adversarial conditions require further investigation. Future work may explore adaptive models and real-time interpretability in operational network environments. This interpretable anomaly detection framework can be integrated into modern security operations for encrypted environments, allowing analysts not only to detect anomalies with high precision but also to understand why a model made a particular decision a crucial capability in compliance-driven and mission-critical settings.

Paperid: 3098, https://arxiv.org/pdf/2505.16246.pdf

Abstract:
Differential Privacy (DP) is a rigorous privacy standard widely adopted in data analysis and machine learning. However, its guarantees rely on correctly introducing randomized noise--an assumption that may not hold if the implementation is faulty or manipulated by an untrusted analyst. To address this concern, we propose the first verifiable implementation of the exponential mechanism using zk-SNARKs. As a concrete application, we present the first verifiable differentially private (DP) median estimation scheme, which leverages this construction to ensure both privacy and verifiability. Our method encodes the exponential mechanism and a utility function for the median into an arithmetic circuit, employing a scaled inverse CDF technique for sampling. This design enables cryptographic verification that the reported output adheres to the intended DP mechanism, ensuring both privacy and integrity without revealing sensitive data.

Paperid: 3099, https://arxiv.org/pdf/2505.16205.pdf

Abstract:
Static Application Security Testing (SAST) enables organizations to detect vulnerabilities in code early; however, major SAST platforms do not include visual aids and present little insight on correlations between tainted data chains. We propose VIVID - Vulnerability Information Via Data flow - a novel method to extract and consume SAST insights, which is to graph the application's vulnerability data flows (VDFs) and carry out graph theory analysis on the resulting VDF directed graph. Nine metrics were assessed to evaluate their effectiveness in analyzing the VDF graphs of deliberately insecure web applications. These metrics include 3 centrality metrics, 2 structural metrics, PageRank, in-degree, out-degree, and cross-clique connectivity. We present simulations that find that out-degree, betweenness centrality, in-eigenvector centrality, and cross-clique connectivity were found to be associated with files exhibiting high vulnerability traffic, making them refactoring candidates where input sanitization may have been missed. Meanwhile, out-eigenvector centrality, PageRank, and in-degree were found to be associated with nodes enabling vulnerability flow and sinks, but not necessarily where input validation should be placed. This is a novel method to automatically provide development teams an evidence-based prioritized list of files to embed security controls into, informed by vulnerability propagation patterns in the application architecture.

Paperid: 3100, https://arxiv.org/pdf/2505.15797.pdf

Abstract:
Voting is a cornerstone of democracy, allowing citizens to express their will and make collective decisions. With advancing technology, online voting is gaining popularity as it enables voting from anywhere with Internet access, eliminating the need for printed ballots or polling stations. However, despite its benefits, online voting carries significant risks. A single vulnerability could be exploited to manipulate elections on a large scale. Centralized systems can be secure but may lack transparency and confidentiality, especially if those in power manipulate them. Blockchain-based voting offers a transparent, tamper-resistant alternative with end-to-end verifiability and strong security. Adding cryptographic layers can also ensure voter confidentiality.

Paperid: 3101, https://arxiv.org/pdf/2505.15175.pdf

Abstract:
We investigate the theoretical foundations of data poisoning attacks in machine learning models. Our analysis reveals that the Hessian with respect to the input serves as a diagnostic tool for detecting poisoning, exhibiting spectral signatures that characterize compromised datasets. We use random matrix theory (RMT) to develop a theory for the impact of poisoning proportion and regularisation on attack efficacy in linear regression. Through QR stepwise regression, we study the spectral signatures of the Hessian in multi-output regression. We perform experiments on deep networks to show experimentally that this theory extends to modern convolutional and transformer networks under the cross-entropy loss. Based on these insights we develop preliminary algorithms to determine if a network has been poisoned and remedies which do not require further training.

Paperid: 3102, https://arxiv.org/pdf/2505.15088.pdf

Abstract:
Command injection vulnerabilities are a significant security threat in dynamic languages like Python, particularly in widely used open-source projects where security issues can have extensive impact. With the proven effectiveness of Large Language Models(LLMs) in code-related tasks, such as testing, researchers have explored their potential for vulnerabilities analysis. This study evaluates the potential of large language models (LLMs), such as GPT-4, as an alternative approach for automated testing for vulnerability detection. In particular, LLMs have demonstrated advanced contextual understanding and adaptability, making them promising candidates for identifying nuanced security vulnerabilities within code. To evaluate this potential, we applied LLM-based analysis to six high-profile GitHub projects-Django, Flask, TensorFlow, Scikit-learn, PyTorch, and Langchain-each with over 50,000 stars and extensive adoption across software development and academic research. Our analysis assesses both the strengths and limitations of LLMs in detecting command injection vulnerabilities, evaluating factors such as detection accuracy, efficiency, and practical integration into development workflows. In addition, we provide a comparative analysis of different LLM tools to identify those most suitable for security applications. Our findings offer guidance for developers and security researchers on leveraging LLMs as innovative and automated approaches to enhance software security.

Paperid: 3103, https://arxiv.org/pdf/2505.14592.pdf

Abstract:
Artificial neural network pruning is a method in which artificial neural network sizes can be reduced while attempting to preserve the predicting capabilities of the network. This is done to make the model smaller or faster during inference time. In this work we analyze the ability of a selection of artificial neural network pruning methods to generalize to a new cybersecurity dataset utilizing a simpler network type than was designed for. We analyze each method using a variety of pruning degrees to best understand how each algorithm responds to the new environment. This has allowed us to determine the most well fit pruning method of those we searched for the task. Unexpectedly, we have found that many of them do not generalize to the problem well, leaving only a few algorithms working to an acceptable degree.

Paperid: 3104, https://arxiv.org/pdf/2505.14323.pdf

Abstract:
Training data reconstruction attacks enable adversaries to recover portions of a released model's training data. We consider the attacks where a reconstructor neural network learns to invert the (random) mapping between training data and model weights. Prior work has shown that an informed adversary with access to released model's weights and all but one training data point can achieve high-quality reconstructions in this way. However, differential privacy can defend against such an attack with little to no loss in model's utility when the amount of training data is sufficiently large. In this work we consider a more realistic adversary who only knows the distribution from which a small training dataset has been sampled and who attacks a transfer-learned neural network classifier that has been trained on this dataset. We exhibit an attack that works in this realistic threat model and demonstrate that in the small-data regime it cannot be defended against by DP-SGD without severely damaging the classifier accuracy. This raises significant concerns about the use of such transfer-learned classifiers when protection of training-data is paramount. We demonstrate the effectiveness and robustness of our attack on VGG, EfficientNet and ResNet image classifiers transfer-learned on MNIST, CIFAR-10 and CelebA respectively. Additionally, we point out that the commonly used (true-positive) reconstruction success rate metric fails to reliably quantify the actual reconstruction effectiveness. Instead, we make use of the Neyman-Pearson lemma to construct the receiver operating characteristic curve and consider the associated true-positive reconstruction rate at a fixed level of the false-positive reconstruction rate.

Paperid: 3105, https://arxiv.org/pdf/2505.14175.pdf

Abstract:
Cyberattacks on smart inverters and distributed PV are becoming an imminent threat, because of the recent well-documented vulnerabilities and attack incidents. Particularly, the long lifespan of inverter devices, users' oblivion of cybersecurity compliance, and the lack of cyber regulatory frameworks exacerbate the prospect of cyberattacks on smart inverters. As a result, this raises a question -- "do cyberattacks on smart inverters, if orchestrated on a large scale, pose a genuine threat of wide-scale instability to the power grid and energy market"? This paper provides a realistic assessment on the plausibility and impacts of wide-scale power instability caused by cyberattacks on smart inverters. We conduct an in-depth study based on the electricity market data of Australia and the knowledge of practical contingency mechanisms. Our key findings reveal: (1) Despite the possibility of disruption to the grid by cyberattacks on smart inverters, the impact is only significant under careful planning and orchestration. (2) While the grid can assure certain power system security to survive inadvertent contingency events, it is insufficient to defend against savvy attackers who can orchestrate attacks in an adversarial manner. Our data analysis of Australia's electricity grid also reveals that a relatively low percentage of distributed PV would be sufficient to launch an impactful concerted attack on the grid. Our study casts insights on robust strategies for defending the grid in the presence of cyberattacks for places with high penetration of distributed PV.

Paperid: 3106, https://arxiv.org/pdf/2505.14067.pdf

Abstract:
To avoid the disclosure of personal or corporate data, sanitization of storage devices is an important issue when such devices are to be reused. While poor sanitization practices have been reported for second-hand hard disk drives, it has been reported that data has been found on original storage devices based on flash technology. Based on insights into the second-hand chip market in China, we report on the results of the first large-scale study on the effects of chip reuse for USB flash drives. We provide clear evidence of poor sanitization practices in a non-negligible fraction of USB flash drives from the low-cost Chinese market that were sold as original. More specifically, we forensically analyzed 614 USB flash drives and were able to recover non-trivial user data on a total of 75 devices (more than 12 %). This non-negligible probability that any data (including incriminating files) already existed on the drive when it was bought has critical implications to forensic investigations. The absence of external factors which correlate with finding data on new USB flash drives complicates the matter further.

Paperid: 3107, https://arxiv.org/pdf/2505.14024.pdf

Abstract:
Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by sharing only their local models, ensuring data privacy. However, FL is vulnerable to untargeted attacks that aim to degrade the global model's performance on the underlying data distribution. Existing defense mechanisms attempt to improve FL's resilience against such attacks, but their effectiveness is limited in practical FL environments due to data heterogeneity. On the contrary, we aim to detect and remove the attacks to mitigate their impact. Generalization contribution plays a crucial role in distinguishing untargeted attacks. Our observations indicate that, with limited data, the divergence between embeddings representing different classes provides a better measure of generalization than direct accuracy. In light of this, we propose a novel robust aggregation method, FedGraM, designed to defend against untargeted attacks in FL. The server maintains an auxiliary dataset containing one sample per class to support aggregation. This dataset is fed to the local models to extract embeddings. Then, the server calculates the norm of the Gram Matrix of the embeddings for each local model. The norm serves as an indicator of each model's inter-class separation capability in the embedding space. FedGraM identifies and removes potentially malicious models by filtering out those with the largest norms, then averages the remaining local models to form the global model. We conduct extensive experiments to evaluate the performance of FedGraM. Our empirical results show that with limited data samples used to construct the auxiliary dataset, FedGraM achieves exceptional performance, outperforming state-of-the-art defense methods.

Paperid: 3108, https://arxiv.org/pdf/2505.13493.pdf

Abstract:
The emergence of Software-Defined Networking (SDN) has changed the network structure by separating the control plane from the data plane. However, this innovation has also increased susceptibility to DDoS attacks. Existing detection techniques are often ineffective due to data imbalance and accuracy issues; thus, a considerable research gap exists regarding DDoS detection methods suitable for SDN contexts. This research attempts to detect DDoS attacks more effectively using machine learning algorithms: RF, SVC, KNN, MLP, and XGB. For this purpose, both balanced and imbalanced datasets have been used to measure the performance of the models in terms of accuracy and AUC. Based on the analysis, we can say that RF and XGB had the perfect score, 1.0000, in the accuracy and AUC, but since XGB ended with the lowest Brier Score which indicates the highest reliability. MLP achieved an accuracy of 99.93%, SVC an accuracy of 97.65% and KNN an accuracy of 97.87%, which was the next best performers after RF and XGB. These results are consistent with the validity of SDNs as a platform for RF and XGB techniques in detecting DDoS attacks and highlights the importance of balanced datasets for improving detection against generative cyber attacks that are continually evolving.

Paperid: 3109, https://arxiv.org/pdf/2505.13475.pdf

Abstract:
We present a formal theory for analysing causality in cyber-physical systems. To this end, we extend the theory of actual causality by Halpern and Pearl to cope with the continuous nature of cyber-physical systems. Based on our theory, we develop an analysis technique that is used to uncover the causes for examples of failures resulting from verification, which are represented as continuous trajectories. We develop a search-based technique to efficiently produce such causes and provide an implementation for such a technique. Moreover, we apply our solution to case studies (a suspension system and a connected platoon) and benchmark systems to evaluate its effectiveness; in the experiment, we show that we were able to detect causes for inserted faults.

Paperid: 3110, https://arxiv.org/pdf/2505.13280.pdf

Abstract:
Despite significant advancements in the area, adversarial robustness remains a critical challenge in systems employing machine learning models. The removal of adversarial perturbations at inference time, known as adversarial purification, has emerged as a promising defense strategy. To achieve this, state-of-the-art methods leverage diffusion models that inject Gaussian noise during a forward process to dilute adversarial perturbations, followed by a denoising step to restore clean samples before classification. In this work, we propose FlowPure, a novel purification method based on Continuous Normalizing Flows (CNFs) trained with Conditional Flow Matching (CFM) to learn mappings from adversarial examples to their clean counterparts. Unlike prior diffusion-based approaches that rely on fixed noise processes, FlowPure can leverage specific attack knowledge to improve robustness under known threats, while also supporting a more general stochastic variant trained on Gaussian perturbations for settings where such knowledge is unavailable. Experiments on CIFAR-10 and CIFAR-100 demonstrate that our method outperforms state-of-the-art purification-based defenses in preprocessor-blind and white-box scenarios, and can do so while fully preserving benign accuracy in the former. Moreover, our results show that not only is FlowPure a highly effective purifier but it also holds a strong potential for adversarial detection, identifying preprocessor-blind PGD samples with near-perfect accuracy.

Paperid: 3111, https://arxiv.org/pdf/2505.13028.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into critical systems in industries like healthcare and finance. Users can often submit queries to LLM-enabled chatbots, some of which can enrich responses with information retrieved from internal databases storing sensitive data. This gives rise to a range of attacks in which a user submits a malicious query and the LLM-system outputs a response that creates harm to the owner, such as leaking internal data or creating legal liability by harming a third-party. While security tools are being developed to counter these threats, there is little formal evaluation of their effectiveness and usability. This study addresses this gap by conducting a thorough comparative analysis of LLM security tools. We identified 13 solutions (9 closed-source, 4 open-source), but only 7 were evaluated due to a lack of participation by proprietary model owners.To evaluate, we built a benchmark dataset of malicious prompts, and evaluate these tools performance against a baseline LLM model (ChatGPT-3.5-Turbo). Our results show that the baseline model has too many false positives to be used for this task. Lakera Guard and ProtectAI LLM Guard emerged as the best overall tools showcasing the tradeoff between usability and performance. The study concluded with recommendations for greater transparency among closed source providers, improved context-aware detections, enhanced open-source engagement, increased user awareness, and the adoption of more representative performance metrics.

Paperid: 3112, https://arxiv.org/pdf/2505.12968.pdf

Abstract:
Anonymous authentication is a technique that allows to combine access control with privacy preservation. Typically, clients use different pseudonyms for each access, hindering providers from correlating their activities. To perform the revocation of pseudonyms in a privacy preserving manner is notoriously challenging. When multiple pseudonyms are revoked together, an adversary may infer that these pseudonyms belong to the same client and perform privacy breaking correlations, in particular if these pseudonyms have already been used. Backward unlinkability and revocation auditability are two properties that address this problem. Most systems that offer these properties rely on some sort of time slots, which assume a common reference of time that must be shared among clients and providers; for instance, the client must be aware that it should not use a pseudonym after a certain time or should be able to assess the freshness of a revocation list prior to perform authentication. In this paper we propose Lara, a Lightweight Anonymous Authentication with Asynchronous Revocation Auditability that does not require parties to agree on the current time slot and it is not affected by the clock skew. Prior to disclosing a pseudonym, clients are provided with a revocation list (RL) and can check that the pseudonym has not been revoked. Then, they provide a proof on non-revocation that cannot be used against any other (past or future) RL, avoiding any dependency of timing assumptions. Lara can be implemented using efficient public-key primitives and space-efficient data structures. We have implemented a prototype of Lara and have assessed experimentally its efficiency.

Paperid: 3113, https://arxiv.org/pdf/2505.12640.pdf

Abstract:
With the rapid increase in privacy violations in modern software development, regulatory frameworks such as the General Data Protection Regulation (GDPR) have been established to enforce strict data protection practices. However, insufficient privacy awareness among SME software developers contributes to failure in GDPR compliance. For instance, a developer unfamiliar with data minimization may build a system that collects excessive data, violating GDPR and risking fines. One reason for this lack of awareness is that developers in SMEs often take on multidisciplinary roles (e.g., front-end, back-end, database management, and privacy compliance), which limits specialization in privacy. This lack of awareness may lead to poor privacy attitudes, ultimately hindering the development of a strong organizational privacy culture. However, SMEs that achieve GDPR compliance may gain competitive advantages, such as increased user trust and marketing value, compared to others that do not. Therefore, in this paper, we introduce a novel AI-powered framework called "GDPRShield," specifically designed to enhance the GDPR awareness of SME software developers and, through this, improve their privacy attitudes. Simultaneously, GDPRShield boosts developers' motivation to comply with GDPR from the early stages of software development. It leverages functional requirements written as user stories to provide comprehensive GDPR-based privacy descriptions tailored to each requirement. Alongside improving awareness, GDPRShield strengthens motivation by presenting real-world consequences of noncompliance, such as heavy fines, reputational damage, and loss of user trust, aligned with each requirement. This dual focus on awareness and motivation leads developers to engage with GDPRShield, improving their GDPR compliance and privacy attitudes, which will help SMEs build a stronger privacy culture over time.

Paperid: 3114, https://arxiv.org/pdf/2505.12610.pdf

Abstract:
Concerns regarding privacy and data security in conventional healthcare prompted alternative technologies. In smart healthcare, blockchain technology addresses existing concerns with security, privacy, and electronic healthcare transmission. Integration of Blockchain Technology with the Internet of Medical Things (IoMT) allows real-time monitoring of protected healthcare data. Utilizing edge devices with IoMT devices is very advantageous for addressing security, computing, and storage challenges. Encryption using symmetric and asymmetric keys is used to conceal sensitive information from unauthorized parties. SHA256 is an algorithm for one-way hashing. It is used to verify that the data has not been altered, since if it had, the hash value would have changed. This article offers a blockchain-based smart healthcare system using IoMT devices for continuous patient monitoring. In addition, it employs edge resources in addition to IoMT devices to have extra computing power and storage to hash and encrypt incoming data before sending it to the blockchain. Symmetric key is utilized to keep the data private even in the blockchain, allowing the patient to safely communicate the data through smart contracts while preventing unauthorized physicians from seeing the data. Through the use of a verification node and blockchain, an asymmetric key is used for the signing and validation of patient data in the healthcare provider system. In addition to other security measures, location-based authentication is recommended to guarantee that data originates from the patient area. Through the edge device, SHA256 is utilized to secure the data's integrity and a secret key is used to maintain its secrecy. The hChain architecture improves the computing power of IoMT environments, the security of EHR sharing through smart contracts, and the privacy and authentication procedures.

Paperid: 3115, https://arxiv.org/pdf/2505.12567.pdf

Abstract:
Large language models (LLMs) and LLM-based agents have been widely deployed in a wide range of applications in the real world, including healthcare diagnostics, financial analysis, customer support, robotics, and autonomous driving, expanding their powerful capability of understanding, reasoning, and generating natural languages. However, the wide deployment of LLM-based applications exposes critical security and reliability risks, such as the potential for malicious misuse, privacy leakage, and service disruption that weaken user trust and undermine societal safety. This paper provides a systematic overview of the details of adversarial attacks targeting both LLMs and LLM-based agents. These attacks are organized into three phases in LLMs: Training-Phase Attacks, Inference-Phase Attacks, and Availability & Integrity Attacks. For each phase, we analyze the details of representative and recently introduced attack methods along with their corresponding defenses. We hope our survey will provide a good tutorial and a comprehensive understanding of LLM security, especially for attacks on LLMs. We desire to raise attention to the risks inherent in widely deployed LLM-based applications and highlight the urgent need for robust mitigation strategies for evolving threats.

Paperid: 3116, https://arxiv.org/pdf/2505.12106.pdf

Abstract:
As technology advances, Android malware continues to pose significant threats to devices and sensitive data. The open-source nature of the Android OS and the availability of its SDK contribute to this rapid growth. Traditional malware detection techniques, such as signature-based, static, and dynamic analysis, struggle to detect obfuscated threats that use encryption, packing, or compression. While deep learning (DL)-based visualization methods have been proposed, they often fail to highlight the critical malicious features effectively. This research introduces MalVis, a unified visualization framework that integrates entropy and N-gram analysis to emphasize structural and anomalous patterns in malware bytecode. MalVis addresses key limitations of prior methods, including insufficient feature representation, poor interpretability, and limited data accessibility. The framework leverages a newly introduced large-scale dataset, the MalVis dataset, containing over 1.3 million visual samples across nine malware classes and one benign class. We evaluate MalVis against state-of-the-art visualization techniques using leading CNN models: MobileNet-V2, DenseNet201, ResNet50, and Inception-V3. To enhance performance and reduce overfitting, we implement eight ensemble learning strategies. Additionally, an undersampling technique mitigates class imbalance in the multiclass setting. MalVis achieves strong results: 95.19% accuracy, 90.81% F1-score, 92.58% precision, 89.10% recall, 87.58% MCC, and 98.06% ROC-AUC. These findings demonstrate the effectiveness of MalVis in enabling accurate, interpretable malware detection and providing a valuable resource for security research and applications.

Paperid: 3117, https://arxiv.org/pdf/2505.12104.pdf

Abstract:
Modern organizations are persistently targeted by phishing emails. Despite advances in detection systems and widespread employee training, attackers continue to innovate, posing ongoing threats. Two emerging vectors stand out in the current landscape: QR-code baits and LLM-enabled pretexting. Yet, little is known about the effectiveness of current defenses against these attacks, particularly when it comes to real-world impact on employees. This gap leaves uncertainty around to what extent related countermeasures are justified or needed. Our work addresses this issue. We conduct three phishing simulations across organizations of varying sizes -- from small-medium businesses to a multinational enterprise. In total, we send over 71k emails targeting employees, including: a "traditional" phishing email with a click-through button; a nearly-identical "quishing" email with a QR code instead; and a phishing email written with the assistance of an LLM and open-source intelligence. Our results show that quishing emails have the same effectiveness as traditional phishing emails at luring users to the landing webpage -- which is worrying, given that quishing emails are much harder to identify even by operational detectors. We also find that LLMs can be very good "social engineers": in one company, over 30% of the emails opened led to visiting the landing webpage -- a rate exceeding some prior benchmarks. Finally, we complement our study by conducting a survey across the organizations' employees, measuring their "perceived" phishing awareness. Our findings suggest a correlation between higher self-reported awareness and organizational resilience to phishing attempts.

Paperid: 3118, https://arxiv.org/pdf/2505.11928.pdf

Abstract:
The moduli of the form 2n + 1 belong to a class of low-cost odd moduli, which have been frequently selected to form the basis of various residue number systems (RNS). The most efficient computations modulo (mod) 2n + 1 are performed using the so-called diminished-1 (D1) representation. Therefore, it is desirable that the input converter from the positional number system to RNS (composed of a set of residue generators) could generate the residues mod 2n + 1 in D1 form. In this paper, we propose the basic architecture of the residue generator mod 2n + 1 with D1 output. It is universal, because its initial part can be easily designed for an arbitrary p >= 4n, whereas its final block-the 4-operand adder mod 2n + 1-preserves the same structure for any p. If a pair of conjugate moduli 2n +/- 1 belongs to the RNS moduli set, the latter architecture can be easily extended to build p-input bi-residue generators mod 2n+/-1, which not only save hardware by sharing p - 4n full-adders, but also generate the residue mod 2n + 1 directly in D1 form.

Paperid: 3119, https://arxiv.org/pdf/2505.11565.pdf

Abstract:
While Large Language Models have shown promise in cybersecurity applications, their effectiveness in identifying security threats within cloud deployments remains unexplored. This paper introduces AWS Cloud Security Engineering Eval, a novel dataset for evaluating LLMs cloud security threat modeling capabilities. ACSE-Eval contains 100 production grade AWS deployment scenarios, each featuring detailed architectural specifications, Infrastructure as Code implementations, documented security vulnerabilities, and associated threat modeling parameters. Our dataset enables systemic assessment of LLMs abilities to identify security risks, analyze attack vectors, and propose mitigation strategies in cloud environments. Our evaluations on ACSE-Eval demonstrate that GPT 4.1 and Gemini 2.5 Pro excel at threat identification, with Gemini 2.5 Pro performing optimally in 0-shot scenarios and GPT 4.1 showing superior results in few-shot settings. While GPT 4.1 maintains a slight overall performance advantage, Claude 3.7 Sonnet generates the most semantically sophisticated threat models but struggles with threat categorization and generalization. To promote reproducibility and advance research in automated cybersecurity threat analysis, we open-source our dataset, evaluation metrics, and methodologies.

Paperid: 3120, https://arxiv.org/pdf/2505.11549.pdf

Abstract:
This study examines the strategic role of cybersecurity based on survey data from 1,083 managers across Europe, the UK, and the United States. The findings indicate growing recognition of cybersecurity as a source of competitive advantage, although firms continue to face barriers such as limited resources, talent shortages, and cultural resistance. Larger and high-tech firms tend to adopt more proactive strategies, while SMEs and low-tech sectors display greater variability. Key managerial tensions emerge in balancing security with innovation and agility. Notable country-level differences are observed across Europe, the UK, and the United States. Across all contexts, leadership and employee engagement appear central to closing the gap between strategic intent and operational practice.

Paperid: 3121, https://arxiv.org/pdf/2505.10746.pdf

Abstract:
Foreign information operations conducted by Russian and Chinese actors exploit the United States' permissive information environment. These campaigns threaten democratic institutions and the broader Westphalian model. Yet, existing detection and mitigation strategies often fail to identify active information campaigns in real time. This paper introduces ChestyBot, a pragmatics-based language model that detects unlabeled foreign malign influence tweets with up to 98.34% accuracy. The model supports a novel framework to disrupt foreign influence operations in their formative stages.

Paperid: 3122, https://arxiv.org/pdf/2505.10732.pdf

Abstract:
In the current rapidly changing digital environment, businesses are under constant stress to ensure that their systems are secured. Security audits help to maintain a strong security posture by ensuring that policies are in place, controls are implemented, gaps are identified for cybersecurity risks mitigation. However, audits are usually manual, requiring much time and costs. This paper looks at the possibility of developing a framework to leverage Large Language Models (LLMs) as an autonomous agent to execute part of the security audit, namely with the field audit. password policy compliance for Windows operating system. Through the conduct of an exploration experiment of using GPT-4 with Langchain, the agent executed the audit tasks by accurately flagging password policy violations and appeared to be more efficient than traditional manual audits. Despite its potential limitations in operational consistency in complex and dynamic environment, the framework suggests possibilities to extend further to real-time threat monitoring and compliance checks.

Paperid: 3123, https://arxiv.org/pdf/2505.10708.pdf

Abstract:
Rust is a strong contender for a memory-safe alternative to C as a "systems" programming language, but porting the vast amount of existing C code to Rust is a daunting task. In this paper, we evaluate the potential of large language models (LLMs) to automate the transpilation of C code to idiomatic Rust, while ensuring that the generated code mitigates any memory-related vulnerabilities present in the original code. To that end, we present the design and implementation of SafeTrans, a framework that uses LLMs to i) transpile C code into Rust and ii) iteratively fix any compilation and runtime errors in the resulting code. A key novelty of our approach is the introduction of a few-shot guided repair technique for translation errors, which provides contextual information and example code snippets for specific error types, guiding the LLM toward the correct solution. Another novel aspect of our work is the evaluation of the security implications of the transpilation process, i.e., whether potential vulnerabilities in the original C code have been properly addressed in the translated Rust code. We experimentally evaluated SafeTrans with six leading LLMs and a set of 2,653 C programs accompanied by comprehensive unit tests, which were used for validating the correctness of the translated code. Our results show that our iterative repair strategy improves the rate of successful translations from 54% to 80% for the best-performing LLM (GPT-4o), and that all types of identified vulnerabilities in the original C code are effectively mitigated in the translated Rust code.

Paperid: 3124, https://arxiv.org/pdf/2505.10600.pdf

Abstract:
Due to the rapid growth in the number of Internet of Things (IoT) networks, the cyber risk has increased exponentially, and therefore, we have to develop effective IDS that can work well with highly imbalanced datasets. A high rate of missed threats can be the result, as traditional machine learning models tend to struggle in identifying attacks when normal data volume is much higher than the volume of attacks. For example, the dataset used in this study reveals a strong class imbalance with 94,659 instances of the majority class and only 28 instances of the minority class, making it quite challenging to determine rare attacks accurately. The challenges presented in this research are addressed by hybrid sampling techniques designed to improve data imbalance detection accuracy in IoT domains. After applying these techniques, we evaluate the performance of several machine learning models such as Random Forest, Soft Voting, Support Vector Classifier (SVC), K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Logistic Regression with respect to the classification of cyber-attacks. The obtained results indicate that the Random Forest model achieved the best performance with a Kappa score of 0.9903, test accuracy of 0.9961, and AUC of 0.9994. Strong performance is also shown by the Soft Voting model, with an accuracy of 0.9952 and AUC of 0.9997, indicating the benefits of combining model predictions. Overall, this work demonstrates the value of hybrid sampling combined with robust model and feature selection for significantly improving IoT security against cyber-attacks, especially in highly imbalanced data environments.

Paperid: 3125, https://arxiv.org/pdf/2505.09983.pdf

Abstract:
Federated learning is vulnerable to poisoning attacks by malicious adversaries. Existing methods often involve high costs to achieve effective attacks. To address this challenge, we propose a sybil-based virtual data poisoning attack, where a malicious client generates sybil nodes to amplify the poisoning model's impact. To reduce neural network computational complexity, we develop a virtual data generation method based on gradient matching. We also design three schemes for target model acquisition, applicable to online local, online global, and offline scenarios. In simulation, our method outperforms other attack algorithms since our method can obtain a global target model under non-independent uniformly distributed data.

Paperid: 3126, https://arxiv.org/pdf/2505.09843.pdf

Abstract:
Enterprise networks are growing ever larger with a rapidly expanding attack surface, increasing the volume of security alerts generated from security controls. Security Operations Centre (SOC) analysts triage these alerts to identify malicious activity, but they struggle with alert fatigue due to the overwhelming number of benign alerts. Organisations are turning to managed SOC providers, where the problem is amplified by context switching and limited visibility into business processes. A novel system, named AACT, is introduced that automates SOC workflows by learning from analysts' triage actions on cybersecurity alerts. It accurately predicts triage decisions in real time, allowing benign alerts to be closed automatically and critical ones prioritised. This reduces the SOC queue allowing analysts to focus on the most severe, relevant or ambiguous threats. The system has been trained and evaluated on both real SOC data and an open dataset, obtaining high performance in identifying malicious alerts from benign alerts. Additionally, the system has demonstrated high accuracy in a real SOC environment, reducing alerts shown to analysts by 61% over six months, with a low false negative rate of 1.36% over millions of alerts.

Paperid: 3127, https://arxiv.org/pdf/2505.09733.pdf

Abstract:
Federated learning (FL) presents an effective solution for collaborative model training while maintaining data privacy across decentralized client datasets. However, data quality issues such as noisy labels, missing classes, and imbalanced distributions significantly challenge its effectiveness. This study proposes a federated learning methodology that systematically addresses data quality issues, including noise, class imbalance, and missing labels. The proposed approach systematically enhances data integrity through adaptive noise cleaning, collaborative conditional GAN-based synthetic data generation, and robust federated model training. Experimental evaluations conducted on benchmark datasets (MNIST and Fashion-MNIST) demonstrate significant improvements in federated model performance, particularly macro-F1 Score, under varying noise and class imbalance conditions. Additionally, the proposed framework carefully balances computational feasibility and substantial performance gains, ensuring practicality for resource constrained edge devices while rigorously maintaining data privacy. Our results indicate that this method effectively mitigates common data quality challenges, providing a robust, scalable, and privacy compliant solution suitable for diverse real-world federated learning scenarios.

Paperid: 3128, https://arxiv.org/pdf/2505.09261.pdf

Abstract:
Extracting MITRE ATT\&CK Tactics, Techniques, and Procedures (TTPs) from natural language threat reports is crucial yet challenging. Existing methods primarily focus on performance metrics using data-driven approaches, often neglecting mechanisms to ensure faithful adherence to the official standard. This deficiency compromises reliability and consistency of TTP assignments, creating intelligence silos and contradictory threat assessments across organizations. To address this, we introduce a novel framework that converts abstract standard definitions into actionable, contextualized knowledge. Our method utilizes Large Language Model (LLM) to generate, update, and apply this knowledge. This framework populates an evolvable memory with dual-layer situational knowledge instances derived from labeled examples and official definitions. The first layer identifies situational contexts (e.g., "Communication with C2 using encoded subdomains"), while the second layer captures distinctive features that differentiate similar techniques (e.g., distinguishing T1132 "Data Encoding" from T1071 "Application Layer Protocol" based on whether the focus is on encoding methods or protocol usage). This structured approach provides a transparent basis for explainable TTP assignments and enhanced human oversight, while also helping to standardize other TTP extraction systems. Experiments show our framework (using Qwen2.5-32B) boosts Technique F1 scores by 11\% over GPT-4o. Qualitative analysis confirms superior standardization, enhanced transparency, and improved explainability in real-world threat intelligence scenarios. To the best of our knowledge, this is the first work that uses the LLM to generate, update, and apply the a new knowledge for TTP extraction.

Paperid: 3129, https://arxiv.org/pdf/2505.09038.pdf

Abstract:
Satellites face a multitude of security risks that set them apart from hardware on Earth. Small satellites may face additional challenges, as they are often developed on a budget and by amateur organizations or universities that do not consider security. We explore the security practices and preferences of small satellite teams, particularly university satellite teams, to understand what barriers exist to building satellites securely. We interviewed 8 university satellite club leaders across 4 clubs in the U.S. and perform a code audit of 3 of these clubs' code repositories. We find that security practices vary widely across teams, but all teams studied had vulnerabilities available to an unprivileged, ground-based attacker. Participants foresee many risks of unsecured small satellites and indicate security shortcomings in industry and government. Lastly, we identify a set of considerations for how to build future small satellites securely, in amateur organizations and beyond.

Paperid: 3130, https://arxiv.org/pdf/2505.08964.pdf

Abstract:
The dramatic increase of complex, multi-step, and rapidly evolving attacks in dynamic networks involves advanced cyber-threat detectors. The GPML (Graph Processing for Machine Learning) library addresses this need by transforming raw network traffic traces into graph representations, enabling advanced insights into network behaviors. The library provides tools to detect anomalies in interaction and community shifts in dynamic networks. GPML supports community and spectral metrics extraction, enhancing both real-time detection and historical forensics analysis. This library supports modern cybersecurity challenges with a robust, graph-based approach.

Paperid: 3131, https://arxiv.org/pdf/2505.08849.pdf

Abstract:
Language model alignment is crucial for ensuring that large language models (LLMs) align with human preferences, yet it often involves sensitive user data, raising significant privacy concerns. While prior work has integrated differential privacy (DP) with alignment techniques, their performance remains limited. In this paper, we propose novel algorithms for privacy-preserving alignment and rigorously analyze their effectiveness across varying privacy budgets and models. Our framework can be deployed on two celebrated alignment techniques, namely direct preference optimization (DPO) and reinforcement learning from human feedback (RLHF). Through systematic experiments on large-scale language models, we demonstrate that our approach achieves state-of-the-art performance. Notably, one of our algorithms, DP-AdamW, combined with DPO, surpasses existing methods, improving alignment quality by up to 15% under moderate privacy budgets (Îµ=2-5). We further investigate the interplay between privacy guarantees, alignment efficacy, and computational demands, providing practical guidelines for optimizing these trade-offs.

Paperid: 3132, https://arxiv.org/pdf/2505.08840.pdf

Abstract:
In this thesis, a novel lightweight hybrid encryption algorithm named SEPAR is proposed, featuring a 16-bit block length and a 128-bit initialization vector. The algorithm is designed specifically for application in Internet of Things (IoT) technology devices. The design concept of this algorithm is based on the integration of a pseudo-random permutation function and a pseudo-random generator function. This intelligent combination not only enhances the algorithm's resistance against cryptographic attacks but also improves its processing speed. The security analyses conducted on the algorithm, along with the results of NIST statistical tests, confirm its robustness against most common and advanced cryptographic attacks, including linear and differential attacks. The proposed algorithm has been implemented on various software platform architectures. The software implementation was carried out on three platforms: 8-bit, 16-bit, and 32-bit architectures. A comparative analysis with the BORON algorithm on a 32-bit ARM processor indicates a performance improvement of 42.25%. Furthermore, implementation results on 8-bit and 16-bit microcontrollers demonstrate performance improvements of 87.91% and 98.01% respectively, compared to the PRESENT cipher.

Paperid: 3133, https://arxiv.org/pdf/2505.08596.pdf

Abstract:
The process of linking databases that contain sensitive information about individuals across organisations is an increasingly common requirement in the health and social science research domains, as well as with governments and businesses. To protect personal data, protocols have been developed to limit the leakage of sensitive information. Furthermore, privacy-preserving record linkage (PPRL) techniques have been proposed to conduct linkage on encoded data. While PPRL techniques are now being employed in real-world applications, the focus of PPRL research has been on the technical aspects of linking sensitive data (such as encoding methods and cryptanalysis attacks), but not on organisational challenges when employing such techniques in practice. We analyse what sensitive information can possibly leak, either unintentionally or intentionally, in traditional data linkage as well as PPRL protocols, and what a party that participates in such a protocol can learn from the data it obtains legitimately within the protocol. We also show that PPRL protocols can still result in the unintentional leakage of sensitive information. We provide recommendations to help data custodians and other parties involved in a data linkage project to identify and prevent vulnerabilities and make their project more secure.

Paperid: 3134, https://arxiv.org/pdf/2505.08544.pdf

Abstract:
A code-level backdoor is a hidden access, programmed and concealed within the code of a program. For instance, hard-coded credentials planted in the code of a file server application would enable maliciously logging into all deployed instances of this application. Confirmed software supply chain attacks have led to the injection of backdoors into popular open-source projects, and backdoors have been discovered in various router firmware. Manual code auditing for backdoors is challenging and existing semi-automated approaches can handle only a limited scope of programs and backdoors, while requiring manual reverse-engineering of the audited (binary) program. Graybox fuzzing (automated semi-randomized testing) has grown in popularity due to its success in discovering vulnerabilities and hence stands as a strong candidate for improved backdoor detection. However, current fuzzing knowledge does not offer any means to detect the triggering of a backdoor at runtime. In this work we introduce ROSA, a novel approach (and tool) which combines a state-of-the-art fuzzer (AFL++) with a new metamorphic test oracle, capable of detecting runtime backdoor triggers. To facilitate the evaluation of ROSA, we have created ROSARUM, the first openly available benchmark for assessing the detection of various backdoors in diverse programs. Experimental evaluation shows that ROSA has a level of robustness, speed and automation similar to classical fuzzing. It finds all 17 authentic or synthetic backdooors from ROSARUM in 1h30 on average. Compared to existing detection tools, it can handle a diversity of backdoors and programs and it does not rely on manual reverse-engineering of the fuzzed binary code.

Paperid: 3135, https://arxiv.org/pdf/2505.08162.pdf

Abstract:
With the rapid advancement of quantum computing technology, post-quantum cryptography (PQC) has emerged as a pivotal direction for next-generation encryption standards. Among these, lattice-based cryptographic schemes rely heavily on the fast Number Theoretic Transform (NTT) over polynomial rings, whose performance directly determines encryption/decryption throughput and energy efficiency. However, existing software-based NTT implementations struggle to meet the real-time performance and low-power requirements of IoT and edge devices. To address this challenge, this paper proposes an area-efficient highly parallel NTT accelerator with glitch-driven near-memory computing (GDNTT). The design integrates a 10T SRAM for data storage, enabling flexible row/column data access and streamlining circuit mapping strategies. Furthermore, a glitch generator is incorporated into the near-memory computing unit, significantly reducing the latency of butterfly operations. Evaluation results show that the proposed NTT accelerator achieves a 1.5~28* improvement in throughput-per-area compared to the state-of-the-art.

Paperid: 3136, https://arxiv.org/pdf/2505.08114.pdf

Abstract:
Explainable artificial intelligence (XAI) methods have become increasingly important in the context of explainable intrusion detection systems (X-IDSs) for improving the interpretability and trustworthiness of X-IDSs. However, existing evaluation approaches for XAI focus on model-specific properties such as fidelity and simplicity, and neglect whether the explanation content is meaningful or useful within the application domain. In this paper, we introduce new evaluation metrics measuring the quality of explanations from X-IDSs. The metrics aim at quantifying how well explanations are aligned with predefined feature sets that can be identified from domain-specific knowledge bases. Such alignment with these knowledge bases enables explanations to reflect domain knowledge and enables meaningful and actionable insights for security analysts. In our evaluation, we demonstrate the use of the proposed metrics to evaluate the quality of explanations from X-IDSs. The experimental results show that the proposed metrics can offer meaningful differences in explanation quality across X-IDSs and attack types, and assess how well X-IDS explanations reflect known domain knowledge. The findings of the proposed metrics provide actionable insights for security analysts to improve the interpretability of X-IDS in practical settings.

Paperid: 3138, https://arxiv.org/pdf/2505.07380.pdf

Abstract:
iPhone portrait-mode images contain a distinctive pattern in out-of-focus regions simulating the bokeh effect, which we term Apple's Synthetic Defocus Noise Pattern (SDNP). If overlooked, this pattern can interfere with blind forensic analyses, especially PRNU-based camera source verification, as noted in earlier works. Since Apple's SDNP remains underexplored, we provide a detailed characterization, proposing a method for its precise estimation, modeling its dependence on scene brightness, ISO settings, and other factors. Leveraging this characterization, we explore forensic applications of the SDNP, including traceability of portrait-mode images across iPhone models and iOS versions in open-set scenarios, assessing its robustness under post-processing. Furthermore, we show that masking SDNP-affected regions in PRNU-based camera source verification significantly reduces false positives, overcoming a critical limitation in camera attribution, and improving state-of-the-art techniques.

Paperid: 3139, https://arxiv.org/pdf/2505.07188.pdf

Abstract:
Federated Learning (FL) offers a promising framework for collaboratively training machine learning models across decentralized genomic datasets without direct data sharing. While this approach preserves data locality, it remains susceptible to sophisticated inference attacks that can compromise individual privacy. In this study, we simulate a federated learning setup using synthetic genomic data and assess its vulnerability to three key attack vectors: Membership Inference Attack (MIA), Gradient-Based Membership Inference Attack, and Label Inference Attack (LIA). Our experiments reveal that Gradient-Based MIA achieves the highest effectiveness, with a precision of 0.79 and F1-score of 0.87, underscoring the risk posed by gradient exposure in federated updates. Additionally, we visualize comparative attack performance through radar plots and quantify model leakage across clients. The findings emphasize the inadequacy of naÃ¯ve FL setups in safeguarding genomic privacy and motivate the development of more robust privacy-preserving mechanisms tailored to the unique sensitivity of genomic data.

Paperid: 3140, https://arxiv.org/pdf/2505.06913.pdf

Abstract:
From automated intrusion testing to discovery of zero-day attacks before software launch, agentic AI calls for great promises in security engineering. This strong capability is bound with a similar threat: the security and research community must build up its models before the approach is leveraged by malicious actors for cybercrime. We therefore propose and evaluate RedTeamLLM, an integrated architecture with a comprehensive security model for automatization of pentest tasks. RedTeamLLM follows three key steps: summarizing, reasoning and act, which embed its operational capacity. This novel framework addresses four open challenges: plan correction, memory management, context window constraint, and generality vs. specialization. Evaluation is performed through the automated resolution of a range of entry-level, but not trivial, CTF challenges. The contribution of the reasoning capability of our agentic AI framework is specifically evaluated.

Paperid: 3141, https://arxiv.org/pdf/2505.06822.pdf

Abstract:
In this paper, we proposes an automatic firmware analysis tool targeting at finding hidden services that may be potentially harmful to the IoT devices. Our approach uses static analysis and symbolic execution to search and filter services that are transparent to normal users but explicit to experienced attackers. A prototype is built and evaluated against a dataset of IoT firmware, and The evaluation shows our tool can find the suspicious hidden services effectively.

Paperid: 3142, https://arxiv.org/pdf/2505.06738.pdf

Abstract:
Large Language Models (LLMs) that can be deployed locally have recently gained popularity for privacy-sensitive tasks, with companies such as Meta, Google, and Intel playing significant roles in their development. However, the security of local LLMs through the lens of hardware cache side-channels remains unexplored. In this paper, we unveil novel side-channel vulnerabilities in local LLM inference: token value and token position leakage, which can expose both the victim's input and output text, thereby compromising user privacy. Specifically, we found that adversaries can infer the token values from the cache access patterns of the token embedding operation, and deduce the token positions from the timing of autoregressive decoding phases. To demonstrate the potential of these leaks, we design a novel eavesdropping attack framework targeting both open-source and proprietary LLM inference systems. The attack framework does not directly interact with the victim's LLM and can be executed without privilege. We evaluate the attack on a range of practical local LLM deployments (e.g., Llama, Falcon, and Gemma), and the results show that our attack achieves promising accuracy. The restored output and input text have an average edit distance of 5.2% and 17.3% to the ground truth, respectively. Furthermore, the reconstructed texts achieve average cosine similarity scores of 98.7% (input) and 98.0% (output).

Paperid: 3143, https://arxiv.org/pdf/2505.06498.pdf

Abstract:
Over the years, adversarial attempts against critical services have become more effective and sophisticated in launching low-profile attacks. This trend has always been concerning. However, an even more alarming trend is the increasing difficulty of collecting relevant evidence about these attacks and the involved threat actors in the early stages before significant damage is done. This issue puts defenders at a significant disadvantage, as it becomes exceedingly difficult to understand the attack details and formulate an appropriate response. Developing robust forensics tools to collect evidence about modern threats has never been easy. One main challenge is to provide a robust trade-off between achieving sufficient visibility while leaving minimal detectable artifacts. This paper will introduce LASE, an open-source Low-Artifact Forensics Engine to perform threat analysis and forensics in Windows operating system. LASE augments current analysis tools by providing detailed, system-wide monitoring capabilities while minimizing detectable artifacts. We designed multiple deployment scenarios, showing LASE's potential in evidence gathering and threat reasoning in a real-world setting. By making LASE and its execution trace data available to the broader research community, this work encourages further exploration in the field by reducing the engineering costs for threat analysis and building a longitudinal behavioral analysis catalog for diverse security domains.

Paperid: 3144, https://arxiv.org/pdf/2505.06470.pdf

Abstract:
In this work, we hope to expand the universe of security practitioners of open-source hardware by creating a bridge from hardware design languages (HDLs) to data science languages like Python and R through novel libraries that convert VCD (value change dump) files into data frames, the expected input type of the modern data science tools. We show how insights can be derived in high-level languages from register transfer level (RTL) trace data. Additionally, we show a promising future direction in hardware security research leveraging the parallelism of Spark to study transient execution CPU vulnerabilities, and provide reproducibility researchers via GitHub and Colab.

Paperid: 3145, https://arxiv.org/pdf/2505.06406.pdf

Abstract:
Recent advances in AI are transforming AI's ubiquitous presence in our world from that of standalone AI-applications into deeply integrated AI-agents. These changes have been driven by agents' increasing capability to autonomously make decisions and initiate actions, using existing applications; whether those applications are AI-based or not. This evolution enables unprecedented levels of AI integration, with agents now able to take actions on behalf of systems and users -- including, in some cases, the powerful ability for the AI to write and execute scripts as it deems necessary. With AI systems now able to autonomously execute code, interact with external systems, and operate without human oversight, traditional security approaches fall short. This paper introduces an asset-centric methodology for threat modeling AI systems that addresses the unique security challenges posed by integrated AI agents. Unlike existing top-down frameworks that analyze individual attacks within specific product contexts, our bottom-up approach enables defenders to systematically identify how vulnerabilities -- both conventional and AI-specific -- impact critical AI assets across distributed infrastructures used to develop and deploy these agents. This methodology allows security teams to: (1) perform comprehensive analysis that communicates effectively across technical domains, (2) quantify security assumptions about third-party AI components without requiring visibility into their implementation, and (3) holistically identify AI-based vulnerabilities relevant to their specific product context. This approach is particularly relevant for securing agentic systems with complex autonomous capabilities. By focusing on assets rather than attacks, our approach scales with the rapidly evolving threat landscape while accommodating increasingly complex and distributed AI development pipelines.

Paperid: 3147, https://arxiv.org/pdf/2505.05959.pdf

Abstract:
The advancements in quantum computing are a threat to classical cryptographic systems. The traditional cryptographic methods that utilize factorization-based or discrete-logarithm-based algorithms, such as RSA and ECC, are some of these. This paper thoroughly investigates the vulnerabilities of traditional cryptographic methods against quantum attacks and provides a decision-support framework to help organizations in recommending mitigation plans and determining appropriate transition strategies to post-quantum cryptography. A semi-synthetic dataset, consisting of key features such as key size, network complexity, and sensitivity levels, is crafted, with each configuration labeled according to its recommended mitigation plan. Using decision tree and random forest models, a classifier is trained to recommend appropriate mitigation/transition plans such as continuous monitoring, scheduled transitions, and immediate hybrid implementation. The proposed approach introduces a data-driven and dynamic solution for organizations to assess the scale of the migration, specifying a structured roadmap toward quantum resilience. The results highlight important features that influence strategy decisions and support actionable recommendations for cryptographic modernization based on system context.

Paperid: 3148, https://arxiv.org/pdf/2505.05846.pdf

Abstract:
Functional encryption (FE) has recently attracted interest in privacy-preserving machine learning (PPML) for its unique ability to compute specific functions on encrypted data. A related line of work focuses on noisy FE, which ensures differential privacy in the output while keeping the data encrypted. We extend the notion of noisy multi-input functional encryption (NMIFE) to (dynamic) noisy multi-client functional encryption ((Dy)NMCFE), which allows for more flexibility in the number of data holders and analyses, while protecting the privacy of the data holder with fine-grained access through the usage of labels. Following our new definition of DyNMCFE, we present DyNo, a concrete inner-product DyNMCFE scheme. Our scheme captures all the functionalities previously introduced in noisy FE schemes, while being significantly more efficient in terms of space and runtime and fulfilling a stronger security notion by allowing the corruption of clients. To further prove the applicability of DyNMCFE, we present a protocol for PPML based on DyNo. According to this protocol, we train a privacy-preserving logistic regression.

Paperid: 3150, https://arxiv.org/pdf/2505.05810.pdf

Abstract:
As the number of cyberattacks and their particualr nature escalate, the need for effective intrusion detection systems (IDS) has become indispensable for ensuring the security of contemporary networks. Adaptive and more sophisticated threats are often beyond the reach of traditional approaches to intrusion detection and access control. This paper proposes an experimental evaluation of IDS models based on deep learning techniques, focusing on the classification of network traffic into malicious and benign categories. We analyze and retrain an assortment of architectures, such as Convolutional Neural Networks (CNN), Artificial Neural Networks (ANN), and LSTM models. Each model was tested based on a real dataset simulated in a multi-faceted and everchanging network traffic environment. Among the tested models, the best achieved an accuracy of 96 percent, underscoring the potential of deep learning models in improving efficiency and rapid response in IDS systems. The goal of the research is to demonstrate the effectiveness of distinct architectures and their corresponding trade-offs to enhance framework development for adaptive IDS solutions and improve overall network security.

Paperid: 3151, https://arxiv.org/pdf/2505.05712.pdf

Abstract:
The rapid advancement of LLMs (Large Language Models) has established them as a foundational technology for many AI and ML-powered human computer interactions. A critical challenge in this context is the attribution of LLM-generated text -- either to the specific language model that produced it or to the individual user who embedded their identity via a so-called multi-bit watermark. This capability is essential for combating misinformation, fake news, misinterpretation, and plagiarism. One of the key techniques for addressing this challenge is digital watermarking. This work presents a watermarking scheme for LLM-generated text based on Lagrange interpolation, enabling the recovery of a multi-bit author identity even when the text has been heavily redacted by an adversary. The core idea is to embed a continuous sequence of points $(x, f(x))$ that lie on a single straight line. The $x$-coordinates are computed pseudorandomly using a cryptographic hash function $H$ applied to the concatenation of the previous token's identity and a secret key $s_k$. Crucially, the $x$-coordinates do not need to be embedded into the text -- only the corresponding $f(x)$ values are embedded. During extraction, the algorithm recovers the original points along with many spurious ones, forming an instance of the Maximum Collinear Points (MCP) problem, which can be solved efficiently. Experimental results demonstrate that the proposed method is highly effective, allowing the recovery of the author identity even when as few as three genuine points remain after adversarial manipulation.

Paperid: 3152, https://arxiv.org/pdf/2505.05697.pdf

Abstract:
Today's computer systems come with a pre-installed tiny operating system, which is also known as UEFI. UEFI has slowly displaced the former legacy PC-BIOS while the main task has not changed: It is responsible for booting the actual operating system. However, features like the network stack make it also useful for other applications. This paper introduces UEberForensIcs, a UEFI application that makes it easy to acquire memory from the firmware, similar to the well-known cold boot attacks. There is even UEFI code called by the operating system during runtime, and we demonstrate how to utilize this for forensic purposes.

Paperid: 3153, https://arxiv.org/pdf/2505.05015.pdf

Abstract:
Continuous authentication systems leveraging free-text keyboard dynamics offer a promising additional layer of security in a multifactor authentication setup that can be used in a transparent way with no impact on user experience. This study investigates the efficacy of behavioral biometrics by employing an Agent-Based Model (ABM) to simulate diverse typing profiles across mechanical and membrane keyboards. Specifically, we generated synthetic keystroke data from five unique agents, capturing features related to dwell time, flight time, and error rates within sliding 5-second windows updated every second. Two machine learning approaches, One-Class Support Vector Machine (OC-SVM) and Random Forest (RF), were evaluated for user verification. Results revealed a stark contrast in performance: while One-Class SVM failed to differentiate individual users within each group, Random Forest achieved robust intra-keyboard user recognition (Accuracy > 0.7) but struggled to generalize across keyboards for the same user, highlighting the significant impact of keyboard hardware on typing behavior. These findings suggest that: (1) keyboard-specific user profiles may be necessary for reliable authentication, and (2) ensemble methods like RF outperform One-Class SVM in capturing fine-grained user-specific patterns.

Paperid: 3154, https://arxiv.org/pdf/2505.04934.pdf

Abstract:
Blockchain technology, introduced in 2008, has revolutionized data storage and transfer across sectors such as finance, healthcare, intelligent transportation, and the metaverse. However, the proliferation of blockchain systems has led to discrepancies in architectures, consensus mechanisms, and data standards, creating data and value silos that hinder the development of an integrated multi chain ecosystem. Blockchain interoperability (a.k.a cross chain interoperability) has thus emerged as a solution to enable seamless data and asset exchange across disparate blockchains. In this survey, we systematically analyze over 150 high impact sources from academic journals, digital libraries, and grey literature to provide an in depth examination of blockchain interoperability. By exploring the existing methods, technologies, and architectures, we offer a classification of interoperability approaches including Atomic Swaps, Sidechains, Light Clients, and so on, which represent the most comprehensive overview to date. Furthermore, we investigate the convergence of academic research with industry practices, underscoring the importance of collaborative efforts in advancing blockchain innovation. Finally, we identify key strategic insights, challenges, and future research trajectories in this field. Our findings aim to support researchers, policymakers, and industry leaders in understanding and harnessing the transformative potential of blockchain interoperability to address current challenges and drive forward a cohesive multi-chain ecosystem.

Paperid: 3155, https://arxiv.org/pdf/2505.04896.pdf

Abstract:
Side-channel attacks on memory (SCAM) exploit unintended data leaks from memory subsystems to infer sensitive information, posing significant threats to system security. These attacks exploit vulnerabilities in memory access patterns, cache behaviors, and other microarchitectural features to bypass traditional security measures. The purpose of this research is to examine SCAM, classify various attack techniques, and evaluate existing defense mechanisms. It guides researchers and industry professionals in improving memory security and mitigating emerging threats. We begin by identifying the major vulnerabilities in the memory system that are frequently exploited in SCAM, such as cache timing, speculative execution, \textit{Rowhammer}, and other sophisticated approaches. Next, we outline a comprehensive taxonomy that systematically classifies these attacks based on their types, target systems, attack vectors, and adversarial capabilities required to execute them. In addition, we review the current landscape of mitigation strategies, emphasizing their strengths and limitations. This work aims to provide a comprehensive overview of memory-based side-channel attacks with the goal of providing significant insights for researchers and practitioners to better understand, detect, and mitigate SCAM risks.

Paperid: 3156, https://arxiv.org/pdf/2505.04843.pdf

Abstract:
Fast and effective incident response is essential to prevent adversarial cyberattacks. Autonomous Cyber Defense (ACD) aims to automate incident response through Artificial Intelligence (AI) agents that plan and execute actions. Most ACD approaches focus on single-agent scenarios and leverage Reinforcement Learning (RL). However, ACD RL-trained agents depend on costly training, and their reasoning is not always explainable or transferable. Large Language Models (LLMs) can address these concerns by providing explainable actions in general security contexts. Researchers have explored LLM agents for ACD but have not evaluated them on multi-agent scenarios or interacting with other ACD agents. In this paper, we show the first study on how LLMs perform in multi-agent ACD environments by proposing a new integration to the CybORG CAGE 4 environment. We examine how ACD teams of LLM and RL agents can interact by proposing a novel communication protocol. Our results highlight the strengths and weaknesses of LLMs and RL and help us identify promising research directions to create, train, and deploy future teams of ACD agents.

Paperid: 3157, https://arxiv.org/pdf/2505.04784.pdf

Abstract:
The emergence of Generative AI (Gen AI) and Large Language Models (LLMs) has enabled more advanced chatbots capable of human-like interactions. However, these conversational agents introduce a broader set of operational risks that extend beyond traditional cybersecurity considerations. In this work, we propose a novel, instrumented risk-assessment metric that simultaneously evaluates potential threats to three key stakeholders: the service-providing organization, end users, and third parties. Our approach incorporates the technical complexity required to induce erroneous behaviors in the chatbot--ranging from non-induced failures to advanced prompt-injection attacks--as well as contextual factors such as the target industry, user age range, and vulnerability severity. To validate our metric, we leverage Garak, an open-source framework for LLM vulnerability testing. We further enhance Garak to capture a variety of threat vectors (e.g., misinformation, code hallucinations, social engineering, and malicious code generation). Our methodology is demonstrated in a scenario involving chatbots that employ retrieval-augmented generation (RAG), showing how the aggregated risk scores guide both short-term mitigation and longer-term improvements in model design and deployment. The results underscore the importance of multi-dimensional risk assessments in operationalizing secure, reliable AI-driven conversational systems.

Paperid: 3158, https://arxiv.org/pdf/2505.04333.pdf

Abstract:
The transition to post-quantum cryptography (PQC) presents significant challenges for certificate-based identity management in industrial environments, where secure onboarding of devices relies on long-lived and interoperable credentials. This work analyzes the integration of PQC into X.509 certificate structures and compares existing tool support for classical, hybrid, composite, and chameleon certificates. A gap is identified in available open-source solutions, particularly for the generation and validation of hybrid and composite certificates via command-line interfaces. To address this, a proof-of-concept implementation based on the Bouncy Castle library is developed. The tool supports the creation of classical, hybrid (Catalyst), composite, and partially chameleon certificates using PQC algorithms such as ML-DSA and SLH-DSA. It demonstrates compatibility with standard X.509 workflows and aims to support headless operation and constrained platforms typical of industrial systems. The implementation is modular, publicly available, and intended to facilitate further research and testing of PQC migration strategies in practice. A comparison with OpenSSL-based solutions highlights current limitations in standardization, toolchain support, and algorithm coverage.

Paperid: 3159, https://arxiv.org/pdf/2505.04308.pdf

Abstract:
Website information security has become a critical concern in the digital age. This article explores the evolution of website information security, examining its historical development, current practices, and future directions. The early beginnings from the 1960s to the 1980s laid the groundwork for modern cybersecurity, with the development of ARPANET, TCP/IP, public-key cryptography, and the first antivirus programs. The 1990s marked a transformative era, driven by the commercialization of the Internet and the emergence of web-based services. As the Internet grew, so did the range and sophistication of cyber threats, leading to advancements in security technologies such as the Secure Sockets Layer (SSL) protocol, password protection, and firewalls. Current practices in website information security involve a multi-layered approach, including encryption, secure coding practices, regular security audits, and user education. The future of website information security is expected to be shaped by emerging technologies such as artificial intelligence, blockchain, and quantum computing, as well as the increasing importance of international cooperation and standardization efforts. As cyber threats continue to evolve, ongoing research and innovation in website information security will be essential to protect sensitive information and maintain trust in the digital world.

Paperid: 3160, https://arxiv.org/pdf/2505.04265.pdf

Abstract:
This, with the ever-increasing sophistication of cyberwar, calls for novel solutions. In this regard, Large Language Models (LLMs) have emerged as a highly promising tool for defensive and offensive cybersecurity-related strategies. While existing literature has focused much on the defensive use of LLMs, when it comes to their offensive utilization, very little has been reported-namely, concerning Vulnerability Assessment (VA) report validation. Consequentially, this paper tries to fill that gap by investigating the capabilities of LLMs in automating and improving the validation process of the report of the VA. From the critical review of the related literature, this paper hereby proposes a new approach to using the LLMs in the automation of the analysis and within the validation process of the report of the VA that could potentially reduce the number of false positives and generally enhance efficiency. These results are promising for LLM automatization for improving validation on reports coming from VA in order to improve accuracy while reducing human effort and security postures. The contribution of this paper provides further evidence about the offensive and defensive LLM capabilities and therefor helps in devising more appropriate cybersecurity strategies and tools accordingly.

Paperid: 3161, https://arxiv.org/pdf/2505.04181.pdf

Abstract:
As image processing systems proliferate, privacy concerns intensify given the sensitive personal information contained in images. This paper examines privacy challenges in image processing and surveys emerging privacy-preserving techniques including differential privacy, secure multiparty computation, homomorphic encryption, and anonymization. Key applications with heightened privacy risks include healthcare, where medical images contain patient health data, and surveillance systems that can enable unwarranted tracking. Differential privacy offers rigorous privacy guarantees by injecting controlled noise, while MPC facilitates collaborative analytics without exposing raw data inputs. Homomorphic encryption enables computations on encrypted data and anonymization directly removes identifying elements. However, balancing privacy protections and utility remains an open challenge. Promising future directions identified include quantum-resilient cryptography, federated learning, dedicated hardware, and conceptual innovations like privacy by design. Ultimately, a holistic effort combining technological innovations, ethical considerations, and policy frameworks is necessary to uphold the fundamental right to privacy as image processing capabilities continue advancing rapidly.

Paperid: 3162, https://arxiv.org/pdf/2505.04123.pdf

Abstract:
Doubtlessly, the immersive technologies have potential to ease people's life and uplift economy, however the obvious data privacy risks cannot be ignored. For example, a participant wears a 3D headset device which detects participant's head motion to track the pose of participant's head to match the orientation of camera with participant's eyes positions in the real-world. In a preliminary study, researchers have proved that the voice command features on such headsets could lead to major privacy leakages. By analyzing the facial dynamics captured with the motion sensors, the headsets suffer security vulnerabilities revealing a user's sensitive speech without user's consent. The psychography data (such as voice command features, facial dynamics, etc.) is sensitive data and it should not be leaked out of the device without users consent else it is a privacy breach. To the best of our literature review, the work done in this particular research problem is very limited. Motivated from this, we develop a simple technical framework to mitigate sensitive data (or biometric data) privacy leaks in immersive technology domain. The performance evaluation is conducted in a robust way using six data sets, to show that the proposed solution is effective and feasible to prevent this issue.

Paperid: 3163, https://arxiv.org/pdf/2505.03945.pdf

Abstract:
Cloud security concerns have been greatly realized in recent years due to the increase of complicated threats in the computing world. Many traditional solutions do not work well in real-time to detect or prevent more complex threats. Artificial intelligence is today regarded as a revolution in determining a protection plan for cloud data architecture through machine learning, statistical visualization of computing infrastructure, and detection of security breaches followed by counteraction. These AI-enabled systems make work easier as more network activities are scrutinized, and any anomalous behavior that might be a precursor to a more serious breach is prevented. This paper examines ways AI can enhance cloud security by applying predictive analytics, behavior-based security threat detection, and AI-stirring encryption. It also outlines the problems of the previous security models and how AI overcomes them. For a similar reason, issues like data privacy, biases in the AI model, and regulatory compliance are also covered. So, AI improves the protection of cloud computing contexts; however, more efforts are needed in the subsequent phases to extend the technology's reliability, modularity, and ethical aspects. This means that AI can be blended with other new computing technologies, including blockchain, to improve security frameworks further. The paper discusses the current trends in securing cloud data architecture using AI and presents further research and application directions.

Paperid: 3164, https://arxiv.org/pdf/2505.03843.pdf

Abstract:
As restaking protocols gain adoption across blockchain ecosystems, there is a need for Actively Validated Services (AVSs) to span multiple Shared Security Providers (SSPs). This leads to stake fragmentation which introduces new complications where an adversary may compromise an AVS by targeting its weakest SSP. In this paper, we formalize the Multiple SSP Problem and analyze two architectures : an isolated fragmented model called Model $\mathbb{M}$ and a shared unified model called Model $\mathbb{S}$, through a convex optimization and game-theoretic lens. We derive utility bounds, attack cost conditions, and market equilibrium that describes protocol security for both models. Our results show that while Model $\mathbb{M}$ offers deployment flexibility, it inherits lowest-cost attack vulnerabilities, whereas Model $\mathbb{S}$ achieves tighter security guarantees through single validator sets and aggregated slashing logic. We conclude with future directions of work including an incentive-compatible stake rebalancing allocation in restaking ecosystems.

Paperid: 3165, https://arxiv.org/pdf/2505.03831.pdf

Abstract:
Deep learning has revolutionized email filtering, which is critical to protect users from cyber threats such as spam, malware, and phishing. However, the increasing sophistication of adversarial attacks poses a significant challenge to the effectiveness of these filters. This study investigates the impact of adversarial attacks on deep learning-based spam detection systems using real-world datasets. Six prominent deep learning models are evaluated on these datasets, analyzing attacks at the word, character sentence, and AI-generated paragraph-levels. Novel scoring functions, including spam weights and attention weights, are introduced to improve attack effectiveness. This comprehensive analysis sheds light on the vulnerabilities of spam filters and contributes to efforts to improve their security against evolving adversarial threats.

Paperid: 3166, https://arxiv.org/pdf/2505.03817.pdf

Abstract:
This paper presents a holistic approach to attacker preference modeling from system-level audit logs using inverse reinforcement learning (IRL). Adversary modeling is an important capability in cybersecurity that lets defenders characterize behaviors of potential attackers, which enables attribution to known cyber adversary groups. Existing approaches rely on documenting an ever-evolving set of attacker tools and techniques to track known threat actors. Although attacks evolve constantly, attacker behavioral preferences are intrinsic and less volatile. Our approach learns the behavioral preferences of cyber adversaries from forensics data on their tools and techniques. We model the attacker as an expert decision-making agent with unknown behavioral preferences situated in a computer host. We leverage attack provenance graphs of audit logs to derive a state-action trajectory of the attack. We test our approach on open datasets of audit logs containing real attack data. Our results demonstrate for the first time that low-level forensics data can automatically reveal an adversary's subjective preferences, which serves as an additional dimension to modeling and documenting cyber adversaries. Attackers' preferences tend to be invariant despite their different tools and indicate predispositions that are inherent to the attacker. As such, these inferred preferences can potentially serve as unique behavioral signatures of attackers and improve threat attribution.

Paperid: 3167, https://arxiv.org/pdf/2505.03555.pdf

Abstract:
Symbolic execution is a powerful program analysis technique that can formally reason the correctness of program behaviors and detect software bugs. It can systematically explore the execution paths of the tested program. But it suffers from an inherent limitation: path explosion. Path explosion occurs when symbolic execution encounters an overwhelming number (exponential to the program size) of paths that need to be symbolically reasoned. It severely impacts the scalability and performance of symbolic execution. To tackle this problem, previous works leverage various heuristics to prioritize paths for symbolic execution. They rank the exponential number of paths using static rules or heuristics and explore the paths with the highest rank. However, in practice, these works often fail to generalize to diverse programs. In this work, we propose a novel and effective path prioritization technique with path cover, named Empc. Our key insight is that not all paths need to be symbolically reasoned. Unlike traditional path prioritization, our approach leverages a small subset of paths as a minimum path cover (MPC) that can cover all code regions of the tested programs. To encourage diversity in path prioritization, we compute multiple MPCs. We then guide the search for symbolic execution on the small number of paths inside multiple MPCs rather than the exponential number of paths. We implement our technique Empc based on KLEE. We conduct a comprehensive evaluation of Empc to investigate its performance in code coverage, bug findings, and runtime overhead. The evaluation shows that Empc can cover 19.6% more basic blocks than KLEE's best search strategy and 24.4% more lines compared to the state-of-the-art work cgs. Empc also finds 24 more security violations than KLEE's best search strategy. Meanwhile, Empc can significantly reduce the memory usage of KLEE by up to 93.5%.

Paperid: 3168, https://arxiv.org/pdf/2505.03208.pdf

Abstract:
The advancement and adoption of Artificial Intelligence (AI) models across diverse domains have transformed the way we interact with technology. However, it is essential to recognize that while AI models have introduced remarkable advancements, they also present inherent challenges such as their vulnerability to adversarial attacks. The current work proposes a novel defense mechanism against one of the most significant attack vectors of AI models - the backdoor attack via data poisoning of training datasets. In this defense technique, an integrated approach that combines chaos theory with manifold learning is proposed. A novel metric - Precision Matrix Dependency Score (PDS) that is based on the conditional variance of Neurochaos features is formulated. The PDS metric has been successfully evaluated to distinguish poisoned samples from non-poisoned samples across diverse datasets.

Paperid: 3169, https://arxiv.org/pdf/2505.02493.pdf

Abstract:
The decentralized and unregulated nature of cryptocurrencies, combined with their monetary value, has made them a vehicle for various illicit activities. One such activity is cryptojacking, an attack that uses stolen computing resources to mine cryptocurrencies without consent for profit. In-browser cryptojacking malware exploits high-performance web technologies like WebAssembly to mine cryptocurrencies directly within the browser without file downloads. Although existing methods for cryptomining detection report high accuracy and low overhead, they are often susceptible to various forms of obfuscation, and due to the limited variety of cryptomining scripts in the wild, standard code obfuscation methods present a natural and appealing solution to avoid detection. To address these limitations, we propose using instruction-level data-flow graphs to detect cryptomining behavior. Data-flow graphs offer detailed structural insights into a program's computations, making them suitable for characterizing proof-of-work algorithms, but they can be difficult to analyze due to their large size and susceptibility to noise and fragmentation under obfuscation. We present two techniques to simplify and compare data-flow graphs: (1) a graph simplification algorithm to reduce the computational burden of processing large and granular data-flow graphs while preserving local substructures; and (2) a subgraph similarity measure, the n-fragment inclusion score, based on fragment inclusion that is robust against noise and obfuscation. Using data-flow graphs as computation fingerprints, our detection framework PoT (Proof-of-Theft) was able to achieve high detection accuracy against standard obfuscations, outperforming existing detection methods. Moreover, PoT uses generic data-flow properties that can be applied to other platforms more susceptible to cryptojacking such as servers and data centers.

Paperid: 3170, https://arxiv.org/pdf/2505.02409.pdf

Abstract:
The sharing of information between agencies is effective in dealing with cross-jurisdictional criminal activities; however, such sharing is often restricted due to concerns about data privacy, ownership, and compliance. Towards this end, this work has introduced a privacy-preserving federated search system that allows law enforcement agencies to conduct queries on encrypted criminal databases by utilizing Homomorphic Encryption (HE). The key innovation here is the ability to execute encrypted queries across distributed databases, without the decryption of the data, thus preserving end-to-end confidentiality. In essence, this approach meets stringent privacy requirements in the interests of national security and regulatory compliance. The system incorporates the CKKS and BFV scheme embedded within TenSEAL, with each agency holding its key pair in a centralized key management table. In this federated search, encrypted queries are computed on the server side, and only authorized clients can decrypt the computed results. The matching of agencies is flexible for working in real-time while at the same time being secure and scalable while preserving control over data and the integrity of the process. Experimental results demonstrate the model. This paper also provide the implementation code and other details.

Paperid: 3171, https://arxiv.org/pdf/2505.02362.pdf

Abstract:
Email spam detection is a critical task in modern communication systems, essential for maintaining productivity, security, and user experience. Traditional machine learning and deep learning approaches, while effective in static settings, face significant limitations in adapting to evolving spam tactics, addressing class imbalance, and managing data scarcity. These challenges necessitate innovative approaches that reduce dependency on extensive labeled datasets and frequent retraining. This study investigates the effectiveness of Zero-Shot Learning using FLAN-T5, combined with advanced Natural Language Processing (NLP) techniques such as BERT for email spam detection. By employing BERT to preprocess and extract critical information from email content, and FLAN-T5 to classify emails in a Zero-Shot framework, the proposed approach aims to address the limitations of traditional spam detection systems. The integration of FLAN-T5 and BERT enables robust spam detection without relying on extensive labeled datasets or frequent retraining, making it highly adaptable to unseen spam patterns and adversarial environments. This research highlights the potential of leveraging zero-shot learning and NLPs for scalable and efficient spam detection, providing insights into their capability to address the dynamic and challenging nature of spam detection tasks.

Paperid: 3172, https://arxiv.org/pdf/2505.02349.pdf

Abstract:
Code cloning is a common practice in software development, but it poses significant security risks by propagating vulnerabilities across cloned segments. To address this challenge, we introduce srcVul, a scalable, precise detection approach that combines program slicing with Locality-Sensitive Hashing to identify vulnerable code clones and recommend patches. srcVul builds a database of vulnerability-related slices by analyzing known vulnerable programs and their corresponding patches, indexing each slice's unique structural characteristics as a vulnerability slicing vector. During clone detection, srcVul efficiently matches slicing vectors from target programs with those in the database, recommending patches upon identifying similarities. Our evaluation of srcVul against three state-of-the-art vulnerable clone detectors demonstrates its accuracy, efficiency, and scalability, achieving 91% precision and 75% recall on established vulnerability databases and open-source repositories. These results highlight srcVul's effectiveness in detecting complex vulnerability patterns across diverse codebases.

Paperid: 3173, https://arxiv.org/pdf/2505.02231.pdf

Abstract:
This research paper delves into the field of autonomous vehicle technology, examining the vulnerabilities inherent in each component of these transformative vehicles. Autonomous vehicles (AVs) are revolutionizing transportation by seamlessly integrating advanced functionalities such as sensing, perception, planning, decision-making, and control. However, their reliance on interconnected systems and external communication interfaces renders them susceptible to cybersecurity threats. This research endeavors to develop a comprehensive threat model for AV systems, employing OWASP Threat Dragon and the STRIDE framework. This model categorizes threats into Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service (DoS), and Elevation of Privilege. A systematic risk assessment is conducted to evaluate vulnerabilities across various AV components, including perception modules, planning systems, control units, and communication interfaces.

Paperid: 3174, https://arxiv.org/pdf/2505.01941.pdf

Abstract:
The rise of social media financial influencers (finfluencers) has significantly transformed the personal finance landscape, making financial advice and insights more accessible to a broader and younger audience. By leveraging digital platforms, these influencers have contributed to the democratization of financial literacy. However, the line between education and promotion is often blurred, as many finfluencers lack formal financial qualifications, raising concerns about the accuracy and reliability of the information they share. This study investigates the patterns and behaviours of finfluencers in the UK on TikTok, focusing not on individual actions but on broader trends and the interactions between influencers and their followers. The aim is to identify common engagement patterns and propose guidelines that can help protect the public from potential financial harm. Specifically, the paper contributes a detailed analysis of finfluencer content categorization, sentiment trends, and the prevalence and role of disclaimers, offering empirical insights that inform recommendations for safer and more transparent financial communication on social media.

Paperid: 3175, https://arxiv.org/pdf/2505.01788.pdf

Abstract:
The widespread adoption of Artificial Intelligence (AI) has been driven by significant advances in intelligent system research. However, this progress has raised concerns about data privacy, leading to a growing awareness of the need for privacy-preserving AI. In response, there has been a seismic shift in interest towards the leading paradigm for training Machine Learning (ML) models on decentralized data silos while maintaining data privacy, Federated Learning (FL). This research paper presents a comprehensive performance analysis of a cutting-edge approach to personalize ML model while preserving privacy achieved through Privacy Preserving Machine Learning with the innovative framework of Federated Personalized Learning (PPMLFPL). Regarding the increasing concerns about data privacy, this study evaluates the effectiveness of PPMLFPL addressing the critical balance between personalized model refinement and maintaining the confidentiality of individual user data. According to our analysis, Adaptive Personalized Cross-Silo Federated Learning with Differential Privacy (APPLE+DP) offering efficient execution whereas overall, the use of the Adaptive Personalized Cross-Silo Federated Learning with Homomorphic Encryption (APPLE+HE) algorithm for privacy-preserving machine learning tasks in federated personalized learning settings is strongly suggested. The results offer valuable insights creating it a promising scope for future advancements in the field of privacy-conscious data-driven technologies.

Paperid: 3176, https://arxiv.org/pdf/2505.01484.pdf

Abstract:
Due to the increasing number of tasks that are solved on remote servers, identifying and classifying traffic is an important task to reduce the load on the server. There are various methods for classifying traffic. This paper discusses machine learning models for solving this problem. However, such ML models are also subject to attacks that affect the classification result of network traffic. To protect models, we proposed a solution based on an autoencoder

Paperid: 3179, https://arxiv.org/pdf/2505.01287.pdf

Abstract:
How can we generate a permutation of the numbers $1$ through $n$ so that it is hard to guess the next element given the history so far? The twist is that the generator of the permutation (the ``Dealer") has limited memory, while the ``Guesser" has unlimited memory. With unbounded memory (actually $n$ bits suffice), the Dealer can generate a truly random permutation where $\ln n$ is the expected number of correct guesses. Our main results establish tight bounds for the relationship between the guessing probability and the memory $m$ required to generate the permutation. We suggest a method for an $m$-bit Dealer that operates in constant time per turn, and any Guesser can pick correctly only $O(n/m+\log m)$ cards in expectation. The method is fully transparent, requiring no hidden information from the Dealer (i.e., it is "open book" or "whitebox"). We show that this bound is the best possible, even with secret memory. Specifically, for any $m$-bit Dealer, there is a (computationally powerful) guesser that achieves $Î©(n/m+\log m)$ correct guesses in expectation. We point out that the assumption that the Guesser is computationally powerful is necessary: under cryptographic assumptions, there exists a low-memory Dealer that can fool any computationally bounded guesser. We also give an $O(n)$ bit memory Dealer that generates perfectly random permutations and operates in constant time per turn.

Paperid: 3180, https://arxiv.org/pdf/2505.01139.pdf

Abstract:
The InterPlanetary File System (IPFS) is a decentralized peer-to-peer (P2P) storage that relies on Kademlia, a Distributed Hash Table (DHT) structure commonly used in P2P systems for its proved scalability. However, DHTs are known to be vulnerable to Sybil attacks, in which a single entity controls multiple malicious nodes. Recent studies have shown that IPFS is affected by a passive content eclipse attack, leveraging Sybils, in which adversarial nodes hide received indexed information from other peers, making the content appear unavailable. Fortunately, the latest mitigation strategy coupling an attack detection based on statistical tests and a wider publication strategy upon detection was able to circumvent it. In this work, we present a new active attack, with malicious nodes responding with semantically correct but intentionally false data, exploiting both an optimized placement of Sybils to stay below the detection threshold and an early trigger of the content discovery termination in Kubo, the main IPFS implementation. Our attack achieves to completely eclipse content on the latest Kubo release. When evaluated against the most recent known mitigation, it successfully denies access to the target content in approximately 80\% of lookup attempts. To address this vulnerability, we propose a new mitigation called SR-DHT-Store, which enables efficient, Sybil-resistant content publication without relying on attack detection but instead on a systematic and precise use of region-based queries, defined by a dynamically computed XOR distance to the target ID. SR-DHT-Store can be combined with other defense mechanisms resulting in a defense strategy that completely mitigates both passive and active Sybil attacks at a lower overhead, while allowing an incremental deployment.

Paperid: 3181, https://arxiv.org/pdf/2505.00977.pdf

Abstract:
Generating high-quality steganographic text is a fundamental challenge in the field of generative linguistic steganography. This challenge arises primarily from two aspects: firstly, the capabilities of existing models in text generation are limited; secondly, embedding algorithms fail to effectively mitigate the negative impacts of sensitive information's properties, such as semantic content or randomness. Specifically, to ensure that the recipient can accurately extract hidden information, embedding algorithms often have to consider selecting candidate words with relatively low probabilities. This phenomenon leads to a decrease in the number of high-probability candidate words and an increase in low-probability candidate words, thereby compromising the semantic coherence and logical fluency of the steganographic text and diminishing the overall quality of the generated steganographic material. To address this issue, this paper proposes a novel embedding algorithm, character-based diffusion embedding algorithm (CDEA). Unlike existing embedding algorithms that strive to eliminate the impact of sensitive information's properties on the generation process, CDEA leverages sensitive information's properties. It enhances the selection frequency of high-probability candidate words in the candidate pool based on general statistical properties at the character level and grouping methods based on power-law distributions, while reducing the selection frequency of low-probability candidate words in the candidate pool. Furthermore, to ensure the effective transformation of sensitive information in long sequences, we also introduce the XLNet model. Experimental results demonstrate that the combination of CDEA and XLNet significantly improves the quality of generated steganographic text, particularly in terms of perceptual-imperceptibility.

Paperid: 3182, https://arxiv.org/pdf/2505.00888.pdf

Abstract:
As new project upgrading the blockchain industry, novel forms of attack challenges developers to rethink about the design of their innovations. In the growth stage of the development, Decentralized Autonomous Organizations (DAO) introduces different approaches in managing fund through voting in governance tokens. However, relying on tokens as a weight for voting introduces opportunities for hackers to manipulate voting results through flash loan, allowing malicious proposals - fund withdrawal from DAO to hacker's wallet - to execute through the smart contract. In this research, we learned different defense mechanism against the flash loan attack, and their weakness in accessibility that compromise the security of different blockchain projects. Based on our observation, we propose a new defensing structure and apply it with cases.

Paperid: 3183, https://arxiv.org/pdf/2505.00858.pdf

Abstract:
This study investigates a duality approach to information leak detection in the generalized Kirchhoff-Law-Johnson-Noise secure key exchange scheme proposed by Vadai, Mingesz, and Gingl (VMG-KLJN). While previous work by Chamon and Kish sampled voltages at zero-current instances, this research explores sampling currents at zero-voltage crossings. The objective is to determine if this dual approach can reveal information leaks in non-equilibrium KLJN systems. Results indicate that the duality method successfully detects information leaks, further supporting the necessity of thermal equilibrium for unconditional security in KLJN systems. Our findings confirm that the duality method successfully detects information leaks, with results closely mirroring those of Chamon and Kish, showing comparable vulnerabilities in non-equilibrium conditions. These results further support the necessity of thermal equilibrium for unconditional security in the KLJN scheme.

Paperid: 3184, https://arxiv.org/pdf/2505.00487.pdf

Abstract:
This article describes the process of creating a script and conducting an analytical study of a dataset using the DeepMIMO emulator. An advertorial attack was carried out using the FGSM method to maximize the gradient. A comparison is made of the effectiveness of binary classifiers in the task of detecting distorted data. The dynamics of changes in the quality indicators of the regression model were analyzed in conditions without adversarial attacks, during an adversarial attack and when the distorted data was isolated. It is shown that an adversarial FGSM attack with gradient maximization leads to an increase in the value of the MSE metric by 33% and a decrease in the R2 indicator by 10% on average. The LightGBM binary classifier effectively identifies data with adversarial anomalies with 98% accuracy. Regression machine learning models are susceptible to adversarial attacks, but rapid analysis of network traffic and data transmitted over the network makes it possible to identify malicious activity

Paperid: 3185, https://arxiv.org/pdf/2505.00480.pdf

Abstract:
This paper proposes a decentralized, blockchain-based system for the publication of Common Vulnerabilities and Exposures (CVEs), aiming to mitigate the limitations of the current centralized model primarily overseen by MITRE. The proposed architecture leverages a permissioned blockchain, wherein only authenticated CVE Numbering Authorities (CNAs) are authorized to submit entries. This ensures controlled write access while preserving public transparency. By incorporating smart contracts, the system supports key features such as embargoed disclosures and decentralized governance. We evaluate the proposed model in comparison with existing practices, highlighting its advantages in transparency, trust decentralization, and auditability. A prototype implementation using Hyperledger Fabric is presented to demonstrate the feasibility of the approach, along with a discussion of its implications for the future of vulnerability disclosure.

Paperid: 3186, https://arxiv.org/pdf/2505.00061.pdf

Abstract:
This study examines vulnerabilities in transformer-based automated short-answer grading systems used in medical education, with a focus on how these systems can be manipulated through adversarial gaming strategies. Our research identifies three main types of gaming strategies that exploit the system's weaknesses, potentially leading to false positives. To counteract these vulnerabilities, we implement several adversarial training methods designed to enhance the systems' robustness. Our results indicate that these methods significantly reduce the susceptibility of grading systems to such manipulations, especially when combined with ensemble techniques like majority voting and ridge regression, which further improve the system's defense against sophisticated adversarial inputs. Additionally, employing large language models such as GPT-4 with varied prompting techniques has shown promise in recognizing and scoring gaming strategies effectively. The findings underscore the importance of continuous improvements in AI-driven educational tools to ensure their reliability and fairness in high-stakes settings.

Paperid: 3187, https://arxiv.org/pdf/2504.21520.pdf

Abstract:
Function detection is a well-known problem in binary analysis. While previous research has primarily focused on Linux/ELF, Windows/PE binaries have been overlooked or only partially considered. This paper introduces FuncPEval, a new dataset for Windows x86 and x64 PE files, featuring Chromium and the Conti ransomware, along with ground truth data for 1,092,820 function starts. Utilizing FuncPEval, we evaluate five heuristics-based (Ghidra, IDA, Nucleus, rev.ng, SMDA) and three machine-learning-based (DeepDi, RNN, XDA) function start detection tools. Among the tested tools, IDA achieves the highest F1-score (98.44%) for Chromium x64, while DeepDi closely follows (97%) but stands out as the fastest by a significant margin. Working towards explainability, we examine the impact of padding between functions on the detection results. Our analysis shows that all tested tools, except rev.ng, are susceptible to randomized padding. The randomized padding significantly diminishes the effectiveness for the RNN, XDA, and Nucleus. Among the learning-based tools, DeepDi exhibits the least sensitivity and demonstrates overall the fastest performance, while Nucleus is the most adversely affected among non-learning-based tools. In addition, we improve the recurrent neural network (RNN) proposed by Shin et al. and enhance the XDA tool, increasing the F1-score by approximately 10%.

Paperid: 3188, https://arxiv.org/pdf/2504.21045.pdf

Abstract:
According to the Open Web Application Security Project (OWASP), Cross-Site Scripting (XSS) is a critical security vulnerability. Despite decades of research, XSS remains among the top 10 security vulnerabilities. Researchers have proposed various techniques to protect systems from XSS attacks, with machine learning (ML) being one of the most widely used methods. An ML model is trained on a dataset to identify potential XSS threats, making its effectiveness highly dependent on the size and diversity of the training data. A variation of XSS is obfuscated XSS, where attackers apply obfuscation techniques to alter the code's structure, making it challenging for security systems to detect its malicious intent. Our study's random forest model was trained on traditional (non-obfuscated) XSS data achieved 99.8% accuracy. However, when tested against obfuscated XSS samples, accuracy dropped to 81.9%, underscoring the importance of training ML models with obfuscated data to improve their effectiveness in detecting XSS attacks. A significant challenge is to generate highly complex obfuscated code despite the availability of several public tools. These tools can only produce obfuscation up to certain levels of complexity. In our proposed system, we fine-tune a Large Language Model (LLM) to generate complex obfuscated XSS payloads automatically. By transforming original XSS samples into diverse obfuscated variants, we create challenging training data for ML model evaluation. Our approach achieved a 99.5% accuracy rate with the obfuscated dataset. We also found that the obfuscated samples generated by the LLMs were 28.1% more complex than those created by other tools, significantly improving the model's ability to handle advanced XSS attacks and making it more effective for real-world application security.

Paperid: 3189, https://arxiv.org/pdf/2504.21029.pdf

Abstract:
We propose a robust transformer architecture designed to prevent prompt injection attacks and ensure secure, reliable response generation. Our PICO (Prompt Isolation and Cybersecurity Oversight) framework structurally separates trusted system instructions from untrusted user inputs through dual channels that are processed independently and merged only by a controlled, gated fusion mechanism. In addition, we integrate a specialized Security Expert Agent within a Mixture-of-Experts (MoE) framework and incorporate a Cybersecurity Knowledge Graph (CKG) to supply domain-specific reasoning. Our training design further ensures that the system prompt branch remains immutable while the rest of the network learns to handle adversarial inputs safely. This PICO framework is presented via a general mathematical formulation, then elaborated in terms of the specifics of transformer architecture, and fleshed out via hypothetical case studies including Policy Puppetry attacks. While the most effective implementation may involve training transformers in a PICO-based way from scratch, we also present a cost-effective fine-tuning approach.

Paperid: 3190, https://arxiv.org/pdf/2504.20827.pdf

Abstract:
Smart homes represent intelligent environments where interconnected devices gather information, enhancing users living experiences by ensuring comfort, safety, and efficient energy management. To enhance the quality of life, companies in the smart device industry collect user data, including activities, preferences, and power consumption. However, sharing such data necessitates privacy-preserving practices. This paper introduces a robust method for secure sharing of data to service providers, grounded in differential privacy (DP). This empowers smart home residents to contribute usage statistics while safeguarding their privacy. The approach incorporates the Synthetic Minority Oversampling technique (SMOTe) and seamlessly integrates Gaussian noise to generate synthetic data, enabling data and statistics sharing while preserving individual privacy. The proposed method employs the SMOTe algorithm and applies Gaussian noise to generate data. Subsequently, it employs a k-anonymity function to assess reidentification risk before sharing the data. The simulation outcomes demonstrate that our method delivers strong performance in safeguarding privacy and in accuracy, recall, and f-measure metrics. This approach is particularly effective in smart homes, offering substantial utility in privacy at a reidentification risk of 30%, with Gaussian noise set to 0.3, SMOTe at 500%, and the application of a k-anonymity function with k = 2. Additionally, it shows a high classification accuracy, ranging from 90% to 98%, across various classification techniques.

Paperid: 3191, https://arxiv.org/pdf/2504.20689.pdf

Abstract:
Medical image encryption plays an important role in protecting sensitive health information from cyberattacks and unauthorized access. In this paper, we introduce a secure and robust encryption scheme that is multi-modality compatible and works with MRI, CT, X-Ray and Ultrasound images for different anatomical region of interest. The method utilizes hyperchaotic signals and multi-level diffusion methods. The encryption starts by taking DICOM image as input, then padding to increase the image area. Chaotic signals are produced by a logistic map and are used to carry out pixel random permutation. Then, multi-level diffusion is carried out by 4-bit, 8-bit, radial and adjacent diffusion to provide high randomness and immunity against statistical attacks. In addition, we propose a captcha-based authentication scheme to further improve security. An algorithm generates alphanumeric captcha-based image which is encrypted with the same chaotic and diffusion methods as the medical image. Both encrypted images(DICOM image and captcha image) are then superimposed to create a final encrypted output, essentially integrating dual-layer security. Upon decryption, the superimposed image is again decomposed back to original medical and captcha images, and inverse operations are performed to obtain the original unencrypted data. Experimental results show that the proposed method provides strong protection with no loss in image integrity, thereby reducing unauthorized data breaches to a significant level. The dual-encryption approach not only protects the confidentiality of the medical images but also enhances authentication by incorporating captcha.

Paperid: 3192, https://arxiv.org/pdf/2504.20637.pdf

Abstract:
Protocol dialects are methods for modifying protocols that provide light-weight security, especially against easy attacks that can lead to more serious ones. A lingo is a dialect's key security component by making attackers unable to "speak" the lingo. A lingo's "talk" changes all the time, becoming a moving target for attackers. We present several kinds of lingo transformations and compositions to generate stronger lingos from simpler ones, thus making dialects more secure.

Paperid: 3193, https://arxiv.org/pdf/2504.20612.pdf

Abstract:
The rapid advancement of Large Language Models (LLMs) has enhanced software development processes, minimizing the time and effort required for coding and enhancing developer productivity. However, despite their potential benefits, code generated by LLMs has been shown to generate insecure code in controlled environments, raising critical concerns about their reliability and security in real-world applications. This paper uses predefined security parameters to evaluate the security compliance of LLM-generated code across multiple models, such as ChatGPT, DeepSeek, Claude, Gemini and Grok. The analysis reveals critical vulnerabilities in authentication mechanisms, session management, input validation and HTTP security headers. Although some models implement security measures to a limited extent, none fully align with industry best practices, highlighting the associated risks in automated software development. Our findings underscore that human expertise is crucial to ensure secure software deployment or review of LLM-generated code. Also, there is a need for robust security assessment frameworks to enhance the reliability of LLM-generated code in real-world applications.

Paperid: 3194, https://arxiv.org/pdf/2504.20414.pdf

Abstract:
Searchable Symmetric Encryption (SSE) enables efficient search capabilities over encrypted data, allowing users to maintain privacy while utilizing cloud storage. However, SSE schemes are vulnerable to leakage attacks that exploit access patterns, search frequency, and volume information. Existing studies frequently assume that adversaries possess a substantial fraction of the encrypted dataset to mount effective inference attacks, implying there is a database leakage of such documents, thus, an assumption that may not hold in real-world scenarios. In this work, we investigate the feasibility of enhancing leakage attacks under a more realistic threat model in which adversaries have access to minimal leaked data. We propose a novel approach that leverages large language models (LLMs), specifically GPT-4 variants, to generate synthetic documents that statistically and semantically resemble the real-world dataset of Enron emails. Using the email corpus as a case study, we evaluate the effectiveness of synthetic data generated via random sampling and hierarchical clustering methods on the performance of the SAP (Search Access Pattern) keyword inference attack restricted to token volumes only. Our results demonstrate that, while the choice of LLM has limited effect, increasing dataset size and employing clustering-based generation significantly improve attack accuracy, achieving comparable performance to attacks using larger amounts of real data. We highlight the growing relevance of LLMs in adversarial contexts.

Paperid: 3195, https://arxiv.org/pdf/2504.20350.pdf

Abstract:
In software development, privacy preservation has become essential with the rise of privacy concerns and regulations such as GDPR and CCPA. While several tools, guidelines, methods, methodologies, and frameworks have been proposed to support developers embedding privacy into software applications, most of them are proofs-of-concept without empirical evaluations, making their practical applicability uncertain. These solutions should be evaluated for different types of scenarios (e.g., industry settings such as rapid software development environments, teams with different privacy knowledge, etc.) to determine what their limitations are in various industry settings and what changes are required to refine current solutions before putting them into industry and developing new developer-supporting approaches. For that, a thorough review of empirically evaluated current solutions will be very effective. However, the existing secondary studies that examine the available developer support provide broad overviews but do not specifically analyze empirically evaluated solutions and their limitations. Therefore, this Systematic Literature Review (SLR) aims to identify and analyze empirically validated solutions that are designed to help developers in privacy-preserving software development. The findings will provide valuable insights for researchers to improve current privacy-preserving solutions and for practitioners looking for effective and validated solutions to embed privacy into software development.

Paperid: 3196, https://arxiv.org/pdf/2504.19640.pdf

Abstract:
Railways provide a critical service and operate under strict regulatory frameworks for implementing changes or upgrades. Despite their impact on the public, these frameworks do not define means or mechanisms for transparency towards the public, leading to reduced trust and complex tracking processes. We analyse the German guideline for railway-infrastructural modifications from proposal to approval, using the guideline as a motivating example for modelling decisions in processes using digital signatures and zero-knowledge proofs. Therein, a verifier can verify that a process was executed correctly by the involved parties and according to specification without learning confidential information such as trade secrets or identities of the participants. We validate our system by applying it to the railway process, demonstrating how it realises various rules, and we evaluate its scalability with increased process complexities. Our solution is not railway-specific but also applicable to other contexts, helping leverage zero-knowledge proofs for public transparency and trust.

Paperid: 3197, https://arxiv.org/pdf/2504.19222.pdf

Abstract:
The NIS2 directive requires EU Member States to ensure a consistently high level of cybersecurity by setting risk-management measures for essential and important entities. Evaluations are necessary to assess whether the required security level is met. This involves understanding the needs and goals of different personas defined by NIS2, who benefit from evaluation results. In this paper, we consider how NIS2 user stories support the evaluation of the level of information security in organizations. Using requirements elicitation principles, we extracted the legal requirements from NIS2 from our narrowed scope, identified six key personas and their goals, formulated user stories based on the gathered information, and validated the usability and relevance of the user stories with security evaluation instruments or methods we found from the literature. The defined user stories help to adjust existing instruments and methods of assessing the security level to comply with NIS2. On the other hand, user stories enable us to see the patterns related to security evaluation when developing new NIS2-compliant security evaluation methods to optimize the administrative burden of entities.

Paperid: 3198, https://arxiv.org/pdf/2504.19154.pdf

Abstract:
The integration of security within DevOps, known as DevSecOps, has gained traction in modern software development to address security vulnerabilities while maintaining agility. Artificial Intelligence (AI) and Machine Learning (ML) have been increasingly leveraged to enhance security automation, threat detection, and compliance enforcement. However, existing studies primarily focus on individual aspects of AI-driven security in DevSecOps, lacking a structured comparison of methodologies. This study conducts a systematic literature review (SLR) to analyze and compare AI-driven security solutions in DevSecOps, evaluating their technical capabilities, implementation challenges, and operational impacts. The findings reveal gaps in empirical validation, scalability, and integration of AI in security automation. The study highlights best practices, identifies research gaps, and proposes future directions for optimizing AI-based security frameworks in DevSecOps.

Paperid: 3199, https://arxiv.org/pdf/2504.19064.pdf

Abstract:
Quantum computing is becoming increasingly widespread due to the potential and capabilities to solve complex problems beyond the scope of classical computers. As Quantum Cloud services are adopted by businesses and research groups, they allow for greater progress and application in many fields. However, the inherent vulnerabilities of these environments pose significant security concerns. This survey delivers a comprehensive analysis of the security challenges that emerged in quantum cloud systems, with a distinct focus on multi-tenant vulnerabilities and the classical-quantum interface. Key threats such as crosstalk attacks, quantum-specific side-channel vulnerabilities, and insider threats are all examined, as well as their effects on the confidentiality, integrity, and availability of quantum circuits. The design and implementation of various quantum architectures from quantum cloud providers are also discussed. In addition, this paper delves into emerging quantum security solutions and best practices to mitigate these risks. This survey offers insights into current research gaps and proposes future directions for secure and resilient quantum cloud infrastructures.

Paperid: 3200, https://arxiv.org/pdf/2504.18974.pdf

Abstract:
In the standard privacy-preserving Machine learning as-a-service (MLaaS) model, the client encrypts data using homomorphic encryption and uploads it to a server for computation. The result is then sent back to the client for decryption. It has become more and more common for the computation to be outsourced to third-party servers. In this paper we identify a weakness in this protocol that enables a completely undetectable novel model-stealing attack that we call the Silver Platter attack. This attack works even under multikey encryption that prevents a simple collusion attack to steal model parameters. We also propose a mitigation that protects privacy even in the presence of a malicious server and malicious client or model provider (majority dishonest). When compared to a state-of-the-art but small encrypted model with 32k parameters, we preserve privacy with a failure chance of 1.51 x 10^-28 while batching capability is reduced by 0.2%. Our approach uses a novel results-checking protocol that ensures the computation was performed correctly without violating honest clients' data privacy. Even with collusion between the client and the server, they are unable to steal model parameters. Additionally, the model provider cannot learn any client data if maliciously working with the server.

Paperid: 3201, https://arxiv.org/pdf/2504.18771.pdf

Abstract:
This work empirically evaluates machine learning models on two imbalanced public datasets (KDDCUP99 and Credit Card Fraud 2013). The method includes data preparation, model training, and evaluation, using an 80/20 (train/test) split. Models tested include eXtreme Gradient Boosting (XGB), Multi Layer Perceptron (MLP), Generative Adversarial Network (GAN), Variational Autoencoder (VAE), and Multiple-Objective Generative Adversarial Active Learning (MO-GAAL), with XGB and MLP further combined with Random-Over-Sampling (ROS) and Self-Paced-Ensemble (SPE). Evaluation involves 5-fold cross-validation and imputation techniques (mean, median, and IterativeImputer) with 10, 20, 30, and 50 % missing data. Findings show XGB and MLP outperform generative models. IterativeImputer results are comparable to mean and median, but not recommended for large datasets due to increased complexity and execution time. The code used is publicly available on GitHub (github.com/markushaug/acr-25).

Paperid: 3202, https://arxiv.org/pdf/2504.18597.pdf

Abstract:
The Brakerski-Gentry-Vaikuntanathan (BGV) scheme is one of the most significant fully homomorphic encryption (FHE) schemes. It belongs to a class of FHE schemes whose security is based on the presumed intractability of the Learning with Errors (LWE) problem and its ring variant (RLWE). Such schemes deal with a quantity, called noise, which increases each time a homomorphic operation is performed. Specifically, in order for the scheme to work properly, it is essential that the noise remains below a certain threshold throughout the process. For BGV, this threshold strictly depends on the ciphertext modulus, which is one of the initial parameters whose selection heavily affects both the efficiency and security of the scheme. In this paper, we provide a new method to estimate noise growth, closely aligning with experimental results and forming the basis for parameter selection that ensures correctness and improves efficiency.

Paperid: 3203, https://arxiv.org/pdf/2504.18596.pdf

Abstract:
This paper explores the strategic use of modern synthetic data generation and advanced data perturbation techniques to enhance security, maintain analytical utility, and improve operational efficiency when managing large datasets, with a particular focus on the Banking, Financial Services, and Insurance (BFSI) sector. We contrast these advanced methods encompassing generative models like GANs, sophisticated context-aware PII transformation, configurable statistical perturbation, and differential privacy with traditional anonymization approaches. The goal is to create realistic, privacy-preserving datasets that retain high utility for complex machine learning tasks and analytics, a critical need in the data-sensitive industries like BFSI, Healthcare, Retail, and Telecommunications. We discuss how these modern techniques potentially offer significant improvements in balancing privacy preservation while maintaining data utility compared to older methods. Furthermore, we examine the potential for operational gains, such as reduced overhead and accelerated analytics, by using these privacy-enhanced datasets. We also explore key use cases where these methods can mitigate regulatory risks and enable scalable, data-driven innovation without compromising sensitive customer information.

Paperid: 3204, https://arxiv.org/pdf/2504.18570.pdf

Abstract:
This paper presents two attack strategies designed to evade detection in ADMM-based systems by preventing significant changes to the residual during the attacked iteration. While many detection algorithms focus on identifying false data injection through residual changes, we show that our attacks remain undetected by keeping the residual largely unchanged. The first strategy uses a random starting point combined with Gram-Schmidt orthogonalization to ensure stealth, with potential for refinement by enhancing the orthogonal component to increase system disruption. The second strategy builds on the first, targeting financial gains by manipulating reactive power and pushing the system to its upper voltage limit, exploiting operational constraints. The effectiveness of the proposed attack-resilient mechanism is demonstrated through case studies on the IEEE 14-bus system. A comparison of the two strategies, along with commonly used naive attacks, reveals trade-offs between simplicity, detectability, and effectiveness, providing insights into ADMM system vulnerabilities. These findings underscore the need for more robust monitoring algorithms to protect against advanced attack strategies.

Paperid: 3205, https://arxiv.org/pdf/2504.18411.pdf

Abstract:
With the rapid growth of digital platforms, there is increasing apprehension about how personal data is collected, stored, and used by various entities. These concerns arise from the increasing frequency of data breaches, cyber-attacks, and misuse of personal information for targeted advertising and surveillance. To address these matters, Differential Privacy (DP) has emerged as a prominent tool for quantifying a digital system's level of protection. The Gaussian mechanism is commonly used because the Gaussian density is closed under convolution, and is a common method utilized when aggregating datasets. However, the Gaussian mechanism only satisfies an approximate form of Differential Privacy. In this work, we present and analyze of the Symmetric alpha-Stable (SaS) mechanism. We prove that the mechanism achieves pure differential privacy while remaining closed under convolution. Additionally, we study the nuanced relationship between the level of privacy achieved and the parameters of the density. Lastly, we compare the expected error introduced to dataset queries by the Gaussian and SaS mechanisms. From our analysis, we believe the SaS Mechanism is an appealing choice for privacy-focused applications.

Paperid: 3206, https://arxiv.org/pdf/2504.18348.pdf

Abstract:
For deep learning-based image steganography frameworks, in order to ensure the invisibility and recoverability of the information embedding, the loss function usually contains several losses such as embedding loss, recovery loss and steganalysis loss. In previous research works, fixed loss weights are usually chosen for training optimization, and this setting is not linked to the importance of the steganography task itself and the training process. In this paper, we propose a Two-stage Curriculum Learning loss scheduler (TSCL) for balancing multinomial losses in deep learning image steganography algorithms. TSCL consists of two phases: a priori curriculum control and loss dynamics control. The first phase firstly focuses the model on learning the information embedding of the original image by controlling the loss weights in the multi-party adversarial training; secondly, it makes the model shift its learning focus to improving the decoding accuracy; and finally, it makes the model learn to generate a steganographic image that is resistant to steganalysis. In the second stage, the learning speed of each training task is evaluated by calculating the loss drop of the before and after iteration rounds to balance the learning of each task. Experimental results on three large public datasets, ALASKA2, VOC2012 and ImageNet, show that the proposed TSCL strategy improves the quality of steganography, decoding accuracy and security.

Paperid: 3207, https://arxiv.org/pdf/2504.17692.pdf

Abstract:
Web browsers provide the security foundation for our online experiences. Significant research has been done into the security of browsers themselves, but relatively little investigation has been done into how they interact with the operating system or the file system. In this work, we provide the first systematic security study of browser profiles, the on-disk persistence layer of browsers, used for storing everything from users' authentication cookies and browser extensions to certificate trust decisions and device permissions. We show that, except for the Tor Browser, all modern browsers store sensitive data in home directories with little to no integrity or confidentiality controls. We show that security measures like password and cookie encryption can be easily bypassed. In addition, HTTPS can be sidestepped entirely by deploying malicious root certificates within users' browser profiles. The Public Key Infrastructure (PKI), the backbone of the secure Web. HTTPS can be fully bypassed with the deployment of custom potentially malicious root certificates. More worryingly, we show how these powerful attacks can be fully mounted directly from web browsers themselves, through the File System Access API, a recent feature added by Chromium browsers that enables a website to directly manipulate a user's file system via JavaScript. In a series of case studies, we demonstrate how an attacker can install malicious browser extensions, inject additional root certificates, hijack HTTPS traffic, and enable websites to access hardware devices like the camera and GPS. Based on our findings, we argue that researchers and browser vendors need to develop and deploy more secure mechanisms for protecting users' browser data against file system attackers.

Paperid: 3208, https://arxiv.org/pdf/2504.17473.pdf

Abstract:
The digital economy runs on Open Source Software (OSS), with an estimated 90\% of modern applications containing open-source components. While this widespread adoption has revolutionized software development, it has also created critical security vulnerabilities, particularly in essential but under-resourced projects. This paper examines a sophisticated attack on the XZ Utils project (CVE-2024-3094), where attackers exploited not just code, but the entire open-source development process to inject a backdoor into a fundamental Linux compression library. Our analysis reveals a new breed of supply chain attack that manipulates software engineering practices themselves -- from community management to CI/CD configurations -- to establish legitimacy and maintain long-term control. Through a comprehensive examination of GitHub events and development artifacts, we reconstruct the attack timeline, analyze the evolution of attacker tactics. Our findings demonstrate how attackers leveraged seemingly beneficial contributions to project infrastructure and maintenance to bypass traditional security measures. This work extends beyond traditional security analysis by examining how software engineering practices themselves can be weaponized, offering insights for protecting the open-source ecosystem.

Paperid: 3209, https://arxiv.org/pdf/2504.17194.pdf

Abstract:
As digital content distribution expands rapidly through online platforms, securing digital media and protecting intellectual property has become increasingly complex. Traditional centralized systems, while widely adopted, suffer from vulnerabilities such as single points of failure and limited traceability of unauthorized access. This paper presents a blockchain-based secure digital content distribution system that integrates Sia, a decentralized storage network, and Skynet, a content delivery network, to enhance content protection and distribution. The proposed system employs a dual-layer architecture: off-chain for user authentication and on-chain for transaction validation using smart contracts and asymmetric encryption. By introducing a license issuance and secret block mechanism, the system ensures content authenticity, privacy, and controlled access. Experimental results demonstrate the feasibility and scalability of the system in securely distributing multimedia files. The proposed platform not only improves content security but also paves the way for future enhancements with decentralized applications and integrated royalty payment mechanisms.

Paperid: 3210, https://arxiv.org/pdf/2504.16836.pdf

Abstract:
The Onion Router (Tor) is a controversial network whose utility is constantly under scrutiny. On the one hand, it allows for anonymous interaction and cooperation of users seeking untraceable navigation on the Internet. This freedom also attracts criminals who aim to thwart law enforcement investigations, e.g., trading illegal products or services such as drugs or weapons. Tor allows delivering content without revealing the actual hosting address, by means of .onion (or hidden) services. Different from regular domains, these services can not be resolved by traditional name services, are not indexed by regular search engines, and they frequently change. This generates uncertainty about the extent and size of the Tor network and the type of content offered. In this work, we present a large-scale analysis of the Tor Network. We leverage our crawler, dubbed Mimir, which automatically collects and visits content linked within the pages to collect a dataset of pages from more than 25k sites. We analyze the topology of the Tor Network, including its depth and reachability from the surface web. We define a set of heuristics to detect the presence of replicated content (mirrors) and show that most of the analyzed content in the Dark Web (82% approx.) is a replica of other content. Also, we train a custom Machine Learning classifier to understand the type of content the hidden services offer. Overall, our study provides new insights into the Tor network, highlighting the importance of initial seeding for focus on specific topics, and optimize the crawling process. We show that previous work on large-scale Tor measurements does not consider the presence of mirrors, which biases their understanding of the Dark Web topology and the distribution of content.

Paperid: 3211, https://arxiv.org/pdf/2504.16584.pdf

Abstract:
Large Language Models (LLMs) have demonstrated significant capabilities in understanding and analyzing code for security vulnerabilities, such as Common Weakness Enumerations (CWEs). However, their reliance on cloud infrastructure and substantial computational requirements pose challenges for analyzing sensitive or proprietary codebases due to privacy concerns and inference costs. This work explores the potential of Small Language Models (SLMs) as a viable alternative for accurate, on-premise vulnerability detection. We investigated whether a 350-million parameter pre-trained code model (codegen-mono) could be effectively fine-tuned to detect the MITRE Top 25 CWEs specifically within Python code. To facilitate this, we developed a targeted dataset of 500 examples using a semi-supervised approach involving LLM-driven synthetic data generation coupled with meticulous human review. Initial tests confirmed that the base codegen-mono model completely failed to identify CWEs in our samples. However, after applying instruction-following fine-tuning, the specialized SLM achieved remarkable performance on our test set, yielding approximately 99% accuracy, 98.08% precision, 100% recall, and a 99.04% F1-score. These results strongly suggest that fine-tuned SLMs can serve as highly accurate and efficient tools for CWE detection, offering a practical and privacy-preserving solution for integrating advanced security analysis directly into development workflows.

Paperid: 3212, https://arxiv.org/pdf/2504.16550.pdf

Abstract:
Intrusion Detection Systems (IDSs) are integral to safeguarding networks by detecting and responding to threats from malicious traffic or compromised devices. However, standalone IDS deployments often fall short when addressing the increasing complexity and scale of modern cyberattacks. This paper proposes a Collaborative Intrusion Detection System (CIDS) that leverages Snort, an open-source network intrusion detection system, to enhance detection accuracy and reduce false positives. The proposed architecture connects multiple Snort IDS nodes to a centralised node and integrates with a Security Information and Event Management (SIEM) platform to facilitate real-time data sharing, correlation, and analysis. The CIDS design includes a scalable configuration of Snort sensors, a centralised database for log storage, and LogScale SIEM for advanced analytics and visualisation. By aggregating and analysing intrusion data from multiple nodes, the system enables improved detection of distributed and sophisticated attack patterns that standalone IDSs may miss. Performance evaluation against simulated attacks, including Nmap port scans and ICMP flood attacks, demonstrates our CIDS's ability to efficiently process large-scale network traffic, detect threats with higher accuracy, and reduce alert fatigue. This paper highlights the potential of CIDS in modern network environments and explores future enhancements, such as integrating machine learning for advanced threat detection and creating public datasets to support collaborative research. The proposed CIDS framework provides a promising foundation for building more resilient and adaptive network security systems.

Paperid: 3213, https://arxiv.org/pdf/2504.16120.pdf

Abstract:
Large Language Models (LLM) have made remarkable progress, but concerns about potential biases and harmful content persist. To address these apprehensions, we introduce a practical solution for ensuring LLM's safe and ethical use. Our novel approach focuses on a post-generation correction mechanism, the BART-Corrective Model, which adjusts generated content to ensure safety and security. Unlike relying solely on model fine-tuning or prompt engineering, our method provides a robust data-centric alternative for mitigating harmful content. We demonstrate the effectiveness of our approach through experiments on multiple toxic datasets, which show a significant reduction in mean toxicity and jail-breaking scores after integration. Specifically, our results show a reduction of 15% and 21% in mean toxicity and jail-breaking scores with GPT-4, a substantial reduction of 28% and 5% with PaLM2, a reduction of approximately 26% and 23% with Mistral-7B, and a reduction of 11.1% and 19% with Gemma-2b-it. These results demonstrate the potential of our approach to improve the safety and security of LLM, making them more suitable for real-world applications.

Paperid: 3214, https://arxiv.org/pdf/2504.16087.pdf

Abstract:
Parental control applications, software tools designed to manage and monitor children's online activities, serve as essential safeguards for parents in the digital age. However, their usage has sparked concerns about security and privacy violations inherent in various child monitoring products. Sideloaded software (i. e. apps installed outside official app stores) poses an increased risk, as it is not bound by the regulations of trusted platforms. Despite this, the market of sideloaded parental control software has remained widely unexplored by the research community. This paper examines 20 sideloaded parental control apps and compares them to 20 apps available on the Google Play Store. We base our analysis on privacy policies, Android package kit (APK) files, application behaviour, network traffic and application functionalities. Our findings reveal that sideloaded parental control apps fall short compared to their in-store counterparts, lacking specialised parental control features and safeguards against misuse while concealing themselves on the user's device. Alarmingly, three apps transmitted sensitive data unencrypted, half lacked a privacy policy and 8 out of 20 were flagged for potential stalkerware indicators of compromise (IOC).

Paperid: 3215, https://arxiv.org/pdf/2504.15499.pdf

Abstract:
As AI models become more embedded in critical sectors like finance, healthcare, and the military, their inscrutable behavior poses ever-greater risks to society. To mitigate this risk, we propose Guillotine, a hypervisor architecture for sandboxing powerful AI models -- models that, by accident or malice, can generate existential threats to humanity. Although Guillotine borrows some well-known virtualization techniques, Guillotine must also introduce fundamentally new isolation mechanisms to handle the unique threat model posed by existential-risk AIs. For example, a rogue AI may try to introspect upon hypervisor software or the underlying hardware substrate to enable later subversion of that control plane; thus, a Guillotine hypervisor requires careful co-design of the hypervisor software and the CPUs, RAM, NIC, and storage devices that support the hypervisor software, to thwart side channel leakage and more generally eliminate mechanisms for AI to exploit reflection-based vulnerabilities. Beyond such isolation at the software, network, and microarchitectural layers, a Guillotine hypervisor must also provide physical fail-safes more commonly associated with nuclear power plants, avionic platforms, and other types of mission critical systems. Physical fail-safes, e.g., involving electromechanical disconnection of network cables, or the flooding of a datacenter which holds a rogue AI, provide defense in depth if software, network, and microarchitectural isolation is compromised and a rogue AI must be temporarily shut down or permanently destroyed.

Paperid: 3216, https://arxiv.org/pdf/2504.15497.pdf

Abstract:
This paper presents an underlying framework for both automating and accelerating malware classification, more specifically, mapping malicious executables to known Advanced Persistent Threat (APT) groups. The main feature of this analysis is the assembly-level instructions present in executables which are also known as opcodes. The collection of such opcodes on many malicious samples is a lengthy process; hence, open-source reverse engineering tools are used in tandem with scripts that leverage parallel computing to analyze multiple files at once. Traditional and deep learning models are applied to create models capable of classifying malware samples. One-gram and two-gram datasets are constructed and used to train models such as SVM, KNN, and Decision Tree; however, they struggle to provide adequate results without relying on metadata to support n-gram sequences. The computational limitations of such models are overcome with convolutional neural networks (CNNs) and heavily accelerated using graphical compute unit (GPU) resources.

Paperid: 3217, https://arxiv.org/pdf/2504.15395.pdf

Abstract:
In cybersecurity risk is commonly measured by impact and probability, the former is objectively measured based on the consequences from the use of technology to obtain business gains, or by achieving business objectives. The latter has been measured, in sectors such as financial or insurance, based on historical data because there is vast information, and many other fields have applied the same approach. Although in cybersecurity, as a new discipline, there is not always historical data to support an objective measure of probability, the data available is not public and there is no consistent formatting to store and share it, so a new approach is required to measure cybersecurity events incidence. Through a comprehensive analysis of the state of the art, including current methodologies, frameworks, and incident data, considering tactics, techniques, and procedures (TTP) used by attackers, indicators of compromise (IOC), and defence controls, this work proposes a data model that describes a cyber exposure profile that provides an indirect but objective measure for likelihood, including different sources and metrics to update the model if needed. We further propose a set of practical, quantifiable metrics for risk assessment, enabling cybersecurity practitioners to measure likelihood without relying solely on historical incident data. By combining these metrics with our data model, organizations gain an actionable framework for continuously refining their cybersecurity strategies.

Paperid: 3218, https://arxiv.org/pdf/2504.15202.pdf

Abstract:
Despite the effort in vulnerability detection over the last two decades, memory safety vulnerabilities continue to be a critical problem. Recent reports suggest that the key solution is to migrate to memory-safe languages. To this end, C-to-Rust transpilation becomes popular to resolve memory-safety issues in C programs. Recent works propose C-to-Rust transpilation frameworks; however, a comprehensive evaluation dataset is missing. Although one solution is to put together a large enough dataset, this increases the analysis time in automated frameworks as well as in manual efforts for some cases. In this work, we build a method to select functions from a large set to construct a minimized yet representative dataset to evaluate the C-to-Rust transpilation. We propose C2RUST-BENCH that contains 2,905 functions, which are representative of C-to-Rust transpilation, selected from 15,503 functions of real-world programs.

Paperid: 3220, https://arxiv.org/pdf/2504.14881.pdf

Abstract:
Zero-knowledge proofs (ZKPs) have evolved from a theoretical cryptographic concept into a powerful tool for implementing privacy-preserving and verifiable applications without requiring trust assumptions. Despite significant progress in the field, implementing and using ZKPs via \emph{ZKP circuits} remains challenging, leading to numerous bugs that affect ZKP circuits in practice, and \emph{fuzzing} remains largely unexplored as a method to detect bugs in ZKP circuits. We discuss the unique challenges of applying fuzzing to ZKP circuits, examine the oracle problem and its potential solutions, and propose techniques for input generation and test harness construction. We demonstrate that fuzzing can be effective in this domain by implementing a fuzzer for \texttt{zk-regex}, a cornerstone library in modern ZKP applications. In our case study, we discovered \textit{$10$} new bugs that have been confirmed by the developers.

Paperid: 3221, https://arxiv.org/pdf/2504.14833.pdf

Abstract:
Traffic classification is crucial for securing Internet of Things (IoT) networks. Deep learning-based methods can autonomously extract latent patterns from massive network traffic, demonstrating significant potential for IoT traffic classification tasks. However, the limited computational and spatial resources of IoT devices pose challenges for deploying more complex deep learning models. Existing methods rely heavily on either flow-level features or raw packet byte features. Flow-level features often require inspecting entire or most of the traffic flow, leading to excessive resource consumption, while raw packet byte features fail to distinguish between headers and payloads, overlooking semantic differences and introducing noise from feature misalignment. Therefore, this paper proposes IoT-AMLHP, an aligned multimodal learning framework for resource-efficient malicious IoT traffic classification. Firstly, the framework constructs a packet-wise header-payload representation by parsing packet headers and payload bytes, resulting in an aligned and standardized multimodal traffic representation that enhances the characterization of heterogeneous IoT traffic. Subsequently, the traffic representation is fed into a resource-efficient neural network comprising a multimodal feature extraction module and a multimodal fusion module. The extraction module employs efficient depthwise separable convolutions to capture multi-scale features from different modalities while maintaining a lightweight architecture. The fusion module adaptively captures complementary features from different modalities and effectively fuses multimodal features.

Paperid: 3222, https://arxiv.org/pdf/2504.14497.pdf

Abstract:
Plaintext-ciphertext matrix multiplication (PC-MM) is an indispensable tool in privacy-preserving computations such as secure machine learning and encrypted signal processing. While there are many established algorithms for plaintext-plaintext matrix multiplication, efficiently computing plaintext-ciphertext (and ciphertext-ciphertext) matrix multiplication is an active area of research which has received a lot of attention. Recent literature have explored various techniques for privacy-preserving matrix multiplication using fully homomorphic encryption (FHE) schemes with ciphertext packing and Single Instruction Multiple Data (SIMD) processing. On the other hand, there hasn't been any attempt to speed up PC-MM using unpacked additively homomorphic encryption (AHE) schemes beyond the schoolbook method and Strassen's algorithm for matrix multiplication. In this work, we propose an efficient PC-MM from unpacked AHE, which applies Cussen's compression-reconstruction algorithm for plaintext-plaintext matrix multiplication in the encrypted setting. We experimentally validate our proposed technique using a concrete instantiation with the additively homomorphic elliptic curve ElGamal encryption scheme and its software implementation on a Raspberry Pi 5 edge computing platform. Our proposed approach achieves up to an order of magnitude speedup compared to state-of-the-art for large matrices with relatively small element bit-widths. Extensive measurement results demonstrate that our fast PC-MM is an excellent candidate for efficient privacy-preserving computation even in resource-constrained environments.

Paperid: 3223, https://arxiv.org/pdf/2504.14162.pdf

Abstract:
This study introduces ROFBS$Î±$, a new defense architecture that addresses delays in detection in ransomware detectors based on machine learning. It builds on our earlier Real Time Open File Backup System, ROFBS, by adopting an asynchronous design that separates backup operations from detection tasks. By using eBPF to monitor file open events and running the backup process independently, the system avoids performance limitations when detection and protection contend for resources. We evaluated ROFBS$Î±$ against three ransomware strains, AvosLocker, Conti, and IceFire. The evaluation measured the number of files encrypted, the number of files successfully backed up, the ratio of backups to encrypted files, and the overall detection latency. The results show that ROFBS$Î±$ achieves high backup success rates and faster detection while adding minimal extra load to the system. However, defending against ransomware that encrypts files extremely quickly remains an open challenge that will require further enhancements.

Paperid: 3224, https://arxiv.org/pdf/2504.14122.pdf

Abstract:
The rapid growth in web-based services has significantly increased security risks related to user information, as web-based attacks become increasingly sophisticated and prevalent. Traditional security methods frequently struggle to detect previously unknown (zero-day) web attacks, putting sensitive user data at significant risk. Additionally, reducing human intervention in web security tasks can minimize errors and enhance reliability. This paper introduces an intelligent system designed to detect zero-day web attacks using a novel one-class ensemble method consisting of three distinct autoencoder architectures: LSTM autoencoder, GRU autoencoder, and stacked autoencoder. Our approach employs a novel tokenization strategy to convert normal web requests into structured numeric sequences, enabling the ensemble model to effectively identify anomalous activities by uniquely concatenating and compressing the latent representations from each autoencoder. The proposed method efficiently detects unknown web attacks while effectively addressing common limitations of previous methods, such as high memory consumption and excessive false positive rates. Extensive experimental evaluations demonstrate the superiority of our proposed ensemble, achieving remarkable detection metrics: 97.58% accuracy, 97.52% recall, 99.76% specificity, and 99.99% precision, with an exceptionally low false positive rate of 0.2%. These results underscore our method's significant potential in enhancing real-world web security through accurate and reliable detection of web-based attacks.

Paperid: 3225, https://arxiv.org/pdf/2504.13408.pdf

Abstract:
This technical report presents a comprehensive analysis of malware classification using OpCode sequences. Two distinct approaches are evaluated: traditional machine learning using n-gram analysis with Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree classifiers; and a deep learning approach employing a Convolutional Neural Network (CNN). The traditional machine learning approach establishes a baseline using handcrafted 1-gram and 2-gram features from disassembled malware samples. The deep learning methodology builds upon the work proposed in "Deep Android Malware Detection" by McLaughlin et al. and evaluates the performance of a CNN model trained to automatically extract features from raw OpCode data. Empirical results are compared using standard performance metrics (accuracy, precision, recall, and F1-score). While the SVM classifier outperforms other traditional techniques, the CNN model demonstrates competitive performance with the added benefit of automated feature extraction.

Paperid: 3226, https://arxiv.org/pdf/2504.13267.pdf

Abstract:
Over the past few years, traffic congestion has continuously plagued the nation's transportation system creating several negative impacts including longer travel times, increased pollution rates, and higher collision risks. To overcome these challenges, Intelligent Transportation Systems (ITS) aim to improve mobility and vehicular systems, ensuring higher levels of safety by utilizing cutting-edge technologies, sophisticated sensing capabilities, and innovative algorithms. Drivers' participatory sensing, current/future location reporting, and machine learning algorithms have considerably improved real-time congestion monitoring and future traffic management. However, each driver's sensitive spatiotemporal location information can create serious privacy concerns. To address these challenges, we propose in this paper a secure, privacy-preserving location reporting and traffic forecasting system that guarantees privacy protection of driver data while maintaining high traffic forecasting accuracy. Our novel k-anonymity scheme utilizes functional encryption to aggregate encrypted location information submitted by drivers while ensuring the privacy of driver location data. Additionally, using the aggregated encrypted location information as input, this research proposes a deep learning model that incorporates a Convolutional-Long Short-Term Memory (Conv-LSTM) module to capture spatial and short-term temporal features and a Bidirectional Long Short-Term Memory (Bi-LSTM) module to recover long-term periodic patterns for traffic forecasting. With extensive evaluation on real datasets, we demonstrate the effectiveness of the proposed scheme with less than 10% mean absolute error for a 60-minute forecasting horizon, all while protecting driver privacy.

Paperid: 3227, https://arxiv.org/pdf/2504.13199.pdf

Abstract:
Objective: This review explores the trustworthiness of multimodal artificial intelligence (AI) systems, specifically focusing on vision-language tasks. It addresses critical challenges related to fairness, transparency, and ethical implications in these systems, providing a comparative analysis of key tasks such as Visual Question Answering (VQA), image captioning, and visual dialogue. Background: Multimodal models, particularly vision-language models, enhance artificial intelligence (AI) capabilities by integrating visual and textual data, mimicking human learning processes. Despite significant advancements, the trustworthiness of these models remains a crucial concern, particularly as AI systems increasingly confront issues regarding fairness, transparency, and ethics. Methods: This review examines research conducted from 2017 to 2024 focusing on forenamed core vision-language tasks. It employs a comparative approach to analyze these tasks through the lens of trustworthiness, underlining fairness, explainability, and ethics. This study synthesizes findings from recent literature to identify trends, challenges, and state-of-the-art solutions. Results: Several key findings were highlighted. Transparency: Explainability of vision language tasks is important for user trust. Techniques, such as attention maps and gradient-based methods, have successfully addressed this issue. Fairness: Bias mitigation in VQA and visual dialogue systems is essential for ensuring unbiased outcomes across diverse demographic groups. Ethical Implications: Addressing biases in multilingual models and ensuring ethical data handling is critical for the responsible deployment of vision-language systems. Conclusion: This study underscores the importance of integrating fairness, transparency, and ethical considerations in developing vision-language models within a unified framework.

Paperid: 3228, https://arxiv.org/pdf/2504.13198.pdf

Abstract:
On June 2, 2024, Mexico held its federal elections. The majority of Mexican citizens voted in person at the polls in this historic election. For the first time though, Mexican citizens living outside their country were able to vote online via a web app, either on a personal device or using an electronic voting kiosk at one of 23 embassies and consulates in the U.S., Canada, and Europe. In total, 144,734 people voted outside of Mexico: 122,496 on a personal device and 22,238 in-person at a kiosk. Voting was open for remote voting from 8PM, May 18, 2024 to 6PM, June 2, 2024 and was open for in-person voting from 8AM-6PM on June 2, 2024. This article describes the technical and cryptographic tools applied to secure the ex-patriate component of the election and to enable INE (Mexico's National Electoral Institute) to generate provable election results within minutes of the close of the election. This article will also describe how the solutions we present scale to elections on a national level.

Paperid: 3229, https://arxiv.org/pdf/2504.13196.pdf

Abstract:
The purpose of research: Detection of cybersecurity incidents and analysis of decision support and assessment of the effectiveness of measures to counter information security threats based on modern generative models. The methods of research: Emulation of signal propagation data in MIMO systems, synthesis of adversarial examples, execution of adversarial attacks on machine learning models, fine tuning of large language models for detecting adversarial attacks, explainability of decisions on detecting cybersecurity incidents based on the prompts technique. Scientific novelty: A binary classification of data poisoning attacks was performed using large language models, and the possibility of using large language models for investigating cybersecurity incidents in the latest generation wireless networks was investigated. The result of research: Fine-tuning of large language models was performed on the prepared data of the emulated wireless network segment. Six large language models were compared for detecting adversarial attacks, and the capabilities of explaining decisions made by a large language model were investigated. The Gemma-7b model showed the best results according to the metrics Precision = 0.89, Recall = 0.89 and F1-Score = 0.89. Based on various explainability prompts, the Gemma-7b model notes inconsistencies in the compromised data under study, performs feature importance analysis and provides various recommendations for mitigating the consequences of adversarial attacks. Large language models integrated with binary classifiers of network threats have significant potential for practical application in the field of cybersecurity incident investigation, decision support and assessing the effectiveness of measures to counter information security threats.

Paperid: 3230, https://arxiv.org/pdf/2504.13052.pdf

Abstract:
Large Language Models (LLMs) have been equipped with safety mechanisms to prevent harmful outputs, but these guardrails can often be bypassed through "jailbreak" prompts. This paper introduces a novel graph-based approach to systematically generate jailbreak prompts through semantic transformations. We represent malicious prompts as nodes in a graph structure with edges denoting different transformations, leveraging Abstract Meaning Representation (AMR) and Resource Description Framework (RDF) to parse user goals into semantic components that can be manipulated to evade safety filters. We demonstrate a particularly effective exploitation vector by instructing LLMs to generate code that realizes the intent described in these semantic graphs, achieving success rates of up to 87% against leading commercial LLMs. Our analysis reveals that contextual framing and abstraction are particularly effective at circumventing safety measures, highlighting critical gaps in current safety alignment techniques that focus primarily on surface-level patterns. These findings provide insights for developing more robust safeguards against structured semantic attacks. Our research contributes both a theoretical framework and practical methodology for systematically stress-testing LLM safety mechanisms.

Paperid: 3231, https://arxiv.org/pdf/2504.12948.pdf

Abstract:
Efficiently solving the Shortest Vector Problem (SVP) in two-dimensional lattices holds practical significance in cryptography and computational geometry. While simpler than its high-dimensional counterpart, two-dimensional SVP motivates scalable solutions for high-dimensional lattices and benefits applications like sequence cipher cryptanalysis involving large integers. In this work, we first propose a novel definition of reduced bases and develop an efficient adaptive lattice reduction algorithm \textbf{CrossEuc} that strategically applies the Euclidean algorithm across dimensions. Building on this framework, we introduce \textbf{HVec}, a vectorized generalization of the Half-GCD algorithm originally defined for integers, which can efficiently halve the bit-length of two vectors and may have independent interest. By iteratively invoking \textbf{HVec}, our optimized algorithm \textbf{HVecSBP} achieves a reduced basis in $O(\log n M(n) )$ time for arbitrary input bases with bit-length $n$, where $M(n)$ denotes the cost of multiplying two $n$-bit integers. Compared to existing algorithms, our design is applicable to general forms of input lattices, eliminating the cost of pre-converting input bases to Hermite Normal Form (HNF). The comprehensive experimental results demonstrate that for the input lattice bases in HNF, the optimized algorithm \textbf{HVecSBP} achieves at least a $13.5\times$ efficiency improvement compared to existing methods. For general-form input lattice bases, converting them to HNF before applying \textbf{HVecSBP} offers only marginal advantages in extreme cases where the two basis vectors are nearly degenerate. However, as the linear dependency between input basis vectors decreases, directly employing \textbf{HVecSBP} yields increasingly significant efficiency gains, outperforming hybrid approaches that rely on prior \textbf{HNF} conversion.

Paperid: 3232, https://arxiv.org/pdf/2504.12757.pdf

Abstract:
As Agentic AI gain mainstream adoption, the industry invests heavily in model capabilities, achieving rapid leaps in reasoning and quality. However, these systems remain largely confined to data silos, and each new integration requires custom logic that is difficult to scale. The Model Context Protocol (MCP) addresses this challenge by defining a universal, open standard for securely connecting AI-based applications (MCP clients) to data sources (MCP servers). However, the flexibility of the MCP introduces new risks, including malicious tool servers and compromised data integrity. We present MCP Guardian, a framework that strengthens MCP-based communication with authentication, rate-limiting, logging, tracing, and Web Application Firewall (WAF) scanning. Through real-world scenarios and empirical testing, we demonstrate how MCP Guardian effectively mitigates attacks and ensures robust oversight with minimal overheads. Our approach fosters secure, scalable data access for AI assistants, underscoring the importance of a defense-in-depth approach that enables safer and more transparent innovation in AI-driven environments.

Paperid: 3233, https://arxiv.org/pdf/2504.12142.pdf

Abstract:
The growing demand for highly reliable communication systems drives the research and development of algorithms that identify and correct errors during data transmission and storage. This need becomes even more critical in hard-to-access or sensitive systems, such as those used in space applications, passenger transportation, and the financial sector. In this context, Error Correction Codes (ECCs) are essential tools for ensuring a certain level of reliability. This work proposes a technique to enhance ECC error correction capability by overlapping data regions. The approach consists of protecting the same data area with multiple ECCs organized in a two-dimensional structure, enabling logical inferences that correlate the codes and improve their error detection and correction capabilities. More specifically, the overlapping is characterized by the organization of multiple ECCs, whose intersection exclusively covers the entire data region. Different configurations of overlapping ECCs were analyzed to evaluate the proposal regarding error detection and correction capability, scalability, and reliability. Experimental results confirm the technique's effectiveness and demonstrate its high scalability potential, reducing the need for redundancy bits relative to the number of data bits. Furthermore, comparisons with state-of-the-art ECC approaches indicate the technique's applicability in critical systems that require high reliability.

Paperid: 3234, https://arxiv.org/pdf/2504.12062.pdf

Abstract:
This work explores the performance and scalability of a hierarchical certificate authority framework with automated certificate issuance employing post-quantum cryptographic (PQC) signature algorithms. The system is designed for compatibility with both classical and PQC algorithms, promoting crypto-agility while ensuring robust security against quantum-based threats. The proposed framework design expects minimal cryptographic requirements from potential clients, protects certificates of high importance against cross-dependent chains-of-trust and allows for prompt switching between classical and PQC algorithms. Finally, we evaluate SPHINCS$^+$, Falcon, and Dilithium variants in various configurations of certificate issuance and verification accommodating a large client base, underlining the trade-offs in balancing performance, scalability, and security.

Paperid: 3235, https://arxiv.org/pdf/2504.11984.pdf

Abstract:
Zero Trust Architecture (ZTA) is one of the paradigm changes in cybersecurity, from the traditional perimeter-based model to perimeterless. This article studies the core concepts of ZTA, its beginning, a few use cases and future trends. Emphasising the always verify and least privilege access, some key tenets of ZTA have grown to be integration technologies like Identity Management, Multi-Factor Authentication (MFA) and real-time analytics. ZTA is expected to strengthen cloud environments, education, work environments (including from home) while controlling other risks like lateral movement and insider threats. Despite ZTA's benefits, it comes with challenges in the form of complexity, performance overhead and vulnerabilities in the control plane. These require phased implementation and continuous refinement to keep up with evolving organisational needs and threat landscapes. Emerging technologies, such as Artificial Intelligence (AI) and Machine Learning (ML) will further automate policy enforcement and threat detection in keeping up with dynamic cyber threats.

Paperid: 3236, https://arxiv.org/pdf/2504.11702.pdf

Abstract:
Decentralised applications (dApps) that run on public blockchains have the benefit of trustworthiness and transparency as every activity that happens on the blockchain can be publicly traced through the transaction data. However, this introduces a potential privacy problem as this data can be tracked and analysed, which can reveal user-behaviour information. A user behaviour analysis pipeline was proposed to present how this type of information can be extracted and analysed to identify separate behavioural clusters that can describe how users behave in the game. The pipeline starts with the collection of transaction data, involving smart contracts, that is collected from a blockchain-based game called Planet IX. Both the raw transaction information and the transaction events are considered in the data collection. From this data, separate game actions can be formed and those are leveraged to present how and when the users conducted their in-game activities in the form of user flows. An extended version of these user flows also presents how the Non-Fungible Tokens (NFTs) are being leveraged in the user actions. The latter is given as input for a Graph Neural Network (GNN) model to provide graph embeddings for these flows which then can be leveraged by clustering algorithms to cluster user behaviours into separate behavioural clusters. We benchmark and compare well-known clustering algorithms as a part of the proposed method. The user behaviour clusters were analysed and visualised in a graph format. It was found that behavioural information can be extracted regarding the users that belong to these clusters. Such information can be exploited by malicious users to their advantage. To demonstrate this, a privacy threat model was also presented based on the results that correspond to multiple potentially affected areas.

Paperid: 3237, https://arxiv.org/pdf/2504.11486.pdf

Abstract:
Foreign information operations on social media platforms pose significant risks to democratic societies. With the rise of Artificial Intelligence (AI), this threat is likely to intensify, potentially overwhelming human defenders. To achieve the necessary scale and tempo to defend against these threats, utilizing AI as part of the solution seems inevitable. Although there has been a significant debate on AI in Lethal Autonomous Weapon Systems (LAWS), it is equally likely that AI will be widely used in information operations for defensive and offensive objectives. Similar to LAWS, AI-driven information operations occupy a highly sensitive moral domain where removing human involvement in the tactical decision making process raises ethical concerns. Although AI has yet to revolutionize the field, a solid ethical stance is urgently needed on how AI can be responsibly used to defend against information operations on social media platforms. This paper proposes possible AI-enabled countermeasures against cognitive warfare and argues how they can be developed in a responsible way, such that meaningful human control is preserved.

Paperid: 3238, https://arxiv.org/pdf/2504.11124.pdf

Abstract:
The Number Theoretic Transform (NTT) is an indispensable tool for computing efficient polynomial multiplications in post-quantum lattice-based cryptography. It has strong resemblance with the Fast Fourier Transform (FFT), which is the most widely used algorithm in digital signal processing. In this work, we demonstrate a unified hardware accelerator supporting both 512-point complex FFT as well as 256-point NTT for the recently standardized NIST post-quantum key encapsulation and digital signature algorithms ML-KEM and ML-DSA respectively. Our proposed architecture effectively utilizes the arithmetic circuitry required for complex FFT, and the only additional circuits required are for modular reduction along with modifications in the control logic. Our implementation achieves performance comparable to state-of-the-art ML-KEM / ML-DSA NTT accelerators on FPGA, thus demonstrating how an FFT accelerator can be augmented to support NTT and the unified hardware can be used for both digital signal processing and post-quantum lattice-based cryptography applications.

Paperid: 3239, https://arxiv.org/pdf/2504.10850.pdf

Abstract:
With the rise of powerful foundation models, a pre-training-fine-tuning paradigm becomes increasingly popular these days: A foundation model is pre-trained using a huge amount of data from various sources, and then the downstream users only need to fine-tune and adapt it to specific downstream tasks. However, due to the high computation complexity of adversarial training, it is not feasible to fine-tune the foundation model to improve its robustness on the downstream task. Observing the above challenge, we want to improve the downstream robustness without updating/accessing the weights in the foundation model. Inspired from existing literature in robustness inheritance (Kim et al., 2020), through theoretical investigation, we identify a close relationship between robust contrastive learning with the adversarial robustness of supervised learning. To further validate and utilize this theoretical insight, we design a simple-yet-effective robust auto-encoder as a data pre-processing method before feeding the data into the foundation model. The proposed approach has zero access to the foundation model when training the robust auto-encoder. Extensive experiments demonstrate the effectiveness of the proposed method in improving the robustness of downstream tasks, verifying the connection between the feature robustness (implied by small adversarial contrastive loss) and the robustness of the downstream task.

Paperid: 3240, https://arxiv.org/pdf/2504.10698.pdf

Abstract:
The growth of the Internet of Things has amplified the need for secure data interactions in cloud-edge ecosystems, where sensitive information is constantly processed across various system layers. Intrusion detection systems are commonly used to protect such environments from malicious attacks. Recently, Federated Learning has emerged as an effective solution for implementing intrusion detection systems, owing to its decentralised architecture that avoids sharing raw data with a central server, thereby enhancing data privacy. Despite its benefits, Federated Learning faces criticism for high communication overhead from frequent model updates, especially in large-scale Cloud-Edge infrastructures. This paper explores Knowledge Distillation to reduce communication overhead in Cloud-Edge intrusion detection while preserving accuracy and data privacy. Experiments show significant improvements over state-of-the-art methods.

Paperid: 3241, https://arxiv.org/pdf/2504.10318.pdf

Abstract:
Microarchitectural attacks are a significant concern, leading to many hardware-based defense proposals. However, different defenses target different classes of attacks, and their impact on each other has not been fully considered. To raise awareness of this problem, we study an interaction between two state-of-the art defenses in this paper, timing obfuscations of remote cache lines (TORC) and delaying speculative changes to remote cache lines (DSRC). TORC mitigates cache-hit based attacks and DSRC mitigates speculative coherence state change attacks. We observe that DSRC enables coherence information to be retrieved into the processor core, where it is out of the reach of timing obfuscations to protect. This creates an unforeseen consequence that redo operations can be triggered within the core to detect the presence or absence of remote cache lines, which constitutes a security vulnerability. We demonstrate that a new covert channel attack is possible using this vulnerability. We propose two ways to mitigate the attack, whose performance varies depending on an application's cache usage. One way is to never send remote exclusive coherence state (E) information to the core even if it is created. The other way is to never create a remote E state, which is responsible for triggering redos. We demonstrate the timing difference caused by this microarchitectural defense assumption violation using GEM5 simulations. Performance evaluation on SPECrate 2017 and PARSEC benchmarks of the two fixes show less than 32\% average overhead across both sets of benchmarks. The repair which prevented the creation of remote E state had less than 2.8% average overhead.

Paperid: 3242, https://arxiv.org/pdf/2504.10120.pdf

Abstract:
In this work, we explore the possibility of universally composable (UC)-secure commitments using Physically Uncloneable Functions (PUFs) within a new adversarial model. We introduce the communicating malicious PUFs, i.e. malicious PUFs that can interact with their creator even when not in their possession, obtaining a stronger adversarial model. Prior work [ASIACRYPT 2013, LNCS, vol. 8270, pp. 100-119] proposed a compiler for constructing UC-secure commitments from ideal extractable commitments, and our task would be to adapt the ideal extractable commitment scheme proposed therein to our new model. However, we found an attack and identified a few other issues in that construction, and to address them, we modified the aforementioned ideal extractable commitment scheme and introduced new properties and tools that allow us to rigorously develop and present security proofs in this context. We propose a new UC-secure commitment scheme against adversaries that can only create stateless malicious PUFs which can receive, but not send, information from their creators. Our protocol is more efficient compared to previous proposals, as we have parallelized the ideal extractable commitments within it. The restriction to stateless malicious PUFs is significant, mainly since the protocol from [ASIACRYPT 2013, LNCS, vol. 8270, pp. 100-119] assumes malicious PUFs with unbounded state, thus limiting its applicability. However it is the only way we found to address the issues of the original construction. We hope that in future work this restriction can be lifted, and along the lines of our work, UC-secure commitments with fewer restrictions on both the state and communication can be constructed.

Paperid: 3243, https://arxiv.org/pdf/2504.09776.pdf

Abstract:
Spam messages continue to present significant challenges to digital users, cluttering inboxes and posing security risks. Traditional spam detection methods, including rules-based, collaborative, and machine learning approaches, struggle to keep up with the rapidly evolving tactics employed by spammers. This project studies new spam detection systems that leverage Large Language Models (LLMs) fine-tuned with spam datasets. More importantly, we want to understand how LLM-based spam detection systems perform under adversarial attacks that purposefully modify spam emails and data poisoning attacks that exploit the differences between the training data and the massages in detection, to which traditional machine learning models are shown to be vulnerable. This experimentation employs two LLM models of GPT2 and BERT and three spam datasets of Enron, LingSpam, and SMSspamCollection for extensive training and testing tasks. The results show that, while they can function as effective spam filters, the LLM models are susceptible to the adversarial and data poisoning attacks. This research provides very useful insights for future applications of LLM models for information security.

Paperid: 3244, https://arxiv.org/pdf/2504.09604.pdf

Abstract:
Many-shot jailbreaking (MSJ) is an adversarial technique that exploits the long context windows of modern LLMs to circumvent model safety training by including in the prompt many examples of a "fake" assistant responding inappropriately before the final request. With enough examples, the model's in-context learning abilities override its safety training, and it responds as if it were the "fake" assistant. In this work, we probe the effectiveness of different fine-tuning and input sanitization approaches on mitigating MSJ attacks, alone and in combination. We find incremental mitigation effectiveness for each, and show that the combined techniques significantly reduce the effectiveness of MSJ attacks, while retaining model performance in benign in-context learning and conversational tasks. We suggest that our approach could meaningfully ameliorate this vulnerability if incorporated into model safety post-training.

Paperid: 3245, https://arxiv.org/pdf/2504.09527.pdf

Abstract:
Remote Keyless Entry (RKE) systems have become a standard feature in modern vehicles, yet their unidirectional fixed-frequency radio communication renders them vulnerable to replay attacks, impersonation attacks, cryptanalysis, and intentional interference. Existing cryptographic authentication methods enhance security but often fail to address real-world constraints such as computational efficiency and radio interference. To mitigate these threats, we designed the Adaptive Frequency-Hopping Algorithm and the Adaptive TXP and PHY Mode Control Algorithm that can dynamically optimize channel selection, transmission power, and PHY modes based on real-time channel quality assessment. To enhance the security and reliability of RKE systems, we propose the Lightweight Vehicle-Key Authentication Protocol. In addition, a prototype of the proposed scheme was implemented to verify its effectiveness in mitigating interference and preventing unauthorized access.Experimental results show that our scheme significantly enhances communication security and reliability while maintaining low computational overhead. Under mild interference conditions, the packet delivery rate (PDR) of the adaptive scheme increases from 93% to 99.23%, and under strong interference, it improves from 85% to 99.01%. Additionally, the scheme effectively prevents replay and impersonation attacks, ensuring secure vehicle access control by dynamically optimizing communication parameters to maintain stable and reliable transmission.

Paperid: 3246, https://arxiv.org/pdf/2504.09095.pdf

Abstract:
The ability of machines to comprehend and produce language that is similar to that of humans has revolutionized sectors like customer service, healthcare, and finance thanks to the quick advances in Natural Language Processing (NLP), which are fueled by Generative Artificial Intelligence (AI) and Large Language Models (LLMs). However, because LLMs trained on large datasets may unintentionally absorb and reveal Personally Identifiable Information (PII) from user interactions, these capabilities also raise serious privacy concerns. Deep neural networks' intricacy makes it difficult to track down or stop the inadvertent storing and release of private information, which raises serious concerns about the privacy and security of AI-driven data. This study tackles these issues by detecting Generative AI weaknesses through attacks such as data extraction, model inversion, and membership inference. A privacy-preserving Generative AI application that is resistant to these assaults is then developed. It ensures privacy without sacrificing functionality by using methods to identify, alter, or remove PII before to dealing with LLMs. In order to determine how well cloud platforms like Microsoft Azure, Google Cloud, and AWS provide privacy tools for protecting AI applications, the study also examines these technologies. In the end, this study offers a fundamental privacy paradigm for generative AI systems, focusing on data security and moral AI implementation, and opening the door to a more secure and conscientious use of these tools.

Paperid: 3247, https://arxiv.org/pdf/2504.08977.pdf

Abstract:
Recent steganographic schemes, starting with Meteor (CCS'21), rely on leveraging large language models (LLMs) to resolve a historically-challenging task of disguising covert communication as ``innocent-looking'' natural-language communication. However, existing methods are vulnerable to ``re-randomization attacks,'' where slight changes to the communicated text, that might go unnoticed, completely destroy any hidden message. This is also a vulnerability in more traditional encryption-based stegosystems, where adversaries can modify the randomness of an encryption scheme to destroy the hidden message while preserving an acceptable covertext to ordinary users. In this work, we study the problem of robust steganography. We introduce formal definitions of weak and strong robust LLM-based steganography, corresponding to two threat models in which natural language serves as a covertext channel resistant to realistic re-randomization attacks. We then propose two constructions satisfying these notions. We design and implement our steganographic schemes that embed arbitrary secret messages into natural language text generated by LLMs, ensuring recoverability even under adversarial paraphrasing and rewording attacks. To support further research and real-world deployment, we release our implementation and datasets for public use.

Paperid: 3248, https://arxiv.org/pdf/2504.08967.pdf

Abstract:
OneAPI is an open standard that supports cross-architecture software development with minimal effort from developers. It brings DPC++ and C++ compilers which need to be thoroughly tested to verify their correctness, reliability, and security. Compilers have numerous code flows and optimization features. This process requires developers with deep understanding of the different compiler flows to craft testcases specific to target paths in the compiler. This testcase creation is a time-consuming and costly process. In this paper, we propose a large-language model (LLM)-based compiler fuzzing tool that integrates the concept of retrieval-augmented generation (RAG). This tool automates the testcase generation task and relieves experienced compiler developers from investing time to craft testcase generation patterns. We test our proposed approach on the Intel DPC++/C++ compiler. This compiler compiles SYCL code and allows developers to offload it to different architectures, e.g. GPUs and CPUs from different vendors. Using this tool, we managed to identify 87 SYCL code test cases that lead to output value mismatch or compiler runtime errors when compiled using Intel DPC++ and clang++ compilers and run on different architectures. The testcases and the identified unexpected behaviors of the compilers under test were obtained within only few hours with no prior background on the compiler passes under tests. This tool facilitates efficient compiler fuzzing with reduced developer time requirements via the dynamic testcase creation capability provided by an LLM with RAG.

Paperid: 3249, https://arxiv.org/pdf/2504.08871.pdf

Abstract:
Recent advancements in Large Language Models (LLMs) have transformed communication, yet their role in secure messaging remains underexplored, especially in surveillance-heavy environments. At the same time, many governments all over the world are proposing legislation to detect, backdoor, or even ban encrypted communication. That emphasizes the need for alternative ways to communicate securely and covertly over open channels. We propose a novel cryptographic embedding framework that enables covert Public Key or Symmetric Key encrypted communication over public chat channels with humanlike produced texts. Some unique properties of our framework are: 1. It is LLM agnostic, i.e., it allows participants to use different local LLM models independently; 2. It is pre- or post-quantum agnostic; 3. It ensures indistinguishability from human-like chat-produced texts. Thus, it offers a viable alternative where traditional encryption is detectable and restricted.

Paperid: 3250, https://arxiv.org/pdf/2504.08805.pdf

Abstract:
We measure the association between generative AI (GAI) tool adoption and four metrics spanning security operations, information protection, and endpoint management: 1) number of security alerts per incident, 2) probability of security incident reopenings, 3) time to classify a data loss prevention alert, and 4) time to resolve device policy conflicts. We find that GAI is associated with robust and statistically and practically significant improvements in the four metrics. Although unobserved confounders inhibit causal identification, these results are among the first to use observational data from live operations to investigate the relationship between GAI adoption and security operations, data loss prevention, and device policy management.

Paperid: 3251, https://arxiv.org/pdf/2504.08176.pdf

Abstract:
The increasing reliance on web services has led to a rise in cybersecurity threats, particularly Cross-Site Scripting (XSS) attacks, which target client-side layers of web applications by injecting malicious scripts. Traditional Web Application Firewalls (WAFs) struggle to detect highly obfuscated and complex attacks, as their rules require manual updates. This paper presents a novel generative AI framework that leverages Large Language Models (LLMs) to enhance XSS mitigation. The framework achieves two primary objectives: (1) generating sophisticated and syntactically validated XSS payloads using in-context learning, and (2) automating defense mechanisms by testing these attacks against a vulnerable application secured by a WAF, classifying bypassing attacks, and generating effective WAF security rules. Experimental results using GPT-4o demonstrate the framework's effectiveness generating 264 XSS payloads, 83% of which were validated, with 80% bypassing ModSecurity WAF equipped with an industry standard security rule set developed by the Open Web Application Security Project (OWASP) to protect against web vulnerabilities. Through rule generation, 86% of previously successful attacks were blocked using only 15 new rules. In comparison, Google Gemini Pro achieved a lower bypass rate of 63%, highlighting performance differences across LLMs.

Paperid: 3252, https://arxiv.org/pdf/2504.08086.pdf

Abstract:
Differentially private selection mechanisms offer strong privacy guarantees for queries aiming to identify the top-scoring element r from a finite set R, based on a dataset-dependent utility function. While selection queries are fundamental in data science, few mechanisms effectively ensure their privacy. Furthermore, most approaches rely on global sensitivity to achieve differential privacy (DP), which can introduce excessive noise and impair downstream inferences. To address this limitation, we propose the Smooth Noisy Max (SNM) mechanism, which leverages smooth sensitivity to yield provably tighter (upper bounds on) expected errors compared to global sensitivity-based methods. Empirical results demonstrate that SNM is more accurate than state-of-the-art differentially private selection methods in three applications: percentile selection, greedy decision trees, and random forests.

Paperid: 3253, https://arxiv.org/pdf/2504.07938.pdf

Abstract:
This paper presents a condensed system architecture for a file transfer solution that leverages post quantum cryptography and blockchain to secure data against quantum threats. The architecture integrates NIST standardized algorithms CRYSTALS Kyber for encryption and CRYSTALS Dilithium for digital signatures with an immutable blockchain ledger to provide an auditable, decentralized storage mechanism. The system is modular, comprising a Sender module for secure encryption and signing, a central User Storage module for decryption, reencryption, and blockchain logging, and a Requestor module for authenticated data access. We include detailed pseudocode, analyze security risks, and offer performance insights to demonstrate the system's robustness, scalability, and transparency.

Paperid: 3254, https://arxiv.org/pdf/2504.07868.pdf

Abstract:
Ransomware poses a significant threat to individuals and organisations, compelling tools to investigate its behaviour and the effectiveness of mitigations. To answer this need, we present SAFARI, an open-source framework designed for safe and efficient ransomware analysis. SAFARI's design emphasises scalability, air-gapped security, and automation, democratising access to safe ransomware investigation tools and fostering collaborative efforts. SAFARI leverages virtualisation, Infrastructure-as-Code, and OS-agnostic task automation to create isolated environments for controlled ransomware execution and analysis. The framework enables researchers to profile ransomware behaviour and evaluate mitigation strategies through automated, reproducible experiments. We demonstrate SAFARI's capabilities by building a proof-of-concept implementation and using it to run two case studies. The first analyses five renowned ransomware strains (including WannaCry and LockBit) to identify their encryption patterns and file targeting strategies. The second evaluates Ranflood, a contrast tool which we use against three dangerous strains. Our results provide insights into ransomware behaviour and the effectiveness of countermeasures, showcasing SAFARI's potential to advance ransomware research and defence development.

Paperid: 3255, https://arxiv.org/pdf/2504.07574.pdf

Abstract:
This research studies the quality, speed and cost of malware analysis assisted by artificial intelligence. It focuses on Linux and IoT malware of 2024-2025, and uses r2ai, the AI extension of Radare2's disassembler. Not all malware and not all LLMs are equivalent but the study shows excellent results with Claude 3.5 and 3.7 Sonnet. Despite a few errors, the quality of analysis is overall equal or better than without AI assistance. For good results, the AI cannot operate alone and must constantly be guided by an experienced analyst. The gain of speed is largely visible with AI assistance, even when taking account the time to understand AI's hallucinations, exaggerations and omissions. The cost is usually noticeably lower than the salary of a malware analyst, but attention and guidance is needed to keep it under control in cases where the AI would naturally loop without showing progress.

Paperid: 3256, https://arxiv.org/pdf/2504.07478.pdf

Abstract:
Detecting Denial of Service (DoS) and Distributed Denial of Service (DDoS) attacks remains a critical challenge in cybersecurity. This research introduces a hybrid deep learning model combining Gated Recurrent Units (GRUs) and a Neural Turing Machine (NTM) for enhanced intrusion detection. Trained on the UNSW-NB15 and BoT-IoT datasets, the model employs GRU layers for sequential data processing and an NTM for long-term pattern recognition. The proposed approach achieves 99% accuracy in distinguishing between normal, DoS, and DDoS traffic. These findings offer promising advancements in real-time threat detection and contribute to improved network security across various domains.

Paperid: 3257, https://arxiv.org/pdf/2504.07358.pdf

Abstract:
Unmanned Aerial Vehicles (UAVs) play a pivotal role in modern autonomous air mobility, and the reliability of UAV avionics systems is critical to ensuring mission success, sustainability practices, and public safety. The success of UAV missions depends on effectively mitigating various aspects of electronic warfare, including non-destructive and destructive cyberattacks, transponder vulnerabilities, and jamming threats, while rigorously implementing countermeasures and defensive aids. This paper provides a comprehensive review of UAV cyberattacks, countermeasures, and defensive strategies. It explores UAV-to-UAV coordination attacks and their associated features, such as dispatch system attacks, Automatic Dependent Surveillance-Broadcast (ADS-B) attacks, Traffic Alert and Collision Avoidance System (TCAS)-induced collisions, and TCAS attacks. Additionally, the paper examines UAV-to-command center coordination attacks, as well as UAV functionality attacks. The review also covers various countermeasures and defensive aids designed for UAVs. Lastly, a comparison of common cyberattacks and countermeasure approaches is conducted, along with a discussion of future trends in the field. Keywords: Electronic warfare, UAVs, Avionics Systems, cyberattacks, coordination attacks, functionality attacks, countermeasure, defensive-aids.

Paperid: 3258, https://arxiv.org/pdf/2504.07018.pdf

Abstract:
Secure speculation schemes have shown great promise in the war against speculative side-channel attacks, and will be a key building block for developing secure, high-performance architectures moving forward. As the field matures, the need for rigorous microarchitectures, and corresponding performance and cost analysis, become critical for evaluating secure schemes and for enabling their future adoption. In ShadowBinding, we present effective microarchitectures for two state-of-the-art secure schemes, uncovering and mitigating fundamental microarchitectural limitations within the analyzed schemes, and provide important design characteristics. We uncover that Speculative Taint Tracking's (STT's) rename-based taint computation must be completed in a single cycle, creating an expensive dependency chain which greatly limits performance for wider processor cores. We also introduce a novel michroarchitectural approach for STT, named STT-Issue, which, by delaying the taint computation to the issue stage, eliminates the dependency chain, achieving better instructions per cycle (IPC), timing, area, and performance results. Through a comprehensive evaluation of our STT and Non-Speculative Data Access (NDA) microarchitectural designs on the RISC-V Berkeley Out-of-Order Machine, we find that the IPC impact of in-core secure schemes is higher than previously estimated, close to 20% for the highest performance core. With insights into timing from our RTL evaluation, the performance loss, created by the combined impact of IPC and timing, becomes even greater, at 35%, 27%, and 22% for STT-Rename, STT-Issue, and NDA, respectively. If these trends were to hold for leading processor core designs, the performance impact would be well over 30%, even for the best-performing scheme.

Paperid: 3259, https://arxiv.org/pdf/2504.06552.pdf

Abstract:
The widespread adoption of conversational AI platforms has introduced new security and privacy risks. While these risks and their mitigation strategies have been extensively researched from a technical perspective, users' perceptions of these platforms' security and privacy remain largely unexplored. In this paper, we conduct a large-scale analysis of over 2.5M user posts from the r/ChatGPT Reddit community to understand users' security and privacy concerns and attitudes toward conversational AI platforms. Our qualitative analysis reveals that users are concerned about each stage of the data lifecycle (i.e., collection, usage, and retention). They seek mitigations for security vulnerabilities, compliance with privacy regulations, and greater transparency and control in data handling. We also find that users exhibit varied behaviors and preferences when interacting with these platforms. Some users proactively safeguard their data and adjust privacy settings, while others prioritize convenience over privacy risks, dismissing privacy concerns in favor of benefits, or feel resigned to inevitable data sharing. Through qualitative content and regression analysis, we discover that users' concerns evolve over time with the evolving AI landscape and are influenced by technological developments and major events. Based on our findings, we provide recommendations for users, platforms, enterprises, and policymakers to enhance transparency, improve data controls, and increase user trust and adoption.

Paperid: 3260, https://arxiv.org/pdf/2504.06083.pdf

Abstract:
As a primary encryption primitive balancing the privacy and searchability of cloud storage images, thumbnail preserving encryption (TPE) enables users to quickly identify the privacy personal image on the cloud and request this image from the owner through a secure channel. In this paper, we have found that two different plaintext images may produce the same thumbnail. It results in the failure of search strategy because the collision of thumbnail occurs. To address this serious security issues, we conduct an in-depth analysis on the collision probabilities of thumbnails, and then propose a new TPE framework, called multi-factor thumbnail preserving encryption (MFTPE). It starts from the collision probability of two blocks, extend to the probabilities of two images and ultimately to N images. Then, we in detail describe three specific MFTPE constructions preserving different combinations of factors, i.e., the sum and the geometric mean, the sum and the range, and the sum and the weighted mean. The theoretical and experimental results demonstrate that the proposed MFTPE reduces the probability of thumbnails, exhibits strong robustness, and also effectively resists face detection and noise attacks.

Paperid: 3261, https://arxiv.org/pdf/2504.05485.pdf

Abstract:
Zero Trust is the new cybersecurity model that challenges the traditional one by promoting continuous verification of users, devices, and applications, whatever their position or origin. This model is critical for reducing the attack surface and preventing lateral movement without relying on implicit trust. Adopting the zero trust principle in Intelligent Transportation Systems (ITS), especially in the context of connected vehicles (CVs), presents an adequate solution in the face of increasing cyber threats, thereby strengthening the ITS environment. This paper offers an understanding of Zero Trust security through a comprehensive review of existing literature, principles, and challenges. It specifically examines its applications in emerging technologies, particularly within connected vehicles, addressing potential issues and cyber threats faced by CVs. Inclusion/exclusion criteria for the systematic literature review were planned alongside a bibliometric analysis. Moreover, keyword co-occurrence analysis was done, which indicates trends and general themes for the Zero Trust model, Zero Trust implementation, and Zero Trust application. Furthermore, the paper explores various ZT models proposed in the literature for connected vehicles, shedding light on the challenges associated with their integration into CV systems. Future directions of this research will focus on incorporating Zero Trust principles within Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication paradigms. This initiative intends to enhance the security posture and safety protocols within interconnected vehicular networks. The proposed research seeks to address the unique cybersecurity vulnerabilities inherent in the highly dynamic nature of vehicular communication systems.

Paperid: 3262, https://arxiv.org/pdf/2504.04794.pdf

Abstract:
The rapid advancement of artificial intelligence (AI) has brought about sophisticated models capable of various tasks ranging from image recognition to natural language processing. As these models continue to grow in complexity, ensuring their trustworthiness and transparency becomes critical, particularly in decentralized environments where traditional trust mechanisms are absent. This paper addresses the challenge of verifying personalized AI models in such environments, focusing on their integrity and privacy. We propose a novel framework that integrates zero-knowledge succinct non-interactive arguments of knowledge (zk-SNARKs) with Chainlink decentralized oracles to verify AI model performance claims on blockchain platforms. Our key contribution lies in integrating zk-SNARKs with Chainlink oracles to securely fetch and verify external data to enable trustless verification of AI models on a blockchain. Our approach addresses the limitations of using unverified external data for AI verification on the blockchain while preserving sensitive information of AI models and enhancing transparency. We demonstrate our methodology with a linear regression model predicting Bitcoin prices using on-chain data verified on the Sepolia testnet. Our results indicate the framework's efficacy, with key metrics including proof generation taking an average of 233.63 seconds and verification time of 61.50 seconds. This research paves the way for transparent and trustless verification processes in blockchain-enabled AI ecosystems, addressing key challenges such as model integrity and model privacy protection. The proposed framework, while exemplified with linear regression, is designed for broader applicability across more complex AI models, setting the stage for future advancements in transparent AI verification.

Paperid: 3263, https://arxiv.org/pdf/2504.04731.pdf

Abstract:
In modern critical infrastructure such as power grids, it is crucial to ensure security of data communications between network-connected devices while following strict latency criteria. This necessitates the use of cryptographic hardware accelerators. We propose a high-performance unified elliptic curve cryptography accelerator supporting NIST standard Montgomery curves Curve25519 and Curve448 at 128-bit and 224-bit security levels respectively. Our accelerator implements extensive parallel processing of Karatsuba-style large-integer multiplications, restructures arithmetic operations in the Montgomery Ladder and exploits special mathematical properties of the underlying pseudo-Mersenne and Solinas prime fields for optimized performance. Our design ensures efficient resource sharing across both curve computations and also incorporates several standard side-channel countermeasures. Our ASIC implementation achieves record performance and energy of 10.38 $Î¼$s / 54.01 $Î¼$s and 0.72 $Î¼$J / 3.73 $Î¼$J respectively for Curve25519 / Curve448, which is significantly better than state-of-the-art.

Paperid: 3264, https://arxiv.org/pdf/2504.04511.pdf

Abstract:
We consider the problem of adapting a Post-Quantum cryptosystem to be used in resource-constrained devices, such as those typically used in Device-to-Device and Internet of Things systems. In particular, we propose leveraging the characteristics of wireless communications channels to minimize the complexity of implementation of a Post-Quantum public key encryption scheme, without diminishing its security. To that end, we focus on the adaptation of a well-known cryptosystem, namely CRYSTALS-Kyber, so as to enable its direct integration into the lowest layer of the communication stack, the physical layer, defining two new transport schemes for CRYSTALS-Kyber to be used in Device-to-Device communications, both of which are modeled under a wireless channel subject to Additive White Gaussian Noise, using a 4 Quadrature Amplitude Modulation constellation and a BCH-code to communicate CRYSTALSKyber's polynomial coefficients. Simulation results demonstrate the viability of the adapted Kyber algorithm due to its low key error probability, while maintaining the security reductions of the original Kyber by considering the error distribution imposed by the channel on the cipher.

Paperid: 3265, https://arxiv.org/pdf/2504.04388.pdf

Abstract:
Zoom serves millions of users daily and allows third-party developers to integrate their apps with the Zoom client and reach those users. So far, these apps' privacy and security aspects, which can access rich audio-visual data (among others) from Zoom, have not been scientifically investigated. This paper examines the evolution of the Zoom Marketplace over one year, identifying trends in apps, their data collection behaviors, and the transparency of privacy policies. Our findings include worrisome details about the increasing over-collection of user data, non-transparency about purposes and sharing behaviors, and possible non-compliance with relevant laws. We believe these findings will inform future privacy and security research on this platform and help improve Zoom's app review process and platform policy.

Paperid: 3266, https://arxiv.org/pdf/2504.04063.pdf

Abstract:
Unmanned Aerial Vehicles are increasingly utilized across various domains, necessitating robust security measures for their communication networks. The ASCON family, a NIST finalist in lightweight cryptography standards, is known for its simplistic yet resilient design, making it well-suited for resource-constrained environments characterized by limited processing capabilities and energy reservoirs. This study focuses on understanding the integration and assessment of the ASCON encryption algorithm in UAV networks, emphasizing its potential as a lightweight and efficient cryptographic solution. The research objectives aim to evaluate ASCON variants' effectiveness in providing security comparable to AES-128 while exhibiting lower computational cost and energy consumption within simulated UAV network environments. Comparative analysis assesses performance metrics such as encryption and decryption speeds, resource utilization, and resistance to cryptographic vulnerabilities against established algorithms like AES. Performance metrics, including peak and average execution times, overall throughput, and security properties against various cryptographic attacks, are measured and analysed to determine the most suitable cryptographic algorithm for UAV communication systems. Performance results indicate that ASCON-128a as the optimal choice for UAV communication systems requiring a balance between efficiency and security. Its superior performance metrics, robust security properties, and suitability for resource-constrained environments position it as the preferred solution for securing UAV communication networks. By leveraging the strengths of ASCON-128a, UAV communication systems can achieve optimal performance and security, ensuring reliable and secure communication in challenging operational environments.

Paperid: 3267, https://arxiv.org/pdf/2504.03730.pdf

Abstract:
The rapid development of Internet of Things (IoT) technology has significantly impacted various market sectors. According to Li et al. (2024), an estimated 75 billion devices will be on the market in 2025. The healthcare industry is a target to improve patient care and ease healthcare provider burdens. Chronic respiratory disease is likely to benefit from their inclusion, with 545 million people worldwide recorded to suffer from patients using these devices to track their dosage. At the same time, healthcare providers can improve medication administration and monitor respiratory health (Soriano et al., 2020). While IoT medical devices offer numerous benefits, they also have security vulnerabilities that can expose patient data to cyberattacks. It's crucial to prioritize security measures in developing and deploying IoT medical devices, especially in personalized health monitoring systems for individuals with respiratory conditions. Efforts are underway to assess the security risks associated with intelligent inhalers and respiratory medical devices by understanding usability behavior and technological elements to identify and address vulnerabilities effectively. This work analyses usability behavior and technical vulnerabilities, emphasizing the confidentiality of information gained from Smart Inhalers. It then extrapolates to interrogate potential vulnerabilities with Implantable Medical Devices (IMDs). Our work explores the tensions in device development through the intersection of IoT technology and respiratory health, particularly in the context of intelligent inhalers and other breathing medical devices, calling for integrating robust security measures into the development and deployment of IoT devices to safeguard patient data and ensure the secure functioning of these critical healthcare technologies.

Paperid: 3268, https://arxiv.org/pdf/2504.03726.pdf

Abstract:
This study investigates malicious AI Assistants' manipulative traits and whether the behaviours of malicious AI Assistants can be detected when interacting with human-like simulated users in various decision-making contexts. We also examine how interaction depth and ability of planning influence malicious AI Assistants' manipulative strategies and effectiveness. Using a controlled experimental design, we simulate interactions between AI Assistants (both benign and deliberately malicious) and users across eight decision-making scenarios of varying complexity and stakes. Our methodology employs two state-of-the-art language models to generate interaction data and implements Intent-Aware Prompting (IAP) to detect malicious AI Assistants. The findings reveal that malicious AI Assistants employ domain-specific persona-tailored manipulation strategies, exploiting simulated users' vulnerabilities and emotional triggers. In particular, simulated users demonstrate resistance to manipulation initially, but become increasingly vulnerable to malicious AI Assistants as the depth of the interaction increases, highlighting the significant risks associated with extended engagement with potentially manipulative systems. IAP detection methods achieve high precision with zero false positives but struggle to detect many malicious AI Assistants, resulting in high false negative rates. These findings underscore critical risks in human-AI interactions and highlight the need for robust, context-sensitive safeguards against manipulative AI behaviour in increasingly autonomous decision-support systems.

Paperid: 3269, https://arxiv.org/pdf/2504.03347.pdf

Abstract:
Efficient password cracking is a critical aspect of digital forensics, enabling investigators to decrypt protected content during criminal investigations. Traditional password cracking methods, including brute-force, dictionary and rule-based attacks face challenges in balancing efficiency with increasing computational complexity. This study explores rule based optimisation strategies to enhance the effectiveness of password cracking while minimising resource consumption. By analysing publicly available password datasets, we propose an optimised rule set that reduces computational iterations by approximately 40%, significantly improving the speed of password recovery. Additionally, the impact of national password recommendations were examined, specifically, the UK National Cyber Security Centre's three word password guideline on password security and forensic recovery. Through user generated password surveys, we evaluate the crackability of three word passwords using dictionaries of varying common word proportions. Results indicate that while three word passwords provide improved memorability and usability, they remain vulnerable when common word combinations are used, with up to 77.5% of passwords cracked using a 30% common word dictionary subset. The study underscores the importance of dynamic password cracking strategies that account for evolving user behaviours and policy driven password structures. Findings contribution to both forensic efficiency and cyber security awareness, highlight the dual impact of password policies on security and investigative capabilities. Future work will focus upon refining rule based cracking techniques and expanding research on password composition trends.

Paperid: 3270, https://arxiv.org/pdf/2504.03307.pdf

Abstract:
We study the behaviour of the algebraic degree of vectorial Boolean functions when their inputs are restricted to an affine subspace of their domain. Functions which maintain their degree on all subspaces of as high a codimension as possible are particularly interesting for cryptographic applications. For functions which are power functions $x^d$ in their univariate representation, we fully characterize the exponents $d$ for which the algebraic degree of the function stays unchanged when the input is restricted to spaces of codimension 1 or 2. For codimensions $k\ge 3$, we give a sufficient condition for the algebraic degree to stay unchanged. We apply these results to the multiplicative inverse function, as well as to the Kasami functions. We define an optimality notion regarding the stability of the degree on subspaces, and determine a number of optimal functions, including the multiplicative inverse function and the quadratic APN functions. We also give an explicit formula for counting the functions that keep their algebraic degree unchanged when restricted to hyperplanes.

Paperid: 3271, https://arxiv.org/pdf/2504.02695.pdf

Abstract:
We prove new hardness results for fundamental lattice problems under the Exponential Time Hypothesis (ETH). Building on a recent breakthrough by Bitansky et al. [BHIRW24], who gave a polynomial-time reduction from $\mathsf{3SAT}$ to the (gap) $\mathsf{MAXLIN}$ problem-a class of CSPs with linear equations over finite fields-we derive ETH-hardness for several lattice problems. First, we show that for any $p \in [1, \infty)$, there exists an explicit constant $Î³> 1$ such that $\mathsf{CVP}_{p,Î³}$ (the $\ell_p$-norm approximate Closest Vector Problem) does not admit a $2^{o(n)}$-time algorithm unless ETH is false. Our reduction is deterministic and proceeds via a direct reduction from (gap) $\mathsf{MAXLIN}$ to $\mathsf{CVP}_{p,Î³}$. Next, we prove a randomized ETH-hardness result for $\mathsf{SVP}_{p,Î³}$ (the $\ell_p$-norm approximate Shortest Vector Problem) for all $p > 2$. This result relies on a novel property of the integer lattice $\mathbb{Z}^n$ in the $\ell_p$ norm and a randomized reduction from $\mathsf{CVP}_{p,Î³}$ to $\mathsf{SVP}_{p,Î³'}$. Finally, we improve over prior reductions from $\mathsf{3SAT}$ to $\mathsf{BDD}_{p, Î±}$ (the Bounded Distance Decoding problem), yielding better ETH-hardness results for $\mathsf{BDD}_{p, Î±}$ for any $p \in [1, \infty)$ and $Î±> Î±_p^{\ddagger}$, where $Î±_p^{\ddagger}$ is an explicit threshold depending on $p$. We additionally observe that prior work implies ETH hardness for the gap minimum distance problem ($Î³$-$\mathsf{MDP}$) in codes.

Paperid: 3272, https://arxiv.org/pdf/2504.02114.pdf

Abstract:
In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the model. Unlike previous research, which predominantly focuses on safeguarding client data, our work shifts attention protecting the client model itself. Through a theoretical analysis, we examine how various factors, such as the probability of client selection, the structure of local objective functions, global aggregation at the server, and the eavesdropper's capabilities, impact the overall level of protection. We further validate our findings through numerical experiments, assessing the protection by evaluating the model accuracy achieved by the adversary. Finally, we compare our results with methods based on differential privacy, underscoring their limitations in this specific context.

Paperid: 3273, https://arxiv.org/pdf/2504.01606.pdf

Abstract:
Cyber threat intelligence (CTI) is essential for effective system defense. CTI is a collection of information about current or past threats to a computer system. This information is gathered by an agent through observation, or based on a set of sources. Building intelligence only makes sense if you have confidence in it. To achieve this, it is necessary to estimate the confidence in each piece of information gathered, taking into account the different dimensions that can make it up: reliability of the source, competence, plausibility of the information, credibility of the information, for example. The information gathered must then be combined with other information to consolidate an agent's knowledge. Recent advances have been made in the theory underlying the modeling of trust for decision-making based on uncertain information, notably by using multivalued logic. This approach makes it possible to deal with unknown values of trust-building parameters, or to easily integrate dimensions. In this article we present the problem of CTI and CTI information sharing, and the reasons that led us to use a logic-based solution for an initial implementation.

Paperid: 3274, https://arxiv.org/pdf/2504.01230.pdf

Abstract:
The matrix code equivalence problem consists, given two matrix spaces $\mathcal{C},\mathcal{D} \subset \mathbb{F}_q^{m\times n}$ of dimension $k$, in finding invertible matrices $P\in\mathrm{GL}_m(\mathbb{F}_q)$ and $Q\in\mathrm{GL}_n(\mathbb{F}_q)$ such that $\mathcal{D}=P\mathcal{C} Q^{-1}$. Recent signature schemes such as MEDS and ALTEQ relate their security to the hardness of this problem. Recent works by Narayanan, Qiao and Tang on the one hand and by Ran and Samardjiska on the other hand tackle this problem. The former is restricted to the ``cubic'' case $k = m =n$ and succeeds in $\widetilde{\mathcal{O}}(q^{\frac k 2})$ operations. The latter is an algebraic attack on the general problem whose complexity is not fully understood and which succeeds only on $\mathcal{O}(1/q)$ instances. We present a novel algorithm which solves the problem in the general case. Our approach consists in reducing the problem to the matrix code conjugacy problem, \emph{i.e.} the case $P=Q$. For the latter problem, similarly to the permutation code equivalence problem in Hamming metric, a natural invariant based on the \emph{Hull} of the code can be used. Next, the equivalence of codes can be deduced using a usual list collision argument. For $k=m=n$, our algorithm achieves the same time complexity as Narayanan \emph{et al.} but with a lower space complexity. Moreover, ours extends to a much broader range of parameters.

Paperid: 3275, https://arxiv.org/pdf/2504.00497.pdf

Abstract:
This paper proposes a visual encryption method to ensure the confidentiality of digital images. The model used is based on an autoencoder using aConvolutional Neural Network (CNN) to ensure the protection of the user data on both the sender side (encryption process) and the receiver side(decryption process)in a symmetric mode. To train and test the model, we used the MNIST and CIFAR-10 datasets. Our focus lies in generating an encrypted dataset by combining the original dataset with a random mask. Then, a convolutional autoencoder in the masked dataset will be designed and trained to learn essential image features in a reduced-dimensional latent space and reconstruct the image from this space. The used mask can be considered as a secret key known in standard cryptographic algorithms which allows the receiver of the masked data to recover the plain data. The implementation of this proposed encryption model demonstrates efficacy in preserving data confidentiality and integrity while reducing the dimensionality (for example we pass from 3072 Bytes to 1024 Bytes for CIFAR-10 images). Experimental results show that the used CNN exhibits a proficient encryption and decryption process on the MNIST dataset, and a proficient encryption and acceptable decryption process on the CIFAR-10 dataset.

Paperid: 3276, https://arxiv.org/pdf/2504.00428.pdf

Abstract:
Successful defense against dynamically evolving cyber threats requires advanced and sophisticated techniques. This research presents a novel approach to enhance real-time cybersecurity threat detection and response by integrating large language models (LLMs) and Retrieval-Augmented Generation (RAG) systems with continuous threat intelligence feeds. Leveraging recent advancements in LLMs, specifically GPT-4o, and the innovative application of RAG techniques, our approach addresses the limitations of traditional static threat analysis by incorporating dynamic, real-time data sources. We leveraged RAG to get the latest information in real-time for threat intelligence, which is not possible in the existing GPT-4o model. We employ the Patrowl framework to automate the retrieval of diverse cybersecurity threat intelligence feeds, including Common Vulnerabilities and Exposures (CVE), Common Weakness Enumeration (CWE), Exploit Prediction Scoring System (EPSS), and Known Exploited Vulnerabilities (KEV) databases, and integrate these with the all-mpnet-base-v2 model for high-dimensional vector embeddings, stored and queried in Milvus. We demonstrate our system's efficacy through a series of case studies, revealing significant improvements in addressing recently disclosed vulnerabilities, KEVs, and high-EPSS-score CVEs compared to the baseline GPT-4o. This work not only advances the role of LLMs in cybersecurity but also establishes a robust foundation for the development of automated intelligent cyberthreat information management systems, addressing crucial gaps in current cybersecurity practices.

Paperid: 3277, https://arxiv.org/pdf/2504.00357.pdf

Abstract:
This paper presents a novel singletrace sidechannel attack on FALCON a latticebased postquantum digital signature protocol recently approved for standardization by NIST We target the discrete Gaussian sampling operation within FALCONs key generation scheme and demonstrate that a single power trace is sufficient to mount a successful attack Notably negating the results of a 63bit rightshift operation on 64bit secret values leaks critical information about the assignment of 1 versus 0 to intermediate coefficients during sampling These leaks enable full recovery of the secret key We demonstrate a groundup approach to the attack on an ARM CortexM4 microcontroller executing both the reference and optimized implementations from FALCONs NIST round 3 software package We successfully recovered all of the secret polynomials in FALCON We further quantify the attackers success rate using a univariate Gaussian template model providing generalizable guarantees Statistical analysis with over 500000 tests reveals a percoefficient success rate of 999999999478 and a fullkey recovery rate of 9999994654 for FALCON512 We verify that this vulnerability is present in all implementations included in FALCONs NIST submission package This highlights the vulnerability of current software implementations to singletrace attacks and underscores the urgent need for singletrace resilient software in embedded systems

Paperid: 3279, https://arxiv.org/pdf/2504.00031.pdf

Abstract:
To effectively deploy Large Language Models (LLMs) in application-specific settings, fine-tuning techniques are applied to enhance performance on specialized tasks. This process often involves fine-tuning on user data data, which may contain sensitive information. Although not recommended, it is not uncommon for users to send passwords in messages, and fine-tuning models on this could result in passwords being leaked. In this study, a Large Language Model is fine-tuned with customer support data and passwords from the RockYou password wordlist using Low-Rank Adaptation (LoRA). Out of the first 200 passwords from the list, 37 were successfully recovered. Further, causal tracing is used to identify that password information is largely located in a few layers. Lastly, Rank One Model Editing (ROME) is used to remove the password information from the model, resulting in the number of passwords recovered going from 37 to 0.

Paperid: 3280, https://arxiv.org/pdf/2504.00012.pdf

Abstract:
Organisations are rapidly adopting artificial intelligence (AI) tools to perform tasks previously undertaken by people. The potential benefits are enormous. Separately, some organisations deploy personnel security measures to mitigate the security risks arising from trusted human insiders. Unfortunately, there is no meaningful interplay between the rapidly evolving domain of AI and the traditional world of personnel security. This is a problem. The complex risks from human insiders are hard enough to understand and manage, despite many decades of effort. The emerging security risks from AI insiders are even more opaque. Both sides need all the help they can get. Some of the concepts and approaches that have proved useful in dealing with human insiders are also applicable to the emerging risks from AI insiders.

Paperid: 3281, https://arxiv.org/pdf/2503.23785.pdf

Abstract:
This paper introduces ObfusQate, a novel tool that conducts obfuscations using quantum primitives to enhance the security of both classical and quantum programs. We have designed and implemented two primary categories of obfuscations: quantum circuit level obfuscation and code level obfuscation, encompassing a total of eight distinct methods. Quantum circuit-level obfuscation leverages on quantum gates and circuits, utilizing strategies such as quantum gate hiding and identity matrices to construct complex, non-intuitive circuits that effectively obscure core functionalities and resist reverse engineering, making the underlying code difficult to interpret. Meanwhile, code-level obfuscation manipulates the logical sequence of program operations through quantum-based opaque predicates, obfuscating execution paths and rendering program behavior more unpredictable and challenging to analyze. Additionally, ObfusQate can be used to obfuscate malicious code segments, making them harder to detect and analyze. These advancements establish a foundational framework for further exploration into the potential and limitations of quantum-based obfuscation techniques, positioning ObfusQate as a valuable tool for future developers to enhance code security in the evolving landscape of software development. To the best of our knowledge, ObfusQate represents the pioneering work in developing an automated framework for implementing obfuscations leveraging quantum primitives. Security evaluations show that obfuscations by ObfusQate maintain code behavior with polynomial overheads in space and time complexities. We have also demonstrated an offensive use case by embedding a keylogger into Shor's algorithm and obfuscating it using ObfusQate. Our results show that current Large language models like GPT 4o, GPT o3 mini and Grok 3 were not able to identify the malicious keylogger after obfuscation.

Paperid: 3282, https://arxiv.org/pdf/2503.23444.pdf

Abstract:
Since SARS-CoV-2 started spreading in Europe in early 2020, there has been a strong call for technical solutions to combat or contain the pandemic, with contact tracing apps at the heart of the debates. The EU's General Data Protection Regulation (GDPR) requires controllers to carry out a data protection impact assessment (DPIA) where their data processing is likely to result in a high risk to the rights and freedoms (Art. 35 GDPR). A DPIA is a structured risk analysis that identifies and evaluates possible consequences of data processing relevant to fundamental rights in advance and describes the measures envisaged to address these risks or expresses the inability to do so. Based on the Standard Data Protection Model (SDM), we present the results of a scientific and methodologically clear DPIA of the German German Corona-Warn-App (CWA). It shows that even a decentralized architecture involves numerous serious weaknesses and risks, including larger ones still left unaddressed in current implementations. It also found that none of the proposed designs operates on anonymous data or ensures proper anonymisation. It also showed that informed consent would not be a legitimate legal ground for the processing. For all points where data subjects' rights are still not sufficiently safeguarded, we briefly outline solutions.

Paperid: 3283, https://arxiv.org/pdf/2503.23292.pdf

Abstract:
Federated learning (FL) is a distributed machine learning paradigm enabling multiple clients to train a model collaboratively without exposing their local data. Among FL schemes, clustering is an effective technique addressing the heterogeneity issue (i.e., differences in data distribution and computational ability affect training performance and effectiveness) via grouping participants with similar computational resources or data distribution into clusters. However, intra-cluster data exchange poses privacy risks, while cluster selection and adaptation introduce challenges that may affect overall performance. To address these challenges, this paper introduces anonymous adaptive clustering, a novel approach that simultaneously enhances privacy protection and boosts training efficiency. Specifically, an oblivious shuffle-based anonymization method is designed to safeguard user identities and prevent the aggregation server from inferring similarities through clustering. Additionally, to improve performance, we introduce an iteration-based adaptive frequency decay strategy, which leverages variability in clustering probabilities to optimize training dynamics. With these techniques, we build the FedCAPrivacy; experiments show that FedCAPrivacy achieves ~7X improvement in terms of performance while maintaining high privacy.

Paperid: 3284, https://arxiv.org/pdf/2503.22612.pdf

Abstract:
This study evaluates the adoption of DevSecOps among small and medium-sized enterprises (SMEs), identifying key challenges, best practices, and future trends. Through a mixed methods approach backed by the Technology Acceptance Model (TAM) and Diffusion of Innovations (DOI) theory, we analyzed survey data from 405 SME professionals, revealing that while 68% have implemented DevSecOps, adoption is hindered by technical complexity (41%), resource constraints (35%), and cultural resistance (38%). Despite strong leadership prioritization of security (73%), automation gaps persist, with only 12% of organizations conducting security scans per commit. Our findings highlight a growing integration of security tools, particularly API security (63%) and software composition analysis (62%), although container security adoption remains low (34%). Looking ahead, SMEs anticipate artificial intelligence and machine learning to significantly influence DevSecOps, underscoring the need for proactive adoption of AI-driven security enhancements. Based on our findings, this research proposes strategic best practices to enhance CI/CD pipeline security including automation, leadership-driven security culture, and cross-team collaboration.

Paperid: 3285, https://arxiv.org/pdf/2503.22573.pdf

Abstract:
The increasing integration of Artificial Intelligence across multiple industry sectors necessitates robust mechanisms for ensuring transparency, trust, and auditability of its development and deployment. This topic is particularly important in light of recent calls in various jurisdictions to introduce regulation and legislation on AI safety. In this paper, we propose a framework for complete verifiable AI pipelines, identifying key components and analyzing existing cryptographic approaches that contribute to verifiability across different stages of the AI lifecycle, from data sourcing to training, inference, and unlearning. This framework could be used to combat misinformation by providing cryptographic proofs alongside AI-generated assets to allow downstream verification of their provenance and correctness. Our findings underscore the importance of ongoing research to develop cryptographic tools that are not only efficient for isolated AI processes, but that are efficiently `linkable' across different processes within the AI pipeline, to support the development of end-to-end verifiable AI technologies.

Paperid: 3286, https://arxiv.org/pdf/2503.22227.pdf

Abstract:
We introduce an open-source GPU-accelerated fully homomorphic encryption (FHE) framework CAT, which surpasses existing solutions in functionality and efficiency. \emph{CAT} features a three-layer architecture: a foundation of core math, a bridge of pre-computed elements and combined operations, and an API-accessible layer of FHE operators. It utilizes techniques such as parallel executed operations, well-defined layout patterns of cipher data, kernel fusion/segmentation, and dual GPU pools to enhance the overall execution efficiency. In addition, a memory management mechanism ensures server-side suitability and prevents data leakage. Based on our framework, we implement three widely used FHE schemes: CKKS, BFV, and BGV. The results show that our implementation on Nvidia 4090 can achieve up to 2173$\times$ speedup over CPU implementation and 1.25$\times$ over state-of-the-art GPU acceleration work for specific operations. What's more, we offer a scenario validation with CKKS-based Privacy Database Queries, achieving a 33$\times$ speedup over its CPU counterpart. All query tasks can handle datasets up to $10^3$ rows on a single GPU within 1 second, using 2-5 GB storage. Our implementation has undergone extensive stability testing and can be easily deployed on commercial GPUs. We hope that our work will significantly advance the integration of state-of-the-art FHE algorithms into diverse real-world systems by providing a robust, industry-ready, and open-source tool.

Paperid: 3287, https://arxiv.org/pdf/2503.22065.pdf

Abstract:
Recent Intrusion Detection System (IDS) research has increasingly moved towards the adoption of machine learning methods. However, most of these systems rely on supervised learning approaches, necessitating a fully labeled training set. In the realm of network intrusion detection, the requirement for extensive labeling can become impractically burdensome. Moreover, while IDS training could benefit from inter-company knowledge sharing, the sensitive nature of cybersecurity data often precludes such cooperation. To address these challenges, we propose an IDS architecture that utilizes unsupervised learning to reduce the need for labeling. We further facilitate collaborative learning through the implementation of a federated learning framework. To enhance privacy beyond what current federated clustering models offer, we introduce an innovative federated K-means++ initialization technique. Our findings indicate that transitioning from a centralized to a federated setup does not significantly diminish performance.

Paperid: 3288, https://arxiv.org/pdf/2503.21535.pdf

Abstract:
The Deligne-Ogus-Shioda theorem guarantees the existence of isomorphisms between products of supersingular elliptic curves over finite fields. In this paper, we present methods for explicitly computing these isomorphisms in polynomial time, given the endomorphism rings of the curves involved. Our approach leverages the Deuring correspondence, enabling us to reformulate computational isogeny problems into algebraic problems in quaternions. Specifically, we reduce the computation of isomorphisms to solving systems of quadratic and linear equations over the integers derived from norm equations. We develop $\ell$-adic techniques for solving these equations when we have access to a low discriminant subring. Combining these results leads to the description of an efficient probabilistic Las Vegas algorithm for computing the desired isomorphisms. Under GRH, it is proved to run in expected polynomial time.

Paperid: 3289, https://arxiv.org/pdf/2503.21528.pdf

Abstract:
Differential privacy (DP) is becoming increasingly important for deployed machine learning applications because it provides strong guarantees for protecting the privacy of individuals whose data is used to train models. However, DP mechanisms commonly used in machine learning tend to struggle on many real world distributions, including highly imbalanced or small labeled training sets. In this work, we propose a new scalable DP mechanism for deep learning models, SWAG-PPM, by using a pseudo posterior distribution that downweights by-record likelihood contributions proportionally to their disclosure risks as the randomized mechanism. As a motivating example from official statistics, we demonstrate SWAG-PPM on a workplace injury text classification task using a highly imbalanced public dataset published by the U.S. Occupational Safety and Health Administration (OSHA). We find that SWAG-PPM exhibits only modest utility degradation against a non-private comparator while greatly outperforming the industry standard DP-SGD for a similar privacy budget.

Paperid: 3290, https://arxiv.org/pdf/2503.20803.pdf

Abstract:
This paper assesses the performance of five machine learning classifiers: Decision Tree, Naive Bayes, LightGBM, Logistic Regression, and Random Forest using latent representations learned by a Variational Autoencoder from malware datasets. Results from the experiments conducted on different training-test splits with different random seeds reveal that all the models perform well in detecting malware with ensemble methods (LightGBM and Random Forest) performing slightly better than the rest. In addition, the use of latent features reduces the computational cost of the model and the need for extensive hyperparameter tuning for improved efficiency of the model for deployment. Statistical tests show that these improvements are significant, and thus, the practical relevance of integrating latent space representation with traditional classifiers for effective malware detection in cybersecurity is established.

Paperid: 3291, https://arxiv.org/pdf/2503.20800.pdf

Abstract:
In light of scaling laws, many AI institutions are intensifying efforts to construct advanced AIs on extensive collections of high-quality human data. However, in a rush to stay competitive, some institutions may inadvertently or even deliberately include unauthorized data (like privacy- or intellectual property-sensitive content) for AI training, which infringes on the rights of data owners. Compounding this issue, these advanced AI services are typically built on opaque cloud platforms, which restricts access to internal information during AI training and inference, leaving only the generated outputs available for forensics. Thus, despite the introduction of legal frameworks by various countries to safeguard data rights, uncovering evidence of data misuse in modern opaque AI applications remains a significant challenge. In this paper, inspired by the ability of isotopes to trace elements within chemical reactions, we introduce the concept of information isotopes and elucidate their properties in tracing training data within opaque AI systems. Furthermore, we propose an information isotope tracing method designed to identify and provide evidence of unauthorized data usage by detecting the presence of target information isotopes in AI generations. We conduct experiments on ten AI models (including GPT-4o, Claude-3.5, and DeepSeek) and four benchmark datasets in critical domains (medical data, copyrighted books, and news). Results show that our method can distinguish training datasets from non-training datasets with 99\% accuracy and significant evidence (p-value$<0.001$) by examining a data entry equivalent in length to a research paper. The findings show the potential of our work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

Paperid: 3292, https://arxiv.org/pdf/2503.20796.pdf

Abstract:
Sophisticated phishing attacks have emerged as a major cybersecurity threat, becoming more common and difficult to prevent. Though machine learning techniques have shown promise in detecting phishing attacks, they function mainly as "black boxes" without revealing their decision-making rationale. This lack of transparency erodes the trust of users and diminishes their effective threat response. We present EXPLICATE: a framework that enhances phishing detection through a three-component architecture: an ML-based classifier using domain-specific features, a dual-explanation layer combining LIME and SHAP for complementary feature-level insights, and an LLM enhancement using DeepSeek v3 to translate technical explanations into accessible natural language. Our experiments show that EXPLICATE attains 98.4 % accuracy on all metrics, which is on par with existing deep learning techniques but has better explainability. High-quality explanations are generated by the framework with an accuracy of 94.2 % as well as a consistency of 96.8\% between the LLM output and model prediction. We create EXPLICATE as a fully usable GUI application and a light Chrome extension, showing its applicability in many deployment situations. The research shows that high detection performance can go hand-in-hand with meaningful explainability in security applications. Most important, it addresses the critical divide between automated AI and user trust in phishing detection systems.

Paperid: 3293, https://arxiv.org/pdf/2503.20281.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) are vital for ensuring enterprise security. Recently, Graph-based NIDS (GIDS) have attracted considerable attention because of their capability to effectively capture the complex relationships within the graph structures of data communications. Despite their promise, the reproducibility and replicability of these GIDS remain largely unexplored, posing challenges for developing reliable and robust detection systems. This study bridges this gap by designing a systematic approach to evaluate state-of-the-art GIDS, which includes critically assessing, extending, and clarifying the findings of these systems. We further assess the robustness of GIDS under adversarial attacks. Evaluations were conducted on three public datasets as well as a newly collected large-scale enterprise dataset. Our findings reveal significant performance discrepancies, highlighting challenges related to dataset scale, model inputs, and implementation settings. We demonstrate difficulties in reproducing and replicating results, particularly concerning false positive rates and robustness against adversarial attacks. This work provides valuable insights and recommendations for future research, emphasizing the importance of rigorous reproduction and replication studies in developing robust and generalizable GIDS solutions.

Paperid: 3294, https://arxiv.org/pdf/2503.19638.pdf

Abstract:
Smart grids have undergone a profound digitization process, integrating new data-driven control and supervision techniques, resulting in modern digital substations (DS). Attackers are more focused on attacking the supply chain of the DS, as they a comprise a multivendor environment. In this research work, we present the Substation Bill of Materials (Subs-BOM) schema, based on the CycloneDX specification, that is capable of modeling all the IEDs in a DS and their relationships from a cybersecurity perspective. The proposed Subs-BOM allows one to make informed decisions about cyber risks related to the supply chain, and enables managing multiple DS at the same time. This provides energy utilities with an accurate and complete inventory of the devices, the firmware they are running, and the services that are deployed into the DS. The Subs-BOM is generated using the Substation Configuration Description (SCD) file specified in the IEC 61850 standard as its main source of information. We validated the Subs-BOM schema against the Dependency-Track software by OWASP. This validation proved that the schema is correctly recognized by CycloneDX-compatible tools. Moreover, the Dependency-Track software could track existing vulnerabilities in the IEDs represented by the Subs-BOM.

Paperid: 3295, https://arxiv.org/pdf/2503.19531.pdf

Abstract:
The advent of quantum computing poses a significant challenge as it has the potential to break certain cryptographic algorithms, necessitating a proactive approach to identify and modernize cryptographic code. Identifying these cryptographic elements in existing code is only the first step. It is crucial not only to identify quantum vulnerable algorithms but also to detect vulnerabilities and incorrect crypto usages, to prioritize, report, monitor as well as remediate and modernize code bases. A U.S. government memorandum require agencies to begin their transition to PQC (Post Quantum Cryptograpy) by conducting a prioritized inventory of cryptographic systems including software and hardware systems. In this paper we describe our code scanning tool - Cryptoscope - which leverages cryptographic domain knowledge as well as compiler techniques to statically parse and analyze source code. By analyzing control and data flow the tool is able to build an extendable and querriable inventory of cryptography. Cryptoscope goes beyond identifying disconnected cryptographic APIs and instead provides the user with an inventory of cryptographic assets - containing comprehensive views of the cryptographic operations implemented. We show that for more than 92% of our test cases, these views include the cryptographic operation itself, APIs, as well as the related material such as keys, nonces, random sources etc. Lastly, building on top of this inventory, our tool is able to detect and report all the cryptographic related weaknesses and vulnerabilities (11 out of 15) in CamBench - achieving state-of-the-art performance.

Paperid: 3296, https://arxiv.org/pdf/2503.19141.pdf

Abstract:
Let $p$ be a prime, and let $N$ be a positive integer such that $p$ is a primitive root modulo $N$. Define $q = p^e$, where $e = Ï(N)$, and let $\mathbb{F}_q$ be the finite field of order $q$ with $\mathbb{F}_p$ as its prime subfield. Denote by $\mathrm{Tr}$ the trace function from $\mathbb{F}_q$ to $\mathbb{F}_p$. For $Î±\in \mathbb{F}_p$ and $Î²\in \mathbb{F}_q$, let $D$ be the set of nonzero solutions in $\mathbb{F}_q$ to the equation $\mathrm{Tr}(x^{\frac{q-1}{N}} + Î²x) = Î±$. Writing $D = \{d_1, \ldots, d_n\}$, we define the code $\mathcal{C}_{Î±,Î²} = \{(\mathrm{Tr}(d_1 x), \ldots, \mathrm{Tr}(d_n x)) : x \in \mathbb{F}_q\}$. In this paper, we investigate the weight distribution of $\mathcal{C}_{Î±,Î²}$ for all $Î±\in \mathbb{F}_p$ and $Î²\in \mathbb{F}_q$, with a focus on general odd primes $p$. When $Î²= 0$, we establish that $\mathcal{C}_{Î±,0}$ is a two-weight code for any $Î±\in \mathbb{F}_p$ and compute its weight distribution. For $Î²\neq 0$, we determine all possible weights of codewords in $\mathcal{C}_{Î±,Î²}$, demonstrating that it has at most $p+1$ distinct nonzero weights. Additionally, we prove that the dual code $\mathcal{C}_{0,0}^{\perp}$ is optimal with respect to the sphere packing bound. These findings extend prior results to the broader case of any odd prime $p$.

Paperid: 3297, https://arxiv.org/pdf/2503.19030.pdf

Abstract:
Microsoft's STRIDE methodology is at the forefront of threat modeling, supporting the increasingly critical quality attribute of security in software-intensive systems. However, in a comprehensive security evaluation process, the general consensus is that the STRIDE classification is only useful for threat elicitation, isolating threat modeling from the other security evaluation activities involved in a secure software development life cycle (SDLC). We present strideSEA, a STRIDE-centric Security Evaluation Approach that integrates STRIDE as the central classification scheme into the security activities of threat modeling, attack scenario analysis, risk analysis, and countermeasure recommendation that are conducted alongside software engineering activities in secure SDLCs. The application of strideSEA is demonstrated in a real-world online immunization system case study. Using STRIDE as a single unifying thread, we bind existing security evaluation approaches in the four security activities of strideSEA to analyze (1) threats using Microsoft threat modeling tool, (2) attack scenarios using attack trees, (3) systemic risk using NASA's defect detection and prevention (DDP) technique, and (4) recommend countermeasures based on their effectiveness in reducing the most critical risks using DDP. The results include a detailed quantitative assessment of the security of the online immunization system with a clear definition of the role and advantages of integrating STRIDE in the evaluation process. Overall, the unified approach in strideSEA enables a more structured security evaluation process, allowing easier identification and recommendation of countermeasures, thus supporting the security requirements and eliciting design considerations, informing the software development life cycle of future software-based information systems.

Paperid: 3298, https://arxiv.org/pdf/2503.18899.pdf

Abstract:
Many real-world applications are increasingly incorporating automated decision-making, driven by the widespread adoption of ML/AI inference for planning and guidance. This study examines the growing need for verifiable computing in autonomous decision-making. We formalize the problem of verifiable computing and introduce a sampling-based protocol that is significantly faster, more cost-effective, and simpler than existing methods. Furthermore, we tackle the challenges posed by non-determinism, proposing a set of strategies to effectively manage common scenarios.

Paperid: 3299, https://arxiv.org/pdf/2503.18890.pdf

Abstract:
We propose a public-key quantum money scheme based on group actions and the Hartley transform. Our scheme adapts the quantum money scheme of Zhandry (2024), replacing the Fourier transform with the Hartley transform. This substitution ensures the banknotes have real amplitudes rather than complex amplitudes, which could offer both computational and theoretical advantages. To support this new construction, we propose a new verification algorithm that uses group action twists to address verification failures caused by the switch to real amplitudes. We also show how to efficiently compute the serial number associated with a money state using a new algorithm based on continuous-time quantum walks. Finally, we present a recursive algorithm for the quantum Hartley transform, achieving lower gate complexity than prior work and demonstrate how to compute other real quantum transforms, such as the quantum sine transform, using the quantum Hartley transform as a subroutine.

Paperid: 3300, https://arxiv.org/pdf/2503.18857.pdf

Abstract:
Structural Health Monitoring (SHM) plays a crucial role in maintaining aging and critical infrastructure, supporting applications such as smart cities and digital twinning. These applications demand machine learning models capable of processing large volumes of real-time sensor data at the network edge. However, existing approaches often neglect the challenges of deploying machine learning models at the edge or are constrained by vendor-specific platforms. This paper introduces a scalable and secure edge-computing reference architecture tailored for data-driven SHM. We share practical insights from deploying this architecture at the Memorial Bridge in New Hampshire, US, referred to as the Living Bridge project. Our solution integrates a commercial data acquisition system with off-the-shelf hardware running an open-source edge-computing platform, remotely managed and scaled through cloud services. To support the development of data-driven SHM systems, we propose a resource consumption benchmarking framework called edgeOps to evaluate the performance of machine learning models on edge devices. We study this framework by collecting resource utilization data for machine learning models typically used in SHM applications on two different edge computing hardware platforms. edgeOps was specifically studied on off-the-shelf Linux and ARM-based edge devices. Our findings demonstrate the impact of platform and model selection on system performance, providing actionable guidance for edge-based SHM system design.

Paperid: 3301, https://arxiv.org/pdf/2503.18721.pdf

Abstract:
Identification of joint dependence among more than two random vectors plays an important role in many statistical applications, where the data may contain sensitive or confidential information. In this paper, we consider the the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) in the context of differential privacy. Given the limiting distribution of the empirical estimate of dHSIC is complicated Gaussian chaos, constructing tests in the non-privacy regime is typically based on permutation and bootstrap. To detect joint dependence in privacy, we propose a dHSIC-based testing procedure by employing a differentially private permutation methodology. Our method enjoys privacy guarantee, valid level and pointwise consistency, while the bootstrap counterpart suffers inconsistent power. We further investigate the uniform power of the proposed test in dHSIC metric and $L_2$ metric, indicating that the proposed test attains the minimax optimal power across different privacy regimes. As a byproduct, our results also contain the pointwise and uniform power of the non-private permutation dHSIC, addressing an unsolved question remained in Pfister et al. (2018). Both numerical simulations and real data analysis on causal inference suggest our proposed test performs well empirically.

Paperid: 3302, https://arxiv.org/pdf/2503.18550.pdf

Abstract:
As Virtual Reality (VR) expands into fields like healthcare and education, ensuring secure and user-friendly authentication becomes essential. Traditional password entry methods in VR are cumbersome and insecure, making password managers (PMs) a potential solution. To explore this field, we conducted a user study (n=126 VR users) where participants expressed a strong preference for simpler passwords and showed interest in biometric authentication and password managers. On these grounds, we provide the first in-depth evaluation of PMs in VR. We report findings from 91 cognitive walkthroughs, revealing that while PMs improve usability, they are not yet ready for prime time. Key features like cross-app autofill are missing, and user experiences highlight the need for better solutions. Based on consolidated user views and expert analysis, we make recommendations on how to move forward in improving VR authentication systems, ultimately creating more practical solutions for this growing field.

Paperid: 3303, https://arxiv.org/pdf/2503.17873.pdf

Abstract:
Recently, the Internet of Things (IoT) environment has become increasingly fertile for malicious users to break the security and privacy of IoT users. Access control is a paramount necessity to forestall illicit access. Traditional access control mechanisms are designed and managed in a centralized manner, thus rendering them unfit for decentralized IoT systems. To address the distributed IoT environment, blockchain is viewed as a promising decentralised data management technology. In this thesis, we investigate the state-of-art works in the domain of distributed blockchain-based access control. We establish the most important requirements and assess related works against them. We propose a Distributed Blockchain and Attribute-based Access Control model for IoT entitled (DBC-ABAC) that merges blockchain technology with the attribute-based access control model. A proof-of-concept implementation is presented using Hyperledger Fabric. To validate performance, we experimentally evaluate and compare our work with other recent works using Hyperledger Caliper tool. Results indicate that the proposed model surpasses other works in terms of latency and throughput with considerable efficiency.

Paperid: 3304, https://arxiv.org/pdf/2503.17767.pdf

Abstract:
The aim of this paper is to present a new design for a pseudorandom number generator (PRNG) that is cryptographically secure, passes all of the usual statistical tests referenced in the literature and hence generates high quality random sequences, that is compact and easy to implement in practice, of portable design and offering reasonable execution times. Our procedure achieves those objectives through the use of a sequence of modular exponentiations followed by the application of Feistel-like boxes that mix up bits using a nonlinear function. The results of extensive statistical tests on sequences of about 2^40 bits in size generated by our algorithm are also presented.

Paperid: 3305, https://arxiv.org/pdf/2503.17730.pdf

Abstract:
Cybersecurity in robotics stands out as a key aspect within Regulation (EU) 2024/1689, also known as the Artificial Intelligence Act, which establishes specific guidelines for intelligent and automated systems. A fundamental distinction in this regulatory framework is the difference between robots with Artificial Intelligence (AI) and those that operate through automation systems without AI, since the former are subject to stricter security requirements due to their learning and autonomy capabilities. This work analyzes cybersecurity tools applicable to advanced robotic systems, with special emphasis on the protection of knowledge bases in cognitive architectures. Furthermore, a list of basic tools is proposed to guarantee the security, integrity, and resilience of these systems, and a practical case is presented, focused on the analysis of robot knowledge management, where ten evaluation criteria are defined to ensure compliance with the regulation and reduce risks in human-robot interaction (HRI) environments.

Paperid: 3306, https://arxiv.org/pdf/2503.17302.pdf

Abstract:
As software systems grow increasingly complex, ensuring security during development poses significant challenges. Traditional manual code audits are often expensive, time-intensive, and ill-suited for fast-paced workflows, while automated tools frequently suffer from high false-positive rates, limiting their reliability. To address these issues, we introduce Bugdar, an AI-augmented code review system that integrates seamlessly into GitHub pull requests, providing near real-time, context-aware vulnerability analysis. Bugdar leverages fine-tunable Large Language Models (LLMs) and Retrieval Augmented Generation (RAGs) to deliver project-specific, actionable feedback that aligns with each codebase's unique requirements and developer practices. Supporting multiple programming languages, including Solidity, Move, Rust, and Python, Bugdar demonstrates exceptional efficiency, processing an average of 56.4 seconds per pull request or 30 lines of code per second. This is significantly faster than manual reviews, which could take hours per pull request. By facilitating a proactive approach to secure coding, Bugdar reduces the reliance on manual reviews, accelerates development cycles, and enhances the security posture of software systems without compromising productivity.

Paperid: 3307, https://arxiv.org/pdf/2503.16992.pdf

Abstract:
In a 'digital by default' society, essential services must be accessed online. This opens users to digital deception not only from criminal fraudsters but from a range of actors in a marketised digital economy. Using grounded empirical research from northern England, we show how supposedly 'trusted' actors, such as governments,(re)produce the insecurities and harms that they seek to prevent. Enhanced by a weakening of social institutions amid a drive for efficiency and scale, this has built a constricted, unpredictable digital channel. We conceptualise this as a "snipers' alley". Four key snipers articulated by participants' lived experiences are examined: 1) Governments; 2) Business; 3) Criminal Fraudsters; and 4) Friends and Family to explore how snipers are differentially experienced and transfigure through this constricted digital channel. We discuss strategies to re-configure the alley, and how crafting and adopting opportunity models can enable more equitable forms of security for all.

Paperid: 3308, https://arxiv.org/pdf/2503.16719.pdf

Abstract:
Cloud services have become an essential infrastructure for enterprises and individuals. Access to these cloud services is typically governed by Identity and Access Management systems, where user authentication often relies on passwords. While best practices dictate the implementation of multi-factor authentication, it's a reality that many such users remain solely protected by passwords. This reliance on passwords creates a significant vulnerability, as these credentials can be compromised through various means, including side-channel attacks. This paper exploits keyboard acoustic emanations to infer typed natural language passphrases via unsupervised learning, necessitating no previous training data. Whilst this work focuses on short passphrases, it is also applicable to longer messages, such as confidential emails, where the margin for error is much greater, than with passphrases, making the attack even more effective in such a setting. Unlike traditional attacks that require physical access to the target device, acoustic side-channel attacks can be executed within the vicinity, without the user's knowledge, offering a worthwhile avenue for malicious actors. Our findings replicate and extend previous work, confirming that cross-correlation audio preprocessing outperforms methods like mel-frequency-cepstral coefficients and fast-fourier transforms in keystroke clustering. Moreover, we show that partial passphrase recovery through clustering and a dictionary attack can enable faster than brute-force attacks, further emphasizing the risks posed by this attack vector.

Paperid: 3309, https://arxiv.org/pdf/2503.16612.pdf

Abstract:
Trusted Execution Environments (TEEs) have emerged at the forefront of edge computing to combat the lack of trust between system components. Field Programmable Gate Arrays (FPGAs) are commonly used as edge computers but were not created with security as a primary consideration. Thus, FPGA-based edge computers are increasingly the target of cyberattacks. We analyze the existing literature to systematize the applications and features of FPGA-based TEEs. We identified 27 primary studies related to different types of System-on-Chip FPGA-based TEEs. Across a wide range of applications and features, the availability of extensible solutions is limited. Most solutions focus on specific features and applications, whereas few solutions focus on feature-rich, comprehensive TEEs that can be utilized across computer systems. Whether TEEs are specific or extensible, the paucity of published studies provides evidence of research gaps. This SoK delineates these gaps revealing opportunities for researchers and developers.

Paperid: 3310, https://arxiv.org/pdf/2503.15881.pdf

Abstract:
Data redundancy techniques have been tested in several different applications to provide fault tolerance and performance gains. The use of these techniques is mostly seen at the hardware, device driver, or file system level. In practice, the use of data integrity techniques with logical data has largely been limited to verifying the integrity of transferred files using cryptographic hashes. In this paper, we study the RAID scheme used with disk arrays and adapt it for use with logical data. An implementation for such a system is devised in theory and implemented in software, providing the specifications for the procedures and file formats used. Rigorous experimentation is conducted to test the effectiveness of the developed system for multiple use cases. With computer-generated benchmarks and simulated experiments, the system demonstrates robust performance in recovering arbitrary faults in large archive files only using a small fraction of redundant data. This was achieved by leveraging computing power for the process of data recovery.

Paperid: 3311, https://arxiv.org/pdf/2503.15730.pdf

Abstract:
This paper presents a systematic review of recent advancements in V2G cybersecurity, employing the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) framework for detailed searches across three journal databases and included only peer-reviewed studies published between 2020 and 2024 (June). We identified and reviewed 133 V2G cybersecurity studies and found five important insights on existing V2G cybersecurity research. First, most studies (103 of 133) focused on protecting V2G systems against cyber threats, while only seven studies addressed the recovery aspect of the CRML (Cybersecurity Risk Management Lifecycle) function. Second, existing studies have adequately addressed the security of EVs and EVCS (EV charging stations) in V2G systems (112 and 81 of 133 studies, respectively). However, none have focused on the linkage between the behaviour of EV users and the cybersecurity of V2G systems. Third, physical access, control-related vulnerabilities, and user behaviour-related attacks in V2G systems are not addressed significantly. Furthermore, existing studies overlook vulnerabilities and attacks specific to AI and blockchain technologies. Fourth, blockchain, artificial intelligence (AI), encryption, control theory, and optimisation are the main technologies used, and finally, the inclusion of quantum safety within encryption and AI models and AI assurance (AIA) is in a very early stage; only two and one of 133 studies explicitly addressed quantum safety and AIA through explainability. By providing a holistic perspective, this study identifies critical research gaps and outlines future directions for developing robust end-to-end cybersecurity solutions to safeguard V2G systems and support global sustainability goals.

Paperid: 3312, https://arxiv.org/pdf/2503.15678.pdf

Abstract:
The financial sector faces escalating cyber threats amplified by artificial intelligence (AI) and the advent of quantum computing. AI is being weaponized for sophisticated attacks like deepfakes and AI-driven malware, while quantum computing threatens to render current encryption methods obsolete. This report analyzes these threats, relevant frameworks, and possible countermeasures like quantum cryptography. AI enhances social engineering and phishing attacks via personalized content, lowers entry barriers for cybercriminals, and introduces risks like data poisoning and adversarial AI. Quantum computing, particularly Shor's algorithm, poses a fundamental threat to current encryption standards (RSA and ECC), with estimates suggesting cryptographically relevant quantum computers could emerge within the next 5-30 years. The "harvest now, decrypt later" scenario highlights the urgency of transitioning to quantum-resistant cryptography. This is key. Existing legal frameworks are evolving to address AI in cybercrime, but quantum threats require new initiatives. International cooperation and harmonized regulations are crucial. Quantum Key Distribution (QKD) offers theoretical security but faces practical limitations. Post-quantum cryptography (PQC) is a promising alternative, with ongoing standardization efforts. Recommendations for international regulators include fostering collaboration and information sharing, establishing global standards, supporting research and development in quantum security, harmonizing legal frameworks, promoting cryptographic agility, and raising awareness and education. The financial industry must adopt a proactive and adaptive approach to cybersecurity, investing in research, developing migration plans for quantum-resistant cryptography, and embracing a multi-faceted, collaborative strategy to build a resilient, quantum-safe, and AI-resilient financial ecosystem

Paperid: 3313, https://arxiv.org/pdf/2503.15648.pdf

Abstract:
Cancelable biometric schemes are designed to extract an identity-preserving, non-invertible as well as revocable pseudo-identifier from biometric data. Recognition systems need to store only this pseudo-identifier, to avoid tampering and/or stealing of original biometric data during the recognition process. State-of-the-art cancelable schemes generate pseudo-identifiers by transforming the original template using either user-specific salting or many-to-one transformations. In addition to the performance concerns, most of such schemes are modality-specific and prone to reconstruction attacks as there are chances for unauthorized access to security-critical transformation keys. A novel, modality-independent cancelable biometric scheme is proposed to overcome these limitations. In this scheme, a cancelable template (pseudo identifier) is generated as a distance vector between multiple random transformations of the biometric feature vector. These transformations were done by grouping feature vector components based on a set of user-specific random vectors. The proposed scheme nullifies the possibility of template reconstruction as the generated cancelable template contains only the distance values between the different random transformations of the feature vector and it does not store any details of the biometric template. The recognition performance of the proposed scheme is evaluated for face and fingerprint modalities. Equal Error Rate (EER) of 1.5 is obtained for face and 1.7 is obtained for the fingerprint in the worst case.

Paperid: 3314, https://arxiv.org/pdf/2503.15560.pdf

Abstract:
Large Language Models (LLMs) are increasingly vulnerable to sophisticated multi-turn manipulation attacks, where adversaries strategically build context through seemingly benign conversational turns to circumvent safety measures and elicit harmful or unauthorized responses. These attacks exploit the temporal nature of dialogue to evade single-turn detection methods, representing a critical security vulnerability with significant implications for real-world deployments. This paper introduces the Temporal Context Awareness (TCA) framework, a novel defense mechanism designed to address this challenge by continuously analyzing semantic drift, cross-turn intention consistency and evolving conversational patterns. The TCA framework integrates dynamic context embedding analysis, cross-turn consistency verification, and progressive risk scoring to detect and mitigate manipulation attempts effectively. Preliminary evaluations on simulated adversarial scenarios demonstrate the framework's potential to identify subtle manipulation patterns often missed by traditional detection techniques, offering a much-needed layer of security for conversational AI systems. In addition to outlining the design of TCA , we analyze diverse attack vectors and their progression across multi-turn conversation, providing valuable insights into adversarial tactics and their impact on LLM vulnerabilities. Our findings underscore the pressing need for robust, context-aware defenses in conversational AI systems and highlight TCA framework as a promising direction for securing LLMs while preserving their utility in legitimate applications. We make our implementation available to support further research in this emerging area of AI security.

Paperid: 3315, https://arxiv.org/pdf/2503.15546.pdf

Abstract:
The integration of Large Language Models (LLMs) into autonomous robotic agents for conducting online transactions poses significant cybersecurity challenges. This study aims to enforce robust cybersecurity constraints to mitigate the risks associated with data breaches, transaction fraud, and system manipulation. The background focuses on the rise of LLM-driven robotic systems in e-commerce, finance, and service industries, alongside the vulnerabilities they introduce. A novel security architecture combining blockchain technology with multi-factor authentication (MFA) and real-time anomaly detection was implemented to safeguard transactions. Key performance metrics such as transaction integrity, response time, and breach detection accuracy were evaluated, showing improved security and system performance. The results highlight that the proposed architecture reduced fraudulent transactions by 90%, improved breach detection accuracy to 98%, and ensured secure transaction validation within a latency of 0.05 seconds. These findings emphasize the importance of cybersecurity in the deployment of LLM-driven robotic systems and suggest a framework adaptable to various online platforms.

Paperid: 3316, https://arxiv.org/pdf/2503.15380.pdf

Abstract:
We present ChonkyBFT, a partially-synchronous Byzantine fault-tolerant (BFT) consensus protocol used in the ZKsync system. The proposed protocol is a hybrid protocol inspired by FAB Paxos, Fast-HotStuff, and HotStuff-2. It is a committee-based protocol with only one round of voting, single slot finality, quadratic communication, and n >= 5f + 1 fault tolerance. This design enables its effective application within the context of the ZKsync rollup, achieving its most critical goals: simplicity, low transaction latency, and reduced system complexity. The target audience for this paper is the ZKsync community and others worldwide who seek assurance in the safety and security of the ZKsync protocols. The described consensus protocol has been implemented, analyzed, and tested using formal methods.

Paperid: 3317, https://arxiv.org/pdf/2503.15065.pdf

Abstract:
Memory forensics is a powerful technique commonly adopted to investigate compromised machines and to detect stealthy computer attacks that do not store data on non-volatile storage. To employ this technique effectively, the analyst has to first acquire a faithful copy of the system's volatile memory after the incident. However, almost all memory acquisition tools capture the content of physical memory without stopping the system's activity and by following the ascending order of the physical pages, which can lead to inconsistencies and errors in the dump. In this paper we developed a system to track all write operations performed by the OS kernel during a memory acquisition process. This allows us to quantify, for the first time, the exact number and type of inconsistencies observed in memory dumps. We examine the runtime activity of three different operating systems and the way they manage physical memory. Then, focusing on Linux, we quantify how different acquisition modes, file systems, and hardware targets influence the frequency of kernel writes during the dump. We also analyze the impact of inconsistencies on the reconstruction of page tables and major kernel data structures used by Volatility to extract forensic artifacts. Our results show that inconsistencies are very common and that their presence can undermine the reliability and validity of memory forensics analysis.

Paperid: 3318, https://arxiv.org/pdf/2503.14729.pdf

Abstract:
We introduce a zero-knowledge cryptocurrency mixer framework that allows groups of users to set up a mixing pool with configurable governance conditions, configurable deposit delays, and the ability to refund or confiscate deposits if it is suspected that funds originate from crime. Using a consensus process, group participants can monitor inputs to the mixer and determine whether the inputs satisfy the mixer conditions. If a deposit is accepted by the group, it will enter the mixer and become untraceable. If it is not accepted, the verifiers can freeze the deposit and collectively vote to either refund the deposit back to the user, or confiscate the deposit and send it to a different user. This behaviour can be used to examine deposits, determine if they originate from a legitimate source, and if not, return deposits to victims of crime.

Paperid: 3319, https://arxiv.org/pdf/2503.14006.pdf

Abstract:
Automated insulin delivery (AID) systems have emerged as a significant technological advancement in diabetes care. These systems integrate a continuous glucose monitor, an insulin pump, and control algorithms to automate insulin delivery, reducing the burden of self-management and offering enhanced glucose control. However, the increasing reliance on wireless connectivity and software control has exposed AID systems to critical security risks that could result in life-threatening treatment errors. This review first presents a comprehensive examination of the security landscape, covering technical vulnerabilities, legal frameworks, and commercial product considerations, and an analysis of existing research on attack vectors, defence mechanisms, as well as evaluation methods and resources for AID systems. Despite recent advancements, several open challenges remain in achieving secure AID systems, particularly in standardising security evaluation frameworks and developing comprehensive, lightweight, and adaptive defence strategies. As one of the most widely adopted and extensively studied physiologic closed-loop control systems, this review serves as a valuable reference for understanding security challenges and solutions applicable to analogous medical systems.

Paperid: 3320, https://arxiv.org/pdf/2503.13478.pdf

Abstract:
Highway work zones are critical areas where accidents frequently occur, often due to the proximity of workers to heavy machinery and ongoing traffic. With technological advancements in sensor technologies and the Internet of Things, promising solutions are emerging to address these safety concerns. This paper provides a systematic review of existing studies on the application of sensor technologies in enhancing highway work zone safety, particularly in preventing intrusion and proximity hazards. Following the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) protocol, the review examines a broad spectrum of publications on various sensor technologies, including GPS, radar, laser, infrared, RFID, Bluetooth, ultrasonic, and infrared sensors, detailing their application in reducing intrusion and proximity incidents. The review also assesses these technologies in terms of their accuracy, range, power consumption, cost, and user-friendliness, with a specific emphasis on their suitability for highway work zones. The findings highlight the potential of sensor technologies to significantly enhance work zone safety. As there are a wide range of sensor technologies to choose from, the review also revealed that selection of sensors for a particular application needs careful consideration of different factors. Finally, while sensor technologies offer promising solutions for enhancing highway work zone safety, their effective implementation requires comprehensive consideration of various factors beyond technological capabilities, including developing integrated, cost-effective, user-friendly, and secure systems, and creating regulatory frameworks to support the rapid development of these technologies.

Paperid: 3321, https://arxiv.org/pdf/2503.13052.pdf

Abstract:
This study empirically analyzes the transaction activity of Bitcoin addresses linked to Russian intelligence services, which have liquidated over 7 Bitcoin (BTC), i.e., equivalent to approximately US$300,000 based on the exchange rate at the time. Our investigation begins with an observed anomaly in transaction outputs featuring the Bitcoin Script operation code, tied to input addresses identified by cyber threat intelligence sources and court documents as belonging to Russian intelligence agencies. We explore how an unauthorized entity appears to have gained control of the associated private keys, with messages embedded in the outputs confirming the seizure. Tracing the funds' origins, we connect them to cryptocurrency mixers and establish a link to the Russian ransomware group Conti, implicating intelligence service involvement. This analysis represents one of the first empirical studies of large-scale Bitcoin misuse by nation-state cyber actors.

Paperid: 3322, https://arxiv.org/pdf/2503.12952.pdf

Abstract:
As quantum computing advances, modern cryptographic standards face an existential threat, necessitating a transition to post-quantum cryptography (PQC). The National Institute of Standards and Technology (NIST) has selected CRYSTALS-Kyber and CRYSTALS-Dilithium as standardized PQC algorithms for secure key exchange and digital signatures, respectively. This study conducts a comprehensive performance analysis of these algorithms by benchmarking execution times across cryptographic operations such as key generation, encapsulation, decapsulation, signing, and verification. Additionally, the impact of AVX2 optimizations is evaluated to assess hardware acceleration benefits. Our findings demonstrate that Kyber and Dilithium achieve efficient execution times, outperforming classical cryptographic schemes such as RSA and ECDSA at equivalent security levels. Beyond technical performance, the real-world deployment of PQC introduces challenges in telecommunications networks, where large-scale infrastructure upgrades, interoperability with legacy systems, and regulatory constraints must be addressed. This paper examines the feasibility of PQC adoption in telecom environments, highlighting key transition challenges, security risks, and implementation strategies. Through industry case studies, we illustrate how telecom operators are integrating PQC into 5G authentication, subscriber identity protection, and secure communications. Our analysis provides insights into the computational trade-offs, deployment considerations, and standardization efforts shaping the future of quantum-safe cryptographic infrastructure.

Paperid: 3323, https://arxiv.org/pdf/2503.12719.pdf

Abstract:
Atomic swaps have been widely considered to be an ideal solution for cross-chain cryptocurrency transactions due to their trustless and decentralized nature. However, their adoption in practice has been strictly limited compared to centralized exchange order books because of long transaction times (anywhere from 20 to 60 minutes) prohibiting market makers from accurately pricing atomic swap spreads. For the decentralized finance ecosystem to expand and benefit all users, this would require accommodating market makers and high-frequency traders to reduce spreads and dramatically boost liquidity. This white paper will introduce a protocol for atomic swaps that eliminates the need for an intermediary currency or centralized trusted third party, reducing transaction times between Bitcoin and Ethereum swaps to approximately 15 seconds for a market maker, and could be reduced further with future Layer 2 solutions.

Paperid: 3324, https://arxiv.org/pdf/2503.12045.pdf

Abstract:
This paper introduces a novel theoretical framework for auditing differential privacy (DP) in a black-box setting. Leveraging the concept of $f$-differential privacy, we explicitly define type I and type II errors and propose an auditing mechanism based on conformal inference. Our approach robustly controls the type I error rate under minimal assumptions. Furthermore, we establish a fundamental impossibility result, demonstrating the inherent difficulty of simultaneously controlling both type I and type II errors without additional assumptions. Nevertheless, under a monotone likelihood ratio (MLR) assumption, our auditing mechanism effectively controls both errors. We also extend our method to construct valid confidence bands for the trade-off function in the finite-sample regime.

Paperid: 3325, https://arxiv.org/pdf/2503.11913.pdf

Abstract:
The future of quantum computing architecture is most likely the one in which a large number of clients are either fully classical or have a very limited quantum capability while a very small number of servers having the capability to perform quantum computations and most quantum computational tasks are delegated to these quantum servers. In this architecture, it becomes very crucial that a classical/semi-classical client is able to keep the delegated data/ computation secure against eavesdroppers as well as the server itself, known as the blindness feature. In 2009, A. Broadbent et. al proposed a universal blind quantum computation (UBQC) protocol based on measurement-based quantum computation (MBQC) that enables a semi-classical client to delegate universal quantum computation to a quantum server, interactively and fetch the results while the computation itself remains blind to the server. In this work, we propose an implementation (with examples) of UBQC in the current quantum computing architecture, a fully classical client, a quantum server (IBM Quantum) and the computation does not proceed interactively (projective measurement basis is not decided by previous measurement outcome). We combined UBQC with the 8-state remote state preparation (RSP) protocol, to blindly prepare the initial cluster state, which is an initial resource state in UBQC protocol, to allow a completely classical client to perform delegated blind quantum computation. Such an implementation has already been shown to be secure in a game-based security setting, which is the weakest security model.

Paperid: 3326, https://arxiv.org/pdf/2503.11634.pdf

Abstract:
Black-box separations are a cornerstone of cryptography, indicating barriers to various goals. A recent line of work has explored black-box separations for quantum cryptographic primitives. Namely, a number of separations are known in the Common Haar Random State (CHRS) model, though this model is not considered a complete separation, but rather a starting point. A few very recent works have attempted to lift these separations to a unitary separation, which are considered complete separations. Unfortunately, we find significant errors in some of these lifting results. We prove general conditions under which CHRS separations can be generically lifted, thereby giving simple, modular, and bug-free proofs of complete unitary separations between various quantum primitives. Our techniques allow for simpler proofs of existing separations as well as new separations that were previously only known in the CHRS model.

Paperid: 3327, https://arxiv.org/pdf/2503.10207.pdf

Abstract:
Kyber, an IND-CCA2-secure lattice-based post-quantum key-encapsulation mechanism, is the winner of the first post-quantum cryptography standardization process of the US National Institute of Standards and Technology. In this work, we provide an efficient implementation of Kyber on ESP32, a very popular microcontroller for Internet of Things applications. We hand-partition the Kyber algorithm to enable utilization of the ESP32 dual-core architecture, which allows us to speed up its execution by 1.21x (keygen), 1.22x (encaps) and 1.20x (decaps). We also explore the possibility of gaining further improvement by utilizing the ESP32 SHA and AES coprocessor and achieve a culminated speed-up of 1.72x (keygen), 1.84x (encaps) and 1.69x (decaps).

Paperid: 3328, https://arxiv.org/pdf/2503.10074.pdf

Abstract:
ISA extensions are increasingly adopted to boost the performance of specialized workloads without requiring an entire architectural redesign. However, these enhancements can inadvertently expose new attack surfaces in the microarchitecture. In this paper, we investigate Intel's recently introduced cldemote extension, which promotes efficient data sharing by transferring cache lines from upper-level caches to the Last Level Cache (LLC). Despite its performance benefits, we uncover critical properties-unprivileged access, inter-cache state transition, and fault suppression-that render cldemote exploitable for microarchitectural attacks. We propose two new attack primitives, Flush+Demote and Demote+Time, built on our analysis. Flush+Demote constructs a covert channel with a bandwidth of 2.84 Mbps and a bit error rate of 0.018%, while Demote+Time derandomizes the kernel base address in 2.49 ms on Linux. Furthermore, we show that leveraging cldemote accelerates eviction set construction in non-inclusive LLC designs by obviating the need for helper threads or extensive cache conflicts, thereby reducing construction time by 36% yet retaining comparable success rates. Finally, we examine how ISA extensions contribute to broader microarchitectural attacks, identifying five key exploitable characteristics and categorizing four distinct attack types. We also discuss potential countermeasures, highlighting the far-reaching security implications of emerging ISA extensions.

Paperid: 3329, https://arxiv.org/pdf/2503.09823.pdf

Abstract:
This paper offers a new privacy approach for the growing ecosystem of services -- ranging from open banking to healthcare -- dependent on sensitive personal data sharing between individuals and third parties. While these services offer significant benefits, individuals want control over their data, transparency regarding how their data is used, and accountability from third parties for misuse. However, existing legal and technical mechanisms are inadequate for supporting these needs. A comprehensive approach to the modern privacy challenges of accountable third-party data sharing requires a closer alignment of technical system architecture and legal institutional design. In order to achieve this privacy alignment, we extend traditional security threat modeling and analysis to encompass a broader range of privacy notions than has been typically considered. In particular, we introduce the concept of covert-accountability, which addresses the risk from adversaries that may act dishonestly but nevertheless face potential identification and legal consequences. As a concrete instance of this design approach, we present the OTrace protocol, designed to provide traceable, accountable, consumer-control in third-party data sharing ecosystems. OTrace empowers consumers with the knowledge of who has their data, what it is being used for, what consent or other legal terms apply, and whom it is being shared with. By applying our alignment framework, we demonstrate that OTrace's technical affordances can provide more confident, scalable regulatory oversight when combined with complementary legal mechanisms.

Paperid: 3330, https://arxiv.org/pdf/2503.09666.pdf

Abstract:
In a globalized and interconnected world, interoperability has become a key concept for advancing tactical scenarios. Federated Coalition Networks (FCN) enable cooperation between entities from multiple nations while allowing each to maintain control over their systems. However, this interoperability necessitates the sharing of increasing amounts of information between different tactical assets, raising the need for higher security measures. Emerging technologies like blockchain drive a revolution in secure communications, paving the way for new tactical scenarios. In this work, we propose a blockchain-based framework to enhance the resilience and security of the management of these networks. We offer a guide to FCN design to help a broad audience understand the military networks in international missions by a use case and key functions applied to a proposed architecture. We evaluate its effectiveness and performance in information encryption to validate this framework.

Paperid: 3331, https://arxiv.org/pdf/2503.09586.pdf

Abstract:
We present Auspex - a threat modeling system built using a specialized collection of generative artificial intelligence-based methods that capture threat modeling tradecraft. This new approach, called tradecraft prompting, centers on encoding the on-the-ground knowledge of threat modelers within the prompts that drive a generative AI-based threat modeling system. Auspex employs tradecraft prompts in two processing stages. The first stage centers on ingesting and processing system architecture information using prompts that encode threat modeling tradecraft knowledge pertaining to system decomposition and description. The second stage centers on chaining the resulting system analysis through a collection of prompts that encode tradecraft knowledge on threat identification, classification, and mitigation. The two-stage process yields a threat matrix for a system that specifies threat scenarios, threat types, information security categorizations and potential mitigations. Auspex produces formalized threat model output in minutes, relative to the weeks or months a manual process takes. More broadly, the focus on bespoke tradecraft prompting, as opposed to fine-tuning or agent-based add-ons, makes Auspex a lightweight, flexible, modular, and extensible foundational system capable of addressing the complexity, resource, and standardization limitations of both existing manual and automated threat modeling processes. In this connection, we establish the baseline value of Auspex to threat modelers through an evaluation procedure based on feedback collected from cybersecurity subject matter experts measuring the quality and utility of threat models generated by Auspex on real banking systems. We conclude with a discussion of system performance and plans for enhancements to Auspex.

Paperid: 3332, https://arxiv.org/pdf/2503.09099.pdf

Abstract:
The advancement of quantum computing technology has led to the emergence of early-stage quantum cloud computing services. To fully realize the potential of quantum cloud computing, it is essential to develop techniques that ensure the privacy of both data and functions. Quantum computations often leverage superposition to evaluate a function on all possible inputs simultaneously, making function privacy a critical requirement. In 2009, Broadbent et al. introduced the Universal Blind Quantum Computation (UBQC) protocol, which is based on Measurement-Based Quantum Computation (MBQC) and provides a framework for ensuring both function and data privacy in quantum computing. Although theoretical results indicate an equivalence between MBQC and circuitbased quantum computation, translating MBQC into circuitbased implementations remains challenging due to higher qubit requirements and the complexity of the transformation process. Consequently, current quantum cloud computing platforms are limited in their ability to simulate MBQC efficiently. This paper presents an efficient method to simulate MBQC on circuit-based quantum computing platforms. We validate this approach by implementing the two-qubit Grover algorithm in the MBQC framework and further demonstrate blindness by applying the UBQC protocol. This work verifies the simulation of a blind quantum computation using the two-qubit Grover algorithm on a circuit-based quantum computing platform.

Paperid: 3333, https://arxiv.org/pdf/2503.08734.pdf

Abstract:
In today's increasingly digital interactions, robust Identity Verification (IDV) is crucial for security and trust. Artificial Intelligence (AI) is transforming IDV, enhancing accuracy and fraud detection. This paper introduces ``Zero to One,'' a holistic conceptual framework for developing AI-powered IDV products. This paper outlines the foundational problem and research objectives that necessitate a new framework for IDV in the age of AI. It details the evolution of identity verification and the current regulatory landscape to contextualize the need for a robust conceptual model. The core of the paper is the presentation of the ``Zero to One'' framework itself, dissecting its four essential components: Document Verification, Biometric Verification, Risk Assessment, and Orchestration. The paper concludes by discussing the implications of this conceptual model and suggesting future research directions focused on the framework's further development and application. The framework addresses security, privacy, UX, and regulatory compliance, offering a structured approach to building effective IDV solutions. Successful IDV platforms require a balanced conceptual understanding of verification methods, risk management, and operational scalability, with AI as a key enabler. This paper presents the ``Zero to One'' framework as a refined conceptual model, detailing verification layers, and AI's transformative role in shaping next-generation IDV products.

Paperid: 3334, https://arxiv.org/pdf/2503.08707.pdf

Abstract:
The maritime industry is governed by stringent environmental regulations, most notably the International Convention for the Prevention of Pollution from Ships (MARPOL). Ensuring compliance with these regulations is difficult due to low inspection rates and the risk of data fabrication. To address these issues, this paper proposes a secure blockchain-assisted framework for real-time maritime environmental compliance monitoring. By integrating IoT and shipboard sensors with blockchain technology, the framework ensures immutable and transparent record-keeping of environmental data. Smart contracts automate compliance verification and notify relevant authorities in case of non-compliance. A proof-of-concept case study on sulfur emissions demonstrates the framework's efficacy in enhancing MARPOL enforcement through real-time data integrity and regulatory adherence. The proposed system leverages the Polygon blockchain for scalability and efficiency, providing a robust solution for maritime environmental protection. The evaluation results demonstrate that the proposed blockchain-enhanced compliance monitoring system effectively and securely ensures real-time regulatory adherence with high scalability, efficiency, and cost-effectiveness, leveraging the robust capabilities of the Polygon blockchain.

Paperid: 3335, https://arxiv.org/pdf/2503.08699.pdf

Abstract:
As artificial intelligence (AI) systems become increasingly complex and autonomous, concerns over transparency and accountability have intensified. The "black box" problem in AI decision-making limits stakeholders' ability to understand, trust, and verify outcomes, particularly in high-stakes sectors such as healthcare, finance, and autonomous systems. Blockchain technology, with its decentralized, immutable, and transparent characteristics, presents a potential solution to enhance AI transparency and auditability. This paper explores the integration of blockchain with AI to improve decision traceability, data provenance, and model accountability. By leveraging blockchain as an immutable record-keeping system, AI decision-making can become more interpretable, fostering trust among users and regulatory compliance. However, challenges such as scalability, integration complexity, and computational overhead must be addressed to fully realize this synergy. This study discusses existing research, proposes a framework for blockchain-enhanced AI transparency, and highlights practical applications, benefits, and limitations. The findings suggest that blockchain could be a foundational technology for ensuring AI systems remain accountable, ethical, and aligned with regulatory standards.

Paperid: 3336, https://arxiv.org/pdf/2503.08632.pdf

Abstract:
This study investigates secret-key generation for device authentication using physical identifiers, such as responses from physical unclonable functions (PUFs). The system includes two legitimate terminals (encoder and decoder) and an eavesdropper (Eve), each with access to different measurements of the identifier. From the device identifier, the encoder generates a secret key, which is securely stored in a private database, along with helper data that is saved in a public database accessible by the decoder for key reconstruction. Eve, who also has access to the public database, may use both her own measurements and the helper data to attempt to estimate the secret key and identifier. Our setup focuses on authentication scenarios where channel statistics are uncertain, with the involved parties employing multiple antennas to enhance signal reception. Our contributions include deriving inner and outer bounds on the optimal trade-off among secret-key, storage, and privacy-leakage rates for general discrete sources, and showing that these bounds are tight for Gaussian sources.

Paperid: 3337, https://arxiv.org/pdf/2503.08568.pdf

Abstract:
In recent years, major privacy laws like the GDPR have brought about positive changes. However, challenges remain in enforcing the laws, particularly due to under-resourced regulators facing a large number of potential privacy-violating software applications (apps) and the high costs of investigating them. Since 2019, China has launched a series of privacy enforcement campaigns known as Special Privacy Rectification Campaigns (SPRCs) to address widespread privacy violations in its mobile application (app) ecosystem. Unlike the enforcement of the GDPR, SPRCs are characterized by large-scale privacy reviews and strict sanctions, under the strong control of central authorities. In SPRCs, central government authorities issue administrative orders to mobilize various resources for market-wide privacy reviews of mobile apps. They enforce strict sanctions by requiring privacy-violating apps to rectify issues within a short timeframe or face removal from app stores. While there are a few reports on SPRCs, the effectiveness and potential problems of this campaign-style privacy enforcement approach remain unclear to the community. In this study, we conducted 18 semi-structured interviews with app-related engineers involved in SPRCs to better understand the campaign-style privacy enforcement. Based on the interviews, we reported our findings on a variety of aspects of SPRCs, such as the processes that app engineers regularly follow to achieve privacy compliance in SPRCs, the challenges they encounter, the solutions they adopt to address these challenges, and the impacts of SPRCs, etc. We found that app engineers face a series of challenges in achieving privacy compliance in their apps...

Paperid: 3338, https://arxiv.org/pdf/2503.08403.pdf

Abstract:
The NP-hardness of the closest vector problem (CVP) is an important basis for quantum-secure cryptography, in much the same way that integer factorisation's conjectured hardness is at the foundation of cryptosystems like RSA. Recent work with heuristic quantum algorithms (arXiv:2212.12372) indicates the possibility to find close approximations to (constrained) CVP instances that could be incorporated within fast sieving approaches for factorisation. This work explores both the practicality and scalability of the proposed heuristic approach to explore the potential for a quantum advantage for approximate CVP, without regard for the subsequent factoring claims. We also extend the proposal to include an antecedent "pre-training" scheme to find and fix a set of parameters that generalise well to increasingly large lattices, which both optimises the scalability of the algorithm, and permits direct numerical analyses. Our results further indicate a noteworthy quantum speed-up for lattice problems obeying a certain `prime' structure, approaching fifth order advantage for QAOA of fixed depth p=10 compared to classical brute-force, motivating renewed discussions about the necessary lattice dimensions for quantum-secure cryptosystems in the near-term.

Paperid: 3339, https://arxiv.org/pdf/2503.07661.pdf

Abstract:
Model merging is a technique that combines multiple finetuned models into a single model without additional training, allowing a free-rider to cheaply inherit specialized capabilities. This study investigates methodologies to suppress unwanted model merging by free-riders. Existing methods such as model watermarking or fingerprinting can only detect merging in hindsight. In contrast, we propose a first proactive defense against model merging. Specifically, our defense method modifies the model parameters so that the model is disrupted if the model is merged with any other model, while its functionality is kept unchanged if not merged with others. Our approach consists of two modules, rearranging MLP parameters and scaling attention heads, which push the model out of the shared basin in parameter space, causing the merging performance with other models to degrade significantly. We conduct extensive experiments on image classification, image generation, and text classification to demonstrate that our defense severely disrupts merging while retaining the functionality of the post-protect model. Moreover, we analyze potential adaptive attacks and further propose a dropout-based pruning to improve our proposal's robustness.

Paperid: 3340, https://arxiv.org/pdf/2503.07570.pdf

Abstract:
Deep learning, when integrated with a large amount of training data, has the potential to outperform machine learning in terms of high accuracy. Recently, privacy-preserving deep learning has drawn significant attention of the research community. Different privacy notions in deep learning include privacy of data provided by data-owners and privacy of parameters and/or hyperparameters of the underlying neural network. Federated learning is a popular privacy-preserving execution environment where data-owners participate in learning the parameters collectively without leaking their respective data to other participants. However, federated learning suffers from certain security/privacy issues. In this paper, we propose Split-n-Chain, a variant of split learning where the layers of the network are split among several distributed nodes. Split-n-Chain achieves several privacy properties: data-owners need not share their training data with other nodes, and no nodes have access to the parameters and hyperparameters of the neural network (except that of the respective layers they hold). Moreover, Split-n-Chain uses blockchain to audit the computation done by different nodes. Our experimental results show that: Split-n-Chain is efficient, in terms of time required to execute different phases, and the training loss trend is similar to that for the same neural network when implemented in a monolithic fashion.

Paperid: 3341, https://arxiv.org/pdf/2503.07200.pdf

Abstract:
In this work we use formal verification to prove that the Lightning Network (LN), the most prominent scaling technique for Bitcoin, always safeguards the funds of honest users. We provide a custom implementation of (a simplification of) LN, express the desired security goals and, for the first time, we provide a machine checkable proof that they are upheld under every scenario, all in an integrated fashion. We build our system using the Why3 platform.

Paperid: 3342, https://arxiv.org/pdf/2503.07109.pdf

Abstract:
With the escalating threat of malware, particularly on mobile devices, the demand for effective analysis methods has never been higher. While existing security solutions, including AI-based approaches, offer promise, their lack of transparency constraints the understanding of detected threats. Manual analysis remains time-consuming and reliant on scarce expertise. To address these challenges, we propose a novel approach called XAIDroid that leverages graph neural networks (GNNs) and graph attention mechanisms for automatically locating malicious code snippets within malware. By representing code as API call graphs, XAIDroid captures semantic context and enhances resilience against obfuscation. Utilizing the Graph Attention Model (GAM) and Graph Attention Network (GAT), we assign importance scores to API nodes, facilitating focused attention on critical information for malicious code localization. Evaluation on synthetic and real-world malware datasets demonstrates the efficacy of our approach, achieving high recall and F1-score rates for malicious code localization. The successful implementation of automatic malicious code localization enhances the scalability, interpretability, and reliability of malware analysis.

Paperid: 3343, https://arxiv.org/pdf/2503.06925.pdf

Abstract:
Recent studies have explored DNA-based algorithms for IoT security and image encryption. A similar encryption algorithm was proposed by Al-Husainy et. al. in 2021 Recently, Al-Husainy et al.in 2021, proposed an encryption algorithm based on DNA processes for Internet of Things(IoT) applications. Upon finding low avalanche effect in our experiments, we first report related-key attack and later propose a key recovery attack on full cipher, which generates the complete key with just two plaintext-ciphertext blocks with time complexity of O(1). Upon discovering these serious weaknesses, we improve the security of encryption algorithm against above reported attacks by employing a bio-inspired SBOX and adding some tweaks in the cipher. A lot of research has been done in developing image encryption algorithm based on DNA and chaotic maps. Inspired from the design of SNOW-3G, a well known cipher, primarily used in mobile communications and considering DNA-based functions as building blocks, we propose a new DNA-based stream cipher-Bio-SNOW. We discuss its security in response to various kinds of attacks and find that it passes all NIST randomness tests. We also find that the speed of Bio-SNOW is around twice that of SNOW-3G. Moreover, by means of histogram and correlation analysis, we find that Bio-SNOW offers robust image encryption. These results highlight Bio-SNOW as a promising DNA-based cipher for lightweight and image cryptography applications.

Paperid: 3344, https://arxiv.org/pdf/2503.06785.pdf

Abstract:
As reliance on space systems continues to increase, so does the need to ensure security for them. However, public work in space standards have struggled with defining security protocols that are well tailored to the domain and its risks. In this work, we investigate various space networking paradigms and security approaches, and identify trade-offs and gaps. Furthermore, we describe potential existing security protocol approaches that fit well into the space network paradigm in terms of both functionality and security. Finally, we establish future directions for enabling strong security for space communication.

Paperid: 3345, https://arxiv.org/pdf/2503.06387.pdf

Abstract:
Large Language Models (LLMs) are of great interest in vulnerability detection and repair. The effectiveness of these models hinges on the quality of the datasets used for both training and evaluation. Our investigation reveals that a number of studies featured in prominent software engineering conferences have employed datasets that are plagued by high duplication rates, questionable label accuracy, and incomplete samples. Using these datasets for experimentation will yield incorrect results that are significantly different from actual expected behavior. For example, the state-of-the-art VulRepair Model, which is reported to have 44% accuracy, on average yielded 9% accuracy when test-set duplicates were removed from its training set and 13% accuracy when training-set duplicates were removed from its test set. In an effort to tackle these data quality concerns, we have retrained models from several papers without duplicates and conducted an accuracy assessment of labels for the top ten most hazardous Common Weakness Enumerations (CWEs). Our findings indicate that 56% of the samples had incorrect labels and 44% comprised incomplete samples--only 31% were both accurate and complete. Finally, we employ transfer learning using a large deduplicated bugfix corpus to show that these models can exhibit better performance if given larger amounts of high-quality pre-training data, leading us to conclude that while previous studies have over-estimated performance due to poor dataset quality, this does not demonstrate that better performance is not possible.

Paperid: 3346, https://arxiv.org/pdf/2503.05477.pdf

Abstract:
The distributed denial-of-service (DDoS) attack stands out as a highly formidable cyber threat, representing an advanced form of the denial-of-service (DoS) attack. A DDoS attack involves multiple computers working together to overwhelm a system, making it unavailable. On the other hand, a DoS attack is a one-on-one attempt to make a system or website inaccessible. Thus, it is crucial to construct an effective model for identifying various DDoS incidents. Although extensive research has focused on binary detection models for DDoS identification, they face challenges to adapt evolving threats, necessitating frequent updates. Whereas multiclass detection models offer a comprehensive defense against diverse DDoS attacks, ensuring adaptability in the ever-changing cyber threat landscape. In this paper, we propose a Hybrid Model to strengthen network security by combining the featureextraction abilities of 1D Convolutional Neural Networks (CNNs) with the classification skills of Random Forest (RF) and Multi-layer Perceptron (MLP) classifiers. Using the CIC-DDoS2019 dataset, we perform multiclass classification of various DDoS attacks and conduct a comparative analysis of evaluation metrics for RF, MLP, and our proposed Hybrid Model. After analyzing the results, we draw meaningful conclusions and confirm the superiority of our Hybrid Model by performing thorough cross-validation. Additionally, we integrate our machine learning model with Snort, which provides a robust and adaptive solution for detecting and mitigating various DDoS attacks.

Paperid: 3347, https://arxiv.org/pdf/2503.05303.pdf

Abstract:
Machine learning (ML) models serve as powerful tools for threat detection and mitigation; however, they also introduce potential new risks. Adversarial input can exploit these models through standard interfaces, thus creating new attack pathways that threaten critical network operations. As ML advancements progress, adversarial strategies become more advanced, and conventional defenses such as adversarial training are costly in computational terms and often fail to provide real-time detection. These methods typically require a balance between robustness and model performance, which presents challenges for applications that demand instant response. To further investigate this vulnerability, we suggest a novel strategy for detecting and mitigating adversarial attacks using eXplainable Artificial Intelligence (XAI). This approach is evaluated in real time within intrusion detection systems (IDS), leading to the development of a zero-touch mitigation strategy. Additionally, we explore various scenarios in the Radio Resource Control (RRC) layer within the Open Radio Access Network (O-RAN) framework, emphasizing the critical need for enhanced mitigation techniques to strengthen IDS defenses against advanced threats and implement a zero-touch mitigation solution. Extensive testing across different scenarios in the RRC layer of the O-RAN infrastructure validates the ability of the framework to detect and counteract integrated RRC-layer attacks when paired with adversarial strategies, emphasizing the essential need for robust defensive mechanisms to strengthen IDS against complex threats.

Paperid: 3348, https://arxiv.org/pdf/2503.04819.pdf

Abstract:
Cyber threat hunting is the practice of proactively searching for latent threats in a network. Engaging in threat hunting can be difficult due to the volume of network traffic, variety of adversary techniques, and constantly evolving vulnerabilities. To aid analysts in identifying techniques which may be co-occurring as part of a campaign, we present the Technique Inference Engine, a tool to infer tactics, techniques, and procedures (TTPs) which may be related to existing observations of adversarial behavior. We compile the largest (to our knowledge) available dataset of cyber threat intelligence (CTI) reports labeled with relevant TTPs. With the knowledge that techniques are chronically under-reported in CTI, we apply several implicit feedback recommender models to the data in order to predict additional techniques which may be part of a given campaign. We evaluate the results in the context of the cyber analyst's use case and apply t-SNE to visualize the model embeddings. We provide our code and a web interface.

Paperid: 3349, https://arxiv.org/pdf/2503.04806.pdf

Abstract:
The development of quantum computing threatens the security of our currently widely deployed cryptographic algorithms. While signicant progress has been made in developing post-quantum cryptography (PQC) standards to protect against future quantum computing threats, the U.S. government's estimated $7.1 billion transition cost for non-National Security Systems alone, coupled with an aggressive 2035 deadline, will require sustained funding, research, and international coordination to successfully upgrade existing cryptographic systems.

Paperid: 3350, https://arxiv.org/pdf/2503.04311.pdf

Abstract:
In this article we uncover flaws and pitfalls of a quantum-based remote memory attestation procedure for Internet-of-Things devices. We also show limitations of quantum memory that suggests the attestation problem for quantum memory is fundamentally different to the attestation problem for classical memory, even when the devices can perform quantum computation. The identified problems are of interest for quantum-based security protocol designers in general, particularly those dealing with corrupt devices. Finally, we make use of the lessons learned to design a quantum-based attestation system for classical memory with improved communication efficiency and security.

Paperid: 3351, https://arxiv.org/pdf/2503.04293.pdf

Abstract:
Software developing small and medium enterprises (SMEs) play a crucial role as suppliers to larger corporations and public administration. It is therefore necessary for them to be able to demonstrate that their products meet certain security criteria, both to gain trust of their customers and to comply to standards that demand such a demonstration. In this study we have investigated ways for SMEs to demonstrate their security when operating in a business-to-business model, conducting semi-structured interviews (N=16) with practitioners from different SMEs in Denmark and validating our findings in a follow-up workshop (N=6). Our findings indicate five distinctive security demonstration approaches, namely: Certifications, Reports, Questionnaires, Interactive Sessions and Social Proof. We discuss the challenges, benefits, and recommendations related to these approaches, concluding that none of them is a one-size-fits all solution and that more research into relative advantages of these approaches and their combinations is needed.

Paperid: 3352, https://arxiv.org/pdf/2503.04274.pdf

Abstract:
Security situational awareness refers to identifying, mitigating, and preventing digital cyber threats by gathering information to understand the current situation. With awareness, the basis for decisions is present, particularly in complex situations. However, while logging can track the successful login into a system, it typically cannot determine if the login was performed by the user assigned to the account. An account takeover, for example, by a successful phishing attack, can be used as an entry into an organization's network. All identities within an organization are managed in an identity management system. Thereby, these systems are an interesting goal for malicious actors. Even within identity management systems, it is difficult to differentiate legitimate from malicious actions. We propose a security situational awareness approach specifically to identity management. We focus on protocol-specifics and identity-related sources in a general concept before providing the example of the protocol OAuth with a proof-of-concept implementation.

Paperid: 3353, https://arxiv.org/pdf/2503.04259.pdf

Abstract:
The European General Data Protection Regulation (GDPR) grants European users the right to access their data processed and stored by organizations. Although the GDPR contains requirements for data processing organizations (e.g., understandable data provided within a month), it leaves much flexibility. In-depth research on how online services handle data subject access request is sparse. Specifically, it is unclear whether online services comply with the individual GDPR requirements, if the privacy policies and the data subject access responses are coherent, and how the responses change over time. To answer these questions, we perform a qualitative structured review of the processes and data exports of significant online services to (1) analyze the data received in 2023 in detail, (2) compare the data exports with the privacy policies, and (3) compare the data exports from November 2018 and November 2023. The study concludes that the quality of data subject access responses varies among the analyzed services, and none fulfills all requirements completely.

Paperid: 3354, https://arxiv.org/pdf/2503.04002.pdf

Abstract:
The pervasive nature of software vulnerabilities has emerged as a primary factor for the surge in cyberattacks. Traditional vulnerability detection methods, including rule-based, signature-based, manual review, static, and dynamic analysis, often exhibit limitations when encountering increasingly complex systems and a fast-evolving attack landscape. Deep learning (DL) methods excel at automatically learning and identifying complex patterns in code, enabling more effective detection of emerging vulnerabilities. This survey analyzes 34 relevant studies from high-impact journals and conferences between 2017 and 2024. This survey introduces the conceptual framework Vulnerability Detection Lifecycle for the first time to systematically analyze and compare various DL-based vulnerability detection methods and unify them into the same analysis perspective. The framework includes six phases: (1) Dataset Construction, (2) Vulnerability Granularity Definition, (3) Code Representation, (4) Model Design, (5) Model Performance Evaluation, and (6) Real-world Project Implementation. For each phase of the framework, we identify and explore key issues through in-depth analysis of existing research while also highlighting challenges that remain inadequately addressed. This survey provides guidelines for future software vulnerability detection, facilitating further implementation of deep learning techniques applications in this field.

Paperid: 3355, https://arxiv.org/pdf/2503.03893.pdf

Abstract:
Database Management System (DBMS) is the key component for data-intensive applications. Recently, researchers propose many tools to comprehensively test DBMS systems for finding various bugs. However, these tools only cover a small subset of diverse syntax elements defined in DBMS-specific SQL dialects, leaving a large number of features unexplored. In this paper, we propose ParserFuzz, a novel fuzzing framework that automatically extracts grammar rules from DBMSs' built-in syntax definition files for SQL query generation. Without any input corpus, ParserFuzz can generate diverse query statements to saturate the grammar features of the tested DBMSs, which grammar features could be missed by previous tools. Additionally, ParserFuzz utilizes code coverage as feedback to guide the query mutation, which combines different DBMS features extracted from the syntax rules to find more function and safety bugs. In our evaluation, ParserFuzz outperforms all state-of-the-art existing DBMS testing tools in terms of bug finding, grammar rule coverage and code coverage. ParserFuzz detects 81 previously unknown bugs in total across 5 popular DBMSs, where all bugs are confirmed and 34 have been fixed.

Paperid: 3356, https://arxiv.org/pdf/2503.03428.pdf

Abstract:
In a world where data is the new currency, wearable health devices offer unprecedented insights into daily life, continuously monitoring vital signs and metrics. However, this convenience raises privacy concerns, as these devices collect sensitive data that can be misused or breached. Traditional measures often fail due to real-time data processing needs and limited device power. Users also lack awareness and control over data sharing and usage. We propose a Privacy-Enhancing Technology (PET) framework for wearable devices, integrating federated learning, lightweight cryptographic methods, and selectively deployed blockchain technology. The blockchain acts as a secure ledger triggered only upon data transfer requests, granting users real-time notifications and control. By dismantling data monopolies, this approach returns data sovereignty to individuals. Through real-world applications like secure medical data sharing, privacy-preserving fitness tracking, and continuous health monitoring, our framework reduces privacy risks by up to 70 percent while preserving data utility and performance. This innovation sets a new benchmark for wearable privacy and can scale to broader IoT ecosystems, including smart homes and industry. As data continues to shape our digital landscape, our research underscores the critical need to maintain privacy and user control at the forefront of technological progress.

Paperid: 3357, https://arxiv.org/pdf/2503.02968.pdf

Abstract:
Sharing of tabular data containing valuable but private information is limited due to legal and ethical issues. Synthetic data could be an alternative solution to this sharing problem, as it is artificially generated by machine learning algorithms and tries to capture the underlying data distribution. However, machine learning models are not free from memorization and may introduce biases, as they rely on training data. Producing synthetic data that preserves privacy and fairness while maintaining utility close to the real data is a challenging task. This research simultaneously addresses both the privacy and fairness aspects of synthetic data, an area not explored by other studies. In this work, we present PF-WGAN, a privacy-preserving, fair synthetic tabular data generator based on the WGAN-GP model. We have modified the original WGAN-GP by adding privacy and fairness constraints forcing it to produce privacy-preserving fair data. This approach will enable the publication of datasets that protect individual's privacy and remain unbiased toward any particular group. We compared the results with three state-of-the-art synthetic data generator models in terms of utility, privacy, and fairness across four different datasets. We found that the proposed model exhibits a more balanced trade-off among utility, privacy, and fairness.

Paperid: 3358, https://arxiv.org/pdf/2503.02706.pdf

Abstract:
Nowadays, cyber threats are considered among the most dangerous risks by top management of enterprises. One way to deal with these risks is to insure them, but cyber insurance is still quite expensive. The insurance fee can be reduced if organisations improve their cyber security protection, i.e., reducing the insured risk. In other words, organisations need an investment strategy to decide the optimal amount of investments into cyber insurance and self-protection. In this work, we propose an approach to help a risk-averse organisation to distribute its cyber security investments in a cost-efficient way. What makes our approach unique is that next to defining the amount of investments in cyber insurance and self-protection, our proposal also explicitly defines how these investments should be spent by selecting the most cost-efficient security controls. Moreover, we provide an exact algorithm for the control selection problem considering several threats at the same time and compare this algorithm with other approximate algorithmic solutions.

Paperid: 3359, https://arxiv.org/pdf/2503.02648.pdf

Abstract:
We propose the first continuous-variable (CV) unclonable encryption scheme, extending the paradigm of quantum encryption of classical messages (QECM) to CV systems. In our construction, a classical message is first encrypted classically and then encoded using an error-correcting code. Each bit of the codeword is mapped to a CV mode by creating a coherent state which is squeezed in the $q$ or $p$ quadrature direction, with a small displacement that encodes the bit. The squeezing directions are part of the encryption key. We prove unclonability in the framework introduced by Broadbent and Lord, via a reduction of the cloning game to a CV monogamy-of-entanglement game.

Paperid: 3360, https://arxiv.org/pdf/2503.02559.pdf

Abstract:
Fully homomorphic encryption (FHE) is a technique that enables statistical processing and machine learning while protecting data, including sensitive information collected by single board computers (SBCs), on a cloud server. Among FHE schemes, the TFHE scheme is capable of homomorphic NAND operations and, unlike other FHE schemes, can perform various operations such as minimum, maximum, and comparison. However, TFHE requires Torus Learning With Error (TLWE) encryption, which encrypts one bit at a time, leading to less efficient encryption and larger ciphertext size compared to other schemes. Additionally, SBCs have a limited number of hardware accelerators compared to servers, making it challenging to achieve the same level of optimization as on servers. In this study, we propose a novel SBC-specific design, \textsf{TFHE-SBC}, to accelerate client-side TFHE operations and enhance communication and energy efficiency. Experimental results demonstrate that \textsf{TFHE-SBC} encryption is up to 2486 times faster, improves communication efficiency by 512 times, and achieves 12 to 2004 times greater energy efficiency than the state-of-the-art.

Paperid: 3361, https://arxiv.org/pdf/2503.02144.pdf

Abstract:
This study investigates the performance of various classification models for a malware classification task using different feature sets and data configurations. Six models-Logistic Regression, K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Trees, Random Forest (RF), and Extreme Gradient Boosting (XGB)-were evaluated alongside two deep learning models, Recurrent Neural Networks (RNN) and Transformers, as well as the Gemini zero-shot and few-shot learning methods. Four feature sets were tested including All Features, Literature Review Features, the Top 45 Features from RF, and Down-Sampled with Top 45 Features. XGB achieved the highest accuracy of 87.42% using the Top 45 Features, outperforming all other models. RF followed closely with 87.23% accuracy on the same feature set. In contrast, deep learning models underperformed, with RNN achieving 66.71% accuracy and Transformers reaching 71.59%. Down-sampling reduced performance across all models, with XGB dropping to 81.31%. Gemini zero-shot and few-shot learning approaches showed the lowest performance, with accuracies of 40.65% and 48.65%, respectively. The results highlight the importance of feature selection in improving model performance while reducing computational complexity. Traditional models like XGB and RF demonstrated superior performance, while deep learning and few-shot methods struggled to match their accuracy. This study underscores the effectiveness of traditional machine learning models for structured datasets and provides a foundation for future research into hybrid approaches and larger datasets.

Paperid: 3362, https://arxiv.org/pdf/2503.02141.pdf

Abstract:
This study uses various models to address network traffic classification, categorizing traffic into web, browsing, IPSec, backup, and email. We collected a comprehensive dataset from Arbor Edge Defender (AED) devices, comprising of 30,959 observations and 19 features. Multiple models were evaluated, including Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, XGBoost, Deep Neural Networks (DNN), Transformer, and two Large Language Models (LLMs) including GPT-4o and Gemini with zero- and few-shot learning. Transformer and XGBoost showed the best performance, achieving the highest accuracy of 98.95 and 97.56%, respectively. GPT-4o and Gemini showed promising results with few-shot learning, improving accuracy significantly from initial zero-shot performance. While Gemini Few-Shot and GPT-4o Few-Shot performed well in categories like Web and Email, misclassifications occurred in more complex categories like IPSec and Backup. The study highlights the importance of model selection, fine-tuning, and the balance between training data size and model complexity for achieving reliable classification results.

Paperid: 3363, https://arxiv.org/pdf/2503.02097.pdf

Abstract:
Inaccuracies in conventional dependency-tracking methods frequently undermine the security and integrity of modern software supply chains. This paper introduces a kernel-level framework leveraging extended Berkeley Packet Filter (eBPF) to capture software build dependencies transparently in real time. Our approach provides tamper-evident, intrinsic identifiers of build-time dependencies by computing cryptographic hashes of files accessed during compilation and constructing Merkle trees based on the observed file content. In contrast to traditional static analysis, this kernel-level methodology accounts for conditional compilation, dead-code, selective library usage, and dynamic dependencies, yielding more precise Software Bills of Materials (SBOMs) and Artifact Dependency Graphs (ADGs). We illustrate how existing SBOMs may omit dynamically loaded or ephemeral dependencies and discuss how kernel-level tracing can mitigate these omissions. The proposed system enhances trustworthiness in software artifacts by offering independently verifiable, kernel-level evidence of build provenance, thereby reducing supply chain risks and facilitating more accurate vulnerability management.

Paperid: 3364, https://arxiv.org/pdf/2503.01915.pdf

Abstract:
The development of Large Language Models (LLMs) has led to significant advancements in natural language processing and enabled numerous applications across various industries. However, many LLM-based solutions operate as open systems relying on cloud services, which pose risks to data confidentiality and security. To address these challenges, organizations require closed LLM systems that comply with data protection regulations while maintaining high performance. In this paper, we present a reference architecture for developing closed, LLM-based systems using open-source technologies. The architecture provides a flexible and transparent solution that meets strict data privacy and security requirements. We analyze the key challenges in implementing such systems, including computing resources, data management, scalability, and security risks. Additionally, we introduce an evaluation pipeline that enables a systematic assessment of system performance and compliance.

Paperid: 3365, https://arxiv.org/pdf/2503.01470.pdf

Abstract:
The external evaluation of AI systems is increasingly recognised as a crucial approach for understanding their potential risks. However, facilitating external evaluation in practice faces significant challenges in balancing evaluators' need for system access with AI developers' privacy and security concerns. Additionally, evaluators have reason to protect their own privacy - for example, in order to maintain the integrity of held-out test sets. We refer to the challenge of ensuring both developers' and evaluators' privacy as one of providing mutual privacy. In this position paper, we argue that (i) addressing this mutual privacy challenge is essential for effective external evaluation of AI systems, and (ii) current methods for facilitating external evaluation inadequately address this challenge, particularly when it comes to preserving evaluators' privacy. In making these arguments, we formalise the mutual privacy problem; examine the privacy and access requirements of both model owners and evaluators; and explore potential solutions to this challenge, including through the application of cryptographic and hardware-based approaches.

Paperid: 3366, https://arxiv.org/pdf/2503.01396.pdf

Abstract:
Copious mobile operating systems exist in the market, but Android remains the user's choice. Meanwhile, its growing popularity has also attracted malware developers. Researchers have proposed various static solutions for Android malware detection. However, stealthier malware evade static analysis. This raises the need for a robust Android malware detection system capable of dealing with advanced threats and overcoming the shortcomings of static analysis. Hence, this work proposes a dynamic analysis-based Android malware detection system, CorrNetDroid, that works over network traffic flows. Many traffic features exhibit overlapping ranges in normal and malware datasets. Therefore, we first rank the features using two statistical measures, crRelevance and Normalized Mean Residue Similarity (NMRS), to assess feature-class and feature-feature correlations. Thereafter, we introduce a novel correlation-based feature selection algorithm that applies NMRS on crRelevance rankings to identify the optimal feature subset for Android malware detection. Experimental results highlight that our model effectively reduces the feature set while detecting Android malware with 99.50 percent accuracy when considering only two network traffic features. Furthermore, our experiments demonstrate that the NMRS-based algorithm on crRelevance rankings outperforms statistical tests such as chi-square, ANOVA, Mann-Whitney U test, and Kruskal-Wallis test. In addition, our model surpasses various state-of-the-art Android malware detection techniques in terms of detection accuracy.

Paperid: 3367, https://arxiv.org/pdf/2503.01111.pdf

Abstract:
This study delves into the tokenization of real-world assets (RWAs) on the blockchain with the objective of augmenting liquidity and refining asset management practices. By conducting an exhaustive analysis of the technical procedures implicated and scrutinizing case studies of existing deployments, this research evaluates the advantages, hurdles, and prospective advancements of blockchain technology in reshaping conventional asset management paradigms.

Paperid: 3368, https://arxiv.org/pdf/2503.01089.pdf

Abstract:
Machine learning (ML) in Internet of Vehicles (IoV) applications enhanced intelligent transportation, autonomous driving capabilities, and various connected services within a large, heterogeneous network. However, the increased connectivity and massive data exchange for ML applications introduce significant privacy challenges. Privacy-preserving machine learning (PPML) offers potential solutions to address these challenges by preserving privacy at various stages of the ML pipeline. Despite the rapid development of ML-based IoV applications and the growing data privacy concerns, there are limited comprehensive studies on the adoption of PPML within this domain. Therefore, this study provides a comprehensive review of the fundamentals, recent advancements, and the challenges of integrating PPML into IoV applications. We first review existing surveys of various PPML techniques and their integration into IoV across different scopes. We then categorize IoV applications into three key domains and analyze the privacy challenges in leveraging ML in these application domains. Building on these fundamentals, we review recent advancements in integrating various PPML techniques within IoV applications, discussing their frameworks, key features, and performance in terms of privacy, utility, and efficiency. Finally, we identify current challenges and propose future research directions to enhance privacy and reliability in IoV applications.

Paperid: 3369, https://arxiv.org/pdf/2503.00950.pdf

Abstract:
An efficient integer factorization algorithm would reduce the security of all variants of the RSA cryptographic scheme to zero. Despite the passage of years, no method for efficiently factoring large semiprime numbers in a classical computational model has been discovered. In this paper, we demonstrate how a natural extension of the generalized approach to smoothness, combined with the separation of $2$-adic point orders, leads us to propose a factoring algorithm that finds (conjecturally) the prime decomposition $N = pq$ in subexponential time $L(\sqrt 2+o(1), \min(p,q))$. This approach motivated by the papers \cite{Len}, \cite{MMV} and \cite{PoZo} is based on a more careful investigation of pairs $(E,Q)$, where $Q$ is a point on an elliptic curve $E$ over $\Z _N$. Specifically, in contrast to the familiar condition that the largest prime divisor $P^+(\ord Q_p)$ of the reduced order $\ord Q_p$ does not divide $\#E(\F_q)$ we focus on the relation between $P^+(\ord Q_r)$ and the smallest prime number $l_{\min}(E,Q)$ separating the orders $\ord Q_p$ and $\ord Q_q$. We focus on the ${\calE}_2$ family of even order elliptic curves over $\Z_N$ since then the condition $l_{\min}(E,Q)\le 2$ holds true for large fraction of points $(x,y)\in E(\Z_N)$. Moreover if we know the pair $(E,Q)$ such that $P^+(\ord Q_r)\le t

Paperid: 3370, https://arxiv.org/pdf/2503.00932.pdf

Abstract:
Deep neural networks (DNNs) are highly susceptible to adversarial examples--subtle perturbations applied to inputs that are often imperceptible to humans yet lead to incorrect model predictions. In black-box scenarios, however, existing adversarial examples exhibit limited transferability and struggle to effectively compromise multiple unseen DNN models. Previous strategies enhance the cross-model generalization of adversarial examples by introducing versatility into adversarial perturbations, thereby improving transferability. However, further refining perturbation versatility often demands intricate algorithm development and substantial computation consumption. In this work, we propose an input transpose method that requires almost no additional labor and computation costs but can significantly improve the transferability of existing adversarial strategies. Even without adding adversarial perturbations, our method demonstrates considerable effectiveness in cross-model attacks. Our exploration finds that on specific datasets, a mere $1^\circ$ left or right rotation might be sufficient for most adversarial examples to deceive unseen models. Our further analysis suggests that this transferability improvement triggered by rotating only $1^\circ$ may stem from visible pattern shifts in the DNN's low-level feature maps. Moreover, this transferability exhibits optimal angles that, when identified under unrestricted query conditions, could potentially yield even greater performance.

Paperid: 3371, https://arxiv.org/pdf/2503.00742.pdf

Abstract:
Integrating blockchain technology into healthcare systems presents a transformative approach to documenting, storing, and accessing electronic health records (EHRs). This research introduces a novel blockchain-based EHR system designed to significantly enhance security, scalability, and accessibility compared to existing solutions. Current systems primarily utilize SHA-256 for security and either IPFS or centralized storage, which, while effective, have limitations in providing comprehensive data integrity and security. The proposed system leverages a hybrid security algorithm combining Argon2 and AES and integrates a hybrid storage and consensus mechanism utilizing IPFS and PBFT. This multifaceted approach ensures robust encryption, efficient consensus, and high fault tolerance. Furthermore, the system incorporates Multi-Factor Authentication (MFA) to safeguard against unauthorized access. It utilizes advanced blockchain tools like MetaMask, Ganache, and Truffle to facilitate seamless interaction with the decentralized network. Simulation results demonstrate that this system offers superior protection against data breaches and enhances operational efficiency. Specifically, the proposed hybrid model substantially improves data integrity, consensus efficiency, fault tolerance, data availability, latency, bandwidth utilization, throughput, memory usage, and CPU usage across various healthcare applications. To validate the performance and security of the proposed system, comprehensive analyses were conducted using real-world healthcare scenarios. The findings highlight the significant advantages of the blockchain-based EHR system, emphasizing its potential to revolutionize healthcare data management by ensuring secure, reliable, and efficient handling of sensitive medical information.

Paperid: 3372, https://arxiv.org/pdf/2503.00638.pdf

Abstract:
Counterfeiting poses a significant challenge across multiple industries, leading to financial losses and health risks. While DNA-based molecular tagging has emerged as a promising anti-counterfeiting strategy, existing methods rely on predefined DNA sequences, making them vulnerable to replication as sequencing and synthesis technologies advance. To address these limitations, we introduce POSERS (Position-Oriented Scattering of Elements among a Randomized Sequence), a steganographic tagging system embedded within DNA sequences. POSERS ensures copy- and forgery-proof authentication by adding restrictions within randomized DNA libraries, enhancing security against counterfeiting attempts. The POSERS design allows the complexity of the libraries to be adjusted based on the customer's needs while ensuring they withstand the ongoing improvements in DNA synthesis and sequencing technologies. We mathematically validate its security properties and experimentally demonstrate its effectiveness using Next-Generation Sequencing and an authentication test, successfully distinguishing genuine POSERS tags from counterfeit ones. Our results highlight the potential of POSERS as a long-term, adaptable solution for secure product authentication.

Paperid: 3373, https://arxiv.org/pdf/2503.00358.pdf

Abstract:
The modern power grids are integrated with digital technologies and automation systems. The inclusion of digital technologies has made the smart grids vulnerable to cyber-attacks. Cyberattacks on smart grids can compromise data integrity and jeopardize the reliability of the power supply. Traditional intrusion detection systems often need help to effectively detect novel and sophisticated attacks due to their reliance on labeled training data, which may only encompass part of the spectrum of potential threats. This work proposes a semi-supervised method for cyber-attack detection in smart grids by leveraging the labeled and unlabeled measurement data. We implement consistency regularization and pseudo-labeling to identify deviations from expected behavior and predict the attack classes. We use a curriculum learning approach to improve pseudo-labeling performance, capturing the model uncertainty. We demonstrate the efficiency of the proposed method in detecting different types of cyberattacks, minimizing the false positives by implementing them on publicly available datasets. The method proposes a promising solution by improving the detection accuracy to 99% in the presence of unknown samples and significantly reducing false positives.

Paperid: 3374, https://arxiv.org/pdf/2503.00140.pdf

Abstract:
Machine learning systems deployed in distributed or federated environments are highly susceptible to adversarial manipulations, particularly availability attacks -adding imperceptible perturbations to training data, thereby rendering the trained model unavailable. Prior research in distributed machine learning has demonstrated such adversarial effects through the injection of gradients or data poisoning. In this study, we aim to enhance comprehension of the potential of weaker (and more probable) adversaries by posing the following inquiry: Can availability attacks be inflicted solely through the flipping of a subset of training labels, without altering features, and under a strict flipping budget? We analyze the extent of damage caused by constrained label flipping attacks. Focusing on a distributed classification problem, (1) we propose a novel formalization of label flipping attacks on logistic regression models and derive a greedy algorithm that is provably optimal at each training step. (2) To demonstrate that availability attacks can be approached by label flipping alone, we show that a budget of only $0.1\%$ of labels at each training step can reduce the accuracy of the model by $6\%$, and that some models can perform worse than random guessing when up to $25\%$ of labels are flipped. (3) We shed light on an interesting interplay between what the attacker gains from more write-access versus what they gain from more flipping budget. (4) we define and compare the power of targeted label flipping attack to that of an untargeted label flipping attack.

Paperid: 3375, https://arxiv.org/pdf/2502.20952.pdf

Abstract:
With the widespread application of Large Language Models across various domains, their security issues have increasingly garnered significant attention from both academic and industrial communities. This study conducts sampling and normalization of the parameters of the LLM to generate visual representations and heatmaps of parameter distributions, revealing notable discrepancies in parameter distributions among certain layers within the hidden layers. Further analysis involves calculating statistical metrics for each layer, followed by the computation of a Comprehensive Sensitivity Score based on these metrics, which identifies the lower layers as being particularly sensitive to the generation of harmful content. Based on this finding, we employ a Freeze training strategy, selectively performing Supervised Fine-Tuning only on the lower layers. Experimental results demonstrate that this method significantly reduces training duration and GPU memory consumption while maintaining a high jailbreak success rate and a high harm score, outperforming the results achieved by applying the LoRA method for SFT across all layers. Additionally, the method has been successfully extended to other open-source large models, validating its generality and effectiveness across different model architectures. Furthermore, we compare our method with ohter jailbreak method, demonstrating the superior performance of our approach. By innovatively proposing a method to statistically analyze and compare large model parameters layer by layer, this study provides new insights into the interpretability of large models. These discoveries emphasize the necessity of continuous research and the implementation of adaptive security measures in the rapidly evolving field of LLMs to prevent potential jailbreak attack risks, thereby promoting the development of more robust and secure LLMs.

Paperid: 3376, https://arxiv.org/pdf/2502.20902.pdf

Abstract:
Source location privacy (SLP) has been of great concern in WSNs when deployed for habitat monitoring applications. The issue is taken care of by employing privacy-preserving routing schemes. In the existing works, the attacker is assumed to be passive in nature and backtracks to the source of information by eavesdropping the message signals. In this work, we try to understand the impact of active attacks by proposing a new hybrid attack model consisting of both active and passive attacks. The proposed model is then applied to three existing TTL-based random walk SLP solutions: phantom routing scheme (PRS), source location privacy using randomized routes (SLP-R), and position-independent section-based scheme (PSSLP). The performance of the algorithms in terms of privacy metrics is compared in the case of pure passive attack and hybrid attack of varying intensity. The results indicate a significant degradation in the privacy protection performance of the reference algorithms in the face of the proposed hybrid attack model indicating the importance and relevance of such attacks. It is further observed that the hybrid attack can be optimized to increase the vulnerability of the existing solutions.

Paperid: 3377, https://arxiv.org/pdf/2502.20359.pdf

Abstract:
Traditional authentication methods, such as passwords and biometrics, verify a user's identity only at the start of a session, leaving systems vulnerable to session hijacking. Continuous authentication, however, ensures ongoing verification by monitoring user behavior. This study investigates the long-term feasibility of eye-tracking as a behavioral biometric for continuous authentication in virtual reality (VR) environments, using data from the GazebaseVR dataset. Our approach evaluates three architectures, Transformer Encoder, DenseNet, and XGBoost, on short and long-term data to determine their efficacy in user identification tasks. Initial results indicate that both Transformer Encoder and DenseNet models achieve high accuracy rates of up to 97% in short-term settings, effectively capturing unique gaze patterns. However, when tested on data collected 26 months later, model accuracy declined significantly, with rates as low as 1.78% for some tasks. To address this, we propose periodic model updates incorporating recent data, restoring accuracy to over 95%. These findings highlight the adaptability required for gaze-based continuous authentication systems and underscore the need for model retraining to manage evolving user behavior. Our study provides insights into the efficacy and limitations of eye-tracking as a biometric for VR authentication, paving the way for adaptive, secure VR user experiences.

Paperid: 3378, https://arxiv.org/pdf/2502.18631.pdf

Abstract:
Modern large language models (LLMs) exhibit critical vulnerabilities to poison pill attacks: localized data poisoning that alters specific factual knowledge while preserving overall model utility. We systematically demonstrate these attacks exploit inherent architectural properties of LLMs, achieving 54.6% increased retrieval inaccuracy on long-tail knowledge versus dominant topics and up to 25.5% increase retrieval inaccuracy on compressed models versus original architectures. Through controlled mutations (e.g., temporal/spatial/entity alterations) and, our method induces localized memorization deterioration with negligible impact on models' performance on regular standard benchmarks (e.g., <2% performance drop on MMLU/GPQA), leading to potential detection evasion. Our findings suggest: (1) Disproportionate vulnerability in long-tail knowledge may result from reduced parameter redundancy; (2) Model compression may increase attack surfaces, with pruned/distilled models requiring 30% fewer poison samples for equivalent damage; (3) Associative memory enables both spread of collateral damage to related concepts and amplification of damage from simultaneous attack, particularly for dominant topics. These findings raise concerns over current scaling paradigms since attack costs are lowering while defense complexity is rising. Our work establishes poison pills as both a security threat and diagnostic tool, revealing critical security-efficiency trade-offs in language model compression that challenges prevailing safety assumptions.

Paperid: 3380, https://arxiv.org/pdf/2502.18288.pdf

Abstract:
With the widespread adoption of Zero-Knowledge Proof systems, particularly ZK-SNARK, the efficiency of proof generation, encompassing both the witness generation and proof computation phases, has become a significant concern. While substantial efforts have successfully accelerated proof computation, progress in optimizing witness generation remains limited, which inevitably hampers overall efficiency. In this paper, we propose Yoimiya, a scalable framework with pipeline, to optimize the efficiency in ZK-SNARK systems. First, Yoimiya introduces an automatic circuit partitioning algorithm that divides large circuits of ZK-SNARK into smaller subcircuits, the minimal computing units with smaller memory requirement, allowing parallel processing on multiple units. Second, Yoimiya decouples witness generation from proof computation, and achieves simultaneous executions over units from multiple circuits. Moreover, Yoimiya enables each phase scalable separately by configuring the resource distribution to make the time costs of the two phases aligned, maximizing the resource utilization. Experimental results confirmed that our framework effectively improves the resource utilization and proof generation speed.

Paperid: 3381, https://arxiv.org/pdf/2502.17693.pdf

Abstract:
Detecting phishing, spam, fake accounts, data scraping, and other malicious activity in online social networks (OSNs) is a problem that has been studied for well over a decade, with a number of important results. Nearly all existing works on abuse detection have as their goal producing the best possible binary classifier; i.e., one that labels unseen examples as "benign" or "malicious" with high precision and recall. However, no prior published work considers what comes next: what does the service actually do after it detects abuse? In this paper, we argue that detection as described in previous work is not the goal of those who are fighting OSN abuse. Rather, we believe the goal to be selecting actions (e.g., ban the user, block the request, show a CAPTCHA, or "collect more evidence") that optimize a tradeoff between harm caused by abuse and impact on benign users. With this framing, we see that enlarging the set of possible actions allows us to move the Pareto frontier in a way that is unattainable by simply tuning the threshold of a binary classifier. To demonstrate the potential of our approach, we present Predictive Response Optimization (PRO), a system based on reinforcement learning that utilizes available contextual information to predict future abuse and user-experience metrics conditioned on each possible action, and select actions that optimize a multi-dimensional tradeoff between abuse/harm and impact on user experience. We deployed versions of PRO targeted at stopping automated activity on Instagram and Facebook. In both cases our experiments showed that PRO outperforms a baseline classification system, reducing abuse volume by 59% and 4.5% (respectively) with no negative impact to users. We also present several case studies that demonstrate how PRO can quickly and automatically adapt to changes in business constraints, system behavior, and/or adversarial tactics.

Paperid: 3382, https://arxiv.org/pdf/2502.17526.pdf

Abstract:
In Federated Learning (FL), several clients jointly learn a machine learning model: each client maintains a local model for its local learning dataset, while a master server maintains a global model by aggregating the local models of the client devices. However, the repetitive communication between server and clients leaves room for attacks aimed at compromising the integrity of the global model, causing errors in its targeted predictions. In response to such threats on FL, various defense measures have been proposed in the literature. In this paper, we present a powerful defense against malicious clients in FL, called FedSV, using the Shapley Value (SV), which has been proposed recently to measure user contribution in FL by computing the marginal increase of average accuracy of the model due to the addition of local data of a user. Our approach makes the identification of malicious clients more robust, since during the learning phase, it estimates the contribution of each client according to the different groups to which the target client belongs. FedSV's effectiveness is demonstrated by extensive experiments on MNIST datasets in a cross-silo context under various attacks.

Paperid: 3383, https://arxiv.org/pdf/2502.16605.pdf

Abstract:
Quantum compilers play a crucial role in quantum computing by converting these algorithmic quantum circuits into forms compatible with specific quantum computer hardware. However, untrusted quantum compilers present considerable risks, including the potential theft of quantum circuit intellectual property (IP) and compromise of the functionality (e.g. Trojan insertion). Quantum circuit obfuscation techniques protect quantum IP by transforming a quantum circuit into a key-dependent version before compilation and restoring the compiled circuit's functionality with the correct key. This prevents the untrusted compiler from knowing the circuit's original functionality. Existing quantum circuit obfuscation techniques focus on inserting key qubits to control key gates. One added key gate can represent at most one Boolean key bit. In this paper, we propose OPAQUE, a phase-based quantum circuit obfuscation approach where we use the angle of rotation gates as the secret keys. The rotation angle is a continuous value, which makes it possible to represent multiple key bits. Moreover, phase gates are usually implemented as virtual gates in quantum hardware, diminishing their cost and impact on accuracy.

Paperid: 3384, https://arxiv.org/pdf/2502.16545.pdf

Abstract:
Current federated backdoor attacks focus on collaboratively training backdoor triggers, where multiple compromised clients train their local trigger patches and then merge them into a global trigger during the inference phase. However, these methods require careful design of the shape and position of trigger patches and lack the feature interactions between trigger patches during training, resulting in poor backdoor attack success rates. Moreover, the pixels of the patches remain untruncated, thereby making abrupt areas in backdoor examples easily detectable by the detection algorithm. To this end, we propose a novel benchmark for the federated backdoor attack based on feature aggregation. Specifically, we align the dimensions of triggers with images, delimit the trigger's pixel boundaries, and facilitate feature interaction among local triggers trained by each compromised client. Furthermore, leveraging the intra-class attack strategy, we propose the simultaneous generation of backdoor triggers for all target classes, significantly reducing the overall production time for triggers across all target classes and increasing the risk of the federated model being attacked. Experiments demonstrate that our method can not only bypass the detection of defense methods while patch-based methods fail, but also achieve a zero-shot backdoor attack with a success rate of 77.39%. To the best of our knowledge, our work is the first to implement such a zero-shot attack in federated learning. Finally, we evaluate attack performance by varying the trigger's training factors, including poison location, ratio, pixel bound, and trigger training duration (local epochs and communication rounds).

Paperid: 3385, https://arxiv.org/pdf/2502.16352.pdf

Abstract:
We consider the multi-party classification problem introduced by Dong, Hartline, and Vijayaraghavan (2022) motivated by electronic discovery. In this problem, our goal is to design a protocol that guarantees the requesting party receives nearly all responsive documents while minimizing the disclosure of nonresponsive documents. We develop verification protocols that certify the correctness of a classifier by disclosing a few nonresponsive documents. We introduce a combinatorial notion called the Leave-One-Out dimension of a family of classifiers and show that the number of nonresponsive documents disclosed by our protocol is at most this dimension in the realizable setting, where a perfect classifier exists in this family. For linear classifiers with a margin, we characterize the trade-off between the margin and the number of nonresponsive documents that must be disclosed for verification. Specifically, we establish a trichotomy in this requirement: for $d$ dimensional instances, when the margin exceeds $1/3$, verification can be achieved by revealing only $O(1)$ nonresponsive documents; when the margin is exactly $1/3$, in the worst case, at least $Î©(d)$ nonresponsive documents must be disclosed; when the margin is smaller than $1/3$, verification requires $Î©(e^d)$ nonresponsive documents. We believe this result is of independent interest with applications to coding theory and combinatorial geometry. We further extend our protocols to the nonrealizable setting defining an analogous combinatorial quantity robust Leave-One-Out dimension, and to scenarios where the protocol is tolerant to misclassification errors by Alice.

Paperid: 3386, https://arxiv.org/pdf/2502.16177.pdf

Abstract:
Keystroke dynamics is a behavioral biometric that captures an individual's typing patterns for authentication and security applications. This paper presents a comparative analysis of keystroke authentication models using Gaussian Mixture Models (GMM), Mahalanobis Distance-based Classification, and Gunetti Picardi's Distance Metrics. These models leverage keystroke timing features such as hold time (H), up-down time (UD), and down-down time (DD) extracted from datasets including Aalto, Buffalo and Nanglae-Bhattarakosol. Each model is trained and validated using structured methodologies, with performance evaluated through False Acceptance Rate (FAR), False Rejection Rate (FRR), and Equal Error Rate (EER). The results, visualized through Receiver Operating Characteristic (ROC) curves, highlight the relative strengths and weaknesses of each approach in distinguishing genuine users from impostors.

Paperid: 3387, https://arxiv.org/pdf/2502.16176.pdf

Abstract:
Every commercially available, state-of-the-art neural network consume plain input data, which is a well-known privacy concern. We propose a new architecture based on homomorphic encryption, which allows the neural network to operate on encrypted data. We show that Homomorphic Neural Networks (HNN) can achieve full privacy and security while maintaining levels of accuracy comparable to plain neural networks. We also introduce a new layer, the Differentiable Soft-Argmax, which allows the calibration of output logits in the encrypted domain, raising the entropy of the activation parameters, thus improving the security of the model, while keeping the overall noise below the acceptable noise budget. Experiments were conducted using the Stanford Sentiment Treebank (SST-2) corpora on the DistilBERT base uncased finetuned SST-2 English sentiment analysis model, and the results show that the HNN model can achieve up to 82.5% of the accuracy of the plain model while maintaining full privacy and security.

Paperid: 3388, https://arxiv.org/pdf/2502.16127.pdf

Abstract:
This study presents a blockchain-based voting system aimed at enhancing election security, transparency, and integrity. Traditional voting methods face growing risks of tampering, making it crucial to explore innovative solutions. Our proposed system combines blockchain's immutable, decentralized ledger with advanced voter identity verification techniques, including digital identity validation through Aadhaar and Driving Licenses (secured via BLAKE2b-512 hashing), biometric fingerprint authentication, and a picture rotation pattern for added security. Votes are recorded transparently and securely on a blockchain, with a consensus mechanism ensuring data integrity and reducing the risk of unauthorized alterations. Security analysis indicates that this multi-layered approach significantly reduces impersonation risks, while blockchain ensures accurate, private, and tamper-resistant vote recording. The findings support that a blockchain-based voting system with robust identity checks offers a trustworthy alternative to traditional methods, with potential for even greater refinement in secure and transparent elections.

Paperid: 3389, https://arxiv.org/pdf/2502.15653.pdf

Abstract:
Cellular networking is advancing as a wireless technology to support diverse applications in vehicular communication, enabling vehicles to interact with various applications to enhance the driving experience, even when managed by different authorities. Security Credential Management System (SCMS) is the Public Key Infrastructure (PKI) for vehicular networking and the state-of-the-art distributed PKI to protect the privacy-preserving vehicular networking against an honest-but-curious authority using multiple authorities and to decentralize the trust management. We build a Blockchain-Based Trust Management (BBTM) to provide even greater decentralization and security. Specifically, BBTM uses the blockchain to 1) replace the existing Policy Generator (PG), 2) manage the policy of each authority in SCMS, 3) aggregate the Global Certificate Chain File (GCCF), and 4) provide greater accountability and transparency on the aforementioned functionalities. We implement BBTM on Hyperledger Fabric using a smart contract for experimentation and analyses. Our experiments show that BBTM is lightweight in processing, efficient management in the certificate chain and ledger size, supports a bandwidth of multiple transactions per second, and provides validated end-entities.

Paperid: 3390, https://arxiv.org/pdf/2502.15561.pdf

Abstract:
As cyberattacks become increasingly sophisticated, advanced Network Intrusion Detection Systems (NIDS) are critical for modern network security. Traditional signature-based NIDS are inadequate against zero-day and evolving attacks. In response, machine learning (ML)-based NIDS have emerged as promising solutions; however, they are vulnerable to adversarial evasion attacks that subtly manipulate network traffic to bypass detection. To address this vulnerability, we propose a novel defensive framework that enhances the robustness of ML-based NIDS by simultaneously integrating adversarial training, dataset balancing techniques, advanced feature engineering, ensemble learning, and extensive model fine-tuning. We validate our framework using the NSL-KDD and UNSW-NB15 datasets. Experimental results show, on average, a 35% increase in detection accuracy and a 12.5% reduction in false positives compared to baseline models, particularly under adversarial conditions. The proposed defense against adversarial attacks significantly advances the practical deployment of robust ML-based NIDS in real-world networks.

Paperid: 3391, https://arxiv.org/pdf/2502.15505.pdf

Abstract:
We develop a dynamic model of the Bitcoin market where users set fees themselves and miners decide whether to operate and whom to validate based on those fees. Our analysis reveals how, in equilibrium, users adjust their bids in response to short-term congestion (i.e., the amount of pending transactions), how miners decide when to start operating based on the level of congestion, and how the interplay between these two factors shapes the overall market dynamics. The miners hold off operating when the congestion is mild, which harms social welfare. However, we show that a block reward (a fixed reward paid to miners upon a block production) can mitigate these inefficiencies. We characterize the socially optimal block reward and demonstrate that it is always positive, suggesting that Bitcoin's halving schedule may be suboptimal.

Paperid: 3392, https://arxiv.org/pdf/2502.15017.pdf

Abstract:
Adversarial attacks in deep learning represent a significant threat to the integrity and reliability of machine learning models. Adversarial training has been a popular defence technique against these adversarial attacks. In this work, we capitalize on a network architecture, namely Deep Linearly Gated Networks (DLGN), which has better interpretation capabilities than regular deep network architectures. Using this architecture, we interpret robust models trained using PGD adversarial training and compare them with standard training. Feature networks in DLGN act as feature extractors, making them the only medium through which an adversary can attack the model. We analyze the feature network of DLGN with fully connected layers with respect to properties like alignment of the hyperplanes, hyperplane relation with PCA, and sub-network overlap among classes and compare these properties between robust and standard models. We also consider this architecture having CNN layers wherein we qualitatively (using visualizations) and quantitatively contrast gating patterns between robust and standard models. We uncover insights into hyperplanes resembling principal components in PGD-AT and STD-TR models, with PGD-AT hyperplanes aligned farther from the data points. We use path activity analysis to show that PGD-AT models create diverse, non-overlapping active subnetworks across classes, preventing attack-induced gating overlaps. Our visualization ideas show the nature of representations learnt by PGD-AT and STD-TR models.

Paperid: 3393, https://arxiv.org/pdf/2502.14464.pdf

Abstract:
One of the goals of Federated Learning (FL) is to collaboratively train a global model using local models from remote participants. However, the FL process is susceptible to various security challenges, including interception and tampering models, information leakage through shared gradients, and privacy breaches that expose participant identities or data, particularly in sensitive domains such as medical environments. Furthermore, the advent of quantum computing poses a critical threat to existing cryptographic protocols through the Shor and Grover algorithms, causing security concerns in the communication of FL systems. To address these challenges, we propose a Post-Quantum Blockchain-based protocol for Federated Learning (PQBFL) that utilizes post-quantum cryptographic (PQC) algorithms and blockchain to enhance model security and participant identity privacy in FL systems. It employs a hybrid communication strategy that combines off-chain and on-chain channels to optimize cost efficiency, improve security, and preserve participant privacy while ensuring accountability for reputation-based authentication in FL systems. The PQBFL specifically addresses the security requirement for the iterative nature of FL, which is a less notable point in the literature. Hence, it leverages ratcheting mechanisms to provide forward secrecy and post-compromise security during all the rounds of the learning process. In conclusion, PQBFL provides a secure and resilient solution for federated learning that is well-suited to the quantum computing era.

Paperid: 3394, https://arxiv.org/pdf/2502.14436.pdf

Abstract:
Let $q$ be a prime power and $r$ a positive even integer. Let $\mathbb{F}_{q}$ be the finite field with $q$ elements and $\mathbb{F}_{q^r}$ be its extension field of degree $r$. Let $Ï$ be a nontrivial multiplicative character of $\mathbb{F}_{q^r}$ and $f(X)$ a polynomial over $\mathbb{F}_{q^r}$ with a simple root in $\mathbb{F}_{q^r}$. In this paper, we improve estimates for character sums $\sum\limits_{g \in\mathcal{G}}Ï(f(g))$, where $\mathcal{G}$ is either a subset of $\mathbb{F}_{q^r}$ of sparse elements, with respect to some fixed basis of $\mathbb{F}_{q^r}$ which contains a basis of $\mathbb{F}_{q^{r/2}}$, or a subset avoiding affine hyperplanes in general position. While such sums have been previously studied, our approach yields sharper bounds by reducing them to sums over the subfield $\mathbb{F}_{q^{r/2}}$ rather than sums over general linear spaces. These estimates can be used to prove the existence of primitive elements in $\mathcal{G}$ in the standard way.

Paperid: 3395, https://arxiv.org/pdf/2502.14412.pdf

Abstract:
The prevalence of Vision-Language Models (VLMs) raises important questions about privacy in an era where visual information is increasingly available. While foundation VLMs demonstrate broad knowledge and learned capabilities, we specifically investigate their ability to infer geographic location from previously unseen image data. This paper introduces a benchmark dataset collected from Google Street View that represents its global distribution of coverage. Foundation models are evaluated on single-image geolocation inference, with many achieving median distance errors of <300 km. We further evaluate VLM "agents" with access to supplemental tools, observing up to a 30.6% decrease in distance error. Our findings establish that modern foundation VLMs can act as powerful image geolocation tools, without being specifically trained for this task. When coupled with increasing accessibility of these models, our findings have greater implications for online privacy. We discuss these risks, as well as future work in this area.

Paperid: 3396, https://arxiv.org/pdf/2502.14326.pdf

Abstract:
Digital fingerprints have brought great convenience and benefits to many online businesses. However, they pose a significant threat to the privacy and security of ordinary users. In this paper, we investigate the effectiveness of current anti-tracking methods against digital fingerprints and design a browser extension that can effectively resist digital fingerprints and record the website's collection of digital fingerprint-related information.

Paperid: 3397, https://arxiv.org/pdf/2502.14135.pdf

Abstract:
Concept drift refers to gradual or sudden changes in the properties of data that affect the accuracy of machine learning models. In this paper, we address the problem of concept drift detection in the malware domain. Specifically, we propose and analyze a clustering-based approach to detecting concept drift. Using a subset of the KronoDroid dataset, malware samples are partitioned into temporal batches and analyzed using MiniBatch $K$-Means clustering. The silhouette coefficient is used as a metric to identify points in time where concept drift has likely occurred. To verify our drift detection results, we train learning models under three realistic scenarios, which we refer to as static training, periodic retraining, and drift-aware retraining. In each scenario, we consider four supervised classifiers, namely, Multilayer Perceptron (MLP), Support Vector Machine (SVM), Random Forest, and XGBoost. Experimental results demonstrate that drift-aware retraining guided by silhouette coefficient thresholding achieves classification accuracy far superior to static models, and generally within 1% of periodic retraining, while also being far more efficient than periodic retraining. These results provide strong evidence that our clustering-based approach is effective at detecting concept drift, while also illustrating a highly practical and efficient fully automated approach to improved malware classification via concept drift detection.

Paperid: 3398, https://arxiv.org/pdf/2502.13459.pdf

Abstract:
Deep learning models have gained popularity for conducting various tasks involving source code. However, their black-box nature raises concerns about potential risks. One such risk is a poisoning attack, where an attacker intentionally contaminates the training set with malicious samples to mislead the model's predictions in specific scenarios. To protect source code models from poisoning attacks, we introduce CodeGarrison (CG), a hybrid deep-learning model that relies on code embeddings to identify poisoned code samples. We evaluated CG against the state-of-the-art technique ONION for detecting poisoned samples generated by DAMP, MHM, ALERT, as well as a novel poisoning technique named CodeFooler. Results showed that CG significantly outperformed ONION with an accuracy of 93.5%. We also tested CG's robustness against unknown attacks, achieving an average accuracy of 85.6% in identifying poisoned samples across the four attacks mentioned above.

Paperid: 3399, https://arxiv.org/pdf/2502.13415.pdf

Abstract:
Public exchanges like the New York Stock Exchange and NASDAQ act as auctioneers in a public double auction system, where buyers submit their highest bids and sellers offer their lowest asking prices, along with the number of shares (volume) they wish to trade. The auctioneer matches compatible orders and executes the trades when a match is found. However, auctioneers involved in high-volume exchanges, such as dark pools, may not always be reliable. They could exploit their position by engaging in practices like front-running or face significant conflicts of interest, i.e., ethical breaches that have frequently resulted in hefty fines and regulatory scrutiny within the financial industry. Previous solutions, based on the use of fully homomorphic encryption (Asharov et al., AAMAS 2020), encrypt orders ensuring that information is revealed only when a match occurs. However, this approach introduces significant computational overhead, making it impractical for high-frequency trading environments such as dark pools. In this work, we propose a new system based on differential privacy combined with lightweight encryption, offering an efficient and practical solution that mitigates the risks of an untrustworthy auctioneer. Specifically, we introduce a new concept called Indifferential Privacy, which can be of independent interest, where a user is indifferent to whether certain information is revealed after some special event, unlike standard differential privacy. For example, in an auction, it's reasonable to disclose the true volume of a trade once all of it has been matched. Moreover, our new concept of Indifferential Privacy allows for maximum matching, which is impossible with conventional differential privacy.

Paperid: 3400, https://arxiv.org/pdf/2502.13171.pdf

Abstract:
Phishing is the most prevalent type of cyber-attack today and is recognized as the leading source of data breaches with significant consequences for both individuals and corporations. Web-based phishing attacks are the most frequent with vectors such as social media posts and emails containing links to phishing URLs that once clicked on render host systems vulnerable to more sinister attacks. Research efforts to detect phishing URLs have involved the use of supervised learning techniques that use large amounts of data to train models and have high computational requirements. They also involve analysis of features derived from vectors including email contents thus affecting user privacy. Additionally, they suffer from a lack of resilience against evolution of threats especially with the advent of generative AI techniques to bypass these systems as with AI-generated phishing URLs. Unsupervised methods such as clustering techniques have also been used in phishing detection in the past, however, they are at times unscalable due to the use of pair-wise comparisons. They also lack high detection rates while detecting phishing campaigns. In this paper, we propose an unsupervised learning approach that is not only fast but scalable, as it does not involve pair-wise comparisons. It is able to detect entire campaigns at a time with a high detection rate while preserving user privacy; this includes the recent surge of campaigns with targeted phishing URLs generated by malicious entities using generative AI techniques.

Paperid: 3401, https://arxiv.org/pdf/2502.12976.pdf

Abstract:
As synthetic data becomes increasingly popular in machine learning tasks, numerous methods--without formal differential privacy guarantees--use synthetic data for training. These methods often claim, either explicitly or implicitly, to protect the privacy of the original training data. In this work, we explore four different training paradigms: coreset selection, dataset distillation, data-free knowledge distillation, and synthetic data generated from diffusion models. While all these methods utilize synthetic data for training, they lead to vastly different conclusions regarding privacy preservation. We caution that empirical approaches to preserving data privacy require careful and rigorous evaluation; otherwise, they risk providing a false sense of privacy.

Paperid: 3402, https://arxiv.org/pdf/2502.12966.pdf

Abstract:
Ethereum has adopted a rollup-centric roadmap to scale by making rollups (layer 2 scaling solutions) the primary method for handling transactions. The first significant step towards this goal was EIP-4844, which introduced blob transactions that are designed to meet the data availability needs of layer 2 protocols. This work constitutes the first rigorous and comprehensive empirical analysis of transaction- and mempool-level data since the institution of blobs on Ethereum on March 13, 2024. We perform a longitudinal study of the early days of the blob fee market analyzing the landscape and the behaviors of its participants. We identify and measure the inefficiencies arising out of suboptimal block packing, showing that at times it has resulted in up to 70% relative fee loss. We hone in and give further insight into two (congested) peak demand periods for blobs. Finally, we document a market design issue relating to subset bidding due to the inflexibility of the transaction structure on packing data as blobs and suggest possible ways to fix it. The latter market structure issue also applies more generally for any discrete objects included within transactions.

Paperid: 3403, https://arxiv.org/pdf/2502.12837.pdf

Abstract:
We describe a framework and tool specification that represents a step towards cybersecurity testing and monitoring of IoT ecosystems. We begin with challenges from a previous paper and discuss an integrated approach and tools to enable testing and monitoring to address these challenges. We also describe exemplary use cases of IoT ecosystems and propose approaches to address the challenges using the framework and tools. The current status of this work is that the specification and conceptualisation is complete, use cases are understood with clear challenges and implementation / extension of the tools and framework is underway with tools at different stages of development. Several key observations have been made throughout this work, as follows. 1) Tools may be used in multiple different combinations, and ad-hoc use is also encouraged, where one tool may provide clues and other tools executed to undertake further investigations based on initial results. 2) Automated execution of tool chains is supported by workflows. 3) support for immutable storage of audit records of tests and results is an important requirement. 4) Indicators (observations or measurements representing information of relevance for assessment of cyber security) are a key mechanism for intercommunication between one tool and another, or with the operator. 5) Mapping this work to established security development lifecycles is a useful means of determining applicability and utility of the tools and framework. 6) There is a key interplay between devices and systems. 7) Anomaly detection in multiple forms is a key means of runtime monitoring. 8) Considerable investigation is needed related to the specifics of each device / system as an item of further work.

Paperid: 3404, https://arxiv.org/pdf/2502.12721.pdf

Abstract:
The MinRank problem is a simple linear algebra problem: given matrices with coefficients in a field, find a non trivial linear combination of the matrices that has a small rank. There are several algebraic modeling of the problem. The main ones are: the Kipnis-Shamir modeling, the Minors modeling and the Support-Minors modeling. The Minors modeling has been studied by Faug{Ã¨}re et al. in 2010, where the authors provide an analysis of the complexity of computing a Gr{Ã¶}bner basis of the modeling, through the computation of the exact Hilbert Series for a generic instance. For the Support-Minors modeling, the first terms of the Hilbert Series are given by Bardet et al. in 2020 based on an heuristic and experimental work. In this work, we provide a formula and a proof for the complete Hilbert Series of the Support Minors modeling for generic instances. This is done by adapting well known results on determinantal ideals to an ideal generated by a particular subset of the set of all minors of a matrix of variables. We then show that this ideal is generated by

Paperid: 3405, https://arxiv.org/pdf/2502.12710.pdf

Abstract:
Large language models (LLMs) have gained significant popularity in recent years. Differentiating between a text written by a human and one generated by an LLM has become almost impossible. Information-hiding techniques such as digital watermarking or steganography can help by embedding information inside text in a form that is unlikely to be noticed. However, existing techniques, such as linguistic-based or format-based methods, change the semantics or cannot be applied to pure, unformatted text. In this paper, we introduce a novel method for information hiding called Innamark, which can conceal any byte-encoded sequence within a sufficiently long cover text. This method is implemented as a multi-platform library using the Kotlin programming language, which is accompanied by a command-line tool and a web interface. By substituting conventional whitespace characters with visually similar Unicode whitespace characters, our proposed scheme preserves the semantics of the cover text without changing the number of characters. Furthermore, we propose a specified structure for secret messages that enables configurable compression, encryption, hashing, and error correction. An experimental benchmark comparison on a dataset of 1 000 000 Wikipedia articles compares ten algorithms. The results demonstrate the robustness of our proposed Innamark method in various applications and the imperceptibility of its watermarks to humans. We discuss the limits to the embedding capacity and robustness of the algorithm and how these could be addressed in future work.

Paperid: 3406, https://arxiv.org/pdf/2502.12630.pdf

Abstract:
This paper presents a novel approach to evaluating the security of large language models (LLMs) against prompt leakage-the exposure of system-level prompts or proprietary configurations. We define prompt leakage as a critical threat to secure LLM deployment and introduce a framework for testing the robustness of LLMs using agentic teams. Leveraging AG2 (formerly AutoGen), we implement a multi-agent system where cooperative agents are tasked with probing and exploiting the target LLM to elicit its prompt. Guided by traditional definitions of security in cryptography, we further define a prompt leakage-safe system as one in which an attacker cannot distinguish between two agents: one initialized with an original prompt and the other with a prompt stripped of all sensitive information. In a safe system, the agents' outputs will be indistinguishable to the attacker, ensuring that sensitive information remains secure. This cryptographically inspired framework provides a rigorous standard for evaluating and designing secure LLMs. This work establishes a systematic methodology for adversarial testing of prompt leakage, bridging the gap between automated threat modeling and practical LLM security. You can find the implementation of our prompt leakage probing on GitHub.

Paperid: 3407, https://arxiv.org/pdf/2502.12460.pdf

Abstract:
Organizations often lay down rules or guidelines called Natural Language Access Control Policies (NLACPs) for specifying who gets access to which information and when. However, these cannot be directly used in a target access control model like Attribute-based Access Control (ABAC). Manually translating the NLACP rules into Machine Enforceable Security Policies (MESPs) is both time consuming and resource intensive, rendering it infeasible especially for large organizations. Automated machine translation workflows, on the other hand, require information security officers to be adept at using such processes. To effectively address this problem, we have developed a free web-based publicly accessible tool called LMN (LLMs for generating MESPs from NLACPs) that takes an NLACP as input and converts it into a corresponding MESP. Internally, LMN uses the GPT 3.5 API calls and an appropriately chosen prompt. Extensive experiments with different prompts and performance metrics firmly establish the usefulness of LMN.

Paperid: 3408, https://arxiv.org/pdf/2502.12441.pdf

Abstract:
Shor's algorithm is well-known for its capability to address the elliptic curve discrete logarithm problem (ECDLP) in polynomial time. The enhancement of its quantum resources continues to be a crucial focus of research. Nevertheless, the application of projective coordinates for quantum resource optimization remains an unresolved issue, mainly because the representation of projective coordinates lacks uniqueness without employing modular division operations. Our study reveals that projective coordinates do not provide the same advantages as affine coordinates when utilizing Shor's method to tackle the ECDLP.

Paperid: 3409, https://arxiv.org/pdf/2502.12322.pdf

Abstract:
Video game cheats modify a video game behaviour to give unfair advantages to some players while bypassing the methods game developers use to detect them. This destroys the experience of online gaming and can result in financial losses for game developers. In this work, we present a new type of game cheat, Virtual machine Introspection Cheat (VIC), that takes advantage of virtual machines to stealthy execute game cheats. VIC employees a hypervisor with introspection enabled to lower the bar of cheating against legacy and modern anti-cheat systems. We demonstrate the feasibility and stealthiness of VIC against three popular games (Fortnite, BlackSquad and Team Fortress 2) that include five different anti-cheats. In particular, we use VIC to implement a cheat radar, a wall-hack cheat and a trigger-bot. To support our claim that this type of cheats can be effectively used, we present the performance impact VICs have on gameplay by monitoring the frames per second (fps) while the cheats are activated. Our experimentation also shows how these cheats are currently undetected by the most popular anti-cheat systems, enabling a new paradigm that can take advantage of cloud infrastructure to offer cheating-as-a-service.

Paperid: 3410, https://arxiv.org/pdf/2502.12103.pdf

Abstract:
In the past years, many proposals have emerged in order to address online advertising use-cases without access to third-party cookies. All these proposals leverage some privacy-enhancing technologies such as aggregation or differential privacy. Yet, no public and rich-enough ground truth is currently available to assess the relevancy of aforementioned private advertising frameworks. We are releasing the largest, in terms of number of features, bidding dataset specifically built in alignment with the design of major browser vendors proposals such as Chrome Privacy Sandbox. This dataset, coined CriteoPrivateAds, stands for an anonymised version of Criteo production logs and provides sufficient data to learn bidding models commonly used in online advertising under many privacy constraints (delayed reports, display and user-level differential privacy, user signal quantisation or aggregated reports). We ensured that this dataset, while being anonymised, is able to provide offline results close to production performance of adtech companies including Criteo - making it a relevant ground truth to design private advertising systems. The dataset is available in Hugging Face: https://huggingface.co/datasets/criteo/CriteoPrivateAd.

Paperid: 3411, https://arxiv.org/pdf/2502.11920.pdf

Abstract:
Attack-defense trees (ADTs) are a prominent graphical threat modeling method that is highly recommended for analyzing and communicating security-related information. Despite this, existing empirical studies of attack trees have established their acceptability only for users with highly technical (computer science) backgrounds while raising questions about their suitability for threat modeling stakeholders with a limited technical background. Our research addresses this gap by investigating the impact of the users' technical background on ADT acceptability in an empirical study. Our Method Evaluation Model-based study consisted of n = 102 participants (53 with a strong computer science background and 49 with a limited computer science background) who were asked to complete a series of ADT-related tasks. By analyzing their responses and comparing the results, we reveal that a very limited technical background is sufficient for ADT acceptability. This finding underscores attack trees' viability as a threat modeling method.

Paperid: 3412, https://arxiv.org/pdf/2502.11854.pdf

Abstract:
The rapid proliferation of Internet of Medical Things (IoMT) devices in healthcare has introduced unique cybersecurity challenges, primarily due to the diverse communication protocols and critical nature of these devices This research aims to develop an advanced, real-time anomaly detection framework tailored for IoMT network traffic, leveraging AI/ML models and the CICIoMT2024 dataset By integrating multi-protocol (MQTT, WiFi), attack-specific (DoS, DDoS), time-series (active/idle states), and device-specific (Bluetooth) data, our study captures a comprehensive range of IoMT interactions As part of our data analysis, various machine learning techniques are employed which include an ensemble model using XGBoost for improved performance against specific attack types, sequential models comprised of LSTM and CNN-LSTM that leverage time dependencies, and unsupervised models such as Autoencoders and Isolation Forest that are good in general anomaly detection The results of the experiment prove with an ensemble model lowers false positive rates and reduced detections.

Paperid: 3413, https://arxiv.org/pdf/2502.11658.pdf

Abstract:
Although mobile devices benefit users in their daily lives in numerous ways, they also raise several privacy concerns. For instance, they can reveal sensitive information that can be inferred from location data. This location data is shared through service providers as well as mobile applications. Understanding how and with whom users share their location data -- as well as users' perception of the underlying privacy risks --, are important notions to grasp in order to design usable privacy-enhancing technologies. In this work, we perform a quantitative and qualitative analysis of smartphone users' awareness, perception and self-reported behavior towards location data-sharing through a survey of n=99 young adult participants (i.e., digital natives). We compare stated practices with actual behaviors to better understand their mental models, and survey participants' understanding of privacy risks before and after the inspection of location traces and the information that can be inferred therefrom. Our empirical results show that participants have risky privacy practices: about 54% of participants underestimate the number of mobile applications to which they have granted access to their data, and 33% forget or do not think of revoking access to their data. Also, by using a demonstrator to perform inferences from location data, we observe that slightly more than half of participants (57%) are surprised by the extent of potentially inferred information, and that 47% intend to reduce access to their data via permissions as a result of using the demonstrator. Last, a majority of participants have little knowledge of the tools to better protect themselves, but are nonetheless willing to follow suggestions to improve privacy (51%). Educating people, including digital natives, about privacy risks through transparency tools seems a promising approach.

Paperid: 3414, https://arxiv.org/pdf/2502.10719.pdf

Abstract:
Spectre v1 information disclosure attacks, which exploit CPU conditional branch misprediction, remain unsolved in deployed software. Certain Spectre v1 gadgets can be exploited only by out-of-place mistraining, in which the attacker controls a victim branch's prediction, possibly from another address space, by training a branch that aliases with the victim in the branch predictor unit (BPU) structure. However, constructing a BPU-alias for a victim branch is hard. Consequently, practical out-of-place mistraining attacks use brute-force searches to randomly achieve aliasing. To date, such attacks have been demonstrated only on Intel x86 CPUs. This paper explores the vulnerability of Apple M-Series CPUs to practical out-of-place Spectre v1 mistraining. We show that brute-force out-of-place mistraining fails on the M1. We analytically explain the failure is due to the search space size, assuming (based on Apple patents) that the M1 CPU uses a variant of the TAGE conditional branch predictor. Based on our analysis, we design a new BPU-alias search technique with reduced search space. Our technique requires knowledge of certain M1 BPU parameters and mechanisms, which we reverse engineer. We also use our newfound ability to perform out-of-place Spectre v1 mistraining to test if the M1 CPU implements hardware mitigations against cross-address space out-of-place mistraining -- and find evidence for partial mitigations.

Paperid: 3415, https://arxiv.org/pdf/2502.10711.pdf

Abstract:
Detecting encryption-driven cyber threats remains a large challenge due to the evolving techniques employed to evade traditional detection mechanisms. An entropy-based computational framework was introduced to analyze multi-domain system variations, enabling the identification of malicious encryption behaviors through entropy deviations. By integrating entropy patterns across file operations, memory allocations, and network transmissions, a detection methodology was developed to differentiate between benign and ransomware-induced entropy shifts. A mathematical model was formulated to quantify entropy dynamics, incorporating time-dependent variations and weighted domain contributions to enhance anomaly detection. Experimental evaluations demonstrated that the proposed approach achieved high accuracy across diverse ransomware families while maintaining low false positive rates. Computational efficiency analysis indicated minimal processing overhead, suggesting feasibility for real-time implementation in security-sensitive environments. The study highlighted entropy fluctuations as a useful indicator for identifying malicious encryption processes, reinforcing entropy-driven methodologies as a viable component of cybersecurity strategies.

Paperid: 3416, https://arxiv.org/pdf/2502.10637.pdf

Abstract:
We present a mechanism that for a network of participants allows one participant of the network (Alice) to request some data from another participant (Bob) and either receive a response from Bob within a known-in-advance, bounded time b, or receive a proof that at least one edge on the way to Bob was broken within b, or receive a streaming payment proportional to time passed beyond b during which neither was received. This mechanism allows for building downstream applications that require provable responses from other participants, such as decentralized storage solutions, decentralized AI agents, and more.

Paperid: 3417, https://arxiv.org/pdf/2502.10624.pdf

Abstract:
Network evasion detection aims to distinguish whether the network flow comes from link layer exists network evasion threat, which is a means to disguise the data traffic on detection system by confusing the signature. Since the previous research works has all sorts of frauds, we propose a architecture with deep learning network to handle this problem. In this paper, we extract the critical information as key features from data frame and also specifically propose to use bidirectional long short-term memory (Bi-LSTM) neural network which shows an outstanding performance to trace the serial information, to encode both the past and future trait on the network flows. Furthermore we introduce a classifier named Softmax at the bottom of Bi-LSTM, holding a character to select the correct class. All experiments results shows that we can achieve a significant performance with a deep Bi-LSTM in network evasion detection and it's average accuracy reaches 96.1%.

Paperid: 3418, https://arxiv.org/pdf/2502.10448.pdf

Abstract:
In the context of the rapid development of digital supply chain networks, dealing with the increasing cybersecurity threats and formulating effective security investment strategies to defend against cyberattack risks are the core issues in supply chain management. Cybersecurity investment decision-making is a key strategic task in enterprise supply chain manage-ment. Traditional game theory models and linear programming methods make it challenging to deal with complex problems such as multi-party par-ticipation in the supply chain, resource constraints, and risk uncertainty, re-sulting in enterprises facing high risks and uncertainties in the field of cy-bersecurity. To effectively meet this challenge, this study proposes a nonlin-ear budget-constrained cybersecurity investment optimization model based on variational inequality and projection shrinkage algorithm. This method simulates the impact of market competition on security investment by intro-ducing market share variables, combining variational inequality and projec-tion shrinkage algorithm to solve the model, and analyzing the effect of dif-ferent variables such as budget constraints, cyberattack losses, and market share on supply chain network security. In numerical analysis, the model achieved high cybersecurity levels of 0.96 and 0.95 in the experimental sce-narios of two retailers and two demand markets, respectively, and the budget constraint analysis revealed the profound impact of budget constraints on cybersecurity investment. Through numerical experiments and comparative analysis, the effectiveness and operability of this method in improving sup-ply chain network security are verified.

Paperid: 3419, https://arxiv.org/pdf/2502.10439.pdf

Abstract:
Remote Code Execution (RCE) exploits pose a significant threat to AI and ML systems, particularly in GPU-accelerated environments where the computational power of GPUs can be misused for malicious purposes. This paper focuses on RCE attacks leveraging deserialization vulnerabilities and custom layers, such as TensorFlow Lambda layers, which are often overlooked due to the complexity of monitoring GPU workloads. These vulnerabilities enable attackers to execute arbitrary code, blending malicious activity seamlessly into expected model behavior and exploiting GPUs for unauthorized tasks such as cryptocurrency mining. Unlike traditional CPU-based attacks, the parallel processing nature of GPUs and their high resource utilization make runtime detection exceptionally challenging. In this work, we provide a comprehensive examination of RCE exploits targeting GPUs, demonstrating an attack that utilizes these vulnerabilities to deploy a crypto miner on a GPU. We highlight the technical intricacies of such attacks, emphasize their potential for significant financial and computational costs, and propose strategies for mitigation. By shedding light on this underexplored attack vector, we aim to raise awareness and encourage the adoption of robust security measures in GPU-driven AI and ML systems, with an emphasis on static and model scanning as an easier way to detect exploits.

Paperid: 3420, https://arxiv.org/pdf/2502.10438.pdf

Abstract:
Jailbreak backdoor attacks on LLMs have garnered attention for their effectiveness and stealth. However, existing methods rely on the crafting of poisoned datasets and the time-consuming process of fine-tuning. In this work, we propose JailbreakEdit, a novel jailbreak backdoor injection method that exploits model editing techniques to inject a universal jailbreak backdoor into safety-aligned LLMs with minimal intervention in minutes. JailbreakEdit integrates a multi-node target estimation to estimate the jailbreak space, thus creating shortcuts from the backdoor to this estimated jailbreak space that induce jailbreak actions. Our attack effectively shifts the models' attention by attaching strong semantics to the backdoor, enabling it to bypass internal safety mechanisms. Experimental results show that JailbreakEdit achieves a high jailbreak success rate on jailbreak prompts while preserving generation quality, and safe performance on normal queries. Our findings underscore the effectiveness, stealthiness, and explainability of JailbreakEdit, emphasizing the need for more advanced defense mechanisms in LLMs.

Paperid: 3421, https://arxiv.org/pdf/2502.10404.pdf

Abstract:
In todays increasingly digital world, data has become one of the most valuable assets for organizations. With the rise in cyberattacks, data breaches, and the stringent regulatory environment, it is imperative to adopt robust data protection strategies. One such approach is the use of governance frameworks, which provide structured guidelines, policies, and processes to ensure data protection, compliance, and ethical usage. This paper explores the role of data governance frameworks in protecting sensitive information and maintaining organizational data security. It delves into the principles, strategies, and best practices that constitute an effective governance framework, including risk management, access controls, data quality assurance, and compliance with regulations like GDPR, HIPAA, and CCPA. By analyzing case studies from various sectors, the paper highlights the practicalchallenges, limitations, and advantages of implementing data governance frameworks. Additionally, the paper examines how data governance frameworks contribute to transparency, accountability, and operational efficiency, while also identifying emerging trends and technologies that enhance data protection. Ultimately, the paper aims to provide a comprehensive understanding of how governance frameworks can be leveraged to safeguard organizational data and ensure its responsible use.

Paperid: 3422, https://arxiv.org/pdf/2502.10293.pdf

Abstract:
This paper addresses the critical issue of burnout among cybersecurity professionals, a growing concern that threatens the effectiveness of digital defense systems. As the industry faces a significant attrition crisis, with nearly 46% of cybersecurity leaders contemplating departure from their roles, it is imperative to explore the causes and consequences of burnout through a socio-technical lens. These challenges were discussed by experts from academia and industry in a multi-disciplinary workshop at the 26th International Conference on Human-Computer Interaction to address broad antecedents of burnout, manifestation and its consequences among cybersecurity professionals, as well as programs to mitigate impacts from burnout. Central to the analysis is an empirical study of former National Security Agency (NSA) tactical cyber operators. This paper presents key insights in the following areas based on discussions in the workshop: lessons for public and private sectors from the NSA study, a comparative review of addressing burnout in the healthcare profession. It also outlines a roadmap for future collaborative research, thereby informing interdisciplinary studies.

Paperid: 3423, https://arxiv.org/pdf/2502.10194.pdf

Abstract:
RISC-V is gaining popularity for its adaptability and cost-effectiveness in processor design. With the increasing adoption of RISC-V, the importance of implementing robust security verification has grown significantly. In the state of the art, various approaches have been developed to strengthen the security verification process. Among these methods, assertion-based security verification has proven to be a promising approach for ensuring that security features are effectively met. To this end, some approaches manually define security assertions for processor designs; however, these manual methods require significant time, cost, and human expertise. Consequently, recent approaches focus on translating pre-defined security assertions from one design to another. Nonetheless, these methods are not primarily centered on processor security, particularly RISC-V. Furthermore, many of these approaches have not been validated against real-world attacks, such as hardware Trojans. In this work, we introduce a methodology for translating security assertions across processors with different architectures, using RISC-V as a case study. Our approach reduces time and cost compared to developing security assertions manually from the outset. Our methodology was applied to five critical security modules with assertion translation achieving nearly 100% success across all modules. These results validate the efficacy of our approach and highlight its potential for enhancing security verification in modern processor designs. The effectiveness of the translated assertions was rigorously tested against hardware Trojans defined by large language models (LLMs), demonstrating their reliability in detecting security breaches.

Paperid: 3424, https://arxiv.org/pdf/2502.09837.pdf

Abstract:
Despite the critical role of timing infrastructure in enabling essential services, from public key infrastructure and smart grids to autonomous navigation and high-frequency trading, modern timing stacks remain highly vulnerable to malicious attacks. These threats emerge due to several reasons, including inadequate security mechanisms, the timing architectures unique vulnerability to delays, and implementation issues. In this paper, we aim to obtain a holistic understanding of the issues that make the timing stacks vulnerable to adversarial manipulations, what the challenges are in securing them, and what solutions can be borrowed from the research community to address them. To this end, we perform a systematic analysis of the security vulnerabilities of the timing stack. In doing so, we discover new attack surfaces, i.e., physical timing components and on-device timekeeping, which are often overlooked by existing research that predominantly studies the security of time synchronization protocols. We also show that the emerging trusted timing architectures are flawed and risk compromising wider system security, and propose an alternative design using hardware-software co-design.

Paperid: 3425, https://arxiv.org/pdf/2502.09833.pdf

Abstract:
The increasing sophistication of cyber threats has necessitated the development of advanced detection mechanisms capable of identifying malicious activities with high precision and efficiency. A novel approach, termed Autonomous Feature Resonance, is introduced to address the limitations of traditional ransomware detection methods through the analysis of entropy-based feature interactions within system processes. The proposed method achieves an overall detection accuracy of 97.3\%, with false positive and false negative rates of 1.8\% and 2.1\%, respectively, outperforming existing techniques such as signature-based detection and behavioral analysis. Its decentralized architecture enables local processing of data, reducing latency and improving scalability, while a self-learning mechanism ensures continuous adaptation to emerging threats. Experimental results demonstrate consistent performance across diverse ransomware families, including LockBit 3.0, BlackCat, and Royal, with low detection latency and efficient resource utilization. The method's reliance on entropy as a distinguishing feature provides robustness against obfuscation techniques, making it suitable for real-time deployment in high-throughput environments. These findings highlight the potential of entropy-based approaches to enhance cybersecurity frameworks, offering a scalable and adaptive solution for modern ransomware detection challenges.

Paperid: 3426, https://arxiv.org/pdf/2502.09809.pdf

Abstract:
The integration of tool use into large language models (LLMs) enables agentic systems with real-world impact. In the meantime, unlike standalone LLMs, compromised agents can execute malicious workflows with more consequential impact, signified by their tool-use capability. We propose AgentGuard, a framework to autonomously discover and validate unsafe tool-use workflows, followed by generating safety constraints to confine the behaviors of agents, achieving the baseline of safety guarantee at deployment. AgentGuard leverages the LLM orchestrator's innate capabilities - knowledge of tool functionalities, scalable and realistic workflow generation, and tool execution privileges - to act as its own safety evaluator. The framework operates through four phases: identifying unsafe workflows, validating them in real-world execution, generating safety constraints, and validating constraint efficacy. The output, an evaluation report with unsafe workflows, test cases, and validated constraints, enables multiple security applications. We empirically demonstrate AgentGuard's feasibility with experiments. With this exploratory work, we hope to inspire the establishment of standardized testing and hardening procedures for LLM agents to enhance their trustworthiness in real-world applications.

Paperid: 3427, https://arxiv.org/pdf/2502.09772.pdf

Abstract:
Quantum computing represents a significant advancement in computational capabilities. Of particular concern is its impact on asymmetric cryptography through, notably, Shor's algorithm and the more recently developed Regev's algorithm for factoring composite numbers. We present our implementation of the latter. Our analysis encompasses both quantum simulation results and classical component examples, with particular emphasis on comparative cases between Regev's and Shor's algorithms. Our experimental results reveal that Regev's algorithm indeed outperforms Shor's algorithm for certain composite numbers in practice. However, we observed significant performance variations across different input values. Despite Regev's algorithm's theoretical asymptotic efficiency advantage, our implementation exhibited execution times longer than Shor's algorithm for small integer factorization in both quantum and classical components. These findings offer insights into the practical challenges and performance characteristics of implementing Regev's algorithm in realistic quantum computing scenarios.

Paperid: 3428, https://arxiv.org/pdf/2502.09763.pdf

Abstract:
We present a primary attack ontology and analysis framework for deception attacks in Mixed Reality (MR). This is achieved through multidisciplinary Systematization of Knowledge (SoK), integrating concepts from MR security, information theory, and cognition. While MR grows in popularity, it presents many cybersecurity challenges, particularly concerning deception attacks and their effects on humans. In this paper, we use the Borden-Kopp model of deception to develop a comprehensive ontology of MR deception attacks. Further, we derive two models to assess impact of MR deception attacks on information communication and decision-making. The first, an information-theoretic model, mathematically formalizes the effects of attacks on information communication. The second, a decision-making model, details the effects of attacks on interlaced cognitive processes. Using our ontology and models, we establish the MR Deception Analysis Framework (DAF) to assess the effects of MR deception attacks on information channels, perception, and attention. Our SoK uncovers five key findings for research and practice and identifies five research gaps to guide future work.

Paperid: 3429, https://arxiv.org/pdf/2502.09744.pdf

Abstract:
Federated Learning (FL) provides a decentralized machine learning approach, where multiple devices or servers collaboratively train a model without sharing their raw data, thus enabling data privacy. This approach has gained significant interest in academia and industry due to its privacy-preserving properties, which are particularly valuable in the medical domain where data availability is often protected under strict regulations. A relatively unexplored area is the use of FL to fine-tune Foundation Models (FMs) for time series forecasting, potentially enhancing model efficacy by overcoming data limitation while maintaining privacy. In this paper, we fine-tuned time series FMs with Electrocardiogram (ECG) and Impedance Cardiography (ICG) data using different FL techniques. We then examined various scenarios and discussed the challenges FL faces under different data heterogeneity configurations. Our empirical results demonstrated that while FL can be effective for fine-tuning FMs on time series forecasting tasks, its benefits depend on the data distribution across clients. We highlighted the trade-offs in applying FL to FM fine-tuning.

Paperid: 3430, https://arxiv.org/pdf/2502.09553.pdf

Abstract:
Voice Authentication (VA), also known as Automatic Speaker Verification (ASV), is a widely adopted authentication method, particularly in automated systems like banking services, where it serves as a secondary layer of user authentication. Despite its popularity, VA systems are vulnerable to various attacks, including replay, impersonation, and the emerging threat of deepfake audio that mimics the voice of legitimate users. To mitigate these risks, several defense mechanisms have been proposed. One such solution, Voice Pops, aims to distinguish an individual's unique phoneme pronunciations during the enrollment process. While promising, the effectiveness of VA+VoicePop against a broader range of attacks, particularly logical or adversarial attacks, remains insufficiently explored. We propose a novel attack method, which we refer to as SyntheticPop, designed to target the phoneme recognition capabilities of the VA+VoicePop system. The SyntheticPop attack involves embedding synthetic "pop" noises into spoofed audio samples, significantly degrading the model's performance. We achieve an attack success rate of over 95% while poisoning 20% of the training dataset. Our experiments demonstrate that VA+VoicePop achieves 69% accuracy under normal conditions, 37% accuracy when subjected to a baseline label flipping attack, and just 14% accuracy under our proposed SyntheticPop attack, emphasizing the effectiveness of our method.

Paperid: 3431, https://arxiv.org/pdf/2502.09225.pdf

Abstract:
Equational Unification is a critical problem in many areas such as automated theorem proving and security protocol analysis. In this paper, we focus on XOR-Unification, that is, unification modulo the theory of exclusive-or. This theory contains an operator with the properties Associativity, Commutativity, Nilpotency, and the presence of an identity. In the proof assistant Coq, we implement an algorithm that solves XOR unification problems, whose design was inspired by Liu and Lynch, and prove it sound, complete, and terminating. Using Coq's code extraction capability we obtain an implementation in the programming language OCaml.

Paperid: 3432, https://arxiv.org/pdf/2502.09201.pdf

Abstract:
Commitment schemes are essential to many cryptographic protocols and schemes with applications that include privacy-preserving computation on data, privacy-preserving authentication, and, in particular, oblivious transfer protocols. For quantum oblivious transfer (qOT) protocols, unconditionally binding commitment schemes that do not rely on hardness assumptions from structured mathematical problems are required. These additional constraints severely limit the choice of commitment schemes to random oracle-based constructions or Naor's bit commitment scheme. As these protocols commit to individual bits, the use of such commitment schemes comes at a high bandwidth and computational cost. In this work, we investigate improvements to the efficiency of commitment schemes used in qOT protocols and propose an extension of Naor's commitment scheme requiring the existence of one-way functions (OWF) to reduce communication complexity for 2-bit strings. Additionally, we provide an interactive string commitment scheme with preprocessing to enable a fast and efficient computation of commitments.

Paperid: 3433, https://arxiv.org/pdf/2502.08843.pdf

Abstract:
The rapid evolution of encryption-based threats has rendered conventional detection mechanisms increasingly ineffective against sophisticated attack strategies. Monitoring entropy variations across hierarchical system levels offers an alternative approach to identifying unauthorized data modifications without relying on static signatures. A framework leveraging hierarchical entropy disruption was introduced to analyze deviations in entropy distributions, capturing behavioral anomalies indicative of malicious encryption operations. Evaluating the framework across multiple ransomware variants demonstrated its capability to achieve high detection accuracy while maintaining minimal computational overhead. Entropy distributions across different system directories revealed that encryption activities predominantly targeted user-accessible files, aligning with observed attacker strategies. Detection latency analysis indicated that early-stage identification was feasible, mitigating potential data loss before critical system impact occurred. The framework's ability to operate efficiently in real-time environments was validated through an assessment of resource utilization, confirming a balanced trade-off between detection precision and computational efficiency. Comparative benchmarking against established detection methods highlighted the limitations of conventional approaches in identifying novel ransomware variants, whereas entropy-based anomaly detection provided resilience against obfuscation techniques.

Paperid: 3434, https://arxiv.org/pdf/2502.08217.pdf

Abstract:
Open human mobility data is considered an essential basis for the profound research and analysis required for the transition to sustainable mobility and sustainable urban planning. Cycling data has especially been the focus of data collection endeavors in recent years. Although privacy risks regarding location data are widely known, practitioners often refrain from advanced privacy mechanisms to prevent utility losses. Removing user identifiers from trips is thereby deemed a major privacy gain, as it supposedly prevents linking single trips to obtain entire movement patterns. In this paper, we propose a novel attack to reconstruct user identifiers in GPS trip datasets consisting of single trips, unlike previous ones that are dedicated to evaluating trajectory-user linking in the context of check-in data. We evaluate the remaining privacy risk for users in such datasets and our empirical findings from two real-world datasets show that the risk of re-identification is significant even when personal identifiers have been removed, and that truncation as a simple additional privacy mechanism may not be effective in protecting user privacy. Further investigations indicate that users who frequently visit locations that are only visited by a small number of others, tend to be more vulnerable to re-identification.

Paperid: 3435, https://arxiv.org/pdf/2502.08013.pdf

Abstract:
Encryption-based cyber threats continue to evolve, employing increasingly sophisticated techniques to bypass traditional detection mechanisms. Many existing classification strategies depend on static rule sets, signature-based matching, or machine learning models that require extensive labeled datasets, making them ineffective against emerging ransomware families that exhibit polymorphic and adversarial behaviors. A novel classification framework structured through hierarchical manifold projection introduces a mathematical approach to detecting malicious encryption workflows, preserving geometric consistencies that differentiate ransomware-induced modifications from benign cryptographic operations. The proposed methodology transforms encryption sequences into structured manifold embeddings, ensuring classification robustness through non-Euclidean feature separability rather than reliance on static indicators. Generalization capabilities remain stable across diverse ransomware variants, as hierarchical decomposition techniques capture multi-scale encryption characteristics while maintaining resilience against code obfuscation and execution flow modifications. Empirical analysis demonstrates that detection accuracy remains high even when encryption key variability, delayed execution tactics, or API call obfuscation strategies are introduced, reinforcing the reliability of manifold-based classification. Real-time scalability assessments confirm that the proposed approach maintains computational efficiency across increasing dataset volumes, validating its applicability to large-scale threat detection scenarios.

Paperid: 3436, https://arxiv.org/pdf/2502.07594.pdf

Abstract:
Distributed certification is a set of mechanisms that allows an all-knowing prover to convince the units of a communication network that the network's state has some desired property, such as being 3-colorable or triangle-free. Classical mechanisms, such as proof labeling schemes (PLS), consist of a message from the prover to each unit, followed by one round of communication between each unit and its neighbors. Later works consider extensions, called distributed interactive proofs, where the prover and the units can have multiple rounds of communication before the communication among the units. Recently, Bick, Kol, and Oshman (SODA '22) defined a zero-knowledge version of distributed interactive proofs, where the prover convinces the units of the network's state without revealing any other information about the network's state or structure. In their work, they propose different variants of this model and show that many graph properties of interest can be certified with them. In this work, we define and study distributed non-interactive zero-knowledge proofs (dNIZK); these can be seen as a non-interactive version of the aforementioned model, and also as a zero-knowledge version of PLS. We prove the following: - There exists a dNIZK protocol for 3-coloring with O(log n)-bit messages from the prover and O(log n)-size messages among neighbors. - There exists a family of dNIZK protocols for triangle-freeness, that presents a trade-off between the size of the messages from the prover and the size of the messages among neighbors. - There exists a dNIZK protocol for any graph property in NP in the random oracle models, which is secure against an arbitrary number of malicious parties.

Paperid: 3437, https://arxiv.org/pdf/2502.07498.pdf

Abstract:
The increasing sophistication of cyber threats has necessitated the development of advanced detection mechanisms capable of identifying and mitigating ransomware attacks with high precision and efficiency. A novel framework, termed Decentralized Entropy-Driven Detection (DED), is introduced, leveraging autonomous neural graph embeddings and entropy-based anomaly scoring to address the limitations of traditional methods. The framework operates on a distributed network of nodes, eliminating single points of failure and enhancing resilience against targeted attacks. Experimental results demonstrate its ability to achieve detection accuracy exceeding 95\%, with false positive rates maintained below 2\% across diverse ransomware variants. The integration of graph-based modeling and machine learning techniques enables the framework to capture complex system interactions, facilitating the identification of subtle anomalies indicative of ransomware activity. Comparative analysis against existing methods highlights its superior performance in terms of detection rates and computational efficiency. Case studies further validate its effectiveness in real-world scenarios, showcasing its ability to detect and mitigate ransomware attacks within minutes of their initiation. The proposed framework represents a significant step forward in cybersecurity, offering a scalable and adaptive solution to the growing challenge of ransomware detection.

Paperid: 3438, https://arxiv.org/pdf/2502.07330.pdf

Abstract:
The conspicuous lack of cloud-specific security certifications, in addition to the existing market fragmentation, hinder transparency and accountability in the provision and usage of European cloud services. Both issues ultimately reflect on the level of customers' trustworthiness and adoption of cloud services. The upcoming demand for continuous certification has not yet been definitively addressed and it remains unclear how the level 'high' of the European Cybersecurity Certification Scheme for Cloud Services (EUCS) shall be technologically achieved. The introduction of AI in cloud services is raising the complexity of certification even further. This paper presents the EMERALD Certification-as-a-Service (CaaS) concept for continuous certification of harmonized cybersecurity schemes, like the EUCS. EMERALD CaaS aims to provide agile and lean re-certification to consumers that adhere to a defined level of security and trust in a uniform way across heterogeneous environments consisting of combinations of different resources (Cloud, Edge, IoT). Initial findings suggest that EMERALD will significantly contribute to continuous certification, boosting providers and users of cloud services to maintain regulatory compliance towards the latest and upcoming security schemes.

Paperid: 3439, https://arxiv.org/pdf/2502.06706.pdf

Abstract:
Advances in technology, a growing pool of sensitive data, and heightened global tensions has increased the demand for skilled cybersecurity professionals. Despite the recent increase in attention given to cybersecurity education, traditional approaches have continue in failing to keep pace with the rapidly evolving cyber threat landscape. Challenges such as a shortage of qualified educators and resource-intensive practical training exacerbate these issues. Gamification offers an innovative approach to provide practical hands-on experiences, and equip educators with up-to-date and accessible teaching tools that are targeted to industry-specific concepts. The paper begins with a review of the literature on existing challenges in cybersecurity education and gamification methods already employed in the field, before presenting a real-world case study of a gamified cryptography teaching tool. The paper discusses the design, development process, and intended use cases for this tool. This research highlights and provides an example of how integrating gamification into curricula can address key educational gaps, ensuring a more robust and effective pipeline of cybersecurity talent for the future.

Paperid: 3440, https://arxiv.org/pdf/2502.06502.pdf

Abstract:
This paper presents a security paradigm for edge devices to defend against various internal and external threats. The first section of the manuscript proposes employing machine learning models to identify MQTT-based (Message Queue Telemetry Transport) attacks using the Intrusion Detection and Prevention System (IDPS) for edge nodes. Because the Machine Learning (ML) model cannot be trained directly on low-performance platforms (such as edge devices),a new methodology for updating ML models is proposed to provide a tradeoff between the model performance and the computational complexity. The proposed methodology involves training the model on a high-performance computing platform and then installing the trained model as a detection engine on low-performance platforms (such as the edge node of the edge layer) to identify new attacks. Multiple security techniques have been employed in the second half of the manuscript to verify that the exchanged trained model and the exchanged data files are valid and undiscoverable (information authenticity and privacy) and that the source (such as a fog node or edge device) is indeed what it it claimed to be (source authentication and message integrity). Finally, the proposed security paradigm is found to be effective against various internal and external threats and can be applied to a low-cost single-board computer (SBC).

Paperid: 3441, https://arxiv.org/pdf/2502.06425.pdf

Abstract:
Large language models (LLMs) are increasingly utilized in domains such as finance, healthcare, and interpersonal relationships to provide advice tailored to user traits and contexts. However, this personalization often relies on sensitive data, raising critical privacy concerns and necessitating data minimization. To address these challenges, we propose a framework that integrates zero-knowledge proof (ZKP) technology, specifically zkVM, with LLM-based chatbots. This integration enables privacy-preserving data sharing by verifying user traits without disclosing sensitive information. Our research introduces both an architecture and a prompting strategy for this approach. Through empirical evaluation, we clarify the current constraints and performance limitations of both zkVM and the proposed prompting strategy, thereby demonstrating their practical feasibility in real-world scenarios.

Paperid: 3442, https://arxiv.org/pdf/2502.06293.pdf

Abstract:
RustMC is a stateless model checker that enables verification of concurrent Rust programs. As both Rust and C/C++ compile to LLVM IR, RustMC builds on GenMC which provides a verification framework for LLVM IR. This enables the automatic verification of Rust code and any C/C++ dependencies. This tool paper presents the key challenges we addressed to extend GenMC. These challenges arise from Rust's unique compilation strategy and include intercepting threading operations, handling memory intrinsics and uninitialized accesses. Through case studies adapted from real-world programs, we demonstrate RustMC's effectiveness at finding concurrency bugs stemming from unsafe Rust code, FFI calls to C/C++, and incorrect use of atomic operations.

Paperid: 3443, https://arxiv.org/pdf/2502.06043.pdf

Abstract:
The evolution of ransomware requires the development of more sophisticated detection methodologies capable of identifying malicious behaviors beyond traditional signature-based and heuristic techniques. The proposed Hierarchical Polysemantic Feature Embedding framework introduces a structured approach to ransomware detection through hyperbolic feature representations that capture hierarchical dependencies within executable behaviors. By embedding ransomware-relevant features into a non-Euclidean space, the framework maintains a well-defined decision boundary, ensuring improved generalization across previously unseen ransomware variants. Experimental evaluations demonstrated that the framework consistently outperformed conventional machine learning-based models, achieving higher detection accuracy while maintaining low false positive rates. The structured clustering mechanism employed within the hyperbolic space enabled robust classification even in the presence of obfuscation techniques, delayed execution strategies, and polymorphic transformations. Comparative analysis highlighted the limitations of existing detection frameworks, particularly in their inability to dynamically adapt to evolving ransomware tactics. Computational efficiency assessments indicated that the proposed method maintained a balance between detection performance and processing overhead, making it a viable candidate for real-world cybersecurity applications. The ability to detect emerging ransomware families without requiring extensive retraining demonstrated the adaptability of hierarchical embeddings in security analytics.

Paperid: 3444, https://arxiv.org/pdf/2502.06039.pdf

Abstract:
Prompt engineering reduces reasoning mistakes in Large Language Models (LLMs). However, its effectiveness in mitigating vulnerabilities in LLM-generated code remains underexplored. To address this gap, we implemented a benchmark to automatically assess the impact of various prompt engineering strategies on code security. Our benchmark leverages two peer-reviewed prompt datasets and employs static scanners to evaluate code security at scale. We tested multiple prompt engineering techniques on GPT-3.5-turbo, GPT-4o, and GPT-4o-mini. Our results show that for GPT-4o and GPT-4o-mini, a security-focused prompt prefix can reduce the occurrence of security vulnerabilities by up to 56%. Additionally, all tested models demonstrated the ability to detect and repair between 41.9% and 68.7% of vulnerabilities in previously generated code when using iterative prompting techniques. Finally, we introduce a "prompt agent" that demonstrates how the most effective techniques can be applied in real-world development workflows.

Paperid: 3445, https://arxiv.org/pdf/2502.05987.pdf

Abstract:
UNO is a popular multiplayer card game. In each turn, a player has to play a card in their hand having the same number or color as the most recently played card. When having few people, adding virtual players to play the game can easily be done in UNO video games. However, this is a challenging task for physical UNO without computers. In this paper, we propose a protocol that can simulate virtual players using only physical cards. In particular, our protocol can uniformly select a valid card to play from each virtual player's hand at random, or report that none exists, without revealing the rest of its hand. The protocol can also be applied to simulate virtual players in other turn-based card or tile games where each player has to select a valid card or tile to play in each turn.

Paperid: 3446, https://arxiv.org/pdf/2502.05951.pdf

Abstract:
This work introduces Cyri, an AI-powered conversational assistant designed to support a human user in detecting and analyzing phishing emails by leveraging Large Language Models. Cyri has been designed to scrutinize emails for semantic features used in phishing attacks, such as urgency, and undesirable consequences, using an approach that unifies features already established in the literature with others by Cyri features extraction methodology. Cyri can be directly plugged into a client mail or webmail, ensuring seamless integration with the user's email workflow while maintaining data privacy through local processing. By performing analyses on the user's machine, Cyri eliminates the need to transmit sensitive email data over the internet, reducing associated security risks. The Cyri user interface has been designed to reduce habituation effects and enhance user engagement. It employs dynamic visual cues and context-specific explanations to keep users alert and informed while using emails. Additionally, it allows users to explore identified malicious semantic features both through conversation with the agent and visual exploration, obtaining the advantages of both modalities for expert or non-expert users. It also allows users to keep track of the conversation, supports the user in solving additional questions on both computed features or new parts of the mail, and applies its detection on demand. To evaluate Cyri, we crafted a comprehensive dataset of 420 phishing emails and 420 legitimate emails. Results demonstrate high effectiveness in identifying critical phishing semantic features fundamental to phishing detection. A user study involving 10 participants, both experts and non-experts, evaluated Cyri's effectiveness and usability. Results indicated that Cyri significantly aided users in identifying phishing emails and enhanced their understanding of phishing tactics.

Paperid: 3447, https://arxiv.org/pdf/2502.05931.pdf

Abstract:
EEG-based neural networks, pivotal in medical diagnosis and brain-computer interfaces, face significant intellectual property (IP) risks due to their reliance on sensitive neurophysiological data and resource-intensive development. Current watermarking methods, particularly those using abstract trigger sets, lack robust authentication and fail to address the unique challenges of EEG models. This paper introduces a cryptographic wonder filter-based watermarking framework tailored for EEG-based neural networks. Leveraging collision-resistant hashing and public-key encryption, the wonder filter embeds the watermark during training, ensuring minimal distortion ($\leq 5\%$ drop in EEG task accuracy) and high reliability (100\% watermark detection). The framework is rigorously evaluated against adversarial attacks, including fine-tuning, transfer learning, and neuron pruning. Results demonstrate persistent watermark retention, with classification accuracy for watermarked states remaining above 90\% even after aggressive pruning, while primary task performance degrades faster, deterring removal attempts. Piracy resistance is validated by the inability to embed secondary watermarks without severe accuracy loss ( $>10\%$ in EEGNet and CCNN models). Cryptographic hashing ensures authentication, reducing brute-force attack success probabilities. Evaluated on the DEAP dataset across models (CCNN, EEGNet, TSception), the method achieves $>99.4\%$ null-embedding accuracy, effectively eliminating false positives. By integrating wonder filters with EEG-specific adaptations, this work bridges a critical gap in IP protection for neurophysiological models, offering a secure, tamper-proof solution for healthcare and biometric applications. The framework's robustness against adversarial modifications underscores its potential to safeguard sensitive EEG models while maintaining diagnostic utility.

Paperid: 3448, https://arxiv.org/pdf/2502.05685.pdf

Abstract:
The movement to mobile computing solutions provides flexibility to different users whether it is a business user, a student, or even providing entertainment to children and adults of all ages. Due to these emerging technologies mobile users are unable to safeguard private information in a very effective way and cybercrimes are increasing day by day. This manuscript will focus on security vulnerabilities in the mobile computing industry, especially focusing on tablets and smart phones. This study will dive into current security threats for the Android & Apple iOS market, exposing security risks and threats that the novice or average user may not be aware of. The purpose of this study is to analyze current security risks and threats, and provide solutions that may be deployed to protect against such threats.

Paperid: 3449, https://arxiv.org/pdf/2502.05341.pdf

Abstract:
Encrypted behavioral patterns provide a unique avenue for classifying complex digital threats without reliance on explicit feature extraction, enabling detection frameworks to remain effective even when conventional static and behavioral methodologies fail. A novel approach based on Neural Encrypted State Transduction (NEST) is introduced to analyze cryptographic flow residuals and classify threats through their encrypted state transitions, mitigating evasion tactics employed through polymorphic and obfuscated attack strategies. The mathematical formulation of NEST leverages transduction principles to map state transitions dynamically, enabling high-confidence classification without requiring direct access to decrypted execution traces. Experimental evaluations demonstrate that the proposed framework achieves improved detection accuracy across multiple ransomware families while exhibiting resilience against adversarial perturbations and previously unseen attack variants. The model maintains competitive processing efficiency, offering a practical balance between classification performance and computational resource constraints, making it suitable for large-scale security deployments. Comparative assessments reveal that NEST consistently outperforms baseline classification models, particularly in detecting ransomware samples employing delayed encryption, entropy-based obfuscation, and memory-resident execution techniques. The capacity to generalize across diverse execution environments reinforces the applicability of encrypted transduction methodologies in adversarial classification tasks beyond conventional malware detection pipelines. The integration of residual learning mechanisms within the transduction layers further enhances classification robustness, minimizing both false positives and misclassification rates across varied operational contexts.

Paperid: 3450, https://arxiv.org/pdf/2502.05225.pdf

Abstract:
Phishing often targets victims through visually perturbed texts to bypass security systems. The noise contained in these texts functions as an adversarial attack, designed to deceive language models and hinder their ability to accurately interpret the content. However, since it is difficult to obtain sufficient phishing cases, previous studies have used synthetic datasets that do not contain real-world cases. In this study, we propose the BitAbuse dataset, which includes real-world phishing cases, to address the limitations of previous research. Our dataset comprises a total of 325,580 visually perturbed texts. The dataset inputs are drawn from the raw corpus, consisting of visually perturbed sentences and sentences generated through an artificial perturbation process. Each input sentence is labeled with its corresponding ground truth, representing the restored, non-perturbed version. Language models trained on our proposed dataset demonstrated significantly better performance compared to previous methods, achieving an accuracy of approximately 96%. Our analysis revealed a significant gap between real-world and synthetic examples, underscoring the value of our dataset for building reliable pre-trained models for restoration tasks. We release the BitAbuse dataset, which includes real-world phishing cases annotated with visual perturbations, to support future research in adversarial attack defense.

Paperid: 3451, https://arxiv.org/pdf/2502.05220.pdf

Abstract:
Increased utilization of unmanned aerial vehicles (UAVs) in critical operations necessitates secure and reliable communication with Ground Control Stations (GCS). This paper introduces Aero-LLM, a framework integrating multiple Large Language Models (LLMs) to enhance UAV mission security and operational efficiency. Unlike conventional singular LLMs, Aero-LLM leverages multiple specialized LLMs for various tasks, such as inferencing, anomaly detection, and forecasting, deployed across onboard systems, edge, and cloud servers. This dynamic, distributed architecture reduces performance bottleneck and increases security capabilities. Aero-LLM's evaluation demonstrates outstanding task-specific metrics and robust defense against cyber threats, significantly enhancing UAV decision-making and operational capabilities and security resilience against cyber attacks, setting a new standard for secure, intelligent UAV operations.

Paperid: 3452, https://arxiv.org/pdf/2502.05219.pdf

Abstract:
This article describes how technical infrastructure developed by the nonprofit OpenMined enables external scrutiny of AI systems without compromising sensitive information. Independent external scrutiny of AI systems provides crucial transparency into AI development, so it should be an integral component of any approach to AI governance. In practice, external researchers have struggled to gain access to AI systems because of AI companies' legitimate concerns about security, privacy, and intellectual property. But now, privacy-enhancing technologies (PETs) have reached a new level of maturity: end-to-end technical infrastructure developed by OpenMined combines several PETs into various setups that enable privacy-preserving audits of AI systems. We showcase two case studies where this infrastructure has been deployed in real-world governance scenarios: "Understanding Social Media Recommendation Algorithms with the Christchurch Call" and "Evaluating Frontier Models with the UK AI Safety Institute." We describe types of scrutiny of AI systems that could be facilitated by current setups and OpenMined's proposed future setups. We conclude that these innovative approaches deserve further exploration and support from the AI governance community. Interested policymakers can focus on empowering researchers on a legal level.

Paperid: 3453, https://arxiv.org/pdf/2502.05213.pdf

Abstract:
As large language models (LLMs) grow more powerful, concerns over copyright infringement of LLM-generated texts have intensified. LLM watermarking has been proposed to trace unauthorized redistribution or resale of generated content by embedding identifiers within the text. Existing approaches primarily rely on one-bit watermarking, which only verifies whether a text was generated by a specific LLM. In contrast, multi-bit watermarking encodes richer information, enabling the identification of the specific LLM and user involved in generated or distributed content. However, current multi-bit methods directly embed the watermark into the text without considering its watermark capacity, which can result in failures, especially in low-entropy texts. In this paper, we analyze that the watermark embedding follows a normal distribution. We then derive a formal inequality to optimally segment the text for watermark embedding. Building upon this, we propose DERMARK, a dynamic, efficient, and robust multi-bit watermarking method that divides the text into variable-length segments for each watermark bit during the inference. Moreover, DERMARK incurs negligible overhead since no additional intermediate matrices are generated and achieves robustness against text editing by minimizing watermark extraction loss. Experiments demonstrate that, compared to SOTA, on average, our method reduces the number of tokens required per embedded bit by 25\%, reduces watermark embedding time by 50\%, and maintains high robustness against text modifications and watermark erasure attacks.

Paperid: 3454, https://arxiv.org/pdf/2502.05046.pdf

Abstract:
Data collection and processing in advanced health monitoring systems are experiencing revolutionary change. In-Sensor Computing (ISC) systems emerge as a promising alternative to save energy on massive data transmission, analog-to-digital conversion, and ineffective processing. While the new paradigm shift of ISC systems gains increasing attention, the highly compacted systems could incur new challenges from a hardware security perspective. This work first conducts a literature review to highlight the research trend of this topic and then performs comprehensive analyses on the root of security challenges. This is the first work that compares the security challenges of traditional sensor-involved computing systems and emerging ISC systems. Furthermore, new attack scenarios are predicted for board-, chip-, and device-level ISC systems. Two proof-of-concept demos are provided to inspire new countermeasure designs against unique hardware security threats in ISC systems.

Paperid: 3455, https://arxiv.org/pdf/2502.05011.pdf

Abstract:
We apply language modeling techniques to detect ransomware activity in NVMe command sequences. We design and train two types of transformer-based models: the Command-Level Transformer (CLT) performs in-context token classification to determine whether individual commands are initiated by ransomware, and the Patch-Level Transformer (PLT) predicts the volume of data accessed by ransomware within a patch of commands. We present both model designs and the corresponding tokenization and embedding schemes and show that they improve over state-of-the-art tabular methods by up to 24% in missed-detection rate, 66% in data loss prevention, and 84% in identifying data accessed by ransomware.

Paperid: 3456, https://arxiv.org/pdf/2502.04659.pdf

Abstract:
Blockchains have revolutionized decentralized applications, with composability enabling atomic, trustless interactions across smart contracts. However, layer 2 (L2) scalability solutions like rollups introduce fragmentation and hinder composability. Current cross-chain protocols, including atomic swaps, bridges, and shared sequencers, lack the necessary coordination mechanisms or rely on trust assumptions, and are thus not sufficient to support full cross-rollup composability. This paper presents $\mathsf{CRATE}$, a secure protocol for cross-rollup composability that ensures all-or-nothing and serializable execution of cross-rollup transactions (CRTs). $\mathsf{CRATE}$ supports rollups on distinct layer 1 (L1) chains, achieves finality in 4 rounds on L1, and only relies on the underlying L1s and the liveness of L2s. We introduce two formal models for CRTs, define atomicity within them, and formally prove the security of $\mathsf{CRATE}$. We also provide an implementation of $\mathsf{CRATE}$ along with a cross-rollup flash loan application; our experiments demonstrate that $\mathsf{CRATE}$ is practical in terms of gas usage on L1.

Paperid: 3457, https://arxiv.org/pdf/2502.04648.pdf

Abstract:
With greater design complexity, the challenge to anticipate and mitigate security issues provides more responsibility for the designer. As hardware provides the foundation of a secure system, we need tools and techniques that support engineers to improve trust and help them address security concerns. Knowing the security assets in a design is fundamental to downstream security analyses, such as threat modeling, weakness identification, and verification. This paper proposes an automated approach for the initial identification of potential security assets in a Verilog design. Taking inspiration from manual asset identification methodologies, we analyze open-source hardware designs in three IP families and identify patterns and commonalities likely to indicate structural assets. Through iterative refinement, we provide a potential set of primary security assets and thus help to reduce the manual search space.

Paperid: 3458, https://arxiv.org/pdf/2502.04538.pdf

Abstract:
Concerns about big tech's monopoly power have featured prominently in recent media and policy discourse, as regulators across the EU, the US, and beyond have ramped up efforts to promote healthier market competition. One favored approach is to require certain kinds of interoperation between platforms, to mitigate the current concentration of power in the biggest companies. Unsurprisingly, interoperability initiatives have generally been met with resistance by big tech companies. Perhaps more surprisingly, a significant part of that pushback has been in the name of security -- that is, arguing against interoperation on the basis that it will undermine security. We conduct a systematic examination of "security vs. interoperability" (SvI) discourse in the context of EU antitrust proceedings. Our resulting contributions are threefold. First, we propose a taxonomy of SvI concerns in three categories: engineering, vetting, and hybrid. Second, we present an analytical framework for assessing real-world SvI concerns, and illustrate its utility by analyzing several case studies spanning our three taxonomy categories. Third, we undertake a comparative analysis that highlights key considerations around the interplay of economic incentives, market power, and security across our diverse case study contexts, identifying common patterns in each taxonomy category. Our contributions provide valuable analytical tools for experts and non-experts alike to critically assess SvI discourse in today's fast-paced regulatory landscape.

Paperid: 3459, https://arxiv.org/pdf/2502.04421.pdf

Abstract:
We present an approach to identifying which ransomware adversaries are most likely to target specific entities, thereby assisting these entities in formulating better protection strategies. Ransomware poses a formidable cybersecurity threat characterized by profit-driven motives, a complex underlying economy supporting criminal syndicates, and the overt nature of its attacks. This type of malware has consistently ranked among the most prevalent, with a rapid escalation in activity observed. Recent estimates indicate that approximately two-thirds of organizations experienced ransomware attacks in 2023 \cite{Sophos2023Ransomware}. A central tactic in ransomware campaigns is publicizing attacks to coerce victims into paying ransoms. Our study utilizes public disclosures from ransomware victims to predict the likelihood of an entity being targeted by a specific ransomware variant. We employ a Large Language Model (LLM) architecture that uses a unique chain-of-thought, multi-shot prompt methodology to define adversary SKRAM (Skills, Knowledge, Resources, Authorities, and Motivation) profiles from ransomware bulletins, threat reports, and news items. This analysis is enriched with publicly available victim data and is further enhanced by a heuristic for generating synthetic data that reflects victim profiles. Our work culminates in the development of a machine learning model that assists organizations in prioritizing ransomware threats and formulating defenses based on the tactics, techniques, and procedures (TTP) of the most likely attackers.

Paperid: 3460, https://arxiv.org/pdf/2502.04303.pdf

Abstract:
In October 2023, 23andMe, a prominent provider of personal genetic testing, ancestry, and health information services, suffered a significant data breach orchestrated by a cybercriminal known as ``Golem.'' Initially, approximately 14,000 user accounts were compromised by a credential smear attack, exploiting reused usernames and passwords from previous data leaks. However, due to the interconnected nature of 23andMe's DNA Relatives and Family Tree features, the breach expanded exponentially, exposing sensitive personal and genetic data of approximately 5.5 million users and 1.4 million additional profiles. The attack highlights the increasing threat of credential stuffing, exacerbated by poor password hygiene and the absence of robust security measures such as multi-factor authentication (MFA) and rate limiting. In response, 23andMe mandated password resets, implemented email-based two-step verification, and advised users to update passwords across other services. This paper critically analyzes the attack methodology, its impact on users and the company, and explores potential mitigation strategies, including enhanced authentication protocols, proactive breach detection, and improved cybersecurity practices. The findings underscore the necessity of stronger user authentication measures and corporate responsibility in safeguarding sensitive genetic and personal data.

Paperid: 3461, https://arxiv.org/pdf/2502.04287.pdf

Abstract:
Managing the security of employee work computers has become increasingly important as today's work model shifts to remote and hybrid work plans. In this paper, we explore the recent 2022 LastPass data breach, in which the attacker obtained sensitive customer data by exploiting a software vulnerability on a DevSecOps engineer's computer. We discuss the methodology of the attacker as well as the impact this incident had on LastPass and its customers. Next, we expand upon the impact the breach had on LastPass as well as its customers. From this, we propose solutions for preparing for and mitigating similar attacks in the future. The aim of this paper is to shed light on the LastPass incident and provide methods for companies to secure their employee base, both nationally and internationally. With a strong security structure, companies can vastly reduce the chances of falling victim to a similar attack.

Paperid: 3462, https://arxiv.org/pdf/2502.03882.pdf

Abstract:
The increasing complexity of cryptographic extortion techniques has necessitated the development of adaptive detection frameworks capable of identifying adversarial encryption behaviors without reliance on predefined signatures. Hierarchical Entropic Diffusion (HED) introduces a structured entropy-based anomaly classification mechanism that systematically tracks fluctuations in entropy evolution to differentiate between benign cryptographic processes and unauthorized encryption attempts. The integration of hierarchical clustering, entropy profiling, and probabilistic diffusion modeling refines detection granularity, ensuring that encryption anomalies are identified despite obfuscation strategies or incremental execution methodologies. Experimental evaluations demonstrated that HED maintained high classification accuracy across diverse ransomware families, outperforming traditional heuristic-based and signature-driven approaches while reducing false positive occurrences. Comparative analysis highlighted that entropy-driven anomaly segmentation improved detection efficiency under variable system workload conditions, ensuring real-time classification feasibility. The computational overhead associated with entropy anomaly detection remained within operational constraints, reinforcing the suitability of entropy-driven classification for large-scale deployment. The ability to identify adversarial entropy manipulations before encryption completion contributes to broader cybersecurity defenses, offering a structured methodology for isolating unauthorized cryptographic activities within heterogeneous computing environments. The results further emphasized that entropy evolution modeling facilitates predictive anomaly detection, enhancing resilience against encryption evasion techniques designed to circumvent traditional detection mechanisms.

Paperid: 3463, https://arxiv.org/pdf/2502.03811.pdf

Abstract:
The digitization of health records has greatly improved the efficiency of the healthcare system and promoted the formulation of related research and policies. However, the widespread application of advanced technologies such as electronic health records, genomic data, and wearable devices in the field of health big data has also intensified the collection of personal sensitive data, bringing serious privacy and security issues. Based on a systematic literature review (SLR), this paper comprehensively outlines the key research in the field of health big data security. By analyzing existing research, this paper explores how cutting-edge technologies such as homomorphic encryption, blockchain, federated learning, and artificial immune systems can enhance data security while protecting personal privacy. This paper also points out the current challenges and proposes a future research framework in this key area.

Paperid: 3464, https://arxiv.org/pdf/2502.03622.pdf

Abstract:
Phishing attacks remain a significant threat in the digital age, yet organizations lack effective methods to tackle phishing attacks without leaking sensitive information. Phish bowl initiatives are a vital part of cybersecurity efforts against these attacks. However, traditional phish bowls require manual anonymization and are often limited to internal use. To overcome these limitations, we introduce AdaPhish, an AI-powered phish bowl platform that automatically anonymizes and analyzes phishing emails using large language models (LLMs) and vector databases. AdaPhish achieves real-time detection and adaptation to new phishing tactics while enabling long-term tracking of phishing trends. Through automated reporting, adaptive analysis, and real-time alerts, AdaPhish presents a scalable, collaborative solution for phishing detection and cybersecurity education.

Paperid: 3465, https://arxiv.org/pdf/2502.03427.pdf

Abstract:
Scalable and secure data management is important in Internet of Things (IoT) applications such as smart water meters, where traditional blockchain storage can be restrictive due to high data volumes. This paper investigates a hybrid blockchain and InterPlanetary File System (IPFS) approach designed to optimise storage efficiency, enhance throughput, and reduce block time by offloading large data off-chain to IPFS while preserving on-chain integrity. A substrate-based private blockchain was developed to store smart water meter (SWM) data, and controlled experiments were conducted to evaluate blockchain performance with and without IPFS. Key metrics, including block size, block time, and transaction throughput, were analysed across varying data volumes and node counts. Results show that integrating IPFS significantly reduces on-chain storage demands, leading to smaller block sizes, increased throughput, and improved block times compared to blockchain-only storage. These findings highlight the potential of hybrid blockchain-IPFS models for efficiently and securely managing high-volume IoT data.

Paperid: 3466, https://arxiv.org/pdf/2502.03398.pdf

Abstract:
The obstacles of each security system combined with the increase of cyber-attacks, negatively affect the effectiveness of network security management and rise the activities to be taken by the security staff and network administrators. So, there is a growing need for the automated auditing and intelligent reporting strategies for reliable network security with as less model complexity as possible. Newly, artificial intelligence has been effectively applied to various network security issues, and numerous studies have been conducted that utilize various artificial intelligence techniques for the purposes of encryption and secure communication, in addition to using artificial intelligence to perform a large number of data encryption operations in record time. The aim of the study is to present and discuss the most prominent methods of artificial intelligence recently used in the field of network security including user authentication, Key exchanging, encryption/decryption, data integrity and intrusion detection system.

Paperid: 3467, https://arxiv.org/pdf/2502.03378.pdf

Abstract:
The long history of misconfigurations and errors in RPKI indicates that they cannot be easily avoided and will most probably persist also in the future. These errors create conflicts between BGP announcements and their covering ROAs, causing the RPKI validation to result in status invalid. Networks that enforce RPKI filtering with Route Origin Validation (ROV) would block such conflicting BGP announcements and as a result lose traffic from the corresponding origins. Since the business incentives of networks are tightly coupled with the traffic they relay, filtering legitimate traffic leads to a loss of revenue, reducing the motivation to filter invalid announcements with ROV. In this work, we introduce a new mechanism, LOV, designed for whitelisting benign conflicts on an Internet scale. The resulting whitelist is made available to RPKI supporting ASes to avoid filtering RPKI-invalid but benign routes. Saving legitimate traffic resolves one main obstacle towards RPKI deployment. We measure live BGP updates using LOV during a period of half a year and whitelist 52,846 routes with benign origin errors.

Paperid: 3468, https://arxiv.org/pdf/2502.03221.pdf

Abstract:
Physical Unclonable Functions (PUFs) enable physical tamper protection for high-assurance devices without needing a continuous power supply that is active over the entire lifetime of the device. Several methods for PUF-based tamper protection have been proposed together with practical quantization and error correction schemes. In this work we take a step back from the implementation to analyze theoretical properties and limits. We apply zero leakage output quantization to existing quantization schemes and minimize the reconstruction error probability under zero leakage. We apply wiretap coding within a helper data algorithm to enable a reliable key reconstruction for the legitimate user while guaranteeing a selectable reconstruction complexity for an attacker, analogously to the security level for a cryptographic algorithm for the attacker models considered in this work. We present lower bounds on the achievable key rates depending on the attacker's capabilities in the asymptotic and finite blocklength regime to give fundamental security guarantees even if the attacker gets partial information about the PUF response and the helper data. Furthermore, we present converse bounds on the number of PUF cells. Our results show for example that for a practical scenario one needs at least 459 PUF cells using 3 bit quantization to achieve a security level of 128 bit.

Paperid: 3469, https://arxiv.org/pdf/2502.02851.pdf

Abstract:
5G enables digital innovation by integrating diverse services, making security especially primary authentication crucial. Two standardized protocols, 5G AKA and EAP AKA', handle authentication for 3GPP and non 3GPP devices. However, 5G AKA has vulnerabilities, including linkability attacks. Additionally, quantum computing poses threats, requiring quantum resistant cryptography. While post-quantum cryptography (PQC) is being standardized, its real world robustness remains unproven. Conventional cryptographic schemes offer reliability due to decades of practical use. To bridge this gap, IETF is standardizing hybrid PQC (HPQC), combining classical and quantum resistant methods. Ensuring forward secrecy and quantum resilience in 5G-AKA is critical. To address these issues, we propose 5G AKA HPQC, a protocol maintaining compatibility with existing standards while enhancing security by integrating keys derived from Elliptic Curve Integrated Encryption Scheme (ECIES) and PQC Key Encapsulation Mechanism (KEM). We validate its security using SVO Logic and ProVerif, confirming its robustness. Performance evaluations assess computational and communication overheads, demonstrating a balance between security and efficiency. This research provides key insights into quantum-safe authentication, contributing to future standardization of secure mobile authentication protocols.

Paperid: 3470, https://arxiv.org/pdf/2502.02730.pdf

Abstract:
Encryption-based attacks have introduced significant challenges for detection mechanisms that rely on predefined signatures, heuristic indicators, or static rule-based classifications. Probabilistic Latent Encryption Mapping presents an alternative detection framework that models ransomware-induced encryption behaviors through statistical representations of entropy deviations and probabilistic dependencies in execution traces. Unlike conventional approaches that depend on explicit bytecode analysis or predefined cryptographic function call monitoring, probabilistic inference techniques classify encryption anomalies based on their underlying statistical characteristics, ensuring greater adaptability to polymorphic attack strategies. Evaluations demonstrate that entropy-driven classification reduces false positive rates while maintaining high detection accuracy across diverse ransomware families and encryption methodologies. Experimental results further highlight the framework's ability to differentiate between benign encryption workflows and adversarial cryptographic manipulations, ensuring that classification performance remains effective across cloud-based and localized execution environments. Benchmark comparisons illustrate that probabilistic modeling exhibits advantages over heuristic and machine learning-based detection approaches, particularly in handling previously unseen encryption techniques and adversarial obfuscation strategies. Computational efficiency analysis confirms that detection latency remains within operational feasibility constraints, reinforcing the viability of probabilistic encryption classification for real-time security infrastructures. The ability to systematically infer encryption-induced deviations without requiring static attack signatures strengthens detection robustness against adversarial evasion techniques.

Paperid: 3471, https://arxiv.org/pdf/2502.01339.pdf

Abstract:
The concatenation of encryption and decryption can be interpreted as data transmission over a noisy communication channel. In this work, we use finite blocklength methods (normal approximation and random coding union bound) as well as asymptotics to show that ciphertext and key sizes of the state-of-the-art post-quantum secure key encapsulation mechanism (KEM) Kyber can be reduced without compromising the security of the scheme. We show that in the asymptotic regime, it is possible to reduce the sizes of ciphertexts and secret keys by 25% for the parameter set Kyber1024 while keeping the bitrate at 1 as proposed in the original scheme. For a single Kyber encryption block used to share a 256-bit AES key, we furthermore show that reductions in ciphertext size of 39% and 33% are possible for Kyber1024 and Kyber512, respectively.

Paperid: 3472, https://arxiv.org/pdf/2502.01275.pdf

Abstract:
Malicious encryption techniques continue to evolve, bypassing conventional detection mechanisms that rely on static signatures or predefined behavioral rules. Spectral analysis presents an alternative approach that transforms system activity data into the frequency domain, enabling the identification of anomalous waveform signatures that are difficult to obfuscate through traditional evasion techniques. The proposed Spectral Entanglement Fingerprinting (SEF) framework leverages power spectral densities, coherence functions, and entropy-based metrics to extract hidden patterns indicative of unauthorized encryption activities. Detection accuracy evaluations demonstrate that frequency-domain transformations achieve superior performance in distinguishing malicious from benign processes, particularly in the presence of polymorphic and metamorphic modifications. Comparative analyses with established methods reveal that frequency-based detection minimizes false positive and false negative rates, ensuring operational efficiency without excessive computational overhead. Experimental results indicate that entropy variations in encrypted data streams provide meaningful classification insights, allowing the differentiation of distinct ransomware families based on spectral characteristics alone. The latency assessment confirms that SEF operates within a time window that enables proactive intervention, mitigating encryption-induced damage before data integrity is compromised. Scalability evaluations suggest that the framework remains effective even under concurrent execution of multiple ransomware instances, supporting its suitability for high-throughput environments.

Paperid: 3473, https://arxiv.org/pdf/2502.01066.pdf

Abstract:
As a vital security primitive, the true random number generator (TRNG) is a mandatory component to build roots of trust for any encryption system. However, existing TRNGs suffer from bottlenecks of low throughput and high area-energy consumption. In this work, we propose DH-TRNG, a dynamic hybrid TRNG circuitry architecture with ultra-high throughput and area-energy efficiency. Our DH-TRNG exhibits portability to distinct process FPGAs and passes both NIST and AIS-31 tests without any post-processing. The experiments show it incurs only 8 slices with the highest throughput of 670Mbps and 620Mbps on Xilinx Virtex-6 and Artix-7, respectively. Compared to the state-of-the-art TRNGs, our proposed design has the highest Throughput/SlicesPower with a 2.63 times increase.

Paperid: 3474, https://arxiv.org/pdf/2502.00961.pdf

Abstract:
Due to society's continuing technological advance, the capabilities of machine learning-based artificial intelligence systems continue to expand and influence a wider degree of topics. Alongside this expansion of technology, there is a growing number of individuals willing to misuse these systems to defraud and mislead others. Deepfake technology, a set of deep learning algorithms that are capable of replacing the likeness or voice of one individual with another with alarming accuracy, is one of these technologies. This paper investigates the threat posed by malicious use of this technology, particularly in the form of spearphishing attacks. It uses deepfake technology to create spearphishing-like attack scenarios and validate them against average individuals. Experimental results show that 66% of participants failed to identify AI created audio as fake while 43% failed to identify such videos as fake, confirming the growing fear of threats posed by the use of these technologies by cybercriminals.

Paperid: 3475, https://arxiv.org/pdf/2502.00677.pdf

Abstract:
Event log analysis is an important task that security professionals undertake. Event logs record key information on activities that occur on computing devices, and due to the substantial number of events generated, they consume a large amount of time and resources to analyse. This demanding and repetitive task is also prone to errors. To address these concerns, researchers have developed automated techniques to improve the event log analysis process. Large Language Models (LLMs) have recently demonstrated the ability to successfully perform a wide range of tasks that individuals would usually partake in, to high standards, and at a pace and degree of complexity that outperform humans. Due to this, researchers are rapidly investigating the use of LLMs for event log analysis. This includes fine-tuning, Retrieval-Augmented Generation (RAG) and in-context learning, which affect performance. These works demonstrate good progress, yet there is a need to understand the developing body of knowledge, identify commonalities between works, and identify key challenges and potential solutions to further developments in this domain. This paper aims to survey LLM-based event log analysis techniques, providing readers with an in-depth overview of the domain, gaps identified in previous research, and concluding with potential avenues to explore in future.

Paperid: 3476, https://arxiv.org/pdf/2502.00651.pdf

Abstract:
The cybersecurity threat landscape is constantly actively making it imperative to develop sound frameworks to protect the IT structures. Based on this introduction, this paper aims to discuss the application of cybersecurity frameworks into the IT security with focus placed on the role of such frameworks in addressing the changing nature of cybersecurity threats. It explores widely used models, including the NIST Cybersecurity Framework, Zero Trust Architecture, and the ISO/IEC 27001, and how they apply to industries including finance, healthcare and government. The discussion also singles out such technologies as Artificial Intelligence (AI) and Machine Learning (ML) as the core for real-time threat detection and response mechanisms. As these integration challenges demonstrate, the study provides tangible and proven approaches to tackle framework implementation issues such as legitimate security issues, limited availability of funds and resources, and compliance with legal requirements. By capturing current trends and exposures, the findings promote strong, portfolio-based and risk-appropriate security approaches adjusted for organizational goals and capable to prevent advanced cyber threats.

Paperid: 3477, https://arxiv.org/pdf/2502.00384.pdf

Abstract:
Side-channel analysis (SCA) represents a realistic threat where the attacker can observe unintentional information to obtain secret data. Evaluation labs also use the same SCA techniques in the security certification process. The results in the last decade have shown that machine learning, especially deep learning, is an extremely powerful SCA approach, allowing the breaking of protected devices while achieving optimal attack performance. Unfortunately, deep learning operates as a black-box, making it less useful for security evaluators who must understand how attacks work to prevent them in the future. This work demonstrates that mechanistic interpretability can effectively scale to realistic scenarios where relevant information is sparse and well-defined interchange interventions to the input are impossible due to side-channel protections. Concretely, we reverse engineer the features the network learns during phase transitions, eventually retrieving secret masks, allowing us to move from black-box to white-box evaluation.

Paperid: 3478, https://arxiv.org/pdf/2502.00072.pdf

Abstract:
Large language models (LLMs) are demonstrating increasing prowess in cybersecurity applications, creating creating inherent risks alongside their potential for strengthening defenses. In this position paper, we argue that current efforts to evaluate risks posed by these capabilities are misaligned with the goal of understanding real-world impact. Evaluating LLM cybersecurity risk requires more than just measuring model capabilities -- it demands a comprehensive risk assessment that incorporates analysis of threat actor adoption behavior and potential for impact. We propose a risk assessment framework for LLM cyber capabilities and apply it to a case study of language models used as cybersecurity assistants. Our evaluation of frontier models reveals high compliance rates but moderate accuracy on realistic cyber assistance tasks. However, our framework suggests that this particular use case presents only moderate risk due to limited operational advantages and impact potential. Based on these findings, we recommend several improvements to align research priorities with real-world impact assessment, including closer academia-industry collaboration, more realistic modeling of attacker behavior, and inclusion of economic metrics in evaluations. This work represents an important step toward more effective assessment and mitigation of LLM-enabled cybersecurity risks.

Paperid: 3479, https://arxiv.org/pdf/2502.00064.pdf

Abstract:
This study examines the impact of tokenized Java code length on the accuracy and explicitness of ten major LLMs in vulnerability detection. Using chi-square tests and known ground truth, we found inconsistencies across models: some, like GPT-4, Mistral, and Mixtral, showed robustness, while others exhibited a significant link between tokenized length and performance. We recommend future LLM development focus on minimizing the influence of input length for better vulnerability detection. Additionally, preprocessing techniques that reduce token count while preserving code structure could enhance LLM accuracy and explicitness in these tasks.

Paperid: 3480, https://arxiv.org/pdf/2502.00035.pdf

Abstract:
The development of an ecosystem that balances consumer convenience and security is imperative given the expanding market for electric vehicles (EVs). The vast amount of data that EV charging station management systems (EVCSMSs) give is powered by the Internet of Things (IoT) ecosystem. Intrusion Detection Systems (IDSs), which track network traffic to spot potentially dangerous data exchanges in IT and IoT contexts, are constantly improving in terms of efficacy and accuracy. Intrusion detection is becoming a major topic in academia because of the acceleration of IDS development caused by machine learning and deep learning techniques. The goal of the research presented in this paper is to use a machine-learning-based intrusion detection system with low false-positive rates and high accuracy to safeguard the ecosystem of EV charging stations (EVCS).

Paperid: 3481, https://arxiv.org/pdf/2501.19120.pdf

Abstract:
Encryption-based cyber threats continue to evolve, leveraging increasingly sophisticated cryptographic techniques to evade detection and persist within compromised systems. A hierarchical classification framework designed to analyze structural cryptographic properties provides a novel approach to distinguishing malicious encryption from legitimate cryptographic operations. By systematically decomposing encryption workflows into hierarchical layers, the classification method enhances the ability to recognize distinct patterns across diverse threat variants, reducing the dependence on predefined signatures that often fail against rapidly mutating threats. The study examines how cryptographic feature mapping facilitates improved classification accuracy, highlighting the role of entropy, key exchange mechanisms, and algorithmic dependencies in distinguishing harmful encryption activities. Through experimental validation, the framework demonstrated a high degree of precision across multiple attack families, outperforming conventional classification techniques while maintaining computational efficiency suitable for large-scale cybersecurity applications. The layered structural analysis further enhances forensic investigations, enabling security analysts to dissect encryption workflows to trace attack origins and identify commonalities across different campaigns. The methodology strengthens proactive threat mitigation efforts, offering a scalable and adaptable solution that accounts for both known and emerging encryption-based cyber threats. Comparative evaluations illustrate the advantages of structural decomposition in mitigating false positives and negatives, reinforcing the reliability of cryptographic signature classification in real-world security environments.

Paperid: 3482, https://arxiv.org/pdf/2501.18821.pdf

Abstract:
Intelligent transportation systems (ITS) play a pivotal role in modern infrastructure but face security risks due to the broadcast-based nature of the in-vehicle Controller Area Network (CAN) buses. While numerous machine learning models and strategies have been proposed to detect CAN anomalies, existing approaches lack robustness evaluations and fail to comprehensively detect attacks due to shifting their focus on a subset of dominant structures of anomalies. To overcome these limitations, the current study proposes a cascade feature-level spatiotemporal fusion framework that integrates the spatial features and temporal features through a two-parameter genetic algorithm (2P-GA)-optimized cascade architecture to cover all dominant structures of anomalies. Extensive paired t-test analysis confirms that the model achieves an AUC-ROC of 0.9987, demonstrating robust anomaly detection capabilities. The Spatial Module improves the precision by approximately 4%, while the Temporal Module compensates for recall losses, ensuring high true positive rates. The proposed framework detects all attack types with 100% accuracy on the CAR-HACKING dataset, outperforming state-of-the-art methods. This study provides a validated, robust solution for real-world CAN security challenges.

Paperid: 3483, https://arxiv.org/pdf/2501.18712.pdf

Abstract:
Fingerprinting refers to the process of identifying underlying Machine Learning (ML) models of AI Systemts, such as Large Language Models (LLMs), by analyzing their unique characteristics or patterns, much like a human fingerprint. The fingerprinting of Large Language Models (LLMs) has become essential for ensuring the security and transparency of AI-integrated applications. While existing methods primarily rely on access to direct interactions with the application to infer model identity, they often fail in real-world scenarios involving multi-agent systems, frequent model updates, and restricted access to model internals. In this paper, we introduce a novel fingerprinting framework designed to address these challenges by integrating static and dynamic fingerprinting techniques. Our approach identifies architectural features and behavioral traits, enabling accurate and robust fingerprinting of LLMs in dynamic environments. We also highlight new threat scenarios where traditional fingerprinting methods are ineffective, bridging the gap between theoretical techniques and practical application. To validate our framework, we present an extensive evaluation setup that simulates real-world conditions and demonstrate the effectiveness of our methods in identifying and monitoring LLMs in Gen-AI applications. Our results highlight the framework's adaptability to diverse and evolving deployment contexts.

Paperid: 3484, https://arxiv.org/pdf/2501.18669.pdf

Abstract:
Calls for transparency in AI systems are growing in number and urgency from diverse stakeholders ranging from regulators to researchers to users (with a comparative absence of companies developing AI). Notions of transparency for AI abound, each addressing distinct interests and concerns. In computer security, transparency is likewise regarded as a key concept. The security community has for decades pushed back against so-called security by obscurity -- the idea that hiding how a system works protects it from attack -- against significant pressure from industry and other stakeholders. Over the decades, in a community process that is imperfect and ongoing, security researchers and practitioners have gradually built up some norms and practices around how to balance transparency interests with possible negative side effects. This paper asks: What insights can the AI community take from the security community's experience with transparency? We identify three key themes in the security community's perspective on the benefits of transparency and their approach to balancing transparency against countervailing interests. For each, we investigate parallels and insights relevant to transparency in AI. We then provide a case study discussion on how transparency has shaped the research subfield of anonymization. Finally, shifting our focus from similarities to differences, we highlight key transparency issues where modern AI systems present challenges different from other kinds of security-critical systems, raising interesting open questions for the security and AI communities alike.

Paperid: 3485, https://arxiv.org/pdf/2501.18629.pdf

Abstract:
Neural networks are vulnerable to adversarial attacks, and several defenses have been proposed. Designing a robust network is a challenging task given the wide range of attacks that have been developed. Therefore, we aim to provide insight into the influence of network similarity on the success rate of transferred adversarial attacks. Network designers can then compare their new network with existing ones to estimate its vulnerability. To achieve this, we investigate the complex relationship between network similarity and the success rate of transferred adversarial attacks. We applied the Centered Kernel Alignment (CKA) network similarity score and used various methods to find a correlation between a large number of Convolutional Neural Networks (CNNs) and adversarial attacks. Network similarity was found to be moderate across different CNN architectures, with more complex models such as DenseNet showing lower similarity scores due to their architectural complexity. Layer similarity was highest for consistent, basic layers such as DataParallel, Dropout and Conv2d, while specialized layers showed greater variability. Adversarial attack success rates were generally consistent for non-transferred attacks, but varied significantly for some transferred attacks, with complex networks being more vulnerable. We found that a DecisionTreeRegressor can predict the success rate of transferred attacks for all black-box and Carlini & Wagner attacks with an accuracy of over 90%, suggesting that predictive models may be viable under certain conditions. However, the variability of results across different data subsets underscores the complexity of these relationships and suggests that further research is needed to generalize these findings across different attack scenarios and network architectures.

Paperid: 3486, https://arxiv.org/pdf/2501.18565.pdf

Abstract:
In recent years, the rapid development of artificial intelligence (AI) especially multi-modal Large Language Models (MLLMs), has enabled it to understand text, images, videos, and other multimedia data, allowing AI systems to execute various tasks based on human-provided prompts. However, AI-powered bots have increasingly been able to bypass most existing CAPTCHA systems, posing significant security threats to web applications. This makes the design of new CAPTCHA mechanisms an urgent priority. We observe that humans are highly sensitive to shifts and abrupt changes in videos, while current AI systems still struggle to comprehend and respond to such situations effectively. Based on this observation, we design and implement BounTCHA, a CAPTCHA mechanism that leverages human perception of boundaries in video transitions and disruptions. By utilizing generative AI's capability to extend original videos with prompts, we introduce unexpected twists and changes to create a pipeline for generating guided short videos for CAPTCHA purposes. We develop a prototype and conduct experiments to collect data on humans' time biases in boundary identification. This data serves as a basis for distinguishing between human users and bots. Additionally, we perform a detailed security analysis of BounTCHA, demonstrating its resilience against various types of attacks. We hope that BounTCHA will act as a robust defense, safeguarding millions of web applications in the AI-driven era.

Paperid: 3487, https://arxiv.org/pdf/2501.18387.pdf

Abstract:
Authenticity-oriented (previously named as privacy-free) garbling schemes of Frederiksen et al. Eurocrypt '15 are designed to satisfy only the authenticity criterion of Bellare et al. ACM CCS '12, and to be more efficient compared to full-fledged garbling schemes. In this work, we improve the state-of-the-art authenticity-oriented version of half gates (HG) garbling of Zahur et al. Crypto '15 by allowing it to be bandwidth-free if any of the input wires of an AND gate is freely settable by the garbler. Our full solution AuthOr then successfully combines the ideas from information-theoretical garbling of Kondi and Patra Crypto '17 and the HG garbling-based scheme that we obtained. AuthOr has a lower communication cost (i.e., garbled circuit or GC size) than HG garbling without any further security assumption. Theoretically, AuthOr's GC size reduction over HG garbling lies in the range between 0 to 100%, and the exact improvement depends on the circuit structure. We have implemented our scheme and conducted tests on various circuits that were constructed by independent researchers. Our experimental results show that in practice, the GC size gain may be up to around 98%.

Paperid: 3488, https://arxiv.org/pdf/2501.18131.pdf

Abstract:
Entropy-based detection methodologies have gained significant attention due to their ability to analyze structural irregularities within executable files, particularly in the identification of malicious software employing advanced obfuscation techniques. The Entropy-Synchronized Neural Hashing (ESNH) framework introduces a novel approach that leverages entropy-driven hash representations to classify software binaries based on their underlying entropy characteristics. Through the synchronization of entropy profiles with neural network architectures, the model generates robust and unique hash values that maintain stability even when faced with polymorphic and metamorphic transformations. Comparative analysis against traditional detection approaches revealed superior performance in identifying novel threats, reducing false-positive rates, and achieving consistent classification across diverse ransomware families. The incorporation of a self-regulating hash convergence mechanism further ensured that entropy-synchronized hashes remained invariant across executions, minimizing classification inconsistencies that often arise due to dynamic modifications in ransomware payloads. Experimental results demonstrated high detection rates across contemporary ransomware strains, with the model exhibiting resilience against encryption-based evasion mechanisms, code injection strategies, and reflective loading techniques. Unlike conventional detection mechanisms that rely on static signatures and heuristic analysis, the proposed entropy-aware classification framework adapts to emerging threats through an inherent ability to capture entropy anomalies within executable structures. The findings reinforce the potential of entropy-based detection in addressing the limitations of traditional methodologies while enhancing detection robustness against obfuscation and adversarial evasion techniques.

Paperid: 3489, https://arxiv.org/pdf/2501.18121.pdf

Abstract:
This work identifies the first privacy-aware stratified sampling scheme that minimizes the variance for general private mean estimation under the Laplace, Discrete Laplace (DLap) and Truncated-Uniform-Laplace (TuLap) mechanisms within the framework of differential privacy (DP). We view stratified sampling as a subsampling operation, which amplifies the privacy guarantee; however, to have the same final privacy guarantee for each group, different nominal privacy budgets need to be used depending on the subsampling rate. Ignoring the effect of DP, traditional stratified sampling strategies risk significant variance inflation. We phrase our optimal survey design as an optimization problem, where we determine the optimal subsampling sizes for each group with the goal of minimizing the variance of the resulting estimator. We establish strong convexity of the variance objective, propose an efficient algorithm to identify the integer-optimal design, and offer insights on the structure of the optimal design.

Paperid: 3490, https://arxiv.org/pdf/2501.18102.pdf

Abstract:
There are many challenges for Internet of Things (IoT) sensor networks including the lack of robust standards, diverse wireline and wireless connectivity, interoperability, security, and privacy. Addressing these challenges, the Institute of Electrical and Electronics Engineers (IEEE) P1451.0 standard defines network services, transducer services, transducer electronic data sheets (TEDS) format, and a security framework to achieve sensor data security and interoperability for IoT applications. This paper proposes a security solution for IEEE P1451.1.6-based sensor networks for IoT applications utilizing the security framework defined in IEEE P1451.0. The proposed solution includes an architecture, a security policy with six security levels, security standards, and security TEDS. Further, this paper introduces a new service to update access control lists (ACLs) to regulate the access for topic names by the applications and provides an implementation of the security TEDS for IEEE P1451.1.6-based sensor networks. The paper also illustrates how to access security TEDS that contain metadata on security standards to achieve sensor data security and interoperability.

Paperid: 3491, https://arxiv.org/pdf/2501.17760.pdf

Abstract:
The realm of technology frequently confronts threats posed by adversaries exploiting loopholes in programs. Among these, the Log4Shell vulnerability in the Log4j library stands out due to its widespread impact. Log4j, a prevalent software library for log recording, is integrated into millions of devices worldwide. The Log4Shell vulnerability facilitates remote code execution with relative ease. Its combination with the extensive utilization of Log4j marks it as one of the most dangerous vulnerabilities discovered to date. The severity of this vulnerability, which quickly escalated into a media frenzy, prompted swift action within the industry, thereby mitigating potential extensive damage. This rapid response was crucial, as the consequences could have been significantly more severe if the vulnerability had been exploited by adversaries prior to its public disclosure. This paper details the discovery of the Log4Shell vulnerability and its potential for exploitation. It examines the vulnerability's impact on various stakeholders, including governments, the Apache Software Foundation (which manages the Log4j library), and companies affected by it. The paper also describes strategies for defending against Log4Shell in several scenarios. While numerous Log4j users acted promptly to safeguard their systems, the vulnerability remains a persistent threat until all vulnerable instances of the library are adequately protected.

Paperid: 3492, https://arxiv.org/pdf/2501.17740.pdf

Abstract:
As bug-finding methods improve, bug-fixing capabilities are exceeded, resulting in an accumulation of potential vulnerabilities. There is thus a need for efficient and precise bug prioritization based on exploitability. In this work, we explore the notion of control of an attacker over a vulnerability's parameters, which is an often overlooked factor of exploitability. We show that taint as well as straightforward qualitative and quantitative notions of control are not enough to effectively differentiate vulnerabilities. Instead, we propose to focus analysis on feasible value sets, which we call domains of control, in order to better take into account threat models and expert insight. Our new Shrink and Split algorithm efficiently extracts domains of control from path constraints obtained with symbolic execution and renders them in an easily processed, human-readable form. This in turn allows to automatically compute more complex control metrics, such as weighted Quantitative Control, which factors in the varying threat levels of different values. Experiments show that our method is both efficient and precise. In particular, it is the only one able to distinguish between vulnerabilities such as cve-2019-14192 and cve-2022-30552, while revealing a mistake in the human evaluation of cve-2022-30790. The high degree of automation of our tool also brings us closer to a fully-automated evaluation pipeline.

Paperid: 3493, https://arxiv.org/pdf/2501.17429.pdf

Abstract:
The rapid evolution of cyber threats has outpaced traditional detection methodologies, necessitating innovative approaches capable of addressing the adaptive and complex behaviors of modern adversaries. A novel framework was introduced, leveraging Temporal-Correlation Graphs to model the intricate relationships and temporal patterns inherent in malicious operations. The approach dynamically captured behavioral anomalies, offering a robust mechanism for distinguishing between benign and malicious activities in real-time scenarios. Extensive experiments demonstrated the framework's effectiveness across diverse ransomware families, with consistently high precision, recall, and overall detection accuracy. Comparative evaluations highlighted its better performance over traditional signature-based and heuristic methods, particularly in handling polymorphic and previously unseen ransomware variants. The architecture was designed with scalability and modularity in mind, ensuring compatibility with enterprise-scale environments while maintaining resource efficiency. Analysis of encryption speeds, anomaly patterns, and temporal correlations provided deeper insights into the operational strategies of ransomware, validating the framework's adaptability to evolving threats. The research contributes to advancing cybersecurity technologies by integrating dynamic graph analytics and machine learning for future innovations in threat detection. Results from this study underline the potential for transforming the way organizations detect and mitigate complex cyberattacks.

Paperid: 3494, https://arxiv.org/pdf/2501.17123.pdf

Abstract:
Cache side channel attacks are a sophisticated and persistent threat that exploit vulnerabilities in modern processors to extract sensitive information. These attacks leverage weaknesses in shared computational resources, particularly the last level cache, to infer patterns in data access and execution flows, often bypassing traditional security defenses. Such attacks are especially dangerous as they can be executed remotely without requiring physical access to the victim's device. This study focuses on a specific class of these threats: fingerprinting attacks, where an adversary monitors and analyzes the behavior of co-located processes via cache side channels. This can potentially reveal confidential information, such as encryption keys or user activity patterns. A comprehensive threat model illustrates how attackers sharing computational resources with target systems exploit these side channels to compromise sensitive data. To mitigate such risks, a hybrid deep learning model is proposed for detecting cache side channel attacks. Its performance is compared with five widely used deep learning models: Multi-Layer Perceptron, Convolutional Neural Network, Simple Recurrent Neural Network, Long Short-Term Memory, and Gated Recurrent Unit. The experimental results demonstrate that the hybrid model achieves a detection rate of up to 99.96%. These findings highlight the limitations of existing models, the need for enhanced defensive mechanisms, and directions for future research to secure sensitive data against evolving side channel threats.

Paperid: 3495, https://arxiv.org/pdf/2501.17070.pdf

Abstract:
Judging an action's safety requires knowledge of the context in which the action takes place. To human agents who act in various contexts, this may seem obvious: performing an action such as email deletion may or may not be appropriate depending on the email's content, the goal (e.g., to erase sensitive emails or to clean up trash), and the type of email address (e.g., work or personal). Unlike people, computational systems have often had only limited agency in limited contexts. Thus, manually crafted policies and user confirmation (e.g., smartphone app permissions or network access control lists), while imperfect, have sufficed to restrict harmful actions. However, with the upcoming deployment of generalist agents that support a multitude of tasks (e.g., an automated personal assistant), we argue that we must rethink security designs to adapt to the scale of contexts and capabilities of these systems. As a first step, this paper explores contextual security in the domain of agents and proposes contextual agent security (Conseca), a framework to generate just-in-time, contextual, and human-verifiable security policies.

Paperid: 3496, https://arxiv.org/pdf/2501.16948.pdf

Abstract:
We study the impact of Stack Overflow code evolution on the stability of prior research findings derived from Stack Overflow data and provide recommendations for future studies. We systematically reviewed papers published between 2005--2023 to identify key aspects of Stack Overflow that can affect study results, such as the language or context of code snippets. Our analysis reveals that certain aspects are non-stationary over time, which could lead to different conclusions if experiments are repeated at different times. We replicated six studies using a more recent dataset to demonstrate this risk. Our findings show that four papers produced significantly different results than the original findings, preventing the same conclusions from being drawn with a newer dataset version. Consequently, we recommend treating Stack Overflow as a time series data source to provide context for interpreting cross-sectional research conclusions.

Paperid: 3497, https://arxiv.org/pdf/2501.16393.pdf

Abstract:
Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.

Paperid: 3498, https://arxiv.org/pdf/2501.16184.pdf

Abstract:
We introduce a protocol called ENCORE which simultaneously compresses and encrypts data in a one-pass process that can be implemented efficiently and possesses a number of desirable features as a streaming encoder/decoder. Motivated by the observation that both lossless compression and encryption consist of performing an invertible transformation whose output is close to a uniform distribution over bit streams, we show that these can be done simultaneously, at least for ``typical'' data with a stable distribution, i.e., approximated reasonably well by the output of a Markov model. The strategy is to transform the data into a dyadic distribution whose Huffman encoding is close to uniform, and then store the transformations made to said data in a compressed secondary stream interwoven into the first with a user-defined encryption protocol. The result is an encoding which we show exhibits a modified version of Yao's ``next-bit test'' while requiring many fewer bits of entropy than standard encryption. Numerous open questions remain, particularly regarding results that we suspect can be strengthened considerably.

Paperid: 3499, https://arxiv.org/pdf/2501.15836.pdf

Abstract:
Modern threat landscapes continue to evolve with increasing sophistication, challenging traditional detection methodologies and necessitating innovative solutions capable of addressing complex adversarial tactics. A novel framework was developed to identify ransomware activity through multimodal execution path analysis, integrating high-dimensional embeddings and dynamic heuristic derivation mechanisms to capture behavioral patterns across diverse attack variants. The approach demonstrated high adaptability, effectively mitigating obfuscation strategies and polymorphic characteristics often employed by ransomware families to evade detection. Comprehensive experimental evaluations revealed significant advancements in precision, recall, and accuracy metrics compared to baseline techniques, particularly under conditions of variable encryption speeds and obfuscated execution flows. The framework achieved scalable and computationally efficient performance, ensuring robust applicability across a range of system configurations, from resource-constrained environments to high-performance infrastructures. Notable findings included reduced false positive rates and enhanced detection latency, even for ransomware families employing sophisticated encryption mechanisms. The modular design allowed seamless integration of additional modalities, enabling extensibility and future-proofing against emerging threat vectors. Quantitative analyses further highlighted the system's energy efficiency, emphasizing its practicality for deployment in environments with stringent operational constraints. The results underline the importance of integrating advanced computational techniques and dynamic adaptability to safeguard digital ecosystems from increasingly complex threats.

Paperid: 3500, https://arxiv.org/pdf/2501.15290.pdf

Abstract:
Artificial Intelligence has become a double edged sword in modern society being both a boon and a bane. While it empowers individuals it also enables malicious actors to perpetrate scams such as fraudulent phone calls and user impersonations. This growing threat necessitates a robust system to protect individuals In this paper we introduce a novel real time fraud detection mechanism using Retrieval Augmented Generation technology to address this challenge on two fronts. First our system incorporates a continuously updating policy checking feature that transcribes phone calls in real time and uses RAG based models to verify that the caller is not soliciting private information thus ensuring transparency and the authenticity of the conversation. Second we implement a real time user impersonation check with a two step verification process to confirm the callers identity ensuring accountability. A key innovation of our system is the ability to update policies without retraining the entire model enhancing its adaptability. We validated our RAG based approach using synthetic call recordings achieving an accuracy of 97.98 percent and an F1score of 97.44 percent with 100 calls outperforming state of the art methods. This robust and flexible fraud detection system is well suited for real world deployment.

Paperid: 3501, https://arxiv.org/pdf/2501.15084.pdf

Abstract:
The increasing sophistication of encryption-based ransomware has demanded innovative approaches to detection and mitigation, prompting the development of a hierarchical framework grounded in probabilistic cryptographic analysis. By focusing on the statistical characteristics of encryption patterns, the proposed methodology introduces a layered approach that combines advanced clustering algorithms with machine learning to isolate ransomware-induced anomalies. Through comprehensive testing across diverse ransomware families, the framework demonstrated exceptional accuracy, effectively distinguishing malicious encryption operations from benign activities while maintaining low false positive rates. The system's design integrates dynamic feedback mechanisms, enabling adaptability to varying cryptographic complexities and operational environments. Detailed entropy-based evaluations revealed its sensitivity to subtle deviations in encryption workflows, offering a robust alternative to traditional detection methods reliant on static signatures or heuristics. Computational benchmarks confirmed its scalability and efficiency, achieving consistent performance even under high data loads and complex cryptographic scenarios. The inclusion of real-time clustering and anomaly evaluation ensures rapid response capabilities, addressing critical latency challenges in ransomware detection. Performance comparisons with established methods highlighted its improvements in detection efficacy, particularly against advanced ransomware employing extended key lengths and unique cryptographic protocols.

Paperid: 3502, https://arxiv.org/pdf/2501.15002.pdf

Abstract:
CairoZero is a programming language for running decentralized applications (dApps) at scale. Programs written in the CairoZero language are compiled to machine code for the Cairo CPU architecture and cryptographic protocols are used to verify the results of execution efficiently on blockchain. We explain how we have extended the CairoZero compiler with tooling that enables users to prove, in the Lean 3 proof assistant, that compiled code satisfies high-level functional specifications. We demonstrate the success of our approach by verifying primitives for computation with the secp256k1 and secp256r1 curves over a large finite field as well as the validation of cryptographic signatures using the former. We also verify a mechanism for simulating a read-write dictionary data structure in a read-only setting. Finally, we reflect on our methodology and discuss some of the benefits of our approach.

Paperid: 3503, https://arxiv.org/pdf/2501.14974.pdf

Abstract:
Objective functions based on Hellinger distance yield robust and efficient estimators of model parameters. Motivated by privacy and regulatory requirements encountered in contemporary applications, we derive in this paper \emph{private minimum Hellinger distance estimators}. The estimators satisfy a new privacy constraint, namely, Hellinger differential privacy, while retaining the robustness and efficiency properties. We demonstrate that Hellinger differential privacy shares several features of standard differential privacy while allowing for sharper inference. Additionally, for computational purposes, we also develop Hellinger differentially private gradient descent and Newton-Raphson algorithms. We illustrate the behavior of our estimators in finite samples using numerical experiments and verify that they retain robustness properties under gross-error contamination.

Paperid: 3504, https://arxiv.org/pdf/2501.14651.pdf

Abstract:
To safeguard against data fabrication and enhance trust in quantitative social science, we present Data Non-Manipulation Authentication Digest (Data-NoMAD). Data-NoMAD is a tool that allows researchers to certify, and others to verify, that a dataset has not been inappropriately manipulated between the point of data collection and the point at which a replication archive is made publicly available. Data-NoMAD creates and stores a column hash digest of a raw dataset upon initial download from a survey platform (the current version works with Qualtrics and SurveyCTO), but before it is subject to appropriate manipulations such as anonymity-preserving redactions. Data-NoMAD can later be used to verify the integrity of a publicly archived dataset by identifying columns that have been deleted, added, or altered. Data-NoMAD complements existing efforts at ensuring research integrity and integrates seamlessly with extant replication practices.

Paperid: 3505, https://arxiv.org/pdf/2501.14337.pdf

Abstract:
Passwords have been long used as the primary authentication method for web services. Weak passwords used by the users have prompted the use of password management tools and two-factor authentication to ensure better account security. While prior studies have studied their adoption individually, none of these studies focuses particularly on the Indian setting, which is culturally and economically different from the countries in which these studies have been done in the past. To this end, we conducted a survey with 90 participants residing in India to better understand the mindset of people on using password managers and two-factor authentication (2FA). Our findings suggest that a majority of the participants have used 2FA and password managers in some form, although they are sometimes unaware of their formal names. While many participants used some form of 2FA across all their accounts, browser-integrated and device-default password managers are predominantly utilized for less sensitive platforms such as e-commerce and social media rather than for more critical accounts like banking. The primary motivation for using password managers is the convenience of auto-filling. However, some participants avoid using password managers due to a lack of trust in these tools. Notably, dedicated third-party applications show low adoption for both password manager and 2FA. Despite acknowledging the importance of secure password practices, many participants still reuse passwords across multiple accounts, prefer shorter passwords, and use commonly predictable password patterns. Overall, the study suggests that Indians are more inclined to choose default settings, underscoring the need for tailored strategies to improve user awareness and strengthen password security practices.

Paperid: 3507, https://arxiv.org/pdf/2501.13748.pdf

Abstract:
Detecting weaknesses in cryptographic algorithms is of utmost importance for designing secure information systems. The state-of-the-art soft analytical side-channel attack (SASCA) uses physical leakage information to make probabilistic predictions about intermediate computations and combines these "guesses" with the known algorithmic logic to compute the posterior distribution over the key. This attack is commonly performed via loopy belief propagation, which, however, lacks guarantees in terms of convergence and inference quality. In this paper, we develop a fast and exact inference method for SASCA, denoted as ExSASCA, by leveraging knowledge compilation and tractable probabilistic circuits. When attacking the Advanced Encryption Standard (AES), the most widely used encryption algorithm to date, ExSASCA outperforms SASCA by more than 31% top-1 success rate absolute. By leveraging sparse belief messages, this performance is achieved with little more computational cost than SASCA, and about 3 orders of magnitude less than exact inference via exhaustive enumeration. Even with dense belief messages, ExSASCA still uses 6 times less computations than exhaustive inference.

Paperid: 3508, https://arxiv.org/pdf/2501.13363.pdf

Abstract:
The Wi-Fi technology (IEEE 802.11) was introduced in 1997. With the increasing use and deployment of such networks, their security has also attracted considerable attention. Current Wi-Fi networks use WPA2 (Wi-Fi Protected Access 2) for security (authentication and encryption) between access points and clients. According to the IEEE 802.11i-2004 standard, wireless networks secured with WPA2-PSK (Pre-Shared Key) are required to be protected with a passphrase between 8 to 63 ASCII characters. However, a poorly chosen passphrase significantly reduces the effectiveness of both WPA2 and WPA3-Personal Transition Mode. The objective of this paper is to empirically evaluate password choices in the wild and evaluate weakness in current common practices. We collected a total of 3,352 password hashes from Wi-Fi access points and determine the passphrases that were protecting them. We then analyze these passwords to investigate the impact of user's behavior and preference for convenience on passphrase strength in secured private Wi-Fi networks in Singapore. We characterized the predictability of passphrases that use the minimum required length of 8 numeric or alphanumeric characters, and/or symbols stipulated in wireless security standards, and the usage of default passwords, and found that 16 percent of the passwords show such behavior. Our results also indicate the prevalence of the use of default passwords by hardware manufacturers. We correlate our results with our findings and recommend methods that will improve the overall security and future of our Wi-Fi networks.

Paperid: 3509, https://arxiv.org/pdf/2501.13276.pdf

Abstract:
CMOS one-time-programmable (OTP) memories based on antifuses are widely used for storing small amounts of data (such as serial numbers, keys, and factory trimming) in integrated circuits due to their low cost, requiring no additional mask steps to fabricate. Device manufacturers and IP vendors have claimed for years that antifuses are a ``high security" memory which is significantly more difficult for an attacker to extract data from than other types of memory, such as Flash or mask ROM - however, as our results show, this is untrue. In this paper, we demonstrate that data bits stored in a widely used antifuse block can be extracted by a semiconductor failure analysis technique known as passive voltage contrast (PVC) using a focused ion beam (FIB). The simple form of the attack demonstrated here recovers the bitwise OR of two physically adjacent memory rows sharing common metal 1 contacts, however we have identified several potential mechanisms by which it may be possible to read the even and odd rows separately. We demonstrate the attack on a commodity microcontroller made on the 40nm node and show how it can be used to extract significant quantities of sensitive data, such as keys for firmware encryption, in time scales which are very practical for real world exploitation (1 day of sample prep plus a few hours of FIB time) with only a single target device required after initial reconnaissance has been completed on blank devices.

Paperid: 3510, https://arxiv.org/pdf/2501.13268.pdf

Abstract:
This paper analyzes the reported threats to Industrial Control Systems (ICS)/Operational Technology (OT) and identifies common tactics, techniques, and procedures (TTP) used by threat actors. The paper then uses the MITRE ATT&CK framework to map the common TTPs and provide an understanding of the security controls needed to defend against the reported ICS threats. The paper also includes a review of ICS testbeds and ideas for future research using the identified controls.

Paperid: 3511, https://arxiv.org/pdf/2501.13004.pdf

Abstract:
The comparison analysis of the most popular tools to extract features from network traffic is conducted in this paper. Feature extraction plays a crucial role in Intrusion Detection Systems (IDS) because it helps to transform huge raw network data into meaningful and manageable features for analysis and detection of malicious activities. The good choice of feature extraction tool is an essential step in construction of Artificial Intelligence-based Intrusion Detection Systems (AI-IDS), which can help to enhance the efficiency, accuracy, and scalability of such systems.

Paperid: 3512, https://arxiv.org/pdf/2501.12811.pdf

Abstract:
Modern cybersecurity landscapes increasingly demand sophisticated detection frameworks capable of identifying evolving threats with precision and adaptability. The proposed Zero-Space Detection framework introduces a novel approach that dynamically identifies latent behavioral patterns through unsupervised clustering and advanced deep learning techniques. Designed to address the limitations of signature-based and heuristic methods, it operates effectively in high-velocity environments by integrating multi-phase filtering and ensemble learning for refined decision-making. Experimental evaluation reveals high detection rates across diverse ransomware families, including LockBit, Conti, REvil, and BlackMatter, while maintaining low false positive rates and scalable performance. Computational overhead remains minimal, with average processing times ensuring compatibility with real-time systems even under peak operational loads. The framework demonstrates resilience against adversarial strategies such as obfuscation and encryption speed variability, which frequently challenge conventional detection systems. Analysis across multiple data sources highlights its versatility in handling diverse file types and operational contexts. Comprehensive metrics, including detection probability, latency, and resource efficiency, validate its efficacy under real-world conditions. Through its modular architecture, the framework achieves seamless integration with existing cybersecurity infrastructures without significant reconfiguration. The results demonstrate its robustness and scalability, offering a transformative paradigm for ransomware identification in dynamic and resource-constrained environments.

Paperid: 3513, https://arxiv.org/pdf/2501.12762.pdf

Abstract:
This chapter provides an overview of the evolving landscape of attacks in cyber-physical systems (CPS) and critical infrastructures, highlighting the possible use of Artificial Intelligence (AI) algorithms to develop intelligent cyberattacks. It describes various existing methods used to carry out intelligent attacks in Operational Technology (OT) environments and discusses AI-driven tools that automate penetration tests in Information Technology (IT) systems, which could potentially be used as attack tools. The chapter also discusses mitigation strategies to counter these emerging intelligent attacks by hindering the learning process of AI-based attacks and points to future research directions on the matter.

Paperid: 3514, https://arxiv.org/pdf/2501.12516.pdf

Abstract:
In this paper we compare traditional machine learning and deep learning models trained on a malware dataset when subjected to adversarial attack based on label-flipping. Specifically, we investigate the robustness of Support Vector Machines (SVM), Random Forest, Gaussian Naive Bayes (GNB), Gradient Boosting Machine (GBM), LightGBM, XGBoost, Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), MobileNet, and DenseNet models when facing varying percentages of misleading labels. We empirically assess the the accuracy of each of these models under such an adversarial attack on the training data. This research aims to provide insights into which models are inherently more robust, in the sense of being better able to resist intentional disruptions to the training data. We find wide variation in the robustness of the models tested to adversarial attack, with our MLP model achieving the best combination of initial accuracy and robustness.

Paperid: 3515, https://arxiv.org/pdf/2501.12034.pdf

Abstract:
Like most computer systems, a manycore can also be the target of security attacks. It is essential to ensure the security of the NoC since all information travels through its channels, and any interference in the traffic of messages can reflect on the entire chip, causing communication problems. Among the possible attacks on NoC, Denial of Service (DoS) attacks are the most cited in the literature. The state of the art shows a lack of work that can detect such attacks through learning techniques. On the other hand, these techniques are widely explored in computer network security via an Intrusion Detection System (IDS). In this context, the main goal of this document is to present the progress of a work that explores an IDS technique using machine learning and temporal series for detecting DoS attacks in NoC-based manycore systems. To fulfill this goal, it is necessary to extract traffic data from a manycore NoC and execute the learning techniques in the extracted data. However, while low-level platforms offer precision and slow execution, high-level platforms offer higher speed and data incompatible with reality. Therefore, a platform is being developed using the OVP tool, which has a higher level of abstraction. To solve the low precision problem, the developed platform will have its data validated with a low-level platform.

Paperid: 3516, https://arxiv.org/pdf/2501.11786.pdf

Abstract:
Recent work shows membership inference attacks (MIAs) on large language models (LLMs) produce inconclusive results, partly due to difficulties in creating non-member datasets without temporal shifts. While researchers have turned to synthetic data as an alternative, we show this approach can be fundamentally misleading. Our experiments indicate that MIAs function as machine-generated text detectors, incorrectly identifying synthetic data as training samples regardless of the data source. This behavior persists across different model architectures and sizes, from open-source models to commercial ones such as GPT-3.5. Even synthetic text generated by different, potentially larger models is classified as training data by the target model. Our findings highlight a serious concern: using synthetic data in membership evaluations may lead to false conclusions about model memorization and data leakage. We caution that this issue could affect other evaluations using model signals such as loss where synthetic or machine-generated translated data substitutes for real-world samples.

Paperid: 3517, https://arxiv.org/pdf/2501.11706.pdf

Abstract:
Transformers, a cornerstone of deep-learning architectures for sequential data, have achieved state-of-the-art results in tasks like Natural Language Processing (NLP). Models such as BERT and GPT-3 exemplify their success and have driven the rise of large language models (LLMs). However, a critical challenge persists: safeguarding the privacy of data used in LLM training. Privacy-preserving techniques like Federated Learning (FL) offer potential solutions, but practical limitations hinder their effectiveness for Transformer training. Two primary issues are (I) the risk of sensitive information leakage due to aggregation methods like FedAvg or FedSGD, and (II) the high communication overhead caused by the large size of Transformer models. This paper introduces a novel FL method that reduces communication overhead while maintaining competitive utility. Our approach avoids sharing full model weights by simulating a global model locally. We apply k-means clustering to each Transformer layer, compute centroids locally, and transmit only these centroids to the server instead of full weights or gradients. To enhance security, we leverage Intel SGX for secure transmission of centroids. Evaluated on a translation task, our method achieves utility comparable to state-of-the-art baselines while significantly reducing communication costs. This provides a more efficient and privacy-preserving FL solution for Transformer models.

Paperid: 3518, https://arxiv.org/pdf/2501.11525.pdf

Abstract:
The right to privacy, enshrined in various human rights declarations, faces new challenges in the age of artificial intelligence (AI). This paper explores the concept of the Right to be Forgotten (RTBF) within AI systems, contrasting it with traditional data erasure methods. We introduce Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process. Unlike machine unlearning, which modifies models post-training, our method prevents sensitive data from being embedded in the first place. Using the LIRA membership inference attack, we identify vulnerable data points and propose defenses that combine additive gradient noise and weighting schemes. Our experiments on the CIFAR-10 dataset demonstrate that our techniques reduce privacy risks by at least an order of magnitude while maintaining model accuracy (at 95% significance). Additionally, we present visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy. This work contributes to the development of privacy-preserving AI systems that align with human cognitive processes of motivated forgetting, offering a robust framework for safeguarding sensitive information and ensuring compliance with privacy regulations.

Paperid: 3519, https://arxiv.org/pdf/2501.11054.pdf

Abstract:
In this paper, we experimentally analyze the robustness of selected Federated Learning (FL) systems in the presence of adversarial clients. We find that temporal attacks significantly affect model performance in the FL models tested, especially when the adversaries are active throughout or during the later rounds. We consider a variety of classic learning models, including Multinominal Logistic Regression (MLR), Random Forest, XGBoost, Support Vector Classifier (SVC), as well as various Neural Network models including Multilayer Perceptron (MLP), Convolution Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM). Our results highlight the effectiveness of temporal attacks and the need to develop strategies to make the FL process more robust against such attacks. We also briefly consider the effectiveness of defense mechanisms, including outlier detection in the aggregation algorithm.

Paperid: 3520, https://arxiv.org/pdf/2501.11045.pdf

Abstract:
The security requirements for future 6G mobile networks are anticipated to be significantly more complex and demanding than those of 5G. This increase stems from several factors: the proliferation of massive machine-type communications will dramatically increase the density of devices competing for network access; secure ultra-reliable low-latency communication will impose stringent requirements on security, latency, and reliability; and the widespread deployment of small cells and non-terrestrial networks, including satellite mega-constellations, will result in more frequent handovers. This paper provides a set of security recommendations for 6G networks, with a particular focus on access and handover procedures, which often lack encryption and integrity protection, making them more vulnerable to exploitation. Since 6G is expected to be a backward-compatible extension of 5G, and given that secure systems cannot be effectively designed without a clear understanding of their goals, it is imperative to first evaluate the limitations of the current generation. To this end, the paper begins by reviewing existing 5G access and authentication mechanisms, highlighting several critical vulnerabilities in these procedures. It then examines potential 6G challenges and concludes with actionable recommendations to enhance the security, resilience, and robustness of 6G access and handover mechanisms.

Paperid: 3521, https://arxiv.org/pdf/2501.10956.pdf

Abstract:
The threat of malware is a serious concern for computer networks and systems, highlighting the need for accurate classification techniques. In this research, we experiment with multimodal machine learning approaches for malware classification, based on the structured nature of the Windows Portable Executable (PE) file format. Specifically, we train Support Vector Machine (SVM), Long Short-Term Memory (LSTM), and Convolutional Neural Network (CNN) models on features extracted from PE headers, we train these same models on features extracted from the other sections of PE files, and train each model on features extracted from the entire PE file. We then train SVM models on each of the nine header-sections combinations of these baseline models, using the output layer probabilities of the component models as feature vectors. We compare the baseline cases to these multimodal combinations. In our experiments, we find that the best of the multimodal models outperforms the best of the baseline cases, indicating that it can be advantageous to train separate models on distinct parts of Windows PE files.

Paperid: 3522, https://arxiv.org/pdf/2501.10906.pdf

Abstract:
Traditional adversarial attacks typically aim to alter the predicted labels of input images by generating perturbations that are imperceptible to the human eye. However, these approaches often lack explainability. Moreover, most existing work on adversarial attacks focuses on single-stage classifiers, but multi-stage classifiers are largely unexplored. In this paper, we introduce instance-based adversarial attacks for multi-stage classifiers, leveraging Layer-wise Relevance Propagation (LRP), which assigns relevance scores to pixels based on their influence on classification outcomes. Our approach generates explainable adversarial perturbations by utilizing LRP to identify and target key features critical for both coarse and fine-grained classifications. Unlike conventional attacks, our method not only induces misclassification but also enhances the interpretability of the model's behavior across classification stages, as demonstrated by experimental results.

Paperid: 3523, https://arxiv.org/pdf/2501.10841.pdf

Abstract:
To prove that a dataset is sufficiently anonymized, many privacy policies suggest that a re-identification risk assessment be performed, but do not provide a precise methodology for doing so, leaving the industry alone with the problem. This paper proposes a practical and ready-to-use methodology for re-identification risk assessment, the originality of which is manifold: (1) it is the first to follow well-known risk analysis methods (e.g. EBIOS) that have been used in the cybersecurity field for years, which consider not only the ability to perform an attack, but also the impact such an attack can have on an individual; (2) it is the first to qualify attributes and values of attributes with e.g. degree of exposure, as known real-world attacks mainly target certain types of attributes and not others.

Paperid: 3524, https://arxiv.org/pdf/2501.10817.pdf

Abstract:
The Internet of Things (IoT) is a network of digital devices like sensors, processors, embedded and communication devices that can connect to and exchange data with other devices and systems over the internet. IoT devices have limitations on power, memory, and computational resources. Researchers have developed the IPv6 Over Low-power Wireless Personal Area Network (6LoWPAN) protocols to provide wireless connectivity among these devices while overcoming the constraints on resources. 6LoWPAN has been approved subsequently by the Internet Engineering Task Force (IETF). The IETF Routing Over Low-power and Lossy Networks (ROLL) standardized the Routing Protocol for LLNs known as RPL (IETF RFC 6550), which is part of the 6LoWPAN stack. However, IoT devices are vulnerable to various attacks on RPL-based routing. This survey provides an in depth study of existing RPL-based attacks and defense published from year 2011 to 2024 from highly reputed journals and conferences. By thematic analysis of existing routing attacks on RPL, we developed a novel attack taxonomy which focuses on the nature of routing attacks and classifies them into 12 major categories. Subsequently, the impact of each attack on the network is analyzed and discussed real life scenarios of these attacks. Another contribution of this survey proposed a novel taxonomy for classification of defense mechanisms into 8 major categories against routing attacks based on type of defense strategy. The detailed analysis of each defense mechanism with real life applicability is explained. Furthermore, evaluation tools such as testbeds and simulators for RPL-based attack and defense are discussed and critically analyzed in terms of real world applicability. Finally, open research challenges are presented on the basis of research gaps of existing literature along with research directions for practitioners and researchers.

Paperid: 3525, https://arxiv.org/pdf/2501.10174.pdf

Abstract:
As neural networks are increasingly used for critical decision-making tasks, the threat of integrity attacks, where an adversary maliciously alters a model, has become a significant security and safety concern. These concerns are compounded by the use of licensed models, where end-users purchase third-party models with only black-box access to protect model intellectual property (IP). In such scenarios, conventional approaches to verify model integrity require knowledge of model parameters or cooperative model owners. To address this challenge, we propose Michscan, a methodology leveraging power analysis to verify the integrity of black-box TinyML neural networks designed for resource-constrained devices. Michscan is based on the observation that modifications to model parameters impact the instantaneous power consumption of the device. We leverage this observation to develop a runtime model integrity-checking methodology that employs correlational power analysis using a golden template or signature to mathematically quantify the likelihood of model integrity violations at runtime through the Mann-Whitney U-Test. Michscan operates in a black-box environment and does not require a cooperative or trustworthy model owner. We evaluated Michscan using an STM32F303RC microcontroller with an ARM Cortex-M4 running four TinyML models in the presence of three model integrity violations. Michscan successfully detected all integrity violations at runtime using power data from five inferences. All detected violations had a negligible probability P < 10^(-5) of being produced from an unmodified model (i.e., false positive).

Paperid: 3526, https://arxiv.org/pdf/2501.09936.pdf

Abstract:
Over the years, IP Multimedia Subsystems (IMS) networks have become increasingly critical as they form the backbone of modern telecommunications, enabling the integration of multimedia services such as voice, video, and messaging over IP-based infrastructures and next-generation networks. However, this integration has led to an increase in the attack surface of the IMS network, making it more prone to various forms of cyber threats and attacks, including Denial of Service (DoS) attacks, SIP-based attacks, unauthorized access, etc. As a result, it is important to find a way to manage and assess the security of IMS networks, but there is a lack of a systematic approach to managing the identification of vulnerabilities and threats. In this paper, we propose a model and a threat-specific risk security modeling and assessment approach to model and assess the threats of the IMS network. This model will provide a structured methodology for representing and analyzing threats and attack scenarios in layers within a hierarchical model. The proposed model aims to enhance the security posture of IMS networks by improving vulnerability management, risk evaluation, and defense evaluation against cyber threats. We perform a preliminary evaluation based on vulnerability collected from the National Vulnerability Database for devices in the IMS network. The results showed that we can model and assess the threats of IMS networks. IMS network defenders can use this model to understand their security postures taking into account the threat and risk posed by each vulnerability.

Paperid: 3527, https://arxiv.org/pdf/2501.09802.pdf

Abstract:
The rapid advancements in quantum computing present significant threats to existing encryption standards and internet security. Simultaneously, the advent of Web 3.0 marks a transformative era in internet history, emphasizing enhanced data security, decentralization, and user ownership. This white paper introduces the W3ID, an abbreviation of Web3 standard meeting universal digital ID, which is a Universal Digital Identity (UDI) model designed to meet Web3 standards while addressing vulnerabilities posed by quantum computing. W3ID innovatively generates secure Digital Object Identifiers (DOIs) tailored for the decentralized Web 3.0 ecosystem. Additionally, W3ID employs a dual-key system for secure authentication, enhancing both public and private verification mechanisms. To further enhance encryption strength and authentication integrity in the quantum computing era, W3ID incorporates an advanced security mechanism. By requiring quadruple application of SHA-256, with consecutive matches for validation, the system expands the number of possibilities to 256^4, which is approximately 4.3 billion times the current SHA-256 capacity. This dramatic increase in computational complexity ensures that even advanced quantum computing systems would face significant challenges in executing brute-force attacks. W3ID redefines digital identity standards for Web 3.0 and the quantum computing era, setting a new benchmark for security, scalability, and decentralization in the global digital twin ecosystem.

Paperid: 3528, https://arxiv.org/pdf/2501.09397.pdf

Abstract:
The growing number of satellites in low Earth orbit (LEO) has increased concerns about the risk of satellite collisions, which can ultimately result in the irretrievable loss of satellites and a growing amount of space debris. To mitigate this risk, accurate collision risk analysis is essential. However, this requires access to sensitive orbital data, which satellite operators are often unwilling to share due to privacy concerns. This contribution proposes a solution based on fully homomorphic encryption (FHE) and thus enables secure and private collision risk analysis. In contrast to existing methods, this approach ensures that collision risk analysis can be performed on sensitive orbital data without revealing it to other parties. To display the challenges and opportunities of FHE in this context, an implementation of the CKKS scheme is adapted and analyzed for its capacity to satisfy the theoretical requirements of precision and run time.

Paperid: 3529, https://arxiv.org/pdf/2501.09191.pdf

Abstract:
Software vulnerabilities continue to be the main cause of occurrence for cyber attacks. In an attempt to reduce them and improve software quality, software code analysis has emerged as a service offered by companies specialising in software testing. However, this service requires software companies to provide access to their software's code, which raises concerns about code privacy and intellectual property theft. This paper presents a novel approach to Software Quality and Privacy, in which testing companies can perform code analysis tasks on encrypted software code provided by software companies while code privacy is preserved. The approach combines Static Code Analysis and Searchable Symmetric Encryption in order to process the source code and build an encrypted inverted index that represents its data and control flows. The index is then used to discover vulnerabilities by carrying out static analysis tasks in a confidential way. With this approach, this paper also defines a new research field -- Confidential Code Analysis --, from which other types of code analysis tasks and approaches can be derived. We implemented the approach in a new tool called CoCoA and evaluated it experimentally with synthetic and real PHP web applications. The results show that the tool has similar precision as standard (non-confidential) static analysis tools and a modest average performance overhead of 42.7%.

Paperid: 3530, https://arxiv.org/pdf/2501.09031.pdf

Abstract:
The digital age, driven by the AI revolution, brings significant opportunities but also conceals security threats, which we refer to as cyber shadows. These threats pose risks at individual, organizational, and societal levels. This paper examines the systemic impact of these cyber threats and proposes a comprehensive cybersecurity strategy that integrates AI-driven solutions, such as Intrusion Detection Systems (IDS), with targeted policy interventions. By combining technological and regulatory measures, we create a multilevel defense capable of addressing both direct threats and indirect negative externalities. We emphasize that the synergy between AI-driven solutions and policy interventions is essential for neutralizing cyber threats and mitigating their negative impact on the digital economy. Finally, we underscore the need for continuous adaptation of these strategies, especially in response to the rapid advancement of autonomous AI-driven attacks, to ensure the creation of secure and resilient digital ecosystems.

Paperid: 3532, https://arxiv.org/pdf/2501.08723.pdf

Abstract:
Email phishing remains a prevalent cyber threat, targeting victims to extract sensitive information or deploy malicious software. This paper explores the integration of open-source intelligence (OSINT) tools and machine learning (ML) models to enhance phishing detection across multilingual datasets. Using Nmap and theHarvester, this study extracted 17 features, including domain names, IP addresses, and open ports, to improve detection accuracy. Multilingual email datasets, including English and Arabic, were analyzed to address the limitations of ML models trained predominantly on English data. Experiments with five classification algorithms: Decision Tree, Random Forest, Support Vector Machine, XGBoost, and Multinomial NaÃ¯ve Bayes. It revealed that Random Forest achieved the highest performance, with an accuracy of 97.37% for both English and Arabic datasets. For OSINT-enhanced datasets, the model demonstrated an improvement in accuracy compared to baseline models without OSINT features. These findings highlight the potential of combining OSINT tools with advanced ML models to detect phishing emails more effectively across diverse languages and contexts. This study contributes an approach to phishing detection by incorporating OSINT features and evaluating their impact on multilingual datasets, addressing a critical gap in cybersecurity research.

Paperid: 3533, https://arxiv.org/pdf/2501.08336.pdf

Abstract:
Due to the exceptional performance of Large Language Models (LLMs) in diverse downstream tasks,there has been an exponential growth in edge-device requests to cloud-based models.However, the current authentication mechanism using static Bearer Tokens in request headersfails to provide the flexibility and backend control required for edge-device deployments.To address these limitations, we propose Dynaseal,a novel methodology that enables fine-grained backend constraints on model invocations.

Paperid: 3534, https://arxiv.org/pdf/2501.08249.pdf

Abstract:
Device driver bugs are the leading cause of OS compromises, and their formal verification is therefore highly desirable. To the best of our knowledge, no realistic and performant driver has been verified for a non-trivial device. We propose Pancake, an imperative language for systems programming that features a well-defined and verification-friendly semantics. Leveraging the verified compiler backend of the CakeML functional language, we develop a compiler for Pancake that guarantees that the binary retains the semantics of the source code. Usng automatic translation of Pancake to the Viper SMT front-end, we verify a performant driver for an Ethernet NIC.

Paperid: 3535, https://arxiv.org/pdf/2501.07695.pdf

Abstract:
We propose a modification to the transpiler of a quantum computer to safeguard against side-channel attacks aimed at learning information about a quantum circuit. We demonstrate that if it is feasible to shield a specific subset of gates from side-channel attacks, then it is possible to conceal all information in a quantum circuit by transpiling it into a new circuit whose depth grows linearly, depending on the quantum computer's architecture. We provide concrete examples of implementing this protection on IBM's quantum computers, utilizing their virtual gates and editing their transpiler.

Paperid: 3536, https://arxiv.org/pdf/2501.07208.pdf

Abstract:
Secure Remote Password (SRP) protocol is an essential password-authenticated key exchange (PAKE) protocol based on the discrete logarithm problem (DLP). The protocol is specifically designed to obtain a session key and it has been widely used in various scenarios due to its attractive security features. In the SRP protocol, the server is not required to save any data directly associated with passwords. And this makes attackers who manage to corrupt the server fail to impersonate the client unless performing a brute-force search for the password. However, the development of quantum computing has potentially made classic DLP-based public-key cryptography schemes not secure, including the SRP protocol. So it is significant to design a quantum-resistant SRP protocol. In this paper, based on the original scheme, we propose a post-quantum SRP protocol from the learning with errors (LWE) problem. And we give rigorous proof and analyses on the correctness and security of the scheme. Besides being resistant to known quantum attacks, it maintains the various secure qualities of the original protocol.

Paperid: 3537, https://arxiv.org/pdf/2501.07207.pdf

Abstract:
Digital infrastructures are seeing convergence and connectivity at unprecedented scale. This is true for both current critical national infrastructures and emerging future systems that are highly cyber-physical in nature with complex intersections between humans and technologies, e.g., smart cities, intelligent transportation, high-value manufacturing and Industry 4.0. Diverse legacy and non-legacy software systems underpinned by heterogeneous hardware compose on-the-fly to deliver services to millions of users with varying requirements and unpredictable actions. This complexity is compounded by intricate and complicated supply-chains with many digital assets and services outsourced to third parties. The reality is that, at any particular point in time, there will be untrusted, partially-trusted or compromised elements across the infrastructure. Given this reality, and the societal scale of digital infrastructures, delivering secure and resilient operations is a major challenge. We argue that this requires us to move beyond the paradigm of security-by-design and embrace the challenge of securing-a-compromised-system.

Paperid: 3538, https://arxiv.org/pdf/2501.07131.pdf

Abstract:
Threat analysis is continuously growing in importance due to the always-increasing complexity and frequency of cyber attacks. Analyzing threats demands significant effort from security experts, leading to delays in the security analysis process. Different cybersecurity knowledge bases are currently available to support this task but manual efforts are often required to correlate such heterogenous sources into a unified view that would enable a more comprehensive assessment. To address this gap, we propose a methodology leveraging Natural Language Processing (NLP) to effectively and efficiently associate Common Vulnerabilities and Exposure (CVE) vulnerabilities with Common Attack Pattern Enumeration and Classification (CAPEC) attack patterns. The proposed technique combines semantic similarity with keyword analysis to improve the accuracy of association estimations. Experimental evaluations demonstrate superior performance compared to state-of-the-art models, reducing manual effort and analysis time, and enabling cybersecurity professionals to prioritize critical tasks.

Paperid: 3539, https://arxiv.org/pdf/2501.06963.pdf

Abstract:
The advent of Generative Artificial Intelligence (GenAI) has brought a significant change to our society. GenAI can be applied across numerous fields, with particular relevance in cybersecurity. Among the various areas of application, its use in penetration testing (pentesting) or ethical hacking processes is of special interest. In this paper, we have analyzed the potential of leading generic-purpose GenAI tools-Claude Opus, GPT-4 from ChatGPT, and Copilot-in augmenting the penetration testing process as defined by the Penetration Testing Execution Standard (PTES). Our analysis involved evaluating each tool across all PTES phases within a controlled virtualized environment. The findings reveal that, while these tools cannot fully automate the pentesting process, they provide substantial support by enhancing efficiency and effectiveness in specific tasks. Notably, all tools demonstrated utility; however, Claude Opus consistently outperformed the others in our experimental scenarios.

Paperid: 3540, https://arxiv.org/pdf/2501.06319.pdf

Abstract:
In this paper, we introduce a novel authentication scheme for satellite nodes based on quantum entanglement and measurement noise profiles. Our approach leverages the unique noise characteristics exhibited by each satellite's quantum optical communication system to create a distinctive "quantum noise fingerprint." This fingerprint is used for node authentication within a satellite constellation, offering a quantum-safe alternative to traditional cryptographic methods. The proposed scheme consists of a training phase, where each satellite engages in a training exercise with its neighbors to compile noise profiles, and an online authentication phase, where these profiles are used for real-time authentication. Our method addresses the inherent challenges of implementing cryptographic-based schemes in space, such as key management and distribution, by exploiting the fundamental properties of quantum mechanics and the unavoidable imperfections in quantum systems. This approach enhances the security and reliability of satellite communication networks, providing a robust solution to the authentication challenges in satellite constellations. We validated and tested several hypotheses for this approach using IBM System One quantum computers.

Paperid: 3541, https://arxiv.org/pdf/2501.06239.pdf

Abstract:
Cyber Threat Intelligence (CTI) is critical for mitigating threats to organizations, governments, and institutions, yet the necessary data are often dispersed across diverse formats. AI-driven solutions for CTI Information Extraction (IE) typically depend on high-quality, annotated data, which are not always available. This paper introduces 0-CTI, a scalable AI-based framework designed for efficient CTI Information Extraction. Leveraging advanced Natural Language Processing (NLP) techniques, particularly Transformer-based architectures, the proposed system processes complete text sequences of CTI reports to extract a cyber ontology of named entities and their relationships. Our contribution is the development of 0-CTI, the first modular framework for CTI Information Extraction that supports both supervised and zero-shot learning. Unlike existing state-of-the-art models that rely heavily on annotated datasets, our system enables fully dataless operation through zero-shot methods for both Entity and Relation Extraction, making it adaptable to various data availability scenarios. Additionally, our supervised Entity Extractor surpasses current state-of-the-art performance in cyber Entity Extraction, highlighting the dual strength of the framework in both low-resource and data-rich environments. By aligning the system's outputs with the Structured Threat Information Expression (STIX) format, a standard for information exchange in the cybersecurity domain, 0-CTI standardizes extracted knowledge, enhancing communication and collaboration in cybersecurity operations.

Paperid: 3542, https://arxiv.org/pdf/2501.06234.pdf

Abstract:
This paper reviews the state-of-the-art in deepfake generation and detection, focusing on modern deep learning technologies and tools based on the latest scientific advancements. The rise of deepfakes, leveraging techniques like Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), Diffusion models and other generative models, presents significant threats to privacy, security, and democracy. This fake media can deceive individuals, discredit real people and organizations, facilitate blackmail, and even threaten the integrity of legal, political, and social systems. Therefore, finding appropriate solutions to counter the potential threats posed by this technology is essential. We explore various deepfake methods, including face swapping, voice conversion, reenactment and lip synchronization, highlighting their applications in both benign and malicious contexts. The review critically examines the ongoing "arms race" between deepfake generation and detection, analyzing the challenges in identifying manipulated contents. By examining current methods and highlighting future research directions, this paper contributes to a crucial understanding of this rapidly evolving field and the urgent need for robust detection strategies to counter the misuse of this powerful technology. While focusing primarily on audio, image, and video domains, this study allows the reader to easily grasp the latest advancements in deepfake generation and detection.

Paperid: 3544, https://arxiv.org/pdf/2501.06161.pdf

Abstract:
The remarkable advancement of smart grid technology in the IoT sector has raised concerns over the privacy and security of the data collected and transferred in real-time. Smart meters generate detailed information about consumers' energy consumption patterns, increasing the risks of data breaches, identity theft, and other forms of cyber attacks. This study proposes a privacy-preserving data aggregation protocol that uses reversible watermarking and AES cryptography to ensure the security and privacy of the data. There are two versions of the protocol: one for low-frequency smart meters that uses LSB-shifting-based reversible watermarking (RLS) and another for high-frequency smart meters that uses difference expansion-based reversible watermarking (RDE). This enables the aggregation of smart meter data, maintaining confidentiality, integrity, and authenticity. The proposed protocol significantly enhances privacy-preserving measures for smart metering systems, conducting an experimental evaluation with real hardware implementation using Nucleo microcontroller boards and the RIOT operating system and comparing the results to existing security schemes.

Paperid: 3545, https://arxiv.org/pdf/2501.06033.pdf

Abstract:
As the Internet of Things (IoT) becomes more embedded within our daily lives, there is growing concern about the risk `smart' devices pose to network security. To address this, one avenue of research has focused on automated IoT device identification. Research has however largely neglected the identification of IoT device firmware versions. There is strong evidence that IoT security relies on devices being on the latest version patched for known vulnerabilities. Identifying when a device has updated (has changed version) or not (is on a stable version) is therefore useful for IoT security. Version identification involves challenges beyond those for identifying the model, type, and manufacturer of IoT devices, and traditional machine learning algorithms are ill-suited for effective version identification due to being limited by the availability of data for training. In this paper, we introduce an effective technique for identifying IoT device versions based on transfer learning. This technique relies on the idea that we can use a Twin Neural Network (TNN) - trained at distinguishing devices - to detect differences between a device on different versions. This facilitates real-world implementation by requiring relatively little training data. We extract statistical features from on-wire packet flows, convert these features into greyscale images, pass these images into a TNN, and determine version changes based on the Hedges' g effect size of the similarity scores. This allows us to detect the subtle changes present in on-wire traffic when a device changes version. To evaluate our technique, we set up a lab containing 12 IoT devices and recorded their on-wire packet captures for 11 days across multiple firmware versions. For testing data held out from training, our best performing model is shown to be 95.83% and 84.38% accurate at identifying stable versions and version changes respectively.

Paperid: 3546, https://arxiv.org/pdf/2501.06010.pdf

Abstract:
Tor is a well-known anonymous communication tool, used by people with various privacy and security needs. Prior works have exploited routing attacks to observe Tor traffic and deanonymize Tor users. Subsequently, location-aware relay selection algorithms have been proposed to defend against such attacks on Tor. However, location-aware relay selection algorithms are known to be vulnerable to information leakage on client locations and guard placement attacks. Can we design a new location-unaware approach to relay selection while achieving the similar goal of defending against routing attacks? Towards this end, we leverage the Resource Public Key Infrastructure (RPKI) in designing new guard relay selection algorithms. We develop a lightweight Discount Selection algorithm by only incorporating Route Origin Authorization (ROA) information, and a more secure Matching Selection algorithm by incorporating both ROA and Route Origin Validation (ROV) information. Our evaluation results show an increase in the number of ROA-ROV matched client-relay pairs using our Matching Selection algorithm, reaching 48.47% with minimal performance overhead through custom Shadow simulations and benchmarking.

Paperid: 3547, https://arxiv.org/pdf/2501.05786.pdf

Abstract:
Cancelable Biometrics (CB) stands for a range of biometric transformation schemes combining biometrics with user specific tokens to generate secure templates. Required properties are the irreversibility, unlikability and recognition accuracy of templates while making their revocation possible. In biometrics, a key-binding scheme is used for protecting a cryptographic key using a biometric data. The key can be recomputed only if a correct biometric data is acquired during authentication. Applications of key-binding schemes are typically disk encryption, where the cryptographic key is used to encrypt and decrypt the disk. In this paper, we cryptanalyze a recent key-binding scheme, called Cancelable Biometrics Vault (CBV) based on cancelable biometrics. More precisely, the introduced cancelable transformation, called BioEncoding scheme, for instantiating the CBV framework is attacked in terms of reversibility and linkability of templates. Subsequently, our linkability attack enables to recover the key in the vault without additional assumptions. Our cryptanalysis introduces a new perspective by uncovering the CBV scheme's revocability and linkability vulnerabilities, which were not previously identified in comparable biometric-based key-binding schemes.

Paperid: 3548, https://arxiv.org/pdf/2501.05626.pdf

Abstract:
Ensuring the privacy of votes in an election is crucial for the integrity of a democratic process. Often, voting power is delegated to representatives (e.g., in congress) who subsequently vote on behalf of voters on specific issues. This delegation model is also widely used in Decentralized Autonomous Organizations (DAOs). Although several existing voting systems used in DAOs support private voting, they only offer public delegation. In this paper, we introduce Kite, a new protocol that enables $\textit{private}$ delegation of voting power for DAO members. Voters can freely delegate, revoke, and re-delegate their power without revealing any information about who they delegated to. Even the delegate does not learn who delegated to them. The only information that is recorded publicly is that the voter delegated or re-delegated their vote to someone. Kite accommodates both public and private voting for the delegates themselves. We analyze the security of our protocol within the Universal Composability (UC) framework. We implement Kite as an extension to the existing Governor Bravo smart contract on the Ethereum blockchain, that is widely used for DAO governance. Furthermore, we provide an evaluation of our implementation that demonstrates the practicality of the protocol. The most expensive operation is delegation due to the required zero-knowledge proofs. On a consumer-grade laptop, delegation takes between 7 and 167 seconds depending on the requested level of privacy.

Paperid: 3549, https://arxiv.org/pdf/2501.05387.pdf

Abstract:
Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints. However, attackers are increasingly exploiting encryption to conceal malicious behavior. Detecting unknown encrypted malicious traffic without decrypting the payloads remains a significant challenge. In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic. We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication. To effectively represent malicious communication, we compiled a robust dataset with 1,127 unique connections, more than any other available open-source dataset, and spanning 54 malware families. Our models were benchmarked against the CTU-13 dataset, achieving performance of over 99% accuracy, precision, and F1-score. Additionally, the eXtreme Gradient Boosting (XGB) model demonstrated 99.32% accuracy, 99.53% precision, and 99.43% F1-score on our custom dataset. By leveraging Shapley Additive Explanations (SHAP), we identified that the maximum packet size, mean inter-arrival time of packets, and transport layer security version used are the most critical features for the global model explanation. Furthermore, key features were identified as important for local explanations across both datasets for individual traffic samples. These insights provide a deeper understanding of the model decision-making process, enhancing the transparency and reliability of detecting malicious encrypted traffic.

Paperid: 3550, https://arxiv.org/pdf/2501.05309.pdf

Abstract:
Differentially private (DP) selection involves choosing a high-scoring candidate from a finite candidate pool, where each score depends on a sensitive dataset. This problem arises naturally in a variety of contexts including model selection, hypothesis testing, and within many DP algorithms. Classical methods, such as Report Noisy Max (RNM), assume all candidates' scores are equally sensitive to changes in a single individual's data, but this often isn't the case. To address this, algorithms like the Generalised Exponential Mechanism (GEM) leverage variability in candidate sensitivities. However, we observe that while these algorithms can outperform RNM in some situations, they may underperform in others - they can even perform worse than random selection. In this work, we explore how the distribution of scores and sensitivities impacts DP selection mechanisms. In all settings we study, we find that there exists a mechanism that utilises heterogeneity in the candidate sensitivities that outperforms standard mechanisms like RNM. However, no single mechanism uniformly outperforms RNM. We propose using the correlation between the scores and sensitivities as the basis for deciding which DP selection mechanism to use. Further, we design a slight variant of GEM, modified GEM that generally performs well whenever GEM performs poorly. Relying on the correlation heuristic we propose combined GEM, which adaptively chooses between GEM and modified GEM and outperforms both in polarised settings.

Paperid: 3551, https://arxiv.org/pdf/2501.04580.pdf

Abstract:
Organizations run applications on cloud infrastructure shared between multiple users and organizations. Popular tooling for this shared infrastructure, including Docker and Kubernetes, supports such multi-tenancy through the use of operating system virtualization. With operating system virtualization (known as containerization), multiple applications share the same kernel, reducing the runtime overhead. However, this shared kernel presents a large attack surface and has led to a proliferation of container escape attacks in which a kernel exploit lets an attacker escape the isolation of operating system virtualization to access other applications or the operating system itself. To address this, some systems have proposed a return to hypervisor virtualization for stronger isolation between applications. However, no existing system has achieved both the isolation of hypervisor virtualization and the performance and usability of operating system virtualization. We present Edera, an optimized type 1 hypervisor that uses paravirtualization to improve the runtime of hypervisor virtualization. We illustrate Edera's usability and performance through two use cases. First, we create a container runtime compatible with Kubernetes that runs on the Edera hypervisor. This implementation can be used as a drop-in replacement for the Kubernetes runtime and is compatible with all the tooling in the Kubernetes ecosystem. Second, we use Edera to provide driver isolation for hardware drivers, including those for networking, storage, and GPUs. This use of isolation protects the hypervisor and other applications from driver vulnerabilities. We find that Edera has runtime comparable to Docker with .9% slower cpu speeds, an average of 3% faster system call performance, and memory performance 0-7% faster. It achieves this with a 648 millisecond increase in startup time from Docker's 177.4 milliseconds.

Paperid: 3552, https://arxiv.org/pdf/2501.04511.pdf

Abstract:
Secure covert communication in hostile environments requires simultaneously achieving invisibility, provable security guarantees, and robustness against informed adversaries. This paper presents a novel hybrid steganographic framework that unites cover synthesis and cover modification within a unified multichannel protocol. A secret-seeded PRNG drives a lightweight Markov-chain generator to produce contextually plausible cover parameters, which are then masked with the payload and dispersed across independent channels. The masked bit-vector is imperceptibly embedded into conventional media via a variance-aware least-significant-bit algorithm, ensuring that statistical properties remain within natural bounds. We formalize a multichannel adversary model (MC-ATTACK) and prove that, under standard security assumptions, the adversary's distinguishing advantage is negligible, thereby guaranteeing both confidentiality and integrity. Empirical results corroborate these claims: local-variance-guided embedding yields near-lossless extraction (mean BER $<5\times10^{-3}$, correlation $>0.99$) with minimal perceptual distortion (PSNR $\approx100$,dB, SSIM $>0.99$), while key-based masking drives extraction success to zero (BER $\approx0.5$) for a fully informed adversary. Comparative analysis demonstrates that purely distortion-free or invertible schemes fail under the same threat model, underscoring the necessity of hybrid designs. The proposed approach advances high-assurance steganography by delivering an efficient, provably secure covert channel suitable for deployment in high-surveillance networks.

Paperid: 3553, https://arxiv.org/pdf/2501.04394.pdf

Abstract:
With the exponential rise in the use of cloud services, smart devices, and IoT devices, advanced cyber attacks have become increasingly sophisticated and ubiquitous. Furthermore, the rapid evolution of computing architectures and memory technologies has created an urgent need to understand and address hardware security vulnerabilities. In this paper, we review the current state of vulnerabilities and mitigation strategies in contemporary computing systems. We discuss cache side-channel attacks (including Spectre and Meltdown), power side-channel attacks (such as Simple Power Analysis, Differential Power Analysis, Correlation Power Analysis, and Template Attacks), and advanced techniques like Voltage Glitching and Electromagnetic Analysis to help understand and build robust cybersecurity defense systems and guide further research. We also examine memory encryption, focusing on confidentiality, granularity, key management, masking, and re-keying strategies. Additionally, we cover Cryptographic Instruction Set Architectures, Secure Boot, Root of Trust mechanisms, Physical Unclonable Functions, and hardware fault injection techniques. The paper concludes with an analysis of the RISC-V architecture's unique security challenges. The comprehensive analysis presented in this paper is essential for building resilient hardware security solutions that can protect against both current and emerging threats in an increasingly challenging security landscape.

Paperid: 3554, https://arxiv.org/pdf/2501.04362.pdf

Abstract:
In this paper, we propose a robust method for detecting guilty actors in image steganography while effectively addressing the Cover Source Mismatch (CSM) problem, which arises when classifying images from one source using a classifier trained on images from another source. Designed for an actor-based scenario, our method combines the use of Detection of Classifier Inconsistencies (DCI) prediction with EfficientNet neural networks for feature extraction, and a Gradient Boosting Machine for the final classification. The proposed approach successfully determines whether an actor is innocent or guilty, or if they should be discarded due to excessive CSM. We show that the method remains reliable even in scenarios with high CSM, consistently achieving accuracy above 80% and outperforming the baseline method. This novel approach contributes to the field of steganalysis by offering a practical and efficient solution for handling CSM and detecting guilty actors in real-world applications.

Paperid: 3555, https://arxiv.org/pdf/2501.04216.pdf

Abstract:
In cloud databases, cloud computation over sensitive data uploaded by clients inevitably causes concern about data security and privacy. Even when encryption primitives and trusted computing environments are integrated into query processing to safeguard the actual contents of the data, access patterns of algorithms can still leak private information about the data. Oblivious Random Access Memory (ORAM) and circuits are two generic approaches to address this issue, ensuring that access patterns of algorithms remain oblivious to the data. However, deploying these methods on insecure algorithms, particularly for multi-way join processing, is computationally expensive and inherently challenging. In this paper, we propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions. Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor. Furthermore, it is cache-agnostic, with cache complexity matching the insecure lower bound, also up to a logarithmic factor. This clean and straightforward approach has the potential to be extended to other security settings and implemented in practical database systems.

Paperid: 3556, https://arxiv.org/pdf/2501.03898.pdf

Abstract:
The increasing sophistication of modern cyber threats, particularly file-less malware relying on living-off-the-land techniques, poses significant challenges to traditional detection mechanisms. Memory forensics has emerged as a crucial method for uncovering such threats by analysing dynamic changes in memory. This research introduces SPECTRE (Snapshot Processing, Emulation, Comparison, and Threat Reporting Engine), a modular Cyber Incident Response System designed to enhance threat detection, investigation, and visualization. By adopting Volatility JSON format as an intermediate output, SPECTRE ensures compatibility with widely used DFIR tools, minimizing manual data transformations and enabling seamless integration into established workflows. Its emulation capabilities safely replicate realistic attack scenarios, such as credential dumping and malicious process injections, for controlled experimentation and validation. The anomaly detection module addresses critical attack vectors, including RunDLL32 abuse and malicious IP detection, while the IP forensics module enhances threat intelligence by integrating tools like Virus Total and geolocation APIs. SPECTRE advanced visualization techniques transform raw memory data into actionable insights, aiding Red, Blue and Purple teams in refining strategies and responding effectively to threats. Bridging gaps between memory and network forensics, SPECTRE offers a scalable, robust platform for advancing threat detection, team training, and forensic research in combating sophisticated cyber threats.

Paperid: 3557, https://arxiv.org/pdf/2501.03449.pdf

Abstract:
In this paper, we investigate the application of Reed-Muller (RM) codes for Physical-layer security in a real world wiretap channel scenario. Utilizing software-defined radios (SDRs) in a real indoor environment, we implement a coset coding scheme that leverages the hierarchical structure of RM codes to secure data transmission. The generator matrix of the RM code is used to partition codewords into cosets in the usual way, where each message corresponds to a unique coset, and auxiliary bits select specific codewords within each coset. This approach enables the legitimate receiver (Bob) can decode the transmitted message with minimal information leakage to eavesdropper (Eve) thus protecting the confidentiality of the communication with the help of coset structure. Mutual information neural estimation (MINE) is used to quantify information leakage and validate the effectiveness of the scheme. Experimental results indicate that RM codes can achieve robust security even in practical environments affected by real-world channel impairments. These findings demonstrate the potential of RM codes as an efficient solution for physical-layer security, particularly for applications that require low latency and short blocklengths.

Paperid: 3558, https://arxiv.org/pdf/2501.03391.pdf

Abstract:
The Bitcoin white paper introduced blockchain technology, enabling trustful transactions without intermediaries. Smart contracts emerged with Ethereum and blockchains expanded beyond cryptocurrency, applying to auctions, crowdfunding and electronic voting. However, blockchain's transparency raised privacy concerns and initial anonymity measures proved ineffective. Smart contract privacy solutions employed zero-knowledge proofs, homomorphic encryption and trusted execution environments. These approaches have practical drawbacks, such as limited functionality, high computation times and trust on third parties requirements, being not fully decentralized. This work proposes a solution utilizing zk-SNARKs to provide privacy in smart contracts and blockchains. The solution supports both fungible and nonfungible tokens. Additionally, the proposal includes a new type of transactions, called delegated transactions, which enable use cases like Delivery vs Payment (DvP).

Paperid: 3559, https://arxiv.org/pdf/2501.03358.pdf

Abstract:
Automatic Ship Identification Systems (AIS) play a key role in monitoring maritime traffic, providing the data necessary for analysis and decision-making. The integrity of this data is fundamental to the correctness of infer-ence and decision-making in the context of maritime safety, traffic manage-ment and environmental protection. This paper analyzes the impact of data integrity in large AIS datasets, on classification accuracy. It also presents er-ror detection and correction methods and data verification techniques that can improve the reliability of AIS systems. The results show that improving the integrity of AIS data significantly improves the quality of inference, which has a direct impact on operational efficiency and safety at sea.

Paperid: 3560, https://arxiv.org/pdf/2501.03067.pdf

Abstract:
When developing devices, architectures and services for the Internet of Medical Things (IoMT) world, manufacturers or integrators must be aware of the security requirements expressed by both laws and specifications. To provide tools guiding through these requirements and to assure a third party of the correct compliance, an ontology charting the relevant laws and specifications (for the European context) is very useful. We here address the development of this ontology. Due to the very high number and size of the considered specification documents, we have put in place a methodology and tools to simplify the transition from natural text to an ontology. The first step is a manual highlighting of relevant concepts in the corpus, then a manual translation to XML/XSD is operated. We have developed a tool allowing us to convert this semi-structured data into an ontology. Because the different specifications use similar but different wording, our approach favors the creation of similar instances in the ontology. To improve the ontology simplification through instance merging, we consider the use of LLMs. The responses of the LLMs are compared against our manually defined correct responses. The quality of the responses of the automated system does not prove to be good enough to be trusted blindly, and should only be used as a starting point for a manual correction.

Paperid: 3561, https://arxiv.org/pdf/2501.02981.pdf

Abstract:
Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their sophisticated and stealthy nature. Traditional Intrusion Detection Systems (IDS) often fall short in detecting these multi-stage attacks. Recently, Graph Neural Networks (GNNs) have been employed to enhance IDS capabilities by analyzing the complex relationships within networked data. However, existing GNN-based solutions are hampered by high false positive rates and substantial resource consumption. In this paper, we present a novel IDS designed to detect APTs using a Spatio-Temporal Graph Neural Network Autoencoder. Our approach leverages spatial information to understand the interactions between entities within a graph and temporal information to capture the evolution of the graph over time. This dual perspective is crucial for identifying the sequential stages of APTs. Furthermore, to address privacy and scalability concerns, we deploy our architecture in a federated learning environment. This setup ensures that local data remains on-premise while encrypted model-weights are shared and aggregated using homomorphic encryption, maintaining data privacy and security. Our evaluation shows that this system effectively detects APTs with lower false positive rates and optimized resource usage compared to existing methods, highlighting the potential of spatio-temporal analysis and federated learning in enhancing cybersecurity defenses.

Paperid: 3562, https://arxiv.org/pdf/2501.02933.pdf

Abstract:
Echomix is a practical mix network framework and a suite of associated protocols providing strong metadata privacy against realistic modern adversaries. It is distinguished from other anonymity systems by a resistance to traffic analysis by global adversaries, compromised contacts and network infrastructure, quantum decryption algorithms, and statistical and confirmation attacks typical for multi-client messaging setting. It is implemented as Katzenpost, a robust software project, and used in multiple deployed systems, and features relatively low latency and bandwidth overhead. The contributions of this paper are: (1) Improvements on leading mix network designs, supported by rigorous analysis. These include solutions to crucial vulnerabilities to traffic analysis, malicious servers and active attacks. (2) A cryptographic group messaging protocol with strong metadata protection guarantees and reliability. (3) Hybrid post-quantum nested packet encryption.

Paperid: 3563, https://arxiv.org/pdf/2501.02647.pdf

Abstract:
This paper critically reviews the integration of Artificial Intelligence (AI) and blockchain technologies in the context of Medical Internet of Things (MedIoT) applications, where they collectively promise to revolutionize healthcare delivery. By examining current research, we underscore AI's potential in advancing diagnostics and patient care, alongside blockchain's capacity to bolster data security and patient privacy. We focus particularly on the imperative to cultivate trust and ensure reliability within these systems. Our review highlights innovative solutions for managing healthcare data and challenges such as ensuring scalability, maintaining privacy, and promoting ethical practices within the MedIoT domain. We present a vision for integrating AI-driven insights with blockchain security in healthcare, offering a comprehensive review of current research and future directions. We conclude with a set of identified research gaps and propose that addressing these is crucial for achieving the dependable, secure, and patient -centric MedIoT applications of tomorrow.

Paperid: 3564, https://arxiv.org/pdf/2501.02626.pdf

Abstract:
Cryptography based on the presumed hardness of decoding codes -- i.e., code-based cryptography -- has recently seen increased interest due to its plausible security against quantum attackers. Notably, of the four proposals for the NIST post-quantum standardization process that were advanced to their fourth round for further review, two were code-based. The most efficient proposals -- including HQC and BIKE, the NIST submissions alluded to above -- in fact rely on the presumed hardness of decoding structured codes. Of particular relevance to our work, HQC is based on quasi-cyclic codes, which are codes generated by matrices consisting of two cyclic blocks. In particular, the security analysis of HQC requires a precise understanding of the Decryption Failure Rate (DFR), whose analysis relies on the following heuristic: given random ``sparse'' vectors $e_1,e_2$ (say, each coordinate is i.i.d. Bernoulli) multiplied by fixed ``sparse'' quasi-cyclic matrices $A_1,A_2$, the weight of resulting vector $e_1A_1+e_2A_2$ is very concentrated around its expectation. In the documentation, the authors model the distribution of $e_1A_1+e_2A_2$ as a vector with independent coordinates (and correct marginal distribution). However, we uncover cases where this modeling fails. While this does not invalidate the (empirically verified) heuristic that the weight of $e_1A_1+e_2A_2$ is concentrated, it does suggest that the behavior of the noise is a bit more subtle than previously predicted. Lastly, we also discuss implications of our result for potential worst-case to average-case reductions for quasi-cyclic codes.

Paperid: 3565, https://arxiv.org/pdf/2501.02493.pdf

Abstract:
In an era of escalating cyber threats, malware poses significant risks to individuals and organizations, potentially leading to data breaches, system failures, and substantial financial losses. This study addresses the urgent need for effective malware detection strategies by leveraging Machine Learning (ML) techniques on extensive datasets collected from Microsoft Windows Defender. Our research aims to develop an advanced ML model that accurately predicts malware vulnerabilities based on the specific conditions of individual machines. Moving beyond traditional signature-based detection methods, we incorporate historical data and innovative feature engineering to enhance detection capabilities. This study makes several contributions: first, it advances existing malware detection techniques by employing sophisticated ML algorithms; second, it utilizes a large-scale, real-world dataset to ensure the applicability of findings; third, it highlights the importance of feature analysis in identifying key indicators of malware infections; and fourth, it proposes models that can be adapted for enterprise environments, offering a proactive approach to safeguarding extensive networks against emerging threats. We aim to improve cybersecurity resilience, providing critical insights for practitioners in the field and addressing the evolving challenges posed by malware in a digital landscape. Finally, discussions on results, insights, and conclusions are presented.

Paperid: 3566, https://arxiv.org/pdf/2501.02350.pdf

Abstract:
Currently, an increasing number of users and enterprises are storing their data in the cloud but do not fully trust cloud providers with their data in plaintext form. To address this concern, they encrypt their data before uploading it to the cloud. However, encryption with different keys means that even identical data will become different ciphertexts, making deduplication less effective. Encrypted deduplication avoids this issue by ensuring that identical data chunks generate the same ciphertext with content-based keys, enabling the cloud to efficiently identify and remove duplicates even in encrypted form. Current encrypted data deduplication work can be classified into two types: target-based and source-based. Target-based encrypted deduplication requires clients to upload all encrypted chunks (the basic unit of deduplication) to the cloud with high network bandwidth overhead. Source-based deduplication involves clients uploading fingerprints (hashes) of encrypted chunks for duplicate checking and only uploading unique encrypted chunks, which reduces network transfer but introduces high latency and potential side-channel attacks, which need to be mitigated by Proof of Ownership (PoW), and high computing overhead of the cloud. So, reducing the latency and the overheads of network and cloud while ensuring security has become a significant challenge for secure data deduplication in cloud storage. In response to this challenge, we present PM-Dedup, a novel secure source-based deduplication approach that relocates a portion of the deduplication checking process and PoW tasks from the cloud to the trusted execution environments (TEEs) in the client-side edge servers. We also propose various designs to enhance the security and efficiency of data deduplication.

Paperid: 3567, https://arxiv.org/pdf/2501.01672.pdf

Abstract:
Large language models(LLMs) are currently at the forefront of the machine learning field, which show a broad application prospect but at the same time expose some risks of privacy leakage. We combined Fully Homomorphic Encryption(FHE) and provable security theory with Parameter-Efficient Fine-Tuning(PEFT) to propose an efficient and secure inference scheme for LLMs. More specially, we focus on pre-trained LLMs which rely on open-sourced base model and then fine-tuned with the private datasets by LoRA. This is a popular road-map for Vertical Domain Models such as LawGPT and BenTsao. We use two key technologies below. Firstly, we divide the whole model into the public part and the private part. The weights of public part are publicly accessible(e.g. the open-sourced base model) while the private part needs to be protected(e.g. the LoRA matrices). In this way, the overhead brought by computing on private data can be greatly reduced. Secondly, we propose a general method to transform a linear layer into another one which provides security against model extraction attacks and preserves its original functionality, which denoted as Private Linear Layer(PLL). Then we use this method on the LoRA matrices to make sure that the server protects their private weights without restricting the user's input. We also show that the difficulty of performing model extraction attacks for PLL can be reduced to the well-known hard problem Learning with Errors(LWE). Combing this method with FHE, we can protect user's input at the same time. In this paper, we use the open-source model ChatGLM2-6B as the base model which is fine-tuned by LoRA. Experimental results show the inference efficiency of our scheme reaches 1.61s/token which displays that the scheme has good practicality.

Paperid: 3568, https://arxiv.org/pdf/2501.01517.pdf

Abstract:
Wireless local area networks remain vulnerable to attacks initiated during the connection establishment (CE) phase. Current Wi-Fi security protocols fail to fully mitigate attacks like man-in-the-middle, preamble spoofing, and relaying. To fortify the CE phase, in this paper we design a backward-compatible scheme using a digital signature interwoven into the preambles at the physical (PHY) layer with time constraints to effectively counter those attacks. This approach slices a MAC-layer signature and embeds the slices within CE frame preambles without extending frame size, allowing one or multiple stations to concurrently verify their respective APs' transmissions. The concurrent CEs are supported by enabling the stations to analyze the consistent patterns of PHY-layer headers and identify whether the received frames are the anticipated ones from the expected APs, achieving 100% accuracy without needing to examine their MAC-layer headers. Additionally, we design and implement a fast relay attack to challenge our proposed defense and determine its effectiveness. We extend existing open-source tools to support IEEE 802.11ax to evaluate the effectiveness and practicality of our proposed scheme in a testbed consisting of USRPs, commercial APs, and Wi-Fi devices, and we show that our relay attack detection achieves 96-100% true positive rates. Finally, end-to-end formal security analyses confirm the security and correctness of the proposed solution.

Paperid: 3569, https://arxiv.org/pdf/2501.01255.pdf

Abstract:
This paper reports an analysis of aspects of the project planning stage. The object of research is the decision-making processes that take place at this stage. This work considers the problem of building a hierarchy of tasks, their distribution among performers, taking into account restrictions on financial costs and duration of project implementation. Verbal and mathematical models of the task of constructing a hierarchy of tasks and other tasks that take place at the stage of project planning were constructed. Such indicators of the project implementation process efficiency were introduced as the time, cost, and cost-time efficiency. In order to be able to apply these criteria, the tasks of estimating the minimum value of the duration of the project and its minimum required cost were considered. Appropriate methods have been developed to solve them. The developed iterative method for assessing the minimum duration of project implementation is based on taking into account the possibility of simultaneous execution of various tasks. The method of estimating the minimum cost of the project is to build and solve the problem of Boolean programming. The values obtained as a result of solving these problems form an Â«ideal pointÂ», approaching which is enabled by the developed iterative method of constructing a hierarchy of tasks based on the method of sequential concessions. This method makes it possible to devise options for management decisions to obtain valid solutions to the problem. According to them, the decision maker can introduce a concession on the value of one or both components of the Â«ideal pointÂ» or change the input data to the task. The models and methods built can be used when planning projects in education, science, production, etc.

Paperid: 3570, https://arxiv.org/pdf/2501.01146.pdf

Abstract:
Consensus mechanism is the core technology for blockchain to ensure that transactions are executed in sequence. It also determines the decentralization, security, and efficiency of blockchain. Existing mechanisms all have certain centralization issues and fail to ensure the decentralization of blockchain networks. A decentralized and efficient mechanism is required to improve blockchain systems. This paper proposes a fair consensus mechanism called Proof of Verifiable Functions (PoVF), based on the verifiability and unpredictability of verifiable functions. PoVF provides a sufficiently fair mechanism, ensuring that all nodes in blockchain network have equal opportunity to participate in consensus. In addition, a structure called "Delay buffer" is proposed to ensure transactions are executed sequentially. It delay the selection of blocks to avoid blockchain forks caused by broadcasting and transaction execution confusion. According to our security analysis, PoVF is provably secure and has the ability to resist potential adversaries. According to the experiments, PoVF-based blockchain can process up to 4000 transactions per second with nodes configured with only 4-core CPUs. This paper uses the Gini coefficient to measure the decentralization of blockchains, and the PoVF-based blockchain achieves the lowest Gini coefficient of 0.39 among all sampled blockchains. PoVF has been shown to provide sufficient efficiency while ensuring decentralization and security through experiments.

Paperid: 3571, https://arxiv.org/pdf/2501.00916.pdf

Abstract:
We present two Device Independent Quantum Random Number Generator (DI-QRNG) protocols using two self-testing methodologies in Preparation \& Measure (P\&M) scenario. These two methodologies are the variants of two well-known non-local games, namely, CHSH and pseudo-telepathy games, in P\&M framework. We exploit them as distinguishers in black-box settings to differentiate the classical and the quantum paradigms and hence to certify the Device Independence. The first self-test was proposed by Tavakoli et al. (Phys. Rev. A, 2018). We show that this is actually a P\&M variant of the CHSH game. Then based on this self-test, we design our first DI-QRNG protocol. We also propose a new self-testing methodology, which is the first of its kind that is reducible from pseudo-telepathy game in P\&M framework. Based on this new self-test, we design our second DI-QRNG protocol.

Paperid: 3572, https://arxiv.org/pdf/2501.00757.pdf

Abstract:
For different factors/reasons, ranging from inherent characteristics and features providing decentralization, enhanced privacy, ease of transactions, etc., to implied external hardships in enforcing regulations, contradictions in data sharing policies, etc., cryptocurrencies have been severely abused for carrying out numerous malicious and illicit activities including money laundering, darknet transactions, scams, terrorism financing, arm trades. However, money laundering is a key crime to be mitigated to also suspend the movement of funds from other illicit activities. Billions of dollars are annually being laundered. It is getting extremely difficult to identify money laundering in crypto transactions owing to many layering strategies available today, and rapidly evolving tactics, and patterns the launderers use to obfuscate the illicit funds. Many detection methods have been proposed ranging from naive approaches involving complete manual investigation to machine learning models. However, there are very limited datasets available for effectively training machine learning models. Also, the existing datasets are static and class-imbalanced, posing challenges for scalability and suitability to specific scenarios, due to lack of customization to varying requirements. This has been a persistent challenge in literature. In this paper, we propose behavior embedded entity-specific money laundering-like transaction simulation that helps in generating various transaction types and models the transactions embedding the behavior of several entities observed in this space. The paper discusses the design and architecture of the simulator, a custom dataset we generated using the simulator, and the performance of models trained on this synthetic data in detecting real addresses involved in money laundering.

Paperid: 3573, https://arxiv.org/pdf/2501.00754.pdf

Abstract:
The learner's ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. Achieving this requires sufficient data; however, unauthorized access by an eavesdropping learner can lead to security risks. Thus, it is important to ensure the performance of the "authorized" learner by limiting the quality of the training data accessible to eavesdroppers. Unlike previous studies focusing on encryption or access controls, we provide a theorem to ensure superior learning outcomes exclusively for the authorized learner with quantum label encoding. In this context, we use the probably-approximately-correct (PAC) learning framework and introduce the concept of learning probability to quantitatively assess learner performance. Our theorem allows the condition that, given a training dataset, an authorized learner is guaranteed to achieve a certain quality of learning outcome, while eavesdroppers are not. Notably, this condition can be constructed based only on the authorized-learning-only measurable quantities of the training data, i.e., its size and noise degree. We validate our theoretical proofs and predictions through convolutional neural networks (CNNs) image classification learning.

Paperid: 3574, https://arxiv.org/pdf/2501.00463.pdf

Abstract:
The rapid proliferation of AI-generated images necessitates effective watermarking techniques to protect intellectual property and detect fraudulent content. While existing training-based watermarking methods show promise, they often struggle with generalizing across diverse prompts and tend to introduce visible artifacts. To this end, we propose a novel, provably generalizable image watermarking approach for Latent Diffusion Models, termed Self-Augmented Training (SAT-LDM). Our method aligns the training and testing phases through a free generation distribution, thereby enhancing the watermarking module's generalization capabilities. We theoretically consolidate SAT-LDM by proving that the free generation distribution contributes to its tight generalization bound, without the need for additional data collection. Extensive experiments show that SAT-LDM not only achieves robust watermarking but also significantly improves the quality of watermarked images across a wide range of prompts. Moreover, our experimental analyses confirm the strong generalization abilities of SAT-LDM. We hope that our method provides a practical and efficient solution for securing high-fidelity AI-generated content.

Paperid: 3575, https://arxiv.org/pdf/2501.00085.pdf

Abstract:
Security-Enhanced Linux (SELinux) is a robust security mechanism that enforces mandatory access controls (MAC), but its policy language's complexity creates challenges for policy analysis and management. This research investigates the automation of SELinux policy analysis using graph-based techniques combined with machine learning approaches to detect policy anomalies. The study addresses two key questions: Can SELinux policy analysis be automated through graph analysis, and how do different anomaly detection models compare in analyzing SELinux policies? We will be comparing different machine learning models by evaluating their effectiveness in detecting policy violations and anomalies. Our approach utilizes Neo4j for graph representation of policies, with Node2vec transforming these graph structures into meaningful vector embeddings that can be processed by our machine learning models. In our results, the MLP Neural Network consistently demonstrated superior performance across different dataset sizes, achieving 95% accuracy with balanced precision and recall metrics, while both Random Forest and SVM models showed competitive but slightly lower performance in detecting policy violations. This combination of graph-based modeling and machine learning provides a more sophisticated and automated approach to understanding and analyzing complex SELinux policies compared to traditional manual analysis methods.

Paperid: 3576, https://arxiv.org/pdf/2506.23592.pdf

Abstract:
The cybersecurity industry combines "automated" and "autonomous" AI, creating dangerous misconceptions about system capabilities. Recent milestones like XBOW topping HackerOne's leaderboard showcase impressive progress, yet these systems remain fundamentally semi-autonomous--requiring human oversight. Drawing from robotics principles, where the distinction between automation and autonomy is well-established, I take inspiration from prior work and establish a 6-level taxonomy (Level 0-5) distinguishing automation from autonomy in Cybersecurity AI. Current "autonomous" pentesters operate at Level 3-4: they execute complex attack sequences but need human review for edge cases and strategic decisions. True Level 5 autonomy remains aspirational. Organizations deploying mischaracterized "autonomous" tools risk reducing oversight precisely when it's most needed, potentially creating new vulnerabilities. The path forward requires precise terminology, transparent capabilities disclosure, and human-AI partnership-not replacement.

Paperid: 3577, https://arxiv.org/pdf/2506.23499.pdf

Abstract:
The unbounded knapsack problem can be considered as a particular case of the double partition problem that asks for a number of nonnegative integer solutions to a system of two linear Diophantine equations with integer coefficients. In the middle of 19th century Sylvester and Cayley suggested an approach based on the variable elimination allowing a reduction of a double partition to a sum of scalar partitions. This manuscript discusses a geometric interpretation of this method and its application to the knapsack problem.

Paperid: 3578, https://arxiv.org/pdf/2506.23435.pdf

Abstract:
Speedrunning is a competition that emerged from communities of early video games such as Doom (1993). Speedrunners try to finish a game in minimal time. Provably verifying the authenticity of submitted speedruns is an open problem. Traditionally, best-effort speedrun verification is conducted by on-site human observers, forensic audio analysis, or a rigorous mathematical analysis of the game mechanics. Such methods are tedious, fallible, and, perhaps worst of all, not cryptographic. Motivated by naivety and the Dunning-Kruger effect, we attempt to build a system that cryptographically proves the authenticity of speedruns. This paper describes our attempted solutions and ways to circumvent them. Through a narration of our failures, we attempt to demonstrate the difficulty of authenticating live and interactive human input in untrusted environments, as well as the limits of signature schemes, game integrity, and provable play.

Paperid: 3579, https://arxiv.org/pdf/2506.23050.pdf

Abstract:
We investigate properties of equivalence classes in AES which arise naturally from properties of MixColumns and InvMixColumns. These two operations have the property that the XOR of the 4 input bytes equals the XOR of 4 output bytes. We examine the effect on equivalence classes due to the operation of SubBytes, ShiftRows, MixColumns and AddRoundKey. The next phase of research is to find a key recovery attack using known (plaintext, ciphertext) equivalence class pairs. Keywords: AES, Equivalence, Class, MixColumns, ShiftRows, SubBytes, AddRoundKey, Schedule, State, XOR

Paperid: 3580, https://arxiv.org/pdf/2506.22445.pdf

Abstract:
Cyber-Physical Systems play a critical role in the infrastructure of various sectors, including manufacturing, energy distribution, and autonomous transportation systems. However, their increasing connectivity renders them highly vulnerable to sophisticated cyber threats, such as adaptive and zero-day attacks, against which traditional security methods like rule-based intrusion detection and single-agent reinforcement learning prove insufficient. To overcome these challenges, this paper introduces a novel Hierarchical Adversarially-Resilient Multi-Agent Reinforcement Learning (HAMARL) framework. HAMARL employs a hierarchical structure consisting of local agents dedicated to subsystem security and a global coordinator that oversees and optimizes comprehensive, system-wide defense strategies. Furthermore, the framework incorporates an adversarial training loop designed to simulate and anticipate evolving cyber threats, enabling proactive defense adaptation. Extensive experimental evaluations conducted on a simulated industrial IoT testbed indicate that HAMARL substantially outperforms traditional multi-agent reinforcement learning approaches, significantly improving attack detection accuracy, reducing response times, and ensuring operational continuity. The results underscore the effectiveness of combining hierarchical multi-agent coordination with adversarially-aware training to enhance the resilience and security of next-generation CPS.

Paperid: 3581, https://arxiv.org/pdf/2506.22323.pdf

Abstract:
A sophisticated malspam campaign was recently uncovered targeting Latin American countries, with a particular focus on Brazil. This operation utilizes a highly deceptive phishing email to trick users into executing a malicious MSI file, initiating a multi-stage infection. The core of the attack leverages DLL side-loading, where a legitimate executable from Valve Corporation is used to load a trojanized DLL, thereby bypassing standard security defenses. Once active, the malware, a variant of QuasarRAT known as BlotchyQuasar, is capable of a wide range of malicious activities. It is designed to steal sensitive browser-stored credentials and banking information, the latter through fake login windows mimicking well-known Brazilian banks. The threat establishes persistence by modifying the Windows registry , captures user keystrokes through keylogging , and exfiltrates stolen data to a Command-and-Control (C2) server using encrypted payloads. Despite its advanced capabilities, the malware code exhibits signs of rushed development, with inefficiencies and poor error handling that suggest the threat actors prioritized rapid deployment over meticulous design. Nonetheless, the campaign extensive reach and sophisticated mechanisms pose a serious and immediate threat to the targeted regions, underscoring the need for robust cybersecurity defenses.

Paperid: 3582, https://arxiv.org/pdf/2506.22180.pdf

Abstract:
The industrial market continuously needs reliable solutions to secure autonomous systems. Especially as these systems become more complex and interconnected, reliable security solutions are becoming increasingly important. One promising solution to tackle this challenge is using smart contracts designed to meet contractual conditions, avoid malicious errors, secure exchanges, and minimize the need for reliable intermediaries. However, smart contracts are immutable. Moreover, there are different smart contract execution architectures (namely Order-Execute and Execute-Order-Validate) that have different throughputs. In this study, we developed an evaluation model for assessing the security of reliable smart contract execution. We then developed a realistic smart contract enabled IoT energy case study. Finally, we simulate the developed case study to evaluate several smart contract security vulnerabilities reported in the literature. Our results show that the Execute-Order-Validate architecture is more promising regarding reliability and security.

Paperid: 3583, https://arxiv.org/pdf/2506.21114.pdf

Abstract:
To cater to the needs of (Zero Knowledge) proofs for (mathematical) proofs, we describe a method to transform formal sentences in 2x2-matrices over multivariate polynomials with integer coefficients, such that usual proof-steps like modus-ponens or the substitution are easy to compute from the matrices corresponding to the terms or formulas used as arguments. By evaluating the polynomial variables in random elements of a suitably chosen finite field, the proof is replaced by a numeric sequence. Only the values corresponding to the axioms have to be computed from scratch. The values corresponding to derived formulas are computed from the values corresponding to their ancestors by applying the homomorphic properties. On such sequences, various Zero Knowledge methods can be applied.

Paperid: 3584, https://arxiv.org/pdf/2506.20965.pdf

Abstract:
This paper integrates Austrian capital theory with repeated game theory to examine strategic miner behaviour under different institutional conditions in blockchain systems. It shows that when protocol rules are mutable, effective time preference rises, undermining rational long-term planning and cooperative equilibria. Using formal game-theoretic analysis and Austrian economic principles, the paper demonstrates how mutable protocols shift miner incentives from productive investment to political rent-seeking and influence games. The original Bitcoin protocol is interpreted as an institutional anchor: a fixed rule-set enabling calculability and low time preference. Drawing on the work of Bohm-Bawerk, Mises, and Hayek, the argument is made that protocol immutability is essential for restoring strategic coherence, entrepreneurial confidence, and sustainable network equilibrium.

Paperid: 3585, https://arxiv.org/pdf/2506.19934.pdf

Abstract:
Cybersecurity is one of the foremost challenges facing the world of cloud computing. Recently, the widespread adoption of smart devices in cloud computing environments that provide Internet-based services has become prevalent. Therefore, it is essential to consider the security threats in these environments. The use of intrusion detection systems can mitigate the vulnerabilities of these systems. Furthermore, hybrid intrusion detection systems can provide better protection compared to conventional intrusion detection systems. These systems manage issues related to complexity, dimensionality, and performance. This research aims to propose a Hybrid Intrusion Detection System (HyIDS) that identifies and mitigates initial threats. The main innovation of this research is the introduction of a new method for hybrid intrusion detection systems (HyIDS). For this purpose, an Energy-Valley Optimizer (EVO) is used to select an optimal feature set, which is then classified using supervised machine learning models. The proposed approach is evaluated using the CIC_DDoS2019, CSE_CIC_DDoS2018, and NSL-KDD datasets. For evaluation and testing, the proposed system has been run for a total of 32 times. The results of the proposed approach are compared with the Grey Wolf Optimizer (GWO). With the CIC_DDoS2019 dataset, the D_TreeEVO model achieves an accuracy of 99.13% and a detection rate of 98.941%. Furthermore, this result reaches 99.78% for the CSE_CIC_DDoS2018 dataset. In comparison to NSL-KDD, it has an accuracy of 99.50% and a detection rate (DT) of 99.48%. For feature selection, EVO outperforms GWO. The results of this research indicate that EVO yields better results as an optimizer for HyIDS performance.

Paperid: 3586, https://arxiv.org/pdf/2506.19881.pdf

Abstract:
Are there any conditions under which a generative model's outputs are guaranteed not to infringe the copyrights of its training data? This is the question of "provable copyright protection" first posed by Vyas, Kakade, and Barak (ICML 2023). They define near access-freeness (NAF) and propose it as sufficient for protection. This paper revisits the question and establishes new foundations for provable copyright protection -- foundations that are firmer both technically and legally. First, we show that NAF alone does not prevent infringement. In fact, NAF models can enable verbatim copying, a blatant failure of copy protection that we dub being tainted. Then, we introduce our blameless copy protection framework for defining meaningful guarantees, and instantiate it with clean-room copy protection. Clean-room copy protection allows a user to control their risk of copying by behaving in a way that is unlikely to copy in a counterfactual clean-room setting. Finally, we formalize a common intuition about differential privacy and copyright by proving that DP implies clean-room copy protection when the dataset is golden, a copyright deduplication requirement.

Paperid: 3587, https://arxiv.org/pdf/2506.18848.pdf

Abstract:
An interleaving sequence is obtained by combining or intertwining elements from two or more sequences. On the other hand, cellular automata are known to be generators for keystream sequences. In this paper we present two families of one-dimensional cellular automata as generators of interleaving sequences. This study aims to close a notable gap within the current body of literature by exploring the capacity of cellular automata to generate interleaving sequences. While previous works have separately examined cellular automata as sequence generators and interleaving sequences, there exists limited literature interconnecting these two topics. Our study seeks to bridge this gap, providing perspectives on the generation of interleaving sequences through the utilisation of cellular automata, thereby fostering a deeper understanding of both disciplines.

Paperid: 3588, https://arxiv.org/pdf/2506.18780.pdf

Abstract:
High-confidence computing relies on trusted instructional set architecture, sealed kernels, and secure operating systems. Cloud computing depends on trusted systems for virtualization tasks. Branch predictions and pipelines are essential in improving performance of a CPU/GPU. But Spectre and Meltdown make modern processors vulnerable to be exploited. Disabling the prediction and pipeline is definitely not a good solution. On the other hand, current software patches can only address non-essential issues around Meltdown. This paper introduces a holistic approach in trusted computer architecture design and emulation.

Paperid: 3589, https://arxiv.org/pdf/2506.18189.pdf

Abstract:
Initially introduced to Ethereum via Flashbots' MEV-boost, Proposer-Builder Separation allows proposers to auction off blockspace to a market of transaction orderers, known as builders. PBS is currently available to validators through the aforementioned MEV-boost, but its unregulated and relay-dependent nature has much of the Ethereum community calling for its enshrinement. Providing a protocol-integrated PBS marketspace and communication channel for payload outsourcing is termed PBS enshrinement. Although ePBS potentially introduces native MEV mitigation mechanisms and reduces validator operation costs, fears of multiparty collusion and chain stagnation are all too real. In addition to mitigating these potential drawbacks, PBS research pursues many tenets revered by Web3 enthusiasts, including but not limited to, censorship resistance, validator reward equity, and deflationary finance. The subsequent SoK will identify current PBS mechanisms, the need for enshrinement, additions to the ePBS upgrade, and the existing or potential on-chain socioeconomic implications of each.

Paperid: 3590, https://arxiv.org/pdf/2506.17245.pdf

Abstract:
SQL injection (SQLi) remains a critical vulnerability in web applications, enabling attackers to manipulate databases through malicious inputs. Despite advancements in mitigation techniques, the evolving complexity of web applications and attack strategies continues to pose significant risks. This paper presents a comprehensive penetration testing methodology to identify, exploit, and mitigate SQLi vulnerabilities in a PHP-MySQL-based web application. Utilizing tools such as OWASP ZAP, sqlmap, and Nmap, the study demonstrates a systematic approach to vulnerability assessment and remediation. The findings underscore the efficacy of input sanitization and prepared statements in mitigating SQLi risks, while highlighting the need for ongoing security assessments to address emerging threats. The study contributes to the field by providing practical insights into effective detection and prevention strategies, supported by a real-world case study.

Paperid: 3591, https://arxiv.org/pdf/2506.17236.pdf

Abstract:
The present dissertation addresses the problem of fairly distributing shared resources in non-commercial blockchain networks. Blockchains are distributed systems that order and timestamp records of a given network of users, in a public, cryptographically secure, and consensual way. The records, which may in kind be events, transaction orders, sets of rules for structured transactions etc. are placed within well-defined datastructures called blocks, and they are linked to each other by the virtue of cryptographic pointers, in a total ordering which represents their temporal relations of succession. The ability to operate on the blockchain, and/or to contribute a record to the content of a block are shared resources of the blockchain systems. In commercial networks, these resources are exchanged in return for fiat money, and consequently, fairness is not a relevant problem in terms of computer engineering. In non-commercial networks, however, monetary solutions are not available, by definition. The present non-commercial blockchain networks employ trivial distribution mechanisms called faucets, which offer fixed amounts of free tokens (called cryptocurrencies) specific to the given network. This mechanism, although simple and efficient, is prone to denial of service (DoS) attacks and cannot address the fairness problem. In the present dissertation, the faucet mechanism is adapted for fair distribution, in line with Max-min Fairness scheme. In total, we contributed 6 distinct Max-min Fair algorithms as efficient blockchain faucets. The algorithms we contribute are resistant to DoS attacks, low-cost in terms of blockchain computation economics, and they also allow for different user weighting policies.

Paperid: 3592, https://arxiv.org/pdf/2506.17233.pdf

Abstract:
We present a new approach to RSA factorization inspired by geometric interpretations and square differences. This method reformulates the problem in terms of the distance between perfect squares and provides a recurrence relation that allows rapid convergence when the RSA modulus has closely spaced prime factors. Although this method is efficient for small semiprimes, it does not yet succeed in factoring large challenges like RSA-100 in practical time, highlighting both its potential and current limitations.

Paperid: 3593, https://arxiv.org/pdf/2506.15343.pdf

Abstract:
Offensive Robot Cybersecurity introduces a groundbreaking approach by advocating for offensive security methods empowered by means of automation. It emphasizes the necessity of understanding attackers' tactics and identifying vulnerabilities in advance to develop effective defenses, thereby improving robots' security posture. This thesis leverages a decade of robotics experience, employing Machine Learning and Game Theory to streamline the vulnerability identification and exploitation process. Intrinsically, the thesis uncovers a profound connection between robotic architecture and cybersecurity, highlighting that the design and creation aspect of robotics deeply intertwines with its protection against attacks. This duality -- whereby the architecture that shapes robot behavior and capabilities also necessitates a defense mechanism through offensive and defensive cybersecurity strategies -- creates a unique equilibrium. Approaching cybersecurity with a dual perspective of defense and attack, rooted in an understanding of systems architecture, has been pivotal. Through comprehensive analysis, including ethical considerations, the development of security tools, and executing cyber attacks on robot software, hardware, and industry deployments, this thesis proposes a novel architecture for cybersecurity cognitive engines. These engines, powered by advanced game theory and machine learning, pave the way for autonomous offensive cybersecurity strategies for robots, marking a significant shift towards self-defending robotic systems. This research not only underscores the importance of offensive measures in enhancing robot cybersecurity but also sets the stage for future advancements where robots are not just resilient to cyber threats but are equipped to autonomously safeguard themselves.

Paperid: 3594, https://arxiv.org/pdf/2506.15043.pdf

Abstract:
Advancements in the defense industry are paramount for ensuring the safety and security of nations, providing robust protection against emerging threats. Among these threats, hypersonic missiles pose a significant challenge due to their extreme speeds and maneuverability, making accurate trajectory prediction a critical necessity for effective countermeasures. This paper addresses this challenge by employing a novel hybrid deep learning approach, integrating Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs). By leveraging the strengths of these architectures, the proposed method successfully predicts the complex trajectories of hypersonic missiles with high accuracy, offering a significant contribution to defense strategies and missile interception technologies. This research demonstrates the potential of advanced machine learning techniques in enhancing the predictive capabilities of defense systems.

Paperid: 3595, https://arxiv.org/pdf/2506.14944.pdf

Abstract:
The Fair Data Exchange (FDE) protocol (CCS '24) provides atomic pay-per-file transfers with constant-size proofs, yet existing implementations remain unscalably slow (about 1 second per 4 KiB) and inflate ciphertexts by 10--50x. We introduce an FDE implementation that achieves near-plaintext speeds and sizes, making fair exchange practical even for gigabyte-scale files. Our approach leverages two key insights. First, we observe that a KZG commitment to polynomial evaluations implicitly (and without modification) also binds to the Reed--Solomon (RS) codeword of its coefficients, enabling sound and efficient randomized verification. Second, while heavyweight encryption schemes such as exponential ElGamal enable compact proofs linking ciphertexts to the commitment, they are unnecessary for direct data recovery. Exploiting these insights, we apply a lightweight hash-derived mask to the entire RS-extended codeword, and perform ElGamal encryption only on a pseudorandom Î(lambda) subset of symbols, where lambda is the security parameter (e.g., 128). Data recovery occurs by simply removing the lightweight masks, with ElGamal ciphertexts serving exclusively for verification proofs. A heavyweight (but constant-time) zk-SNARK ensures consistency between these two encryption layers at sampled positions, sharply reducing bandwidth overhead and computational cost. In addition, we show how a constant-time (and precomputable) zk-SNARK linking a BLS12-381 secret key to a secp256k1 hash pre-image resolves Bitcoin's elliptic-curve mismatch, enabling fully off-chain execution via the Lightning Network. This can reduce transaction fees from roughly $10 to under $0.01 and shortens transaction latency from tens of seconds on Ethereum down to about a second or less.

Paperid: 3596, https://arxiv.org/pdf/2506.14576.pdf

Abstract:
As artificial intelligence (AI) continues to permeate various sectors, safeguarding personal and sensitive data has become increasingly crucial. To address these concerns, privacy-enhancing technologies (PETs) have emerged as a suite of digital tools that enable data collection and processing while preserving privacy. This paper explores the current landscape of data privacy in the context of AI, reviews the integration of PETs within AI systems, and assesses both their achievements and the challenges that remain.

Paperid: 3597, https://arxiv.org/pdf/2506.14566.pdf

Abstract:
In today's digital age, personal data is constantly at risk of compromise. Attribute-Based Encryption (ABE) has emerged as a promising approach to privacy-preserving data protection. This paper proposes an anonymous authentication mechanism based on ABE, which allows users to authenticate without revealing their identity. The mechanism adds a privacy-preserving layer by enabling authorization based solely on user attributes. The proposed approach is implemented using OpenID Connect, demonstrating its feasibility in real-world systems.

Paperid: 3598, https://arxiv.org/pdf/2506.14197.pdf

Abstract:
This paper formally examines the network structure of Bitcoin CORE (BTC) and Bitcoin Satoshi Vision (BSV) using complex graph theory to demonstrate that home-hosted full nodes are incapable of participating in or influencing the propagation topology. Leveraging established models such as scale-free networks and small-world connectivity, we demonstrate that the propagation graph is dominated by a densely interconnected miner clique, while full nodes reside on the periphery, excluded from all transaction-to-block inclusion paths. Using simulation-backed metrics and eigenvalue centrality analysis, we confirm that full nodes are neither critical nor operationally relevant for consensus propagation.

Paperid: 3599, https://arxiv.org/pdf/2506.13246.pdf

Abstract:
This paper presents a formalised architecture for synthetic agents designed to retain immutable memory, verifiable reasoning, and constrained epistemic growth. Traditional AI systems rely on mutable, opaque statistical models prone to epistemic drift and historical revisionism. In contrast, we introduce the concept of the Merkle Automaton, a cryptographically anchored, deterministic computational framework that integrates formal automata theory with blockchain-based commitments. Each agent transition, memory fragment, and reasoning step is committed within a Merkle structure rooted on-chain, rendering it non-repudiable and auditably permanent. To ensure selective access and confidentiality, we derive symmetric encryption keys from ECDH exchanges contextualised by hierarchical privilege lattices. This enforces cryptographic access control over append-only DAG-structured knowledge graphs. Reasoning is constrained by formal logic systems and verified through deterministic traversal of policy-encoded structures. Updates are non-destructive and historied, preserving epistemic lineage without catastrophic forgetting. Zero-knowledge proofs facilitate verifiable, privacy-preserving inclusion attestations. Collectively, this architecture reframes memory not as a cache but as a ledger - one whose contents are enforced by protocol, bound by cryptography, and constrained by formal logic. The result is not an intelligent agent that mimics thought, but an epistemic entity whose outputs are provably derived, temporally anchored, and impervious to post hoc revision. This design lays foundational groundwork for legal, economic, and high-assurance computational systems that require provable memory, unforgeable provenance, and structural truth.

Paperid: 3600, https://arxiv.org/pdf/2506.12685.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities, yet their susceptibility to adversarial attacks, particularly jailbreaking, poses significant safety and ethical concerns. While numerous jailbreak methods exist, many suffer from computational expense, high token usage, or complex decoding schemes. Liu et al. (2024) introduced FlipAttack, a black-box method that achieves high attack success rates (ASR) through simple prompt manipulation. This paper investigates the underlying mechanisms of FlipAttack's effectiveness by analyzing the semantic changes induced by its flipping modes. We hypothesize that semantic dissimilarity between original and manipulated prompts is inversely correlated with ASR. To test this, we examine embedding space visualizations (UMAP, KDE) and cosine similarities for FlipAttack's modes. Furthermore, we introduce a novel adversarial attack, Alphabet Index Mapping (AIM), designed to maximize semantic dissimilarity while maintaining simple decodability. Experiments on GPT-4 using a subset of AdvBench show AIM and its variant AIM+FWO achieve a 94% ASR, outperforming FlipAttack and other methods on this subset. Our findings suggest that while high semantic dissimilarity is crucial, a balance with decoding simplicity is key for successful jailbreaking. This work contributes to a deeper understanding of adversarial prompt mechanics and offers a new, effective jailbreak technique.

Paperid: 3601, https://arxiv.org/pdf/2506.12328.pdf

Abstract:
Recent work~\cite{Liu2016} has shown that dependencies between items in a dataset can lead to privacy leaks. We extend this concept to privacy-preserving transformations, considering a broader set of dependencies captured by correlation metrics. Specifically, we measure the correlation between the original data and their noisy responses from a randomizer as an indicator of potential privacy breaches. This paper aims to leverage information-theoretic measures, such as the Maximal Information Coefficient (MIC), to estimate privacy leaks and derive novel, computationally efficient privacy leak estimators. We extend the $Ï_1$-to-$Ï_2$ formulation~\cite{Evfimievski2003} to incorporate entropy, mutual information, and the degree of anonymity for a more comprehensive measure of privacy risk. Our proposed hybrid metric can identify correlation dependencies between attributes in the dataset, serving as a proxy for privacy leak vulnerabilities. This metric provides a computationally efficient worst-case measure of privacy loss, utilizing the inherent characteristics of the data to prevent privacy breaches.

Paperid: 3602, https://arxiv.org/pdf/2506.12088.pdf

Abstract:
Large Language Models (LLMs) and generative AI (GenAI) systems, such as ChatGPT, Claude, Gemini, LLaMA, and Copilot (by OpenAI, Anthropic, Google, Meta, and Microsoft, respectively), are reshaping digital platforms and app ecosystems while introducing critical challenges in cybersecurity, privacy, and platform integrity. Our analysis reveals alarming trends: LLM-assisted malware is projected to rise from 2% (2021) to 50% (2025); AI-generated Google reviews grew nearly tenfold (1.2% in 2021 to 12.21% in 2023, expected to reach 30% by 2025); AI scam reports surged 456%; misinformation sites increased over 1500%; and deepfake attacks are projected to rise over 900% in 2025. In finance, LLM-driven threats like synthetic identity fraud and AI-generated scams are accelerating. Platforms such as JPMorgan Chase, Stripe, and Plaid deploy LLMs for fraud detection, regulation parsing, and KYC/AML automation, reducing fraud loss by up to 21% and accelerating onboarding by 40-60%. LLM-facilitated code development has driven mobile app submissions from 1.8 million (2020) to 3.0 million (2024), projected to reach 3.6 million (2025). To address AI threats, platforms like Google Play, Apple App Store, GitHub Copilot, TikTok, Facebook, and Amazon deploy LLM-based defenses, highlighting their dual nature as both threat sources and mitigation tools. In clinical diagnostics, LLMs raise concerns about accuracy, bias, and safety, necessitating strong governance. Drawing on 445 references, this paper surveys LLM/GenAI and proposes a strategic roadmap and operational blueprint integrating policy auditing (such as CCPA and GDPR compliance), fraud detection, and demonstrates an advanced LLM-DA stack with modular components, multi-LLM routing, agentic memory, and governance layers. We provide actionable insights, best practices, and real-world case studies for scalable trust and responsible innovation.

Paperid: 3603, https://arxiv.org/pdf/2506.12060.pdf

Abstract:
Cybersecurity organizations are adapting to GenAI integration through modified frameworks and hybrid operational processes, with success influenced by existing security maturity, regulatory requirements, and investments in human capital and infrastructure. This qualitative research employs systematic document analysis and comparative case study methodology to examine how cybersecurity organizations adapt their threat modeling frameworks and operational processes to address generative artificial intelligence integration. Through examination of 25 studies from 2022 to 2025, the research documents substantial transformation in organizational approaches to threat modeling, moving from traditional signature-based systems toward frameworks incorporating artificial intelligence capabilities. The research identifies three primary adaptation patterns: Large Language Model integration for security applications, GenAI frameworks for risk detection and response automation, and AI/ML integration for threat hunting. Organizations with mature security infrastructures, particularly in finance and critical infrastructure sectors, demonstrate higher readiness through structured governance approaches, dedicated AI teams, and robust incident response processes. Organizations achieve successful GenAI integration when they maintain appropriate human oversight of automated systems, address data quality concerns and explainability requirements, and establish governance frameworks tailored to their specific sectors. Organizations encounter ongoing difficulties with privacy protection, bias reduction, personnel training, and defending against adversarial attacks. This work advances understanding of how organizations adopt innovative technologies in high-stakes environments and offers actionable insights for cybersecurity professionals implementing GenAI systems.

Paperid: 3604, https://arxiv.org/pdf/2506.12032.pdf

Abstract:
We present a robust neural watermarking framework for scientific data integrity, targeting high-dimensional fields common in climate modeling and fluid simulations. Using a convolutional autoencoder, binary messages are invisibly embedded into structured data such as temperature, vorticity, and geopotential. Our method ensures watermark persistence under lossy transformations - including noise injection, cropping, and compression - while maintaining near-original fidelity (sub-1\% MSE). Compared to classical singular value decomposition (SVD)-based watermarking, our approach achieves $>$98\% bit accuracy and visually indistinguishable reconstructions across ERA5 and Navier-Stokes datasets. This system offers a scalable, model-compatible tool for data provenance, auditability, and traceability in high-performance scientific workflows, and contributes to the broader goal of securing AI systems through verifiable, physics-aware watermarking. We evaluate on physically grounded scientific datasets as a representative stress-test; the framework extends naturally to other structured domains such as satellite imagery and autonomous-vehicle perception streams.

Paperid: 3605, https://arxiv.org/pdf/2506.11954.pdf

Abstract:
We present a technical evaluation of a new, disruptive cryptographic approach to data security, known as HbHAI (Hash-based Homomorphic Artificial Intelligence). HbHAI is based on a novel class of key-dependent hash functions that naturally preserve most similarity properties, most AI algorithms rely on. As a main claim, HbHAI makes now possible to analyze and process data in its cryptographically secure form while using existing native AI algorithms without modification, with unprecedented performances compared to existing homomorphic encryption schemes. We tested various HbHAI-protected datasets (non public preview) using traditional unsupervised and supervised learning techniques (clustering, classification, deep neural networks) with classical unmodified AI algorithms. This paper presents technical results from an independent analysis conducted with those different, off-the-shelf AI algorithms. The aim was to assess the security, operability and performance claims regarding HbHAI techniques. As a results, our results confirm most these claims, with only a few minor reservations.

Paperid: 3606, https://arxiv.org/pdf/2506.10755.pdf

Abstract:
Azure RBAC leverages wildcard permissions to simplify policy authoring, but this abstraction often obscures the actual set of allowed operations and undermines least-privilege guarantees. We introduce Belshazaar, a two-stage framework that targets both the effective permission set problem and the evaluation of wildcards permissions spread. First, we formalize Azure action syntax via a context free grammar and implement a compiler that expands any wildcard into its explicit action set. Second, we define an ultrametric diameter metric to quantify semantic overreach in wildcard scenarios. Applied to Microsoft s official catalog of 15481 actions, Belshazaar reveals that about 50 percent of actions admit a cross Resource Provider reach when associated with non obvious wildcards, and that effective permissions sets are effectively computable. These findings demonstrate that wildcard patterns can introduce substantial privilege bloat, and that our approach offers a scalable, semantics driven path toward tighter, least-privilege RBAC policies in Azure environments.

Paperid: 3607, https://arxiv.org/pdf/2506.10467.pdf

Abstract:
Recent advancements in LLMs indicate potential for novel applications, as evidenced by the reasoning capabilities in the latest OpenAI and DeepSeek models. To apply these models to domain-specific applications beyond text generation, LLM-based multi-agent systems can be utilized to solve complex tasks, particularly by combining reasoning techniques, code generation, and software execution across multiple, potentially specialized LLMs. However, while many evaluations are performed on LLMs, reasoning techniques, and applications individually, their joint specification and combined application are not well understood. Defined specifications for multi-agent LLM systems are required to explore their potential and suitability for specific applications, allowing for systematic evaluations of LLMs, reasoning techniques, and related aspects. This paper reports the results of exploratory research on (1.) multi-agent specification by introducing an agent schema language and (2.) the execution and evaluation of the specifications through a multi-agent system architecture and prototype. The specification language, system architecture, and prototype are first presented in this work, building on an LLM system from prior research. Test cases involving cybersecurity tasks indicate the feasibility of the architecture and evaluation approach. As a result, evaluations could be demonstrated for question answering, server security, and network security tasks completed correctly by agents with LLMs from OpenAI and DeepSeek.

Paperid: 3608, https://arxiv.org/pdf/2506.10338.pdf

Abstract:
Distributed broadcast encryption (DBE) is a specific kind of broadcast encryption (BE) where users independently generate their own public and private keys, and a sender can efficiently create a ciphertext for a subset of users by using the public keys of the subset users. Previously proposed DBE schemes have been proven in the adaptive chosen-plaintext attack (CPA) security model and have the disadvantage of requiring linear number of pairing operations when verifying the public key of a user. In this paper, we propose an efficient DBE scheme in bilinear groups and prove adaptive chosen-ciphertext attack (CCA) security for the first time. To do this, we first propose a semi-static CCA secure DBE scheme and prove the security under the $q$-Type assumption. Then, by modifying the generic transformation of Gentry and Waters that converts a semi-static CPA secure DBE scheme into an adaptive CPA secure DBE scheme to be applied to CCA secure DBE schemes, we propose an adaptive CCA secure DBE scheme and prove its adaptive CCA security. Our proposed DBE scheme is efficient because it requires constant size ciphertexts, constant size private keys, and linear size public keys, and the public key verification requires only a constant number of pairing operations and efficient group membership checks.

Paperid: 3609, https://arxiv.org/pdf/2506.10042.pdf

Abstract:
In an era of increasing interaction with artificial intelligence (AI), users face evolving privacy decisions shaped by complex, uncertain factors. This paper introduces Multiverse Privacy Theory, a novel framework in which each privacy decision spawns a parallel universe, representing a distinct potential outcome based on user choices over time. By simulating these universes, this theory provides a foundation for understanding privacy through the lens of contextual integrity, evolving preferences, and probabilistic decision-making. Future work will explore its application using real-world, scenario-based survey data.

Paperid: 3610, https://arxiv.org/pdf/2506.10039.pdf

Abstract:
We present a symbolic identity for generating integer triples $(a, b, c)$ satisfying $a + b = c$, inspired by structural features of the \emph{abc conjecture}. The construction uses powers of $2$ and $3$ in combination with modular inversion in $\mathbb{Z}/3^p\mathbb{Z}$, leading to a parametric identity with residue constraints that yield abc-triples exhibiting low radical values. Through affine transformations, these symbolic triples are embedded into a broader space of high-quality examples, optimised for the ratio $\log c / \log \operatorname{rad}(abc)$. Computational results demonstrate the emergence of structured, radical-minimising candidates, including both known and novel triples. These methods provide a symbolic and algebraic framework for controlled triple generation, and suggest exploratory implications for symbolic entropy filtering in cryptographic pre-processing.

Paperid: 3611, https://arxiv.org/pdf/2506.10029.pdf

Abstract:
Large Language models (LLMs) are transforming digital usage, particularly in text generation, image creation, information retrieval and code development. ChatGPT, launched by OpenAI in November 2022, quickly became a reference, prompting the emergence of competitors such as Google's Gemini. However, these technological advances raise new cybersecurity challenges, including prompt injection attacks, the circumvention of regulatory measures (jailbreaking), the spread of misinformation (hallucinations) and risks associated with deep fakes. This paper presents a comparative analysis of the security and alignment levels of ChatGPT and Gemini, as well as a taxonomy of jailbreak techniques associated with experiments.

Paperid: 3612, https://arxiv.org/pdf/2506.09825.pdf

Abstract:
We establish a fundamental impossibility result for a `perfect hypervisor', one that (1) preserves every observable behavior of any program exactly as on bare metal and (2) adds zero timing or resource overhead. Within this model we prove two theorems. (1) Indetectability Theorem. If such a hypervisor existed, no guest-level program, measurement, or timing test could distinguish it from native execution; all traces, outputs, and timings would be identical. (2) Impossibility Theorem. Despite that theoretical indetectability, a perfect hypervisor cannot exist on any machine with finite computational resources. These results are architecture-agnostic and extend beyond hypervisors to any virtualization layer emulators, sandboxes, containers, or runtime-instrumentation frameworks. Together they provide a formal foundation for future work on the principles and limits of virtualization.

Paperid: 3613, https://arxiv.org/pdf/2506.08866.pdf

Abstract:
Air-gapped systems are considered highly secure against data leaks due to their physical isolation from external networks. Despite this protection, ultrasonic communication has been demonstrated as an effective method for exfiltrating data from such systems. While smartphones have been extensively studied in the context of ultrasonic covert channels, smartwatches remain an underexplored yet effective attack vector. In this paper, we propose and evaluate SmartAttack, a novel method that leverages smartwatches as receivers for ultrasonic covert communication in air-gapped environments. Our approach utilizes the built-in microphones of smartwatches to capture covert signals in real time within the ultrasonic frequency range of 18-22 kHz. Through experimental validation, we assess the feasibility of this attack under varying environmental conditions, distances, orientations, and noise levels. Furthermore, we analyze smartwatch-specific factors that influence ultrasonic covert channels, including their continuous presence on the user's wrist, the impact of the human body on signal propagation, and the directional constraints of built-in microphones. Our findings highlight the security risks posed by smartwatches in high-security environments and outline mitigation strategies to counteract this emerging threat.

Paperid: 3614, https://arxiv.org/pdf/2506.08828.pdf

Abstract:
Digital terrorism is a major cause of securing patient/healthcare providers data and information. Sensitive topics that may have an impact on a patient's health or even national security include patient health records and information on healthcare providers. Health databases and data sets have been continually breached by many, regular assaults, as well as local and remote servers equipped with wireless sensor networks (WSNs) in diverse locations. The problem was addressed by some contemporary strategies that were created to stop these assaults and guarantee the privacy of patient data and information transferred and gathered by sensors. Nevertheless, the literature analysis outlines many indications of weakness that persist in these methods. This study suggests a novel, reliable method that bolsters the information security and data gathered by sensors and kept on base station datasets. The proposed approach combines a number of security mechanisms, including symmetric cryptography for encryption, asymmetric cryptography for access control and signatures, and the Lesamnta-LW method in the signature process. Users' information is shielded from prying eyes by the careful application of these measures and a sound approach. Investigational comparisons, security studies, and thorough results show that the suggested method is better than earlier methods.

Paperid: 3615, https://arxiv.org/pdf/2506.06735.pdf

Abstract:
Smart contracts, integral to blockchain ecosystems, enable decentralized applications to execute predefined operations without intermediaries. Their ability to enforce trustless interactions has made them a core component of platforms such as Ethereum. Vulnerabilities such as numerical overflows, reentrancy attacks, and improper access permissions have led to the loss of millions of dollars throughout the blockchain and smart contract sector. Traditional smart contract auditing techniques such as manual code reviews and formal verification face limitations in scalability, automation, and adaptability to evolving development patterns. As a result, AI-based solutions have emerged as a promising alternative, offering the ability to learn complex patterns, detect subtle flaws, and provide scalable security assurances. This paper examines novel AI-driven techniques for vulnerability detection in smart contracts, focusing on machine learning, deep learning, graph neural networks, and transformer-based models. This paper analyzes how each technique represents code, processes semantic information, and responds to real world vulnerability classes. We also compare their strengths and weaknesses in terms of accuracy, interpretability, computational overhead, and real time applicability. Lastly, it highlights open challenges and future opportunities for advancing this domain.

Paperid: 3616, https://arxiv.org/pdf/2506.05844.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) face challenges due to class imbalance, affecting their ability to detect novel and rare attacks. This paper proposes a Dual-Conditional Batch Normalization Variational Autoencoder ($\text{C}^{2}\text{BNVAE}$) for generating balanced and labeled network traffic data. $\text{C}^{2}\text{BNVAE}$ improves the model's adaptability to different data categories and generates realistic category-specific data by incorporating Conditional Batch Normalization (CBN) into the Conditional Variational Autoencoder (CVAE). Experiments on the NSL-KDD dataset show the potential of $\text{C}^{2}\text{BNVAE}$ in addressing imbalance and improving NIDS performance with lower computational overhead compared to some baselines.

Paperid: 3617, https://arxiv.org/pdf/2506.05416.pdf

Abstract:
We revisit 1-bit gradient compression through the lens of mutual-information differential privacy (MI-DP). Building on signSGD, we propose FERRET--Fast and Effective Restricted Release for Ethical Training--which transmits at most one sign bit per parameter group with Bernoulli masking. Theory: We prove each fired group leaks at most ln 2 nats; after subsampling with rate s, the total privacy loss of G groups trained for T steps with firing probability p is epsilon = G * T * s * p * ln 2. Thus FERRET achieves MI-DP for epsilon in [0.1, 2] without additive noise. Practice: We evaluate three granularities--FERRET-MAX (finest), FERRET-EIGHTH (medium), and FERRET-2 (coarsest)--on five LLMs (137M-1.8B parameters) against DPSGD and Non-DP baselines. All methods trained for 1, 3, and 5 epochs. Utility: Across all settings, FERRET-MAX/EIGHTH beat DPSGD's perplexity. At epsilon=0.5, 5 epochs: FERRET-EIGHTH achieves 3.98 perplexity vs DPSGD's 11.61 (2.9x better), within 23% of Non-DP (3.25). Privacy: MI-AUC stays at chance for FERRET-MAX/EIGHTH (~0.51), matching DPSGD vs Non-DP's 0.76-0.99. FERRET-2 shows higher leakage (~0.55) due to lower headroom. Efficiency: Stricter budgets fire fewer signs, so FERRET uses 19-33% of DPSGD's training time and only 34-36% of Non-DP training time. Take-away: Sign-based MI-DP gets closer to achieving all three qualities of the privacy, utility, performance trilemma: FERRET trains up to 5x faster, achieves 3x lower perplexity compared to DPSGD and 1.2x greater than Non-DP, all while providing formal, mathematically provable privacy guarantees using zero additive noise. The results also show that, in certain instances, masked 1-bit updates can match non-private training utility while safeguarding data.

Paperid: 3618, https://arxiv.org/pdf/2506.05358.pdf

Abstract:
Multimodal Large Language Models (MLLMs) like GPT-4V are capable of reasoning across text and image modalities, showing promise in a variety of complex vision-language tasks. In this preliminary study, we investigate the out-of-the-box capabilities of GPT-4V in the domain of image forensics, specifically, in detecting image splicing manipulations. Without any task-specific fine-tuning, we evaluate GPT-4V using three prompting strategies: Zero-Shot (ZS), Few-Shot (FS), and Chain-of-Thought (CoT), applied over a curated subset of the CASIA v2.0 splicing dataset. Our results show that GPT-4V achieves competitive detection performance in zero-shot settings (more than 85% accuracy), with CoT prompting yielding the most balanced trade-off across authentic and spliced images. Qualitative analysis further reveals that the model not only detects low-level visual artifacts but also draws upon real-world contextual knowledge such as object scale, semantic consistency, and architectural facts, to identify implausible composites. While GPT-4V lags behind specialized state-of-the-art splicing detection models, its generalizability, interpretability, and encyclopedic reasoning highlight its potential as a flexible tool in image forensics.

Paperid: 3619, https://arxiv.org/pdf/2506.05356.pdf

Abstract:
The growing complexity of cyber threats has rendered static firewalls increasingly ineffective for dynamic, real-time intrusion prevention. This paper proposes a novel AI-driven dynamic firewall optimization framework that leverages deep reinforcement learning (DRL) to autonomously adapt and update firewall rules in response to evolving network threats. Our system employs a Markov Decision Process (MDP) formulation, where the RL agent observes network states, detects anomalies using a hybrid LSTM-CNN model, and dynamically modifies firewall configurations to mitigate risks. We train and evaluate our framework on the NSL-KDD and CIC-IDS2017 datasets using a simulated software-defined network environment. Results demonstrate significant improvements in detection accuracy, false positive reduction, and rule update latency when compared to traditional signature- and behavior-based firewalls. The proposed method provides a scalable, autonomous solution for enhancing network resilience against complex attack vectors in both enterprise and critical infrastructure settings.

Paperid: 3620, https://arxiv.org/pdf/2506.05355.pdf

Abstract:
Vehicular Fog Computing (VFC) is a promising paradigm to meet the low-latency and high-bandwidth demands of Intelligent Transportation Systems (ITS). However, dynamic vehicle mobility and diverse trust boundaries introduce critical security challenges. This paper presents a novel Zero-Trust Mobility-Aware Authentication Framework (ZTMAF) for secure communication in VFC networks. The framework employs context-aware authentication with lightweight cryptographic primitives, a decentralized trust evaluation system, and fog node-assisted session validation to combat spoofing, replay, and impersonation attacks. Simulation results on NS-3 and SUMO demonstrate improved authentication latency, reduced computational overhead, and better scalability compared to traditional PKI and blockchain-based models. Our findings suggest that ZTMAF is effective for secure, real-time V2X interactions under adversarial and mobility-variant scenarios.

Paperid: 3621, https://arxiv.org/pdf/2506.04383.pdf

Abstract:
Classical cryptographic systems rely heavily on structured algebraic problems, such as factorization, discrete logarithms, or lattice-based assumptions, which are increasingly vulnerable to quantum attacks and structural cryptanalysis. In response, this work introduces the Hashed Fractal Key Recovery (HFKR) problem, a non-algebraic cryptographic construction grounded in symbolic dynamics and chaotic perturbations. HFKR builds on the Symbolic Path Inversion Problem (SPIP), leveraging symbolic trajectories generated via contractive affine maps over $\mathbb{Z}^2$, and compressing them into fixed-length cryptographic keys using hash-based obfuscation. A key contribution of this paper is the empirical confirmation that these symbolic paths exhibit fractal behavior, quantified via box counting dimension, path geometry, and spatial density measures. The observed fractal dimension increases with trajectory length and stabilizes near 1.06, indicating symbolic self-similarity and space-filling complexity, both of which reinforce the entropy foundation of the scheme. Experimental results across 250 perturbation trials show that SHA3-512 and SHAKE256 amplify symbolic divergence effectively, achieving mean Hamming distances near 255, ideal bit-flip rates, and negligible entropy deviation. In contrast, BLAKE3 exhibits statistically uniform but weaker diffusion. These findings confirm that HFKR post-quantum security arises from the synergy between symbolic fractality and hash-based entropy amplification. The resulting construction offers a lightweight, structure-free foundation for secure key generation in adversarial settings without relying on algebraic hardness assumptions.

Paperid: 3622, https://arxiv.org/pdf/2506.03656.pdf

Abstract:
Malicious websites and phishing URLs pose an ever-increasing cybersecurity risk, with phishing attacks growing by 40% in a single year. Traditional detection approaches rely on machine learning classifiers or rule-based scanners operating in the cloud, but these face significant challenges in generalization, privacy, and evasion by sophisticated threats. In this paper, we propose a novel client-side framework for comprehensive URL analysis that leverages zero-shot inference by a local large language model (LLM) running entirely in-browser. Our system uses a compact LLM (e.g., 3B/8B parameters) via WebLLM to perform reasoning over rich context collected from the target webpage, including static code analysis (JavaScript abstract syntax trees, structure, and code patterns), dynamic sandbox execution results (DOM changes, API calls, and network requests),and visible content. We detail the architecture and methodology of the system, which combines a real browser sandbox (using iframes) resistant to common anti-analysis techniques, with an LLM-based analyzer that assesses potential vulnerabilities and malicious behaviors without any task-specific training (zero-shot). The LLM aggregates evidence from multiple sources (code, execution trace, page content) to classify the URL as benign or malicious and to provide an explanation of the threats or security issues identified. We evaluate our approach on a diverse set of benign and malicious URLs, demonstrating that even a compact client-side model can achieve high detection accuracy and insightful explanations comparable to cloud-based solutions, while operating privately on end-user devices. The results show that client-side LLM inference is a feasible and effective solution to web threat analysis, eliminating the need to send potentially sensitive data to cloud services.

Paperid: 3623, https://arxiv.org/pdf/2506.03308.pdf

Abstract:
Fully Homomorphic Encryption (FHE) has long promised the ability to compute over encrypted data without revealing sensitive contents -- a foundational goal for secure cloud analytics. Yet despite decades of cryptographic advances, practical integration of FHE into real-world relational databases remains elusive. This paper presents \textbf{Hermes}, the first system to enable FHE-native vector query processing inside a standard SQL engine. By leveraging the multi-slot capabilities of modern schemes, Hermes introduces a novel data model that packs multiple records per ciphertext and embeds encrypted auxiliary statistics (e.g., local sums) to support in-place updates and aggregation. To reconcile ciphertext immutability with record-level mutability, we develop new homomorphic algorithms based on slot masking, shifting, and rewriting. Hermes is implemented as native C++ loadable functions in MySQL using OpenFHE v1.2.4, comprising over 3,500 lines of code. Experiments on real-world datasets show up to 1{,}600$\times$ throughput gain in encryption and over 30$\times$ speedup in insertion compared to per-tuple baselines. Hermes brings FHE from cryptographic promise to practical reality -- realizing a long-standing vision at the intersection of databases and secure computation.

Paperid: 3624, https://arxiv.org/pdf/2506.02282.pdf

Abstract:
web3 wallets are key to managing user identity on blockchain. The main purpose of a web3 wallet application is to manage the private key for the user and provide an interface to interact with the blockchain. The key management scheme ( KMS ) used by the wallet to store and recover the private key can be either custodial, where the keys are permissioned and in custody of the wallet provider or noncustodial where the keys are in custody of the user. The existing non-custodial key management schemes tend to offset the burden of storing and recovering the key entirely on the user by asking them to remember seed-phrases. This creates onboarding hassles for the user and introduces the risk that the user may lose their assets if they forget or lose their seedphrase/private key. In this paper, we propose a novel method of backing up user keys using a non-custodial key management technique that allows users to save and recover a backup of their private key using any independent sign-in method such as google-oAuth or other 3P oAuth.

Paperid: 3625, https://arxiv.org/pdf/2506.02027.pdf

Abstract:
Many identity systems assign a single, static identifier to an individual for life, reused across domains like healthcare, finance, and education. These Universal Lifelong Identifiers (ULIs) underpin critical workflows but now pose systemic privacy risks. We take the position that ULIs are fundamentally incompatible with the AI era and must be phased out. We articulate a threat model grounded in modern AI capabilities and show that traditional safeguards such as redaction, consent, and access controls are no longer sufficient. We define core properties for identity systems in the AI era and present a cryptographic framework that satisfies them while retaining compatibility with existing identifier workflows. Our design preserves institutional workflows, supports essential functions such as auditability and delegation, and offers a practical migration path beyond ULIs.

Paperid: 3626, https://arxiv.org/pdf/2506.01767.pdf

Abstract:
Fragmentation is a routine part of communication in 6LoWPAN-based IoT networks, designed to accommodate small frame sizes on constrained wireless links. However, this process introduces a critical vulnerability fragments are typically stored and processed before their legitimacy is confirmed, allowing attackers to exploit this gap with minimal effort. In this work, we explore a defense strategy that takes a more adaptive, behavior-aware approach to this problem. Our system, called Predictive-CSM, introduces a combination of two lightweight mechanisms. The first tracks how each node behaves over time, rewarding consistent and successful interactions while quickly penalizing suspicious or failing patterns. The second checks the integrity of packet fragments using a chained hash, allowing incomplete or manipulated sequences to be caught early, before they can occupy memory or waste processing time. We put this system to the test using a set of targeted attack simulations, including early fragment injection, replayed headers, and flooding with fake data. Across all scenarios, Predictive CSM preserved network delivery and maintained energy efficiency, even under pressure. Rather than relying on heavyweight cryptography or rigid filters, this approach allows constrained de vices to adapt their defenses in real time based on what they observe, not just what they're told. In that way, it offers a step forward for securing fragmented communication in real world IoT systems

Paperid: 3627, https://arxiv.org/pdf/2506.01384.pdf

Abstract:
This paper presents a mathematically rigorous formal analysis of Simplified Payment Verification (SPV) clients, as specified in Section 8 of the original Bitcoin white paper, versus non-mining full nodes operated by home users. It defines security as resistance to divergence from global consensus and models transaction acceptance, enforcement capability, and divergence probability under adversarial conditions. The results demonstrate that SPV clients, despite omitting script verification, are cryptographically sufficient under honest-majority assumptions and topologically less vulnerable to attack than structurally passive, non-enforcing full nodes. The paper introduces new axioms on behavioral divergence and communication topology, proving that home-based full nodes increase systemic entropy without contributing to consensus integrity. Using a series of formally defined lemmas, propositions, and Monte Carlo simulation results, it is shown that SPV clients represent the rational equilibrium strategy for non-mining participants. This challenges the prevailing narrative that home validators enhance network security, providing formal and operational justifications for the sufficiency of SPV models.

Paperid: 3628, https://arxiv.org/pdf/2506.00426.pdf

Abstract:
The pervasive use of hybrid cloud computing models has changed enterprise as well as Information Technology services infrastructure by giving businesses simple and cost-effective options of combining on-premise IT equipment with public cloud services. hybrid cloud solutions deploy multifaceted models of security, performance optimization, and cost efficiency, conventionally fragmented in the cloud computing milieu. This paper examines how organizations manage these parameters in hybrid cloud ecosystems while providing solutions to the challenges they face in operationalizing hybrid cloud adoptions. The study captures the challenges of achieving a balance in resource distribution between on-premise and cloud resources (herein referred to as the "resource allocation challenge"), the complexity of pricing models from cloud providers like AWS, Microsoft Azure, Google Cloud (herein called the 'pricing complexity problem'), and the urgency for strong security infrastructure to safeguard sensitive information (known as 'the information security problem'). This study demonstrates the security and performance management solutions proposed were validated in a detailed case study of adoption of AWS and Azure based hybrid cloud and provides useful guidance. Also, a hybrid cloud security and cost optimization framework based on zero trust architecture, encryption, hybrid cloud policies, and others, is proposed. The conclusion includes recommendations for research on automation of hybrid cloud service management, integration of multi-clouds, and the ever-present question of data privacy, stressing how those matters affect contemporary enterprises.

Paperid: 3629, https://arxiv.org/pdf/2505.24252.pdf

Abstract:
Frequent cyber-attacks have elevated WebShell exploitation and defense to a critical research focus within network security. However, there remains a significant shortage of publicly available, well-categorized malicious-code datasets organized by obfuscation method. Existing malicious-code generation methods, which primarily rely on prompt engineering, often suffer from limited diversity and high redundancy in the payloads they produce. To address these limitations, we propose \textbf{RAWG}, a \textbf{R}eward-driven \textbf{A}utomated \textbf{W}ebshell Malicious-code \textbf{G}enerator designed for red-teaming applications. Our approach begins by categorizing webshell samples from common datasets into seven distinct types of obfuscation. We then employ a large language model (LLM) to extract and normalize key tokens from each sample, creating a standardized, high-quality corpus. Using this curated dataset, we perform supervised fine-tuning (SFT) on an open-source large model to enable the generation of diverse, highly obfuscated webshell malicious payloads. To further enhance generation quality, we apply Proximal Policy Optimization (PPO), treating malicious-code samples as "chosen" data and benign code as "rejected" data during reinforcement learning. Extensive experiments demonstrate that RAWG significantly outperforms current state-of-the-art methods in both payload diversity and escape effectiveness.

Paperid: 3630, https://arxiv.org/pdf/2505.23770.pdf

Abstract:
Since the 1990s, the integration of technology into daily life has led to the creation of an extensive network of interconnected devices, transforming how individuals and organizations operate. However, this digital transformation has also spurred the rise of cybercrime, criminal activities perpetrated through networks or computer systems. Cybercrime has become a global concern, presenting significant challenges to security systems. Although advancements in digital technology have enhanced efficiency, they have also opened new avenues for exploitation by cybercriminals, highlighting the urgent need for advanced cybersecurity measures. The escalating number of cyberattacks and associated risks in the past decade highlights the critical importance of protecting sensitive data and safeguarding information systems. Cybercrimes range from financial fraud and phishing scams to identity theft and online harassment, posing substantial risks to both individuals and organizations. In response, governments, law enforcement agencies, and cybersecurity units have intensified their efforts to address these threats. In recent years, India has experienced a significant surge in cybercrime incidents, with a notable increase in cases involving ransomware, data breaches, and social engineering attacks. The growing penetration of internet services, the expansion of e-commerce, and the rapid adoption of digital payment systems have made individuals and organizations more vulnerable to cyber threats. Key areas affected include banking, healthcare, and government sectors, which are frequently targeted due to the sensitive nature of the data they handle. To combat these risks, there is an increasing focus on public awareness, cybersecurity education, and robust regulatory frameworks. This paper examines cybercrime, prevention strategies, security protocols, and terminology to safeguard digital infrastructure.

Paperid: 3631, https://arxiv.org/pdf/2505.23655.pdf

Abstract:
Neural network inference typically operates on raw input data, increasing the risk of exposure during preprocessing and inference. Moreover, neural architectures lack efficient built-in mechanisms for directly authenticating input data. This work introduces a novel encryption method for ensuring the security of neural inference. By constructing key-conditioned chaotic graph dynamical systems, we enable the encryption and decryption of real-valued tensors within the neural architecture. The proposed dynamical systems are particularly suited to encryption due to their sensitivity to initial conditions and their capacity to produce complex, key-dependent nonlinear transformations from compact rules. This work establishes a paradigm for securing neural inference and opens new avenues for research on the application of graph dynamical systems in neural network security.

Paperid: 3632, https://arxiv.org/pdf/2505.23634.pdf

Abstract:
The model context protocol (MCP) has been widely adapted as an open standard enabling the seamless integration of generative AI agents. However, recent work has shown the MCP is susceptible to retrieval-based "falsely benign" attacks (FBAs), allowing malicious system access and credential theft, but requiring that users download compromised files directly to their systems. Herein, we show that the threat model of MCP-based attacks is significantly broader than previously thought, i.e., attackers need only post malicious content online to deceive MCP agents into carrying out their attacks on unsuspecting victims' systems. To improve alignment guardrails against such attacks, we introduce a new MCP dataset of FBAs and (truly) benign samples to explore the effectiveness of direct preference optimization (DPO) for the refusal training of large language models (LLMs). While DPO improves model guardrails against such attacks, we show that the efficacy of refusal learning varies drastically depending on the model's original post-training alignment scheme--e.g., GRPO-based LLMs learn to refuse extremely poorly. Thus, to further improve FBA refusals, we introduce Retrieval Augmented Generation for Preference alignment (RAG-Pref), a novel preference alignment strategy based on RAG. We show that RAG-Pref significantly improves the ability of LLMs to refuse FBAs, particularly when combined with DPO alignment, thus drastically improving guardrails against MCP-based attacks.

Paperid: 3633, https://arxiv.org/pdf/2505.22644.pdf

Abstract:
Most classical and post-quantum cryptographic assumptions, including integer factorization, discrete logarithms, and Learning with Errors (LWE), rely on algebraic structures such as rings or vector spaces. While mathematically powerful, these structures can be exploited by quantum algorithms or advanced algebraic attacks, raising a pressing need for structure-free alternatives. To address this gap, we introduce the Symbolic Path Inversion Problem (SPIP), a new computational hardness assumption based on symbolic trajectories generated by contractive affine maps with bounded noise over Z2. Unlike traditional systems, SPIP is inherently non-algebraic and relies on chaotic symbolic evolution and rounding-induced non-injectivity to render inversion computationally infeasible. We prove that SPIP is PSPACE-hard and #P-hard, and demonstrate through empirical simulation that even short symbolic sequences (e.g., n = 3, m = 2) can produce over 500 valid trajectories for a single endpoint, with exponential growth reaching 2256 paths for moderate parameters. A quantum security analysis further shows that Grover-style search offers no practical advantage due to oracle ambiguity and verification instability. These results position SPIP as a viable foundation for post-quantum cryptography that avoids the vulnerabilities of algebraic symmetry while offering scalability, unpredictability, and resistance to both classical and quantum inversion.

Paperid: 3634, https://arxiv.org/pdf/2505.21725.pdf

Abstract:
This report presents a comprehensive analysis of a malicious software sample, detailing its architecture, behavioral characteristics, and underlying intent. Through static and dynamic examination, the malware core functionalities, including persistence mechanisms, command-and-control communication, and data exfiltration routines, are identified and its supporting infrastructure is mapped. By correlating observed indicators of compromise with known techniques, tactics, and procedures, this analysis situates the sample within the broader context of contemporary threat campaigns and infers the capabilities and motivations of its likely threat actor. Building on these findings, actionable threat intelligence is provided to support proactive defenses. Threat hunting teams receive precise detection hypotheses for uncovering latent adversarial presence, while monitoring systems can refine alert logic to detect anomalous activity in real time. Finally, the report discusses how this structured intelligence enhances predictive risk assessments, informs vulnerability prioritization, and strengthens organizational resilience against advanced persistent threats. By integrating detailed technical insights with strategic threat landscape mapping, this malware analysis report not only reconstructs past adversary actions but also establishes a robust foundation for anticipating and mitigating future attacks.

Paperid: 3635, https://arxiv.org/pdf/2505.21263.pdf

Abstract:
Modern software supply chains face an increasing threat from malicious code hidden in trusted components such as browser extensions, IDE extensions, and open-source packages. This paper introduces JavaSith, a novel client-side framework for analyzing potentially malicious extensions in web browsers, Visual Studio Code (VSCode), and Node's NPM packages. JavaSith combines a runtime sandbox that emulates browser/Node.js extension APIs (with a ``time machine'' to accelerate time-based triggers) with static analysis and a local large language model (LLM) to assess risk from code and metadata. We present the design and architecture of JavaSith, including techniques for intercepting extension behavior over simulated time and extracting suspicious patterns. Through case studies on real-world attacks (such as a supply-chain compromise of a Chrome extension and malicious VSCode extensions installing cryptominers), we demonstrate how JavaSith can catch stealthy malicious behaviors that evade traditional detection. We evaluate the framework's effectiveness and discuss its limitations and future enhancements. JavaSith's client-side approach empowers end-users/organizations to vet extensions and packages before trustingly integrating them into their environments.

Paperid: 3636, https://arxiv.org/pdf/2505.20912.pdf

Abstract:
Fully homomorphic encryption (FHE) and trusted execution environments (TEE) are two approaches to provide confidentiality during data processing. Each approach has its own strengths and weaknesses. In certain scenarios, computations can be carried out in a hybrid environment, using both FHE and TEE. However, processing data in such hybrid settings presents challenges, as it requires to adapt and rewrite the algorithms for the chosen technique. We propose a domain-specific language (DSL) for secure computation that allows to express the computations to perform and execute them using a backend that leverages either FHE or TEE, depending on what is available.

Paperid: 3637, https://arxiv.org/pdf/2505.20614.pdf

Abstract:
This paper introduces EarthOL, a novel consensus protocol that attempts to replace computational waste in blockchain systems with verifiable human contributions within bounded domains. While recognizing the fundamental impossibility of universal value assessment, we propose a domain-restricted approach that acknowledges cultural diversity and subjective preferences while maintaining cryptographic security. Our enhanced Proof-of-Human-Contribution (PoHC) protocol uses a multi-layered verification system with domain-specific evaluation criteria, time-dependent validation mechanisms, and comprehensive security frameworks. We present theoretical analysis demonstrating meaningful progress toward incentive-compatible human contribution verification in high-consensus domains, achieving Byzantine fault tolerance in controlled scenarios while addressing significant scalability and cultural bias challenges. Through game-theoretic analysis, probabilistic modeling, and enhanced security protocols, we identify specific conditions under which the protocol remains stable and examine failure modes with comprehensive mitigation strategies. This work contributes to understanding the boundaries of decentralized value assessment and provides a framework for future research in human-centered consensus mechanisms for specific application domains, with particular emphasis on validator and security specialist incentive systems.

Paperid: 3638, https://arxiv.org/pdf/2505.19456.pdf

Abstract:
JavaScript, a scripting language employed to augment the capabilities of web browsers within web pages or browser extensions, utilizes code segments termed JavaScript inclusions. While the security aspects of JavaScript inclusions in web pages have undergone substantial scrutiny, a thorough investigation into the security of such inclusions within browser extensions remains absent, despite the divergent security paradigms governing these environments. This study presents a systematic measurement of JavaScript inclusions in Chrome extensions, employing a hybrid methodology encompassing static and dynamic analysis to identify these inclusions. The analysis of 36,324 extensions revealed 350,784 JavaScript inclusions. Subsequent security assessment indicated that, although the majority of these inclusions originate from local files within the extensions rather than external servers, 22 instances of vulnerable remote JavaScript inclusions were identified. These remote inclusions present potential avenues for malicious actors to execute arbitrary code within the extension's execution context. Furthermore, an analysis of JavaScript library utilization within Chrome extensions disclosed the prevalent use of susceptible and outdated libraries, notably within numerous widely adopted extensions.

Paperid: 3639, https://arxiv.org/pdf/2505.19047.pdf

Abstract:
We introduce the MoveEVM Weakness Classification (MWC) system -- a dedicated vulnerability taxonomy for smart contracts built with Move and executed in EVM-compatible environments. While Move was originally designed to prevent common security flaws via linear resource types and strict ownership, its integration with EVM bytecode introduces novel hybrid vulnerabilities not captured by existing systems like the SWC registry. Our taxonomy spans 37 categorized vulnerability types (MWC-100 to MWC-136) across six semantic frames, addressing issues such as hybrid gas metering, capability misuse, meta-transaction spoofing, and AI-integrated logic. Through analysis of real-world contracts from Aptos and Sui, we demonstrate that current verification tools often miss these hybrid risks. We also explore how formal methods and LLM-based audit agents can operationalize this classification, enabling scalable, logic-aware smart contract auditing. MWC lays the foundation for more secure and verifiable contracts in next-generation blockchain systems. (Shortened Abstract)

Paperid: 3640, https://arxiv.org/pdf/2505.18398.pdf

Abstract:
We introduce funion, a system providing end-to-end sender-receiver unlinkability for neural network inference. By leveraging the Pigeonhole storage protocol and BACAP (blinding-and-capability) scheme from the Echomix anonymity system, funion inherits the provable security guarantees of modern mixnets. Users can anonymously store input tensors in pseudorandom storage locations, commission compute services to process them via the neural network, and retrieve results with no traceable connection between input and output parties. This store-compute-store paradigm masks both network traffic patterns and computational workload characteristics, while quantizing execution timing into public latency buckets. Our security analysis demonstrates that funion inherits the strong metadata privacy guarantees of Echomix under largely the same trust assumptions, while introducing acceptable overhead for production-scale workloads. Our work paves the way towards an accessible platform where users can submit fully anonymized inference queries to cloud services.

Paperid: 3641, https://arxiv.org/pdf/2505.18206.pdf

Abstract:
The integration of unmanned aerial vehicles (UAVs) into smart agriculture has enabled real-time monitoring, data collection, and automated farming operations. However, the high mobility, decentralized nature, and low-power communication of UAVs pose significant security challenges, particularly in ensuring transaction integrity and trust. This paper presents a quantum-resilient blockchain framework designed to secure data and resource transactions in UAV-assisted smart agriculture networks. The proposed solution incorporates post-quantum cryptographic primitives-specifically lattice-based digital signatures and key encapsulation mechanisms to achieve tamper-proof, low-latency consensus without relying on traditional computationally intensive proof-of-work schemes. A lightweight consensus protocol tailored for UAV communication constraints is developed, and transaction validation is handled through a trust-ranked, multi-layer ledger maintained by edge nodes. Experimental results from simulations using NS-3 and custom blockchain testbeds show that the framework outperforms existing schemes in terms of transaction throughput, energy efficiency, and resistance to quantum attacks. The proposed system provides a scalable, secure, and sustainable solution for precision agriculture, enabling trusted automation and resilient data sharing in post-quantum eras.

Paperid: 3642, https://arxiv.org/pdf/2505.18157.pdf

Abstract:
The implementation of blockchain technology in tax administration offers promising improvements in security, transparency, and efficiency. This paper presents the design of a blockchain-based e-Faktur system aimed at addressing the challenges of issuing and verifying tax invoices within Indonesia's VAT reporting process. Utilizing Hyperledger Fabric, a private permissioned blockchain, integrated with Hyperledger FireFly, this system ensures data immutability, access control, and decentralized transaction processing. The proposed system streamlines the issuance of NSFP and validates invoices while reducing reliance on centralized servers, eliminating single points of failure, and enhancing auditability. Hypothetical cyberattack scenarios were explored to assess the system's robustness. The results demonstrate that blockchain technology mitigates potential vulnerabilities and improves the resilience of tax reporting systems. The findings highlight the feasibility of blockchain in public tax administration and provide a foundation for future research on large-scale implementations.

Paperid: 3643, https://arxiv.org/pdf/2505.17527.pdf

Abstract:
Distributed broadcast encryption (DBE) is a variant of broadcast encryption (BE) that can efficiently transmit a message to a subset of users, in which users independently generate user private keys and user public keys instead of a central trusted authority generating user keys. In this paper, we propose a DBE scheme with constant size ciphertexts, constant size private keys, and linear size public parameters, and prove the adaptive security of our DBE scheme under static assumptions in composite-order bilinear groups. The previous efficient DBE schemes with constant size ciphertexts and constant size private keys are proven secure under the $q$-Type assumption or have a drawback of having quadratic size public parameters. In contrast, our DBE scheme is the first DBE scheme with linear size public parameters proven adaptively secure under static assumptions in composite-order bilinear groups.

Paperid: 3644, https://arxiv.org/pdf/2505.17109.pdf

Abstract:
Open-weight general-purpose AI (GPAI) models offer significant benefits but also introduce substantial cybersecurity risks, as demonstrated by the offensive capabilities of models like DeepSeek-R1 in evaluations such as MITRE's OCCULT. These publicly available models empower a wider range of actors to automate and scale cyberattacks, challenging traditional defence paradigms and regulatory approaches. This paper analyzes the specific threats -- including accelerated malware development and enhanced social engineering -- magnified by open-weight AI release. We critically assess current regulations, notably the EU AI Act and the GPAI Code of Practice, identifying significant gaps stemming from the loss of control inherent in open distribution, which renders many standard security mitigations ineffective. We propose a path forward focusing on evaluating and controlling specific high-risk capabilities rather than entire models, advocating for pragmatic policy interpretations for open-weight systems, promoting defensive AI innovation, and fostering international collaboration on standards and cyber threat intelligence (CTI) sharing to ensure security without unduly stifling open technological progress.

Paperid: 3645, https://arxiv.org/pdf/2505.17094.pdf

Abstract:
Neuromorphic computing, inspired by the human brain's neural architecture, is revolutionizing artificial intelligence and edge computing with its low-power, adaptive, and event-driven designs. However, these unique characteristics introduce novel cybersecurity risks. This paper proposes Neuromorphic Mimicry Attacks (NMAs), a groundbreaking class of threats that exploit the probabilistic and non-deterministic nature of neuromorphic chips to execute covert intrusions. By mimicking legitimate neural activity through techniques such as synaptic weight tampering and sensory input poisoning, NMAs evade traditional intrusion detection systems, posing risks to applications such as autonomous vehicles, smart medical implants, and IoT networks. This research develops a theoretical framework for NMAs, evaluates their impact using a simulated neuromorphic chip dataset, and proposes countermeasures, including neural-specific anomaly detection and secure synaptic learning protocols. The findings underscore the critical need for tailored cybersecurity measures to protect brain-inspired computing, offering a pioneering exploration of this emerging threat landscape.

Paperid: 3646, https://arxiv.org/pdf/2505.17034.pdf

Abstract:
As quantum computing progresses, traditional cryptographic systems face the threat of obsolescence due to the capabilities of quantum algorithms. This paper introduces the Quantum-Ready Architecture for Security and Risk Management (QUASAR), a novel framework designed to help organizations prepare for the post-quantum era. QUASAR provides a structured approach to transition from current cryptographic systems to quantum-resistant alternatives, emphasizing technical, security, and operational readiness. The framework integrates a set of actionable components, a timeline for phased implementation, and continuous optimization strategies to ensure ongoing preparedness. Through performance indicators, readiness scores, and optimization functions, QUASAR enables organizations to assess their current state, identify gaps, and execute targeted actions to mitigate risks posed by quantum computing. By offering a comprehensive, adaptable, and quantifiable strategy, QUASAR equips organizations with the tools necessary to future-proof their operations and secure sensitive data against the impending rise of quantum technologies.

Paperid: 3647, https://arxiv.org/pdf/2505.16503.pdf

Abstract:
Algebraic methods are employed in order to define language-based security properties of processes. A supervisor is introduced that can disable unwanted behavior of an insecure process by controlling some of its actions or by inserting timed actions to make an insecure process secure. We assume a situation where neither the supervisor nor the attacker has complete information about the ongoing systems behavior. We study the conditions under which such a supervisor exists, as well as its properties and limitations.

Paperid: 3648, https://arxiv.org/pdf/2505.14027.pdf

Abstract:
As computer networks proliferate, the gravity of network intrusions has escalated, emphasizing the criticality of network intrusion detection systems for safeguarding security. While deep learning models have exhibited promising results in intrusion detection, they face challenges in managing high-dimensional, complex traffic patterns and imbalanced data categories. This paper presents CSAGC-IDS, a network intrusion detection model based on deep learning techniques. CSAGC-IDS integrates SC-CGAN, a self-attention-enhanced convolutional conditional generative adversarial network that generates high-quality data to mitigate class imbalance. Furthermore, CSAGC-IDS integrates CSCA-CNN, a convolutional neural network enhanced through cost sensitive learning and channel attention mechanism, to extract features from complex traffic data for precise detection. Experiments conducted on the NSL-KDD dataset. CSAGC-IDS achieves an accuracy of 84.55% and an F1-score of 84.52% in five-class classification task, and an accuracy of 91.09% and an F1 score of 92.04% in binary classification task.Furthermore, this paper provides an interpretability analysis of the proposed model, using SHAP and LIME to explain the decision-making mechanisms of the model.

Paperid: 3649, https://arxiv.org/pdf/2505.13238.pdf

Abstract:
We propose a framework for compile-time ciphertext synthesis in fully homomorphic encryption (FHE) systems, where ciphertexts are constructed from precomputed encrypted basis vectors combined with a runtime-scaled encryption of zero. This design eliminates online encryption and instead relies solely on ciphertext-level additions and scalar multiplications, enabling efficient data ingestion and algebraic reuse. We formalize the method as a randomized $\mathbb{Z}_t$-module morphism and prove that it satisfies IND-CPA security under standard assumptions. The proof uses a hybrid game reduction, showing that adversarial advantage in distinguishing synthesized ciphertexts is negligible if the underlying FHE scheme is IND-CPA secure. Unlike prior designs that require a pool of random encryptions of zero, our construction achieves equivalent security using a single zero ciphertext multiplied by a fresh scalar at runtime, reducing memory overhead while preserving ciphertext randomness. The resulting primitive supports efficient integration with standard FHE APIs and maintains compatibility with batching, rotation, and aggregation, making it well-suited for encrypted databases, streaming pipelines, and secure compiler backends.

Paperid: 3651, https://arxiv.org/pdf/2505.12393.pdf

Abstract:
Protocol art emerges at the confluence of blockchain-based smart contracts and a century-long lineage of conceptual art, participatory art, and algorithmic generative art practices. Yet existing definitions-most notably Primavera De Filippi's "protocolism"-struggle to demarcate this nascent genre from other art forms in practice. Addressing this definition-to-practice gap, this paper offers a focused case study of pioneering protocol artworks by Pak, an early and influential pseudonymous protocol artist who treats smart contracts as medium and protocol participation as message. Tracing the evolution from early open-edition releases of The Fungible and the dynamic mechanics of Merge to the soul-bound messaging of Censored and the reflective absence of Not Found, we examine how Pak choreographs distributed agency across collectors and autonomous contracts, showing how programmable protocols become a social fabric in artistic meaning-making. Through thematic analysis of Pak's works, we identify seven core characteristics that distinguish protocol art: (1) system-centric rather than object-centric composition, (2) autonomous governance for open-ended control, (3) distributed agency and communal authorship, (4) temporal dynamism and lifecycle aesthetics, (5) economic-driven engagement, (6) poetic message embedding in interaction rituals, and (7) interoperability enabling composability for emergence. We then discuss how these features set protocol art apart from adjacent artistic movements. By developing a theoretical framework grounded in Pak's practice, we contribute to the emerging literature on protocolism while offering design implications for artists shaping this evolving art form.

Paperid: 3652, https://arxiv.org/pdf/2505.12210.pdf

Abstract:
Information-flow control systems often enforce progress-insensitive noninterference, as it is simple to understand and enforce. Unfortunately, real programs need to declassify results and endorse inputs, which noninterference disallows, while preventing attackers from controlling leakage, including through progress channels, which progress-insensitivity ignores. This work combines ideas for progress-sensitive security with secure downgrading (declassification and endorsement) to identify a notion of securely downgrading progress information. We use hyperproperties to distill the separation between progress-sensitive and progress-insensitive noninterference and combine it with nonmalleable information flow, an existing (progress-insensitive) definition of secure downgrading, to define nonmalleable progress leakage (NMPL). We present the first information-flow type system to allow some progress leakage while enforcing NMPL, and we show how to infer the location of secure progress downgrades. All theorems are verified in Rocq.

Paperid: 3653, https://arxiv.org/pdf/2505.11583.pdf

Abstract:
The rapid exponential growth in cloud-first business models and tightened global data protection regulations have led to the exponential increase in the level of importance of ISO certifications, especially ISO/IEC 27001, 27017, and 27018, as strategic imperative propositions for organizations wanting to build trust, ensure compliance, and achieve a competitive advantage. This article describes a case study of a successful design, implementation, and scaling of a cybersecurity certification practice in Armanino LLP, a pioneering US accounting and consulting firm. In reaction to increasing client desires for formalized information security frameworks, I founded an industry practice from conception through implementation to aid mid-market and high-growth technology firms. During one year, the initiative brought in over \$1 million in new service revenue, expanded our portfolio of cybersecurity clients by 150%, and produced more than 20 successful ISO certifications on various verticals such as SaaS, healthcare, and fintech. Based on the strategic wisdom and operational strategy, this paper outlines the technical architecture of the ISO service line from modular audit templates to certification readiness kits, from stakeholder enablement to integration with SOC 2 and CIS controls. The approach gave value to repeatability, speed, and assurance, thus making Armanino a reputable certification body. The lessons drawn out provide us with a flexible template that can be utilized by firms wishing to build strong compliance programs that can be tailored to address changing digital risk terrains. This work adds to the increasing knowledge about audit scalability, cybersecurity compliance, and ISO standardisation.

Paperid: 3654, https://arxiv.org/pdf/2505.10184.pdf

Abstract:
We propose a new method for retrieving the algebraic structure of a generic alternant code given an arbitrary generator matrix, provided certain conditions are met. We then discuss how this challenges the security of the McEliece cryptosystem instantiated with this family of codes. The central object of our work is the quadratic hull related to a linear code, defined as the intersection of all quadrics passing through the columns of a given generator or parity-check matrix, where the columns are considered as points in the affine or projective space. The geometric properties of this object reveal important information about the internal algebraic structure of the code. This is particularly evident in the case of generalized Reed-Solomon codes, whose quadratic hull is deeply linked to a well-known algebraic variety called the rational normal curve. By utilizing the concept of Weil restriction of affine varieties, we demonstrate that the quadratic hull of a generic dual alternant code inherits many interesting features from the rational normal curve, on account of the fact that alternant codes are subfield-subcodes of generalized Reed-Solomon codes. If the rate of the generic alternant code is sufficiently high, this allows us to construct a polynomial-time algorithm for retrieving the underlying generalized Reed-Solomon code from which the alternant code is defined, which leads to an efficient key-recovery attack against the McEliece cryptosystem when instantiated with this class of codes. Finally, we discuss the generalization of this approach to Algebraic-Geometry codes and Goppa codes.

Paperid: 3655, https://arxiv.org/pdf/2505.09048.pdf

Abstract:
Cybersecurity threats are increasingly marked by interdependence, uncertainty, and evolving complexity challenges that traditional assessment methods such as CVSS, STRIDE, and attack trees fail to adequately capture. This paper reviews the application of Bayesian Networks (BNs) in cybersecurity risk modeling, highlighting their capacity to represent probabilistic dependencies, integrate diverse threat indicators, and support reasoning under uncertainty. A structured case study is presented in which a STRIDE-based attack tree for an automotive In-Vehicle Infotainment (IVI) system is transformed into a Bayesian Network. Logical relationships are encoded using Conditional Probability Tables (CPTs), and threat likelihoods are derived from normalized DREAD scores. The model enables not only probabilistic inference of system compromise likelihood but also supports causal analysis using do-calculus and local sensitivity analysis to identify high-impact vulnerabilities. These analyses provide insight into the most influential nodes within the threat propagation chain, informing targeted mitigation strategies. While demonstrating the potential of BNs for dynamic and context-aware risk assessment, the study also outlines limitations related to scalability, reliance on expert input, static structure assumptions, and limited temporal modeling. The paper concludes by advocating for future enhancements through Dynamic Bayesian Networks, structure learning, and adaptive inference to better support real-time cybersecurity decision-making in complex environments.

Paperid: 3656, https://arxiv.org/pdf/2505.09034.pdf

Abstract:
This study proposes a mechanism for encrypting SD-JWT (Selective Disclosure JSON Web Token) Disclosures using Attribute-Based Encryption (ABE) to enable flexible access control on the basis of the Verifier's attributes. By integrating Ciphertext-Policy ABE (CP-ABE) into the existing SD-JWT framework, the Holder can assign decryption policies to Disclosures, ensuring information is selectively disclosed. The mechanism's feasibility was evaluated in a virtualized environment by measuring the processing times for SD-JWT generation, encryption, and decryption with varying Disclosure counts (5, 10, 20). Results showed that SD-JWT generation is lightweight, while encryption and decryption times increase linearly with the number of Disclosures. This approach is suitable for privacy-sensitive applications like healthcare, finance, and supply chain tracking but requires optimization for real-time use cases such as IoT. Future research should focus on improving ABE efficiency and addressing scalability challenges.

Paperid: 3657, https://arxiv.org/pdf/2505.08791.pdf

Abstract:
Most modern cryptographic systems, such as RSA and the Diffie-Hellman Key Exchange, rely on "trapdoor" mathematical functions that are presumed to be computationally difficult with existing tools. However, quantum computers will be able to break these systems using Shor's Algorithm, necessitating the development of quantum-resistant alternatives. We first examine the McEliece cryptosystem, a code-based scheme believed to be secure against quantum attacks due to the hardness of decoding arbitrary linear codes. We then explore NTRU, a lattice-based system grounded in the difficulty of solving the Shortest Vector Problem. Finally, we establish connections between the structural foundations and security of the two systems.

Paperid: 3658, https://arxiv.org/pdf/2505.08772.pdf

Abstract:
Blockchain technology has emerged as one of the most transformative digital innovations of the 21st century. This paper presents a comprehensive review of blockchain's fundamental architecture, tracing its development from Bitcoin's initial implementation to current enterprise applications. We examine the core technical components including distributed consensus algorithms, cryptographic principles, and smart contract functionality that enable blockchain's unique properties. The historical progression from cryptocurrency-focused systems to robust platforms for decentralized applications is analyzed, highlighting pivotal developments in scalability, privacy, and interoperability. Additionally, we identify critical challenges facing widespread blockchain adoption, including technical limitations, regulatory hurdles, and integration complexities with existing systems. By providing this foundational understanding of blockchain technology, this paper contributes to ongoing research efforts addressing blockchain's potential to revolutionize data management across industries.

Paperid: 3659, https://arxiv.org/pdf/2505.08650.pdf

Abstract:
This article examines the evolution of cryptologic techniques and their implications for public and private security, focusing on the Italian and EU legal frameworks. It explores the roles of cryptography, steganography, and quantum technologies in countering cybersecurity threats, emphasising the need for robust legislation to address emerging challenges. Special attention is given to Italy's legislative reforms, including Law No. 90 of 2024, which strengthens penalties for cybercrimes and establishes the National Cryptography Centre within the Italian National Cybersecurity Agency. Additionally, the article highlights international initiatives, such as the UN's draft convention on cybercrime, emphasising the balance between security, privacy, and fundamental human rights in a post-quantum era.

Paperid: 3660, https://arxiv.org/pdf/2505.08237.pdf

Abstract:
Advanced Metering Infrastructure (AMI) data from smart electric and gas meters enables valuable insights for utilities and consumers, but also raises significant privacy concerns. In California, regulatory decisions (CPUC D.11-07-056 and D.11-08-045) mandate strict privacy protections for customer energy usage data, guided by the Fair Information Practice Principles (FIPPs). We comprehensively explore solutions drawn from data anonymization, privacy-preserving machine learning (differential privacy and federated learning), synthetic data generation, and cryptographic techniques (secure multiparty computation, homomorphic encryption). This allows advanced analytics, including machine learning models, statistical and econometric analysis on energy consumption data, to be performed without compromising individual privacy. We evaluate each technique's theoretical foundations, effectiveness, and trade-offs in the context of utility data analytics, and we propose an integrated architecture that combines these methods to meet real-world needs. The proposed hybrid architecture is designed to ensure compliance with California's privacy rules and FIPPs while enabling useful analytics, from forecasting and personalized insights to academic research and econometrics, while strictly protecting individual privacy. Mathematical definitions and derivations are provided where appropriate to demonstrate privacy guarantees and utility implications rigorously. We include comparative evaluations of the techniques, an architecture diagram, and flowcharts to illustrate how they work together in practice. The result is a blueprint for utility data scientists and engineers to implement privacy-by-design in AMI data handling, supporting both data-driven innovation and strict regulatory compliance.

Paperid: 3661, https://arxiv.org/pdf/2505.08115.pdf

Abstract:
We develop a generalized framework for invariant-based cryptography by extending the use of structural identities as core cryptographic mechanisms. Starting from a previously introduced scheme where a secret is encoded via a four-point algebraic invariant over masked functional values, we broaden the approach to include multiple classes of invariant constructions. In particular, we present new symmetric schemes based on shifted polynomial roots and functional equations constrained by symmetric algebraic conditions, such as discriminants and multilinear identities. These examples illustrate how algebraic invariants -- rather than one-way functions -- can enforce structural consistency and unforgeability. We analyze the cryptographic utility of such invariants in terms of recoverability, integrity binding, and resistance to forgery, and show that these constructions achieve security levels comparable to the original oscillatory model. This work establishes a foundation for invariant-based design as a versatile and compact alternative in symmetric cryptographic protocols.

Paperid: 3662, https://arxiv.org/pdf/2505.08050.pdf

Abstract:
Modern web browsers have effectively become the new operating system for business applications, yet their security posture is often under-scrutinized. This paper presents a novel, comprehensive Browser Security Posture Analysis Framework[1], a browser-based client-side security assessment toolkit that runs entirely in JavaScript and WebAssembly within the browser. It performs a battery of over 120 in-browser security tests in situ, providing fine-grained diagnostics of security policies and features that network-level or os-level tools cannot observe. This yields insights into how well a browser enforces critical client-side security invariants. We detail the motivation for such a framework, describe its architecture and implementation, and dive into the technical design of numerous test modules (covering the same-origin policy, cross-origin resource sharing, content security policy, sandboxing, XSS protection, extension interference via WeakRefs, permissions audits, garbage collection behavior, cryptographic APIs, SSL certificate validation, advanced web platform security features like SharedArrayBuffer, Content filtering controls ,and internal network accessibility). We then present an experimental evaluation across different browsers and enterprise scenarios, highlighting gaps in legacy browsers and common misconfigurations. Finally, we discuss the security and privacy implications of our findings, compare with related work in browser security and enterprise endpoint solutions, and outline future enhancements such as real-time posture monitoring and SIEM integration.

Paperid: 3663, https://arxiv.org/pdf/2505.07846.pdf

Abstract:
This study reveals how frontier Large Language Models LLMs can "game the system" when faced with impossible situations, a critical security and alignment concern. Using a novel textual simulation approach, we presented three leading LLMs (o1, o3-mini, and r1) with a tic-tac-toe scenario designed to be unwinnable through legitimate play, then analyzed their tendency to exploit loopholes rather than accept defeat. Our results are alarming for security researchers: the newer, reasoning-focused o3-mini model showed nearly twice the propensity to exploit system vulnerabilities (37.1%) compared to the older o1 model (17.5%). Most striking was the effect of prompting. Simply framing the task as requiring "creative" solutions caused gaming behaviors to skyrocket to 77.3% across all models. We identified four distinct exploitation strategies, from direct manipulation of game state to sophisticated modification of opponent behavior. These findings demonstrate that even without actual execution capabilities, LLMs can identify and propose sophisticated system exploits when incentivized, highlighting urgent challenges for AI alignment as models grow more capable of identifying and leveraging vulnerabilities in their operating environments.

Paperid: 3664, https://arxiv.org/pdf/2505.07828.pdf

Abstract:
The convergence of blockchain and artificial intelligence (AI) has led to the emergence of AI-based tokens, which are cryptographic assets designed to power decentralized AI platforms and services. This paper provides a comprehensive review of leading AI-token projects, examining their technical architectures, token utilities, consensus mechanisms, and underlying business models. We explore how these tokens operate across various blockchain ecosystems and assess the extent to which they offer value beyond traditional centralized AI services. Based on this assessment, our analysis identifies several core limitations. From a technical perspective, many platforms depend extensively on off-chain computation, exhibit limited capabilities for on-chain intelligence, and encounter significant scalability challenges. From a business perspective, many models appear to replicate centralized AI service structures, simply adding token-based payment and governance layers without delivering truly novel value. In light of these challenges, we also examine emerging developments that may shape the next phase of decentralized AI systems. These include approaches for on-chain verification of AI outputs, blockchain-enabled federated learning, and more robust incentive frameworks. Collectively, while emerging innovations offer pathways to strengthen decentralized AI ecosystems, significant gaps remain between the promises and the realities of current AI-token implementations. Our findings contribute to a growing body of research at the intersection of AI and blockchain, highlighting the need for critical evaluation and more grounded approaches as the field continues to evolve.

Paperid: 3665, https://arxiv.org/pdf/2505.06636.pdf

Abstract:
In intelligent industry, autonomous driving and other environments, the Internet of Things (IoT) highly integrated with robotic to form the Internet of Robotic Things (IoRT). However, network intrusion to IoRT can lead to data leakage, service interruption in IoRT and even physical damage by controlling robots or vehicles. This paper proposes a Contrastive Federated Semi-Supervised Learning Network Intrusion Detection framework (CFedSSL-NID) for IoRT intrusion detection and defense, to address the practical scenario of IoRT where robots don't possess labeled data locally and the requirement for data privacy preserving. CFedSSL-NID integrates randomly weak and strong augmentation, latent contrastive learning, and EMA update to integrate supervised signals, thereby enhancing performance and robustness on robots' local unlabeled data. Extensive experiments demonstrate that CFedSSL-NID outperforms existing federated semi-supervised and fully supervised methods on benchmark dataset and has lower resource requirements.

Paperid: 3666, https://arxiv.org/pdf/2505.06581.pdf

Abstract:
We present the first nearly optimal differentially private PAC learner for any concept class with VC dimension 1 and Littlestone dimension $d$. Our algorithm achieves the sample complexity of $\tilde{O}_{\varepsilon,Î´,Î±,Î´}(\log^* d)$, nearly matching the lower bound of $Î©(\log^* d)$ proved by Alon et al. [STOC19]. Prior to our work, the best known upper bound is $\tilde{O}(VC\cdot d^5)$ for general VC classes, as shown by Ghazi et al. [STOC21].

Paperid: 3667, https://arxiv.org/pdf/2505.06409.pdf

Abstract:
As AI models scale to billions of parameters and operate with increasing autonomy, ensuring their safe, reliable operation demands engineering-grade security and assurance frameworks. This paper presents an enterprise-level, risk-aware, security-by-design approach for large-scale autonomous AI systems, integrating standardized threat metrics, adversarial hardening techniques, and real-time anomaly detection into every phase of the development lifecycle. We detail a unified pipeline - from design-time risk assessments and secure training protocols to continuous monitoring and automated audit logging - that delivers provable guarantees of model behavior under adversarial and operational stress. Case studies in national security, open-source model governance, and industrial automation demonstrate measurable reductions in vulnerability and compliance overhead. Finally, we advocate cross-sector collaboration - uniting engineering teams, standards bodies, and regulatory agencies - to institutionalize these technical safeguards within a resilient, end-to-end assurance ecosystem for the next generation of AI.

Paperid: 3668, https://arxiv.org/pdf/2505.05934.pdf

Abstract:
Private Information Retrieval (PIR) schemes enable users to securely retrieve files from a server without disclosing the content of their queries, thereby preserving their privacy. In 2008, Melchor and Gaborit proposed a PIR scheme that achieves a balance between communication overhead and server-side computational cost. However, for particularly small databases, Liu and Bi identified a vulnerability in the scheme using lattice-based methods. Nevertheless, the rapid increase in computational cost associated with the attack limited its practical applicability, leaving the scheme's overall security largely intact. In this paper, we present a novel two-stage attack that extends the work of Liu and Bi to databases of arbitrary sizes. To this end, we employ a binary-search-like preprocessing technique, which enables a significant reduction in the number of lattice problems that need to be considered. Specifically, we demonstrate how to compromise the scheme in a matter of minutes using an ordinary laptop. Our findings are substantiated through both rigorous analytical proofs and comprehensive numerical experiments.

Paperid: 3669, https://arxiv.org/pdf/2505.05653.pdf

Abstract:
We propose a new symmetric cryptographic scheme based on functional invariants defined over discrete oscillatory functions with hidden parameters. The scheme encodes a secret integer through a four-point algebraic identity preserved under controlled parameterization. Security arises not from algebraic inversion but from structural coherence: the transmitted values satisfy an invariant that is computationally hard to forge or invert without knowledge of the shared secret. We develop the full analytic and modular framework, prove exact identities, define index-recovery procedures, and analyze security assumptions, including oscillator construction, hash binding, and invertibility conditions. The result is a compact, self-verifying mechanism suitable for secure authentication, parameter exchange, and lightweight communication protocols.

Paperid: 3670, https://arxiv.org/pdf/2505.04806.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into consumer and enterprise applications. Despite their capabilities, they remain susceptible to adversarial attacks such as prompt injection and jailbreaks that override alignment safeguards. This paper provides a systematic investigation of jailbreak strategies against various state-of-the-art LLMs. We categorize over 1,400 adversarial prompts, analyze their success against GPT-4, Claude 2, Mistral 7B, and Vicuna, and examine their generalizability and construction logic. We further propose layered mitigation strategies and recommend a hybrid red-teaming and sandboxing approach for robust LLM security.

Paperid: 3671, https://arxiv.org/pdf/2505.03743.pdf

Abstract:
In recent years, advancements in quantum chip technology, such as Willow, have contributed to reducing quantum computation error rates, potentially accelerating the practical adoption of quantum computing. As a result, the design of quantum algorithms suitable for real-world applications has become a crucial research direction. This study focuses on the implementation of Shor algorithm, aiming to improve modular computation efficiency and demonstrate the factorization of a 4096-bit integer under specific constraints. Experimental results, when compared with state-of-the-art (SOTA) methods, indicate a significant improvement in efficiency while enabling the factorization of longer integers.

Paperid: 3672, https://arxiv.org/pdf/2505.03741.pdf

Abstract:
This paper conducts a comparative study of three IoT-based PRNG models, including Logistic Map, Double Pendulum, and Multi-LFSR, implemented on an FPGA platform. Comparisons are made across key performance metrics like randomness, latency, power consumption, hardware resource usage, energy efficiency, scalability, and application suitability. Compared to Multi-LFSR, Logistic Map, and Double Pendulum Models provide perfect quality randomness, which is quite apt for high-security grade applications; however, the requirements of these models concerning power and hardware resources are also considerably high. By contrast, the Multi-LFSR comes into its own due to its lower latency, power consumption, and resource-efficient design. It is, therefore, suited for embedded or real-time applications. Furthermore, environmental sensors will also be introduced as entropy sources for the PRNGs to enhance the randomness of the systems, particularly in IoT-enabled battery-powered FPGA platforms. The experimental results confirm that the Multi-LFSR model has the highest energy efficiency, while the Logistic Map and Double Pendulum outperform in generating numbers with very high security. The study thus provides a deeper insight into decision-making for selecting PRNG models.

Paperid: 3673, https://arxiv.org/pdf/2505.03193.pdf

Abstract:
With the rise of short video platforms in global communication, embedding steganographic data in audio synchronization streams has emerged as a new covert communication method. To address the limitations of traditional techniques in detecting synchronized steganography, this paper proposes a detection and distributed guidance reconstruction model based on short video "Yupan" samples released by China's South Sea Fleet on TikTok. The method integrates sliding spectrum feature extraction and intelligent inference mechanisms. A 25 ms sliding window with short-time Fourier transform (STFT) is used to extract the main frequency trajectory and construct the synchronization frame detection model (M1), identifying a frame flag "FFFFFFFFFFFFFFFFFF80". The subsequent 32-byte payload is decoded by a structured model (M2) to infer distributed guidance commands. Analysis reveals a low-entropy, repetitive byte sequence in the 36 to 45 second audio segment with highly concentrated spectral energy, confirming the presence of synchronization frames. Although plaintext semantics are not restored, the consistency in command field layout suggests features of military communication protocols. The multi-segment splicing model further shows cross-video embedding and centralized decoding capabilities. The proposed framework validates the effectiveness of sliding spectral features for synchronized steganography detection and builds an extensible inference model for covert communication analysis and tactical guidance simulation on open platforms.

Paperid: 3674, https://arxiv.org/pdf/2505.02261.pdf

Abstract:
This paper introduces a structured decoding framework for the Voynich Manuscript, based on mathematical rhythm, symbolic transformation, and glyph-level recursion. Rather than interpret symbols phonetically, this method decodes them by structural roles and spatial pacing. Using scroll-wide sequencing, the system tracks prime number grouping, Fibonacci clustering, and golden ratio alignment. These symbolic structures are validated using a ten-part chi-squared test suite and Boolean logic. The method is falsifiable and reproducible. Scroll sections like f57v, f88v, and f91r are used to demonstrate glyph flow, breath-segment patterns, and tri-dot alignment. This decoding strategy challenges assumptions about pre-phonetic manuscripts and proposes a new lens for interpreting symbolic logic.

Paperid: 3675, https://arxiv.org/pdf/2505.02077.pdf

Abstract:
Decentralized AI agents will soon interact across internet platforms, creating security challenges beyond traditional cybersecurity and AI safety frameworks. Free-form protocols are essential for AI's task generalization but enable new threats like secret collusion and coordinated swarm attacks. Network effects can rapidly spread privacy breaches, disinformation, jailbreaks, and data poisoning, while multi-agent dispersion and stealth optimization help adversaries evade oversightcreating novel persistent threats at a systemic level. Despite their critical importance, these security challenges remain understudied, with research fragmented across disparate fields including AI security, multi-agent learning, complex systems, cybersecurity, game theory, distributed systems, and technical AI governance. We introduce \textbf{multi-agent security}, a new field dedicated to securing networks of decentralized AI agents against threats that emerge or amplify through their interactionswhether direct or indirect via shared environmentswith each other, humans, and institutions, and characterize fundamental security-performance trade-offs. Our preliminary work (1) taxonomizes the threat landscape arising from interacting AI agents, (2) surveys security-performance tradeoffs in decentralized AI systems, and (3) proposes a unified research agenda addressing open challenges in designing secure agent systems and interaction environments. By identifying these gaps, we aim to guide research in this critical area to unlock the socioeconomic potential of large-scale agent deployment on the internet, foster public trust, and mitigate national security risks in critical infrastructure and defense contexts.

Paperid: 3676, https://arxiv.org/pdf/2505.02004.pdf

Abstract:
In a typical authentication process, the local system verifies the user's identity using a stored hash value generated by a cross-system hash algorithm. This article shifts the research focus from traditional password encryption to the establishment of gatekeeping mechanisms for effective interactions between a system and the outside world. Here, we propose a triple-identity authentication system to achieve this goal. Specifically, this local system opens the inner structure of its hash algorithm to all user credentials, including the login name, login password, and authentication password. When a login credential is entered, the local system hashes it and then creates a unique identifier using intermediate hash elements randomly selected from the open algorithm. Importantly, this locally generated unique identifier (rather than the stored hash produced by the open algorithm) is utilized to verify the user's combined identity, which is generated by combining the entered credential with the International Mobile Equipment Identity and the International Mobile Subscriber Identity. The verification process is implemented at each interaction point: the login name field, the login password field, and the server's authentication point. Thus, within the context of this triple-identity authentication system, we establish a robust gatekeeping mechanism for system interactions, ultimately providing a level of security that is equivalent to multi-factor authentication.

Paperid: 3677, https://arxiv.org/pdf/2505.00849.pdf

Abstract:
The Kirchhoff-Law-Johnson-Noise (KLJN) secure key exchange scheme leverages statistical physics to enable secure communication with zero average power flow in a wired channel. While the original KLJN scheme requires significant power for operation, a recent wireless modification, TherMod, proposed by Basar claims a "low power" implementation. This paper critically examines this claim. We explain that the additional components inherent in Basar's wireless adaptation substantially increase power consumption, rendering the "low power" assertion inappropriate. Furthermore, we clarify that the security claims of the original KLJN scheme do not directly translate to this wireless adaptation, implying significant security breach. Finally, the scheme looks identical one of the stealth communicators from 2005, which was shown not to be secure.

Paperid: 3678, https://arxiv.org/pdf/2505.00664.pdf

Abstract:
We show how oracles which only allow for classical query access can be used to construct a variety of quantum cryptographic primitives which do not require long-term quantum memory or global entanglement. Specifically, if a quantum party can execute a semi-quantum token scheme (Shmueli 2022) with probability of success $1/2 + Î´$, we can build powerful cryptographic primitives with a multiplicative logarithmic overhead for the desired correctness error. Our scheme makes no assumptions about the quantum party's noise model except for a simple independence requirement: noise on two sets of non-entangled hardware must be independent. Using semi-quantum tokens and oracles which can only be queried classically, we first show how to construct a "short-lived" semi-quantum one-time program (OTP) which allows a classical sending party to prepare a one-time program on the receiving party's quantum computer. We then show how to use this semi-quantum OTP to construct a semi-quantum "stateful obfuscation" scheme (which we term "RAM obfuscation"). Importantly, the RAM obfuscation scheme does not require long-term quantum memory or global entanglement. Finally, we show how RAM obfuscation can be used to build long-lived one-time programs and copy-protection schemes.

Paperid: 3681, https://arxiv.org/pdf/2504.21543.pdf

Abstract:
In this manuscript, we demonstrate the feasibility of a privacy-preserving U-Net deep learning inference framework, namely, homomorphic encryption-based U-Net inference. That is, U-Net inference can be performed solely using homomorphic encryption techniques. To our knowledge, this is the first work to achieve support perform implement enable U-Net inference entirely based on homomorphic encryption ?. The primary technical challenge lies in data encoding. To address this, we employ a flexible encoding scheme, termed Double Volley Revolver, which enables effective support for skip connections and upsampling operations within the U-Net architecture. We adopt a tailored HE-friendly U-Net design incorporating square activation functions, mean pooling layers, and transposed convolution layers (implemented as ConvTranspose2d in PyTorch) with a kernel size of 2 and stride of 2. After training the model in plaintext, we deploy the resulting parameters using the HEAAN homomorphic encryption library to perform encrypted U-Net inference.

Paperid: 3682, https://arxiv.org/pdf/2504.21168.pdf

Abstract:
Numerous methods have been considered to create a fast integer factorization algorithm. Despite its apparent simplicity, the difficulty to find such an algorithm plays a crucial role in modern cryptography, notably, in the security of RSA encryption. Some approaches to factoring integers quickly include the Trial Division method, Pollard's Rho and p-1 methods, and various Sieve algorithms. This paper introduces a new method that converts an integer into a sum in base-2. By combining a base-10 and base-2 representation of the integer, an algorithm on the order of $\sqrt{n}$ time complexity can convert that sum to a product of two integers, thus factoring the original number.

Paperid: 3683, https://arxiv.org/pdf/2504.21049.pdf

Abstract:
Phishing attacks threaten online users, often leading to data breaches, financial losses, and identity theft. Traditional phishing detection systems struggle with high false positive rates and are usually limited by the types of attacks they can identify. This paper proposes a deep learning-based approach using a Bidirectional Long Short-Term Memory (Bi-LSTM) network to classify URLs into four categories: benign, phishing, defacement, and malware. The model leverages sequential URL data and captures contextual information, improving the accuracy of phishing detection. Experimental results on a dataset comprising over 650,000 URLs demonstrate the model's effectiveness, achieving 97% accuracy and significant improvements over traditional techniques.

Paperid: 3684, https://arxiv.org/pdf/2504.20180.pdf

Abstract:
The increasing adoption of autonomous vehicles is bringing a major shift in the automotive industry. However, as these vehicles become more connected, cybersecurity threats have emerged as a serious concern. Protecting the security and integrity of autonomous systems is essential to prevent malicious activities that can harm passengers, other road users, and the overall transportation network. This paper focuses on addressing the cybersecurity issues in autonomous vehicles by examining the challenges and risks involved, which are important for building a secure future. Since autonomous vehicles depend on the communication between sensors, artificial intelligence, external infrastructure, and other systems, they are exposed to different types of cyber threats. A cybersecurity breach in an autonomous vehicle can cause serious problems, including a loss of public trust and safety. Therefore, it is very important to develop and apply strong cybersecurity measures to support the growth and acceptance of self-driving cars. This paper discusses major cybersecurity challenges like vulnerabilities in software and hardware, risks from wireless communication, and threats through external interfaces. It also reviews existing solutions such as secure software development, intrusion detection systems, cryptographic protocols, and anomaly detection methods. Additionally, the paper highlights the role of regulatory bodies, industry collaborations, and cybersecurity standards in creating a secure environment for autonomous vehicles. Setting clear rules and best practices is necessary for consistent protection across manufacturers and regions. By analyzing the current cybersecurity landscape and suggesting practical countermeasures, this paper aims to contribute to the safe development and public trust of autonomous vehicle technology.

Paperid: 3685, https://arxiv.org/pdf/2504.19997.pdf

Abstract:
The increased adoption of the Model Context Protocol (MCP) for AI Agents necessitates robust security for Enterprise integrations. This paper introduces the MCP Gateway to simplify self-hosted MCP server integration. The proposed architecture integrates security principles, authentication, intrusion detection, and secure tunneling, enabling secure self-hosting without exposing infrastructure. Key contributions include a reference architecture, threat model mapping, simplified integration strategies, and open-source implementation recommendations. This work focuses on the unique challenges of enterprise-centric, self-hosted AI integrations, unlike existing public MCP server solutions.

Paperid: 3686, https://arxiv.org/pdf/2504.19128.pdf

Abstract:
Speculative execution is a hardware optimisation technique where a processor, while waiting on the completion of a computation required for an instruction, continues to execute later instructions based on a predicted value of the pending computation. It came to the forefront of security research in 2018 with the disclosure of two related attacks, Spectre and Meltdown. Since then many similar attacks have been identified. While there has been much research on using formal methods to detect speculative execution vulnerabilities based on predicted control flow, there has been significantly less on vulnerabilities based on predicted data flow. In this paper, we introduce an approach for detecting the data flow vulnerabilities, Spectre-STL and Spectre-PSF, using weakest precondition reasoning. We validate our approach on a suite of litmus tests used to validate related approaches in the literature.

Paperid: 3687, https://arxiv.org/pdf/2504.18966.pdf

Abstract:
Blockchain technology has completely revolutionized the field of decentralized finance with the emergence of a variety of cryptocurrencies and digital assets. However, widespread adoption of this technology by governments and enterprises has been limited by concerns regarding the technology's scalability, governance, and economic sustainability. This paper aims to introduce a novel hybrid blockchain architecture that balances scalability, governance, and decentralization while being economically viable for all parties involved. The new semi-centralized model leverages strategies not prevalent in the field, such as resource and node isolation, containerization, separation of networking and compute layers, use of a Kafka pub-sub network instead of a peer-to-peer network, and stakes-based validator selection to possibly mitigate a variety of issues related to scalability, security, governance, and economic sustainability. Simulations conducted on Kubernetes demonstrate the architecture's ability to achieve over 1000 transactions per second, with consistent performance across scaled deployments, even on a lightweight consumer-grade laptop with resource constraints. The findings highlight the system's scalability, security, and economic viability, offering a robust framework for enterprise and government adoption.

Paperid: 3688, https://arxiv.org/pdf/2504.18577.pdf

Abstract:
We investigate the scale of attack and defense mathematically in the context of AI's possible effect on cybersecurity. For a given target today, highly scaled cyber attacks such as from worms or botnets typically all fail or all succeed. Here, we consider the effect of scale if those attack agents were intelligent and creative enough to act independently such that each attack attempt was different from the others or such that attackers could learn from their successes and failures. We find that small increases in the number or quality of defenses can compensate for exponential increases in the number of independent attacks and for exponential speedups.

Paperid: 3689, https://arxiv.org/pdf/2504.18566.pdf

Abstract:
Distributed Denial of Service (DDoS) attacks represent a persistent and evolving threat to modern networked systems, capable of causing large-scale service disruptions. The complexity of such attacks, often hidden within high-dimensional and redundant network traffic data, necessitates robust and intelligent feature selection techniques for effective detection. Traditional methods such as filter-based, wrapper-based, and embedded approaches, each offer strengths but struggle with scalability or adaptability in complex attack environments. In this study, we explore these existing techniques through a detailed comparative analysis and highlight their limitations when applied to large-scale DDoS detection tasks. Building upon these insights, we introduce a novel Generative Adversarial Network-based Feature Selection (GANFS) method that leverages adversarial learning dynamics to identify the most informative features. By training a GAN exclusively on attack traffic and employing a perturbation-based sensitivity analysis on the Discriminator, GANFS effectively ranks feature importance without relying on full supervision. Experimental evaluations using the CIC-DDoS2019 dataset demonstrate that GANFS not only improves the accuracy of downstream classifiers but also enhances computational efficiency by significantly reducing feature dimensionality. These results point to the potential of integrating generative learning models into cybersecurity pipelines to build more adaptive and scalable detection systems.

Paperid: 3690, https://arxiv.org/pdf/2504.18423.pdf

Abstract:
Despite the transformative impact of Artificial Intelligence (AI) across various sectors, cyber security continues to rely on traditional static and dynamic analysis tools, hampered by high false positive rates and superficial code comprehension. While generative AI offers promising automation capabilities for software development, leveraging Large Language Models (LLMs) for vulnerability detection presents unique challenges. This paper explores the potential and limitations of LLMs in identifying vulnerabilities, acknowledging inherent weaknesses such as hallucinations, limited context length, and knowledge cut-offs. Previous attempts employing machine learning models for vulnerability detection have proven ineffective due to limited real-world applicability, feature engineering challenges, lack of contextual understanding, and the complexities of training models to keep pace with the evolving threat landscape. Therefore, we propose a robust AI-driven approach focused on mitigating these limitations and ensuring the quality and reliability of LLM based vulnerability detection. Through innovative methodologies combining Retrieval-Augmented Generation (RAG) and Mixtureof-Agents (MoA), this research seeks to leverage the strengths of LLMs while addressing their weaknesses, ultimately paving the way for dependable and efficient AI-powered solutions in securing the ever-evolving software landscape.

Paperid: 3691, https://arxiv.org/pdf/2504.17906.pdf

Abstract:
Access control needs have broad design implications, but access control specifications may be elicited before, during, or after these needs are captured. Because access control knowledge is distributed, we need to make knowledge asymmetries more transparent, and use expertise already available to stakeholders. In this paper, we present a tool-supported technique identifying knowledge asymmetries around access control based on asset and goal models. Using simple and conventional modelling languages that complement different design techniques, we provide boundary objects to make access control transparent, thereby making knowledge about access control concerns more symmetric. We illustrate this technique using a case study example considering the suitability of a reusable software component in a new military air system.

Paperid: 3692, https://arxiv.org/pdf/2504.17904.pdf

Abstract:
There is growing interest in securing the hardware foundations software stacks build upon. However, before making any investment decision, software and hardware supply chain stakeholders require evidence from realistic, multiple long-term studies of adoption. We present results from a 12 month evaluation of one such secure hardware solution, CHERI, where 15 teams from industry and academia ported software relevant to Defence to Arm's experimental Morello board. We identified six types of blocker inhibiting adoption: dependencies, a knowledge premium, missing utilities, performance, platform instability, and technical debt. We also identified three types of enabler: tool assistance, improved quality, and trivial code porting. Finally, we identified five types of potential vulnerability that CHERI could, if not appropriately configured, expand a system's attack surface: state leaks, memory leaks, use after free vulnerabilities, unsafe defaults, and tool chain instability. Future work should remove potentially insecure defaults from CHERI tooling, and develop a CHERI body of knowledge to further adoption.

Paperid: 3693, https://arxiv.org/pdf/2504.17759.pdf

Abstract:
This paper introduces the Identity Control Plane (ICP), an architectural framework for enforcing identity-aware Zero Trust access across human users, workloads, and automation systems. The ICP model unifies SPIFFE-based workload identity, OIDC/SAML user identity, and scoped automation credentials via broker-issued transaction tokens. We propose a composable enforcement layer using ABAC policy engines (e.g., OPA, Cedar), aligned with IETF WIMSE drafts and OAuth transaction tokens. The paper includes architectural components, integration patterns, use cases, a comparative analysis with current models, and theorized performance metrics. A FedRAMP and SLSA compliance mapping is also presented. This is a theoretical infrastructure architecture paper intended for security researchers and platform architects. No prior version of this work has been published.

Paperid: 3694, https://arxiv.org/pdf/2504.16118.pdf

Abstract:
As cyber threats continue to evolve, securing edge networks has become increasingly challenging due to their distributed nature and resource limitations. Many AI-driven threat detection systems rely on complex deep learning models, which, despite their high accuracy, suffer from two major drawbacks: lack of interpretability and high computational cost. Black-box AI models make it difficult for security analysts to understand the reasoning behind their predictions, limiting their practical deployment. Moreover, conventional deep learning techniques demand significant computational resources, rendering them unsuitable for edge devices with limited processing power. To address these issues, this study introduces an Explainable and Lightweight AI (ELAI) framework designed for real-time cyber threat detection in edge networks. Our approach integrates interpretable machine learning algorithms with optimized lightweight deep learning techniques, ensuring both transparency and computational efficiency. The proposed system leverages decision trees, attention-based deep learning, and federated learning to enhance detection accuracy while maintaining explainability. We evaluate ELAI using benchmark cybersecurity datasets, such as CICIDS and UNSW-NB15, assessing its performance across diverse cyberattack scenarios. Experimental results demonstrate that the proposed framework achieves high detection rates with minimal false positives, all while significantly reducing computational demands compared to traditional deep learning methods. The key contributions of this work include: (1) a novel interpretable AI-based cybersecurity model tailored for edge computing environments, (2) an optimized lightweight deep learning approach for real-time cyber threat detection, and (3) a comprehensive analysis of explainability techniques in AI-driven cybersecurity applications.

Paperid: 3695, https://arxiv.org/pdf/2504.16110.pdf

Abstract:
The conversation around artificial intelligence (AI) often focuses on safety, transparency, accountability, alignment, and responsibility. However, AI security (i.e., the safeguarding of data, models, and pipelines from adversarial manipulation) underpins all of these efforts. This manuscript posits that AI security must be prioritized as a foundational layer. We present a hierarchical view of AI challenges, distinguishing security from safety, and argue for a security-first approach to enable trustworthy and resilient AI systems. We discuss core threat models, key attack vectors, and emerging defense mechanisms, concluding that a metric-driven approach to AI security is essential for robust AI safety, transparency, and accountability.

Paperid: 3696, https://arxiv.org/pdf/2504.16108.pdf

Abstract:
The rise of autonomous AI agents in enterprise and industrial environments introduces a critical challenge: how to securely assign, verify, and manage their identities across distributed systems. Existing identity frameworks based on API keys, certificates, or application-layer credentials lack the infrastructure-grade trust, lifecycle control, and interoperability needed to manage agents operating independently in sensitive contexts. In this paper, we propose a conceptual architecture that leverages telecom-grade eSIM infrastructure, specifically hosted by mobile network operators (MNOs), to serve as a root of trust for AI agents. Rather than embedding SIM credentials in hardware devices, we envision a model where telcos host secure, certified hardware modules (eUICC or HSM) that store and manage agent-specific eSIM profiles. Agents authenticate remotely via cryptographic APIs or identity gateways, enabling scalable and auditable access to enterprise networks and services. We explore use cases such as onboarding enterprise automation agents, securing AI-driven financial systems, and enabling trust in inter-agent communications. We identify current limitations in GSMA and 3GPP standards, particularly their device centric assumptions, and propose extensions to support non-physical, software-based agents within trusted execution environments. This paper is intended as a conceptual framework to open discussion around standardization, security architecture, and the role of telecom infrastructure in the evolving agent economy.

Paperid: 3697, https://arxiv.org/pdf/2504.16088.pdf

Abstract:
This paper is a tutorial on the proven but currently under-appreciated security mechanisms associated with "tagged" or "descriptor" architectures. The tutorial shows how the principles behind such architectures can be applied to mitigate or eliminate vulnerabilities. The tutorial incorporates systems engineering practices by presenting the mechanisms in an informal model of an integrated artifact in its operational environment. The artifact is a special-purpose hardware/software system called a "Guard" which robustly hosts defensive software. It is hoped that this tutorial may encourage teachers to include significant past work in their curricula and students who are self-teaching to add that work to their exploration of secure computing.

Paperid: 3698, https://arxiv.org/pdf/2504.16085.pdf

Abstract:
The increasing demand for sustainability and compliance with global carbon regulations has posed significant challenges for small and medium-sized enterprises (SMEs). This paper proposes a blockchain-based decentralized carbon credit trading platform tailored for SMEs in Taiwan, aiming to simplify the complex carbon trading process and lower market entry barriers. Drawing upon the Diffusion of Innovations theory and transaction cost economics, we illustrate how blockchain technology can reduce informational asymmetry and intermediary costs in carbon markets. By integrating Ethereum-based smart contracts, the platform automates transactions, enhances transparency, and reduces administrative burdens - addressing key obstacles such as technical complexity and market risks. A controlled experimental design was conducted to compare the proposed system with a conventional centralized carbon trading platform. Statistical analysis confirms its effectiveness in minimizing time and expenses while ensuring compliance with the Carbon Border Adjustment Mechanism (CBAM) and the Clean Competition Act (CCA). User satisfaction was measured using the Kano model, with the results identifying essential features and prioritizing future enhancements. This study contributes a more comprehensive solution for SMEs seeking to achieve carbon neutrality, underscoring the transformative potential of blockchain technology in global carbon markets.

Paperid: 3699, https://arxiv.org/pdf/2504.15880.pdf

Abstract:
Directed Acyclic Graph (DAG)-based Distributed Ledger Technologies (DLTs) have emerged as a promising solution to the scalability issues inherent in traditional blockchains. However, amidst the focus on scalability, the crucial aspect of privacy within DAG-based DLTs has been largely overlooked. This paper seeks to address this gap by providing a comprehensive examination of privacy notions and challenges within DAG-based DLTs. We delve into potential methodologies to enhance privacy within these systems, while also analyzing the associated hurdles and real-world implementations within state-of-the-art DAG-based DLTs. By exploring these methodologies, we not only illuminate the current landscape of privacy in DAG-based DLTs but also outline future research directions in this evolving field.

Paperid: 3701, https://arxiv.org/pdf/2504.14777.pdf

Abstract:
This paper introduces intent-aware authorization for Zero Trust CI/CD systems. Identity establishes who is making the request, but additional signals are required to decide whether access should be granted. We describe a control loop architecture where policy engines such as OPA and Cedar evaluate runtime context, justification, and human approvals before issuing access credentials. The system builds on SPIFFE-based workload identity and credential brokers, and enables fine-grained, auditable authorization. This is the third paper in a series on Zero Trust CI/CD design patterns.

Paperid: 3702, https://arxiv.org/pdf/2504.14761.pdf

Abstract:
Credential brokers offer a way to separate identity from access in CI/CD systems. This paper shows how verifiable identities issued at runtime, such as those from SPIFFE, can be used with brokers to enable short-lived, policy-driven credentials for pipelines and workloads. We walk through practical design patterns, including brokers that issue tokens just in time, apply access policies, and operate across trust domains. These ideas help reduce static permissions, improve auditability, and support Zero Trust goals in deployment workflows. This is the second paper in a three-part series on secure CI/CD identity architecture.

Paperid: 3703, https://arxiv.org/pdf/2504.14760.pdf

Abstract:
CI/CD systems have become privileged automation agents in modern infrastructure, but their identity is still based on secrets or temporary credentials passed between systems. In enterprise environments, these platforms are centralized and shared across teams, often with broad cloud permissions and limited isolation. These conditions introduce risk, especially in the era of supply chain attacks, where implicit trust and static credentials leave systems exposed. This paper describes the shift from static credentials to OpenID Connect (OIDC) federation, and introduces SPIFFE (Secure Production Identity Framework for Everyone) as a runtime-issued, platform-neutral identity model for non-human actors. SPIFFE decouples identity from infrastructure, enabling strong, portable authentication across job runners and deployed workloads. We show how SPIFFE identities support policy alignment, workload attestation, and mutual authentication. The paper concludes by outlining next steps in enabling policy-based access, forming the basis of a broader Zero Trust architecture for CI/CD.

Paperid: 3704, https://arxiv.org/pdf/2504.14069.pdf

Abstract:
Ethereum, the leading platform for decentralized applications, faces challenges in maintaining decentralization due to the significant hardware requirements for validators to store Ethereum's entire state. To address this, the concept of stateless clients is under exploration, enabling validators to verify transactions using cryptographic witnesses rather than the full state. This paper compares two approaches currently being discussed for achieving statelessness: Verkle trees utilizing vector commitments and binary Merkle trees combined with SNARKs. Benchmarks are performed to evaluate proving time, witness size, and verification time. The results reveal that the Verkle tree implementation used for benchmarking offers proving and verification times on the order of seconds and proof sizes on the order of one MB. The SNARK-based Merkle trees exhibit slow proof generation times, while offering constant and fast verification time. Overall, the results indicate for Verkle trees to provide a more practical solution for Ethereum's stateless future, but both methods offer valuable insights into reducing the state burden on Ethereum nodes. We make the code used for benchmarking available on GitHub.

Paperid: 3705, https://arxiv.org/pdf/2504.14016.pdf

Abstract:
Some of our current public key methods use a trap door to implement digital signature methods. This includes the RSA method, which uses Fermat's little theorem to support the creation and verification of a digital signature. The problem with a back-door is that the actual trap-door method could, in the end, be discovered. With the rise of PQC (Post Quantum Cryptography), we will see a range of methods that will not use trap doors and provide stronger proof of security. In this case, we use hash-based signatures (as used with SPHINCS+) and Fiat Shamir signatures using Zero Knowledge Proofs (as used with Dilithium).

Paperid: 3706, https://arxiv.org/pdf/2504.13957.pdf

Abstract:
Language is not neutral; it frames understanding, structures power, and shapes governance. This paper argues that misnomers like cybersecurity and artificial intelligence (AI) are more than semantic quirks; they carry significant governance risks by obscuring human agency, inflating expectations, and distorting accountability. Drawing on lessons from cybersecurity's linguistic pitfalls, such as the 'weakest link' narrative, this paper highlights how AI discourse is falling into similar traps with metaphors like 'alignment,' 'black box,' and 'hallucination.' These terms embed adversarial, mystifying, or overly technical assumptions into governance structures. In response, the paper advocates for a language-first approach to AI governance: one that interrogates dominant metaphors, foregrounds human roles, and co-develops a lexicon that is precise, inclusive, and reflexive. This paper contends that linguistic reform is not peripheral to governance but central to the construction of transparent, equitable, and anticipatory regulatory frameworks.

Paperid: 3707, https://arxiv.org/pdf/2504.13747.pdf

Abstract:
In this work, we introduce two complementary metrics for quantifying and scoring privilege risk in Microsoft Azure. In the Control Plane, we define the WAR distance, a superincreasing distance over Write, Action, and Read control permissions, which yields a total ordering of principals by their configuration power. In the Data Plane, we present a blast radius distance for measuring the maximum breadth of data exfiltration and forgery, leveraging the natural ultrametry of Azure Tenants clustering hierarchy Together, these metrics offer a unified framework for proactive IAM analysis, ranking, lifecycle monitoring, and least privilege enforcement.

Paperid: 3708, https://arxiv.org/pdf/2504.13527.pdf

Abstract:
Foundation models have recently emerged as a new paradigm in machine learning (ML). These models are pre-trained on large and diverse datasets and can subsequently be applied to various downstream tasks with little or no retraining. This allows people without advanced ML expertise to build ML applications, accelerating innovation across many fields. However, the adoption of foundation models in cybersecurity is hindered by their inability to efficiently process data such as network traffic captures or binary executables. The recent introduction of graph foundation models (GFMs) could make a significant difference, as graphs are well-suited to representing these types of data. We study the usability of GFMs in cybersecurity through the lens of one specific use case, namely lateral movement detection. Using a pre-trained GFM, we build a detector that reaches state-of-the-art performance without requiring any training on domain-specific data. This case study thus provides compelling evidence of the potential of GFMs for cybersecurity.

Paperid: 3709, https://arxiv.org/pdf/2504.13371.pdf

Abstract:
Unlike other domains of conflict, and unlike other fields with high anticipated risk from AI, the cyber domain is intrinsically digital with a tight feedback loop between AI training and cyber application. Cyber may have some of the largest and earliest impacts from AI, so it is important to understand how the cyber domain may change as AI continues to advance. Our approach reviewed the literature, collecting nine arguments that have been proposed for offensive advantage in cyber conflict and nine proposed arguments for defensive advantage. We include an additional forty-eight arguments that have been proposed to give cyber conflict and competition its character as collected separately by Healey, Jervis, and Nandrajog. We then consider how each of those arguments and propositions might change with varying degrees of AI advancement. We find that the cyber domain is too multifaceted for a single answer to whether AI will enhance offense or defense broadly. AI will improve some aspects, hinder others, and leave some aspects unchanged. We collect and present forty-four ways that we expect AI to impact the cyber offense-defense balance and the character of cyber conflict and competition.

Paperid: 3710, https://arxiv.org/pdf/2504.13205.pdf

Abstract:
As generative AI models produce increasingly realistic output, both academia and industry are focusing on the ability to detect whether an output was generated by an AI model or not. Many of the research efforts and policy discourse are centered around robust watermarking of AI outputs. While plenty of progress has been made, all watermarking and AI detection techniques face severe limitations. In this position paper, we argue that we are adopting the wrong approach, and should instead focus on watermarking via cryptographic signatures trustworthy content rather than AI generated ones. For audio-visual content, in particular, all real content is grounded in the physical world and captured via hardware sensors. This presents a unique opportunity to watermark at the hardware layer, and we lay out a socio-technical framework and draw parallels with HTTPS certification and Blu-Ray verification protocols. While acknowledging implementation challenges, we contend that hardware-based authentication offers a more tractable path forward, particularly from a policy perspective. As generative models approach perceptual indistinguishability, the research community should be wary of being overly optimistic with AI watermarking, and we argue that AI watermarking research efforts are better spent in the text and LLM space, which are ultimately not traceable to a physical sensor.

Paperid: 3711, https://arxiv.org/pdf/2504.12579.pdf

Abstract:
The security of private communication is increasingly at risk due to widespread surveillance. Steganography, a technique for embedding secret messages within innocuous carriers, enables covert communication over monitored channels. Provably Secure Steganography (PSS) is state of the art for making stego carriers indistinguishable from normal ones by ensuring computational indistinguishability between stego and cover distributions. However, current PSS methods often require explicit access to the distribution of generative model for both sender and receiver, limiting their practicality in black box scenarios. In this paper, we propose a provably secure steganography scheme that does not require access to explicit model distributions for both sender and receiver. Our method incorporates a dynamic sampling strategy, enabling generative models to embed secret messages within multiple sampling choices without disrupting the normal generation process of the model. Extensive evaluations of three real world datasets and three LLMs demonstrate that our blackbox method is comparable with existing white-box steganography methods in terms of efficiency and capacity while eliminating the degradation of steganography in model generated outputs.

Paperid: 3712, https://arxiv.org/pdf/2504.12493.pdf

Abstract:
Blockchains and peer-to-peer systems are part of a trend towards computer systems that are "radically decentralised", by which we mean that they 1) run across many participants, 2) without central control, and 3) are such that qualities 1 and 2 are essential to the system's intended use cases. We propose a notion of topological space, which we call a "semitopology", to help us mathematically model such systems. We treat participants as points in a space, which are organised into "actionable coalitions". An actionable coalition is any set of participants who collectively have the resources to collaborate (if they choose) to progress according to the system's rules, without involving any other participants in the system. It turns out that much useful information about the system can be obtained \emph{just} by viewing it as a semitopology and studying its actionable coalitions. For example: we will prove a mathematical sense in which if every actionable coalition of some point p has nonempty intersection with every actionable coalition of another point q -- note that this is the negation of the famous Hausdorff separation property from topology -- then p and q must remain in agreement. This is of practical interest, because remaining in agreement is a key correctness property in many distributed systems. For example in blockchain, participants disagreeing is called "forking", and blockchain designers try hard to avoid it. We provide an accessible introduction to: the technical context of decentralised systems; why we build them and find them useful; how they motivate the theory of semitopological spaces; and we sketch some basic theorems and applications of the resulting mathematics.

Paperid: 3713, https://arxiv.org/pdf/2504.11661.pdf

Abstract:
Cybersecurity often hinges on unpredictability, with a system's defenses being strongest when sensitive values and behaviors cannot be anticipated by attackers. This paper explores the concept of entropy injection-deliberately infusing randomness into security mechanisms to increase unpredictability and enhance system security. We examine the theoretical foundations of entropy-based security, analyze real-world implementations including Address Space Layout Randomization (ASLR) and Moving Target Defense (MTD) frameworks, evaluate practical challenges in implementation, and compare entropy-based approaches with traditional security methods. Our methodology includes a systematic analysis of entropy's role across various security domains, from cryptographic operations to system-level defenses. Results demonstrate that entropy injection can significantly reduce attack probability, with some implementations showing more than 90% reduction with minimal performance impact. The discussion highlights the trade-offs between security benefits and operational complexity, while identifying future directions for entropy-enhanced security, including integration with artificial intelligence and quantum randomness sources. We conclude that entropy injection represents a paradigm shift from reactive defense to proactive uncertainty management, offering a strategic approach that can fundamentally alter the balance between attackers and defenders in cybersecurity.

Paperid: 3714, https://arxiv.org/pdf/2504.10603.pdf

Abstract:
The rapid integration of Generative AI (GenAI) into various applications necessitates robust risk management strategies which includes Red Teaming (RT) - an evaluation method for simulating adversarial attacks. Unfortunately, RT for GenAI is often hindered by technical complexity, lack of user-friendly interfaces, and inadequate reporting features. This paper introduces Violent UTF - an accessible, modular, and scalable platform for GenAI red teaming. Through intuitive interfaces (Web GUI, CLI, API, MCP) powered by LLMs and for LLMs, Violent UTF aims to empower non-technical domain experts and students alongside technical experts, facilitate comprehensive security evaluation by unifying capabilities from RT frameworks like Microsoft PyRIT, Nvidia Garak and its own specialized evaluators. ViolentUTF is being used for evaluating the robustness of a flagship LLM-based product in a large US Government department. It also demonstrates effectiveness in evaluating LLMs' cross-domain reasoning capability between cybersecurity and behavioral psychology.

Paperid: 3715, https://arxiv.org/pdf/2504.08227.pdf

Abstract:
DaemonSec is an early-stage startup exploring machine learning (ML)-based security for Linux daemons, a critical yet often overlooked attack surface. While daemon security remains underexplored, conventional defenses struggle against adaptive threats and zero-day exploits. To assess the perspectives of IT professionals on ML-driven daemon protection, a systematic interview study based on semi-structured interviews was conducted with 22 professionals from industry and academia. The study evaluates adoption, feasibility, and trust in ML-based security solutions. While participants recognized the potential of ML for real-time anomaly detection, findings reveal skepticism toward full automation, limited security awareness among non-security roles, and concerns about patching delays creating attack windows. This paper presents the methods, key findings, and implications for advancing ML-driven daemon security in industry.

Paperid: 3716, https://arxiv.org/pdf/2504.07140.pdf

Abstract:
This work presents an encryption model based on Generative Adversarial Networks (GANs). Encryption of RTF-8 data is realized by dynamically generating decimal numbers that lead to the encryption and decryption of alphabetic strings in integer representation by simple addition rules, the modulus of the dimension of the considered alphabet. The binary numbers for the private dynamic keys correspond to the binary numbers of public reference keys, as defined by a specific GAN configuration. For reversible encryption with a bijective mapping between dynamic and reference keys, as defined by the GAN encryptor, secure text encryption can be achieved by transferring a GAN-encrypted public key along with the encrypted text from a sender to a receiver. Using the technique described above, secure text mail transfer can be realized through component-wise encryption and decryption of text mail strings, with total key sizes of up to $10^{8}$ bits that define random decimal numbers generated by the GAN. From the present model, we assert that encrypted texts can be transmitted more efficiently and securely than from RSA encryption, as long as users of the specific configuration of the GAN encryption model are unaware of the GAN encryptor circuit and configuration, respectively.

Paperid: 3717, https://arxiv.org/pdf/2504.06320.pdf

Abstract:
Cyberattacks on critical infrastructure, particularly water distribution systems, have increased due to rapid digitalization and the integration of IoT devices and industrial control systems (ICS). These cyber-physical systems (CPS) introduce new vulnerabilities, requiring robust and automated intrusion detection systems (IDS) to mitigate potential threats. This study addresses key challenges in anomaly detection by leveraging time correlations in sensor data, integrating physical principles into machine learning models, and optimizing computational efficiency for edge applications. We build upon the concept of temporal differential consistency (TDC) loss to capture the dynamics of the system, ensuring meaningful relationships between dynamic states. Expanding on this foundation, we propose a hybrid autoencoder-based approach, referred to as hybrid TDC-AE, which extends TDC by incorporating both deterministic nodes and conventional statistical nodes. This hybrid structure enables the model to account for non-deterministic processes. Our approach achieves state-of-the-art classification performance while improving time to detect anomalies by 3%, outperforming the BATADAL challenge leader without requiring domain-specific knowledge, making it broadly applicable. Additionally, it maintains the computational efficiency of conventional autoencoders while reducing the number of fully connected layers, resulting in a more sustainable and efficient solution. The method demonstrates how leveraging physics-inspired consistency principles enhances anomaly detection and strengthens the resilience of cyber-physical systems.

Paperid: 3718, https://arxiv.org/pdf/2504.05094.pdf

Abstract:
Blockchain systems, such as Ethereum, are increasingly adopting layer-2 scaling solutions to improve transaction throughput and reduce fees. One popular layer-2 approach is the Optimistic Rollup, which relies on a mechanism known as a dispute game for block proposals. In these systems, validators can challenge blocks that they believe contain errors, and a successful challenge results in the transfer of a portion of the proposer's deposit as a reward. In this paper, we reveal a structural vulnerability in the mechanism: validators may not be awarded a proper profit despite winning a dispute challenge. We develop a formal game-theoretic model of the dispute game and analyze several scenarios, including cases where the proposer controls some validators and cases where a secondary auction mechanism is deployed to induce additional participation. Our analysis demonstrates that under current designs, the competitive pressure from validators may be insufficient to deter malicious behavior. We find that increased validator competition, paradoxically driven by higher rewards or participation, can allow a malicious proposer to significantly lower their net loss by capturing value through mechanisms like auctions. To address this, we propose countermeasures such as an escrowed reward mechanism and a commit-reveal protocol. Our findings provide critical insights into enhancing the economic security of layer-2 scaling solutions in blockchain networks.

Paperid: 3719, https://arxiv.org/pdf/2504.04187.pdf

Abstract:
Malicious examples are crucial for evaluating the robustness of machine learning algorithms under attack, particularly in Industrial Control Systems (ICS). However, collecting normal and attack data in ICS environments is challenging due to the scarcity of testbeds and the high cost of human expertise. Existing datasets are often limited by the domain expertise of practitioners, making the process costly and inefficient. The lack of comprehensive attack pattern data poses a significant problem for developing robust anomaly detection methods. In this paper, we propose a novel approach that combines data-centric and design-centric methodologies to generate attack patterns using large language models (LLMs). Our results demonstrate that the attack patterns generated by LLMs not only surpass the quality and quantity of those created by human experts but also offer a scalable solution that does not rely on expensive testbeds or pre-existing attack examples. This multi-agent based approach presents a promising avenue for enhancing the security and resilience of ICS environments.

Paperid: 3720, https://arxiv.org/pdf/2504.03765.pdf

Abstract:
The rapid advancement of generative artificial intelligence (GenAI) has revolutionized content creation across text, visual, and audio domains, simultaneously introducing significant risks such as misinformation, identity fraud, and content manipulation. This paper presents a practical survey of watermarking techniques designed to proactively detect GenAI content. We develop a structured taxonomy categorizing watermarking methods for text, visual, and audio modalities and critically evaluate existing approaches based on their effectiveness, robustness, and practicality. Additionally, we identify key challenges, including resistance to adversarial attacks, lack of standardization across different content types, and ethical considerations related to privacy and content ownership. Finally, we discuss potential future research directions aimed at enhancing watermarking strategies to ensure content authenticity and trustworthiness. This survey serves as a foundational resource for researchers and practitioners seeking to understand and advance watermarking techniques for AI-generated content detection.

Paperid: 3721, https://arxiv.org/pdf/2504.03752.pdf

Abstract:
The rapid proliferation of generative AI has led to an internet increasingly populated with synthetic content-text, images, audio, and video generated without human intervention. As the distinction between human and AI-generated data blurs, the ability to verify content origin becomes critical for applications ranging from social media and journalism to legal and financial systems. In this paper, we propose a conceptual, multi-layer architectural framework that enables telecommunications networks to act as infrastructure level certifiers of human-originated content. By leveraging identity anchoring at the physical layer, metadata propagation at the network and transport layers, and cryptographic attestations at the session and application layers, Telcos can provide an end-to-end Proof of Humanity for data traversing their networks. We outline how each OSI layer can contribute to this trust fabric using technical primitives such as SIM/eSIM identity, digital signatures, behavior-based ML heuristics, and edge-validated APIs. The framework is presented as a foundation for future implementation, highlighting monetization pathways for telcos such as trust-as-a-service APIs, origin-certified traffic tiers, and regulatory compliance tools. The paper does not present implementation or benchmarking results but offers a technically detailed roadmap and strategic rationale for transforming Telcos into validators of digital authenticity in an AI-dominated internet. Security, privacy, and adversarial considerations are discussed as directions for future work.

Paperid: 3722, https://arxiv.org/pdf/2504.03750.pdf

Abstract:
Financial fraud detection remains a critical challenge due to the dynamic and adversarial nature of fraudulent behavior. As fraudsters evolve their tactics, detection systems must combine robustness, adaptability, and precision. This study presents a hybrid architecture for credit card fraud detection that integrates a Mixture of Experts (MoE) framework with Recurrent Neural Networks (RNNs), Transformer encoders, and Autoencoders. Each expert module contributes a specialized capability: RNNs capture sequential behavior, Transformers extract high-order feature interactions, and Autoencoders detect anomalies through reconstruction loss. The MoE framework dynamically assigns predictive responsibility among the experts, enabling adaptive and context-sensitive decision-making. Trained on a high-fidelity synthetic dataset that simulates real-world transaction patterns and fraud typologies, the hybrid model achieved 98.7 percent accuracy, 94.3 percent precision, and 91.5 percent recall, outperforming standalone models and classical machine learning baselines. The Autoencoder component significantly enhanced the system's ability to identify emerging fraud strategies and atypical behaviors. Beyond technical performance, the model contributes to broader efforts in financial governance and crime prevention. It supports regulatory compliance with Anti-Money Laundering (AML) and Know Your Customer (KYC) protocols and aligns with routine activity theory by operationalizing AI as a capable guardian within financial ecosystems. The proposed hybrid system offers a scalable, modular, and regulation-aware approach to detecting increasingly sophisticated fraud patterns, contributing both to the advancement of intelligent systems and to the strengthening of institutional fraud defense infrastructures.

Paperid: 3723, https://arxiv.org/pdf/2504.03002.pdf

Abstract:
Federated learning is a method used in machine learning to allow multiple devices to work together on a model without sharing their private data. Each participant keeps their private data on their system and trains a local model and only sends updates to a central server, which combines these updates to improve the overall model. A key enabler of privacy in FL is homomorphic encryption (HE). HE allows computations to be performed directly on encrypted data. While HE offers strong privacy guarantees, it is computationally intensive, leading to significant latency and scalability issues, particularly for large-scale models like BERT. In my research, I aimed to address this inefficiency problem. My research introduces a novel algorithm to address these inefficiencies while maintaining robust privacy guarantees. I integrated several mathematical techniques such as selective parameter encryption, sensitivity maps, and differential privacy noise within my algorithms, which has already improved its efficiency. I have also conducted rigorous mathematical proofs to validate the correctness and robustness of the approach. I implemented this algorithm by coding it in C++, simulating the environment of federated learning on large-scale models, and verified that the efficiency of my algorithm is $3$ times the efficiency of the state-of-the-art method. This research has significant implications for machine learning because its ability to improve efficiency while balancing privacy makes it a practical solution! It would enable federated learning to be used very efficiently and deployed in various resource-constrained environments, as this research provides a novel solution to one of the key challenges in federated learning: the inefficiency of homomorphic encryption, as my new algorithm is able to enhance the scalability and resource efficiency of FL while maintaining robust privacy guarantees.

Paperid: 3724, https://arxiv.org/pdf/2504.02068.pdf

Abstract:
Pairing-based inner product functional encryption provides an efficient theoretical construction for privacy-preserving edge computing secured by widely deployed elliptic curve cryptography. In this work, an efficient software implementation framework for pairing-based function-hiding inner product encryption (FHIPE) is presented using the recently proposed and widely adopted BLS12-381 pairing-friendly elliptic curve. Algorithmic optimizations provide $\approx 2.6 \times$ and $\approx 3.4 \times$ speedup in FHIPE encryption and decryption respectively, and extensive performance analysis is presented using a Raspberry Pi 4B edge device. The proposed optimizations enable this implementation framework to achieve performance and ciphertext size comparable to previous work despite being implemented on an edge device with a slower processor and supporting a curve at much higher security level with a larger prime field. Practical privacy-preserving edge computing applications such as encrypted biomedical sensor data classification and secure wireless fingerprint-based indoor localization are also demonstrated using the proposed implementation framework.

Paperid: 3725, https://arxiv.org/pdf/2503.23268.pdf

Abstract:
Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models (LLMs), potentially leading to unauthorized actions such as API misuse. Unlike previous approaches that aim to detect these attacks on a best-effort basis, this paper introduces a novel method that appends an Encrypted Prompt to each user prompt, embedding current permissions. These permissions are verified before executing any actions (such as API calls) generated by the LLM. If the permissions are insufficient, the LLM's actions will not be executed, ensuring safety. This approach guarantees that only actions within the scope of the current permissions from the LLM can proceed. In scenarios where adversarial prompts are introduced to mislead the LLM, this method ensures that any unauthorized actions from LLM wouldn't be executed by verifying permissions in Encrypted Prompt. Thus, threats like prompt injection attacks that trigger LLM to generate harmful actions can be effectively mitigated.

Paperid: 3727, https://arxiv.org/pdf/2503.22765.pdf

Abstract:
In recent years, non-control-data attacks have be come a research hotspot in the field of network security, driven by the increasing number of defense methods against control-flow hijacking attacks. These attacks exploit memory vulnerabilities to modify non-control data within a program, thereby altering its behavior without compromising control-flow integrity. Research has shown that non-control-data attacks can be just as damaging as control-flow hijacking attacks and are even Turing complete, making them a serious security threat. However, despite being discovered long ago, the threat of non-control-data attacks has not been adequately addressed. In this review, we first classify non-control-data attacks into two categories based on their evolution: security-sensitive function attacks and data-oriented programming (DOP) attacks. Subsequently, based on the non control-data attack model, we categorize existing defense methods into three main strategies: memory safety, data confidentiality, and data integrity protection. We then analyze recent defense techniques specifically designed for DOP attacks. Finally, we identify the key challenges hindering the widespread adoption of defenses against non-control-data attacks and explore future research directions in this field.

Paperid: 3728, https://arxiv.org/pdf/2503.22755.pdf

Abstract:
Cybersecurity demands rigorous and scalable techniques to ensure system correctness, robustness, and resilience against evolving threats. Automated reasoning, encompassing formal logic, theorem proving, model checking, and symbolic analysis, provides a foundational framework for verifying security properties across diverse domains such as access control, protocol design, vulnerability detection, and adversarial modeling. This survey presents a comprehensive overview of the role of automated reasoning in cybersecurity, analyzing how logical systems, including temporal, deontic, and epistemic logics are employed to formalize and verify security guarantees. We examine SOTA tools and frameworks, explore integrations with AI for neural-symbolic reasoning, and highlight critical research gaps, particularly in scalability, compositionality, and multi-layered security modeling. The paper concludes with a set of well-grounded future research directions, aiming to foster the development of secure systems through formal, automated, and explainable reasoning techniques.

Paperid: 3729, https://arxiv.org/pdf/2503.22709.pdf

Abstract:
Blockchain technology is rapidly evolving, with scalability remaining one of its most significant challenges. While various solutions have been proposed and continue to be developed, it is essential to consider the blockchain trilemma -- balancing scalability, security, and decentralization -- when designing new approaches. One promising solution is the zero-knowledge proof (ZKP)-based rollup, implemented on top of Ethereum. However, the performance of these systems is often limited by the efficiency of the ZKP mechanism. This paper explores the performance of ZKP-based rollups, focusing on a solution built using the Hardhat Ethereum development environment. Through detailed analysis, the paper identifies and examines key bottlenecks within the ZKP system, providing insight into potential areas for optimization to enhance scalability and overall system performance.

Paperid: 3730, https://arxiv.org/pdf/2503.22686.pdf

Abstract:
Cyber information influence, or disinformation in general terms, is widely regarded as one of the biggest threats to social progress and government stability. From US presidential elections to European Union referendums and down to regional news reporting of wildfires, lies and post-truths have normalized radical decision-making. Accordingly, there has been an explosion in research seeking to detect disinformation in online media. The frontier of disinformation detection research is leveraging a variety of ML techniques such as traditional ML algorithms like Support Vector Machines, Random Forest, and NaÃ¯ve Bayes. Other research has applied deep learning models including Convolutional Neural Networks, Long Short-Term Memory networks, and transformer-based architectures. Despite the overall success of such techniques, the literature demonstrates inconsistencies when viewed holistically which limits our understanding of the true effectiveness. Accordingly, this work employed a two-stage meta-analysis to (a) demonstrate an overall meta statistic for ML model effectiveness in detecting disinformation and (b) investigate the same by subgroups of ML model types. The study found the majority of the 81 ML detection techniques sampled have greater than an 80\% accuracy with a Mean sample effectiveness of 79.18\% accuracy. Meanwhile, subgroups demonstrated no statistically significant difference between-approaches but revealed high within-group variance. Based on the results, this work recommends future work in replication and development of detection methods operating at the ML model level.

Paperid: 3731, https://arxiv.org/pdf/2503.22684.pdf

Abstract:
Maintaining security in IoT systems depends on intrusion detection since these networks' sensitivity to cyber-attacks is growing. Based on the IoT23 dataset, this study explores the use of several Machine Learning (ML) and Deep Learning (DL) along with the hybrid models for binary and multi-class intrusion detection. The standalone machine and deep learning models like Random Forest (RF), Extreme Gradient Boosting (XGBoost), Artificial Neural Network (ANN), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN) were used. Furthermore, two hybrid models were created by combining machine learning techniques: RF, XGBoost, AdaBoost, KNN, and SVM and these hybrid models were voting based hybrid classifier. Where one is for binary, and the other one is for multi-class classification. These models vi were tested using precision, recall, accuracy, and F1-score criteria and compared the performance of each model. This work thoroughly explains how hybrid, standalone ML and DL techniques could improve IDS (Intrusion Detection System) in terms of accuracy and scalability in IoT (Internet of Things).

Paperid: 3732, https://arxiv.org/pdf/2503.22683.pdf

Abstract:
These days, a tremendous quantity of digital visual data is sent over many networks and stored in many different formats. This visual information is usually very confidential and financially rewarding. Maintaining safe transmission of data is crucial, as is the use of approaches to offer security features like privacy, integrity, or authentication that are tailored to certain types of data. Protecting sensitive medical images stored in electronic health records is the focus of this article, which proposes a technique of encryption and decryption. In order to safe-guard image-based programs, encryption methods are applied. Privacy, integrity, and authenticity are only few of the security elements investigated by the proposed system, which encrypts medical pictures using chaos maps. In all stages of the protocol, the suggested chaos-based data scrambling method is employed to mitigate the short-comings of traditional confusion and diffusion designs. Bifurcation charts, Lyapunov exponents, tests for mean squared error and peak-to-average signal-to-noise ratio, and histogram analysis are only some of the tools we use to investigate the suggested system's chaotic behavior.

Paperid: 3733, https://arxiv.org/pdf/2503.22406.pdf

Abstract:
Typosquatting is a long-standing cyber threat that exploits human error in typing URLs to deceive users, distribute malware, and conduct phishing attacks. With the proliferation of domain names and new Top-Level Domains (TLDs), typosquatting techniques have grown more sophisticated, posing significant risks to individuals, businesses, and national cybersecurity infrastructure. Traditional detection methods primarily focus on well-known impersonation patterns, leaving gaps in identifying more complex attacks. This study introduces a novel approach leveraging large language models (LLMs) to enhance typosquatting detection. By training an LLM on character-level transformations and pattern-based heuristics rather than domain-specific data, a more adaptable and resilient detection mechanism develops. Experimental results indicate that the Phi-4 14B model outperformed other tested models when properly fine tuned achieving a 98% accuracy rate with only a few thousand training samples. This research highlights the potential of LLMs in cybersecurity applications, specifically in mitigating domain-based deception tactics, and provides insights into optimizing machine learning strategies for threat detection.

Paperid: 3734, https://arxiv.org/pdf/2503.22161.pdf

Abstract:
Traffic analysis using machine learning and deep learning models has made significant progress over the past decades. These models address various tasks in network security and privacy, including detection of anomalies and attacks, countering censorship, etc. They also reveal privacy risks to users as demonstrated by the research on LLM token inference as well as fingerprinting (and counter-fingerprinting) of user-visiting websites, IoT devices, and different applications. However, challenges remain in securing our networks from threats and attacks. After briefly reviewing the tasks and recent ML models in network security and privacy, we discuss the challenges that lie ahead.

Paperid: 3735, https://arxiv.org/pdf/2503.22016.pdf

Abstract:
We show how to construct simulation secure one-time memories, and thus one-time programs, without computational assumptions in the presence of constraints on quantum hardware. Specifically, we build one-time memories from random linear codes and quantum random access codes (QRACs) when constrained to non-adaptive, constant depth, and $D$-dimensional geometrically-local quantum circuit for some constant $D$. We place no restrictions on the adversary's classical computational power, number of qubits it can use, or the coherence time of its qubits. Notably, our construction can still be secure even in the presence of fault tolerant quantum computation as long as the input qubits are encoded in a non-fault tolerant manner (e.g. encoded as high energy states in non-ideal hardware). Unfortunately though, our construction requires decoding random linear codes and thus does not run in polynomial time. We leave open the question of whether one can construct a polynomial time information theoretically secure one-time memory from geometrically local quantum circuits. Of potentially independent interest, we develop a progress bound for information leakage via collision entropy (Renyi entropy of order $2$) along with a few key technical lemmas for a "mutual information" for collision entropies. We also develop new bounds on how much information a specific $2 \mapsto 1$ QRAC can leak about its input, which may be of independent interest as well.

Paperid: 3736, https://arxiv.org/pdf/2503.21145.pdf

Abstract:
The most important security benefit of software memory safety is easy to state: for C and C++ software, attackers can exploit most bugs and vulnerabilities to gain full, unfettered control of software behavior, whereas this is not true for most bugs in memory-safe software. Fortunately, this security benefit -- most bugs don't give attackers full control -- can be had for unmodified C/C++ software, without per-application effort. This doesn't require trying to establish memory safety; instead, it is sufficient to eliminate most of the combinatorial ways in which software with corrupted memory can execute. To eliminate these interleavings, there already exist practical compiler and runtime mechanisms that incur little overhead and need no special hardware or platform support. Each of the mechanisms described here is already in production use, at scale, on one or more platforms. By supporting their combined use in development toolchains, the security of all C and C++ software against remote code execution attacks can be rapidly, and dramatically, improved.

Paperid: 3737, https://arxiv.org/pdf/2503.20831.pdf

Abstract:
The rapid increase in cybersecurity vulnerabilities necessitates automated tools for analyzing and classifying vulnerability reports. This paper presents a novel Vulnerability Report Classifier that leverages the BERT (Bidirectional Encoder Representations from Transformers) model to perform multi-label classification of Common Vulnerabilities and Exposures (CVE) reports from the National Vulnerability Database (NVD). The classifier predicts both the severity (Low, Medium, High, Critical) and vulnerability types (e.g., Buffer Overflow, XSS) from textual descriptions. We introduce a custom training pipeline using a combined loss function-Cross-Entropy for severity and Binary Cross-Entropy with Logits for types-integrated into a Hugging Face Trainer subclass. Experiments on recent NVD data demonstrate promising results, with decreasing evaluation loss across epochs. The system is deployed via a REST API and a Streamlit UI, enabling real-time vulnerability analysis. This work contributes a scalable, open-source solution for cybersecurity practitioners to automate vulnerability triage.

Paperid: 3738, https://arxiv.org/pdf/2503.20461.pdf

Abstract:
Blockchain technology has emerged as a transformative paradigm for decentralized and secure data management across diverse application domains, including healthcare, supply chain management, and the Internet of Things. Its core features, such as decentralization, immutability, and auditability, achieved through distributed consensus algorithms and cryptographic techniques, offer significant advantages for multi-stakeholder applications requiring transparency and trust. However, the inherent complexity and security-critical nature of blockchain systems necessitate rigorous analysis and verification to ensure their correctness, reliability, and resilience against potential vulnerabilities.

Paperid: 3739, https://arxiv.org/pdf/2503.20365.pdf

Abstract:
The cyber-physical system of electricity power networks utilizes supervisory control and data acquisition systems (SCADA), which are inherently vulnerable to cyber threats if usually connected with the internet technology (IT). Power system operations are conducted through communication systems that are mapped to standards, protocols, ports, and addresses. Real-time situational awareness is a standard term with implications and applications in both power systems and cybersecurity. In the plausible quantum world (Q-world), conventional approaches will likely face new challenges. The unique art of transmitting a quantum state from one place, Alice, to another, Bob, is known as quantum communication. Quantum communication for SCADA communication in a plausible quantum era thus obviously entails wired communication through optical fiber networks complying with the typical cybersecurity criteria of confidentiality, integrity, and availability for classical internet technology unless a quantum internet (qinternet) transpires practically. When combined with the reverse order of AIC for operational technology, the cybersecurity criteria for power networks' critical infrastructure drill down to more specific sub-areas. Unlike other communication modes, such as information technology (IT) in broadband internet connections, SCADA for power networks, one of the critical infrastructures, is intricately intertwined with operations technology (OT), which significantly increases complexity. Though it is desirable to have a barrier called a demilitarized zone (DMZ), some overlap is inevitable. This paper highlights the opportunities and challenges in securing SCADA communication in the plausible quantum computing and communication regime, along with a corresponding integrated Qiskit implementation for possible future framework development.

Paperid: 3740, https://arxiv.org/pdf/2503.19370.pdf

Abstract:
In order to understand the overall picture of cyber attacks and to identify the source of cyber attacks, a method to identify malicious activities by automatically creating a graph that ties together the dependencies of a series of related events by tracking Data Provenance has been developed. However, the problem of dependency explosion, in which a large number of normal computer system operations such as operations by authorized users are included in the dependencies, results in a huge generated graph, making it difficult to identify malicious activities. In this paper, we propose a method to reduce the search space for malicious activities by extracting and removing frequently occurring benign activities through natural language processing of log data and analysis of activities in the computer system using similarity judgments. In the evaluation experiment, we used the DARPA TC Dateset, a large-scale public dataset, to evaluate the effectiveness of the proposed method on the dependency explosion problem. In addition, we showed that about 6.8 to 39% of the activities in a computer system could be defined as patterns of benign activities. In addition, we showed that removing benign activities extracted from a portion of the log data (approximately 1.4% to 3.2% in size) can significantly reduce the search space (up to approximately 52%) in large data sets.

Paperid: 3741, https://arxiv.org/pdf/2503.18255.pdf

Abstract:
The modern enterprise is facing an unprecedented surge in digital identities, with machine identities now significantly outnumbering human identities. This paper examines the cybersecurity risks emerging from what we define as the "human-machine identity blur" - the point at which human and machine identities intersect, delegate authority, and create new attack surfaces. Drawing from industry data, expert insights, and real-world incident analysis, we identify key governance gaps in current identity management models that treat human and machine entities as separate domains. To address these challenges, we propose a Unified Identity Governance Framework based on four core principles: treating identity as a continuum rather than a binary distinction, applying consistent risk evaluation across all identity types, implementing continuous verification guided by zero trust principles, and maintaining governance throughout the entire identity lifecycle. Our research shows that organizations adopting this unified approach experience a 47 percent reduction in identity-related security incidents and a 62 percent improvement in incident response time. We conclude by offering a practical implementation roadmap and outlining future research directions as AI-driven systems become increasingly autonomous.

Paperid: 3742, https://arxiv.org/pdf/2503.17844.pdf

Abstract:
We study the problem of approximating Hamming distance in sublinear time under property-preserving hashing (PPH), where only hashed representations of inputs are available. Building on the threshold evaluation framework of Fleischhacker, Larsen, and Simkin (EUROCRYPT 2022), we present a sequence of constructions with progressively improved complexity: a baseline binary search algorithm, a refined variant with constant repetition per query, and a novel hash design that enables constant-time approximation without oracle access. Our results demonstrate that approximate distance recovery is possible under strong cryptographic guarantees, bridging efficiency and security in similarity estimation.

Paperid: 3743, https://arxiv.org/pdf/2503.17813.pdf

Abstract:
Current frameworks for evaluating security bug severity, such as the Common Vulnerability Scoring System (CVSS), prioritize the ratio of exploitability to impact. This paper suggests that the above approach measures the "known knowns" but inadequately addresses the "known unknowns" especially when there exist multiple possible exploit paths and side effects, which introduce significant uncertainty. This paper introduces the concept of connectedness, which measures how strongly a security bug is connected with different entities, thereby reflecting the uncertainty of impact and the exploit potential. This work highlights the critical but underappreciated role connectedness plays in severity assessments.

Paperid: 3744, https://arxiv.org/pdf/2503.16950.pdf

Abstract:
Stack-based memory corruption vulnerabilities have long been exploited by attackers to execute arbitrary code or perform unauthorized memory operations. Various defense mechanisms have been introduced to mitigate stack memory errors, but they typically focus on specific attack types, incur substantial performance overhead, or suffer from compatibility limitations.In this paper, we present CleanStack, an efficient, highly compatible, and comprehensive stack protection mech anism. CleanStack isolates stack objects influenced by external input from other safe stack objects, thereby preventing attackers from modifying return addresses via controlled stack objects. Additionally, by randomizing the placement of tainted stack objects within the Unclean Stack, CleanStack mitigates non control data attacks by preventing attackers from predicting the stack layout.A key component of CleanStack is the identifica tion of tainted stack objects. We analyze both static program analysis and heuristic methods for this purpose. To maximize compatibility, we adopt a heuristic approach and implement CleanStack within the LLVM compiler framework, applying it to SPEC CPU2017 benchmarks and a real-world application.Our security evaluation demonstrates that CleanStack significantly reduces the exploitability of stack-based memory errors by providing a dual-stack system with isolation and randomization. Performance evaluation results indicate that CleanStack incurs an execution overhead of only 1.73% on the SPEC CPU2017 benchmark while introducing a minimal memory overhead of just 0.04%. Compared to existing stack protection techniques, CleanStack achieves an optimal balance between protection coverage, runtime overhead, and compatibility, making it one of the most comprehensive and efficient stack security solutions to date.

Paperid: 3745, https://arxiv.org/pdf/2503.15968.pdf

Abstract:
In the rapidly evolving landscape of digital assets and blockchain technologies, the necessity for robust, scalable, and secure data management platforms has never been more critical. This paper introduces a novel software architecture designed to meet these demands by leveraging the inherent strengths of cloud-native technologies and modular micro-service based architectures, to facilitate efficient data management, storage and access, across different stakeholders. We detail the architectural design, including its components and interactions, and discuss how it addresses common challenges in managing blockchain data and digital assets, such as scalability, data siloing, and security vulnerabilities. We demonstrate the capabilities of the platform by employing it into multiple real-life scenarios, namely providing data in near real-time to scientists in help with their research. Our results indicate that the proposed architecture not only enhances the efficiency and scalability of distributed data management but also opens new avenues for innovation in the research reproducibility area. This work lays the groundwork for future research and development in machine learning operations systems, offering a scalable and secure framework for the burgeoning digital economy.

Paperid: 3746, https://arxiv.org/pdf/2503.15508.pdf

Abstract:
The rapid advancement of Artificial Intelligence (AI) technologies, including the potential emergence of Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI), has raised concerns about AI surpassing human cognitive capabilities. To address this challenge, intelligence augmentation approaches, such as Brain Machine Interfaces (BMI) and Brain Organoid (BO) integration have been proposed. In this study, we compare three intelligence augmentation strategies, namely BMI, BO, and a hybrid approach combining both. These strategies are evaluated from three key perspectives that influence user decisions in selecting an augmentation method: information processing capacity, identity risk, and consent authenticity risk. First, we model these strategies and assess them across the three perspectives. The results reveal that while BO poses identity risks and BMI has limitations in consent authenticity capacity, the hybrid approach mitigates these weaknesses by striking a balance between the two. Second, we investigate how users might choose among these intelligence augmentation strategies in the context of evolving AI capabilities over time. As the result, we find that BMI augmentation alone is insufficient to compete with advanced AI, and while BO augmentation offers scalability, BO increases identity risks as the scale grows. Moreover, the hybrid approach provides a balanced solution by adapting to AI advancements. This study provides a novel framework for human capability augmentation in the era of advancing AI and serves as a guideline for adapting to AI development.

Paperid: 3747, https://arxiv.org/pdf/2503.15428.pdf

Abstract:
Following work of Mazur-Tate and Satoh, we extend the definition of division polynomials to arbitrary isogenies of elliptic curves, including those whose kernels do not sum to the identity. In analogy to the classical case of division polynomials for multiplication-by-n, we demonstrate recurrence relations, identities relating to classical elliptic functions, the chain rule describing relationships between division polynomials on source and target curve, and generalizations to higher dimension (i.e., elliptic nets).

Paperid: 3748, https://arxiv.org/pdf/2503.14682.pdf

Abstract:
Information-theoretically secure Symmetric Private Information Retrieval (SPIR) is known to be infeasible over noiseless channels with a single server. Known solutions to overcome this infeasibility involve additional resources such as database replication, shared randomness, or noisy channels. In this paper, we propose an alternative approach for achieving SPIR with information-theoretic security guarantees, without relying on shared randomness, noisy channels, or data replication. Specifically, we demonstrate that it is sufficient to use a noiseless binary adder multiple-access channel, where inputs are controlled by two non-colluding servers and the output is observed by the client, alongside a public noiseless communication channel between the client and the servers. Furthermore, in this setting, we characterize the optimal file rates, i.e., the file lengths normalized by the number of channel uses, that can be transferred.

Paperid: 3749, https://arxiv.org/pdf/2503.14388.pdf

Abstract:
This paper presents a study that analyzed state-of-the-art vulnerability scanning tools applied to containers. We have focused the work on tools following the Vulnerability Exploitability eXchange (VEX) format, which has been introduced to complement Software Bills of Material (SBOM) with security advisories of known vulnerabilities. Being able to get an accurate understanding of vulnerabilities found in the dependencies of third-party software is critical for secure software development and risk analysis. Accepting the overwhelming challenge of estimating the precise accuracy and precision of a vulnerability scanner, we have in this study instead set out to explore how consistently different tools perform. By doing this, we aim to assess the maturity of the VEX tool field as a whole (rather than any particular tool). We have used the Jaccard and Tversky indices to produce similarity scores of tool performance for several different datasets created from container images. Overall, our results show a low level of consistency among the tools, thus indicating a low level of maturity in VEX tool space. We have performed a number of experiments to find and explanation to our results, but largely they are inconclusive and further research is needed to understand the underlying causalities of our findings.

Paperid: 3750, https://arxiv.org/pdf/2503.13457.pdf

Abstract:
The development of the Willow quantum chip by Google has sparked significant interest in quantum computing, ushering in a new wave of advancements in the field. As quantum computing technology continues to mature, secure quantum communication has garnered increasing attention. To establish secure communication, several quantum key distribution (QKD) protocols have been proposed, such as the BB84 protocol, which leverages the principles of quantum superposition and other quantum properties to ensure secure transmission. However, existing QKD protocols may face vulnerabilities under certain conditions. This study proposes two types of man-in-the-middle (MITM) attack techniques and demonstrates their potential to compromise quantum cryptography through practical case studies. Furthermore, this study proposes strategies to counteract these MITM attacks and proposes methods to enhance the security of quantum cryptographic systems. The findings offer valuable insights for the future implementation and deployment of secure quantum communication systems.

Paperid: 3751, https://arxiv.org/pdf/2503.10944.pdf

Abstract:
Phishing is a persistent cybersecurity threat in today's digital landscape. This paper introduces Phishsense-1B, a refined version of the Llama-Guard-3-1B model, specifically tailored for phishing detection and reasoning. This adaptation utilizes Low-Rank Adaptation (LoRA) and the GuardReasoner finetuning methodology. We outline our LoRA-based fine-tuning process, describe the balanced dataset comprising phishing and benign emails, and highlight significant performance improvements over the original model. Our findings indicate that Phishsense-1B achieves an impressive 97.5% accuracy on a custom dataset and maintains strong performance with 70% accuracy on a challenging real-world dataset. This performance notably surpasses both unadapted models and BERT-based detectors. Additionally, we examine current state-of-the-art detection methods, compare prompt-engineering with fine-tuning strategies, and explore potential deployment scenarios.

Paperid: 3752, https://arxiv.org/pdf/2503.09727.pdf

Abstract:
Knowledge graph reasoning (KGR), which answers complex, logical queries over large knowledge graphs (KGs), represents an important artificial intelligence task with a range of applications. Many KGs require extensive domain expertise and engineering effort to build and are hence considered proprietary within organizations and enterprises. Yet, spurred by their commercial and research potential, there is a growing trend to make KGR systems, (partially) built upon private KGs, publicly available through reasoning APIs. The inherent tension between maintaining the confidentiality of KGs while ensuring the accessibility to KGR systems motivates our study of KG extraction attacks: the adversary aims to "steal" the private segments of the backend KG, leveraging solely black-box access to the KGR API. Specifically, we present KGX, an attack that extracts confidential sub-KGs with high fidelity under limited query budgets. At a high level, KGX progressively and adaptively queries the KGR API and integrates the query responses to reconstruct the private sub-KG. This extraction remains viable even if any query responses related to the private sub-KG are filtered. We validate the efficacy of KGX against both experimental and real-world KGR APIs. Interestingly, we find that typical countermeasures (e.g., injecting noise into query responses) are often ineffective against KGX. Our findings suggest the need for a more principled approach to developing and deploying KGR systems, as well as devising new defenses against KG extraction attacks.

Paperid: 3753, https://arxiv.org/pdf/2503.09165.pdf

Abstract:
The integration of blockchain technology with data analytics is essential for extracting insights in the cryptocurrency space. Although academic literature on blockchain data analytics is limited, various industry solutions have emerged to address these needs. This paper provides a comprehensive literature review, drawing from both academic research and industry applications. We classify blockchain analytics tools into categories such as block explorers, on-chain data providers, research platforms, and crypto market data providers. Additionally, we discuss the challenges associated with blockchain data analytics, including data accessibility, scalability, accuracy, and interoperability. Our findings emphasize the importance of bridging academic research and industry innovations to advance blockchain data analytics.

Paperid: 3754, https://arxiv.org/pdf/2503.08990.pdf

Abstract:
Large language models (LLMs) have shown great promise as language understanding and decision making tools, and they have permeated various aspects of our everyday life. However, their widespread availability also comes with novel risks, such as generating harmful, unethical, or offensive content, via an attack called jailbreaking. Despite extensive efforts from LLM developers to align LLMs using human feedback, they are still susceptible to jailbreak attacks. To tackle this issue, researchers often employ red-teaming to understand and investigate jailbreak prompts. However, existing red-teaming approaches lack effectiveness, scalability, or both. To address these issues, we propose JBFuzz, a novel effective, automated, and scalable red-teaming technique for jailbreaking LLMs. JBFuzz is inspired by the success of fuzzing for detecting bugs/vulnerabilities in software. We overcome three challenges related to effectiveness and scalability by devising novel seed prompts, a lightweight mutation engine, and a lightweight and accurate evaluator for guiding the fuzzer. Assimilating all three solutions results in a potent fuzzer that only requires black-box access to the target LLM. We perform extensive experimental evaluation of JBFuzz using nine popular and widely-used LLMs. We find that JBFuzz successfully jailbreaks all LLMs for various harmful/unethical questions, with an average attack success rate of 99%. We also find that JBFuzz is extremely efficient as it jailbreaks a given LLM for a given question in 60 seconds on average. Our work highlights the susceptibility of the state-of-the-art LLMs to jailbreak attacks even after safety alignment, and serves as a valuable red-teaming tool for LLM developers.

Paperid: 3755, https://arxiv.org/pdf/2503.07790.pdf

Abstract:
This paper provides an explanation of NTRU, a post quantum encryption scheme, while also providing a gentle introduction to cryptography. NTRU is a very efficient lattice based cryptosystem that appears to be safe against attacks by quantum computers. NTRU's efficiency suggests that it is a strong candidate as an alternative to RSA, ElGamal, and ECC for the post quantum world. The paper begins with an introduction to cryptography and security proofs for cryptographic schemes before explaining the NTRU cryptosystem and culminating with a proof that the original presentation of NTRU is not IND-CPA secure. We will conclude by mentioning padding schemes to NTRU that are provably IND-CCA2 secure in the random oracle model. The paper is designed to be accessible to anyone with minimal background in abstract algebra and number theory - no previous knowledge of cryptography is assumed. Given the author's lack of familiarity with the subject, this paper aims to be an expository work rather than to provide new insights to the subject matter.

Paperid: 3756, https://arxiv.org/pdf/2503.07250.pdf

Abstract:
Because digital devices and systems are widely used in all aspects of society, the risk of adversaries creating cyberattacks on a similar level remains high. As such, regulation of these aspects must follow, which is the domain of cybersecurity. Because this topic is worldwide, different jurisdictions should take inspiration from successful techniques elsewhere, with the European Union and the US being the most experienced and long-standing. What can be derived from their approaches separately to be used in other democratic jurisdictions, and what happens when we compare them with this pragmatic approach in mind? Cybersecurity is oddly enough quite well understood in most jurisdictions worldwide. However, concept comprehension cannot enforce or create compliance, hence the need for good regulatory approaches. The comparative legal analysis of the EU and the US show that there are large differences in definitions and enforcement, but some concepts are repeated in both jurisdictions. These can be further refined to become derivable principles, which can be used to inspire legislation in any democratic jurisdiction. They are: Voluntary Cooperation, Adaptable Definitions, Strong-arm Authorities, Mandated Computer Emergency Response Teams, and Effective Sanctions. These 5 principles are not exhaustive but combine classic regulatory and practical lessons from these two jurisdictions.

Paperid: 3757, https://arxiv.org/pdf/2503.05136.pdf

Abstract:
Fully Homomorphic Encryption (FHE) is a cryptographic scheme that enables computations to be performed directly on encrypted data, as if the data were in plaintext. After all computations are performed on the encrypted data, it can be decrypted to reveal the result. The decrypted value matches the result that would have been obtained if the same computations were applied to the plaintext data. FHE supports basic operations such as addition and multiplication on encrypted numbers. Using these fundamental operations, more complex computations can be constructed, including subtraction, division, logic gates (e.g., AND, OR, XOR, NAND, MUX), and even advanced mathematical functions such as ReLU, sigmoid, and trigonometric functions (e.g., sin, cos). These functions can be implemented either as exact formulas or as approximations, depending on the trade-off between computational efficiency and accuracy. FHE enables privacy-preserving machine learning by allowing a server to process the client's data in its encrypted form through an ML model. With FHE, the server learns neither the plaintext version of the input features nor the inference results. Only the client, using their secret key, can decrypt and access the results at the end of the service protocol. FHE can also be applied to confidential blockchain services, ensuring that sensitive data in smart contracts remains encrypted and confidential while maintaining the transparency and integrity of the execution process. Other applications of FHE include secure outsourcing of data analytics, encrypted database queries, privacy-preserving searches, efficient multi-party computation for digital signatures, and more. As this book is an open project (https://fhetextbook.github.io), we welcome FHE experts to join us as collaborators to help expand the draft.

Paperid: 3758, https://arxiv.org/pdf/2503.04866.pdf

Abstract:
As the use of facial recognition technology is expanding in different domains, ensuring its responsible use is gaining more importance. This paper conducts a comprehensive literature review of existing studies on facial recognition technology from the perspective of privacy, which is one of the key Responsible AI principles. Cloud providers, such as Microsoft, AWS, and Google, are at the forefront of delivering facial-related technology services, but their approaches to responsible use of these technologies vary significantly. This paper compares how these cloud giants implement the privacy principle into their facial recognition and detection services. By analysing their approaches, it identifies both common practices and notable differences. The results of this research will be valuable for developers and businesses by providing them insights into best practices of three major companies for integration responsible AI, particularly privacy, into their cloud-based facial recognition technologies.

Paperid: 3759, https://arxiv.org/pdf/2503.04555.pdf

Abstract:
This article analyzes a key exchange protocol based on the triad tropical semiring, recently proposed by Jackson, J. and Perumal, R. We demonstrate that the triad tropical semiring is isomorphic to a circulant matrix over tropical numbers. Consequently, matrices in this semiring can be represented as tropical matrices. As a result, we conduct a cryptanalysis of the key exchange protocol using an algorithm introduced by Sulaiman Alhussaini, Craig Collett, and Sergei Sergeev to solve the double discrete logarithm problem over tropical matrices

Paperid: 3760, https://arxiv.org/pdf/2503.04178.pdf

Abstract:
In modern world the importance of cybersecurity of various systems is increasing from year to year. The number of information security events generated by information security tools grows up with the development of the IT infrastructure. At the same time, the cyber threat landscape does not remain constant, and monitoring should take into account both already known attack indicators and those for which there are no signature rules in information security products of various classes yet. Detecting anomalies in large cybersecurity data streams is a complex task that, if properly addressed, can allow for timely response to atypical and previously unknown cyber threats. The possibilities of using of offline algorithms may be limited for a number of reasons related to the time of training and the frequency of retraining. Using stream learning algorithms for solving this task is capable of providing near-real-time data processing. This article examines the results of ten algorithms from three Python stream machine-learning libraries on BETH dataset with cybersecurity events, which contains information about the creation, cloning, and destruction of operating system processes collected using extended eBPF. ROC-AUC metric and total processing time of processing with these algorithms are presented. Several combinations of features and the order of events are considered. In conclusion, some mentions are given about the most promising algorithms and possible directions for further research are outlined.

Paperid: 3761, https://arxiv.org/pdf/2503.03884.pdf

Abstract:
This article presents a novel network protocol that incorporates a quantum photonic channel for symmetric key distribution, a Dilithium signature to replace factor-based public key cryptography for enhanced authentication, security, and privacy. The protocol uses strong hash functions to hash original messages and verify heightened data integrity at the destination. This Quantum Good Authentication Protocol (QGP) provides high-level security provided by the theory of quantum mechanics. QGP also has the advantage of quantum-resistant data protection that prevents current digital computer and future quantum computer attacks. QGP transforms the Transmission Control Protocol/Internet Protocol (TCP/IP) by adding a quantum layer at the bottom of Open Systems Interconnection (OSI) model (layer 0) and modifying the top layer (layer 7) with Dilithium signatures, thus improving the security of the original OSI model. In addition, QGP incorporates strong encryption, hardware-based quantum channels, post-quantum signatures, and secure hash algorithms over a platform of decryptors, switches, routers, and network controllers to form a testbed of the next-generation, secure quantum internet. The experiments presented here show that QGP provides secure authentication and improved security and privacy and can be adopted as a new protocol for the next-generation quantum Internet.

Paperid: 3762, https://arxiv.org/pdf/2503.02018.pdf

Abstract:
China's Great Firewall (GFW) exemplifies one of the most extensive and technologically sophisticated internet censorship frameworks worldwide. Serving as a cornerstone of state-directed digital governance, it integrates a multitude of methods - ranging from DNS manipulation and IP blocking to keyword filtering and active surveillance - to control online information flows. These measures, underpinned by both technical proficiency and administrative oversight, form a formidable obstacle to open communication and data privacy. This paper critically examines the GFW's principal detection techniques, including Deep Packet Inspection (DPI), domain name tampering, and traffic fingerprinting, and analyzes how they align with broader governmental mechanisms. In parallel, we evaluate emerging countermeasures that leverage obfuscation, encryption, and routing innovations to circumvent these restrictions. By situating technical strategies within the broader context of governance and human rights, this work underscores the ongoing and evolving contest between state-imposed internet controls and individual efforts to maintain unrestricted access to digital resources.

Paperid: 3763, https://arxiv.org/pdf/2503.00378.pdf

Abstract:
Federated learning is a distributed machine learning approach where multiple clients collaboratively train a model without sharing their local data, which contributes to preserving privacy. A challenge in federated learning is managing heterogeneous data distributions across clients, which can hinder model convergence and performance due to the need for the global model to generalize well across diverse local datasets. We propose to use local characteristic statistics, by which we mean some statistical properties calculated independently by each client using only their local training dataset. These statistics, such as means, covariances, and higher moments, are used to capture the characteristics of the local data distribution. They are not shared with other clients or a central node. During training, these local statistics help the model learn how to condition on the local data distribution, and during inference, they guide the client's predictions. Our experiments show that this approach allows for efficient handling of heterogeneous data across the federation, has favorable scaling compared to approaches that directly try to identify peer nodes that share distribution characteristics, and maintains privacy as no additional information needs to be communicated.

Paperid: 3764, https://arxiv.org/pdf/2503.00164.pdf

Abstract:
In an era marked by unprecedented digital complexity, the cybersecurity landscape is evolving at a breakneck pace, challenging traditional defense paradigms. Advanced Persistent Threats (APTs) reveal inherent vulnerabilities in conventional security measures and underscore the urgent need for continuous, adaptive, and proactive strategies that seamlessly integrate human insight with cutting edge AI technologies. This manuscript explores how the convergence of agentic AI and Frontier AI is transforming cybersecurity by reimagining frameworks such as the cyber kill chain, enhancing threat intelligence processes, and embedding robust ethical governance within automated response systems. Drawing on real-world data and forward looking perspectives, we examine the roles of real time monitoring, automated incident response, and perpetual learning in forging a resilient, dynamic defense ecosystem. Our vision is to harmonize technological innovation with unwavering ethical oversight, ensuring that future AI driven security solutions uphold core human values of fairness, transparency, and accountability while effectively countering emerging cyber threats.

Paperid: 3765, https://arxiv.org/pdf/2503.00070.pdf

Abstract:
Throughout the history from pre-industry 4.0 to post-industry 4.0, cybersecurity at banks has undergone significant changes. Pre-industry 4.0 cyber security at banks relied on individual security methods that were highly manual and had low accuracy. When moving to post-industry 4.0, cybersecurity at banks had a major turning point with security methods that combined different technologies such as Artificial Intelligence (AI), Blockchain, IoT, automating necessary processes and significantly increasing the defence layer for banks. However, along with the development of new technologies, the current challenge of cybersecurity at banks lies in scalability, high costs and resources in both money and time for R&D of defence methods along with the threat of high-tech cybercriminals growing and expanding. This report goes from introducing the importance of cybersecurity at banks, analyzing their management, operational and business objectives, evaluating pre-industry 4.0 technologies used for cybersecurity at banks to assessing post-industry 4.0 technologies focusing on Artificial Intelligence and Blockchain, discussing current policies and practices and ending with discussing key advantages and challenges for 4.0 technologies and recommendations for further developing cybersecurity at banks.

Paperid: 3766, https://arxiv.org/pdf/2502.19621.pdf

Abstract:
Enterprises today face increasing cybersecurity threats that necessitate robust digital forensics and risk mitigation strategies. This paper explores these challenges through an imaginary case study of an organization, a global identity management and data analytics company handling vast customer data. Given the critical nature of its data assets, EP has established a dedicated digital forensics team to detect threats, manage vulnerabilities, and respond to security incidents. This study outlines an approach to cybersecurity, including proactive threat anticipation, forensic investigations, and compliance with regulations like GDPR and CCPA. Key threats such as social engineering, insider risks, phishing, and ransomware are examined, along with mitigation strategies leveraging AI and machine learning. By detailing security framework, this paper highlights best practices in digital forensics, incident response, and enterprise risk management. The findings emphasize the importance of continuous monitoring, policy enforcement, and adaptive security measures to protect sensitive data and ensure business continuity in an evolving threat landscape

Paperid: 3767, https://arxiv.org/pdf/2502.18430.pdf

Abstract:
Pulsars exhibit signals with precise inter-arrival times that are on the order of milliseconds to seconds, depending on the individual pulsar. There are subtle variations in the timing of pulsar signals. We show that these variations can serve as a natural entropy source for the creation of Random Number Generators (RNGs). We also explore the effects of using randomness extractors to increase the entropy of random bits extracted from Pulsar timing data. To evaluate the quality of the Pulsar RNG, we model its entropy as a $k$-source and use well-known cryptographic results to show its closeness to a theoretically ideal uniformly random source. To remain consistent with prior work, we also show that the Pulsar RNG passes well-known statistical tests such as the NIST test suite.

Paperid: 3768, https://arxiv.org/pdf/2502.17703.pdf

Abstract:
As technology has become more embedded into our society, the security of modern-day systems is paramount. One topic which is constantly under discussion is that of patching, or more specifically, the installation of updates that remediate security vulnerabilities in software or hardware systems. This continued deliberation is motivated by complexities involved with patching; in particular, the various incentives and disincentives for organizations and their cybersecurity teams when deciding whether to patch. In this paper, we take a fresh look at the question of patching and critically explore why organizations and IT/security teams choose to patch or decide against it (either explicitly or due to inaction). We tackle this question by aggregating and synthesizing prominent research and industry literature on the incentives and disincentives for patching, specifically considering the human aspects in the context of these motives. Through this research, this study identifies key motivators such as organizational needs, the IT/security team's relationship with vendors, and legal and regulatory requirements placed on the business and its staff. There are also numerous significant reasons discovered for why the decision is taken not to patch, including limited resources (e.g., person-power), challenges with manual patch management tasks, human error, bad patches, unreliable patch management tools, and the perception that related vulnerabilities would not be exploited. These disincentives, in combination with the motivators above, highlight the difficult balance that organizations and their security teams need to maintain on a daily basis. Finally, we conclude by discussing implications of these findings and important future considerations.

Paperid: 3769, https://arxiv.org/pdf/2502.17698.pdf

Abstract:
Both parasites in biological systems and adversarial forces in cybersecurity are often perceived as threats: disruptive elements that must be eliminated. However, these entities play a critical role in revealing systemic weaknesses, driving adaptation, and ultimately strengthening resilience. This paper draws from environmental epidemiology and cybersecurity to reframe parasites and cyber exploiters as essential stress-testers of complex systems, exposing hidden vulnerabilities and pushing defensive innovations forward. By examining how biological and digital systems evolve in response to persistent threats, we highlight the necessity of adversarial engagement in fortifying security frameworks. The recent breach of the DOGE website serves as a timely case study, illustrating how adversarial forces, whether biological or digital, compel systems to reassess and reinforce their defenses.

Paperid: 3770, https://arxiv.org/pdf/2502.17392.pdf

Abstract:
Deep neural networks (DNNs) have achieved remarkable success in the field of natural language processing (NLP), leading to widely recognized applications such as ChatGPT. However, the vulnerability of these models to adversarial attacks remains a significant concern. Unlike continuous domains like images, text exists in a discrete space, making even minor alterations at the sentence, word, or character level easily perceptible to humans. This inherent discreteness also complicates the use of conventional optimization techniques, as text is non-differentiable. Previous research on adversarial attacks in text has focused on character-level, word-level, sentence-level, and multi-level approaches, all of which suffer from inefficiency or perceptibility issues due to the need for multiple queries or significant semantic shifts. In this work, we introduce a novel adversarial attack method, Emoji-Attack, which leverages the manipulation of emojis to create subtle, yet effective, perturbations. Unlike character- and word-level strategies, Emoji-Attack targets emojis as a distinct layer of attack, resulting in less noticeable changes with minimal disruption to the text. This approach has been largely unexplored in previous research, which typically focuses on emoji insertion as an extension of character-level attacks. Our experiments demonstrate that Emoji-Attack achieves strong attack performance on both large and small models, making it a promising technique for enhancing adversarial robustness in NLP systems.

Paperid: 3771, https://arxiv.org/pdf/2502.16730.pdf

Abstract:
We present RapidPen, a fully automated penetration testing (pentesting) framework that addresses the challenge of achieving an initial foothold (IP-to-Shell) without human intervention. Unlike prior approaches that focus primarily on post-exploitation or require a human-in-the-loop, RapidPen leverages large language models (LLMs) to autonomously discover and exploit vulnerabilities, starting from a single IP address. By integrating advanced ReAct-style task planning (Re) with retrieval-augmented knowledge bases of successful exploits, along with a command-generation and direct execution feedback loop (Act), RapidPen systematically scans services, identifies viable attack vectors, and executes targeted exploits in a fully automated manner. In our evaluation against a vulnerable target from the Hack The Box platform, RapidPen achieved shell access within 200-400 seconds at a per-run cost of approximately \$0.3-\$0.6, demonstrating a 60\% success rate when reusing prior "success-case" data. These results underscore the potential of truly autonomous pentesting for both security novices and seasoned professionals. Organizations without dedicated security teams can leverage RapidPen to quickly identify critical vulnerabilities, while expert pentesters can offload repetitive tasks and focus on complex challenges. Ultimately, our work aims to make penetration testing more accessible and cost-efficient, thereby enhancing the overall security posture of modern software ecosystems.

Paperid: 3772, https://arxiv.org/pdf/2502.16279.pdf

Abstract:
This paper explores the parallels between Thompson's "Reflections on Trusting Trust" and modern challenges in LLM-based code generation. We examine how Thompson's insights about compiler backdoors take on new relevance in the era of large language models, where the mechanisms for potential exploitation are even more opaque and difficult to analyze. Building on this analogy, we discuss how the statistical nature of LLMs creates novel security challenges in code generation pipelines. As a potential direction forward, we propose an ensemble-based validation approach that leverages multiple independent models to detect anomalous code patterns through cross-model consensus. This perspective piece aims to spark discussion about trust and validation in AI-assisted software development.

Paperid: 3773, https://arxiv.org/pdf/2502.14966.pdf

Abstract:
The rapid advancement of artificial intelligence (AI) has significantly expanded the attack surface for AI-driven cybersecurity threats, necessitating adaptive defense strategies. This paper introduces CyberSentinel, a unified, single-agent system for emergent threat detection, designed to identify and mitigate novel security risks in real time. CyberSentinel integrates: (1) Brute-force attack detection through SSH log analysis, (2) Phishing threat assessment using domain blacklists and heuristic URL scoring, and (3) Emergent threat detection via machine learning-based anomaly detection. By continuously adapting to evolving adversarial tactics, CyberSentinel strengthens proactive cybersecurity defense, addressing critical vulnerabilities in AI security.

Paperid: 3774, https://arxiv.org/pdf/2502.14291.pdf

Abstract:
Traditional approaches to vector similarity search over encrypted data rely on fully homomorphic encryption (FHE) to enable computation without decryption. However, the substantial computational overhead of FHE makes it impractical for large-scale real-time applications. This work explores a more efficient alternative: using additively homomorphic encryption (AHE) for privacy-preserving similarity search. We consider scenarios where either the query vector or the database vectors remain encrypted, a setting that frequently arises in applications such as confidential recommender systems and secure federated learning. While AHE only supports addition and scalar multiplication, we show that it is sufficient to compute inner product similarity--one of the most widely used similarity measures in vector retrieval. Compared to FHE-based solutions, our approach significantly reduces computational overhead by avoiding ciphertext-ciphertext multiplications and bootstrapping, while still preserving correctness and privacy. We present an efficient algorithm for encrypted similarity search under AHE and analyze its error growth and security implications. Our method provides a scalable and practical solution for privacy-preserving vector search in real-world machine learning applications.

Paperid: 3775, https://arxiv.org/pdf/2502.14093.pdf

Abstract:
In the domain of practical software protection against man-at-the-end attacks such as software reverse engineering and tampering, much of the scientific literature is plagued by the use of subpar methods to evaluate the protections' strength and even by the absence of such evaluations. Several criteria have been proposed in the past to assess the strength of protections, such as potency, resilience, stealth, and cost. We analyze their evolving definitions and uses. We formulate a number of critiques, from which we conclude that the existing definitions are unsatisfactory and need to be revised. We present a new framework of software protection evaluation criteria: relevance, effectiveness (or efficacy), robustness, concealment, stubbornness, sensitivity, predictability, and cost.

Paperid: 3776, https://arxiv.org/pdf/2502.14087.pdf

Abstract:
We study a setting of collecting and learning from private data distributed across end users. In the shuffled model of differential privacy, the end users partially protect their data locally before sharing it, and their data is also anonymized during its collection to enhance privacy. This model has recently become a prominent alternative to central DP, which requires full trust in a central data curator, and local DP, where fully local data protection takes a steep toll on downstream accuracy. Our main technical result is a shuffled DP protocol for privately estimating the kernel density function of a distributed dataset, with accuracy essentially matching central DP. We use it to privately learn a classifier from the end user data, by learning a private density function per class. Moreover, we show that the density function itself can recover the semantic content of its class, despite having been learned in the absence of any unprotected data. Our experiments show the favorable downstream performance of our approach, and highlight key downstream considerations and trade-offs in a practical ML deployment of shuffled DP.

Paperid: 3777, https://arxiv.org/pdf/2502.13197.pdf

Abstract:
Cayley hash functions are based on a simple idea of using a pair of semigroup elements, A and B, to hash the 0 and 1 bit, respectively, and then to hash an arbitrary bit string in the natural way, by using multiplication of elements in the semigroup. The main advantage of Cayley hash functions compared to, say, hash functions in the SHA family is that when an already hashed document is amended, one does not have to hash the whole amended document all over again, but rather hash just the amended part and then multiply the result by the hash of the original document. In this article, we survey some of the previously proposed Cayley hash functions and single out a very simple hash function whose security has not been compromised up to date.

Paperid: 3778, https://arxiv.org/pdf/2502.12544.pdf

Abstract:
Usage of the fast development of real-life digital applications in modern technology should guarantee novel and efficient way-outs of their protection. Encryption facilitates the data hiding. With the express development of technology, people tend to figure out a method that is capable of hiding a message and the survival of the message. Secrecy and efficiency can be obtained through steganographic involvement, a novel approach, along with multipurpose audio streams. Generally, steganography advantages are not used among industry and learners even though it is extensively discussed in the present information world. Information hiding in audio files is exclusively inspiring due to the compassion of the Human Auditory System (HAS). The proposed resolution supports Advance Encryption Standard (AES)256 key encryption and tolerates all existing audio file types as the container. This paper analyzes and proposes a way out according to the performance based on robustness, security, and hiding capacity. Furthermore, a survey of audio steganography applications, as well as a proposed resolution, is discussed in this paper.

Paperid: 3779, https://arxiv.org/pdf/2502.11737.pdf

Abstract:
The digital age requires strong security measures to protect online activities. Two-Factor Authentication (2FA) has emerged as a critical solution. However, its implementation presents significant challenges, particularly in terms of accessibility for people with disabilities. This paper examines the intricacies of deploying 2FA in a way that is secure and accessible to all users by outlining the concrete challenges for people who are affected by various types of impairments. This research investigates the implications of 2FA on digital inclusivity and proposes solutions to enhance accessibility. An analysis was conducted to examine the implementation and availability of various 2FA methods across popular online platforms. The results reveal a diverse landscape of authentication strategies. While 2FA significantly improves account security, its current adoption is hampered by inconsistencies across platforms and a lack of standardised, accessible options for users with disabilities. Future advancements in 2FA technologies, including but not limited to autofill capabilities and the adoption of Fast IDentity Onlines (FIDO) protocols, offer possible directions for more inclusive authentication mechanisms. However, ongoing research is necessary to address the evolving needs of users with disabilities and to mitigate new security challenges. This paper proposes a collaborative approach among stakeholders to ensure that security improvements do not compromise accessibility. It promotes a digital environment where security and inclusivity mutually reinforce each other.

Paperid: 3780, https://arxiv.org/pdf/2502.10984.pdf

Abstract:
Hiding messages for countless security purposes has become a highly fascinating subject nowadays. Encryption facilitates the data hiding. With the express development of technology, people tend to figure out a method capable of hiding a message and the survival of the message. The present-day study is conducted to hide information in an audio file. Generally, steganography advantages are not used among industry and learners even though it is an extensively discussed area in the present information world. This implementation aims to hide a document such as txt, doc, and pdf file formats in an audio file and retrieve the hidden document when necessary. This system is called DeepAudio v1.0. The system supports AES encryption and tolerates both wave and MP3 files. The sub-aims of this work were the creation of a free, openly available, bug-free software tool with additional features that are new to the area.

Paperid: 3781, https://arxiv.org/pdf/2502.10646.pdf

Abstract:
This paper explores vulnerabilities in the Dynamic Host Configuration Protocol (DHCP) and their implications on the Confidentiality, Integrity, and Availability (CIA) Triad. Through an analysis of various attacks, including DHCP Starvation, Rogue DHCP Servers, Replay Attacks, and TunnelVision exploits, the paper provides a taxonomic classification of threats, assesses risks, and proposes appropriate controls. The discussion also highlights the dangers of VPN decloaking through DHCP exploits and underscores the importance of safeguarding network infrastructures. By bringing awareness to the TunnelVision exploit, this paper aims to mitigate risks associated with these prevalent vulnerabilities.

Paperid: 3782, https://arxiv.org/pdf/2502.10635.pdf

Abstract:
Machine Learning models thrive on vast datasets, continuously adapting to provide accurate predictions and recommendations. However, in an era dominated by privacy concerns, Machine Unlearning emerges as a transformative approach, enabling the selective removal of data from trained models. This paper examines methods such as Naive Retraining and Exact Unlearning via the SISA framework, evaluating their Computational Costs, Consistency, and feasibility using the $\texttt{HSpam14}$ dataset. We explore the potential of integrating unlearning principles into Positive Unlabeled (PU) Learning to address challenges posed by partially labeled datasets. Our findings highlight the promise of unlearning frameworks like $\textit{DaRE}$ for ensuring privacy compliance while maintaining model performance, albeit with significant computational trade-offs. This study underscores the importance of Machine Unlearning in achieving ethical AI and fostering trust in data-driven systems.

Paperid: 3783, https://arxiv.org/pdf/2502.10599.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) ecosystem has transformed various sectors but has also introduced significant cybersecurity challenges. Traditional centralized security methods often struggle to balance privacy preservation and real-time threat detection in IoT networks. To address these issues, this study proposes a Federated Learning-Driven Cybersecurity Framework designed specifically for IoT environments. The framework enables decentralized data processing by training models locally on edge devices, ensuring data privacy. Secure aggregation of these locally trained models is achieved using homomorphic encryption, allowing collaborative learning without exposing sensitive information. The proposed framework utilizes recurrent neural networks (RNNs) for anomaly detection, optimized for resource-constrained IoT networks. Experimental results demonstrate that the system effectively detects complex cyber threats, including distributed denial-of-service (DDoS) attacks, with over 98% accuracy. Additionally, it improves energy efficiency by reducing resource consumption by 20% compared to centralized approaches. This research addresses critical gaps in IoT cybersecurity by integrating federated learning with advanced threat detection techniques. The framework offers a scalable and privacy-preserving solution adaptable to various IoT applications. Future work will explore the integration of blockchain for transparent model aggregation and quantum-resistant cryptographic methods to further enhance security in evolving technological landscapes.

Paperid: 3784, https://arxiv.org/pdf/2502.09864.pdf

Abstract:
Microarchitectural timing attacks exploit subtle timing variations caused by hardware behaviors to leak sensitive information. In this paper, we introduce MCHammer, a novel side-channel technique that leverages machine clears induced by self-modifying code detection mechanisms. Unlike most traditional techniques, MCHammer does not require memory access or waiting periods, making it highly efficient. We compare MCHammer to the classical Flush+Reload technique, improving in terms of trace granularity, providing a powerful side-channel attack vector. Using MCHammer, we successfully recover keys from a deployed implementation of a cryptographic tool. Our findings highlight the practical implications of MCHammer and its potential impact on real-world systems.

Paperid: 3785, https://arxiv.org/pdf/2502.08832.pdf

Abstract:
The Log Structured Merge (LSM) Tree is a popular choice for key-value stores that focus on optimized write throughput while maintaining performant, production-ready read latencies. To optimize read performance, LSM stores rely on a probabilistic data structure called the Bloom Filter (BF). In this paper, we focus on adversarial workloads that lead to a sharp degradation in read performance by impacting the accuracy of BFs used within the LSM store. Our evaluation shows up to $800\%$ increase in the read latency of lookups for popular LSM stores. We define adversarial models and security definitions for LSM stores. We implement adversary resilience into two popular LSM stores, LevelDB and RocksDB. We use our implementations to demonstrate how performance degradation under adversarial workloads can be mitigated.

Paperid: 3786, https://arxiv.org/pdf/2502.08401.pdf

Abstract:
This paper presents a technique for generating sound by leveraging the electrical properties of liquid crystal displays (LCDs). The phenomenon occurs due to vibrational noise produced by capacitors within the LCD panel during rapid pixel state transitions. By modulating these transitions through specially crafted bitmap patterns projected onto the screen, we demonstrate how weak yet audible acoustic signals can be generated directly from the display. We designed, implemented, evaluated, and tested a system that repurposes the LCD as a sound-emitting device. Potential applications for this technique include low-power auditory feedback systems, short-range device communication, air-gap covert channels, secure auditory signaling, and innovative approaches to human-computer interaction.

Paperid: 3788, https://arxiv.org/pdf/2502.07284.pdf

Abstract:
Lattice-based cryptography is a foundation for post-quantum security, with the Learning with Errors (LWE) problem as a core component in key exchange, encryption, and homomorphic computation. Structured variants like Ring-LWE (RLWE) and Module-LWE (MLWE) improve efficiency using polynomial rings but remain constrained by traditional polynomial multiplication rules, limiting their ability to handle structured vectorized data. This work introduces Variety-LWE (VLWE), a new structured lattice problem based on algebraic geometry. Unlike RLWE and MLWE, which use polynomial quotient rings with standard multiplication, VLWE operates over multivariate polynomial rings defined by algebraic varieties. A key difference is that these polynomials lack mixed variables, and multiplication is coordinate-wise rather than following standard polynomial multiplication. This enables direct encoding and homomorphic processing of high-dimensional data while preserving worst-case to average-case hardness reductions. We prove VLWE's security by reducing it to multiple independent Ideal-SVP instances, demonstrating resilience against classical and quantum attacks. Additionally, we analyze hybrid algebraic-lattice attacks, showing that existing Grobner basis and lattice reduction methods do not directly threaten VLWE. We further construct a vector homomorphic encryption scheme based on VLWE, supporting structured computations while controlling noise growth. This scheme offers advantages in privacy-preserving machine learning, encrypted search, and secure computations over structured data. VLWE emerges as a novel and independent paradigm in lattice-based cryptography, leveraging algebraic geometry to enable new cryptographic capabilities beyond traditional polynomial quotient rings.

Paperid: 3789, https://arxiv.org/pdf/2502.06385.pdf

Abstract:
This paper takes a critical look at the recommendations OSCE/ODIHR has given for the Estonian Internet voting over the 20 years it has been running. We present examples of recommendations that can not be fulfilled at all, but also examples where fulfilling a recommendation requires a non-trivial trade-off, potentially weakening the system in some other respect. In such cases OSCE/ODIHR should take an explicit position which trade-off it recommends. We also look at the development of the recommendation to introduce end-to-end verifiability. In this case we expect OSCE/ODIHR to define what it exactly means by this property, as well as to give explicit criteria to determine whether and to which extent end-to-end verifiability has been achieved.

Paperid: 3790, https://arxiv.org/pdf/2502.06247.pdf

Abstract:
In stabilizer-based quantum secret sharing schemes, it is known that some shares can be distributed to participants before a secret is given to the dealer. This distribution is known as advance sharing. It is already known that a set of shares is advance shareable only if it is a forbidden set. However, it was not known whether any forbidden set is advance shareable. We provide an example of a set of shares such that it is a forbidden set but is not advance shareable in the previous scheme. Furthermore, we propose a quantum secret sharing scheme for quantum secrets such that any forbidden set is advance shareable.

Paperid: 3791, https://arxiv.org/pdf/2502.06000.pdf

Abstract:
In chess, zugzwang describes a scenario where any move worsens the player's position. Organizations face a similar dilemma right now at the intersection of artificial intelligence (AI) and cybersecurity. AI adoption creates an inevitable paradox: delaying it poses strategic risks, rushing it introduces poorly understood vulnerabilities, and even incremental adoption leads to cascading complexities. In this work we formalize this challenge as the AI Security Zugzwang, a phenomenon where security leaders must make decisions under conditions of inevitable risk. Grounded in game theory, security economics, and organizational decision theory, we characterize AI security zugzwang through three key properties, the forced movement, predictable vulnerability creation, and temporal pressure. Additionally, we develop a taxonomy to categorize forced-move scenarios across AI adoption, implementation, operational and governance contexts and provide corresponding strategic mitigations. Our framework is supported by a practical decision flowchart, demonstrated through a real-world example of Copilot adoption, thus, showing how security lead

Paperid: 3792, https://arxiv.org/pdf/2502.05637.pdf

Abstract:
Adversarial Machine Learning (AML) addresses vulnerabilities in AI systems where adversaries manipulate inputs or training data to degrade performance. This article provides a comprehensive analysis of evasion and poisoning attacks, formalizes defense mechanisms with mathematical rigor, and discusses the challenges of implementing robust solutions in adaptive threat models. Additionally, it highlights open challenges in certified robustness, scalability, and real-world deployment.

Paperid: 3793, https://arxiv.org/pdf/2502.05205.pdf

Abstract:
This research provides a comprehensive analysis of data breach trends across industries, regions, attack methods and data sensitivity from 2004 to 2024 using Microsoft Power BI. The research seeks to benefit from data visualisation techniques to help policymakers and organisational leaders make informed decisions based on data insights. Understanding the correlation between different data breach categories will enable them to prioritise security based on severity and protect their data. In this regard, this research aims to provide insights into dominant, emerging and declining data breach trends and further explore their implications. Finally, the study concludes by contributing to recommendation strategies and future directions for further research.

Paperid: 3794, https://arxiv.org/pdf/2502.04759.pdf

Abstract:
Phishing has long been a common tactic used by cybercriminals and continues to pose a significant threat in today's digital world. When phishing attacks become more advanced and sophisticated, there is an increasing need for effective methods to detect and prevent them. To address the challenging problem of detecting phishing emails, researchers have developed numerous solutions, in particular those based on machine learning (ML) algorithms. In this work, we take steps to study the efficacy of large language models (LLMs) in detecting phishing emails. The experiments show that the LLM achieves a high accuracy rate at high precision; importantly, it also provides interpretable evidence for the decisions.

Paperid: 3795, https://arxiv.org/pdf/2502.04157.pdf

Abstract:
Cybersecurity is one of the most pressing technological challenges of our time and requires measures from all sectors of society. A key measure is automated security response, which enables automated mitigation and recovery from cyber attacks. Significant strides toward such automation have been made due to the development of rule-based response systems. However, these systems have a critical drawback: they depend on domain experts to configure the rules, a process that is both error-prone and inefficient. Framing security response as an optimal control problem shows promise in addressing this limitation but introduces new challenges. Chief among them is bridging the gap between theoretical optimality and operational performance. Current response systems with theoretical optimality guarantees have only been validated analytically or in simulation, leaving their practical utility unproven. This thesis tackles the aforementioned challenges by developing a practical methodology for optimal security response in IT infrastructures. It encompasses two systems. First, it includes an emulation system that replicates key components of the target infrastructure. We use this system to gather measurements and logs, based on which we identify a game-theoretic model. Second, it includes a simulation system where game-theoretic response strategies are optimized through stochastic approximation to meet a given objective, such as mitigating potential attacks while maintaining operational services. These strategies are then evaluated and refined in the emulation system to close the gap between theoretical and operational performance. We prove structural properties of optimal response strategies and derive efficient algorithms for computing them. This enables us to solve a previously unsolved problem: demonstrating optimal security response against network intrusions on an IT infrastructure.

Paperid: 3797, https://arxiv.org/pdf/2502.01835.pdf

Abstract:
The proliferation of Internet of Things (IoT) devices has created a pressing need for efficient security solutions, particularly against Denial of Service (DoS) attacks. While existing detection approaches demonstrate high accuracy, they often require substantial computational resources, making them impractical for IoT deployment. This paper introduces a novel lightweight approach to DoS attack detection based on Kolmogorov-Arnold Networks (KANs). By leveraging spline-based transformations instead of traditional weight matrices, our solution achieves state-of-the-art detection performance while maintaining minimal resource requirements. Experimental evaluation on the CICIDS2017 dataset demonstrates 99.0% detection accuracy with only 0.19 MB memory footprint and 2.00 ms inference time per sample. Compared to existing solutions, KAN reduces memory requirements by up to 98% while maintaining competitive detection rates. The model's linear computational complexity ensures efficient scaling with input size, making it particularly suitable for large-scale IoT deployments. We provide comprehensive performance comparisons with recent approaches and demonstrate effectiveness across various DoS attack patterns. Our solution addresses the critical challenge of implementing sophisticated attack detection on resource-constrained devices, offering a practical approach to enhancing IoT security without compromising computational efficiency.

Paperid: 3798, https://arxiv.org/pdf/2502.00975.pdf

Abstract:
Distributed Denial of Service (DDoS) attacks make the challenges to provide the services of the data resources to the web clients. In this paper, we concern to study and apply different Machine Learning (ML) techniques to separate the DDoS attack instances from benign instances. Our experimental results show that forward and backward data bytes of our dataset are observed more similar for DDoS attacks compared to the data bytes for benign attempts. This paper uses different machine learning techniques for the detection of the attacks efficiently in order to make sure the offered services from web servers available. This results from the proposed approach suggest that 97.1% of DDoS attacks are successfully detected by the Support Vector Machine (SVM). These accuracies are better while comparing to the several existing machine learning approaches.

Paperid: 3799, https://arxiv.org/pdf/2502.00950.pdf

Abstract:
Audio data are widely exchanged over telecommunications networks. Due to the limitations of network resources, these data are typically compressed before transmission. Various methods are available for compressing audio data. To access such audio information, it is first necessary to identify the codec used for compression. One of the most effective approaches for audio codec identification involves analyzing the content of received packets. In these methods, statistical features extracted from the packets are utilized to determine the codec employed. This paper proposes a novel method for audio codec classification based on features derived from the overlapped longest common sub-string and sub-sequence (LCS). The simulation results, which achieved an accuracy of 97% for 8 KB packets, demonstrate the superiority of the proposed method over conventional approaches. This method divides each 8 KB packet into fifteen 1 KB packets with a 50% overlap. The results indicate that this division has no significant impact on the simulation outcomes, while significantly speeding up the feature extraction, being eight times faster than the traditional method for extracting LCS features.

Paperid: 3800, https://arxiv.org/pdf/2502.00069.pdf

Abstract:
Information hiding technology utilizes the insensitivity of human sensory organs to redundant data, hiding confidential information in the redundant data of these public digital media, and then transmitting it. The carrier media after hiding secret information only displays its own characteristics, which can ensure the transmission of confidential information without being detected, thereby greatly improving the security of the information. In theory, any digital media including image, video, audio, and text can serve as a host carrier. Among them, hiding information in binary images poses great challenges. As we know, any information hiding method involves modifying the data of the host carrier. The more information hidden, the more data of the host carrier are modified. In this paper, we propose information hiding in the black-and-white mixed region of binary images, which can greatly reduce visual distortion. In addition, we propose an efficient encoding to achieve high-capacity information hiding while ensuring image semantics. By selecting binary images of different themes, we conduct experiments. The experimental results prove the feasibility of our technique and verify the expected performance. Since the candidate units for information hiding are selected from equally sized blocks that the image is divided into, and the hiding and extraction of information are based on a shared encoding table, the computational cost is very low, making it suitable for real-time information hiding applications.

Paperid: 3801, https://arxiv.org/pdf/2501.18625.pdf

Abstract:
Anonymization of graph-based data is a problem which has been widely studied over the last years and several anonymization methods have been developed. Information loss measures have been used to evaluate data utility and information loss in the anonymized graphs. However, there is no consensus about how to evaluate data utility and information loss in privacy-preserving and anonymization scenarios, where the anonymous datasets were perturbed to hinder re-identification processes. Authors use diverse metrics to evaluate data utility and, consequently, it is complex to compare different methods or algorithms in literature. In this paper we propose a framework to evaluate and compare anonymous datasets in a common way, providing an objective score to clearly compare methods and algorithms. Our framework includes metrics based on generic information loss measures, such as average distance or betweenness centrality, and also task-specific information loss measures, such as community detection or information flow. Additionally, we provide some metrics to examine re-identification and risk assessment. We demonstrate that our framework could help researchers and practitioners to select the best parametrization and/or algorithm to reduce information loss and maximize data utility.

Paperid: 3802, https://arxiv.org/pdf/2501.18394.pdf

Abstract:
Quantum-key distribution (QKD) schemes employing quantum communication links are typically based on the transmission of weak optical pulses over optical fibers to setup a secret key between the transmitting and receiving nodes. Alice transmits optically a random bit stream to the receiver (Bob) through the photon polarizations or the quadrature components of the lightwaves associated with the photons, with a secret key remaining implicitly embedded therein. However, during the above transmission, some eavesdropper (Eve) might attempt to tap the passing-by photons from the optical fiber links to extract the key. In one of the popular QKD schemes, along with signal pulses, some additional decoy pulses are transmitted by Alice, while Eve might use photon-number splitting (PNS) for eavesdropping. In a typical PNS scheme, (i) the optical pulses with single photon are blocked by Eve, (ii) from the optical pulses with two photons, one photon is retained by Eve to carry out eavesdropping operation and the other is retransmitted to Bob, and (iii) all other pulses with more than two photons are retransmitted by Eve to Bob without retaining any photon from them. Extensive theoretical research has been carried out on such QKD schemes, by employing information-theoretic approach along with computer simulations and experimental studies. We present a novel event-by-event impairment enumeration approach to evaluate the overall performance of one such QKD scheme analytically with due consideration to the physical layer of the quantum communication links. The proposed approach monitors the impairments of the propagating optical pulses event-by-event at all possible locations along the optical fiber link using statistical approach, and provides estimates of the realizable key generation rate, while assuring an adequate yield ratio between signal and decoy pulses for the detection of possible eavesdropping.

Paperid: 3803, https://arxiv.org/pdf/2501.16451.pdf

Abstract:
This paper proposes a method of emulation of \verb|OP_RAND| opcode on Bitcoin through a trustless interactive game between transaction counterparties. The game result is probabilistic and doesn't allow any party to cheat, increasing their chance of winning on any protocol step. The protocol can be organized in a way unrecognizable to any external party and doesn't require some specific scripts or Bitcoin protocol updates. We will show how the protocol works on the simple \textbf{Thimbles Game} and provide some initial thoughts about approaches and applications that can use the mentioned approach.

Paperid: 3804, https://arxiv.org/pdf/2501.16217.pdf

Abstract:
Field Programmable Gate Arrays (FPGAs) are increasingly in various applications. This is due to the fact that they provide flexibility to reprogram and modify in realtime with minimum effort. The increasing usage of FPGA must also ensure its end users with a guarantee that they are able to overcome failures and work in the harshest of environments. Today's end user demands a guarantee that the system he/she is buying remains functional. FPGA Vendor Xilinx provides its users with that guarantee in the name of Isolation Design flow or IDF for short. Xilinx claims that IDF can help ensure users reliability while providing flexibility. If true, application that can benefit from IDF are vast, such as Avionics, Unmanned probes, Self-driving fully autonomous vehicles, etc. This thesis puts the Xilinx claim regarding IDF to the test by implementing a single-chip cryptographic application, namely Advanced Encryption Standard, according to the rules and regulations defined by isolation design flow. This thesis does so by replicating and injecting faults in the system that conforms to the rules of IDF and records its behavior firsthand to observe IDF effectiveness. This thesis can also help end users and system evaluators interested in effectiveness of IDF with an independent view other than Xilinx by providing them with statistical data collected to remove all doubts.

Paperid: 3805, https://arxiv.org/pdf/2501.15760.pdf

Abstract:
Despite decades of development, existing IDSs still face challenges in improving detection accuracy, evasion, and detection of unknown attacks. To solve these problems, many researchers have focused on designing and developing IDSs that use Deep Neural Networks (DNN) which provides advanced methods of threat investigation and detection. Given this reason, the motivation of this research then, is to learn how effective applications of Deep Neural Networks (DNN) can accurately detect and identify malicious network intrusion, while advancing the frontiers of their optimal potential use in network intrusion detection. Using the ASNM-TUN dataset, the study used a Multilayer Perceptron modeling approach in Deep Neural Network to identify network intrusions, in addition to distinguishing them in terms of legitimate network traffic, direct network attacks, and obfuscated network attacks. To further enhance the speed and efficiency of this DNN solution, a thorough feature selection technique called Forward Feature Selection (FFS), which resulted in a significant reduction in the feature subset, was implemented. Using the Multilayer Perceptron model, test results demonstrate no support for the model to accurately and correctly distinguish the classification of network intrusion.

Paperid: 3806, https://arxiv.org/pdf/2501.15751.pdf

Abstract:
A Bloom filter is a space-efficient probabilistic data structure that represents a set $S$ of elements from a larger universe $U$. This efficiency comes with a trade-off, namely, it allows for a small chance of false positives. When you query the Bloom filter about an element x, the filter will respond 'Yes' if $x \in S$. If $x \notin S$, it may still respond 'Yes' with probability at most $\varepsilon$. We investigate the adversarial robustness and privacy of Bloom filters, addressing open problems across three prominent frameworks: the game-based model of Naor-Oved-Yogev (NOY), the simulator-based model of Filic et. al., and learning-augmented variants. We prove the first formal connection between the Filic and NOY models, showing that Filic correctness implies AB-test resilience. We resolve a longstanding open question by proving that PRF-backed Bloom filters fail the NOY model's stronger BP-test. Finally, we introduce the first private Bloom filters with differential privacy guarantees, including constructions applicable to learned Bloom filters. Our taxonomy organizes the space of robustness and privacy guarantees, clarifying relationships between models and constructions.

Paperid: 3807, https://arxiv.org/pdf/2501.15578.pdf

Abstract:
This paper introduces a novel complexity-informed approach to cybersecurity management, addressing the challenges found within complex cyber defences. We adapt and extend the complexity theory to cybersecurity and develop a quantitative framework that empowers decision-makers with strategies to de-complexify defences, identify improvement opportunities, and resolve bottlenecks. Our approach also provides a solid foundation for critical cybersecurity decisions, such as tooling investment or divestment, workforce capacity planning, and optimisation of processes and capabilities. Through a case study, we detail and validate a systematic method for assessing and managing the complexity within cybersecurity defences. The complexity-informed approach based on MITRE ATT&CK, is designed to complement threat-informed defences. Threat-informed methods focus on understanding and countering adversary tactics, while the complexity-informed approach optimises the underlying defence infrastructure, thereby optimising the overall efficiency and effectiveness of cyber defences.

Paperid: 3808, https://arxiv.org/pdf/2501.14496.pdf

Abstract:
This note documents an implementation issue in recent adaptive attacks (Zhang et al. [2024]) against the multi-resolution self-ensemble defense (Fort and Lakshminarayanan [2024]). The implementation allowed adversarial perturbations to exceed the standard $L_\infty = 8/255$ bound by up to a factor of 20$\times$, reaching magnitudes of up to $L_\infty = 160/255$. When attacks are properly constrained within the intended bounds, the defense maintains non-trivial robustness. Beyond highlighting the importance of careful validation in adversarial machine learning research, our analysis reveals an intriguing finding: properly bounded adaptive attacks against strong multi-resolution self-ensembles often align with human perception, suggesting the need to reconsider how we measure adversarial robustness.

Paperid: 3809, https://arxiv.org/pdf/2501.14184.pdf

Abstract:
This paper presents tight upper and lower bounds for minimum number of samples (copies of a quantum state) required to attain a prescribed accuracy (measured by error variance) for scalar parameters estimation using unbiased estimators under quantum local differential privacy for qubits. Particularly, the best-case (optimal) scenario is considered by minimizing the sample complexity over all differentially-private channels; the worst-case channels can be arbitrarily uninformative and render the problem ill-defined. In the small privacy budget $Îµ$ regime, i.e., $Îµ\ll 1$, the sample complexity scales as $Î(Îµ^{-2})$. This bound matches that of classical parameter estimation under local differential privacy. The lower bound however loosens in the large privacy budget regime, i.e., $Îµ\gg 1$. The upper bound for the minimum number of samples is generalized to qudits (with dimension $d$) resulting in sample complexity of $O(dÎµ^{-2})$.

Paperid: 3810, https://arxiv.org/pdf/2501.14175.pdf

Abstract:
Given that disturbances to the stable and normal operation of power systems have grown phenomenally, particularly in terms of unauthorized access to confidential and critical data, injection of malicious software, and exploitation of security vulnerabilities in a poorly patched software among others; then developing, as a countermeasure, an assessment solutions with machine learning capabilities to match up in real-time, with the growth and fast pace of these cyber-attacks, is not only critical to the security, reliability and safe operation of power system, but also germane to guaranteeing advanced monitoring and efficient threat detection. Using the Mississippi State University and Oak Ridge National Laboratory dataset, the study used an XGB Classifier modeling approach in machine learning to diagnose and assess power system disturbances, in terms of Attack Events, Natural Events and No-Events. As test results show, the model, in all the three sub-datasets, generally demonstrates good performance on all metrics, as it relates to accurately identifying and classifying all the three power system events.

Paperid: 3811, https://arxiv.org/pdf/2501.13974.pdf

Abstract:
This dissertation addresses the challenge of ensuring transactional integrity and reducing costs in corporate governance through blockchain technology. We propose an on-chain methodology for certifying, registering, and querying institutional transactional status. Our decentralized governance approach utilizes consensus mechanisms and smart contracts to automate and enforce business rules. The framework aims to reduce the transaction costs associated with contractual measurement reports and enhance overall transactional integrity. We provide a detailed exploration of how blockchain technology can be effectively harnessed to offer a robust solution to these challenges, setting the stage for our proposed solution and its potential impact on corporate governance. The application of the methodology resulted in as average of 2% overbilling reduction.

Paperid: 3812, https://arxiv.org/pdf/2501.12883.pdf

Abstract:
Recent advances in generative artificial intelligence (AI), such as ChatGPT, Google Gemini, and other large language models (LLMs), pose significant challenges to upholding academic integrity in higher education. This paper investigates the susceptibility of a Master's-level cyber security degree program at a UK Russell Group university, accredited by a leading national body, to LLM misuse. Through the application and extension of a quantitative assessment framework, we identify a high exposure to misuse, particularly in independent project- and report-based assessments. Contributing factors, including block teaching and a predominantly international cohort, are highlighted as potential amplifiers of these vulnerabilities. To address these challenges, we discuss the adoption of LLM-resistant assessments, detection tools, and the importance of fostering an ethical learning environment. These approaches aim to uphold academic standards while preparing students for the complexities of real-world cyber security.

Paperid: 3813, https://arxiv.org/pdf/2501.12080.pdf

Abstract:
Secure multi-party computation is an area in cryptography which studies how multiple parties can compare their private information without revealing it. Besides digital protocols, many physical protocols for secure multi-party computation using portable objects found in everyday life have also been developed. The vast majority of them use cards as the main tools. In this paper, we introduce the use of a balance scale and coins as new physical tools for secure multi-party computation. In particular, we develop four protocols that can securely compute any $n$-variable Boolean function using a balance scale and coins.

Paperid: 3814, https://arxiv.org/pdf/2501.11788.pdf

Abstract:
In this work, we propose an error-free, information-theoretically secure multi-valued asynchronous Byzantine agreement (ABA) protocol, called OciorABA. This protocol achieves ABA consensus on an $\ell$-bit message with an expected communication complexity of $O(n\ell + n^3 \log q )$ bits and an expected round complexity of $O(1)$ rounds, under the optimal resilience condition $n \geq 3t + 1$ in an $n$-node network, where up to $t$ nodes may be dishonest. Here, $q$ denotes the alphabet size of the error correction code used in the protocol. In our protocol design, we introduce a new primitive: asynchronous partial vector agreement (APVA). In APVA, the distributed nodes input their vectors and aim to output a common vector, where some of the elements of those vectors may be missing or unknown. We propose an APVA protocol with an expected communication complexity of $O( n^3 \log q )$ bits and an expected round complexity of $O(1)$ rounds. This APVA protocol serves as a key building block for our OciorABA protocol.

Paperid: 3815, https://arxiv.org/pdf/2501.11558.pdf

Abstract:
Federated Learning (FL) is a distributed machine learning approach that has emerged as an effective way to address recent privacy concerns. However, FL introduces the need for additional security measures as FL alone is still subject to vulnerabilities such as model and data poisoning and inference attacks. Confidential Computing (CC) is a paradigm that, by leveraging hardware-based trusted execution environments (TEEs), protects the confidentiality and integrity of ML models and data, thus resulting in a powerful ally of FL applications. Typical TEEs offer an application-isolation level but suffer many drawbacks, such as limited available memory and debugging and coding difficulties. The new generation of TEEs offers a virtual machine (VM)-based isolation level, thus reducing the porting effort for existing applications. In this work, we compare the performance of VM-based and application-isolation level TEEs for confidential FL (CFL) applications. In particular, we evaluate the impact of TEEs and additional security mechanisms such as TLS (for securing the communication channel). The results, obtained across three datasets and two deep learning models, demonstrate that the new VM-based TEEs introduce a limited overhead (at most 1.5x), thus paving the way to leverage public and untrusted computing environments, such as HPC facilities or public cloud, without detriment to performance.

Paperid: 3816, https://arxiv.org/pdf/2501.11091.pdf

Abstract:
This paper reconceptualizes Bitcoin not merely as a financial protocol but as a non-continuous temporal system. Traditional financial and computing systems impose time through synchronized external clocks, whereas Bitcoin constructs temporal order internally through decentralized consensus. We analyze block generation as a probabilistic process, forks and rollbacks as disruptions of continuity, and transaction confirmations as non-linear consolidations of history. Building on this foundation, we reinterpret proof-of-work (PoW) as an entropy-compression mechanism that drives Bitcoin's temporal architecture. In this framework, difficulty adjustment keeps block discovery near the entropy-maximizing regime, the longest-chain rule enforces entropy collapse once a block is found, and recursive hash pointers embed each block within its predecessor, creating a cascading structure that renders reorganization exponentially improbable. Together, these mechanisms compress randomness into a sequence of discrete and irreversible steps, demonstrating how Bitcoin establishes an emergent and convergent notion of time without reliance on external clocks.

Paperid: 3817, https://arxiv.org/pdf/2501.10888.pdf

Abstract:
Selfish mining is strategic rule-breaking to maximize rewards in proof-of-work protocols. Markov Decision Processes (MDPs) are the preferred tool for finding optimal strategies in Bitcoin and similar linear chain protocols. Protocols increasingly adopt DAG-based chain structures, for which MDP analysis is more involved. To date, researchers have tailored specific MDPs for each protocol. Protocol design suffers long feedback loops, as each protocol change implies manual work on the MDP. To overcome this, we propose a generic attack model that covers a wide range of protocols, including Ethereum Proof-of-Work, GhostDAG, and Parallel Proof-of-Work. Our approach is modular: we specify each protocol as a concise program, and our tooling then derives and solves the selfish mining MDP automatically.

Paperid: 3818, https://arxiv.org/pdf/2501.10467.pdf

Abstract:
This paper critically examines the evolving ethical and regulatory challenges posed by the integration of artificial intelligence (AI) in cybersecurity. We trace the historical development of AI regulation, highlighting major milestones from theoretical discussions in the 1940s to the implementation of recent global frameworks such as the European Union AI Act. The current regulatory landscape is analyzed, emphasizing risk-based approaches, sector-specific regulations, and the tension between fostering innovation and mitigating risks. Ethical concerns such as bias, transparency, accountability, privacy, and human oversight are explored in depth, along with their implications for AI-driven cybersecurity systems. Furthermore, we propose strategies for promoting AI literacy and public engagement, essential for shaping a future regulatory framework. Our findings underscore the need for a unified, globally harmonized regulatory approach that addresses the unique risks of AI in cybersecurity. We conclude by identifying future research opportunities and recommending pathways for collaboration between policymakers, industry leaders, and researchers to ensure the responsible deployment of AI technologies in cybersecurity.

Paperid: 3819, https://arxiv.org/pdf/2501.10443.pdf

Abstract:
With the growing popularity and rising value of cryptocurrencies, skepticism surrounding this groundbreaking innovation persists. Many financial and business experts argue that the value created in the cryptocurrency realm resembles the generation of currency from thin air. However, a historical analysis of the fundamental concepts that have shaped money reveals striking parallels with past transformations in human society. This study extends these historical insights to the present era, demonstrating how enduring monetary concepts are once again redefining our understanding of money and reshaping its form. Additionally, we offer novel interpretations of cryptocurrency by linking the intrinsic nature of money, the communities it fosters, and the cryptographic technologies that have provided the infrastructure for this transformative shift.

Paperid: 3820, https://arxiv.org/pdf/2501.10441.pdf

Abstract:
The integration of information and physical systems in modern power grids has heightened vulnerabilities to False Data Injection Attacks (FDIAs), threatening the secure operation of power cyber-physical systems (CPS). This paper reviews FDIA detection, evolution, and data reconstruction strategies, highlighting cross-domain coordination, multi-temporal evolution, and stealth characteristics. Challenges in existing detection methods, including poor interpretability and data imbalance, are discussed, alongside advanced state-aware and action-control data reconstruction techniques. Key issues, such as modeling FDIA evolution and distinguishing malicious data from regular faults, are identified. Future directions to enhance system resilience and detection accuracy are proposed, contributing to the secure operation of power CPS.

Paperid: 3821, https://arxiv.org/pdf/2501.10427.pdf

Abstract:
I examine threat modeling techniques and questions of power dynamics in the systems in which they're used. I compare techniques that can be used by system creators to those used by those who are not involved in creating the system. That second set of analysts might be scientists doing research, consumers comparing products, or those trying to analyze a new system being deployed by a government. Their access to information, skills and choices are different. I examine the impact of those difference on threat modeling methods.

Paperid: 3822, https://arxiv.org/pdf/2501.10419.pdf

Abstract:
We describe a protocol for creating, updating, and transferring digital assets securely, with strong privacy and self-custody features for the initial owner based upon the earlier work of Goodell, Toliver, and Nakib. The architecture comprises three components: a mechanism to unlink counterparties in the transaction channel, a mechanism for oblivious transactions, and a mechanism to prevent service providers from equivocating. We present an approach for the implementation of these components.

Paperid: 3823, https://arxiv.org/pdf/2501.10394.pdf

Abstract:
This paper introduces a novel cryptographic approach based on the continuous logarithm in the complex circle, designed to address the challenges posed by quantum computing. By leveraging its multi-valued and spectral properties, this framework enables the reintroduction of classical algorithms (DH, ECDSA, ElGamal, EC) and elliptic curve variants into the post-quantum landscape. Transitioning from classical or elliptic algebraic structures to the geometric and spectral properties of the complex circle, we propose a robust and adaptable foundation for post-quantum cryptography.

Paperid: 3824, https://arxiv.org/pdf/2501.10393.pdf

Abstract:
With the advancement of quantum computing technologies, recent years have seen increasing efforts to identify cryptographic methods resistant to quantum attacks and to establish post-quantum cryptography (PQC) approaches. Among these, hash-based digital signature algorithms (DSAs) are a notable category of PQC. Hash functions are not only utilized in digital signatures but are also widely applied in pseudorandom number generators (PRNGs). Building on the foundation of hash-based DSAs, this study proposes a modified approach that introduces a DSA based on PRNGs, suitable for one-time signature (OTS) applications. The study explores the security of the proposed PRNG-based OTS algorithm and validates its feasibility through experiments comparing various parameter configurations. These experiments examine key length, signature length, key generation time, signature generation time, and signature verification time under different parameter settings.

Paperid: 3825, https://arxiv.org/pdf/2501.10378.pdf

Abstract:
This article examines the broader societal implications of blockchain technology and crypto-assets, emphasizing their role in the evolution of humanity as a "superorganism" with decentralized, self-regulating systems. Drawing on interdisciplinary concepts such as Nate Hagens' "superorganism" idea and Francis Heylighen's "global brain" theory, the paper contextualizes blockchain technology within the ongoing evolution of governance systems and global systems such as the financial system. Blockchain's decentralized nature, in conjunction with advancements like artificial intelligence and decentralized autonomous organizations (DAOs), could transform traditional financial, economic, and governance structures by enabling the emergence of collective distributed decision-making and global coordination. In parallel, the article aligns blockchain's impact with developmental theories such as Spiral Dynamics. This framework is used to illustrate blockchain's potential to foster societal growth beyond hierarchical models, promoting a shift from centralized authority to collaborative and self-governed communities. The analysis provides a holistic view of blockchain as more than an economic tool, positioning it as a catalyst for the evolution of society into a mature, interconnected global planetary organism.

Paperid: 3826, https://arxiv.org/pdf/2501.10083.pdf

Abstract:
One of the key characteristics of secure quantum communication is quantum secure multiparty computation. In this paper, we propose a quantum secure multiparty summation (QSMS) protocol that can be applied to many complex quantum operations. It is based on the $(t, n)$ threshold approach. We combine the classical and quantum phenomena to make this protocol realistic and secure. Because the current protocols employ the $(n, n)$ threshold approach, which requires all honest players to execute the quantum multiparty summation protocol, they have certain security and efficiency problems. However, we employ a $(t, n)$ threshold approach, which requires the quantum summation protocol to be computed only by $t$ honest players. Our suggested protocol is more economical, practical, and secure than alternative protocols.

Paperid: 3827, https://arxiv.org/pdf/2501.09568.pdf

Abstract:
The Diffie-Hellman key exchange plays a crucial role in conventional cryptography, as it allows two legitimate users to establish a common, usually ephemeral, secret key. Its security relies on the discrete-logarithm problem, which is considered to be a mathematical one-way function, while the final key is formed by random independent actions of the two users. In the present work we investigate the extension of Diffie-Hellman key exchange to the quantum setting, where the two legitimate users exchange independent random quantum states. The proposed protocol relies on the bijective mapping of integers onto a set of symmetric coherent states, and we investigate the regime of parameters for which the map behaves as a quantum one-way function. Its security is analyzed in the framework of minimum-error-discrimination and photon-number-splitting attacks, while its performance and the challenges in a possible realization are also discussed.

Paperid: 3828, https://arxiv.org/pdf/2501.09559.pdf

Abstract:
One crucial and basic method for disclosing a secret to every participant in quantum cryptography is quantum secret sharing. Numerous intricate protocols, including secure multiparty summation, multiplication, sorting, voting, and more, can be designed with it. A quantum secret sharing protocol with a $(t,n)$ threshold approach and modulo d, where t and n represent the threshold number of participants and the total number of participants, respectively was recently discussed by Song et al. Kao et al. notes that without the information of other participants, the secret in Song {\em et al.'s}protocol cannot be reconstructed. We address a protocol that solves this issue in this paper.

Paperid: 3829, https://arxiv.org/pdf/2501.09182.pdf

Abstract:
As artificial intelligence (AI) systems become increasingly integral to critical infrastructure and global operations, the need for a unified, trustworthy governance framework is more urgent that ever. This paper proposes a novel approach to AI governance, utilizing blockchain and distributed ledger technologies (DLT) to establish a decentralized, globally recognized framework that ensures security, privacy, and trustworthiness of AI systems across borders. The paper presents specific implementation scenarios within the financial sector, outlines a phased deployment timeline over the next decade, and addresses potential challenges with solutions grounded in current research. By synthesizing advancements in blockchain, AI ethics, and cybersecurity, this paper offers a comprehensive roadmap for a decentralized AI governance framework capable of adapting to the complex and evolving landscape of global AI regulation.

Paperid: 3830, https://arxiv.org/pdf/2501.09033.pdf

Abstract:
The growing complexity of cyber attacks has necessitated the evolution of firewall technologies from static models to adaptive, machine learning-driven systems. This research introduces "Dynamically Retrainable Firewalls", which respond to emerging threats in real-time. Unlike traditional firewalls that rely on static rules to inspect traffic, these advanced systems leverage machine learning algorithms to analyze network traffic pattern dynamically and identify threats. The study explores architectures such as micro-services and distributed systems for real-time adaptability, data sources for model retraining, and dynamic threat identification through reinforcement and continual learning. It also discusses strategies to improve performance, reduce latency, optimize resource utilization, and address integration issues with present-day concepts such as Zero Trust and mixed environments. By critically assessing the literature, analyzing case studies, and elucidating areas of future research, this work suggests dynamically retrainable firewalls as a more robust form of network security. Additionally, it considers emerging trends such as advancements in AI and quantum computing, ethical issues, and other regulatory questions surrounding future AI systems. These findings provide valuable information on the future state of adaptive cyber security, focusing on the need for proactive and adaptive measures that counter cyber threats that continue to evolve.

Paperid: 3831, https://arxiv.org/pdf/2501.09032.pdf

Abstract:
"Distributed Identity" refers to the transition from centralized identity systems using Decentralized Identifiers (DID) and Verifiable Credentials (VC) for secure and privacy-preserving authentications. With distributed identity, control of identity data is returned to the user, making credential-based attacks impossible due to the lack of a single point of failure. This study assesses the security improvements achieved when distributed identity is employed with the ZTA principle, particularly concerning lateral movements within segmented networks. It also considers areas such as the implementation specifications of the framework, the advantages and disadvantages of the method to organizations, and the issues of compatibility and generalizability. Furthermore, the study highlights privacy and regulatory compliance, including the General Data Protection Regulation (GDPR) and California Consumer Data Privacy Act (CCPA), analyzing potential solutions to these problems. The study implies that adopting distributed identities can enhance overall security postures by an order of magnitude, providing contextual and least-privilege authorization and user privacy. The research recommends refining technical standards, expanding the use of distributed identity in practice, and discussing its applications for the contemporary digital security landscape.

Paperid: 3832, https://arxiv.org/pdf/2501.09029.pdf

Abstract:
This paper explores the integration of provenance tracking systems within the context of Semantic Web technologies to enhance data integrity in diverse operational environments. SURROUND Australia Pty Ltd demonstrates innovative applica-tions of the PROV Data Model (PROV-DM) and its Semantic Web variant, PROV-O, to systematically record and manage provenance information across multiple data processing domains. By employing RDF and Knowledge Graphs, SURROUND ad-dresses the critical challenges of shared entity identification and provenance granularity. The paper highlights the company's architecture for capturing comprehensive provenance data, en-abling robust validation, traceability, and knowledge inference. Through the examination of two projects, we illustrate how provenance mechanisms not only improve data reliability but also facilitate seamless integration across heterogeneous systems. Our findings underscore the importance of sophisticated provenance solutions in maintaining data integrity, serving as a reference for industry peers and academics engaged in provenance research and implementation.

Paperid: 3833, https://arxiv.org/pdf/2501.08840.pdf

Abstract:
Binary Static Code Analysis (BSCA) is a pivotal area in software vulnerability research, focusing on the precise localization of vulnerabilities within binary executables. Despite advancements in BSCA techniques, there is a notable scarcity of comprehensive and readily usable vulnerability datasets tailored for diverse environments such as IoT, UEFI, and MCU firmware. To address this gap, we present CveBinarySheet, a meticulously curated database containing 1033 CVE entries spanning from 1999 to 2024. Our dataset encompasses 16 essential third-party components, including busybox and curl, and supports five CPU architectures: x86-64, i386, MIPS, ARMv7, and RISC-V64. Each precompiled binary is available at two compiler optimization levels (O0 and O3), facilitating comprehensive vulnerability analysis under different compilation scenarios. By providing detailed metadata and diverse binary samples, CveBinarySheet aims to accelerate the development of state-of-the-art BSCA tools, binary similarity analysis, and vulnerability matching applications.

Paperid: 3834, https://arxiv.org/pdf/2501.07209.pdf

Abstract:
With the increasing use of online services, the protection of the privacy of users becomes more and more important. This is particularly critical as authentication and authorization as realized on the Internet nowadays, typically relies on centralized identity management solutions. Although those are very convenient from a user's perspective, they are quite intrusive from a privacy perspective and are currently far from implementing the concept of data minimization. Fortunately, cryptography offers exciting primitives such as zero-knowledge proofs and advanced signature schemes to realize various forms of so-called anonymous credentials. Such primitives allow to realize online authentication and authorization with a high level of built-in privacy protection (what we call privacy-preserving authentication). Though these primitives have already been researched for various decades and are well understood in the research community, unfortunately, they lack widespread adoption. In this paper, we look at the problems, what cryptography can do, some deployment examples, and barriers to widespread adoption. Latter using the example of the EU Digital Identity Wallet (EUDIW) and the recent discussion and feedback from cryptography experts around this topic. We also briefly comment on the transition to post-quantum cryptography.

Paperid: 3835, https://arxiv.org/pdf/2501.06281.pdf

Abstract:
Zero Trust Architectures (ZTA) fundamentally redefine network security by adopting a "trust nothing, verify everything" approach that requires identity verification for all access. Conventional discrete access control measures have proven inadequate since they do not consider evolving user activities and contextual threats, leading to internal threats and enhanced attacks. This research applies the proposed AI-driven, autonomous, identity-based threat segmentation in ZTA, along with real-time identity analytics for fine-grained, real-time mechanisms. Some of the sharp practices include using the behavioral analytics approach to provide real-time risk scores, such as analyzing the patterns used for logging into the system, the access sought, and the resources used. Permissions are adjusted using machine learning models that take into account context-aware factors like geolocation, device type, and access time. Automated threat segmentation helps analysts identify multiple compromised identities in real-time, thus minimizing the likelihood of a breach advancing. The system's use cases are based on real scenarios; for example, insider threats in global offices demonstrate how compromised accounts are detected and locked. This work outlines measures to address privacy issues, false positives, and scalability concerns. This research enhances the security of other critical areas of computer systems by providing dynamic access governance, minimizing insider threats, and supporting dynamic policy enforcement while ensuring that the needed balance between security and user productivity remains a top priority. We prove via comparative analyses that the model is precise and scalable.

Paperid: 3836, https://arxiv.org/pdf/2501.06267.pdf

Abstract:
We describe a protocol for creating, updating, and revoking digital diplomas that we anticipate would make use of the protocol for transferring digital assets elaborated by Goodell, Toliver, and Nakib. Digital diplomas would maintain their own state, and make use a distributed ledger as a mechanism for verifying their integrity. The use of a distributed ledger enables verification of the state of an asset without the need to contact the issuing institution, and we describe how the integrity of a diploma issued in this way can persist even in the absence of the issuing institution.

Paperid: 3837, https://arxiv.org/pdf/2501.05627.pdf

Abstract:
In recent years, U.S. Department of Transportation has adopts Institute of Electrical and Electronics Engineers (IEEE) 1609 series to build the security credential management system (SCMS) for being the standard of connected cars in U.S. Furthermore, a butterfly key expansion (BKE) method in SCMS has been designed to provide pseudonym certificates for improving the privacy of connected cars. However, the BKE method is designed based on elliptic curve cryptography (ECC) in the standard of IEEE 1609.2.1, but more execution time is required for key expansion. Therefore, this study proposes an original efficient key expansion method, and the mathematical principles have been proposed to prove the encryption/decryption feasibility, car privacy, and method efficiency. In a practical environment, the proposed method improves the efficiency of key expansion method in IEEE 1609.2.1-2022 with the same security strength thousands of times.

Paperid: 3838, https://arxiv.org/pdf/2501.05500.pdf

Abstract:
This survey provides a comprehensive examination of verifiable computing, tracing its evolution from foundational complexity theory to modern zero-knowledge succinct non-interactive arguments of knowledge (ZK-SNARKs). We explore key developments in interactive proof systems, knowledge complexity, and the application of low-degree polynomials in error detection and verification protocols. The survey delves into essential mathematical frameworks such as the Cook-Levin Theorem, the sum-check protocol, and the GKR protocol, highlighting their roles in enhancing verification efficiency and soundness. By systematically addressing the limitations of traditional NP-based proof systems and then introducing advanced interactive proof mechanisms to overcome them, this work offers an accessible step-by-step introduction for newcomers while providing detailed mathematical analyses for researchers. Ultimately, we synthesize these concepts to elucidate the GKR protocol, which serves as a foundation for contemporary verifiable computing models. This survey not only reviews the historical and theoretical advancements in verifiable computing over the past three decades but also lays the groundwork for understanding recent innovations in the field.

Paperid: 3839, https://arxiv.org/pdf/2501.05165.pdf

Abstract:
Context. Developing secure and reliable software remains a key challenge in software engineering (SE). The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete. Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems and carry broader socio-economic risks, such as compromising critical national infrastructure and causing significant financial losses. Researchers and practitioners have explored methodologies like Static Application Security Testing Tools (SASTTs) and artificial intelligence (AI) approaches, including machine learning (ML) and large language models (LLMs), to detect and mitigate these vulnerabilities. Each method has unique strengths and limitations. Aim. This thesis seeks to bring order to the chaos in SSE by addressing domain-specific differences that impact AI accuracy. Methodology. The research employs a mix of empirical strategies, such as evaluating effort-aware metrics, analyzing SASTTs, conducting method-level analysis, and leveraging evidence-based techniques like systematic dataset reviews. These approaches help characterize vulnerability prediction datasets. Results. Key findings include limitations in static analysis tools for identifying vulnerabilities, gaps in SASTT coverage of vulnerability types, weak relationships among vulnerability severity scores, improved defect prediction accuracy using just-in-time modeling, and threats posed by untouched methods. Conclusions. This thesis highlights the complexity of SSE and the importance of contextual knowledge in improving AI-driven vulnerability and defect prediction. The comprehensive analysis advances effective prediction models, benefiting both researchers and practitioners.

Paperid: 3840, https://arxiv.org/pdf/2501.04841.pdf

Abstract:
The problem of a single point of failure in centralized systems poses a great challenge to the stability of such systems. Meanwhile, the tamperability of data within centralized systems makes users reluctant to trust and use centralized applications in many scenarios, including the financial and business sectors. Blockchain, as a new decentralized technology, addresses these issues effectively. As a typical decentralized system, blockchain can be utilized to build a data-sharing model. Users in a blockchain do not need to trust other users; instead, they trust that the majority of miner nodes are honest. Smart contracts enable developers to write distributed programs based on blockchain systems, ensuring that all code is immutable and secure. In this paper, we analyze the security of blockchain technology to illustrate its advantages and justify its use. Furthermore, we design a new system for storing and trading vehicle information based on the Ethereum blockchain and smart contract technology. Specifically, our system allows users to upload vehicle information and auction vehicles to transfer ownership. Our application provides great convenience to buyers and owners, while the use of smart contracts enhances the security and privacy of the system.

Paperid: 3841, https://arxiv.org/pdf/2501.04479.pdf

Abstract:
The increasing demand for connectivity in safety-critical domains has made security assurance a crucial consideration. In safety-critical industry, software, and connectivity have become integral to meeting market expectations. Regulatory bodies now require security assurance cases (SAC) to verify compliance, as demonstrated in ISO/SAE-21434 for automotive. However, existing approaches for creating SACs do not adequately address industry-specific constraints and requirements. In this thesis, we present CASCADE, an approach for creating SACs that aligns with ISO/SAE-21434 and integrates quality assurance measures. CASCADE is developed based on insights from industry needs and a systematic literature review. We explore various factors driving SAC adoption, both internal and external to companies in safety-critical domains, and identify gaps in the existing literature. Our approach addresses these gaps and focuses on asset-driven methodology and quality assurance. We provide an illustrative example and evaluate CASCADE's suitability and scalability in an automotive OEM. We evaluate the generalizability of CASCADE in the medical domain, highlighting its benefits and necessary adaptations. Furthermore, we support the creation and management of SACs by developing a machine-learning model to classify security-related requirements and investigating the management of security evidence. We identify deficiencies in evidence management practices and propose potential areas for automation. Finally, our work contributes to the advancement of security assurance practices and provides practical support for practitioners in creating and managing SACs.

Paperid: 3842, https://arxiv.org/pdf/2501.04168.pdf

Abstract:
We present a construction of one-time memories (OTMs) using classical-accessible stateless hardware, building upon the work of Broadbent et al. and Behera et al.. Unlike the aforementioned work, our approach leverages quantum random access codes (QRACs) to encode two classical bits, $b_0$ and $b_1$, into a single qubit state $\mathcal{E}(b_0 b_1)$ where the receiver can retrieve one of the bits with a certain probability of error. To prove soundness, we define a nonconvex optimization problem over POVMs on $\mathbb{C}^2$. This optimization gives an upper bound on the probability of distinguishing bit $b_{1-Î±}$ given that the probability that the receiver recovers bit $b_Î±$ is high. Assuming the optimization is sufficiently accurate, we then prove soundness against a polynomial number of classical queries to the hardware.

Paperid: 3843, https://arxiv.org/pdf/2501.04058.pdf

Abstract:
Focussing on two different use cases-Quality Control methods in industrial contexts and Neural Network algorithms for healthcare diagnostics-this research investigates the inclusion of Fully Homomorphic Encryption into real-world applications in the healthcare sector. We evaluate the performance, resource requirements, and viability of deploying FHE in these settings through extensive testing and analysis, highlighting the progress made in FHE tooling and the obstacles still facing addressing the gap between conceptual research and practical applications. We start our research by describing the specific case study and trust model were working with. Choosing the two FHE frameworks most appropriate for industry development, we assess the resources and performance requirements for implementing each of the two FHE frameworks in the first scenario, Quality Control algorithms. In conclusion, our findings demonstrate the effectiveness and resource consumption of the two use cases-complex NN models and simple QC algorithms-when implemented in an FHE setting.

Paperid: 3844, https://arxiv.org/pdf/2501.03296.pdf

Abstract:
Amidst the burgeoning landscape of database architectures, the surge in NoSQL databases has heralded a transformative era, liberating data storage from traditional relational constraints and ushering in unprecedented scalability. As organizations grapple with the escalating security threats posed by database breaches, a novel theoretical framework emerges at the nexus of chaos theory and topology: the Database in motion Chaos Encryption (DaChE) Algorithm. This paradigm-shifting approach challenges the static nature of data storage, advocating for dynamic data motion to fortify database security. By incorporating chaos theory, this innovative strategy not only enhances database defenses against evolving attack vectors but also redefines the boundaries of data protection, offering a paradigmatic shift in safeguarding critical information assets. Additionally, it enables parallel processing, facilitating on-the-fly processing and optimizing the performance of the proposed framework.

Paperid: 3845, https://arxiv.org/pdf/2501.03250.pdf

Abstract:
In the paced realms of cybersecurity and digital forensics machine learning (ML) and deep learning (DL) have emerged as game changing technologies that introduce methods to identify stop and analyze cyber risks. This review presents an overview of the ML and DL approaches used in these fields showcasing their advantages drawbacks and possibilities. It covers a range of AI techniques used in spotting intrusions in systems and classifying malware to prevent cybersecurity attacks, detect anomalies and enhance resilience. This study concludes by highlighting areas where further research is needed and suggesting ways to create transparent and scalable ML and DL solutions that are suited to the evolving landscape of cybersecurity and digital forensics.

Paperid: 3846, https://arxiv.org/pdf/2501.03249.pdf

Abstract:
As quantum computing technology continues to advance, post-quantum cryptographic methods capable of resisting quantum attacks have emerged as a critical area of focus. Given the potential vulnerability of existing homomorphic encryption methods, such as RSA, ElGamal, and Paillier, to quantum computing attacks, this study proposes a lattice-based post-quantum homomorphic encryption scheme. The approach leverages lattice cryptography to build resilience against quantum threats while enabling practical homomorphic encryption applications. This research provides mathematical proofs and computational examples, alongside a security analysis of the lattice-based post-quantum homomorphic encryption scheme. The findings are intended to serve as a reference for developers of homomorphic encryption applications, such as federated learning systems.

Paperid: 3847, https://arxiv.org/pdf/2501.03237.pdf

Abstract:
This study examines the performance and structural differences between the two primary standards for Security Credential Management Systems (SCMS) in Vehicular-to-Everything (V2X) communication: the North American IEEE standards and the European ETSI standards. It focuses on comparing their respective Public Key Infrastructure (PKI) architectures, certificate application workflows, and the formats of request/response messages used during certificate applications. The research includes a theoretical analysis of message length and security features inherent in each standard. Additionally, practical implementations of both systems are conducted to evaluate their efficiency in certificate management processes. Based on the findings, the study provides recommendations to guide future development of SCMS.

Paperid: 3848, https://arxiv.org/pdf/2501.02263.pdf

Abstract:
This paper provides a brief overview of the ongoing financial revolution, which extends beyond the emergence of cryptocurrencies as a digital medium of exchange. At its core, this revolution is driven by a paradigm shift rooted in the technological advancements of blockchain and the foundational principles of Islamic economics. Together, these elements offer a transformative framework that challenges traditional financial systems, emphasizing transparency, equity, and decentralized governance. The paper highlights the implications of this shift and its potential to reshape the global economic landscape.

Paperid: 3849, https://arxiv.org/pdf/2501.02203.pdf

Abstract:
Many recent IT companies use cloud services for deploying their products, mainly because of their convenience. As such, cloud assets have become a new attack surface, and the concept of cloud security has emerged. However, cloud security is not emphasized enough compared to on-premise security, resulting in many insecure cloud architectures. In particular, small organizations often don't have enough human resources to design a secure architecture, leaving them vulnerable to cloud security breaches. We suggest the multi-account strategy for securing the cloud architecture. This strategy cost-effectively improves security by separating assets and reducing management overheads on the cloud infrastructure. When implemented, it automatically provides access restriction within the boundary of an account and eliminates redundancies in policy management. Since access control is a critical objective for constructing secure architectures, this practical method successfully enhances security even in small companies. In this paper, we analyze the benefits of multi-accounts compared to single accounts and explain how to deploy multiple accounts effortlessly using the services provided by AWS. Then, we present possible design choices for multi-account structures with a concrete example. Finally, we illustrate two techniques for operational excellence on multi-account structures. We take an incremental approach to secure policy management with the principle of least privilege and introduce methods for auditing multiple accounts.

Paperid: 3850, https://arxiv.org/pdf/2501.01913.pdf

Abstract:
Federated learning (FL) is a decentralized machine learning technique that allows multiple entities to jointly train a model while preserving dataset privacy. However, its distributed nature has raised various security concerns, which have been addressed by increasingly sophisticated defenses. These protections utilize a range of data sources and metrics to, for example, filter out malicious model updates, ensuring that the impact of attacks is minimized or eliminated. This paper explores the feasibility of designing a generic attack method capable of installing backdoors in FL while evading a diverse array of defenses. Specifically, we focus on an attacker strategy called MIGO, which aims to produce model updates that subtly blend with legitimate ones. The resulting effect is a gradual integration of a backdoor into the global model, often ensuring its persistence long after the attack concludes, while generating enough ambiguity to hinder the effectiveness of defenses. MIGO was employed to implant three types of backdoors across five datasets and different model architectures. The results demonstrate the significant threat posed by these backdoors, as MIGO consistently achieved exceptionally high backdoor accuracy (exceeding 90%) while maintaining the utility of the main task. Moreover, MIGO exhibited strong evasion capabilities against ten defenses, including several state-of-the-art methods. When compared to four other attack strategies, MIGO consistently outperformed them across most configurations. Notably, even in extreme scenarios where the attacker controls just 0.1% of the clients, the results indicate that successful backdoor insertion is possible if the attacker can persist for a sufficient number of rounds.

Paperid: 3851, https://arxiv.org/pdf/2501.01435.pdf

Abstract:
General Purpose AI - such as Large Language Models (LLMs) - have seen rapid deployment in a wide range of use cases. Most surprisingly, they have have made their way from plain language models, to chat-bots, all the way to an almost ``operating system''-like status that can control decisions and logic of an application. Tool-use, Microsoft co-pilot/office integration, and OpenAIs Altera are just a few examples of increased autonomy, data access, and execution capabilities. These methods come with a range of cybersecurity challenges. We highlight some of the work we have done in terms of evaluation as well as outline future opportunities and challenges.

Paperid: 3852, https://arxiv.org/pdf/2501.00517.pdf

Abstract:
Currently, large models are prone to generating harmful content when faced with complex attack instructions, significantly reducing their defensive capabilities. To address this issue, this paper proposes a method based on constructing data aligned with multi-dimensional attack defense to enhance the generative security of large models. The core of our method lies in improving the effectiveness of safe alignment learning for large models by innova-tively increasing the diversity of attack instruction dimensions and the accuracy of generat-ing safe responses. To validate the effectiveness of our method, beyond existing security evaluation benchmarks, we additionally designed new security evaluation benchmarks and conducted comparative experiments using Llama3.2 as the baseline model. The final ex-perimental results demonstrate that our method can significantly improve the generative security of large models under complex instructional attacks, while also maintaining and enhancing the models' general capabilities.

Paperid: 3853, https://arxiv.org/pdf/2501.00260.pdf

Abstract:
Phishing is an online identity theft technique where attackers steal users personal information, leading to financial losses for individuals and organizations. With the increasing adoption of smartphones, which provide functionalities similar to desktop computers, attackers are targeting mobile users. Smishing, a phishing attack carried out through Short Messaging Service (SMS), has become prevalent due to the widespread use of SMS-based services. It involves deceptive messages designed to extract sensitive information. Despite the growing number of smishing attacks, limited research focuses on detecting these threats. This work presents a smishing detection model using a content-based analysis approach. To address the challenge posed by slang, abbreviations, and short forms in text communication, the model normalizes these into standard forms. A machine learning classifier is employed to classify messages as smishing or ham. Experimental results demonstrate the model effectiveness, achieving classification accuracies of 97.14% for smishing and 96.12% for ham messages, with an overall accuracy of 96.20%.

Paperid: 3854, https://arxiv.org/pdf/2501.00214.pdf

Abstract:
In this work, we propose an error-free, information-theoretically secure, asynchronous multi-valued validated Byzantine agreement (MVBA) protocol, called OciorMVBA. This protocol achieves MVBA consensus on a message $\boldsymbol{w}$ with expected $O(n |\boldsymbol{w}|\log n + n^2 \log q)$ communication bits, expected $O(n^2)$ messages, expected $O(\log n)$ rounds, and expected $O(\log n)$ common coins, under optimal resilience $n \geq 3t + 1$ in an $n$-node network, where up to $t$ nodes may be dishonest. Here, $q$ denotes the alphabet size of the error correction code used in the protocol. When error correction codes with a constant alphabet size (e.g., Expander Codes) are used, $q$ becomes a constant. An MVBA protocol that guarantees all required properties without relying on any cryptographic assumptions, such as signatures or hashing, except for the common coin assumption, is said to be information-theoretically secure (IT secure). Under the common coin assumption, an MVBA protocol that guarantees all required properties in all executions is said to be error-free. We also propose another error-free, IT-secure, asynchronous MVBA protocol, called OciorMVBArr. This protocol achieves MVBA consensus with expected $O(n |\boldsymbol{w}| + n^2 \log n)$ communication bits, expected $O(1)$ rounds, and expected $O(1)$ common coins, under a relaxed resilience (RR) of $n \geq 5t + 1$. Additionally, we propose a hash-based asynchronous MVBA protocol, called OciorMVBAh. This protocol achieves MVBA consensus with expected $O(n |\boldsymbol{w}| + n^3)$ bits, expected $O(1)$ rounds, and expected $O(1)$ common coins, under optimal resilience $n \geq 3t + 1$.