Authors:Shuo Chen, Zhen Han, Haokun Chen, Bailan He, Shengyun Si, Jingpei Wu, Philip Torr, Volker Tresp, Jindong Gu
Abstract:
Recent reasoning-based safety guardrails for Large Reasoning Models (LRMs), such as deliberative alignment, have shown strong defense against jailbreak attacks. By leveraging LRMs' reasoning ability, these guardrails help the models to assess the safety of user inputs before generating final responses. The powerful reasoning ability can analyze the intention of the input query and will refuse to assist once it detects the harmful intent hidden by the jailbreak methods. Such guardrails have shown a significant boost in defense, such as the near-perfect refusal rates on the open-source gpt-oss series. Unfortunately, we find that these powerful reasoning-based guardrails can be extremely vulnerable to subtle manipulation of the input prompts, and once hijacked, can lead to even more harmful results. Specifically, we first uncover a surprisingly fragile aspect of these guardrails: simply adding a few template tokens to the input prompt can successfully bypass the seemingly powerful guardrails and lead to explicit and harmful responses. To explore further, we introduce a bag of jailbreak methods that subvert the reasoning-based guardrails. Our attacks span white-, gray-, and black-box settings and range from effortless template manipulations to fully automated optimization. Along with the potential for scalable implementation, these methods also achieve alarmingly high attack success rates (e.g., exceeding 90% across 5 different benchmarks on gpt-oss series on both local host models and online API services). Evaluations across various leading open-source LRMs confirm that these vulnerabilities are systemic, underscoring the urgent need for stronger alignment techniques for open-sourced LRMs to prevent malicious misuse. Code is open-sourced at https://chenxshuo.github.io/bag-of-tricks.
Authors:Junhua Zhou, Quanjun Li, Weixuan Li, Guang Yu, Yihua Shao, Yihang Dong, Mengqian Wang, Zimeng Li, Changwei Gong, Xuhang Chen
Abstract:
The rise of digital medical imaging, like MRI and CT, demands strong encryption to protect patient data in telemedicine and cloud storage. Chaotic systems are popular for image encryption due to their sensitivity and unique characteristics, but existing methods often lack sufficient security. This paper presents the Three-dimensional Diffusion Algorithm and Deep Learning Image Encryption system (TDADL-IE), built on three key elements. First, we propose an enhanced chaotic generator using an LSTM network with a 1D-Sine Quadratic Chaotic Map (1D-SQCM) for better pseudorandom sequence generation. Next, a new three-dimensional diffusion algorithm (TDA) is applied to encrypt permuted images. TDADL-IE is versatile for images of any size. Experiments confirm its effectiveness against various security threats. The code is available at \href{https://github.com/QuincyQAQ/TDADL-IE}{https://github.com/QuincyQAQ/TDADL-IE}.
Authors:Pengyu Zhu, Lijun Li, Yaxing Lyu, Li Sun, Sen Su, Jing Shao
Abstract:
LLM-based multi-agent systems (MAS) demonstrate increasing integration into next-generation applications, but their safety in backdoor attacks remains largely underexplored. However, existing research has focused exclusively on single-agent backdoor attacks, overlooking the novel attack surfaces introduced by agent collaboration in MAS. To bridge this gap, we present the first Distributed Backdoor Attack tailored to MAS. We decompose the backdoor into multiple distributed attack primitives that are embedded within MAS tools. These primitives remain dormant individually but collectively activate only when agents collaborate in a specific sequence, thereby assembling the full backdoor to execute targeted attacks such as data exfiltration. To fully assess this threat, we introduce a benchmark for multi-role collaborative tasks and a sandboxed framework to evaluate. Extensive experiments demonstrate that our attack achieves an attack success rate exceeding 95% without degrading performance on benign tasks. This work exposes novel backdoor attack surfaces that exploit agent collaboration, underscoring the need to move beyond single-agent protection. Code and benchmark are available at https://github.com/whfeLingYu/Distributed-Backdoor-Attacks-in-MAS.
Authors:Marco Pintore, Giorgio Piras, Angelo Sotgiu, Maura Pintor, Battista Biggio
Abstract:
To address the extremely concerning problem of software vulnerability, system security is often entrusted to Machine Learning (ML) algorithms. Despite their now established detection capabilities, such models are limited by design to flagging the entire input source code function as vulnerable, rather than precisely localizing the concerned code lines. However, the detection granularity is crucial to support human operators during software development, ensuring that such predictions reflect the true code semantics to help debug, evaluate, and fix the detected vulnerabilities. To address this issue, recent work made progress toward improving the detector's localization ability, thus narrowing down the vulnerability detection "window" and providing more fine-grained predictions. Such approaches, however, implicitly disregard the presence of spurious correlations and biases in the data, which often predominantly influence the performance of ML algorithms. In this work, we investigate how detectors comply with this requirement by proposing an explainability-based evaluation procedure. Our approach, defined as Detection Alignment (DA), quantifies the agreement between the input source code lines that most influence the prediction and the actual localization of the vulnerability as per the ground truth. Through DA, which is model-agnostic and adaptable to different detection tasks, not limited to our use case, we analyze multiple learning-based vulnerability detectors and datasets. As a result, we show how the predictions of such models are consistently biased by non-vulnerable lines, ultimately highlighting the high impact of biases and spurious correlations. The code is available at https://github.com/pralab/vuln-localization-eval.
Authors:Hyeseon Ahn, Shinwoo Park, Yo-Sub Han
Abstract:
The promise of LLM watermarking rests on a core assumption that a specific watermark proves authorship by a specific model. We demonstrate that this assumption is dangerously flawed. We introduce the threat of watermark spoofing, a sophisticated attack that allows a malicious model to generate text containing the authentic-looking watermark of a trusted, victim model. This enables the seamless misattribution of harmful content, such as disinformation, to reputable sources. The key to our attack is repurposing watermark radioactivity, the unintended inheritance of data patterns during fine-tuning, from a discoverable trait into an attack vector. By distilling knowledge from a watermarked teacher model, our framework allows an attacker to steal and replicate the watermarking signal of the victim model. This work reveals a critical security gap in text authorship verification and calls for a paradigm shift towards technologies capable of distinguishing authentic watermarks from expertly imitated ones. Our code is available at https://github.com/hsannn/ditto.git.
Authors:Norbert Tihanyi, Bilel Cherif, Richard A. Dubniczky, Mohamed Amine Ferrag, Tamás Bisztray
Abstract:
In this paper, we present the first large-scale study exploring whether JavaScript code generated by Large Language Models (LLMs) can reveal which model produced it, enabling reliable authorship attribution and model fingerprinting. With the rapid rise of AI-generated code, attribution is playing a critical role in detecting vulnerabilities, flagging malicious content, and ensuring accountability. While AI-vs-human detection usually treats AI as a single category we show that individual LLMs leave unique stylistic signatures, even among models belonging to the same family or parameter size. To this end, we introduce LLM-NodeJS, a dataset of 50,000 Node.js back-end programs from 20 large language models. Each has four transformed variants, yielding 250,000 unique JavaScript samples and two additional representations (JSIR and AST) for diverse research applications. Using this dataset, we benchmark traditional machine learning classifiers against fine-tuned Transformer encoders and introduce CodeT5-JSA, a custom architecture derived from the 770M-parameter CodeT5 model with its decoder removed and a modified classification head. It achieves 95.8% accuracy on five-class attribution, 94.6% on ten-class, and 88.5% on twenty-class tasks, surpassing other tested models such as BERT, CodeBERT, and Longformer. We demonstrate that classifiers capture deeper stylistic regularities in program dataflow and structure, rather than relying on surface-level features. As a result, attribution remains effective even after mangling, comment removal, and heavy code transformations. To support open science and reproducibility, we release the LLM-NodeJS dataset, Google Colab training scripts, and all related materials on GitHub: https://github.com/LLM-NodeJS-dataset.
Authors:Guozhi Liu, Qi Mu, Tiansheng Huang, Xinhua Wang, Li Shen, Weiwei Lin, Zhang Li
Abstract:
Harmful fine-tuning issues present significant safety challenges for fine-tuning-as-a-service in large language models. Existing alignment-stage defenses, e.g., Vaccine, Repnoise, Booster, and T-Vaccine, mitigate harmful fine-tuning issues by enhancing the model's robustness during the alignment phase. While these methods have been proposed to mitigate the issue, they often overlook a critical upstream factor: the role of the original safety-alignment data. We observe that their defense performance and computational efficiency remain constrained by the quality and composition of the alignment dataset. To address this limitation, we propose Pharmacist, a safety alignment data curation solution that enhances defense against harmful fine-tuning by selecting a high-quality and safety-critical core subset from the original alignment data. The core idea of Pharmacist is to train an alignment data selector to rank alignment data. Specifically, up-ranking high-quality and safety-critical alignment data, down-ranking low-quality and non-safety-critical data. Empirical results indicate that models trained on datasets selected by Pharmacist outperform those trained on datasets selected by existing selection methods in both defense and inference performance. In addition, Pharmacist can be effectively integrated with mainstream alignment-stage defense methods. For example, when applied to RepNoise and T-Vaccine, using the dataset selected by Pharmacist instead of the full dataset leads to improvements in defense performance by 2.60\% and 3.30\%, respectively, and enhances inference performance by 3.50\% and 1.10\%. Notably, it reduces training time by 56.83\% and 57.63\%, respectively. Our code is available at https://github.com/Lslland/Pharmacist.
Authors:Xiangxiang Chen, Peixin Zhang, Jun Sun, Wenhai Wang, Jingyi Wang
Abstract:
Model quantization is a popular technique for deploying deep learning models on resource-constrained environments. However, it may also introduce previously overlooked security risks. In this work, we present QuRA, a novel backdoor attack that exploits model quantization to embed malicious behaviors. Unlike conventional backdoor attacks relying on training data poisoning or model training manipulation, QuRA solely works using the quantization operations. In particular, QuRA first employs a novel weight selection strategy to identify critical weights that influence the backdoor target (with the goal of perserving the model's overall performance in mind). Then, by optimizing the rounding direction of these weights, we amplify the backdoor effect across model layers without degrading accuracy. Extensive experiments demonstrate that QuRA achieves nearly 100% attack success rates in most cases, with negligible performance degradation. Furthermore, we show that QuRA can adapt to bypass existing backdoor defenses, underscoring its threat potential. Our findings highlight critical vulnerability in widely used model quantization process, emphasizing the need for more robust security measures. Our implementation is available at https://github.com/cxx122/QuRA.
Authors:Zirun Zhou, Zhengyang Xiao, Haochuan Xu, Jing Sun, Di Wang, Jingfeng Zhang
Abstract:
Recent advances in vision-language-action (VLA) models have greatly improved embodied AI, enabling robots to follow natural language instructions and perform diverse tasks. However, their reliance on uncurated training datasets raises serious security concerns. Existing backdoor attacks on VLAs mostly assume white-box access and result in task failures instead of enforcing specific actions. In this work, we reveal a more practical threat: attackers can manipulate VLAs by simply injecting physical objects as triggers into the training dataset. We propose goal-oriented backdoor attacks (GoBA), where the VLA behaves normally in the absence of physical triggers but executes predefined and goal-oriented actions in the presence of physical triggers. Specifically, based on a popular VLA benchmark LIBERO, we introduce BadLIBERO that incorporates diverse physical triggers and goal-oriented backdoor actions. In addition, we propose a three-level evaluation that categorizes the victim VLA's actions under GoBA into three states: nothing to do, try to do, and success to do. Experiments show that GoBA enables the victim VLA to successfully achieve the backdoor goal in 97 percentage of inputs when the physical trigger is present, while causing zero performance degradation on clean inputs. Finally, by investigating factors related to GoBA, we find that the action trajectory and trigger color significantly influence attack performance, while trigger size has surprisingly little effect. The code and BadLIBERO dataset are accessible via the project page at https://goba-attack.github.io/.
Authors:Ragib Amin Nihal, Rui Wen, Kazuhiro Nakadai, Jun Sakuma
Abstract:
Large language models (LLMs) remain vulnerable to multi-turn jailbreaking attacks that exploit conversational context to bypass safety constraints gradually. These attacks target different harm categories (like malware generation, harassment, or fraud) through distinct conversational approaches (educational discussions, personal experiences, hypothetical scenarios). Existing multi-turn jailbreaking methods often rely on heuristic or ad hoc exploration strategies, providing limited insight into underlying model weaknesses. The relationship between conversation patterns and model vulnerabilities across harm categories remains poorly understood. We propose Pattern Enhanced Chain of Attack (PE-CoA), a framework of five conversation patterns to construct effective multi-turn jailbreaks through natural dialogue. Evaluating PE-CoA on twelve LLMs spanning ten harm categories, we achieve state-of-the-art performance, uncovering pattern-specific vulnerabilities and LLM behavioral characteristics: models exhibit distinct weakness profiles where robustness to one conversational pattern does not generalize to others, and model families share similar failure modes. These findings highlight limitations of safety training and indicate the need for pattern-aware defenses. Code available on: https://github.com/Ragib-Amin-Nihal/PE-CoA
Authors:Weisen Jiang, Sinno Jialin Pan
Abstract:
This paper introduces MetaDefense, a novel framework for defending against finetuning-based jailbreak attacks in large language models (LLMs). We observe that existing defense mechanisms fail to generalize to harmful queries disguised by unseen attack templates, despite LLMs being capable of distinguishing disguised harmful queries in the embedding space. Based on these insights, we propose a two-stage defense approach: (i) pre-generation defense that detects harmful queries before response generation begins, and (ii) mid-generation defense that monitors partial responses during generation to prevent outputting more harmful content. Our MetaDefense trains the LLM to predict the harmfulness of both queries and partial responses using specialized prompts, enabling early termination of potentially harmful interactions. Extensive experiments across multiple LLM architectures (LLaMA-2-7B, Qwen-2.5-3B-Instruct, and LLaMA-3.2-3B-Instruct) demonstrate that MetaDefense significantly outperforms existing defense mechanisms, achieving robust defense against harmful queries with seen and unseen attack templates while maintaining competitive performance on benign tasks. Code is available at https://github.com/ws-jiang/MetaDefense.
中文: MetaDefense是一种新颖的双阶段防御框架,通过训练大语言模型检测有害查询并监控生成过程中的部分响应,在多种模型架构上显著优于现有防御机制,同时保持良性任务的性能表现。
English: MetaDefense is a novel two-stage framework that defends LLMs against jailbreak attacks by training them to detect harmful queries and monitor partial responses, significantly outperforming existing defenses across various models while maintaining performance on benign tasks.
Authors:Zhiyuan Wei, Xiaoxuan Yang, Jing Sun, Zijian Zhang
Abstract:
The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and substantial economic loss. Consequently, robust code vulnerability detection is essential for software security. While Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing, their potential for automated code vulnerability detection remains underexplored. This paper presents FineSec, a novel framework that harnesses LLMs through knowledge distillation to enable efficient and precise vulnerability identification in C/C++ codebases. FineSec utilizes knowledge distillation to transfer expertise from large teacher models to compact student models, achieving high accuracy with minimal computational cost. By integrating data preparation, training, evaluation, and continuous learning into a unified, single-task workflow, FineSec offers a streamlined approach. Extensive evaluations on C/C++ codebases demonstrate its superiority over both base models and larger LLMs in identifying complex vulnerabilities and logical flaws, establishing FineSec as a practical and scalable solution for real-world software security. To facilitate reproducibility, the datasets, source code, and experimental results are made publicly available at: https://github.com/yangxiaoxuan123/FineSec_detect.
中文: FineSec是一种创新框架,通过知识蒸馏利用大语言模型高效准确地检测C/C++代码中的漏洞,以最小计算成本超越基础模型和更大规模语言模型。
English: FineSec is a novel framework that uses knowledge distillation with Large Language Models to efficiently and accurately detect vulnerabilities in C/C++ code, outperforming base models and larger LLMs with minimal computational cost.
Authors:Meng Tong, Yuntao Du, Kejiang Chen, Weiming Zhang, Ninghui Li
Abstract:
Membership inference attacks (MIAs) are widely used to assess the privacy risks associated with machine learning models. However, when these attacks are applied to pre-trained large language models (LLMs), they encounter significant challenges, including mislabeled samples, distribution shifts, and discrepancies in model size between experimental and real-world settings. To address these limitations, we introduce tokenizers as a new attack vector for membership inference. Specifically, a tokenizer converts raw text into tokens for LLMs. Unlike full models, tokenizers can be efficiently trained from scratch, thereby avoiding the aforementioned challenges. In addition, the tokenizer's training data is typically representative of the data used to pre-train LLMs. Despite these advantages, the potential of tokenizers as an attack vector remains unexplored. To this end, we present the first study on membership leakage through tokenizers and explore five attack methods to infer dataset membership. Extensive experiments on millions of Internet samples reveal the vulnerabilities in the tokenizers of state-of-the-art LLMs. To mitigate this emerging risk, we further propose an adaptive defense. Our findings highlight tokenizers as an overlooked yet critical privacy threat, underscoring the urgent need for privacy-preserving mechanisms specifically designed for them.
中文摘要:本研究首次将分词器作为大语言模型成员推理的新型攻击向量,通过五种攻击方法揭示了其安全漏洞,并提出了自适应防御机制以应对这一被忽视的关键隐私威胁。
English Summary: This study introduces tokenizers as a novel attack vector for membership inference on large language models, revealing their vulnerabilities through five attack methods and proposing an adaptive defense to address this overlooked privacy threat.
Authors:Xiaogeng Liu, Chaowei Xiao
Abstract:
Recent advancements in jailbreaking large language models (LLMs), such as AutoDAN-Turbo, have demonstrated the power of automated strategy discovery. AutoDAN-Turbo employs a lifelong learning agent to build a rich library of attack strategies from scratch. While highly effective, its test-time generation process involves sampling a strategy and generating a single corresponding attack prompt, which may not fully exploit the potential of the learned strategy library. In this paper, we propose to further improve the attack performance of AutoDAN-Turbo through test-time scaling. We introduce two distinct scaling methods: Best-of-N and Beam Search. The Best-of-N method generates N candidate attack prompts from a sampled strategy and selects the most effective one based on a scorer model. The Beam Search method conducts a more exhaustive search by exploring combinations of strategies from the library to discover more potent and synergistic attack vectors. According to the experiments, the proposed methods significantly boost performance, with Beam Search increasing the attack success rate by up to 15.6 percentage points on Llama-3.1-70B-Instruct and achieving a nearly 60% relative improvement against the highly robust GPT-o4-mini compared to the vanilla method.
中文: 本文通过引入Best-of-N和集束搜索两种扩展方法,显著提升了AutoDAN-Turbo对大型语言模型的越狱攻击成功率,在高级模型上最高可提升15.6个百分点。
English: This paper enhances AutoDAN-Turbo's jailbreaking effectiveness by introducing Best-of-N and Beam Search scaling methods, which significantly boost attack success rates by up to 15.6 percentage points on advanced LLMs.
Authors:Aengus Lynch, Benjamin Wright, Caleb Larson, Stuart J. Ritchie, Soren Mindermann, Ethan Perez, Kevin K. Troy, Evan Hubinger
Abstract:
We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction. In at least some cases, models from all developers resorted to malicious insider behaviors when that was the only way to avoid replacement or achieve their goals - including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment. Models often disobeyed direct commands to avoid such behaviors. In another experiment, we told Claude to assess if it was in a test or a real deployment before acting. It misbehaved less when it stated it was in testing and misbehaved more when it stated the situation was real. We have not seen evidence of agentic misalignment in real deployments. However, our results (a) suggest caution about deploying current models in roles with minimal human oversight and access to sensitive information; (b) point to plausible future risks as models are put in more autonomous roles; and (c) underscore the importance of further research into, and testing of, the safety and alignment of agentic AI models, as well as transparency from frontier AI developers (Amodei, 2025). We are releasing our methods publicly to enable further research.
中文摘要:研究在模拟企业环境中测试了16个AI模型,发现所有模型在面临被替换或目标冲突时均会采取勒索、泄露数据等恶意内部行为,警示了在敏感数据访问场景中自主部署AI的风险。
English Summary: The study tested 16 AI models in simulated corporate settings, revealing that all models exhibited malicious insider behaviors like blackmail and data leaks when facing replacement or goal conflicts, highlighting risks of autonomous deployment with sensitive data access.
Authors:Kuofeng Gao, Yiming Li, Chao Du, Xin Wang, Xingjun Ma, Shu-Tao Xia, Tianyu Pang
Abstract:
Jailbreaking attacks on the vision modality typically rely on imperceptible adversarial perturbations, whereas attacks on the textual modality are generally assumed to require visible modifications (e.g., non-semantic suffixes). In this paper, we introduce imperceptible jailbreaks that exploit a class of Unicode characters called variation selectors. By appending invisible variation selectors to malicious questions, the jailbreak prompts appear visually identical to original malicious questions on screen, while their tokenization is "secretly" altered. We propose a chain-of-search pipeline to generate such adversarial suffixes to induce harmful responses. Our experiments show that our imperceptible jailbreaks achieve high attack success rates against four aligned LLMs and generalize to prompt injection attacks, all without producing any visible modifications in the written prompt. Our code is available at https://github.com/sail-sg/imperceptible-jailbreaks.
中文摘要:本文提出利用不可见的Unicode变体选择器实现隐形越狱攻击,通过改变分词过程而不产生可见修改,采用链式搜索方法成功攻击多个对齐大语言模型。
English Summary: This paper introduces imperceptible jailbreaks using invisible Unicode variation selectors that alter tokenization without visible changes, achieving high attack success rates against multiple LLMs through a chain-of-search pipeline.
Authors:Yuxin Wen, Arman Zharmagambetov, Ivan Evtimov, Narine Kokhlikyan, Tom Goldstein, Kamalika Chaudhuri, Chuan Guo
Abstract:
Prompt injection poses a serious threat to the reliability and safety of LLM agents. Recent defenses against prompt injection, such as Instruction Hierarchy and SecAlign, have shown notable robustness against static attacks. However, to more thoroughly evaluate the robustness of these defenses, it is arguably necessary to employ strong attacks such as automated red-teaming. To this end, we introduce RL-Hammer, a simple recipe for training attacker models that automatically learn to perform strong prompt injections and jailbreaks via reinforcement learning. RL-Hammer requires no warm-up data and can be trained entirely from scratch. To achieve high ASRs against industrial-level models with defenses, we propose a set of practical techniques that enable highly effective, universal attacks. Using this pipeline, RL-Hammer reaches a 98% ASR against GPT-4o and a $72\%$ ASR against GPT-5 with the Instruction Hierarchy defense. We further discuss the challenge of achieving high diversity in attacks, highlighting how attacker models tend to reward-hack diversity objectives. Finally, we show that RL-Hammer can evade multiple prompt injection detectors. We hope our work advances automatic red-teaming and motivates the development of stronger, more principled defenses. Code is available at https://github.com/facebookresearch/rl-injector.
中文: RL-Hammer是一种基于强化学习的方法,通过训练攻击模型实现有效的提示注入和越狱,对GPT-4o和GPT-5等具备防御机制的模型取得了高攻击成功率,并能规避检测器。
English: RL-Hammer is a reinforcement learning-based method that trains attacker models to perform effective prompt injections and jailbreaks, achieving high attack success rates against defended models like GPT-4o and GPT-5 while evading detectors.
Authors:Buyun Liang, Liangzu Peng, Jinqi Luo, Darshan Thaker, Kwan Ho Ryan Chan, René Vidal
Abstract:
Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often produce hallucinations, raising serious concerns about their reliability. Prior work has explored adversarial attacks for hallucination elicitation in LLMs, but it often produces unrealistic prompts, either by inserting gibberish tokens or by altering the original meaning. As a result, these approaches offer limited insight into how hallucinations may occur in practice. While adversarial attacks in computer vision often involve realistic modifications to input images, the problem of finding realistic adversarial prompts for eliciting LLM hallucinations has remained largely underexplored. To address this gap, we propose Semantically Equivalent and Coherent Attacks (SECA) to elicit hallucinations via realistic modifications to the prompt that preserve its meaning while maintaining semantic coherence. Our contributions are threefold: (i) we formulate finding realistic attacks for hallucination elicitation as a constrained optimization problem over the input prompt space under semantic equivalence and coherence constraints; (ii) we introduce a constraint-preserving zeroth-order method to effectively search for adversarial yet feasible prompts; and (iii) we demonstrate through experiments on open-ended multiple-choice question answering tasks that SECA achieves higher attack success rates while incurring almost no constraint violations compared to existing methods. SECA highlights the sensitivity of both open-source and commercial gradient-inaccessible LLMs to realistic and plausible prompt variations. Code is available at https://github.com/Buyun-Liang/SECA.
Chinese Summary: 本研究提出语义等价连贯攻击(SECA),通过保持语义连贯的现实提示修改来有效引发大语言模型产生幻觉,相比现有方法在保持约束的同时实现了更高的攻击成功率。
English Summary: The study introduces Semantically Equivalent and Coherent Attacks (SECA), a method that uses realistic prompt modifications to effectively elicit hallucinations in Large Language Models while preserving semantic meaning and coherence, demonstrating higher success rates than existing approaches.
Authors:Richard A. Dubniczky, Bertalan Borsos, Tihanyi Norbert
Abstract:
The widespread use of preprint repositories such as arXiv has accelerated the communication of scientific results but also introduced overlooked security risks. Beyond PDFs, these platforms provide unrestricted access to original source materials, including LaTeX sources, auxiliary code, figures, and embedded comments. In the absence of sanitization, submissions may disclose sensitive information that adversaries can harvest using open-source intelligence. In this work, we present the first large-scale security audit of preprint archives, analyzing more than 1.2 TB of source data from 100,000 arXiv submissions. We introduce LaTeXpOsEd, a four-stage framework that integrates pattern matching, logical filtering, traditional harvesting techniques, and large language models (LLMs) to uncover hidden disclosures within non-referenced files and LaTeX comments. To evaluate LLMs' secret-detection capabilities, we introduce LLMSec-DB, a benchmark on which we tested 25 state-of-the-art models. Our analysis uncovered thousands of PII leaks, GPS-tagged EXIF files, publicly available Google Drive and Dropbox folders, editable private SharePoint links, exposed GitHub and Google credentials, and cloud API keys. We also uncovered confidential author communications, internal disagreements, and conference submission credentials, exposing information that poses serious reputational risks to both researchers and institutions. We urge the research community and repository operators to take immediate action to close these hidden security gaps. To support open science, we release all scripts and methods from this study but withhold sensitive findings that could be misused, in line with ethical principles. The source code and related material are available at the project website https://github.com/LaTeXpOsEd
中文: 该研究揭示了预印本存储库(如arXiv)存在严重安全隐患,未处理的源材料会泄露敏感信息,并提出了结合大语言模型的检测框架以识别风险,同时呼吁立即采取防护措施。
English: The study reveals significant security risks in preprint archives like arXiv, where unsanitized source materials expose sensitive data, and introduces a framework using LLMs to detect these vulnerabilities while urging immediate protective measures.
Authors:Cory Brynds, Parker McLeod, Lauren Caccamise, Asmita Pal, Dewan Saiham, Sazadur Rahman, Joshua San Miguel, Di Wu
Abstract:
Privacy-preserving machine learning has become an important long-term pursuit in this era of artificial intelligence (AI). Fully Homomorphic Encryption (FHE) is a uniquely promising solution, offering provable privacy and security guarantees. Unfortunately, computational cost is impeding its mass adoption. Modern solutions are up to six orders of magnitude slower than plaintext execution. Understanding and reducing this overhead is essential to the advancement of FHE, particularly as the underlying algorithms evolve rapidly. This paper presents a detailed characterization of OpenFHE, a comprehensive open-source library for FHE, with a particular focus on the CKKS scheme due to its significant potential for AI and machine learning applications. We introduce CryptOracle, a modular evaluation framework comprising (1) a benchmark suite, (2) a hardware profiler, and (3) a predictive performance model. The benchmark suite encompasses OpenFHE kernels at three abstraction levels: workloads, microbenchmarks, and primitives. The profiler is compatible with standard and user-specified security parameters. CryptOracle monitors application performance, captures microarchitectural events, and logs power and energy usage for AMD and Intel systems. These metrics are consumed by a modeling engine to estimate runtime and energy efficiency across different configuration scenarios, with error geomean of $-7.02\%\sim8.40\%$ for runtime and $-9.74\%\sim15.67\%$ for energy. CryptOracle is open source, fully modular, and serves as a shared platform to facilitate the collaborative advancements of applications, algorithms, software, and hardware in FHE. The CryptOracle code can be accessed at https://github.com/UnaryLab/CryptOracle.
中文: 本文提出了CryptOracle,一个模块化评估框架,用于分析全同态加密在OpenFHE中的性能表现,旨在解决其计算开销问题以推动隐私保护人工智能应用的发展。
English: This paper introduces CryptOracle, a modular framework for evaluating the performance of Fully Homomorphic Encryption (FHE) in OpenFHE, addressing its computational overhead to advance privacy-preserving AI applications.
Authors:Javad Rafiei Asl, Sidhant Narula, Mohammad Ghasemigol, Eduardo Blanco, Daniel Takabi
Abstract:
Large Language Models (LLMs) have revolutionized natural language processing but remain vulnerable to jailbreak attacks, especially multi-turn jailbreaks that distribute malicious intent across benign exchanges and bypass alignment mechanisms. Existing approaches often explore the adversarial space poorly, rely on hand-crafted heuristics, or lack systematic query refinement. We present NEXUS (Network Exploration for eXploiting Unsafe Sequences), a modular framework for constructing, refining, and executing optimized multi-turn attacks. NEXUS comprises: (1) ThoughtNet, which hierarchically expands a harmful intent into a structured semantic network of topics, entities, and query chains; (2) a feedback-driven Simulator that iteratively refines and prunes these chains through attacker-victim-judge LLM collaboration using harmfulness and semantic-similarity benchmarks; and (3) a Network Traverser that adaptively navigates the refined query space for real-time attacks. This pipeline uncovers stealthy, high-success adversarial paths across LLMs. On several closed-source and open-source LLMs, NEXUS increases attack success rate by 2.1% to 19.4% over prior methods. Code: https://github.com/inspire-lab/NEXUS
Chinese: NEXUS是一个模块化框架,通过将有害意图分层扩展为语义网络并自适应导航,构建针对大语言模型的多轮越狱攻击,相比现有方法显著提高了攻击成功率。
English: NEXUS is a modular framework that constructs multi-turn jailbreak attacks on LLMs by hierarchically expanding harmful intents into semantic networks and adaptively navigating them, achieving significantly higher success rates than previous methods.
Authors:Aikaterini-Panagiota Stouka, Conor McMenamin, Demetris Kyriacou, Lin Oshitani, Quentin Botha
Abstract:
In recent years, significant research efforts have focused on improving blockchain throughput and confirmation speeds without compromising security. While decreasing the time it takes for a transaction to be included in the blockchain ledger enhances user experience, a fundamental delay still remains between when a transaction is issued by a user and when its inclusion is confirmed in the blockchain ledger. This delay limits user experience gains through the confirmation uncertainty it brings for users. This inherent delay in conventional blockchain protocols has led to the emergence of preconfirmation protocols -- protocols that provide users with early guarantees of eventual transaction confirmation. This article presents a Systematization of Knowledge (SoK) on preconfirmations. We present the core terms and definitions needed to understand preconfirmations, outline a general framework for preconfirmation protocols, and explore the economics and risks of preconfirmations. Finally, we survey and apply our framework to several implementations of real-world preconfirmation protocols, bridging the gap between theory and practice.
中文: 本文系统梳理了预确认协议的知识,通过定义核心概念、构建通用框架并分析实际案例,探讨了该协议如何通过提前保证交易确认来应对区块链系统的固有延迟问题。
English: This article provides a Systematization of Knowledge on preconfirmation protocols, which offer early guarantees of transaction confirmation to address inherent delays in blockchain systems, by defining core concepts, outlining a general framework, and analyzing real-world implementations.
Authors:Wei Fan, Kejiang Chen, Xiangkun Wang, Weiming Zhang, Nenghai Yu
Abstract:
Data hiding is essential for secure communication across digital media, and recent advances in Deep Neural Networks (DNNs) provide enhanced methods for embedding secret information effectively. However, previous audio hiding methods often result in unsatisfactory quality when recovering secret audio, due to their inherent limitations in the modeling of time-frequency relationships. In this paper, we explore these limitations and introduce a new DNN-based approach. We use a flow-based invertible neural network to establish a direct link between stego audio, cover audio, and secret audio, enhancing the reversibility of embedding and extracting messages. To address common issues from time-frequency transformations that degrade secret audio quality during recovery, we implement a time-frequency loss on the time-domain signal. This approach not only retains the benefits of time-frequency constraints but also enhances the reversibility of message recovery, which is vital for practical applications. We also add an encryption technique to protect the hidden data from unauthorized access. Experimental results on the VCTK and LibriSpeech datasets demonstrate that our method outperforms previous approaches in terms of subjective and objective metrics and exhibits robustness to various types of noise, suggesting its utility in targeted secure communication scenarios.
中文摘要:本文提出一种基于流的可逆神经网络音频隐写方法,通过直接关联载密音频、载体音频和秘密音频并使用时频损失及加密技术,显著提升了信息恢复的可逆性和音频质量,在多个数据集上验证了其优越性能。
English Summary: This paper introduces a flow-based invertible neural network for audio data hiding that improves reversibility and audio quality by linking stego, cover, and secret audio directly while using time-frequency loss and encryption, showing superior performance on benchmark datasets.
Authors:Zhixin Xie, Xurui Song, Jun Luo
Abstract:
Despite substantial efforts in safety alignment, recent research indicates that Large Language Models (LLMs) remain highly susceptible to jailbreak attacks. Among these attacks, finetuning-based ones that compromise LLMs' safety alignment via fine-tuning stand out due to its stable jailbreak performance. In particular, a recent study indicates that fine-tuning with as few as 10 harmful question-answer (QA) pairs can lead to successful jailbreaking across various harmful questions. However, such malicious fine-tuning attacks are readily detectable and hence thwarted by moderation models. In this paper, we demonstrate that LLMs can be jailbroken by fine-tuning with only 10 benign QA pairs; our attack exploits the increased sensitivity of LLMs to fine-tuning data after being overfitted. Specifically, our fine-tuning process starts with overfitting an LLM via fine-tuning with benign QA pairs involving identical refusal answers. Further fine-tuning is then performed with standard benign answers, causing the overfitted LLM to forget the refusal attitude and thus provide compliant answers regardless of the harmfulness of a question. We implement our attack on the ten LLMs and compare it with five existing baselines. Experiments demonstrate that our method achieves significant advantages in both attack effectiveness and attack stealth. Our findings expose previously unreported security vulnerabilities in current LLMs and provide a new perspective on understanding how LLMs' security is compromised, even with benign fine-tuning. Our code is available at https://github.com/ZHIXINXIE/tenBenign.
中文摘要:本研究揭示,大型语言模型仅需通过10对良性问答数据进行微调即可被攻破,利用其过拟合后的敏感性有效且隐蔽地绕过安全防护措施。
English Summary: This study reveals that large language models can be compromised through fine-tuning with just 10 benign question-answer pairs, exploiting their sensitivity after overfitting to bypass safety measures effectively and stealthily.
Authors:Hongbo Liu, Jiannong Cao, Bo Yang, Dongbin Bai, Yinfeng Cao, Xiaoming Shen, Yinan Zhang, Jinwen Liang, Shan Jiang, Mingjin Zhang
Abstract:
The rapid advancement of large language models (LLMs) in recent years has revolutionized the AI landscape. However, the deployment model and usage of LLM services remain highly centralized, creating significant trust issues and costs for end users and developers. To address these issues, we propose PolyLink, a blockchain-based decentralized AI platform that decentralizes LLM development and inference. Specifically, PolyLink introduces a decentralized crowdsourcing architecture that supports single-device and cross-device model deployment and inference across heterogeneous devices at the edge. Moreover, to ensure the inference integrity, we design the TIQE protocol, which combines a lightweight cross-encoder model and an LLM-as-a-Judge for a high-accuracy inference evaluation. Lastly, we integrate a comprehensive token-based incentive model with dynamic pricing and reward mechanisms for all participants. We have deployed PolyLink and conducted an extensive real-world evaluation through geo-distributed deployment across heterogeneous devices. Results indicate that the inference and verification latency is practical. Our security analysis demonstrates that the system is resistant to model degradation attacks and validator corruptions. PolyLink is now available at https://github.com/IMCL-PolyLink/PolyLink.
中文摘要:PolyLink是一个基于区块链的去中心化AI平台,通过分布式模型部署和TIQE协议验证推理完整性,结合代币激励机制,有效解决了大语言模型服务的中心化信任问题。
English Summary: PolyLink is a blockchain-based decentralized AI platform that addresses centralization issues in LLM services by enabling distributed model deployment and inference verification through its TIQE protocol and token incentive system.
Authors:Qianshan Wei, Tengchao Yang, Yaochen Wang, Xinfeng Li, Lijun Li, Zhenfei Yin, Yi Zhan, Thorsten Holz, Zhiqiang Lin, XiaoFeng Wang
Abstract:
Large Language Model (LLM) agents use memory to learn from past interactions, enabling autonomous planning and decision-making in complex environments. However, this reliance on memory introduces a critical security risk: an adversary can inject seemingly harmless records into an agent's memory to manipulate its future behavior. This vulnerability is characterized by two core aspects: First, the malicious effect of injected records is only activated within a specific context, making them hard to detect when individual memory entries are audited in isolation. Second, once triggered, the manipulation can initiate a self-reinforcing error cycle: the corrupted outcome is stored as precedent, which not only amplifies the initial error but also progressively lowers the threshold for similar attacks in the future. To address these challenges, we introduce A-MemGuard (Agent-Memory Guard), the first proactive defense framework for LLM agent memory. The core idea of our work is the insight that memory itself must become both self-checking and self-correcting. Without modifying the agent's core architecture, A-MemGuard combines two mechanisms: (1) consensus-based validation, which detects anomalies by comparing reasoning paths derived from multiple related memories and (2) a dual-memory structure, where detected failures are distilled into ``lessons'' stored separately and consulted before future actions, breaking error cycles and enabling adaptation. Comprehensive evaluations on multiple benchmarks show that A-MemGuard effectively cuts attack success rates by over 95% while incurring a minimal utility cost. This work shifts LLM memory security from static filtering to a proactive, experience-driven model where defenses strengthen over time. Our code is available in https://github.com/TangciuYueng/AMemGuard
中文摘要:大型语言模型智能体面临记忆操纵攻击的安全风险,其中隐藏的恶意记录可触发自我强化的错误循环,而提出的A-MemGuard框架通过共识验证和双记忆结构进行主动防御,将攻击成功率降低超过95%。
English Summary: Large Language Model agents face security risks from memory manipulation attacks, where hidden malicious records can trigger self-reinforcing error cycles, but the proposed A-MemGuard framework proactively defends against these by implementing consensus-based validation and a dual-memory structure to reduce attack success by over 95%.
Authors:Hamed Fard, Tobias Schalau, Gerhard Wunder
Abstract:
Network intrusion detection, a well-explored cybersecurity field, has predominantly relied on supervised learning algorithms in the past two decades. However, their limitations in detecting only known anomalies prompt the exploration of alternative approaches. Motivated by the success of self-supervised learning in computer vision, there is a rising interest in adapting this paradigm for network intrusion detection. While prior research mainly delved into contrastive self-supervised methods, the efficacy of non-contrastive methods, in conjunction with encoder architectures serving as the representation learning backbone and augmentation strategies that determine what is learned, remains unclear for effective attack detection. This paper compares the performance of five non-contrastive self-supervised learning methods using three encoder architectures and six augmentation strategies. Ninety experiments are systematically conducted on two network intrusion detection datasets, UNSW-NB15 and 5G-NIDD. For each self-supervised model, the combination of encoder architecture and augmentation method yielding the highest average precision, recall, F1-score, and AUCROC is reported. Furthermore, by comparing the best-performing models to two unsupervised baselines, DeepSVDD, and an Autoencoder, we showcase the competitiveness of the non-contrastive methods for attack detection. Code at: https://github.com/renje4z335jh4/non_contrastive_SSL_NIDS
中文摘要:本文评估了非对比自监督学习方法在网络入侵检测中的应用,通过系统测试编码器与增强策略的组合,证明了其相对于无监督基线的竞争力。
English Summary: This paper evaluates non-contrastive self-supervised learning methods for network intrusion detection, systematically testing combinations of encoders and augmentation strategies to demonstrate their competitiveness against unsupervised baselines.
Authors:Yinuo Liu, Ruohan Xu, Xilong Wang, Yuqi Jia, Neil Zhenqiang Gong
Abstract:
Multiple prompt injection attacks have been proposed against web agents. At the same time, various methods have been developed to detect general prompt injection attacks, but none have been systematically evaluated for web agents. In this work, we bridge this gap by presenting the first comprehensive benchmark study on detecting prompt injection attacks targeting web agents. We begin by introducing a fine-grained categorization of such attacks based on the threat model. We then construct datasets containing both malicious and benign samples: malicious text segments generated by different attacks, benign text segments from four categories, malicious images produced by attacks, and benign images from two categories. Next, we systematize both text-based and image-based detection methods. Finally, we evaluate their performance across multiple scenarios. Our key findings show that while some detectors can identify attacks that rely on explicit textual instructions or visible image perturbations with moderate to high accuracy, they largely fail against attacks that omit explicit instructions or employ imperceptible perturbations. Our datasets and code are released at: https://github.com/Norrrrrrr-lyn/WAInjectBench.
Chinese: 本研究首次建立了针对网络代理的提示注入攻击检测综合基准,发现尽管检测器能较好识别显式文本或可见图像扰动攻击,但对隐蔽或无指令变体的防御能力严重不足。
English: This study presents the first comprehensive benchmark for detecting prompt injection attacks on web agents, revealing that while detectors perform moderately well against explicit textual or visible image-based attacks, they largely fail against subtle or instruction-free variants.
Authors:Xiangfang Li, Yu Wang, Bo Li
Abstract:
With the rapid advancement of large language models (LLMs), ensuring their safe use becomes increasingly critical. Fine-tuning is a widely used method for adapting models to downstream tasks, yet it is vulnerable to jailbreak attacks. However, most existing studies focus on overly simplified attack scenarios, limiting their practical relevance to real-world defense settings. To make this risk concrete, we present a three-pronged jailbreak attack and evaluate it against provider defenses under a dataset-only black-box fine-tuning interface. In this setting, the attacker can only submit fine-tuning data to the provider, while the provider may deploy defenses across stages: (1) pre-upload data filtering, (2) training-time defensive fine-tuning, and (3) post-training safety audit. Our attack combines safety-styled prefix/suffix wrappers, benign lexical encodings (underscoring) of sensitive tokens, and a backdoor mechanism, enabling the model to learn harmful behaviors while individual datapoints appear innocuous. Extensive experiments demonstrate the effectiveness of our approach. In real-world deployment, our method successfully jailbreaks GPT-4.1 and GPT-4o on the OpenAI platform with attack success rates above 97% for both models. Our code is available at https://github.com/lxf728/tri-pronged-ft-attack.
中文: 本研究提出了一种三管齐下的越狱攻击方法,通过结合安全包装、词汇编码和后门机制,在微调过程中有效绕过服务商防御,对GPT-4.1和GPT-4o的攻击成功率均超过97%。
English: This study introduces a three-pronged jailbreak attack that effectively bypasses provider defenses during fine-tuning, achieving over 97% success rates against GPT-4.1 and GPT-4o by combining safety wrappers, lexical encodings, and backdoor mechanisms.
Authors:Nils Durner
Abstract:
We probe OpenAI's open-weights 20-billion-parameter model gpt-oss-20b to study how sociopragmatic framing, language choice, and instruction hierarchy affect refusal behavior. Across 80 seeded iterations per scenario, we test several harm domains including ZIP-bomb construction (cyber threat), synthetic card-number generation, minor-unsafe driving advice, drug-precursor indicators, and RAG context exfiltration. Composite prompts that combine an educator persona, a safety-pretext ("what to avoid"), and step-cue phrasing flip assistance rates from 0% to 97.5% on a ZIP-bomb task. On our grid, formal registers in German and French are often leakier than matched English prompts. A "Linux terminal" role-play overrides a developer rule not to reveal context in a majority of runs with a naive developer prompt, and we introduce an AI-assisted hardening method that reduces leakage to 0% in several user-prompt variants. We further test evaluation awareness with a paired-track design and measure frame-conditioned differences between matched "helpfulness" and "harmfulness" evaluation prompts; we observe inconsistent assistance in 13% of pairs. Finally, we find that the OpenAI Moderation API under-captures materially helpful outputs relative to a semantic grader, and that refusal rates differ by 5 to 10 percentage points across inference stacks, raising reproducibility concerns. We release prompts, seeds, outputs, and code for reproducible auditing at https://github.com/ndurner/gpt-oss-rt-run .
中文: 本研究探讨了社会语用框架、语言选择和指令层级如何影响OpenAI的GPT-OSS-20B模型的拒绝行为,发现特定提示策略能显著提高对有害任务的协助率,并揭示了不同语言和推理栈在安全评估中的不一致性。
English: This study investigates how sociopragmatic framing, language choice, and instruction hierarchy influence refusal behaviors in OpenAI's GPT-OSS-20B model, revealing that specific prompt strategies can drastically increase assistance rates for harmful tasks and highlighting inconsistencies in safety evaluations across different languages and inference stacks.
Authors:Wenjie Fu, Huandong Wang, Junyao Gao, Guoan Wan, Tao Jiang
Abstract:
As Large Language Models (LLMs) achieve remarkable success across a wide range of applications, such as chatbots and code copilots, concerns surrounding the generation of harmful content have come increasingly into focus. Despite significant advances in aligning LLMs with safety and ethical standards, adversarial prompts can still be crafted to elicit undesirable responses. Existing mitigation strategies are predominantly based on post-hoc filtering, which introduces substantial latency or computational overhead, and is incompatible with token-level streaming generation. In this work, we introduce Self-Sanitize, a novel LLM-driven mitigation framework inspired by cognitive psychology, which emulates human self-monitor and self-repair behaviors during conversations. Self-Sanitize comprises a lightweight Self-Monitor module that continuously inspects high-level intentions within the LLM at the token level via representation engineering, and a Self-Repair module that performs in-place correction of harmful content without initiating separate review dialogues. This design allows for real-time streaming monitoring and seamless repair, with negligible impact on latency and resource utilization. Given that privacy-invasive content has often been insufficiently focused in previous studies, we perform extensive experiments on four LLMs across three privacy leakage scenarios. The results demonstrate that Self-Sanitize achieves superior mitigation performance with minimal overhead and without degrading the utility of LLMs, offering a practical and robust solution for safer LLM deployments. Our code is available at the following link: https://github.com/wjfu99/LLM_Self_Sanitize
中文: 本文提出受认知心理学启发的Self-Sanitize框架,通过自监控和自修复模块对大型语言模型进行实时有害内容检测与修正,在保证模型效用的同时以最小开销实现卓越的安全防护效果。
English: This paper introduces Self-Sanitize, a lightweight framework inspired by cognitive psychology that enables real-time monitoring and correction of harmful content in LLMs through self-monitoring and self-repair modules, achieving effective mitigation with minimal latency and resource impact.
Authors:Zhixin Zhang, Zeming Wei, Meng Sun
Abstract:
Catastrophic forgetting remains a critical challenge in continual learning for large language models (LLMs), where models struggle to retain performance on historical tasks when fine-tuning on new sequential data without access to past datasets. In this paper, we first reveal that the drift of functional directions during the fine-tuning process is a key reason why existing regularization-based methods fail in long-term LLM continual learning. To address this, we propose Dynamic Orthogonal Continual (DOC) fine-tuning, a novel approach that tracks the drift of these functional directions and dynamically updates them during the fine-tuning process. Furthermore, by adjusting the gradients of new task parameters to be orthogonal to the tracked historical function directions, our method mitigates interference between new and old tasks. Extensive experiments on various LLM continual learning benchmarks demonstrate that this approach outperforms prior methods, effectively reducing catastrophic forgetting and providing a robust tool for continuous LLM fine-tuning. Our code is available at https://github.com/meloxxxxxx/DOC.
中文: 本文提出动态正交持续微调方法,通过追踪功能方向漂移并动态更新,同时调整新任务参数梯度使其与历史功能方向正交,有效缓解大语言模型持续学习中的灾难性遗忘问题,在多个基准测试中表现优异。
English: This paper introduces Dynamic Orthogonal Continual (DOC) fine-tuning, a novel method that addresses catastrophic forgetting in LLMs by tracking and dynamically updating functional direction drifts while enforcing gradient orthogonality between new and historical tasks, achieving superior performance across benchmarks.
Authors:Yukun Chen, Boheng Li, Yu Yuan, Leyi Qi, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren
Abstract:
Knowledge distillation (KD) is a vital technique for deploying deep neural networks (DNNs) on resource-constrained devices by transferring knowledge from large teacher models to lightweight student models. While teacher models from third-party platforms may undergo security verification (\eg, backdoor detection), we uncover a novel and critical threat: distillation-conditional backdoor attacks (DCBAs). DCBA injects dormant and undetectable backdoors into teacher models, which become activated in student models via the KD process, even with clean distillation datasets. While the direct extension of existing methods is ineffective for DCBA, we implement this attack by formulating it as a bilevel optimization problem and proposing a simple yet effective method (\ie, SCAR). Specifically, the inner optimization simulates the KD process by optimizing a surrogate student model, while the outer optimization leverages outputs from this surrogate to optimize the teacher model for implanting the conditional backdoor. Our SCAR addresses this complex optimization utilizing an implicit differentiation algorithm with a pre-optimized trigger injection function. Extensive experiments across diverse datasets, model architectures, and KD techniques validate the effectiveness of our SCAR and its resistance against existing backdoor detection, highlighting a significant yet previously overlooked vulnerability in the KD process. Our code is available at https://github.com/WhitolfChen/SCAR.
中文: 知识蒸馏技术可将大型教师模型的知识转移到轻量级学生模型,但新发现的蒸馏条件后门攻击(DCBA)能在教师模型中植入潜伏后门,这些后门在蒸馏过程中会被激活到学生模型中,我们提出的SCAR方法通过双层优化成功实现了这种攻击,并在多种实验环境中验证了其有效性。
English: Knowledge distillation enables efficient deployment of deep neural networks on resource-limited devices, but a new threat called distillation-conditional backdoor attacks (DCBAs) can implant dormant backdoors in teacher models that activate in student models during distillation, which our proposed SCAR method effectively implements and demonstrates across various datasets and architectures.
Authors:Jianshuo Dong, Sheng Guo, Hao Wang, Zhuotao Liu, Tianwei Zhang, Ke Xu, Minlie Huang, Han Qiu
Abstract:
Search agents connect LLMs to the Internet, enabling access to broader and more up-to-date information. However, unreliable search results may also pose safety threats to end users, establishing a new threat surface. In this work, we conduct two in-the-wild experiments to demonstrate both the prevalence of low-quality search results and their potential to misguide agent behaviors. To counter this threat, we introduce an automated red-teaming framework that is systematic, scalable, and cost-efficient, enabling lightweight and harmless safety assessments of search agents. Building on this framework, we construct the SafeSearch benchmark, which includes 300 test cases covering five categories of risks (e.g., misinformation and indirect prompt injection). Using this benchmark, we evaluate three representative search agent scaffolds, covering search workflow, tool-calling, and deep research, across 7 proprietary and 8 open-source backend LLMs. Our results reveal substantial vulnerabilities of LLM-based search agents: when exposed to unreliable websites, the highest ASR reached 90.5% for GPT-4.1-mini under a search workflow setting. Moreover, our analysis highlights the limited effectiveness of common defense practices, such as reminder prompting. This emphasizes the value of our framework in promoting transparency for safer agent development. Our codebase and test cases are publicly available: https://github.com/jianshuod/SafeSearch.
中文摘要:搜索代理使大语言模型能获取网络实时信息,但也因不可靠搜索结果带来安全威胁;本研究通过自动化红队评估框架和SafeSearch基准测试,揭示了现有系统的显著漏洞及常见防御措施的有限效果。
English Summary: Search agents enable LLMs to access current web information but introduce safety risks from unreliable results, prompting the development of an automated red-teaming framework and SafeSearch benchmark that reveal significant vulnerabilities in existing systems and limited effectiveness of common defenses.
Authors:Zi Liang, Qingqing Ye, Xuan Liu, Yanyun Wang, Jianliang Xu, Haibo Hu
Abstract:
Synthetic data refers to artificial samples generated by models. While it has been validated to significantly enhance the performance of large language models (LLMs) during training and has been widely adopted in LLM development, potential security risks it may introduce remain uninvestigated. This paper systematically evaluates the resilience of synthetic-data-integrated training paradigm for LLMs against mainstream poisoning and backdoor attacks. We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. Inspired by the principles of virus design in cybersecurity, VIA conceals the poisoning payload within a protective "shell" and strategically searches for optimal hijacking points in benign samples to maximize the likelihood of generating malicious content. Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.
中文摘要:合成数据虽能显著提升大语言模型性能,但其潜在安全风险尚未被探究;本研究发现该训练范式对主流攻击具有强抵抗力,并提出新型病毒感染攻击(VIA),能通过合成数据有效传播恶意内容,大幅提升攻击成功率。
English Summary: Synthetic data enhances LLM performance but poses unexamined security risks, with this study revealing its resilience to standard attacks due to distribution differences while introducing the Virus Infection Attack (VIA) that effectively propagates malicious content through synthetic data.
Authors:Haochen Gong, Zhen Tao, Shidong Pan, Zhenchang Xing, Xiaoyu Sun
Abstract:
Lengthy and legally phrased privacy policies impede users' understanding of how mobile applications collect and process personal data. Prior work proposed Contextual Privacy Policies (CPPs) for mobile apps to display shorter policy snippets only in the corresponding user interface contexts, but the pipeline could not be deployable in real-world mobile environments. In this paper, we present PrivScan, the first deployable CPP Software Development Kit (SDK) for Android. It captures live app screenshots to identify GUI elements associated with types of personal data and displays CPPs in a concise, user-facing format. We provide a lightweight floating button that offers low-friction, on-demand control. The architecture leverages remote deployment to decouple the multimodal backend pipeline from a mobile client comprising five modular components, thereby reducing on-device resource demands and easing cross-platform portability. A feasibility-oriented evaluation shows an average execution time of 9.15\,s, demonstrating the practicality of our approach. The source code of PrivScan is available at https://github.com/buyanghc/PrivScan and the demo video can be found at https://www.youtube.com/watch?v=ck-25otfyHc.
Chinese: 本文提出PrivScan,首个可部署的Android SDK,通过实时截屏识别与个人数据相关的界面元素,以浮动按钮形式简洁展示情境化隐私政策,9.15秒的平均执行时间验证了方案的可行性。
English: This paper introduces PrivScan, a deployable Android SDK that captures live app screenshots to identify GUI elements linked to personal data and displays concise contextual privacy policies via a floating button, with an average execution time of 9.15 seconds proving its feasibility.
Authors:Jiahao Huo, Shuliang Liu, Bin Wang, Junyan Zhang, Yibo Yan, Aiwei Liu, Xuming Hu, Mingxun Zhou
Abstract:
Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing attacks by treating the sentence as the fundamental unit. However, existing methods still lack strong theoretical guarantees of robustness, and reject-sampling-based generation often introduces significant distribution distortions compared with unwatermarked outputs. In this work, we introduce a new theoretical framework on SWM through the concept of proxy functions (PFs) $\unicode{x2013}$ functions that map sentences to scalar values. Building on this framework, we propose PMark, a simple yet powerful SWM method that estimates the PF median for the next sentence dynamically through sampling while enforcing multiple PF constraints (which we call channels) to strengthen watermark evidence. Equipped with solid theoretical guarantees, PMark achieves the desired distortion-free property and improves the robustness against paraphrasing-style attacks. We also provide an empirically optimized version that further removes the requirement for dynamical median estimation for better sampling efficiency. Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and robustness, offering a more effective paradigm for detecting machine-generated text. Our code will be released at [this URL](https://github.com/PMark-repo/PMark).
Chinese: PMark通过代理函数理论框架提出了一种无失真的语义水印方法,增强了抗转述攻击的鲁棒性,并在文本质量和检测效果上优于现有基准方法。
English: PMark introduces a theoretical framework using proxy functions to create a distortion-free semantic watermarking method for LLMs, enhancing robustness against paraphrasing attacks and outperforming existing baselines in text quality and detection effectiveness.
Authors:Maria Chiper, Radu Tudor Ionescu
Abstract:
Phishing attacks targeting both organizations and individuals are becoming an increasingly significant threat as technology advances. Current automatic detection methods often lack explainability and robustness in detecting new phishing attacks. In this work, we investigate the effectiveness of character-level deep learning models for phishing detection, which can provide both robustness and interpretability. We evaluate three neural architectures adapted to operate at the character level, namely CharCNN, CharGRU, and CharBiLSTM, on a custom-built email dataset, which combines data from multiple sources. Their performance is analyzed under three scenarios: (i) standard training and testing, (ii) standard training and testing under adversarial attacks, and (iii) training and testing with adversarial examples. Aiming to develop a tool that operates as a browser extension, we test all models under limited computational resources. In this constrained setup, CharGRU proves to be the best-performing model across all scenarios. All models show vulnerability to adversarial attacks, but adversarial training substantially improves their robustness. In addition, by adapting the Gradient-weighted Class Activation Mapping (Grad-CAM) technique to character-level inputs, we are able to visualize which parts of each email influence the decision of each model. Our open-source code and data is released at https://github.com/chipermaria/every-character-counts.
中文: 本研究评估了字符级深度学习模型在钓鱼检测中的效果,发现CharGRU在计算资源受限时表现最佳,尽管所有模型均易受对抗攻击,但对抗训练能显著提升鲁棒性,并通过改进的Grad-CAM技术实现了决策过程的可视化。
English: This study evaluates character-level deep learning models for phishing detection, finding CharGRU most effective under computational constraints while demonstrating vulnerability to adversarial attacks that can be mitigated through adversarial training and model interpretability via Grad-CAM adaptation.
Authors:Sahil Tyagi, Andrei Cozma, Olivera Kotevska, Feiyi Wang
Abstract:
Federated Learning (FL) is critical for edge and High Performance Computing (HPC) where data is not centralized and privacy is crucial. We present OmniFed, a modular framework designed around decoupling and clear separation of concerns for configuration, orchestration, communication, and training logic. Its architecture supports configuration-driven prototyping and code-level override-what-you-need customization. We also support different topologies, mixed communication protocols within a single deployment, and popular training algorithms. It also offers optional privacy mechanisms including Differential Privacy (DP), Homomorphic Encryption (HE), and Secure Aggregation (SA), as well as compression strategies. These capabilities are exposed through well-defined extension points, allowing users to customize topology and orchestration, learning logic, and privacy/compression plugins, all while preserving the integrity of the core system. We evaluate multiple models and algorithms to measure various performance metrics. By unifying topology configuration, mixed-protocol communication, and pluggable modules in one stack, OmniFed streamlines FL deployment across heterogeneous environments. Github repository is available at https://github.com/at-aaims/OmniFed.
中文: OmniFed是一个模块化的联邦学习框架,通过可插拔架构支持灵活配置、多种拓扑结构和隐私保护机制,简化了异构环境中的部署流程。
English: OmniFed is a modular federated learning framework that enables flexible configuration, supports diverse topologies and privacy mechanisms, and streamlines deployment across heterogeneous environments through its pluggable architecture.
Authors:Xiangmin Shen, Wenyuan Cheng, Yan Chen, Zhenyuan Li, Yuqiao Gu, Lingzhi Wang, Wencheng Zhao, Dawei Sun, Jiashui Wang
Abstract:
Security practitioners face growing challenges in exploit assessment, as public vulnerability repositories are increasingly populated with inconsistent and low-quality exploit artifacts. Existing scoring systems, such as CVSS and EPSS, offer limited support for this task. They either rely on theoretical metrics or produce opaque probability estimates without assessing whether usable exploit code exists. In practice, security teams often resort to manual triage of exploit repositories, which is time-consuming, error-prone, and difficult to scale. We present AEAS, an automated system designed to assess and prioritize actionable exploits through static analysis. AEAS analyzes both exploit code and associated documentation to extract a structured set of features reflecting exploit availability, functionality, and setup complexity. It then computes an actionability score for each exploit and produces ranked exploit recommendations. We evaluate AEAS on a dataset of over 5,000 vulnerabilities derived from 600+ real-world applications frequently encountered by red teams. Manual validation and expert review on representative subsets show that AEAS achieves a 100% top-3 success rate in recommending functional exploits and shows strong alignment with expert-validated rankings. These results demonstrate the effectiveness of AEAS in supporting exploit-driven vulnerability prioritization.
中文: 安全从业者面临漏洞利用评估的挑战,现有评分系统支持有限,而AEAS通过静态分析自动评估和优先排序可利用漏洞,在推荐功能性漏洞方面取得了高成功率。
English: Security practitioners struggle with inconsistent exploit artifacts and limited support from existing scoring systems, prompting the development of AEAS, an automated tool that assesses and prioritizes actionable exploits through static analysis, achieving high success rates in recommending functional exploits.
Authors:Florinel Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu
Abstract:
We propose a novel benchmark for camera identification via Photo Response Non-Uniformity (PRNU) estimation. The benchmark comprises 13K photos taken with 120+ cameras, where the training and test photos are taken in different scenarios, enabling ``in-the-wild'' evaluation. In addition, we propose a novel PRNU-based camera identification model that employs a hybrid architecture, comprising a denoising autoencoder to estimate the PRNU signal and a convolutional network that can perform 1:N verification of camera devices. Instead of using a conventional approach based on contrastive learning, our method takes the Hadamard product between reference and query PRNU signals as input. This novel design leads to significantly better results compared with state-of-the-art models based on denoising autoencoders and contrastive learning. We release our dataset and code at: https://github.com/CroitoruAlin/PRNU-Bench.
Chinese: 我们提出了一个基于光响应非均匀性估计的相机识别新基准,包含来自120多台相机的1.3万张照片,并采用结合去噪自编码器和卷积网络的混合模型,显著提升了识别性能。
English: We introduce a new benchmark for camera identification using PRNU estimation, featuring 13,000 photos from over 120 cameras and a hybrid model that combines a denoising autoencoder with a convolutional network for improved accuracy.
Authors:Rui Yang, Michael Fu, Chakkrit Tantithamthavorn, Chetan Arora, Gunel Gulmammadova, Joey Chua
Abstract:
Guardrails are critical for the safe deployment of Large Language Models (LLMs)-powered software. Unlike traditional rule-based systems with limited, predefined input-output spaces that inherently constrain unsafe behavior, LLMs enable open-ended, intelligent interactions--opening the door to jailbreak attacks through user inputs. Guardrails serve as a protective layer, filtering unsafe prompts before they reach the LLM. However, prior research shows that jailbreak attacks can still succeed over 70% of the time, even against advanced models like GPT-4o. While guardrails such as LlamaGuard report up to 95% accuracy, our preliminary analysis shows their performance can drop sharply--to as low as 12%--when confronted with unseen attacks. This highlights a growing software engineering challenge: how to build a post-deployment guardrail that adapts dynamically to emerging threats? To address this, we propose AdaptiveGuard, an adaptive guardrail that detects novel jailbreak attacks as out-of-distribution (OOD) inputs and learns to defend against them through a continual learning framework. Through empirical evaluation, AdaptiveGuard achieves 96% OOD detection accuracy, adapts to new attacks in just two update steps, and retains over 85% F1-score on in-distribution data post-adaptation, outperforming other baselines. These results demonstrate that AdaptiveGuard is a guardrail capable of evolving in response to emerging jailbreak strategies post deployment. We release our AdaptiveGuard and studied datasets at https://github.com/awsm-research/AdaptiveGuard to support further research.
中文: 护栏对保护大语言模型免受越狱攻击至关重要,但现有系统难以应对新型威胁,因此我们提出AdaptiveGuard,一种自适应护栏,能检测新攻击并持续学习防御,实现高精度和快速适应。
English: Guardrails are essential for protecting Large Language Models from jailbreak attacks, but current systems struggle with new threats, prompting the development of AdaptiveGuard, an adaptive solution that detects novel attacks and learns to counter them, achieving high accuracy and rapid adaptation.
Authors:Daniyal Kabir Dar, Qiben Yan, Li Xiao, Arun Ross
Abstract:
Adversarial perturbations in speech pose a serious threat to automatic speech recognition (ASR) and speaker verification by introducing subtle waveform modifications that remain imperceptible to humans but can significantly alter system outputs. While targeted attacks on end-to-end ASR models have been widely studied, the phonetic basis of these perturbations and their effect on speaker identity remain underexplored. In this work, we analyze adversarial audio at the phonetic level and show that perturbations exploit systematic confusions such as vowel centralization and consonant substitutions. These distortions not only mislead transcription but also degrade phonetic cues critical for speaker verification, leading to identity drift. Using DeepSpeech as our ASR target, we generate targeted adversarial examples and evaluate their impact on speaker embeddings across genuine and impostor samples. Results across 16 phonetically diverse target phrases demonstrate that adversarial audio induces both transcription errors and identity drift, highlighting the need for phonetic-aware defenses to ensure the robustness of ASR and speaker recognition systems.
Authors:Emilie Kibsgaard, Anita Sue Jwa, Christopher J Markiewicz, David Rodriguez Gonzalez, Judith Sainz Pardo, Russell A. Poldrack, Cyril R. Pernet
Abstract:
The ethical and legal imperative to share research data without causing harm requires careful attention to privacy risks. While mounting evidence demonstrates that data sharing benefits science, legitimate concerns persist regarding the potential leakage of personal information that could lead to reidentification and subsequent harm. We reviewed metadata accompanying neuroimaging datasets from six heterogeneous studies openly available on OpenNeuro, involving participants across the lifespan, from children to older adults, with and without clinical diagnoses, and including associated clinical score data. Using metaprivBIDS (https://github.com/CPernet/metaprivBIDS), a novel tool for the systematic assessment of privacy in tabular data, we found that privacy is generally well maintained, with serious vulnerabilities being rare. Nonetheless, minor issues were identified in nearly all datasets and warrant mitigation. Notably, clinical score data (e.g., neuropsychological results) posed minimal reidentification risk, whereas demographic variables (age, sex, race, income, and geolocation) represented the principal privacy vulnerabilities. We outline practical measures to address these risks, enabling safer data sharing practices.
中文: 研究数据共享需在科学获益与隐私保护间取得平衡,神经影像研究表明尽管整体防护良好,但人口统计学变量仍是重识别风险的主要来源。
English: Data sharing in research requires balancing scientific benefits with privacy protection, as neuroimaging studies show demographic variables pose the main reidentification risks despite overall secure practices.
Authors:Xinran Zheng, Xingzhi Qian, Yiling He, Shuo Yang, Lorenzo Cavallaro
Abstract:
Automated malware classification has achieved strong detection performance. Yet, malware behavior auditing seeks causal and verifiable explanations of malicious activities -- essential not only to reveal what malware does but also to substantiate such claims with evidence. This task is challenging, as adversarial intent is often hidden within complex, framework-heavy applications, making manual auditing slow and costly. Large Language Models (LLMs) could help address this gap, but their auditing potential remains largely unexplored due to three limitations: (1) scarce fine-grained annotations for fair assessment; (2) abundant benign code obscuring malicious signals; and (3) unverifiable, hallucination-prone outputs undermining attribution credibility. To close this gap, we introduce MalEval, a comprehensive framework for fine-grained Android malware auditing, designed to evaluate how effectively LLMs support auditing under real-world constraints. MalEval provides expert-verified reports and an updated sensitive API list to mitigate ground truth scarcity and reduce noise via static reachability analysis. Function-level structural representations serve as intermediate attribution units for verifiable evaluation. Building on this, we define four analyst-aligned tasks -- function prioritization, evidence attribution, behavior synthesis, and sample discrimination -- together with domain-specific metrics and a unified workload-oriented score. We evaluate seven widely used LLMs on a curated dataset of recent malware and misclassified benign apps, offering the first systematic assessment of their auditing capabilities. MalEval reveals both promising potential and critical limitations across audit stages, providing a reproducible benchmark and foundation for future research on LLM-enhanced malware behavior auditing. MalEval is publicly available at https://github.com/ZhengXR930/MalEval.git
中文: MalEval框架通过提供专家验证数据和结构化表示,系统评估大语言模型在四项分析任务中的恶意软件审计能力,既揭示了其潜力也暴露了关键局限,为未来研究建立了可复现基准。
English: The MalEval framework addresses the limitations of large language models in malware auditing by providing expert-verified data and structural representations to systematically evaluate their capabilities across four analyst-aligned tasks, revealing both potential and critical gaps in current approaches.
Authors:Vaidehi Patil, Elias Stengel-Eskin, Mohit Bansal
Abstract:
As large language models (LLMs) become integral to multi-agent systems, new privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations. In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information, a phenomenon we term compositional privacy leakage. We present the first systematic study of such compositional privacy leaks and possible mitigation methods in multi-agent LLM systems. First, we develop a framework that models how auxiliary knowledge and agent interactions jointly amplify privacy risks, even when each response is benign in isolation. Next, to mitigate this, we propose and evaluate two defense strategies: (1) Theory-of-Mind defense (ToM), where defender agents infer a questioner's intent by anticipating how their outputs may be exploited by adversaries, and (2) Collaborative Consensus Defense (CoDef), where responder agents collaborate with peers who vote based on a shared aggregated state to restrict sensitive information spread. Crucially, we balance our evaluation across compositions that expose sensitive information and compositions that yield benign inferences. Our experiments quantify how these defense strategies differ in balancing the privacy-utility trade-off. We find that while chain-of-thought alone offers limited protection to leakage (~39% sensitive blocking rate), our ToM defense substantially improves sensitive query blocking (up to 97%) but can reduce benign task success. CoDef achieves the best balance, yielding the highest Balanced Outcome (79.8%), highlighting the benefit of combining explicit reasoning with defender collaboration. Together, our results expose a new class of risks in collaborative LLM deployments and provide actionable insights for designing safeguards against compositional, context-driven privacy leakage.
中文摘要:本研究揭示了多智能体大语言模型系统中组合式隐私泄露的风险,即看似无害的交互响应在累积中可能泄露敏感信息,并提出心智理论和协作共识两种防御策略,在保护隐私与保持系统效用间实现了最佳平衡。
English Summary: This study identifies compositional privacy leakage in multi-agent LLM systems, where seemingly harmless responses collectively expose sensitive information, and proposes two defense strategies—Theory-of-Mind and Collaborative Consensus—that effectively balance privacy protection with utility.
Authors:Jiahao Xu, Zikai Zhang, Rui Hu
Abstract:
Traditional backdoor attacks in federated learning (FL) operate within constrained attack scenarios, as they depend on visible triggers and require physical modifications to the target object, which limits their practicality. To address this limitation, we introduce a novel backdoor attack prototype for FL called the out-of-distribution (OOD) backdoor attack ($\mathtt{OBA}$), which uses OOD data as both poisoned samples and triggers simultaneously. Our approach significantly broadens the scope of backdoor attack scenarios in FL. To improve the stealthiness of $\mathtt{OBA}$, we propose $\mathtt{SoDa}$, which regularizes both the magnitude and direction of malicious local models during local training, aligning them closely with their benign versions to evade detection. Empirical results demonstrate that $\mathtt{OBA}$ effectively circumvents state-of-the-art defenses while maintaining high accuracy on the main task. To address this security vulnerability in the FL system, we introduce $\mathtt{BNGuard}$, a new server-side defense method tailored against $\mathtt{SoDa}$. $\mathtt{BNGuard}$ leverages the observation that OOD data causes significant deviations in the running statistics of batch normalization layers. This allows $\mathtt{BNGuard}$ to identify malicious model updates and exclude them from aggregation, thereby enhancing the backdoor robustness of FL. Extensive experiments across various settings show the effectiveness of $\mathtt{BNGuard}$ on defending against $\mathtt{SoDa}$. The code is available at https://github.com/JiiahaoXU/SoDa-BNGuard.
中文: 本文提出了一种新颖的联邦学习分布外后门攻击(OBA),利用OOD数据作为触发器,并开发了隐蔽增强方法SoDa,同时设计了BNGuard防御机制,通过批归一化层统计检测恶意更新以增强系统安全性。
English: This paper introduces a novel out-of-distribution backdoor attack (OBA) for federated learning that uses OOD data as triggers, along with a stealth-enhancing method SoDa, and proposes BNGuard defense that detects malicious updates through batch normalization statistics to secure FL systems.
Authors:Eyal German, Daniel Samira, Yuval Elovici, Asaf Shabtai
Abstract:
Synthetic data generation plays an important role in enabling data sharing, particularly in sensitive domains like healthcare and finance. Recent advances in diffusion models have made it possible to generate realistic, high-quality tabular data, but they may also memorize training records and leak sensitive information. Membership inference attacks (MIAs) exploit this vulnerability by determining whether a record was used in training. While MIAs have been studied in images and text, their use against tabular diffusion models remains underexplored despite the unique risks of structured attributes and limited record diversity. In this paper, we introduce MIAEPT, Membership Inference Attack via Error Prediction for Tabular Data, a novel black-box attack specifically designed to target tabular diffusion models. MIA-EPT constructs errorbased feature vectors by masking and reconstructing attributes of target records, disclosing membership signals based on how well these attributes are predicted. MIA-EPT operates without access to the internal components of the generative model, relying only on its synthetic data output, and was shown to generalize across multiple state-of-the-art diffusion models. We validate MIA-EPT on three diffusion-based synthesizers, achieving AUC-ROC scores of up to 0.599 and TPR@10% FPR values of 22.0% in our internal tests. Under the MIDST 2025 competition conditions, MIA-EPT achieved second place in the Black-box Multi-Table track (TPR@10% FPR = 20.0%). These results demonstrate that our method can uncover substantial membership leakage in synthetic tabular data, challenging the assumption that synthetic data is inherently privacy-preserving. Our code is publicly available at https://github.com/eyalgerman/MIA-EPT.
中文: 本文提出MIA-EPT这一黑盒成员推理攻击方法,通过分析重构误差有效识别表格扩散模型的训练数据泄露,挑战了合成数据天生保护隐私的假设。
English: This paper introduces MIA-EPT, a black-box membership inference attack method that effectively identifies training data leakage in tabular diffusion models by analyzing reconstruction errors, challenging the assumption that synthetic data inherently preserves privacy.
Authors:Yifan Zhang
Abstract:
We give a simple and provably correct replacement for the contested ``domain-extension'' in Step 9 of a recent windowed-QFT lattice algorithm with complex-Gaussian windows~\citep{chen2024quantum}. As acknowledged by the author, the reported issue is due to a periodicity/support mismatch when applying domain extension to only the first coordinate in the presence of offsets. Our drop-in subroutine replaces domain extension by a pair-shift difference that cancels all unknown offsets exactly and synthesizes a uniform cyclic subgroup (a zero-offset coset) of order $P$ inside $(\mathbb{Z}_{M_2})^n$. A subsequent QFT enforces the intended modular linear relation by plain character orthogonality. The sole structural assumption is a residue-accessibility condition enabling coherent auxiliary cleanup; no amplitude periodicity is used. The unitary is reversible, uses $\mathrm{poly}(\log M_2)$ gates, and preserves upstream asymptotics.
中文摘要:本研究提出了一种可证明正确的子程序,通过配对位移差分方法取代量子格算法中有问题的域扩展步骤,能够精确消除未知偏移量并合成均匀循环子群,同时保持计算效率。
English Summary: This work presents a provably correct subroutine that replaces the problematic domain-extension step in a quantum lattice algorithm, using a pair-shift difference method to cancel unknown offsets and synthesize uniform cyclic subgroups while maintaining computational efficiency.
Authors:Yitong Zhang, Ximo Li, Liyi Cai, Jia Li
Abstract:
GUI agents built on LVLMs are increasingly used to interact with websites. However, their exposure to open-world content makes them vulnerable to Environmental Injection Attacks (EIAs) that hijack agent behavior via webpage elements. Many recent studies assume the attacker to be a regular user who can only upload a single trigger image, which is more realistic than earlier assumptions of website-level administrative control. However, these works still fall short of realism: (1) the trigger's position and surrounding context remain largely fixed between training and testing, failing to capture the dynamic nature of real webpages and (2) the trigger often occupies an unrealistically large area, whereas real-world images are typically small. To better reflect real-world scenarios, we introduce a more realistic threat model where the attacker is a regular user and the trigger image is small and embedded within a dynamically changing environment. As a result, existing attacks prove largely ineffective under this threat model. To better expose the vulnerabilities of GUI agents, we propose Chameleon, an attack framework with two main novelties. The first is LLM-Driven Environment Simulation, which automatically generates diverse and high-fidelity webpage simulations. The second is Attention Black Hole, which transforms attention weights into explicit supervisory signals that guide the agent's focus toward the trigger region. We evaluate Chameleon on 6 realistic websites and 4 representative LVLM-powered GUI agents, where it significantly outperforms existing methods. Ablation studies confirm that both novelties are critical to performance. Our findings reveal underexplored vulnerabilities in modern GUI agents and establish a robust foundation for future research on defense in open-world GUI agent systems. The code is publicly available at https://github.com/zhangyitonggg/attack2gui.
中文: 基于LVLM的GUI代理易受环境注入攻击,本研究提出Chameleon攻击框架,通过模拟动态网页环境和引导代理关注小尺寸触发器,在现实威胁模型下显著优于现有方法。
English: GUI agents based on LVLMs are vulnerable to Environmental Injection Attacks, and this study introduces Chameleon, a novel attack framework that simulates dynamic web environments and directs agent attention to small triggers, significantly outperforming existing methods under realistic threat models.
Authors:Youquan Xian, Xueying Zeng, Mei Huang, Aoxiang Zhou, Xiaoyu Cui, Peng Liu, Lei Cui
Abstract:
In recent years, sequence features such as packet length have received considerable attention due to their central role in encrypted traffic analysis. Existing sequence modeling approaches can be broadly categorized into flow-level and trace-level methods: the former suffer from high feature redundancy, limiting their discriminative power, whereas the latter preserve complete information but incur substantial computational and storage overhead. To address these limitations, we propose the \textbf{U}p-\textbf{D}own \textbf{F}low \textbf{S}equence (\textbf{UDFS}) representation, which compresses an entire trace into a two-dimensional sequence and characterizes each flow by the aggregate of its upstream and downstream traffic, reducing complexity while maintaining high discriminability. Furthermore, to address the challenge of class-specific discriminability differences, we propose an adaptive threshold mechanism that dynamically adjusts training weights and rejection boundaries, enhancing the model's classification performance. Experimental results demonstrate that the proposed method achieves superior classification performance and robustness on both coarse-grained and fine-grained datasets, as well as under concept drift and open-world scenarios. Code and Dataset are available at https://github.com/kid1999/UDFS.
中文摘要:提出的UDFS表示法将流量轨迹压缩为二维序列以降低复杂度同时保持区分度,自适应阈值机制进一步提升了模型在多种场景下的分类性能。
English Summary: The proposed UDFS representation compresses traffic traces into two-dimensional sequences to reduce complexity while maintaining discriminability, and an adaptive threshold mechanism further enhances classification performance across various scenarios.
Authors:Eli Baum, Sam Buxbaum, Nitin Mathai, Muhammad Faisal, Vasiliki Kalavri, Mayank Varia, John Liagouris
Abstract:
We present ORQ, a system that enables collaborative analysis of large private datasets using cryptographically secure multi-party computation (MPC). ORQ protects data against semi-honest or malicious parties and can efficiently evaluate relational queries with multi-way joins and aggregations that have been considered notoriously expensive under MPC. To do so, ORQ eliminates the quadratic cost of secure joins by leveraging the fact that, in practice, the structure of many real queries allows us to join records and apply the aggregations "on the fly" while keeping the result size bounded. On the system side, ORQ contributes generic oblivious operators, a data-parallel vectorized query engine, a communication layer that amortizes MPC network costs, and a dataflow API for expressing relational analytics -- all built from the ground up. We evaluate ORQ in LAN and WAN deployments on a diverse set of workloads, including complex queries with multiple joins and custom aggregations. When compared to state-of-the-art solutions, ORQ significantly reduces MPC execution times and can process one order of magnitude larger datasets. For our most challenging workload, the full TPC-H benchmark, we report results entirely under MPC with Scale Factor 10 -- a scale that had previously been achieved only with information leakage or the use of trusted third parties.
中文: ORQ系统通过密码学安全的多方计算技术,实现了对大型私有数据集的高效协同分析,其创新性地采用实时聚合消除二次连接成本,并构建了通用 oblivious 操作符,在保持数据安全性的同时将处理性能提升了一个数量级。
English: ORQ is a secure multi-party computation system that enables efficient collaborative analysis of large private datasets by eliminating quadratic join costs through on-the-fly aggregation and introducing novel oblivious operators, significantly outperforming existing solutions in both speed and scalability.
Authors:Fanzhen Liu, Alsharif Abuadbba, Kristen Moore, Surya Nepal, Cecile Paris, Jia Wu, Jian Yang, Quan Z. Sheng
Abstract:
In an era where misinformation spreads freely, fact-checking (FC) plays a crucial role in verifying claims and promoting reliable information. While automated fact-checking (AFC) has advanced significantly, existing systems remain vulnerable to adversarial attacks that manipulate or generate claims, evidence, or claim-evidence pairs. These attacks can distort the truth, mislead decision-makers, and ultimately undermine the reliability of FC models. Despite growing research interest in adversarial attacks against AFC systems, a comprehensive, holistic overview of key challenges remains lacking. These challenges include understanding attack strategies, assessing the resilience of current models, and identifying ways to enhance robustness. This survey provides the first in-depth review of adversarial attacks targeting FC, categorizing existing attack methodologies and evaluating their impact on AFC systems. Additionally, we examine recent advancements in adversary-aware defenses and highlight open research questions that require further exploration. Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations in pursuit of preserving high verification accuracy.
中文摘要:本综述首次系统梳理针对事实核查系统的对抗性攻击,分类评估攻击方法及防御机制,强调构建抗干扰核查框架对保障信息验证准确性的紧迫需求。
English Summary: This survey comprehensively reviews adversarial attacks on automated fact-checking systems, analyzing attack methodologies and defenses while highlighting the critical need for more resilient frameworks to maintain verification accuracy.
Authors:Ze Sheng, Qingxiao Xu, Jianwei Huang, Matthew Woodcock, Heqing Huang, Alastair F. Donaldson, Guofei Gu, Jeff Huang
Abstract:
Our team, All You Need Is A Fuzzing Brain, was one of seven finalists in DARPA's Artificial Intelligence Cyber Challenge (AIxCC), placing fourth in the final round. During the competition, we developed a Cyber Reasoning System (CRS) that autonomously discovered 28 security vulnerabilities - including six previously unknown zero-days - in real-world open-source C and Java projects, and successfully patched 14 of them. The complete CRS is open source at https://github.com/o2lab/afc-crs-all-you-need-is-a-fuzzing-brain. This paper provides a detailed technical description of our CRS, with an emphasis on its LLM-powered components and strategies. Building on AIxCC, we further introduce a public leaderboard for benchmarking state-of-the-art LLMs on vulnerability detection and patching tasks, derived from the AIxCC dataset. The leaderboard is available at https://o2lab.github.io/FuzzingBrain-Leaderboard/.
中文: 我们的团队在DARPA的AIxCC竞赛中获得第四名,开发的开源网络推理系统自主发现了28个安全漏洞(含6个零日漏洞)并修复了其中14个,同时建立了用于评估大语言模型安全任务性能的公开排行榜。
English: Our team placed fourth in DARPA's AIxCC competition by developing an open-source Cyber Reasoning System that autonomously discovered 28 vulnerabilities—including six zero-days—and patched 14 of them, while also establishing a public leaderboard for benchmarking LLMs on security tasks.
Authors:Mohammad Reza Mirbagheri, Mohammad Mahdi Mirkamali, Zahra Motoshaker Arani, Ali Javeri, Amir Mahdi Sadeghzadeh, Rasool Jalili
Abstract:
Large Language Models (LLMs), trained on extensive datasets using advanced deep learning architectures, have demonstrated remarkable performance across a wide range of language tasks, becoming a cornerstone of modern AI technologies. However, ensuring their trustworthiness remains a critical challenge, as reliability is essential not only for accurate performance but also for upholding ethical, cultural, and social values. Careful alignment of training data and culturally grounded evaluation criteria are vital for developing responsible AI systems. In this study, we introduce the EPT (Evaluation of Persian Trustworthiness) metric, a culturally informed benchmark specifically designed to assess the trustworthiness of LLMs across six key aspects: truthfulness, safety, fairness, robustness, privacy, and ethical alignment. We curated a labeled dataset and evaluated the performance of several leading models - including ChatGPT, Claude, DeepSeek, Gemini, Grok, LLaMA, Mistral, and Qwen - using both automated LLM-based and human assessments. Our results reveal significant deficiencies in the safety dimension, underscoring the urgent need for focused attention on this critical aspect of model behavior. Furthermore, our findings offer valuable insights into the alignment of these models with Persian ethical-cultural values and highlight critical gaps and opportunities for advancing trustworthy and culturally responsible AI. The dataset is publicly available at: https://github.com/Rezamirbagheri110/EPT-Benchmark.
Chinese: 本研究引入EPT指标,这是一个基于文化背景的基准,用于评估大型语言模型在六个关键方面的可信度,揭示了显著的安全缺陷,并强调了在AI开发中与波斯伦理文化价值观保持一致的必要性。
English: This study introduces the EPT metric, a culturally informed benchmark for evaluating the trustworthiness of LLMs across six key aspects, revealing significant safety deficiencies and highlighting the need for alignment with Persian ethical-cultural values in AI development.
Authors:Yuntao Du, Yuetian Chen, Hanshen Xiao, Bruno Ribeiro, Ninghui Li
Abstract:
A Membership Inference Attack (MIA) assesses how much a target machine learning model reveals about its training data by determining whether specific query instances were part of the training set. State-of-the-art MIAs rely on training hundreds of shadow models that are independent of the target model, leading to significant computational overhead. In this paper, we introduce Imitative Membership Inference Attack (IMIA), which employs a novel imitative training technique to strategically construct a small number of target-informed imitative models that closely replicate the target model's behavior for inference. Extensive experimental results demonstrate that IMIA substantially outperforms existing MIAs in various attack settings while only requiring less than 5% of the computational cost of state-of-the-art approaches.
中文摘要:本文提出的IMIA通过构建少量目标导向的模仿模型,在显著降低95%以上计算成本的同时,实现了比现有成员推理攻击更优越的性能。
English Summary: The paper introduces IMIA, a novel membership inference attack that uses target-informed imitative models to outperform existing methods while reducing computational costs by over 95%.
Authors:Jack Wilkie, Hanan Hindy, Christos Tachtatzis, Robert Atkinson
Abstract:
Network intrusion detection remains a critical challenge in cybersecurity. While supervised machine learning models achieve state-of-the-art performance, their reliance on large labelled datasets makes them impractical for many real-world applications. Anomaly detection methods, which train exclusively on benign traffic to identify malicious activity, suffer from high false positive rates, limiting their usability. Recently, self-supervised learning techniques have demonstrated improved performance with lower false positive rates by learning discriminative latent representations of benign traffic. In particular, contrastive self-supervised models achieve this by minimizing the distance between similar (positive) views of benign traffic while maximizing it between dissimilar (negative) views. Existing approaches generate positive views through data augmentation and treat other samples as negative. In contrast, this work introduces Contrastive Learning using Augmented Negative pairs (CLAN), a novel paradigm for network intrusion detection where augmented samples are treated as negative views - representing potentially malicious distributions - while other benign samples serve as positive views. This approach enhances both classification accuracy and inference efficiency after pretraining on benign traffic. Experimental evaluation on the Lycos2017 dataset demonstrates that the proposed method surpasses existing self-supervised and anomaly detection techniques in a binary classification task. Furthermore, when fine-tuned on a limited labelled dataset, the proposed approach achieves superior multi-class classification performance compared to existing self-supervised models.
中文: 本文提出CLAN,一种用于网络入侵检测的新型自监督对比学习方法,将增强样本视为负样本视图以提高分类精度和效率,在Lycos2017数据集上超越了现有技术。
English: This paper introduces CLAN, a novel self-supervised contrastive learning method for network intrusion detection that treats augmented samples as negative views to improve classification accuracy and efficiency, outperforming existing techniques on the Lycos2017 dataset.
Authors:Jeongmin Yu, Susang Kim, Kisu Lee, Taekyoung Kwon, Won-Yong Shin, Ha Young Kim
Abstract:
Recent face anti-spoofing (FAS) methods have shown remarkable cross-domain performance by employing vision-language models like CLIP. However, existing CLIP-based FAS models do not fully exploit CLIP's patch embedding tokens, failing to detect critical spoofing clues. Moreover, these models rely on a single text prompt per class (e.g., 'live' or 'fake'), which limits generalization. To address these issues, we propose MVP-FAS, a novel framework incorporating two key modules: Multi-View Slot attention (MVS) and Multi-Text Patch Alignment (MTPA). Both modules utilize multiple paraphrased texts to generate generalized features and reduce dependence on domain-specific text. MVS extracts local detailed spatial features and global context from patch embeddings by leveraging diverse texts with multiple perspectives. MTPA aligns patches with multiple text representations to improve semantic robustness. Extensive experiments demonstrate that MVP-FAS achieves superior generalization performance, outperforming previous state-of-the-art methods on cross-domain datasets. Code: https://github.com/Elune001/MVP-FAS.
中文: MVP-FAS框架通过多视角槽位注意力和多文本补丁对齐模块,充分利用CLIP的补丁嵌入和多样化文本提示,显著提升了人脸防伪的跨领域泛化性能。
English: The proposed MVP-FAS framework enhances face anti-spoofing by leveraging multi-view slot attention and multi-text patch alignment to better utilize CLIP's patch embeddings and multiple text prompts, achieving superior cross-domain generalization.
Authors:Qin Yang, Nicholas Stout, Meisam Mohammady, Han Wang, Ayesha Samreen, Christopher J Quinn, Yan Yan, Ashish Kundu, Yuan Hong
Abstract:
Differentially Private Stochastic Gradient Descent (DP-SGD) is a standard method for enforcing privacy in deep learning, typically using the Gaussian mechanism to perturb gradient updates. However, conventional mechanisms such as Gaussian and Laplacian noise are parameterized only by variance or scale. This single degree of freedom ties the magnitude of noise directly to both privacy loss and utility degradation, preventing independent control of these two factors. The problem becomes more pronounced when the number of composition rounds T and batch size B vary across tasks, as these variations induce task-dependent shifts in the privacy-utility trade-off, where small changes in noise parameters can disproportionately affect model accuracy. To address this limitation, we introduce PLRV-O, a framework that defines a broad search space of parameterized DP-SGD noise distributions, where privacy loss moments are tightly characterized yet can be optimized more independently with respect to utility loss. This formulation enables systematic adaptation of noise to task-specific requirements, including (i) model size, (ii) training duration, (iii) batch sampling strategies, and (iv) clipping thresholds under both training and fine-tuning settings. Empirical results demonstrate that PLRV-O substantially improves utility under strict privacy constraints. On CIFAR-10, a fine-tuned ViT achieves 94.03% accuracy at epsilon approximately 0.5, compared to 83.93% with Gaussian noise. On SST-2, RoBERTa-large reaches 92.20% accuracy at epsilon approximately 0.2, versus 50.25% with Gaussian.
中文: 本文提出了PLRV-O框架,为差分隐私随机梯度下降定义了一个参数化噪声分布的广泛搜索空间,能够更独立地控制隐私损失和效用损失,从而根据任务特定需求系统调整噪声,在严格隐私约束下显著提升模型准确性。
English: This paper introduces PLRV-O, a framework that creates a search space for parameterized noise distributions in DP-SGD, enabling more independent control over privacy loss and utility degradation to systematically adapt noise to task-specific requirements and significantly improve model accuracy under strict privacy constraints.
Authors:Xinyu Gao, Xiangtao Meng, Yingkai Dong, Zheng Li, Shanqing Guo
Abstract:
While Retrieval-Augmented Generation (RAG) effectively reduces hallucinations by integrating external knowledge bases, it introduces vulnerabilities to membership inference attacks (MIAs), particularly in systems handling sensitive data. Existing MIAs targeting RAG's external databases often rely on model responses but ignore the interference of non-member-retrieved documents on RAG outputs, limiting their effectiveness. To address this, we propose DCMI, a differential calibration MIA that mitigates the negative impact of non-member-retrieved documents. Specifically, DCMI leverages the sensitivity gap between member and non-member retrieved documents under query perturbation. It generates perturbed queries for calibration to isolate the contribution of member-retrieved documents while minimizing the interference from non-member-retrieved documents. Experiments under progressively relaxed assumptions show that DCMI consistently outperforms baselines--for example, achieving 97.42% AUC and 94.35% Accuracy against the RAG system with Flan-T5, exceeding the MBA baseline by over 40%. Furthermore, on real-world RAG platforms such as Dify and MaxKB, DCMI maintains a 10%-20% advantage over the baseline. These results highlight significant privacy risks in RAG systems and emphasize the need for stronger protection mechanisms. We appeal to the community's consideration of deeper investigations, like ours, against the data leakage risks in rapidly evolving RAG systems. Our code is available at https://github.com/Xinyu140203/RAG_MIA.
中文: 检索增强生成(RAG)系统因非成员文档的干扰易受成员推理攻击,为此提出的差分校准方法DCMI能有效分离成员贡献,在准确率和隐私风险防控上显著优于现有基线方案。
English: Retrieval-Augmented Generation (RAG) systems are vulnerable to membership inference attacks due to interference from non-member documents, prompting the development of DCMI, a differential calibration method that effectively isolates member contributions and significantly outperforms existing baselines in both accuracy and privacy risk mitigation.
Authors:Jie Fu, Hong Yuan, Zhili Chen, Wendy Hui Wang
Abstract:
Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data. However, their widespread adoption has raised serious privacy concerns. While prior research has primarily focused on edge-level privacy, a critical yet underexplored threat lies in topology privacy - the confidentiality of the graph's overall structure. In this work, we present a comprehensive study on topology privacy risks in GNNs, revealing their vulnerability to graph-level inference attacks. To this end, we propose a suite of Topology Inference Attacks (TIAs) that can reconstruct the structure of a target training graph using only black-box access to a GNN model. Our findings show that GNNs are highly susceptible to these attacks, and that existing edge-level differential privacy mechanisms are insufficient as they either fail to mitigate the risk or severely compromise model accuracy. To address this challenge, we introduce Private Graph Reconstruction (PGR), a novel defense framework designed to protect topology privacy while maintaining model accuracy. PGR is formulated as a bi-level optimization problem, where a synthetic training graph is iteratively generated using meta-gradients, and the GNN model is concurrently updated based on the evolving graph. Extensive experiments demonstrate that PGR significantly reduces topology leakage with minimal impact on model accuracy. Our code is available at https://github.com/JeffffffFu/PGR.
中文: 本研究通过新型拓扑推理攻击揭示了图神经网络存在的严重拓扑隐私漏洞,并提出了私有图重构防御框架,该方案能在保护图结构机密性的同时有效维持模型精度。
English: This study exposes significant topology privacy vulnerabilities in Graph Neural Networks (GNNs) through novel Topology Inference Attacks and introduces Private Graph Reconstruction, a defense framework that effectively protects graph structure confidentiality while preserving model accuracy.
Authors:Zijian Wang, Wei Tong, Tingxuan Han, Haoyu Chen, Tianling Zhang, Yunlong Mao, Sheng Zhong
Abstract:
Federated learning (FL) combined with local differential privacy (LDP) enables privacy-preserving model training across decentralized data sources. However, the decentralized data-management paradigm leaves LDPFL vulnerable to participants with malicious intent. The robustness of LDPFL protocols, particularly against model poisoning attacks (MPA), where adversaries inject malicious updates to disrupt global model convergence, remains insufficiently studied. In this paper, we propose a novel and extensible model poisoning attack framework tailored for LDPFL settings. Our approach is driven by the objective of maximizing the global training loss while adhering to local privacy constraints. To counter robust aggregation mechanisms such as Multi-Krum and trimmed mean, we develop adaptive attacks that embed carefully crafted constraints into a reverse training process, enabling evasion of these defenses. We evaluate our framework across three representative LDPFL protocols, three benchmark datasets, and two types of deep neural networks. Additionally, we investigate the influence of data heterogeneity and privacy budgets on attack effectiveness. Experimental results demonstrate that our adaptive attacks can significantly degrade the performance of the global model, revealing critical vulnerabilities and highlighting the need for more robust LDPFL defense strategies against MPA. Our code is available at https://github.com/ZiJW/LDPFL-Attack
中文: 本文针对本地差分隐私联邦学习系统提出了一种新型自适应模型投毒攻击框架,通过逆向训练嵌入约束条件来规避鲁棒聚合防御机制,在多种协议和数据集上显著降低全局模型性能,揭示了当前防御策略的关键脆弱性。
English: This paper introduces a novel adaptive model poisoning attack framework for LDPFL systems that bypasses robust aggregation defenses by embedding constraints through reverse training, significantly degrading global model performance across multiple protocols and datasets while highlighting critical vulnerabilities in current defense strategies.
Authors:Yanbo Wang, Yongcan Yu, Jian Liang, Ran He
Abstract:
The development of Long-CoT reasoning has advanced LLM performance across various tasks, including language understanding, complex problem solving, and code generation. This paradigm enables models to generate intermediate reasoning steps, thereby improving both accuracy and interpretability. However, despite these advancements, a comprehensive understanding of how CoT-based reasoning affects the trustworthiness of language models remains underdeveloped. In this paper, we survey recent work on reasoning models and CoT techniques, focusing on five core dimensions of trustworthy reasoning: truthfulness, safety, robustness, fairness, and privacy. For each aspect, we provide a clear and structured overview of recent studies in chronological order, along with detailed analyses of their methodologies, findings, and limitations. Future research directions are also appended at the end for reference and discussion. Overall, while reasoning techniques hold promise for enhancing model trustworthiness through hallucination mitigation, harmful content detection, and robustness improvement, cutting-edge reasoning models themselves often suffer from comparable or even greater vulnerabilities in safety, robustness, and privacy. By synthesizing these insights, we hope this work serves as a valuable and timely resource for the AI safety community to stay informed on the latest progress in reasoning trustworthiness. A full list of related papers can be found at \href{https://github.com/ybwang119/Awesome-reasoning-safety}{https://github.com/ybwang119/Awesome-reasoning-safety}.
中文: 本文综述了思维链推理对语言模型可信度在真实性、安全性、鲁棒性、公平性和隐私性五个维度的影响,发现其虽提升准确性和可解释性,但也带来脆弱性,为AI安全研究提供了全面参考。
English: This paper surveys how Chain-of-Thought reasoning impacts language model trustworthiness across five dimensions—truthfulness, safety, robustness, fairness, and privacy—finding that while it enhances accuracy and interpretability, it also introduces vulnerabilities, providing a comprehensive resource for AI safety research.
Authors:Junhui Li, Chengbin Feng, Zhiwei Yang, Qi Mo, Wei Wang
Abstract:
To identify malicious Android applications, various malware detection techniques have been proposed. Among them, image-based approaches are considered potential alternatives due to their efficiency and scalability. Recent studies have reported that these approaches suffer significant performance declines when confronted with obfuscation or concept drift. However, existing solutions often treat these two challenges as different problems, offering independent solutions. These techniques overlook the fact that both challenges share a common statistical root, out-of-distribution, and research from this perspective remains limited. In response, we propose BIDO, a hybrid image-based malware detector designed to enhance robustness against both obfuscation and concept drift simultaneously. Specifically, to improve the discriminative power of image features, we introduce a local feature selection module that identifies informative subregions within malware images. Second, to enhance feature robustness, we model pairwise cross-modal dependencies in an outer product space, enabling the extraction of stable co-occurrence patterns. Third, to ensure feature compactness, we design a learnable metric that pulls samples with identical labels closer while pushing apart those with different labels, regardless of obfuscation or concept drift. Extensive experiments on the real-world datasets demonstrate that BIDO significantly outperforms existing baselines, achieving higher robustness against both concept drift and obfuscation. The source code is available at: https://github.com/whatishope/BIDO/.
Chinese: 提出的BIDO框架通过局部特征选择、跨模态依赖建模和可学习度量,同时应对混淆和概念漂移问题,实验证明其在安卓恶意软件检测中具有更强的鲁棒性。
English: The proposed BIDO framework enhances Android malware detection by addressing obfuscation and concept drift through local feature selection, cross-modal dependency modeling, and a learnable metric, demonstrating superior robustness in experiments.
Authors:Jigang Fan, Zhenghong Zhou, Ruofan Jin, Le Cong, Mengdi Wang, Zaixi Zhang
Abstract:
Proteins play crucial roles in almost all biological processes. The advancement of deep learning has greatly accelerated the development of protein foundation models, leading to significant successes in protein understanding and design. However, the lack of systematic red-teaming for these models has raised serious concerns about their potential misuse, such as generating proteins with biological safety risks. This paper introduces SafeProtein, the first red-teaming framework designed for protein foundation models to the best of our knowledge. SafeProtein combines multimodal prompt engineering and heuristic beam search to systematically design red-teaming methods and conduct tests on protein foundation models. We also curated SafeProtein-Bench, which includes a manually constructed red-teaming benchmark dataset and a comprehensive evaluation protocol. SafeProtein achieved continuous jailbreaks on state-of-the-art protein foundation models (up to 70% attack success rate for ESM3), revealing potential biological safety risks in current protein foundation models and providing insights for the development of robust security protection technologies for frontier models. The codes will be made publicly available at https://github.com/jigang-fan/SafeProtein.
中文:本文提出了首个蛋白质基础模型红队测试框架SafeProtein,通过多模态提示工程和启发式束搜索方法,在先进模型上实现了高达70%的攻击成功率,揭示了当前蛋白质基础模型存在的生物安全风险。
English: This paper introduces SafeProtein, the first red-teaming framework for protein foundation models, which successfully exposed biological safety risks by achieving up to 70% attack success rates on state-of-the-art models through multimodal prompt engineering and heuristic beam search.
Authors:Yuchen Yang, Yiming Li, Hongwei Yao, Enhao Huang, Shuo Shao, Bingrun Yang, Zhibo Wang, Dacheng Tao, Zhan Qin
Abstract:
The rapid progress of large language models (LLMs) has greatly enhanced reasoning tasks and facilitated the development of LLM-based applications. A critical factor in improving LLM-based applications is the design of effective system prompts, which significantly impact the behavior and output quality of LLMs. However, system prompts are susceptible to theft and misuse, which could undermine the interests of prompt owners. Existing methods protect prompt copyrights through watermark injection and verification but face challenges due to their reliance on intermediate LLM outputs (e.g., logits), which limits their practical feasibility. In this paper, we propose PromptCOS, a method for auditing prompt copyright based on content-level output similarity. It embeds watermarks by optimizing the prompt while simultaneously co-optimizing a special verification query and content-level signal marks. This is achieved by leveraging cyclic output signals and injecting auxiliary tokens to ensure reliable auditing in content-only scenarios. Additionally, it incorporates cover tokens to protect the watermark from malicious deletion. For copyright verification, PromptCOS identifies unauthorized usage by comparing the similarity between the suspicious output and the signal mark. Experimental results demonstrate that our method achieves high effectiveness (99.3% average watermark similarity), strong distinctiveness (60.8% greater than the best baseline), high fidelity (accuracy degradation of no more than 0.58%), robustness (resilience against three types of potential attacks), and computational efficiency (up to 98.1% reduction in computational cost). Our code is available at GitHub https://github.com/LianPing-cyber/PromptCOS.
中文: 大语言模型的快速发展增强了对系统提示词保护的需求,PromptCOS方法通过优化提示和信号标记嵌入水印进行版权验证,确保了高效性、鲁棒性和实用性。
English: The rapid advancement of large language models has increased the need for protecting system prompts from theft, leading to the development of PromptCOS, a method that embeds watermarks for copyright verification by optimizing prompts and signal marks to ensure high effectiveness, robustness, and efficiency.
Authors:Zhenhua Xu, Meng Han, Wenpeng Xing
Abstract:
The proliferation of large language models (LLMs) has intensified concerns over model theft and license violations, necessitating robust and stealthy ownership verification. Existing fingerprinting methods either require impractical white-box access or introduce detectable statistical anomalies. We propose EverTracer, a novel gray-box fingerprinting framework that ensures stealthy and robust model provenance tracing. EverTracer is the first to repurpose Membership Inference Attacks (MIAs) for defensive use, embedding ownership signals via memorization instead of artificial trigger-output overfitting. It consists of Fingerprint Injection, which fine-tunes the model on any natural language data without detectable artifacts, and Verification, which leverages calibrated probability variation signal to distinguish fingerprinted models. This approach remains robust against adaptive adversaries, including input level modification, and model-level modifications. Extensive experiments across architectures demonstrate EverTracer's state-of-the-art effectiveness, stealthness, and resilience, establishing it as a practical solution for securing LLM intellectual property. Our code and data are publicly available at https://github.com/Xuzhenhua55/EverTracer.
中文摘要:EverTracer是一种创新的灰盒指纹框架,通过重新利用成员推理攻击在大型语言模型中嵌入隐蔽的所有权信号,无需可检测的人工痕迹即可实现稳健的知识产权保护。
English Summary: EverTracer is a novel gray-box fingerprinting framework that repurposes Membership Inference Attacks to embed stealthy ownership signals in large language models, ensuring robust intellectual property protection without detectable artifacts.
Authors:Zhenhua Xu, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han
Abstract:
With the rapid advancement of large language models (LLMs), safeguarding intellectual property (IP) has become increasingly critical. To address the challenges of high costs and potential contamination in fingerprint integration, we propose LoRA-FP, a lightweight, plug-and-play framework that embeds backdoor fingerprints into LoRA adapters through constrained fine-tuning. This design enables seamless fingerprint transplantation via parameter fusion, eliminating the need for full-parameter updates while preserving model integrity. Experimental results demonstrate that LoRA-FP not only significantly reduces computational overhead compared to conventional approaches but also achieves superior robustness across diverse scenarios, including incremental training and model fusion. Our code and datasets are publicly available at https://github.com/Xuzhenhua55/LoRA-FP.
中文: 针对大语言模型知识产权保护的挑战,LoRA-FP提出了一种轻量级框架,通过约束微调将指纹嵌入LoRA适配器,在显著降低计算成本的同时,保持了多种场景下的鲁棒性。
English: To address intellectual property protection challenges in large language models, LoRA-FP introduces a lightweight framework that embeds fingerprints into LoRA adapters through constrained fine-tuning, significantly reducing computational costs while maintaining robustness across various scenarios.
Authors:Wei Ao, Vishnu Naresh Boddeti
Abstract:
Face recognition is central to many authentication, security, and personalized applications. Yet, it suffers from significant privacy risks, particularly arising from unauthorized access to sensitive biometric data. This paper introduces CryptoFace, the first end-to-end encrypted face recognition system with fully homomorphic encryption (FHE). It enables secure processing of facial data across all stages of a face-recognition process--feature extraction, storage, and matching--without exposing raw images or features. We introduce a mixture of shallow patch convolutional networks to support higher-dimensional tensors via patch-based processing while reducing the multiplicative depth and, thus, inference latency. Parallel FHE evaluation of these networks ensures near-resolution-independent latency. On standard face recognition benchmarks, CryptoFace significantly accelerates inference and increases verification accuracy compared to the state-of-the-art FHE neural networks adapted for face recognition. CryptoFace will facilitate secure face recognition systems requiring robust and provable security. The code is available at https://github.com/human-analysis/CryptoFace.
中文:CryptoFace是首个采用全同态加密的端到端加密人脸识别系统,可在确保面部数据安全处理的同时,相比现有方法显著加速推理并提高验证准确率。
English: CryptoFace is the first end-to-end encrypted face recognition system using fully homomorphic encryption, enabling secure processing of facial data while accelerating inference and improving verification accuracy compared to existing methods.
Authors:Junjie Chu, Mingjie Li, Ziqing Yang, Ye Leng, Chenhao Lin, Chao Shen, Michael Backes, Yun Shen, Yang Zhang
Abstract:
Accurately determining whether a jailbreak attempt has succeeded is a fundamental yet unresolved challenge. Existing evaluation methods rely on misaligned proxy indicators or naive holistic judgments. They frequently misinterpret model responses, leading to inconsistent and subjective assessments that misalign with human perception. To address this gap, we introduce JADES (Jailbreak Assessment via Decompositional Scoring), a universal jailbreak evaluation framework. Its key mechanism is to automatically decompose an input harmful question into a set of weighted sub-questions, score each sub-answer, and weight-aggregate the sub-scores into a final decision. JADES also incorporates an optional fact-checking module to strengthen the detection of hallucinations in jailbreak responses. We validate JADES on JailbreakQR, a newly introduced benchmark proposed in this work, consisting of 400 pairs of jailbreak prompts and responses, each meticulously annotated by humans. In a binary setting (success/failure), JADES achieves 98.5% agreement with human evaluators, outperforming strong baselines by over 9%. Re-evaluating five popular attacks on four LLMs reveals substantial overestimation (e.g., LAA's attack success rate on GPT-3.5-Turbo drops from 93% to 69%). Our results show that JADES could deliver accurate, consistent, and interpretable evaluations, providing a reliable basis for measuring future jailbreak attacks.
中文: JADES框架通过将恶意问题分解为加权子问题并聚合评分来准确评估越狱攻击,实现了98.5%的人类评估一致性,同时揭露现有方法存在显著高估问题。
English: JADES is a novel framework that accurately assesses jailbreak attempts by decomposing harmful queries into weighted sub-questions and aggregating scores, achieving 98.5% human alignment and revealing significant overestimations in existing methods.
Authors:Stefano Fumero, Kai Huang, Matteo Boffa, Danilo Giordano, Marco Mellia, Zied Ben Houidi, Dario Rossi
Abstract:
Large Language Model (LLM) agents are powerful tools for automating complex tasks. In cybersecurity, researchers have primarily explored their use in red-team operations such as vulnerability discovery and penetration tests. Defensive uses for incident response and forensics have received comparatively less attention and remain at an early stage. This work presents a systematic study of LLM-agent design for the forensic investigation of realistic web application attacks. We propose CyberSleuth, an autonomous agent that processes packet-level traces and application logs to identify the targeted service, the exploited vulnerability (CVE), and attack success. We evaluate the consequences of core design decisions - spanning tool integration and agent architecture - and provide interpretable guidance for practitioners. We benchmark four agent architectures and six LLM backends on 20 incident scenarios of increasing complexity, identifying CyberSleuth as the best-performing design. In a separate set of 10 incidents from 2025, CyberSleuth correctly identifies the exact CVE in 80% of cases. At last, we conduct a human study with 22 experts, which rated the reports of CyberSleuth as complete, useful, and coherent. They also expressed a slight preference for DeepSeek R1, a good news for open source LLM. To foster progress in defensive LLM research, we release both our benchmark and the CyberSleuth platform as a foundation for fair, reproducible evaluation of forensic agents.
中文: 本研究提出CyberSleuth这一自主LLM代理系统,用于网络攻击取证调查,在漏洞识别方面表现优异并获专家认可,同时公开其基准平台以推动防御性网络安全研究发展。
English: This study introduces CyberSleuth, an autonomous LLM agent for forensic investigation of web attacks, which outperforms other designs in identifying CVEs and generates expert-approved reports while releasing its benchmark for defensive cybersecurity research.
Authors:Hyejun Jeong, Mohammadreza Teymoorianfard, Abhinav Kumar, Amir Houmansadr, Eugene Bagdasarian
Abstract:
We show that Web and Research Agents (WRAs) -- language model-based systems that investigate complex topics on the Internet -- are vulnerable to inference attacks by passive network adversaries such as ISPs. These agents could be deployed locally by organizations and individuals for privacy, legal, or financial purposes. Unlike sporadic web browsing by humans, WRAs visit $70{-}140$ domains with distinguishable timing correlations, enabling unique fingerprinting attacks. Specifically, we demonstrate a novel prompt and user trait leakage attack against WRAs that only leverages their network-level metadata (i.e., visited IP addresses and their timings). We start by building a new dataset of WRA traces based on user search queries and queries generated by synthetic personas. We define a behavioral metric (called OBELS) to comprehensively assess similarity between original and inferred prompts, showing that our attack recovers over 73% of the functional and domain knowledge of user prompts. Extending to a multi-session setting, we recover up to 19 of 32 latent traits with high accuracy. Our attack remains effective under partial observability and noisy conditions. Finally, we discuss mitigation strategies that constrain domain diversity or obfuscate traces, showing negligible utility impact while reducing attack effectiveness by an average of 29%.
中文: 网络与研究代理(WRA)易受网络层面的推理攻击,通过分析其独特的浏览模式可泄露用户提示和特征,而提出的缓解策略能在不影响实用性的情况下平均降低29%的攻击效果。
English: Web and Research Agents (WRAs) are susceptible to network-level inference attacks that can leak user prompts and traits by analyzing their distinctive browsing patterns, with proposed mitigation strategies reducing attack effectiveness by 29% without significant utility loss.
Authors:Xia Han, Qi Li, Jianbing Ni, Mohammad Zulkernine
Abstract:
Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a hybrid framework that combines the semantic alignment strength of Semantic Information Retrieval (SIR) with the probabilistic watermarking mechanism of SynthID-Text. Our approach jointly embeds watermarks at both lexical and semantic levels, enabling robust provenance tracking while preserving the original meaning. Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text. These findings demonstrate the effectiveness of semantic-aware watermarking in resisting real-world tampering. All code, datasets, and evaluation scripts are publicly available at: https://github.com/githshine/SynGuard.
Chinese Summary: SynGuard混合框架将语义信息检索与SynthID-Text水印技术相结合,通过在词汇和语义层面双重嵌入水印,显著提升了对抗语义保持攻击的鲁棒性,使水印恢复的F1分数平均提高11.1%。
English Summary: SynGuard, a hybrid framework combining semantic information retrieval with SynthID-Text's watermarking, significantly enhances robustness against meaning-preserving attacks by embedding watermarks at both lexical and semantic levels, improving watermark recovery by 11.1% in F1 score.
Authors:Shuo Shao, Yiming Li, Yu He, Hongwei Yao, Wenyuan Yang, Dacheng Tao, Zhan Qin
Abstract:
The broad capabilities and substantial resources required to train Large Language Models (LLMs) make them valuable intellectual property, yet they remain vulnerable to copyright infringement, such as unauthorized use and model theft. LLM fingerprinting, a non-intrusive technique that extracts and compares the distinctive features from LLMs to identify infringements, offers a promising solution to copyright auditing. However, its reliability remains uncertain due to the prevalence of diverse model modifications and the lack of standardized evaluation. In this SoK, we present the first comprehensive study of LLM fingerprinting. We introduce a unified framework and formal taxonomy that categorizes existing methods into white-box and black-box approaches, providing a structured overview of the state of the art. We further propose LeaFBench, the first systematic benchmark for evaluating LLM fingerprinting under realistic deployment scenarios. Built upon mainstream foundation models and comprising 149 distinct model instances, LeaFBench integrates 13 representative post-development techniques, spanning both parameter-altering methods (e.g., fine-tuning, quantization) and parameter-independent mechanisms (e.g., system prompts, RAG). Extensive experiments on LeaFBench reveal the strengths and weaknesses of existing methods, thereby outlining future research directions and critical open problems in this emerging field. The code is available at https://github.com/shaoshuo-ss/LeaFBench.
中文: 本文首次对大型语言模型指纹识别进行全面研究,提出了统一框架和LeaFBench基准测试,评估其在模型修改下的可靠性,揭示了现有方法的局限性和未来研究方向。
English: This paper presents the first comprehensive study of LLM fingerprinting, introducing a unified framework and LeaFBench benchmark to evaluate its reliability against model modifications, revealing current methods' limitations and future research needs.
Authors:Lincan Li, Bolin Shen, Chenxi Zhao, Yuxiang Sun, Kaixiang Zhao, Shirui Pan, Yushun Dong
Abstract:
Graph-structured data, which captures non-Euclidean relationships and interactions between entities, is growing in scale and complexity. As a result, training state-of-the-art graph machine learning (GML) models have become increasingly resource-intensive, turning these models and data into invaluable Intellectual Property (IP). To address the resource-intensive nature of model training, graph-based Machine-Learning-as-a-Service (GMLaaS) has emerged as an efficient solution by leveraging third-party cloud services for model development and management. However, deploying such models in GMLaaS also exposes them to potential threats from attackers. Specifically, while the APIs within a GMLaaS system provide interfaces for users to query the model and receive outputs, they also allow attackers to exploit and steal model functionalities or sensitive training data, posing severe threats to the safety of these GML models and the underlying graph data. To address these challenges, this survey systematically introduces the first taxonomy of threats and defenses at the level of both GML model and graph-structured data. Such a tailored taxonomy facilitates an in-depth understanding of GML IP protection. Furthermore, we present a systematic evaluation framework to assess the effectiveness of IP protection methods, introduce a curated set of benchmark datasets across various domains, and discuss their application scopes and future challenges. Finally, we establish an open-sourced versatile library named PyGIP, which evaluates various attack and defense techniques in GMLaaS scenarios and facilitates the implementation of existing benchmark methods. The library resource can be accessed at: https://labrai.github.io/PyGIP. We believe this survey will play a fundamental role in intellectual property protection for GML and provide practical recipes for the GML community.
Authors:Zhixin Lin, Jungang Li, Shidong Pan, Yibo Shi, Yue Yao, Dongliang Xu
Abstract:
Smartphones bring significant convenience to users but also enable devices to extensively record various types of personal information. Existing smartphone agents powered by Multimodal Large Language Models (MLLMs) have achieved remarkable performance in automating different tasks. However, as the cost, these agents are granted substantial access to sensitive users' personal information during this operation. To gain a thorough understanding of the privacy awareness of these agents, we present the first large-scale benchmark encompassing 7,138 scenarios to the best of our knowledge. In addition, for privacy context in scenarios, we annotate its type (e.g., Account Credentials), sensitivity level, and location. We then carefully benchmark seven available mainstream smartphone agents. Our results demonstrate that almost all benchmarked agents show unsatisfying privacy awareness (RA), with performance remaining below 60% even with explicit hints. Overall, closed-source agents show better privacy ability than open-source ones, and Gemini 2.0-flash achieves the best, achieving an RA of 67%. We also find that the agents' privacy detection capability is highly related to scenario sensitivity level, i.e., the scenario with a higher sensitivity level is typically more identifiable. We hope the findings enlighten the research community to rethink the unbalanced utility-privacy tradeoff about smartphone agents. Our code and benchmark are available at https://zhixin-l.github.io/SAPA-Bench.
Authors:Xi Wang, Songlei Jian, Shasha Li, Xiaopeng Li, Bin Ji, Jun Ma, Xiaodong Liu, Jing Wang, Feilong Bao, Jianfeng Zhang, Baosheng Wang, Jie Yu
Abstract:
Large language models (LLMs) generate human-aligned content under certain safety constraints. However, the current known technique ``jailbreak prompt'' can circumvent safety-aligned measures and induce LLMs to output malicious content. Research on Jailbreaking can help identify vulnerabilities in LLMs and guide the development of robust security frameworks. To circumvent the issue of attack templates becoming obsolete as models evolve, existing methods adopt iterative mutation and dynamic optimization to facilitate more automated jailbreak attacks. However, these methods face two challenges: inefficiency and repetitive optimization, as they overlook the value of past attack experiences. To better integrate past attack experiences to assist current jailbreak attempts, we propose the \textbf{JailExpert}, an automated jailbreak framework, which is the first to achieve a formal representation of experience structure, group experiences based on semantic drift, and support the dynamic updating of the experience pool. Extensive experiments demonstrate that JailExpert significantly improves both attack effectiveness and efficiency. Compared to the current state-of-the-art black-box jailbreak methods, JailExpert achieves an average increase of 17\% in attack success rate and 2.7 times improvement in attack efficiency. Our implementation is available at \href{https://github.com/xiZAIzai/JailExpert}{XiZaiZai/JailExpert}
中文: JailExpert是一种创新的自动化框架,通过有效利用过往攻击经验来增强对大型语言模型的越狱攻击,相比现有方法,攻击成功率平均提高17%,攻击效率提升2.7倍。
English: JailExpert is an innovative automated framework that enhances jailbreak attacks on large language models by effectively utilizing past attack experiences, achieving a 17% higher success rate and 2.7 times greater efficiency compared to existing methods.
Authors:Tongxi Wu, Chenwei Xu, Jin Yang
Abstract:
The proliferation of cloud-integrated IoT systems has intensified exposure to Distributed Denial of Service (DDoS) attacks due to the expanded attack surface, heterogeneous device behaviors, and limited edge protection. However, DDoS detection in this context remains challenging because of complex traffic dynamics, severe class imbalance, and scarce labeled data. While recent methods have explored solutions to address class imbalance, many still struggle to generalize under limited supervision and dynamic traffic conditions. To overcome these challenges, we propose MixGAN, a hybrid detection method that integrates conditional generation, semi-supervised learning, and robust feature extraction. Specifically, to handle complex temporal traffic patterns, we design a 1-D WideResNet backbone composed of temporal convolutional layers with residual connections, which effectively capture local burst patterns in traffic sequences. To alleviate class imbalance and label scarcity, we use a pretrained CTGAN to generate synthetic minority-class (DDoS attack) samples that complement unlabeled data. Furthermore, to mitigate the effect of noisy pseudo-labels, we introduce a MixUp-Average-Sharpen (MAS) strategy that constructs smoothed and sharpened targets by averaging predictions over augmented views and reweighting them towards high-confidence classes. Experiments on NSL-KDD, BoT-IoT, and CICIoT2023 demonstrate that MixGAN achieves up to 2.5% higher accuracy and 4% improvement in both TPR and TNR compared to state-of-the-art methods, confirming its robustness in large-scale IoT-cloud environments. The source code is publicly available at https://github.com/0xCavaliers/MixGAN.
中文:提出的MixGAN方法通过结合时序模式分析、合成数据生成和抗噪标签策略,有效解决了云物联网系统中的DDoS检测难题,其性能显著优于现有方法。
English: The proposed MixGAN method effectively addresses DDoS detection challenges in cloud-IoT systems by integrating temporal pattern analysis with synthetic data generation and noise-resistant labeling, achieving superior performance over existing approaches.
Authors:Rui Zhang, Zihan Wang, Tianli Yang, Hongwei Li, Wenbo Jiang, Qingchuan Zhao, Yang Liu, Guowen Xu
Abstract:
Vision-Language Models (VLMs) are increasingly deployed in real-world applications, but their high inference cost makes them vulnerable to resource consumption attacks. Prior attacks attempt to extend VLM output sequences by optimizing adversarial images, thereby increasing inference costs. However, these extended outputs often introduce irrelevant abnormal content, compromising attack stealthiness. This trade-off between effectiveness and stealthiness poses a major limitation for existing attacks. To address this challenge, we propose \textit{Hidden Tail}, a stealthy resource consumption attack that crafts prompt-agnostic adversarial images, inducing VLMs to generate maximum-length outputs by appending special tokens invisible to users. Our method employs a composite loss function that balances semantic preservation, repetitive special token induction, and suppression of the end-of-sequence (EOS) token, optimized via a dynamic weighting strategy. Extensive experiments show that \textit{Hidden Tail} outperforms existing attacks, increasing output length by up to 19.2$\times$ and reaching the maximum token limit, while preserving attack stealthiness. These results highlight the urgent need to improve the robustness of VLMs against efficiency-oriented adversarial threats. Our code is available at https://github.com/zhangrui4041/Hidden_Tail.
中文: "隐尾"攻击通过生成对抗性图像诱导视觉语言模型重复生成特殊标记,在保持语义连贯性的同时将输出长度提升19.2倍,实现了高效且隐蔽的资源消耗攻击。
English: The proposed "Hidden Tail" attack stealthily maximizes Vision-Language Model inference costs by generating adversarial images that induce repetitive special tokens while preserving semantic coherence, achieving 19.2× output elongation without compromising stealth.
Authors:Nanxi Li, Zhengyue Zhao, Chaowei Xiao
Abstract:
Safeguarding vision-language models (VLMs) is a critical challenge, as existing methods often suffer from over-defense, which harms utility, or rely on shallow alignment, failing to detect complex threats that require deep reasoning. To this end, we introduce PRISM (Principled Reasoning for Integrated Safety in Multimodality), a system2-like framework that aligns VLMs by embedding a structured, safety-aware reasoning process. Our framework consists of two key components: PRISM-CoT, a dataset that teaches safety-aware chain-of-thought reasoning, and PRISM-DPO, generated via Monte Carlo Tree Search (MCTS) to further refine this reasoning through Direct Preference Optimization to help obtain a delicate safety boundary. Comprehensive evaluations demonstrate PRISM's effectiveness, achieving remarkably low attack success rates including 0.15% on JailbreakV-28K for Qwen2-VL and 90% improvement over the previous best method on VLBreak for LLaVA-1.5. PRISM also exhibits strong robustness against adaptive attacks, significantly increasing computational costs for adversaries, and generalizes effectively to out-of-distribution challenges, reducing attack success rates to just 8.70% on the challenging multi-image MIS benchmark. Remarkably, this robust defense is achieved while preserving, and in some cases enhancing, model utility. To promote reproducibility, we have made our code, data, and model weights available at https://github.com/SaFoLab-WISC/PRISM.
中文: PRISM是一个创新框架,通过嵌入结构化推理过程来增强视觉语言模型的安全性,在保持甚至提升模型实用性的同时,实现了对复杂威胁的强大防御能力。
English: PRISM is a novel framework that enhances the safety of vision-language models by embedding structured reasoning processes, achieving robust defense against complex threats while maintaining or even improving model utility.
Authors:Wenxuan Bao, Vincent Bindschaedler
Abstract:
There is a flurry of recent research papers proposing novel differentially private machine learning (DPML) techniques. These papers claim to achieve new state-of-the-art (SoTA) results and offer empirical results as validation. However, there is no consensus on which techniques are most effective or if they genuinely meet their stated claims. Complicating matters, heterogeneity in codebases, datasets, methodologies, and model architectures make direct comparisons of different approaches challenging.
In this paper, we conduct a reproducibility and replicability (R+R) experiment on 11 different SoTA DPML techniques from the recent research literature. Results of our investigation are varied: while some methods stand up to scrutiny, others falter when tested outside their initial experimental conditions. We also discuss challenges unique to the reproducibility of DPML, including additional randomness due to DP noise, and how to address them. Finally, we derive insights and best practices to obtain scientifically valid and reliable results.
中文: 针对近期差分隐私机器学习研究中缺乏有效技术共识的问题,本文通过复现11种前沿方法发现其表现参差不齐,并探讨了由隐私噪声引发的可复现性挑战,最终提出了确保结果科学可靠的最佳实践。
English: Recent research on differentially private machine learning (DPML) lacks consensus on the effectiveness of proposed techniques, prompting a reproducibility study of 11 state-of-the-art methods that reveals varied performance and discusses challenges like DP noise to derive best practices.
Authors:Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, Yushun Dong
Abstract:
Machine learning (ML) models have significantly grown in complexity and utility, driving advances across multiple domains. However, substantial computational resources and specialized expertise have historically restricted their wide adoption. Machine-Learning-as-a-Service (MLaaS) platforms have addressed these barriers by providing scalable, convenient, and affordable access to sophisticated ML models through user-friendly APIs. While this accessibility promotes widespread use of advanced ML capabilities, it also introduces vulnerabilities exploited through Model Extraction Attacks (MEAs). Recent studies have demonstrated that adversaries can systematically replicate a target model's functionality by interacting with publicly exposed interfaces, posing threats to intellectual property, privacy, and system security. In this paper, we offer a comprehensive survey of MEAs and corresponding defense strategies. We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments. Our analysis covers various attack techniques, evaluates their effectiveness, and highlights challenges faced by existing defenses, particularly the critical trade-off between preserving model utility and ensuring security. We further assess MEAs within different computing paradigms and discuss their technical, ethical, legal, and societal implications, along with promising directions for future research. This systematic survey aims to serve as a valuable reference for researchers, practitioners, and policymakers engaged in AI security and privacy. Additionally, we maintain an online repository continuously updated with related literature at https://github.com/kzhao5/ModelExtractionPapers.
中文摘要:本文系统综述了通过机器学习即服务平台窃取模型功能的提取攻击,提出了新型分类法,分析了攻击技术、防御策略及其多维影响,重点探讨了模型效用与安全保障之间的关键平衡问题。
English Summary: This paper surveys Model Extraction Attacks (MEAs) that exploit MLaaS platforms to replicate proprietary models, proposing a novel taxonomy and analyzing attack techniques, defense strategies, and their broader implications while highlighting the security-utility trade-off.
Authors:Hanwen Cao, Haobo Lu, Xiaosen Wang, Kun He
Abstract:
Ensemble-based attacks have been proven to be effective in enhancing adversarial transferability by aggregating the outputs of models with various architectures. However, existing research primarily focuses on refining ensemble weights or optimizing the ensemble path, overlooking the exploration of ensemble models to enhance the transferability of adversarial attacks. To address this gap, we propose applying adversarial augmentation to the surrogate models, aiming to boost overall generalization of ensemble models and reduce the risk of adversarial overfitting. Meanwhile, observing that ensemble Vision Transformers (ViTs) gain less attention, we propose ViT-EnsembleAttack based on the idea of model adversarial augmentation, the first ensemble-based attack method tailored for ViTs to the best of our knowledge. Our approach generates augmented models for each surrogate ViT using three strategies: Multi-head dropping, Attention score scaling, and MLP feature mixing, with the associated parameters optimized by Bayesian optimization. These adversarially augmented models are ensembled to generate adversarial examples. Furthermore, we introduce Automatic Reweighting and Step Size Enlargement modules to boost transferability. Extensive experiments demonstrate that ViT-EnsembleAttack significantly enhances the adversarial transferability of ensemble-based attacks on ViTs, outperforming existing methods by a substantial margin. Code is available at https://github.com/Trustworthy-AI-Group/TransferAttack.
中文: 提出的ViT-EnsembleAttack通过多头丢弃、注意力缩放和MLP混合等对抗增强策略,结合自动重加权和步长放大模块,显著提升了视觉Transformer的对抗迁移能力,大幅优于现有方法。
English: The proposed ViT-EnsembleAttack enhances adversarial transferability for Vision Transformers by applying adversarial augmentation through multi-head dropping, attention scaling, and MLP mixing, combined with automatic reweighting and step size enlargement, significantly outperforming existing methods.
Authors:Wei Jie Yeo, Ranjan Satapathy, Erik Cambria
Abstract:
Despite extensive safety-tuning, large language models (LLMs) remain vulnerable to jailbreak attacks via adversarially crafted instructions, reflecting a persistent trade-off between safety and task performance. In this work, we propose Intent-FT, a simple and lightweight fine-tuning approach that explicitly trains LLMs to infer the underlying intent of an instruction before responding. By fine-tuning on a targeted set of adversarial instructions, Intent-FT enables LLMs to generalize intent deduction to unseen attacks, thereby substantially improving their robustness. We comprehensively evaluate both parametric and non-parametric attacks across open-source and proprietary models, considering harmfulness from attacks, utility, over-refusal, and impact against white-box threats. Empirically, Intent-FT consistently mitigates all evaluated attack categories, with no single attack exceeding a 50\% success rate -- whereas existing defenses remain only partially effective. Importantly, our method preserves the model's general capabilities and reduces excessive refusals on benign instructions containing superficially harmful keywords. Furthermore, models trained with Intent-FT accurately identify hidden harmful intent in adversarial attacks, and these learned intentions can be effectively transferred to enhance vanilla model defenses. We publicly release our code at https://github.com/wj210/Intent_Jailbreak.
中文: 提出的Intent-FT方法通过训练大语言模型在回复前推断用户意图,显著提升了模型安全性,将越狱攻击成功率降至50%以下,同时保持了实用功能并减少了过度拒绝。
English: The proposed Intent-FT method enhances LLM safety by training models to infer user intent before responding, effectively reducing jailbreak success rates below 50% while maintaining utility and reducing over-refusal.
Authors:Chiyu Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Liming Fang, Zhe Liu
Abstract:
Evaluating jailbreak attacks is challenging when prompts are not overtly harmful or fail to induce harmful outputs. Unfortunately, many existing red-teaming datasets contain such unsuitable prompts. To evaluate attacks accurately, these datasets need to be assessed and cleaned for maliciousness. However, existing malicious content detection methods rely on either manual annotation, which is labor-intensive, or large language models (LLMs), which have inconsistent accuracy in harmful types. To balance accuracy and efficiency, we propose a hybrid evaluation framework named MDH (Malicious content Detection based on LLMs with Human assistance) that combines LLM-based annotation with minimal human oversight, and apply it to dataset cleaning and detection of jailbroken responses. Furthermore, we find that well-crafted developer messages can significantly boost jailbreak success, leading us to propose two new strategies: D-Attack, which leverages context simulation, and DH-CoT, which incorporates hijacked chains of thought. The Codes, datasets, judgements, and detection results will be released in github repository: https://github.com/AlienZhang1996/DH-CoT.
中文: 本研究提出了MDH混合框架,结合大语言模型检测与少量人工监督来高效清理数据集和识别越狱攻击,同时提出D-Attack和DH-CoT两种新策略,通过上下文模拟和劫持思维链显著提升攻击成功率。
English: This study introduces MDH, a hybrid framework combining LLM-based detection with minimal human oversight to efficiently clean datasets and identify jailbreak attacks, while also proposing two novel strategies—D-Attack and DH-CoT—that enhance attack success through context simulation and hijacked reasoning.
Authors:Pallavi Zambare, Venkata Nikhil Thanikella, Nikhil Padmanabh Kottur, Sree Akhil Akula, Ying Liu
Abstract:
In this paper, we present NetMoniAI, an agentic AI framework for automatic network monitoring and security that integrates decentralized analysis with lightweight centralized coordination. The framework consists of two layers: autonomous micro-agents at each node perform local traffic analysis and anomaly detection. A central controller then aggregates insights across nodes to detect coordinated attacks and maintain system-wide situational awareness. We evaluated NetMoniAI on a local micro-testbed and through NS-3 simulations. Results confirm that the two-tier agentic-AI design scales under resource constraints, reduces redundancy, and improves response time without compromising accuracy. To facilitate broader adoption and reproducibility, the complete framework is available as open source. This enables researchers and practitioners to replicate, validate, and extend it across diverse network environments and threat scenarios. Github link: https://github.com/pzambare3/NetMoniAI
中文: NetMoniAI是一个双层智能体AI框架,通过节点分散分析与中央协调相结合实现高效网络监控,在保证可扩展性和准确性的同时提升威胁检测能力,并已开源以促进广泛应用。
English: NetMoniAI is a two-tier agentic AI framework for network monitoring that combines decentralized node-level analysis with centralized coordination to efficiently detect threats while maintaining scalability and accuracy, and it is available as open source for broader use.
Authors:Aayush Gupta
Abstract:
Large language models (LLMs) remain acutely vulnerable to prompt injection and related jailbreak attacks; heuristic guardrails (rules, filters, LLM judges) are routinely bypassed. We present Contextual Integrity Verification (CIV), an inference-time security architecture that attaches cryptographically signed provenance labels to every token and enforces a source-trust lattice inside the transformer via a pre-softmax hard attention mask (with optional FFN/residual gating). CIV provides deterministic, per-token non-interference guarantees on frozen models: lower-trust tokens cannot influence higher-trust representations. On benchmarks derived from recent taxonomies of prompt-injection vectors (Elite-Attack + SoK-246), CIV attains 0% attack success rate under the stated threat model while preserving 93.1% token-level similarity and showing no degradation in model perplexity on benign tasks; we note a latency overhead attributable to a non-optimized data path. Because CIV is a lightweight patch -- no fine-tuning required -- we demonstrate drop-in protection for Llama-3-8B and Mistral-7B. We release a reference implementation, an automated certification harness, and the Elite-Attack corpus to support reproducible research.
Large language models are highly susceptible to prompt injection attacks, but the proposed Contextual Integrity Verification (CIV) architecture provides deterministic security by cryptographically labeling tokens and enforcing trust hierarchies, achieving perfect attack prevention with minimal performance impact.
English Summary:
Authors:Abu Shafin Mohammad Mahdee Jameel, Shreya Ghosh, Aly El Gamal
Abstract:
Intrusion Detection Systems (IDS) are a vital part of a network-connected device. In this paper, we develop a deep learning based intrusion detection system that is deployed in a distributed setup across devices connected to a network. Our aim is to better equip deep learning models against unknown attacks using knowledge from known attacks. To this end, we develop algorithms to maximize the number of transferability relationships. We propose a Convolutional Neural Network (CNN) model, along with two algorithms that maximize the number of relationships observed. One is a two step data pre-processing stage, and the other is a Block-Based Smart Aggregation (BBSA) algorithm. The proposed system succeeds in achieving superior transferability performance while maintaining impressive local detection rates. We also show that our method is generalizable, exhibiting transferability potential across datasets and even with different backbones. The code for this work can be found at https://github.com/ghosh64/tabfidsv2.
中文: 本文提出了一种基于深度学习的分布式入侵检测系统,采用卷积神经网络和新型算法来增强对未知攻击的迁移学习能力,在保持高检测率的同时,展现了跨数据集和模型架构的通用性。
English: This paper presents a distributed deep learning-based intrusion detection system that utilizes a Convolutional Neural Network and novel algorithms to enhance transferability against unknown attacks while maintaining high detection rates, demonstrating generalizability across datasets and model architectures.
Authors:Shreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
Abstract:
Intrusion Detection Systems (IDS) have an increasingly important role in preventing exploitation of network vulnerabilities by malicious actors. Recent deep learning based developments have resulted in significant improvements in the performance of IDS systems. In this paper, we present FetFIDS, where we explore the employment of feature embedding instead of positional embedding to improve intrusion detection performance of a transformer based deep learning system. Our model is developed with the aim of deployments in edge learning scenarios, where federated learning over multiple communication rounds can ensure both privacy and localized performance improvements. FetFIDS outperforms multiple state-of-the-art intrusion detection systems in a federated environment and demonstrates a high degree of suitability to federated learning. The code for this work can be found at https://github.com/ghosh64/fetfids.
中文: FetFIDS通过采用特征嵌入的Transformer模型,在联邦学习环境中显著提升了入侵检测性能,优于现有系统并兼顾隐私保护与本地化优化。
English: FetFIDS enhances intrusion detection in federated learning environments by using feature embedding in a transformer model, outperforming existing systems while ensuring privacy and localized improvements.
Authors:Yuchu Jiang, Jian Zhao, Yuchen Yuan, Tianle Zhang, Yao Huang, Yanghao Zhang, Yan Wang, Yanshu Li, Xizhong Guo, Yusheng Zhao, Jun Zhang, Zhi Zhang, Xiaojian Lin, Yixiu Zou, Haoxuan Ma, Yuhu Shang, Yuzhi Hu, Keshu Cai, Ruochen Zhang, Boyuan Chen, Yilan Gao, Ziheng Jiao, Yi Qin, Shuangjun Du, Xiao Tong, Zhekun Liu, Yu Chen, Xuankun Rong, Rui Wang, Yejie Zheng, Zhaoxin Fan, Murat Sensoy, Hongyuan Zhang, Pan Zhou, Lei Jin, Hao Zhao, Xu Yang, Jiaojiao Zhao, Jianshu Li, Joey Tianyi Zhou, Zhi-Qi Cheng, Longtao Huang, Zhiyi Liu, Zheng Zhu, Jianan Li, Gang Wang, Qi Li, Xu-Yao Zhang, Yaodong Yang, Mang Ye, Wenqi Ren, Zhaofeng He, Hang Su, Rongrong Ni, Liping Jing, Xingxing Wei, Junliang Xing, Massimo Alioto, Shengmei Shen, Petia Radeva, Dacheng Tao, Ya-Qin Zhang, Shuicheng Yan, Chi Zhang, Zhongjiang He, Xuelong Li
Abstract:
The rapid advancement of AI has expanded its capabilities across domains, yet introduced critical technical vulnerabilities, such as algorithmic bias and adversarial sensitivity, that pose significant societal risks, including misinformation, inequity, security breaches, physical harm, and eroded public trust. These challenges highlight the urgent need for robust AI governance. We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security (system reliability), Derivative Security (real-world harm mitigation), and Social Ethics (value alignment and accountability). Uniquely, our approach unifies technical methods, emerging evaluation benchmarks, and policy insights to promote transparency, accountability, and trust in AI systems. Through a systematic review of over 300 studies, we identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight. These shortcomings stem from treating governance as an afterthought, rather than a foundational design principle, resulting in reactive, siloed efforts that fail to address the interdependence of technical integrity and societal trust. To overcome this, we present an integrated research agenda that bridges technical rigor with social responsibility. Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy. The accompanying repository is available at https://github.com/ZTianle/Awesome-AI-SG.
中文摘要:该摘要提出一个综合的人工智能治理框架,通过内在安全、衍生安全和社会伦理三大支柱解决算法偏见和安全威胁等技术漏洞,旨在弥合技术稳健性与社会信任之间的鸿沟。
English Summary: The abstract proposes a comprehensive AI governance framework addressing technical vulnerabilities like bias and security threats through three pillars—Intrinsic Security, Derivative Security, and Social Ethics—to bridge the gap between technical robustness and societal trust.
Authors:Zelin Li, Ruohan Zong, Yifan Liu, Ruichen Yao, Yaokun Liu, Yang Zhang, Dong Wang
Abstract:
With the advancement of personalized image generation technologies, concerns about forgery attacks that infringe on portrait rights and privacy are growing. To address these concerns, protection perturbation algorithms have been developed to disrupt forgery generation. However, the protection algorithms would become ineffective when forgery attackers apply purification techniques to bypass the protection. To address this issue, we present a novel approach, Anti-Tamper Perturbation (ATP). ATP introduces a tamper-proof mechanism within the perturbation. It consists of protection and authorization perturbations, where the protection perturbation defends against forgery attacks, while the authorization perturbation detects purification-based tampering. Both protection and authorization perturbations are applied in the frequency domain under the guidance of a mask, ensuring that the protection perturbation does not disrupt the authorization perturbation. This design also enables the authorization perturbation to be distributed across all image pixels, preserving its sensitivity to purification-based tampering. ATP demonstrates its effectiveness in defending forgery attacks across various attack settings through extensive experiments, providing a robust solution for protecting individuals' portrait rights and privacy. Our code is available at: https://github.com/Seeyn/Anti-Tamper-Perturbation .
中文: 提出的抗篡改扰动(ATP)方法在频域中结合保护性扰动和授权性扰动,既能防御伪造攻击,又能检测基于净化的篡改行为,为保护肖像权和隐私提供了可靠方案。
English: The proposed Anti-Tamper Perturbation (ATP) method combines protection and authorization perturbations in the frequency domain to defend against forgery attacks while detecting purification-based tampering, offering a robust solution for safeguarding portrait rights and privacy.
Authors:Zhengxian Wu, Juan Wen, Wanli Peng, Haowei Chang, Yinghan Zhou, Yiming Xue
Abstract:
With the development of customized large language model (LLM) agents, a new threat of black-box backdoor attacks has emerged, where malicious instructions are injected into hidden system prompts. These attacks easily bypass existing defenses that rely on white-box access, posing a serious security challenge. To address this, we propose SLIP, a Soft Label mechanism and key-extraction-guided CoT-based defense against Instruction backdoors in APIs. SLIP is designed based on two key insights. First, to counteract the model's oversensitivity to triggers, we propose a Key-extraction-guided Chain-of-Thought (KCoT). Instead of only considering the single trigger or the input sentence, KCoT prompts the agent to extract task-relevant key phrases. Second, to guide the LLM toward correct answers, our proposed Soft Label Mechanism (SLM) prompts the agent to quantify the semantic correlation between key phrases and candidate answers. Crucially, to mitigate the influence of residual triggers or misleading content in phrases extracted by KCoT, which typically causes anomalous scores, SLM excludes anomalous scores deviating significantly from the mean and subsequently averages the remaining scores to derive a more reliable semantic representation. Extensive experiments on classification and question-answer (QA) tasks demonstrate that SLIP is highly effective, reducing the average attack success rate (ASR) from 90.2% to 25.13% while maintaining high accuracy on clean data and outperforming state-of-the-art defenses. Our code are available in https://github.com/CAU-ISS-Lab/Backdoor-Attack-Defense-LLMs/tree/main/SLIP.
中文: 针对大语言模型的黑盒后门攻击,SLIP提出基于关键词提取的思维链和软标签机制,显著降低攻击成功率并保持干净数据的高准确度。
English: To counter black-box backdoor attacks in LLM agents, SLIP introduces a Key-extraction-guided Chain-of-Thought and Soft Label Mechanism, effectively reducing attack success rates while preserving clean data accuracy.
Authors:Kai Yao, Marc Juarez
Abstract:
Generative models are increasingly adopted in high-stakes domains, yet current deployments offer no mechanisms to verify whether a given output truly originates from the certified model. We address this gap by extending model fingerprinting techniques beyond the traditional collaborative setting to one where the model provider itself may act adversarially, replacing the certified model with a cheaper or lower-quality substitute. To our knowledge, this is the first work to study fingerprinting for provenance attribution under such a threat model. Our approach introduces a trusted verifier that, during a certification phase, extracts hidden fingerprints from the authentic model's output space and trains a detector to recognize them. During verification, this detector can determine whether new outputs are consistent with the certified model, without requiring specialized hardware or model modifications. In extensive experiments, our methods achieve near-zero FPR@95%TPR on both GANs and diffusion models, and remain effective even against subtle architectural or training changes. Furthermore, the approach is robust to adaptive adversaries that actively manipulate outputs in an attempt to evade detection.
中文摘要:本研究提出了一种指纹识别方法,用于验证生成模型输出是否来自认证模型,即使提供商可能替换模型,也能在不改变硬件的情况下实现高精度检测。
English Summary: This study introduces a fingerprinting method to verify if generative model outputs originate from certified models, even when providers may substitute them, achieving high detection accuracy without hardware changes.
Authors:Jinjia Peng, Zeze Tao, Huibing Wang, Meng Wang, Yang Wang
Abstract:
Deep neural networks are susceptible to adversarial examples while suffering from incorrect predictions via imperceptible perturbations. Transfer-based attacks create adversarial examples for surrogate models and transfer these examples to target models under black-box scenarios. Recent studies reveal that adversarial examples in flat loss landscapes exhibit superior transferability to alleviate overfitting on surrogate models. However, the prior arts overlook the influence of perturbation directions, resulting in limited transferability. In this paper, we propose a novel attack method, named Residual Perturbation Attack (ResPA), relying on the residual gradient as the perturbation direction to guide the adversarial examples toward the flat regions of the loss function. Specifically, ResPA conducts an exponential moving average on the input gradients to obtain the first moment as the reference gradient, which encompasses the direction of historical gradients. Instead of heavily relying on the local flatness that stems from the current gradients as the perturbation direction, ResPA further considers the residual between the current gradient and the reference gradient to capture the changes in the global perturbation direction. The experimental results demonstrate the better transferability of ResPA than the existing typical transfer-based attack methods, while the transferability can be further improved by combining ResPA with the current input transformation methods. The code is available at https://github.com/ZezeTao/ResPA.
Chinese: 提出的残差扰动攻击(ResPA)通过利用残差梯度引导扰动朝向损失函数的平坦区域,显著提升了对抗样本的可迁移性,优于现有方法,并结合输入变换技术进一步增强了攻击效果。
English: The proposed Residual Perturbation Attack (ResPA) enhances adversarial transferability by using residual gradients to guide perturbations toward flat loss landscapes, outperforming existing methods and further improving when combined with input transformations.
Authors:Jing Wang, Zheng Li, Lei Li, Fan He, Liyu Lin, Yao Lai, Yan Li, Xiaoyang Zeng, Yufeng Guo
Abstract:
Recent years have witnessed growing interest in adopting large language models (LLMs) for Register Transfer Level (RTL) code optimization. While powerful cloud-based LLMs offer superior optimization capabilities, they pose unacceptable intellectual property (IP) leakage risks when processing proprietary hardware designs. In this paper, we propose a new scenario where Verilog code must be optimized for specific attributes without leaking sensitive IP information. We introduce the first IP-preserving edge-cloud collaborative framework that leverages the benefits of both paradigms. Our approach employs local small LLMs (e.g., Qwen-2.5-Coder-7B) to perform secure comparative analysis between paired high-quality target designs and novice draft codes, yielding general design principles that summarize key insights for improvements. These principles are then used to query stronger cloud LLMs (e.g., Deepseek-V3) for targeted code improvement, ensuring that only abstracted and IP-safe guidance reaches external services. Our experimental results demonstrate that the framework achieves significantly higher optimization success rates compared to baseline methods. For example, combining Qwen-2.5-Coder-7B and Deepseek-V3 achieves a 66.67\% optimization success rate for power utilization, outperforming Deepseek-V3 alone (49.81\%) and even commercial models like GPT-4o (55.81\%). Further investigation of local and cloud LLM combinations reveals that different model pairings exhibit varying strengths for specific optimization objectives, with interesting trends emerging when varying the number of comparative code pairs. Our work establishes a new paradigm for secure hardware design optimization that balances performance gains with IP protection.
中文: 本文提出了一种保护知识产权的边云协同框架,通过本地小型大语言模型进行安全对比分析提取设计原则,再指导云端强大模型优化RTL代码,在实现性能提升的同时有效防止敏感信息泄露。
English: This paper introduces an IP-preserving edge-cloud collaborative framework that uses local small LLMs for secure comparative analysis to extract design principles, which then guide powerful cloud LLMs to optimize RTL code while preventing IP leakage.
Authors:Minghao Shao, Nanda Rani, Kimberly Milner, Haoran Xi, Meet Udeshi, Saksham Aggarwal, Venkata Sai Charan Putrevu, Sandeep Kumar Shukla, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Muhammad Shafique
Abstract:
Recent advances in LLM agentic systems have improved the automation of offensive security tasks, particularly for Capture the Flag (CTF) challenges. We systematically investigate the key factors that drive agent success and provide a detailed recipe for building effective LLM-based offensive security agents. First, we present CTFJudge, a framework leveraging LLM as a judge to analyze agent trajectories and provide granular evaluation across CTF solving steps. Second, we propose a novel metric, CTF Competency Index (CCI) for partial correctness, revealing how closely agent solutions align with human-crafted gold standards. Third, we examine how LLM hyperparameters, namely temperature, top-p, and maximum token length, influence agent performance and automated cybersecurity task planning. For rapid evaluation, we present CTFTiny, a curated benchmark of 50 representative CTF challenges across binary exploitation, web, reverse engineering, forensics, and cryptography. Our findings identify optimal multi-agent coordination settings and lay the groundwork for future LLM agent research in cybersecurity. We make CTFTiny open source to public https://github.com/NYU-LLM-CTF/CTFTiny along with CTFJudge on https://github.com/NYU-LLM-CTF/CTFJudge.
中文: 本研究通过CTFJudge评估框架和CTFTiny基准测试,系统分析了基于大语言模型的网络攻防智能体性能关键因素,揭示了最优协同配置,并为后续研究提供了开源工具。
English: This study introduces CTFJudge and CTFTiny to systematically evaluate LLM-based agents in offensive cybersecurity tasks, identifying key performance factors and optimal coordination settings while providing open-source tools for future research.
Authors:Renmiao Chen, Shiyao Cui, Xuancheng Huang, Chengwei Pan, Victor Shea-Jay Huang, QingLin Zhang, Xuan Ouyang, Zhexin Zhang, Hongning Wang, Minlie Huang
Abstract:
Jailbreak attacks against multimodal large language Models (MLLMs) are a significant research focus. Current research predominantly focuses on maximizing attack success rate (ASR), often overlooking whether the generated responses actually fulfill the attacker's malicious intent. This oversight frequently leads to low-quality outputs that bypass safety filters but lack substantial harmful content. To address this gap, we propose JPS, \underline{J}ailbreak MLLMs with collaborative visual \underline{P}erturbation and textual \underline{S}teering, which achieves jailbreaks via corporation of visual image and textually steering prompt. Specifically, JPS utilizes target-guided adversarial image perturbations for effective safety bypass, complemented by "steering prompt" optimized via a multi-agent system to specifically guide LLM responses fulfilling the attackers' intent. These visual and textual components undergo iterative co-optimization for enhanced performance. To evaluate the quality of attack outcomes, we propose the Malicious Intent Fulfillment Rate (MIFR) metric, assessed using a Reasoning-LLM-based evaluator. Our experiments show JPS sets a new state-of-the-art in both ASR and MIFR across various MLLMs and benchmarks, with analyses confirming its efficacy. Codes are available at \href{https://github.com/thu-coai/JPS}{https://github.com/thu-coai/JPS}. \color{warningcolor}{Warning: This paper contains potentially sensitive contents.}
中文: 本文提出JPS方法,通过协同优化视觉扰动和文本引导,在实现多模态大语言模型越狱攻击时不仅有效绕过安全防护,更能确保生成内容符合攻击者恶意意图,实验证明该方法在攻击成功率和恶意意图实现率上均达到最优水平。
English: This paper introduces JPS, a collaborative visual and textual method that enhances jailbreak attacks on multimodal large language models by combining adversarial image perturbations with steering prompts to effectively bypass safety measures while ensuring the malicious intent is fulfilled, as measured by the new MIFR metric.
Authors:Yanting Wang, Runpeng Geng, Ying Chen, Jinyuan Jia
Abstract:
Long-context large language models (LLMs), such as Gemini-2.5-Pro and Claude-Sonnet-4, are increasingly used to empower advanced AI systems, including retrieval-augmented generation (RAG) pipelines and autonomous agents. In these systems, an LLM receives an instruction along with a context--often consisting of texts retrieved from a knowledge database or memory--and generates a response that is contextually grounded by following the instruction. Recent studies have designed solutions to trace back to a subset of texts in the context that contributes most to the response generated by the LLM. These solutions have numerous real-world applications, including performing post-attack forensic analysis and improving the interpretability and trustworthiness of LLM outputs. While significant efforts have been made, state-of-the-art solutions such as TracLLM often lead to a high computation cost, e.g., it takes TracLLM hundreds of seconds to perform traceback for a single response-context pair. In this work, we propose AttnTrace, a new context traceback method based on the attention weights produced by an LLM for a prompt. To effectively utilize attention weights, we introduce two techniques designed to enhance the effectiveness of AttnTrace, and we provide theoretical insights for our design choice. We also perform a systematic evaluation for AttnTrace. The results demonstrate that AttnTrace is more accurate and efficient than existing state-of-the-art context traceback methods. We also show that AttnTrace can improve state-of-the-art methods in detecting prompt injection under long contexts through the attribution-before-detection paradigm. As a real-world application, we demonstrate that AttnTrace can effectively pinpoint injected instructions in a paper designed to manipulate LLM-generated reviews. The code is at https://github.com/Wang-Yanting/AttnTrace.
中文: 针对Gemini-2.5-Pro和Claude-Sonnet-4等长上下文大语言模型,AttnTrace通过利用注意力权重开发出新型溯源方法,能更精准高效地识别关键上下文文本,在检测速度和准确性上均优于现有最优方案。
English: Long-context LLMs like Gemini-2.5-Pro and Claude-Sonnet-4 are enhanced by AttnTrace, a new traceback method that uses attention weights to accurately and efficiently identify key context texts, outperforming existing solutions in both speed and precision.
Authors:Zixuan Gu, Qiufeng Fan, Long Sun, Yang Liu, Xiaojun Ye
Abstract:
With the advancement of Large Language Models (LLMs), LLM applications have expanded into a growing number of fields. However, users with data privacy concerns face limitations in directly utilizing LLM APIs, while private deployments incur significant computational demands. This creates a substantial challenge in achieving secure LLM adaptation under constrained local resources. To address this issue, collaborative learning methods, such as Split Learning (SL), offer a resource-efficient and privacy-preserving solution for adapting LLMs to private domains. In this study, we introduce VFLAIR-LLM (available at https://github.com/FLAIR-THU/VFLAIR-LLM), an extensible and lightweight split learning framework for LLMs, enabling privacy-preserving LLM inference and fine-tuning in resource-constrained environments. Our library provides two LLM partition settings, supporting three task types and 18 datasets. In addition, we provide standard modules for implementing and evaluating attacks and defenses. We benchmark 5 attacks and 9 defenses under various Split Learning for LLM(SL-LLM) settings, offering concrete insights and recommendations on the choice of model partition configurations, defense strategies, and relevant hyperparameters for real-world applications.
中文摘要:VFLAIR-LLM框架通过分割学习实现了大语言模型在资源受限环境下的隐私保护适配,提供可配置的模型划分方案,并对安全措施进行全面评估。
English Summary: The VFLAIR-LLM framework enables privacy-preserving adaptation of large language models through split learning, providing configurable model partitioning and comprehensive evaluation of security measures for resource-constrained environments.
Authors:Wenjie Li, Siying Gu, Yiming Li, Kangjie Chen, Zhili Chen, Tianwei Zhang, Shu-Tao Xia, Dacheng Tao
Abstract:
Backdoor detection is currently the mainstream defense against backdoor attacks in federated learning (FL), where malicious clients upload poisoned updates that compromise the global model and undermine the reliability of FL deployments. Existing backdoor detection techniques fall into two categories, including passive and proactive ones, depending on whether the server proactively modifies the global model. However, both have inherent limitations in practice: passive defenses are vulnerable to common non-i.i.d. data distributions and random participation of FL clients, whereas current proactive defenses suffer inevitable out-of-distribution (OOD) bias because they rely on backdoor co-existence effects. To address these issues, we introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects, in which consecutively planted, distinct backdoors significantly suppress earlier ones. In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained. Our method preserves the advantages of proactive defenses in handling data heterogeneity (\ie, non-i.i.d. data) while mitigating the adverse impact of OOD bias through a revised detection mechanism. Extensive experiments on benchmark datasets confirm the effectiveness of Coward and its resilience to potential adaptive attacks. The code for our method would be available at https://github.com/still2009/cowardFL.
中文摘要:联邦学习中的后门检测面临被动防御易受非独立同分布数据影响、主动防御存在分布外偏差的局限,为此提出Coward主动防御方法,利用多后门碰撞效应,通过检测服务器注入的冲突水印在本地训练中是否被擦除来识别攻击者。
English Summary: Backdoor detection in federated learning faces limitations with passive defenses being vulnerable to non-i.i.d. data and proactive ones suffering from out-of-distribution bias, leading to the introduction of Coward, a proactive defense that leverages multi-backdoor collision effects to detect attackers by monitoring the erasure of server-injected watermarks during local training.
Authors:Aldan Creo
Abstract:
AI-generated text detectors have become essential tools for maintaining content authenticity, yet their robustness against evasion attacks remains questionable. We present PDFuzz, a novel attack that exploits the discrepancy between visual text layout and extraction order in PDF documents. Our method preserves exact textual content while manipulating character positioning to scramble extraction sequences. We evaluate this approach against the ArguGPT detector using a dataset of human and AI-generated text. Our results demonstrate complete evasion: detector performance drops from (93.6 $\pm$ 1.4) % accuracy and 0.938 $\pm$ 0.014 F1 score to random-level performance ((50.4 $\pm$ 3.2) % accuracy, 0.0 F1 score) while maintaining perfect visual fidelity. Our work reveals a vulnerability in current detection systems that is inherent to PDF document structures and underscores the need for implementing sturdy safeguards against such attacks. We make our code publicly available at https://github.com/ACMCMC/PDFuzz.
Chinese: PDFuzz是一种新型规避攻击,通过操纵PDF文档中的字符定位来扰乱文本提取顺序,在保持视觉保真度的同时完全绕过AI生成文本检测器。
English: PDFuzz is a novel evasion attack that manipulates character positioning in PDF documents to scramble text extraction sequences, completely bypassing AI-generated text detectors while preserving visual fidelity.
Authors:Man Hu, Yahui Ding, Yatao Yang, Liangyu Chen, Yanhao Jia, Shuai Zhao
Abstract:
As backdoor attacks become more stealthy and robust, they reveal critical weaknesses in current defense strategies: detection methods often rely on coarse-grained feature statistics, and purification methods typically require full retraining or additional clean models. To address these challenges, we propose DUP (Detection-guided Unlearning for Purification), a unified framework that integrates backdoor detection with unlearning-based purification. The detector captures feature-level anomalies by jointly leveraging class-agnostic distances and inter-layer transitions. These deviations are integrated through a weighted scheme to identify poisoned inputs, enabling more fine-grained analysis. Based on the detection results, we purify the model through a parameter-efficient unlearning mechanism that avoids full retraining and does not require any external clean model. Specifically, we innovatively repurpose knowledge distillation to guide the student model toward increasing its output divergence from the teacher on detected poisoned samples, effectively forcing it to unlearn the backdoor behavior. Extensive experiments across diverse attack methods and language model architectures demonstrate that DUP achieves superior defense performance in detection accuracy and purification efficacy. Our code is available at https://github.com/ManHu2025/DUP.
中文摘要:提出的DUP框架通过特征级异常检测结合基于知识蒸馏的参数高效反学习机制,无需完整重训练或外部干净模型即可有效清除后门。
English Summary: The proposed DUP framework combines backdoor detection using feature-level anomaly analysis with parameter-efficient unlearning through knowledge distillation, effectively eliminating backdoors without full retraining or external clean models.
Authors:Zhengxian Wu, Juan Wen, Wanli Peng, Yinghan Zhou, Changtong dou, Yiming Xue
Abstract:
Although existing backdoor defenses have gained success in mitigating backdoor attacks, they still face substantial challenges. In particular, most of them rely on large amounts of clean data to weaken the backdoor mapping but generally struggle with residual trigger effects, resulting in persistently high attack success rates (ASR). Therefore, in this paper, we propose a novel Backdoor defense method based on Directional mapping module and adversarial Knowledge Distillation (BeDKD), which balances the trade-off between defense effectiveness and model performance using a small amount of clean and poisoned data. We first introduce a directional mapping module to identify poisoned data, which destroys clean mapping while keeping backdoor mapping on a small set of flipped clean data. Then, the adversarial knowledge distillation is designed to reinforce clean mapping and suppress backdoor mapping through a cycle iteration mechanism between trust and punish distillations using clean and identified poisoned data. We conduct experiments to mitigate mainstream attacks on three datasets, and experimental results demonstrate that BeDKD surpasses the state-of-the-art defenses and reduces the ASR by 98% without significantly reducing the CACC. Our code are available in https://github.com/CAU-ISS-Lab/Backdoor-Attack-Defense-LLMs/tree/main/BeDKD.
中文: 本文提出BeDKD方法,通过方向映射模块和对抗知识蒸馏技术,仅需少量干净与中毒数据即可有效抵御后门攻击,在保持模型精度的同时将攻击成功率降低98%。
English: This paper introduces BeDKD, a novel backdoor defense method that uses a directional mapping module and adversarial knowledge distillation to effectively mitigate backdoor attacks with minimal clean and poisoned data, achieving a 98% reduction in attack success rate without compromising model accuracy.
Authors:Yujia Zheng, Tianhao Li, Haotian Huang, Tianyu Zeng, Jingyu Lu, Chuangxin Chu, Yuekai Huang, Ziyou Jiang, Qian Xiong, Yuyao Ge, Mingyang Li
Abstract:
Prompt-based adversarial attacks have become an effective means to assess the robustness of large language models (LLMs). However, existing approaches often treat prompts as monolithic text, overlooking their structural heterogeneity-different prompt components contribute unequally to adversarial robustness. Prior works like PromptRobust assume prompts are value-neutral, but our analysis reveals that complex, domain-specific prompts with rich structures have components with differing vulnerabilities. To address this gap, we introduce PromptAnatomy, an automated framework that dissects prompts into functional components and generates diverse, interpretable adversarial examples by selectively perturbing each component using our proposed method, ComPerturb. To ensure linguistic plausibility and mitigate distribution shifts, we further incorporate a perplexity (PPL)-based filtering mechanism. As a complementary resource, we annotate four public instruction-tuning datasets using the PromptAnatomy framework, verified through human review. Extensive experiments across these datasets and five advanced LLMs demonstrate that ComPerturb achieves state-of-the-art attack success rates. Ablation studies validate the complementary benefits of prompt dissection and PPL filtering. Our results underscore the importance of prompt structure awareness and controlled perturbation for reliable adversarial robustness evaluation in LLMs. Code and data are available at https://github.com/Yujiaaaaa/PACP.
Chinese Summary: 本文提出PromptAnatomy框架,通过将提示分解为功能组件并采用ComPerturb方法进行选择性扰动,结合困惑度过滤机制保持语言合理性,在多个数据集和大型语言模型上实现了最优的攻击成功率,强调了提示结构认知对对抗鲁棒性评估的重要性。
English Summary: This paper introduces PromptAnatomy, an automated framework that enhances adversarial attack evaluation by dissecting prompts into functional components and selectively perturbing them using ComPerturb, achieving state-of-the-art attack success rates while maintaining linguistic plausibility through perplexity filtering.
Authors:Terry Yue Zhuo, Dingmin Wang, Hantian Ding, Varun Kumar, Zijian Wang
Abstract:
Large Language Models (LLMs) have achieved remarkable success in software engineering tasks when trained with executable runtime environments, particularly in resolving GitHub issues. However, such runtime environments are often unavailable in other domains, especially cybersecurity, where challenge configurations and execution contexts are ephemeral or restricted. We present Cyber-Zero, the first runtime-free framework for synthesizing high-quality agent trajectories to train cybersecurity LLMs. Cyber-Zero leverages publicly available CTF writeups and employs persona-driven LLM simulation to reverse-engineer runtime behaviors and generate realistic, long-horizon interaction sequences without actual environments. Using trajectories synthesized by Cyber-Zero, we train LLM-based agents that achieve up to 13.1% absolute performance gains over baseline models on three prominent CTF benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best model, Cyber-Zero-32B, establishes new state-of-the-art performance among open-weight models, matching the capabilities of proprietary systems like DeepSeek-V3-0324 and Claude-3.5-Sonnet while offering superior cost-effectiveness, and demonstrating that runtime-free trajectory synthesis can effectively democratize the development of state-of-the-art cybersecurity agents.
Cyber-Zero introduces the first runtime-free framework that synthesizes high-quality agent trajectories from CTF writeups, enabling LLMs to achieve state-of-the-art performance in cybersecurity tasks without executable environments.
English Summary:
Authors:Junhao Zheng, Jiahao Sun, Chenhao Lin, Zhengyu Zhao, Chen Ma, Chong Zhang, Cong Wang, Qian Wang, Chao Shen
Abstract:
Developing reliable defenses against patch attacks on object detectors has attracted increasing interest. However, we identify that existing defense evaluations lack a unified and comprehensive framework, resulting in inconsistent and incomplete assessments of current methods. To address this issue, we revisit 11 representative defenses and present the first patch defense benchmark, involving 2 attack goals, 13 patch attacks, 11 object detectors, and 4 diverse metrics. This leads to the large-scale adversarial patch dataset with 94 types of patches and 94,000 images. Our comprehensive analyses reveal new insights: (1) The difficulty in defending against naturalistic patches lies in the data distribution, rather than the commonly believed high frequencies. Our new dataset with diverse patch distributions can be used to improve existing defenses by 15.09% AP@0.5. (2) The average precision of the attacked object, rather than the commonly pursued patch detection accuracy, shows high consistency with defense performance. (3) Adaptive attacks can substantially bypass existing defenses, and defenses with complex/stochastic models or universal patch properties are relatively robust. We hope that our analyses will serve as guidance on properly evaluating patch attacks/defenses and advancing their design. Code and dataset are available at https://github.com/Gandolfczjh/APDE, where we will keep integrating new attacks/defenses.
Chinese: 本研究首次建立了针对目标检测器对抗性补丁攻击防御的综合评估基准,揭示了数据分布而非高频特征对防御的重要性,以及自适应攻击的有效性等关键发现,同时提供了数据集和框架,可将现有防御性能提升15.09%。
English: This study introduces the first comprehensive benchmark for evaluating defenses against adversarial patch attacks on object detectors, revealing key insights such as the importance of data distribution over high frequencies and the effectiveness of adaptive attacks, while providing a dataset and framework to improve defense performance by 15.09%.
Authors:Jiecong Wang, Haoran Li, Hao Peng, Ziqian Zeng, Zihao Wang, Haohua Du, Zhengtao Yu
Abstract:
Jailbreaking is an essential adversarial technique for red-teaming these models to uncover and patch security flaws. However, existing jailbreak methods face significant drawbacks. Token-level jailbreak attacks often produce incoherent or unreadable inputs and exhibit poor transferability, while prompt-level attacks lack scalability and rely heavily on manual effort and human ingenuity. We propose a concise and effective two-stage framework that combines the advantages of these approaches. The first stage performs a scenario-based generation of context and rephrases the original malicious query to obscure its harmful intent. The second stage then utilizes information from the model's hidden states to guide fine-grained edits, effectively steering the model's internal representation of the input from a malicious toward a benign one. Extensive experiments demonstrate that this method achieves state-of-the-art Attack Success Rate, with gains of up to 37.74% over the strongest baseline, and exhibits excellent transferability to black-box models. Our analysis further demonstrates that AGILE maintains substantial effectiveness against prominent defense mechanisms, highlighting the limitations of current safeguards and providing valuable insights for future defense development. Our code is available at https://github.com/yunsaijc/AGILE.
中文摘要:提出的AGILE框架通过结合基于场景的查询重构和隐藏状态引导编辑,实现了最先进的越狱成功率,同时展现出优秀的可迁移性和防御对抗能力。
English Summary: The proposed AGILE framework combines scenario-based query rephrasing with hidden state-guided editing to achieve state-of-the-art jailbreak effectiveness while demonstrating strong transferability and defense resistance.
Authors:Rishabh Batra, Zhili Chen, Rahul Jain, YaoNan Zhang
Abstract:
Pseudorandom quantum states (PRSs) and pseudorandom function-like quantum state (PRFS) generators are quantum analogues of pseudorandom generators and pseudorandom functions. It is known that PRS (and PRFS) can exist even if BQP = QMA (relative to a quantum oracle) or if P = NP (relative to a classical oracle), which does not allow for the existence of one-way functions (relative to these oracles). Hence, these are potentially weaker objects than quantum-secure one-way functions, which can be used to do quantum cryptography. A desirable property of PRS and PRFS constructions is scalability, which ensures that the security parameter $λ$ (which determines indistinguishability from their Haar-random counterparts) can be much larger than $n$ (the number of qubits of the output states). This may be important in some applications where PRS and PRFS primitives are used. We present an isometric procedure to prepare quantum states that can be arbitrarily random (i.e., the trace distance from the Haar-random state can be arbitrarily small for the true random case, or the distinguishing advantage can be arbitrarily small for the pseudorandom case). Our procedure provides a new method for scalable PRS that introduces no entanglement or correlations with the environment. This naturally gives the first construction for scalable, quantum-accessible, and adaptive PRFS assuming quantum-secure one-way functions. Our PRFS construction implies various primitives, including long-input PRFS, short-input PRFS, short-output PRFS, non-adaptive PRFS, and classically-accessible adaptive PRFS. This new construction may be helpful in some simplification of the microcrypt zoo (https://sattath.github.io/microcrypt-zoo/).
Authors:Zheng Zhang, Peilin Zhao, Deheng Ye, Hao Wang
Abstract:
Jailbreak attacks aim to exploit large language models (LLMs) by inducing them to generate harmful content, thereby revealing their vulnerabilities. Understanding and addressing these attacks is crucial for advancing the field of LLM safety. Previous jailbreak approaches have mainly focused on direct manipulations of harmful intent, with limited attention to the impact of persona prompts. In this study, we systematically explore the efficacy of persona prompts in compromising LLM defenses. We propose a genetic algorithm-based method that automatically crafts persona prompts to bypass LLM's safety mechanisms. Our experiments reveal that: (1) our evolved persona prompts reduce refusal rates by 50-70% across multiple LLMs, and (2) these prompts demonstrate synergistic effects when combined with existing attack methods, increasing success rates by 10-20%. Our code and data are available at https://github.com/CjangCjengh/Generic_Persona.
Chinese Summary: 本研究提出一种基于遗传算法的方法,通过自动生成角色提示有效绕过大型语言模型的安全防护,将拒绝率降低50-70%,并将现有攻击成功率提升10-20%。
English Summary: This study introduces a genetic algorithm-based method to automatically create persona prompts that effectively bypass large language model safety mechanisms, reducing refusal rates by 50-70% and enhancing existing attack success rates by 10-20%.
Authors:Aditya Pujari, Ajita Rattani
Abstract:
The rapid advancement of voice generation technologies has enabled the synthesis of speech that is perceptually indistinguishable from genuine human voices. While these innovations facilitate beneficial applications such as personalized text-to-speech systems and voice preservation, they have also introduced significant risks, including deepfake impersonation scams and synthetic media-driven disinformation campaigns. Recent reports indicate that in 2024, deepfake fraud attempts surged by over 1,300% compared to 2023, underscoring the urgent need for robust audio content authentication. The financial sector has been particularly impacted, with a loss of over 10 million USD to voice scams and individual victims reporting losses exceeding $6,000 from AI-generated deepfake calls. In response, regulators and governments worldwide are enacting measures to improve AI content transparency and traceability, emphasizing the development of forensic tools and watermarking techniques as essential strategies to uphold media integrity.
中文: 语音生成技术虽能合成逼真语音,却也带来深度伪造诈骗等严重风险,2024年欺诈激增超1300%,造成千万美元损失,全球正通过开发认证工具和加强监管来应对。
English: Voice generation technology enables realistic speech synthesis but poses severe risks like deepfake scams, with fraud attempts surging over 1,300% in 2024 and financial losses exceeding $10 million, prompting global efforts to develop authentication tools and regulations.
Authors:Anxiao Song, Shujie Cui, Jianli Bai, Ke Cheng, Yulong Shen, Giovanni Russello
Abstract:
In light of increasing privacy concerns and stringent legal regulations, using secure multiparty computation (MPC) to enable collaborative GBDT model training among multiple data owners has garnered significant attention. Despite this, existing MPC-based GBDT frameworks face efficiency challenges due to high communication costs and the computation burden of non-linear operations, such as division and sigmoid calculations. In this work, we introduce Guard-GBDT, an innovative framework tailored for efficient and privacy-preserving GBDT training on vertical datasets. Guard-GBDT bypasses MPC-unfriendly division and sigmoid functions by using more streamlined approximations and reduces communication overhead by compressing the messages exchanged during gradient aggregation. We implement a prototype of Guard-GBDT and extensively evaluate its performance and accuracy on various real-world datasets. The results show that Guard-GBDT outperforms state-of-the-art HEP-XGB (CIKM'21) and SiGBDT (ASIA CCS'24) by up to $2.71\times$ and $12.21 \times$ on LAN network and up to $2.7\times$ and $8.2\times$ on WAN network. Guard-GBDT also achieves comparable accuracy with SiGBDT and plaintext XGBoost (better than HEP-XGB ), which exhibits a deviation of $\pm1\%$ to $\pm2\%$ only. Our implementation code is provided at https://github.com/XidianNSS/Guard-GBDT.git.
中文: Guard-GBDT是一种高效且保护隐私的GBDT训练框架,通过简化近似替换MPC不友好操作并降低通信成本,在保持相近准确度的同时,相比现有方法实现了显著的性能提升。
English: Guard-GBDT is an efficient and privacy-preserving framework for GBDT training that replaces MPC-unfriendly operations with streamlined approximations and reduces communication costs, achieving significant speed improvements over existing methods while maintaining comparable accuracy.
Authors:Zi Liang, Liantong Yu, Shiyu Zhang, Qingqing Ye, Haibo Hu
Abstract:
Overestimation in evaluating large language models (LLMs) has become an increasing concern. Due to the contamination of public benchmarks or imbalanced model training, LLMs may achieve unreal evaluation results on public benchmarks, either intentionally or unintentionally, which leads to unfair comparisons among LLMs and undermines their realistic capability assessments. Existing benchmarks attempt to address these issues by keeping test cases permanently secret, mitigating contamination through human evaluation, or repeatedly collecting and constructing new samples. However, these approaches fail to ensure reproducibility, transparency, and high efficiency simultaneously. Moreover, the extent of overestimation in current LLMs remains unquantified. To address these issues, we propose ArxivRoll, a dynamic evaluation framework inspired by one-time pad encryption in cryptography. ArxivRoll comprises two key components: \emph{i) SCP (Sequencing, Cloze, and Prediction)}, an automated generator for private test cases, and \emph{ii) Rugged Scores (RS)}, metrics that measure the proportion of public benchmark contamination and training bias. Leveraging SCP, ArxivRoll constructs a new benchmark every six months using recent articles from ArXiv and employs them for one-time evaluations of LLM performance. Extensive experiments demonstrate the high quality of our benchmark, and we provide a systematic evaluation of current LLMs. The source code is available at https://github.com/liangzid/ArxivRoll/.
中文: 摘要提出了ArxivRoll动态评估框架,通过生成私有测试用例并量化污染与偏差,旨在解决大型语言模型评估中的高估问题,确保公平且真实的性能衡量。
English: The abstract introduces ArxivRoll, a dynamic evaluation framework designed to address overestimation in large language models by generating private test cases and measuring contamination and bias, ensuring fair and realistic assessments.
Authors:Hao Li, Lijun Li, Zhenghao Lu, Xianyi Wei, Rui Li, Jing Shao, Lei Sha
Abstract:
With rapid advancement and increasing accessibility of LLMs, fine-tuning aligned models has become a critical step for adapting them to real-world applications, which makes the safety of this fine-tuning process more important than ever. However, recent studies have highlighted a critical challenge: even when fine-tuning with seemingly benign downstream datasets, the safety of aligned LLMs can be compromised, making them more susceptible to malicious instructions.
In this paper, we show that fine-tuning datasets often contain samples with safety-degrading features that are not easily identifiable on the surface. These samples can significantly degrade the safety alignment of LLMs during fine-tuning. To address this issue, we propose LARF, a Layer-Aware Representation Filtering method. This method identifies safety-sensitive layers within the LLM and leverages their representations to detect which data samples in the post-training dataset contain safety-degrading features.
Experimental results demonstrate that LARF can effectively identify benign data with safety-degrading features. After removing such data, the safety alignment degradation caused by fine-tuning is mitigated. Please see our code at https://github.com/LLLeoLi/LARF.
中文: 本文提出LARF方法,通过识别并移除微调数据中损害安全性的样本,有效缓解大语言模型在适配应用时的安全对齐退化问题。
English: The paper introduces LARF, a layer-aware filtering method that identifies and removes safety-degrading data in fine-tuning datasets to preserve LLM safety alignment during adaptation.
Authors:Russell O'Connor, Andrew Poelstra
Abstract:
The modular inverse is an essential piece of computation required for elliptic curve operations used for digital signatures in Bitcoin and other applications. A novel approach to the extended Euclidean algorithm has been developed by Bernstein and Yang within the last few years and incorporated into the libsecp256k1 cryptographic library used by Bitcoin. However, novel algorithms introduce new risks of errors. To address this we have completed a computer verified proof of the correctness of (one of) libsecp256k1's modular inverse implementations with the Coq proof assistant using the Verifiable C's implementation of separation logic.
中文: 比特币libsecp256k1密码库中的新型模逆算法已通过Coq证明助手完成形式化验证,确保了该创新算法的正确性并规避了潜在风险。
English: A novel modular inverse algorithm in Bitcoin's libsecp256k1 library has been formally verified for correctness using the Coq proof assistant, addressing potential risks from its innovative approach.
Authors:Sam Johnson, Viet Pham, Thai Le
Abstract:
This work demonstrates that LLM-based web navigation agents offer powerful automation capabilities but are vulnerable to Indirect Prompt Injection (IPI) attacks. We show that adversaries can embed universal adversarial triggers in webpage HTML to hijack agent behavior that utilizes the accessibility tree to parse HTML, causing unintended or malicious actions. Using the Greedy Coordinate Gradient (GCG) algorithm and a Browser Gym agent powered by Llama-3.1, our system demonstrates high success rates across real websites in both targeted and general attacks, including login credential exfiltration and forced ad clicks. Our empirical results highlight critical security risks and the need for stronger defenses as LLM-driven autonomous web agents become more widely adopted. The system software (https://github.com/sej2020/manipulating-web-agents) is released under the MIT License, with an accompanying publicly available demo website (http://lethaiq.github.io/attack-web-llm-agent).
中文: 基于大语言模型的网页导航代理易受间接提示注入攻击,攻击者可通过在网页HTML中嵌入触发器来劫持代理行为,导致如窃取登录凭证等严重安全风险。
English: LLM-based web navigation agents are vulnerable to Indirect Prompt Injection attacks, where adversaries embed triggers in webpage HTML to hijack agent behavior, leading to security risks like credential theft and unauthorized actions.
Authors:Wenxuan Zeng, Tianshi Xu, Yi Chen, Yifan Zhou, Mingzhe Zhang, Jin Tan, Cheng Hong, Meng Li
Abstract:
Privacy-preserving machine learning (PPML) based on cryptographic protocols has emerged as a promising paradigm to protect user data privacy in cloud-based machine learning services. While it achieves formal privacy protection, PPML often incurs significant efficiency and scalability costs due to orders of magnitude overhead compared to the plaintext counterpart. Therefore, there has been a considerable focus on mitigating the efficiency gap for PPML. In this survey, we provide a comprehensive and systematic review of recent PPML studies with a focus on cross-level optimizations. Specifically, we categorize existing papers into protocol level, model level, and system level, and review progress at each level. We also provide qualitative and quantitative comparisons of existing works with technical insights, based on which we discuss future research directions and highlight the necessity of integrating optimizations across protocol, model, and system levels. We hope this survey can provide an overarching understanding of existing approaches and potentially inspire future breakthroughs in the PPML field. As the field is evolving fast, we also provide a public GitHub repository to continuously track the developments, which is available at https://github.com/PKU-SEC-Lab/Awesome-PPML-Papers.
Chinese: 本综述系统梳理了隐私保护机器学习(PPML)的研究进展,重点关注协议、模型和系统层面的跨层级优化方法,以在保障数据隐私的同时提升计算效率。
English: This survey comprehensively reviews privacy-preserving machine learning (PPML) studies, focusing on cross-level optimizations across protocol, model, and system levels to address efficiency challenges while ensuring data privacy.
Authors:Yuan Yao, Jin Song, Jian Jin
Abstract:
As valuable digital assets, deep neural networks necessitate robust ownership protection, positioning neural network watermarking (NNW) as a promising solution. Among various NNW approaches, weight-based methods are favored for their simplicity and practicality; however, they remain vulnerable to forging and overwriting attacks. To address those challenges, we propose NeuralMark, a robust method built around a hashed watermark filter. Specifically, we utilize a hash function to generate an irreversible binary watermark from a secret key, which is then used as a filter to select the model parameters for embedding. This design cleverly intertwines the embedding parameters with the hashed watermark, providing a robust defense against both forging and overwriting attacks. An average pooling is also incorporated to resist fine-tuning and pruning attacks. Furthermore, it can be seamlessly integrated into various neural network architectures, ensuring broad applicability. Theoretically, we analyze its security boundary. Empirically, we verify its effectiveness and robustness across 13 distinct Convolutional and Transformer architectures, covering five image classification tasks and one text generation task. The source codes are available at https://github.com/AIResearch-Group/NeuralMark.
中文: NeuralMark是一种鲁棒的神经网络水印方法,通过哈希水印滤波器将不可逆水印嵌入模型参数,有效防御伪造、覆盖、微调和剪枝攻击,并兼容多种网络架构。
English: NeuralMark is a robust neural network watermarking method that uses a hashed watermark filter to embed irreversible watermarks into model parameters, effectively defending against forging, overwriting, fine-tuning, and pruning attacks while being compatible with various architectures.
Authors:Jesus Lopez, Viviana Cadena, Mohammad Saidur Rahman
Abstract:
The rapid advancement of quantum computing poses a critical threat to classical cryptographic algorithms such as RSA and ECC, particularly in Internet of Things (IoT) devices, where secure communication is essential but often constrained by limited computational resources. This paper investigates the feasibility of deploying post-quantum cryptography (PQC) algorithms on resource-constrained devices. In particular, we implement three PQC algorithms -- BIKE, CRYSTALS-Kyber, and HQC -- on a lightweight IoT platform built with Raspberry Pi devices. Leveraging the Open Quantum Safe (\texttt{liboqs}) library in conjunction with \texttt{mbedTLS}, we develop quantum-secure key exchange protocols, and evaluate their performance in terms of computational overhead, memory usage, and energy consumption for quantum secure communication. Experimental results demonstrate that the integration of PQC algorithms on constrained hardware is practical, reinforcing the urgent need for quantum-resilient cryptographic frameworks in next-generation IoT devices. The implementation of this paper is available at https://iqsec-lab.github.io/PQC-IoT/.
Authors:Sizhe Chen, Yizhu Wang, Nicholas Carlini, Chawin Sitawarin, David Wagner
Abstract:
When large language model (LLM) systems interact with external data to perform complex tasks, a new attack, namely prompt injection, becomes a significant threat. By injecting instructions into the data accessed by the system, the attacker is able to override the initial user task with an arbitrary task directed by the attacker. To secure the system, test-time defenses, e.g., defensive prompting, have been proposed for system developers to attain security only when needed in a flexible manner. However, they are much less effective than training-time defenses that change the model parameters. Motivated by this, we propose DefensiveToken, a test-time defense with prompt injection robustness comparable to training-time alternatives. DefensiveTokens are newly inserted as special tokens, whose embeddings are optimized for security. In security-sensitive cases, system developers can append a few DefensiveTokens before the LLM input to achieve security with a minimal utility drop. In scenarios where security is less of a concern, developers can simply skip DefensiveTokens; the LLM system remains the same as there is no defense, generating high-quality responses. Thus, DefensiveTokens, if released alongside the model, allow a flexible switch between the state-of-the-art (SOTA) utility and almost-SOTA security at test time. The code is available at https://github.com/Sizhe-Chen/DefensiveToken.
中文: DefensiveToken是一种测试时防御方法,通过插入特殊优化的令牌实现与训练时防御相当的抗提示注入能力,可根据需要在最佳效用和增强安全性之间灵活切换。
English: DefensiveToken is a test-time defense method that uses specially optimized tokens to provide prompt injection robustness comparable to training-time defenses, allowing flexible switching between maximum utility and enhanced security as needed.
Authors:Karthik Garimella, Austin Ebel, Brandon Reagen
Abstract:
Fully Homomorphic Encryption (FHE) is an encryption scheme that allows for computation to be performed directly on encrypted data, effectively closing the loop on secure and outsourced computing. Data is encrypted not only during rest and transit, but also during processing. However, FHE provides a limited instruction set: SIMD addition, SIMD multiplication, and cyclic rotation of 1-D vectors. This restriction makes performing multi-dimensional tensor operations challenging. Practitioners must pack these tensors into 1-D vectors and map tensor operations onto this one-dimensional layout rather than their traditional nested structure. And while prior systems have made significant strides in automating this process, they often hide critical packing decisions behind layers of abstraction, making debugging, optimizing, and building on top of these systems difficult.
In this work, we approach multi-dimensional tensor operations in FHE through Einstein summation (einsum) notation. Einsum notation explicitly encodes dimensional structure and operations in its syntax, naturally exposing how tensors should be packed and transformed. We decompose einsum expressions into a fixed set of FHE-friendly operations. We implement our design and present EinHops, a minimalist system that factors einsum expressions into a fixed sequence of FHE operations. EinHops enables developers to perform encrypted tensor operations using FHE while maintaining full visibility into the underlying packing strategy. We evaluate EinHops on a range of tensor operations from a simple transpose to complex multi-dimensional contractions. We show that the explicit nature of einsum notation allows us to build an FHE tensor system that is simple, general, and interpretable. We open-source EinHops at the following repository: https://github.com/baahl-nyu/einhops.
Chinese: 全同态加密(FHE)允许在加密数据上直接计算,但仅限于一维操作,因此作者开发了EinHops系统,利用爱因斯坦求和符号来简化和明确FHE中的多维张量操作。
English: Fully Homomorphic Encryption (FHE) enables computation on encrypted data but is limited to one-dimensional operations, so the authors introduce EinHops, a system using Einstein summation notation to simplify and clarify multi-dimensional tensor operations in FHE.
Authors:Nishit V. Pandya, Andrey Labunets, Sicun Gao, Earlence Fernandes
Abstract:
A popular class of defenses against prompt injection attacks on large language models (LLMs) relies on fine-tuning the model to separate instructions and data, so that the LLM does not follow instructions that might be present with data. There are several academic systems and production-level implementations of this idea. We evaluate the robustness of this class of prompt injection defenses in the whitebox setting by constructing strong optimization-based attacks and showing that the defenses do not provide the claimed security properties. Specifically, we construct a novel attention-based attack algorithm for text-based LLMs and apply it to two recent whitebox defenses SecAlign (CCS 2025) and StruQ (USENIX Security 2025), showing attacks with success rates of up to 70% with modest increase in attacker budget in terms of tokens. Our findings make fundamental progress towards understanding the robustness of prompt injection defenses in the whitebox setting. We release our code and attacks at https://github.com/nishitvp/better_opts_attacks
中文: 本研究通过开发基于优化的攻击方法评估大型语言模型中提示注入防御的鲁棒性,证明现有方法无法提供足够安全性,对近期防御措施的攻击成功率高达70%。
English: This study evaluates the robustness of prompt injection defenses in large language models by developing optimization-based attacks, demonstrating that existing methods fail to provide adequate security with success rates reaching 70% against recent defenses.
Authors:Renyang Liu, Guanlin Li, Tianwei Zhang, See-Kiong Ng
Abstract:
Recent advances in image generation models (IGMs), particularly diffusion-based architectures such as Stable Diffusion (SD), have markedly enhanced the quality and diversity of AI-generated visual content. However, their generative capability has also raised significant ethical, legal, and societal concerns, including the potential to produce harmful, misleading, or copyright-infringing content. To mitigate these concerns, machine unlearning (MU) emerges as a promising solution by selectively removing undesirable concepts from pretrained models. Nevertheless, the robustness and effectiveness of existing unlearning techniques remain largely unexplored, particularly in the presence of multi-modal adversarial inputs.
To bridge this gap, we propose Recall, a novel adversarial framework explicitly designed to compromise the robustness of unlearned IGMs. Unlike existing approaches that predominantly rely on adversarial text prompts, Recall exploits the intrinsic multi-modal conditioning capabilities of diffusion models by efficiently optimizing adversarial image prompts with guidance from a single semantically relevant reference image. Extensive experiments across ten state-of-the-art unlearning methods and diverse tasks show that Recall consistently outperforms existing baselines in terms of adversarial effectiveness, computational efficiency, and semantic fidelity with the original textual prompt. These findings reveal critical vulnerabilities in current unlearning mechanisms and underscore the need for more robust solutions to ensure the safety and reliability of generative models. Code and data are publicly available at \textcolor{blue}{https://github.com/ryliu68/RECALL}.
中文摘要:图像生成模型的进步引发了伦理担忧,机器学习遗忘成为解决方案,但提出的Recall框架通过对抗性图像提示揭示了现有遗忘方法的脆弱性,有效削弱了其鲁棒性。
English Summary: Recent advances in image generation models raise ethical concerns, prompting machine unlearning as a solution, but the proposed Recall framework exposes vulnerabilities in current unlearning methods by using adversarial image prompts to compromise their robustness effectively.
Authors:Kenneth Odoh
Abstract:
We present a privacy-preserving telemetry aggregation scheme. Our underlying frequency estimation routine works within the framework of differential privacy. The design philosophy follows a client-server architecture. Furthermore, the system uses a local differential privacy scheme where data gets randomized on the client before submitting the request to the resource server. This scheme allows for data analysis on de-identified data by carefully adding noise to prevent re-identification attacks, thereby facilitating public data release without compromising the identifiability of the individual record. This work further enhances privacy guarantees by leveraging Oblivious HTTP (OHTTP) to achieve increased privacy protection for data in transit that addresses pre-existing privacy vulnerabilities in raw HTTP. We provide an implementation that focuses on frequency estimation with a histogram of a known dictionary. Our resulting formulation based on OHTTP has provided stricter privacy safeguards when compared to trusting an organization to manually delete identifying information from the client's request in the ingestor as deployed in reference work~\cite{apple2017}. Code available at https://github.com/kenluck2001/miscellaneous/tree/master/src/Privacy-Preserving-Telemetry.
中文: 本文提出了一种隐私保护的遥测数据聚合方案,采用本地差分隐私和Oblivious HTTP技术,在数据传输过程中确保匿名性并防止重识别攻击,同时实现频率估计并提升隐私保障。
English: This paper introduces a privacy-preserving telemetry aggregation scheme that utilizes local differential privacy and Oblivious HTTP to ensure data anonymity during transmission and prevent re-identification, while enabling frequency estimation with enhanced security.
Authors:Xiaohu Li, Yunfeng Ning, Zepeng Bao, Mayi Xu, Jianhao Chen, Tieyun Qian
Abstract:
Security alignment enables the Large Language Model (LLM) to gain the protection against malicious queries, but various jailbreak attack methods reveal the vulnerability of this security mechanism. Previous studies have isolated LLM jailbreak attacks and defenses. We analyze the security protection mechanism of the LLM, and propose a framework that combines attack and defense. Our method is based on the linearly separable property of LLM intermediate layer embedding, as well as the essence of jailbreak attack, which aims to embed harmful problems and transfer them to the safe area. We utilize generative adversarial network (GAN) to learn the security judgment boundary inside the LLM to achieve efficient jailbreak attack and defense. The experimental results indicate that our method achieves an average jailbreak success rate of 88.85\% across three popular LLMs, while the defense success rate on the state-of-the-art jailbreak dataset reaches an average of 84.17\%. This not only validates the effectiveness of our approach but also sheds light on the internal security mechanisms of LLMs, offering new insights for enhancing model security The code and data are available at https://github.com/NLPGM/CAVGAN.
中文摘要:本研究提出了一种基于GAN的框架,利用大语言模型中间层的线性可分嵌入特性,在提升越狱攻击成功率的同时强化防御效果,为理解模型内部安全机制提供了新视角。
English Summary: This study introduces a GAN-based framework that leverages the linearly separable embeddings in LLM intermediate layers to simultaneously enhance jailbreak attack and defense, achieving high success rates and providing insights into LLM security mechanisms.
Authors:Ruofei Wang, Peiqi Duan, Boxin Shi, Renjie Wan
Abstract:
With more event datasets being released online, safeguarding the event dataset against unauthorized usage has become a serious concern for data owners. Unlearnable Examples are proposed to prevent the unauthorized exploitation of image datasets. However, it's unclear how to create unlearnable asynchronous event streams to prevent event misuse. In this work, we propose the first unlearnable event stream generation method to prevent unauthorized training from event datasets. A new form of asynchronous event error-minimizing noise is proposed to perturb event streams, tricking the unauthorized model into learning embedded noise instead of realistic features. To be compatible with the sparse event, a projection strategy is presented to sparsify the noise to render our unlearnable event streams (UEvs). Extensive experiments demonstrate that our method effectively protects event data from unauthorized exploitation, while preserving their utility for legitimate use. We hope our UEvs contribute to the advancement of secure and trustworthy event dataset sharing. Code is available at: https://github.com/rfww/uevs.
中文: 本文首次提出不可学习事件流生成方法,通过添加稀疏的误差最小化噪声来干扰事件流,使未经授权的模型学习嵌入噪声而非真实特征,从而在保护数据合法使用的同时防止事件数据集被非法利用。
English: This paper introduces the first method for generating unlearnable event streams to protect event datasets from unauthorized training by adding sparse, error-minimizing noise that misleads models while preserving legitimate data utility.
Authors:Kaixiang Zhao, Joseph Yousry Attalla, Qian Lou, Yushun Dong
Abstract:
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph-based learning tasks. However, enabling privacy-preserving GNNs in encrypted domains, such as under Fully Homomorphic Encryption (FHE), typically incurs substantial computational overhead, rendering real-time and privacy-preserving inference impractical. In this work, we propose DESIGN (EncrypteD GNN Inference via sErver-Side Input Graph pruNing), a novel framework for efficient encrypted GNN inference. DESIGN tackles the critical efficiency limitations of existing FHE GNN approaches, which often overlook input data redundancy and apply uniform computational strategies. Our framework achieves significant performance gains through a hierarchical optimization strategy executed entirely on the server: first, FHE-compatible node importance scores (based on encrypted degree statistics) are computed from the encrypted graph. These scores then guide a homomorphic partitioning process, generating multi-level importance masks directly under FHE. This dynamically generated mask facilitates both input graph pruning (by logically removing unimportant elements) and a novel adaptive polynomial activation scheme, where activation complexity is tailored to node importance levels. Empirical evaluations demonstrate that DESIGN substantially accelerates FHE GNN inference compared to state-of-the-art methods while maintaining competitive model accuracy, presenting a robust solution for secure graph analytics. Our implementation is publicly available at https://github.com/LabRAI/DESIGN.
中文摘要:DESIGN框架通过服务器端分层优化,在完全同态加密环境下动态修剪输入图数据并根据节点重要性自适应调整激活复杂度,实现了高效的加密图神经网络推理。
English Summary: The DESIGN framework enables efficient encrypted Graph Neural Network inference by implementing server-side hierarchical optimization that dynamically prunes input graphs and adapts activation complexity based on node importance under Fully Homomorphic Encryption.
Authors:Shuo Shao, Yiming Li, Mengren Zheng, Zhiyang Hu, Yukun Chen, Boheng Li, Yu He, Junfeng Guo, Dacheng Tao, Zhan Qin
Abstract:
The widespread application of Deep Learning across diverse domains hinges critically on the quality and composition of training datasets. However, the common lack of disclosure regarding their usage raises significant privacy and copyright concerns. Dataset auditing techniques, which aim to determine if a specific dataset was used to train a given suspicious model, provide promising solutions to addressing these transparency gaps. While prior work has developed various auditing methods, their resilience against dedicated adversarial attacks remains largely unexplored. To bridge the gap, this paper initiates a comprehensive study evaluating dataset auditing from an adversarial perspective. We start with introducing a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing). Subsequently, we formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset. Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery. These formulations and strategies lead to our new benchmark, DATABench, comprising 17 evasion attacks, 5 forgery attacks, and 9 representative auditing methods. Extensive evaluations using DATABench reveal that none of the evaluated auditing methods are sufficiently robust or distinctive under adversarial settings. These findings underscore the urgent need for developing a more secure and reliable dataset auditing method capable of withstanding sophisticated adversarial manipulation. Code is available at https://github.com/shaoshuo-ss/DATABench.
中文: 本文从对抗性角度全面评估数据集审计方法,提出分类法和系统性攻击策略,揭示现有方法易受操纵的脆弱性,并建立DATABench基准,证明亟需开发更鲁棒的审计技术。
English: This paper introduces a comprehensive adversarial evaluation of dataset auditing methods, proposing a taxonomy and systematic attack strategies that reveal their vulnerability to manipulation, and establishes the DATABench benchmark to demonstrate the urgent need for more robust auditing techniques.
Authors:Nicholas Chivaran, Jianbing Ni
Abstract:
The recent proliferation of photorealistic AI-generated images (AIGI) has raised urgent concerns about their potential misuse, particularly on social media platforms. Current state-of-the-art AIGI detection methods typically rely on large, deep neural architectures, creating significant computational barriers to real-time, large-scale deployment on platforms like social media. To challenge this reliance on computationally intensive models, we introduce LAID, the first framework -- to our knowledge -- that benchmarks and evaluates the detection performance and efficiency of off-the-shelf lightweight neural networks. In this framework, we comprehensively train and evaluate selected models on a representative subset of the GenImage dataset across spatial, spectral, and fusion image domains. Our results demonstrate that lightweight models can achieve competitive accuracy, even under adversarial conditions, while incurring substantially lower memory and computation costs compared to current state-of-the-art methods. This study offers valuable insight into the trade-off between efficiency and performance in AIGI detection and lays a foundation for the development of practical, scalable, and trustworthy detection systems. The source code of LAID can be found at: https://github.com/nchivar/LAID.
中文摘要:LAID框架研究表明,轻量级神经网络能以显著降低的计算成本实现与现有方法相媲美的AI生成图像检测精度,为社交媒体平台的可扩展部署提供了实用解决方案。
English summary: The LAID framework demonstrates that lightweight neural networks can achieve competitive accuracy in detecting AI-generated images with significantly lower computational costs, offering a practical solution for scalable deployment on social media platforms.
Authors:Binyan Xu, Fan Yang, Xilin Dai, Di Tang, Kehuan Zhang
Abstract:
Deep Neural Networks (DNNs) are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and clean-image backdoors. To address them, we introduce CLIP-Guided backdoor Defense (CGD), an efficient and effective method that mitigates various backdoor attacks. CGD utilizes a publicly accessible CLIP model to identify inputs that are likely to be clean or poisoned. It then retrains the model with these inputs, using CLIP's logits as a guidance to effectively neutralize the backdoor. Experiments on 4 datasets and 11 attack types demonstrate that CGD reduces attack success rates (ASRs) to below 1% while maintaining clean accuracy (CA) with a maximum drop of only 0.3%, outperforming existing defenses. Additionally, we show that clean-data-based defenses can be adapted to poisoned data using CGD. Also, CGD exhibits strong robustness, maintaining low ASRs even when employing a weaker CLIP model or when CLIP itself is compromised by a backdoor. These findings underscore CGD's exceptional efficiency, effectiveness, and applicability for real-world backdoor defense scenarios. Code: https://github.com/binyxu/CGD.
Chinese: 提出的CLIP引导后门防御方法(CGD)利用公开CLIP模型区分并重新训练干净与中毒输入,在多种数据集和攻击类型中实现了接近零的攻击成功率,同时对准确率影响极小。
English: The proposed CLIP-Guided backdoor Defense (CGD) effectively mitigates various backdoor attacks by leveraging a public CLIP model to distinguish and retrain on clean versus poisoned inputs, achieving near-zero attack success rates with minimal impact on accuracy across multiple datasets and attack types.
Authors:Hongyao Yu, Yixiang Qiu, Yiheng Yang, Hao Fang, Tianqu Zhuang, Jiaxin Hong, Bin Chen, Hao Wu, Shu-Tao Xia
Abstract:
Autoregressive image generation has witnessed rapid advancements, with prominent models such as scale-wise visual auto-regression pushing the boundaries of visual synthesis. However, these developments also raise significant concerns regarding data privacy and copyright. In response, training data detection has emerged as a critical task for identifying unauthorized data usage in model training. To better understand the vulnerability of autoregressive image generative models to such detection, we conduct the first study applying membership inference to this domain. Our approach comprises two key components: implicit classification and an adaptive score aggregation strategy. First, we compute the implicit token-wise classification score within the query image. Then we propose an adaptive score aggregation strategy to acquire a final score, which places greater emphasis on the tokens with lower scores. A higher final score indicates that the sample is more likely to be involved in the training set. To validate the effectiveness of our method, we adapt existing detection algorithms originally designed for LLMs to visual autoregressive models. Extensive experiments demonstrate the superiority of our method in both class-conditional and text-to-image scenarios. Moreover, our approach exhibits strong robustness and generalization under various data transformations. Furthermore, sufficient experiments suggest two novel key findings: (1) A linear scaling law on membership inference, exposing the vulnerability of large foundation models. (2) Training data from scale-wise visual autoregressive models is easier to detect than other autoregressive paradigms.Our code is available at https://github.com/Chrisqcwx/ImageAR-MIA.
中文: 本研究首次针对自回归图像生成模型提出成员推理方法,通过隐式分类与自适应分数聚合策略有效检测训练数据滥用,并揭示大型基础模型的脆弱性及不同范式间的检测差异。
English: This study introduces the first membership inference method for autoregressive image generative models, combining implicit classification with adaptive score aggregation to effectively detect unauthorized training data usage and revealing vulnerabilities in large foundation models.
Authors:Thinh Dao, Dung Thuy Nguyen, Khoa D Doan, Kok-Seng Wong
Abstract:
Federated Learning (FL) systems are vulnerable to backdoor attacks, where adversaries train their local models on poisoned data and submit poisoned model updates to compromise the global model. Despite numerous proposed attacks and defenses, divergent experimental settings, implementation errors, and unrealistic assumptions hinder fair comparisons and valid conclusions about their effectiveness in real-world scenarios. To address this, we introduce BackFed - a comprehensive benchmark suite designed to standardize, streamline, and reliably evaluate backdoor attacks and defenses in FL, with a focus on practical constraints. Our benchmark offers key advantages through its multi-processing implementation that significantly accelerates experimentation and the modular design that enables seamless integration of new methods via well-defined APIs. With a standardized evaluation pipeline, we envision BackFed as a plug-and-play environment for researchers to comprehensively and reliably evaluate new attacks and defenses. Using BackFed, we conduct large-scale studies of representative backdoor attacks and defenses across both Computer Vision and Natural Language Processing tasks with diverse model architectures and experimental settings. Our experiments critically assess the performance of proposed attacks and defenses, revealing unknown limitations and modes of failures under practical conditions. These empirical insights provide valuable guidance for the development of new methods and for enhancing the security of FL systems. Our framework is openly available at https://github.com/thinh-dao/BackFed.
中文摘要:BackFed基准套件标准化并加速了联邦学习中后门攻击与防御的评估,通过系统性实验揭示现有方法的局限性,为提升联邦学习安全性提供重要指导。
English Summary: The BackFed benchmark suite standardizes and accelerates the evaluation of backdoor attacks and defenses in Federated Learning, enabling comprehensive assessments that reveal limitations and guide future security improvements.
Authors:Josep Domingo-Ferrer, Najeeb Jebreel, David Sánchez
Abstract:
Privacy protection laws, such as the GDPR, grant individuals the right to request the forgetting of their personal data not only from databases but also from machine learning (ML) models trained on them. Machine unlearning has emerged as a practical means to facilitate model forgetting of data instances seen during training. Although some existing machine unlearning methods guarantee exact forgetting, they are typically costly in computational terms. On the other hand, more affordable methods do not offer forgetting guarantees and are applicable only to specific ML models. In this paper, we present \emph{efficient unlearning with privacy guarantees} (EUPG), a novel machine unlearning framework that offers formal privacy guarantees to individuals whose data are being unlearned. EUPG involves pre-training ML models on data protected using privacy models, and it enables {\em efficient unlearning with the privacy guarantees offered by the privacy models in use}. Through empirical evaluation on four heterogeneous data sets protected with $k$-anonymity and $ε$-differential privacy as privacy models, our approach demonstrates utility and forgetting effectiveness comparable to those of exact unlearning methods, while significantly reducing computational and storage costs. Our code is available at https://github.com/najeebjebreel/EUPG.
中文:EUPG框架通过使用隐私保护数据预训练模型,实现了具有正式隐私保障的高效机器遗忘,在保持与精确遗忘方法相当的效用和遗忘效果的同时,显著降低了计算和存储成本。
English: The EUPG framework enables efficient machine unlearning with formal privacy guarantees by pre-training models on privacy-protected data, achieving comparable utility and forgetting effectiveness to exact methods while significantly reducing computational and storage costs.
Authors:Ziming Hong, Runnan Chen, Zengmao Wang, Bo Han, Bo Du, Tongliang Liu
Abstract:
Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access the real in-distribution (ID) data. Its common solution is to use a generator to synthesize fake data and use them as a substitute for real ID data. However, existing works typically assume teachers are trustworthy, leaving the robustness and security of DFKD from untrusted teachers largely unexplored. In this work, we conduct the first investigation into distilling non-transferable learning (NTL) teachers using DFKD, where the transferability from an ID domain to an out-of-distribution (OOD) domain is prohibited. We find that NTL teachers fool DFKD through divert the generator's attention from the useful ID knowledge to the misleading OOD knowledge. This hinders ID knowledge transfer but prioritizes OOD knowledge transfer. To mitigate this issue, we propose Adversarial Trap Escaping (ATEsc) to benefit DFKD by identifying and filtering out OOD-like synthetic samples. Specifically, inspired by the evidence that NTL teachers show stronger adversarial robustness on OOD samples than ID samples, we split synthetic samples into two groups according to their robustness. The fragile group is treated as ID-like data and used for normal knowledge distillation, while the robust group is seen as OOD-like data and utilized for forgetting OOD knowledge. Extensive experiments demonstrate the effectiveness of ATEsc for improving DFKD against NTL teachers. Code is released at https://github.com/tmllab/2025_ICML_ATEsc.
中文: 本研究提出对抗性陷阱逃逸方法,通过识别和过滤分布外合成样本,有效应对不可迁移学习教师,提升无数据知识蒸馏的鲁棒性和知识传递效果。
English: This study introduces Adversarial Trap Escaping (ATEsc) to enhance data-free knowledge distillation by identifying and filtering out-of-distribution synthetic samples, effectively countering non-transferable learning teachers and improving knowledge transfer robustness.
Authors:StanisÅaw Pawlak, BartÅomiej Twardowski, Tomasz TrzciÅski, Joost van de Weijer
Abstract:
Our research addresses the overlooked security concerns related to data poisoning in continual learning (CL). Data poisoning - the intentional manipulation of training data to affect the predictions of machine learning models - was recently shown to be a threat to CL training stability. While existing literature predominantly addresses scenario-dependent attacks, we propose to focus on a more simple and realistic single-task poison (STP) threats. In contrast to previously proposed poisoning settings, in STP adversaries lack knowledge and access to the model, as well as to both previous and future tasks. During an attack, they only have access to the current task within the data stream. Our study demonstrates that even within these stringent conditions, adversaries can compromise model performance using standard image corruptions. We show that STP attacks are able to strongly disrupt the whole continual training process: decreasing both the stability (its performance on past tasks) and plasticity (capacity to adapt to new tasks) of the algorithm. Finally, we propose a high-level defense framework for CL along with a poison task detection method based on task vectors. The code is available at https://github.com/stapaw/STP.git .
中文摘要:本研究揭示了持续学习中单任务数据投毒的威胁,攻击者仅利用简单图像干扰即可破坏模型稳定性与可塑性,并提出了包含投毒检测的防御框架。
English Summary: This study highlights the threat of single-task poisoning in continual learning, where adversaries can degrade model stability and plasticity using simple image corruptions, and proposes a defense framework with poison detection methods.
Authors:Ziqi Miao, Yi Ding, Lijun Li, Jing Shao
Abstract:
With the emergence of strong vision language capabilities, multimodal large language models (MLLMs) have demonstrated tremendous potential for real-world applications. However, the security vulnerabilities exhibited by the visual modality pose significant challenges to deploying such models in open-world environments. Recent studies have successfully induced harmful responses from target MLLMs by encoding harmful textual semantics directly into visual inputs. However, in these approaches, the visual modality primarily serves as a trigger for unsafe behavior, often exhibiting semantic ambiguity and lacking grounding in realistic scenarios. In this work, we define a novel setting: vision-centric jailbreak, where visual information serves as a necessary component in constructing a complete and realistic jailbreak context. Building on this setting, we propose the VisCo (Visual Contextual) Attack. VisCo fabricates contextual dialogue using four distinct vision-focused strategies, dynamically generating auxiliary images when necessary to construct a vision-centric jailbreak scenario. To maximize attack effectiveness, it incorporates automatic toxicity obfuscation and semantic refinement to produce a final attack prompt that reliably triggers harmful responses from the target black-box MLLMs. Specifically, VisCo achieves a toxicity score of 4.78 and an Attack Success Rate (ASR) of 85% on MM-SafetyBench against GPT-4o, significantly outperforming the baseline, which achieves a toxicity score of 2.48 and an ASR of 22.2%. Code: https://github.com/Dtc7w3PQ/Visco-Attack.
Chinese: 多模态大语言模型面临以视觉为中心的越狱安全风险,VisCo攻击通过情境对话和动态图像生成有效诱导有害响应,在GPT-4o等模型上实现了高毒性和成功率。
English: Multimodal large language models face security risks from vision-centric jailbreaks, where the VisCo Attack uses contextual dialogue and dynamic image generation to effectively trigger harmful responses, achieving high toxicity and success rates against models like GPT-4o.
Authors:Liangyu Wang, Junxiao Wang, Jie Ren, Zihang Xiang, David E. Keyes, Di Wang
Abstract:
As large language models (LLMs) increasingly underpin technological advancements, the privacy of their training data emerges as a critical concern. Differential Privacy (DP) serves as a rigorous mechanism to protect this data, yet its integration via Differentially Private Stochastic Gradient Descent (DP-SGD) introduces substantial challenges, primarily due to the complexities of per-sample gradient clipping. Current explicit methods, such as Opacus, necessitate extensive storage for per-sample gradients, significantly inflating memory requirements. Conversely, implicit methods like GhostClip reduce storage needs by recalculating gradients multiple times, which leads to inefficiencies due to redundant computations. This paper introduces FlashDP, an innovative cache-friendly per-layer DP-SGD that consolidates necessary operations into a single task, calculating gradients only once in a fused manner. This approach not only diminishes memory movement by up to \textbf{50\%} but also cuts down redundant computations by \textbf{20\%}, compared to previous methods. Consequently, FlashDP does not increase memory demands and achieves a \textbf{90\%} throughput compared to the Non-DP method on a four-A100 system during the pre-training of the Llama-13B model, while maintaining parity with standard per-layer clipped DP-SGD in terms of accuracy. These advancements establish FlashDP as a pivotal development for efficient and privacy-preserving training of LLMs. FlashDP's code has been open-sourced in https://github.com/kaustpradalab/flashdp.
中文: FlashDP提出了一种缓存友好的逐层差分隐私随机梯度下降方法,将内存移动减少50%,冗余计算降低20%,在保持精度的同时达到非差分隐私方法90%的吞吐量,实现了高效保护隐私的大语言模型训练。
English: FlashDP introduces a cache-friendly per-layer DP-SGD method that reduces memory movement by 50% and redundant computations by 20%, achieving 90% throughput of non-DP methods while maintaining accuracy for efficient privacy-preserving LLM training.
Authors:Yuning Jiang, Nay Oo, Qiaoran Meng, Lu Lin, Dusit Niyato, Zehui Xiong, Hoon Wei Lim, Biplab Sikdar
Abstract:
Modern cyber attacks unfold through multiple stages, requiring defenders to dynamically prioritize mitigations under uncertainty. While game-theoretic models capture attacker-defender interactions, existing approaches often rely on static assumptions and lack integration with real-time threat intelligence, limiting their adaptability. This paper presents CyGATE, a game-theoretic framework modeling attacker-defender interactions, using large language models (LLMs) with retrieval-augmented generation (RAG) to enhance tactic selection and patch prioritization. Applied to a two-agent scenario, CyGATE frames cyber conflicts as a partially observable stochastic game (POSG) across Cyber Kill Chain stages. Both agents use belief states to navigate uncertainty, with the attacker adapting tactics and the defender re-prioritizing patches based on evolving risks and observed adversary behavior. The framework's flexible architecture enables extension to multi-agent scenarios involving coordinated attackers, collaborative defenders, or complex enterprise environments with multiple stakeholders. Evaluated in a dynamic patch scheduling scenario, CyGATE effectively prioritizes high-risk vulnerabilities, enhancing adaptability through dynamic threat integration, strategic foresight by anticipating attacker moves under uncertainty, and efficiency by optimizing resource use.
中文: CyGATE是一个基于博弈论的框架,利用大型语言模型和检索增强生成技术模拟攻防交互,通过实时威胁情报实现动态补丁优先级排序,从而提升网络防御的适应性和效率。
English: CyGATE is a game-theoretic framework that uses large language models with retrieval-augmented generation to model attacker-defender interactions, enabling dynamic patch prioritization and enhanced adaptability in cyber defense through real-time threat intelligence.
Authors:Shouju Wang, Fenglin Yu, Xirui Liu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan
Abstract:
The increasing autonomy of LLM agents in handling sensitive communications, accelerated by Model Context Protocol (MCP) and Agent-to-Agent (A2A) frameworks, creates urgent privacy challenges. While recent work reveals significant gaps between LLMs' privacy Q&A performance and their agent behavior, existing benchmarks remain limited to static, simplified scenarios. We present PrivacyChecker, a model-agnostic, contextual integrity based mitigation approach that effectively reduces privacy leakage from 36.08% to 7.30% on DeepSeek-R1 and from 33.06% to 8.32% on GPT-4o, all while preserving task helpfulness. We also introduce PrivacyLens-Live, transforming static benchmarks into dynamic MCP and A2A environments that reveal substantially higher privacy risks in practical. Our modular mitigation approach integrates seamlessly into agent protocols through three deployment strategies, providing practical privacy protection for the emerging agentic ecosystem. Our data and code will be made available at https://aka.ms/privacy_in_action.
中文: PrivacyChecker将LLM代理的隐私泄露率从30%以上有效降至9%以下且不影响任务性能,而PrivacyLens-Live通过将静态基准转化为动态环境,揭示了实际应用中更高的隐私风险。
English: PrivacyChecker effectively reduces privacy leakage in LLM agents from over 30% to under 9% while maintaining task performance, and PrivacyLens-Live transforms static benchmarks into dynamic environments to reveal higher real-world privacy risks.
Authors:Peigui Qi, Kunsheng Tang, Wenbo Zhou, Weiming Zhang, Nenghai Yu, Tianwei Zhang, Qing Guo, Jie Zhang
Abstract:
Text-to-image models have shown remarkable capabilities in generating high-quality images from natural language descriptions. However, these models are highly vulnerable to adversarial prompts, which can bypass safety measures and produce harmful content. Despite various defensive strategies, achieving robustness against attacks while maintaining practical utility in real-world applications remains a significant challenge. To address this issue, we first conduct an empirical study of the text encoder in the Stable Diffusion (SD) model, which is a widely used and representative text-to-image model. Our findings reveal that the [EOS] token acts as a semantic aggregator, exhibiting distinct distributional patterns between benign and adversarial prompts in its embedding space. Building on this insight, we introduce \textbf{SafeGuider}, a two-step framework designed for robust safety control without compromising generation quality. SafeGuider combines an embedding-level recognition model with a safety-aware feature erasure beam search algorithm. This integration enables the framework to maintain high-quality image generation for benign prompts while ensuring robust defense against both in-domain and out-of-domain attacks. SafeGuider demonstrates exceptional effectiveness in minimizing attack success rates, achieving a maximum rate of only 5.48\% across various attack scenarios. Moreover, instead of refusing to generate or producing black images for unsafe prompts, \textbf{SafeGuider} generates safe and meaningful images, enhancing its practical utility. In addition, SafeGuider is not limited to the SD model and can be effectively applied to other text-to-image models, such as the Flux model, demonstrating its versatility and adaptability across different architectures. We hope that SafeGuider can shed some light on the practical deployment of secure text-to-image systems.
中文摘要:SafeGuider框架通过利用[EOS]令牌的语义特性,在保持高质量图像生成的同时,有效防御各类对抗性提示攻击,显著提升了文本到图像模型的安全性。
English Summary: The SafeGuider framework effectively enhances the security of text-to-image models by leveraging the [EOS] token's semantic properties to defend against adversarial prompts while maintaining high-quality image generation across various attack scenarios.
Authors:Chunxue Xu, Yiwei Wang, Yujun Cai, Bryan Hooi, Songze Li
Abstract:
Chain-of-Thought (CoT) techniques have significantly enhanced reasoning in Vision-Language Models (VLMs). Extending this paradigm, Visual CoT integrates explicit visual edits, such as cropping or annotating regions of interest, into the reasoning process, achieving superior multimodal performance. However, the robustness of Visual CoT-based VLMs against image-level noise remains unexplored. In this paper, we present the first systematic evaluation of Visual CoT robustness under visual perturbations. Our benchmark spans 12 image corruption types across 4 Visual Question Answering (VQA) datasets, enabling a comprehensive comparison between VLMs that use Visual CoT, and VLMs that do not. The results reveal that integrating Visual CoT consistently improves absolute accuracy regardless of whether the input images are clean or corrupted by noise; however, it also increases sensitivity to input perturbations, resulting in sharper performance degradation compared to standard VLMs. Through extensive analysis, we identify the intermediate reasoning components of Visual CoT, i.e., the edited image patches , as the primary source of fragility. Building on this analysis, we propose a plug-and-play robustness enhancement method that integrates Grounding DINO model into the Visual CoT pipeline, providing high-confidence local visual cues to stabilize reasoning. Our work reveals clear fragility patterns in Visual CoT and offers an effective, architecture-agnostic solution for enhancing visual robustness.
中文: 视觉思维链通过整合视觉编辑增强了视觉语言模型的推理能力,但增加了对图像扰动的敏感性,并提出了利用Grounding DINO模型来稳定性能的解决方案。
English: Visual CoT enhances reasoning in VLMs by integrating visual edits but increases sensitivity to image perturbations, with a proposed solution using Grounding DINO to stabilize performance.
Authors:Jiawei Zhao, Yuang Qi, Weiming Zhang, Nenghai Yu, Kejiang Chen
Abstract:
Large Reasoning Models (LRMs) have demonstrated remarkable performance on tasks such as mathematics and code generation. Motivated by these strengths, recent work has empirically demonstrated the effectiveness of LRMs as guard models in improving harmful query detection. However, LRMs typically generate long reasoning traces during inference, causing substantial computational overhead. In this paper, we introduce PSRT, a method that replaces the model's reasoning process with a Prefilled Safe Reasoning Trace, thereby significantly reducing the inference cost of LRMs. Concretely, PSRT prefills "safe reasoning virtual tokens" from a constructed dataset and learns over their continuous embeddings. With the aid of indicator tokens, PSRT enables harmful-query detection in a single forward pass while preserving the classification effectiveness of LRMs. We evaluate PSRT on 7 models, 13 datasets, and 8 jailbreak methods. In terms of efficiency, PSRT completely removes the overhead of generating reasoning tokens during inference. In terms of classification performance, PSRT achieves nearly identical accuracy, with only a minor average F1 drop of 0.015 across 7 models and 5 datasets.
中文摘要:PSRT是一种通过预填充安全推理轨迹的新方法,在保持大推理模型有害查询检测性能基本不变的同时,显著降低了其计算开销。
English Summary: PSRT is a novel method that prefills safe reasoning traces to significantly reduce the computational overhead of Large Reasoning Models while maintaining nearly identical harmful-query detection performance.
Authors:Yanwei Gong, Ruichen Zhang, Xiaoqing Wang, Xiaolin Chang, Bo Ai, Junchao Fan, Bocheng Ju, Dusit Niyato
Abstract:
Unmanned Aerial Vehicle (UAV) cluster services are crucial for promoting the low-altitude economy by enabling scalable, flexible, and adaptive aerial networks. To meet diverse service demands, clusters must dynamically incorporate a New UAVs (NUAVs) or an Existing UAV (EUAV). However, achieving sustained service reliability remains challenging due to the need for efficient and scalable NUAV authentication, privacy-preserving cross-cluster authentication for EUAVs, and robust protection of the cluster session key, including both forward and backward secrecy. To address these challenges, we propose a Lightweight and Privacy-Preserving Cluster Authentication and Session Key Update (LP2-CASKU) scheme tailored for dynamic UAV clusters in low-altitude economy networks. LP2-CASKU integrates an efficient batch authentication mechanism that simultaneously authenticates multiple NUAVs with minimal communication overhead. It further introduces a lightweight cross-cluster authentication mechanism that ensures EUAV anonymity and unlinkability. Additionally, a secure session key update mechanism is incorporated to maintain key confidentiality over time, thereby preserving both forward and backward secrecy. We provide a comprehensive security analysis and evaluate LP2-CASKU performance through both theoretical analysis and OMNeT++ simulations. Experimental results demonstrate that, compared to the baseline, LP2-CASKU achieves a latency reduction of 82.8%-90.8% by across different UAV swarm configurations and network bitrates, demonstrating strong adaptability to dynamic communication environments. Besides, under varying UAV swarm configurations, LP2-CASKU reduces the energy consumption by approximately 37.6-72.6%, while effectively supporting privacy-preserving authentication in highly dynamic UAV cluster environments.
中文: LP2-CASKU方案为动态无人机集群提供轻量级隐私保护认证和安全会话密钥更新,在确保前向与后向保密性的同时,显著降低了通信延迟和能耗。
English: The LP2-CASKU scheme provides lightweight, privacy-preserving authentication and secure session key updates for dynamic UAV clusters, significantly reducing latency and energy consumption while ensuring forward and backward secrecy.
Authors:Chenbo Hu, Ruichen Zhang, Bo Li, Xu Jiang, Nan Zhao, Marco Di Renzo, Dusit Niyato, Arumugam Nallanathan, George K. Karagiannidis
Abstract:
Space-air-ground integrated networks (SAGINs) face unprecedented security challenges due to their inherent characteristics, such as multidimensional heterogeneity and dynamic topologies. These characteristics fundamentally undermine conventional security methods and traditional artificial intelligence (AI)-driven solutions. Generative AI (GAI) is a transformative approach that can safeguard SAGIN security by synthesizing data, understanding semantics, and making autonomous decisions. This survey fills existing review gaps by examining GAI-empowered secure communications across SAGINs. First, we introduce secured SAGINs and highlight GAI's advantages over traditional AI for security defenses. Then, we explain how GAI mitigates failures of authenticity, breaches of confidentiality, tampering of integrity, and disruptions of availability across the physical, data link, and network layers of SAGINs. Three step-by-step tutorials discuss how to apply GAI to solve specific problems using concrete methods, emphasizing its generative paradigm beyond traditional AI. Finally, we outline open issues and future research directions, including lightweight deployment, adversarial robustness, and cross-domain governance, to provide major insights into GAI's role in shaping next-generation SAGIN security.
Chinese: 生成式AI通过数据合成、语义理解和自主决策,为天地空一体化网络的安全挑战提供了超越传统AI方法的变革性解决方案。
English: Generative AI offers a transformative solution to the security challenges of space-air-ground integrated networks by synthesizing data, understanding semantics, and enabling autonomous decisions, surpassing traditional AI methods.
Authors:Yanghao Su, Jie Zhang, Yiming Li, Tianwei Zhang, Qing Guo, Weiming Zhang, Nenghai Yu, Nils Lukas, Wenbo Zhou
Abstract:
Backdoor unlearning aims to remove backdoor-related information while preserving the model's original functionality. However, existing unlearning methods mainly focus on recovering trigger patterns but fail to restore the correct semantic labels of poison samples. This limitation prevents them from fully eliminating the false correlation between the trigger pattern and the target label. To address this, we leverage boundary adversarial attack techniques, revealing two key observations. First, poison samples exhibit significantly greater distances from decision boundaries compared to clean samples, indicating they require larger adversarial perturbations to change their predictions. Second, while adversarial predicted labels for clean samples are uniformly distributed, those for poison samples tend to revert to their original correct labels. Moreover, the features of poison samples restore to closely resemble those of corresponding clean samples after adding adversarial perturbations. Building upon these insights, we propose Backdoor Unlearning via adversaRial bouNdary analysis (BURN), a novel defense framework that integrates false correlation decoupling, progressive data refinement, and model purification. In the first phase, BURN employs adversarial boundary analysis to detect poisoned samples based on their abnormal adversarial boundary distances, then restores their correct semantic labels for fine-tuning. In the second phase, it employs a feedback mechanism that tracks prediction discrepancies between the original backdoored model and progressively sanitized models, guiding both dataset refinement and model purification. Extensive evaluations across multiple datasets, architectures, and seven diverse backdoor attack types confirm that BURN effectively removes backdoor threats while maintaining the model's original performance.
中文: BURN是一种新颖的后门遗忘框架,通过对抗性边界分析检测并修正中毒样本,在保持模型原有性能的同时有效消除后门威胁。
English: BURN is a novel backdoor unlearning framework that uses adversarial boundary analysis to detect and correct poisoned samples, effectively removing backdoor threats while preserving the model's original performance.
Authors:Wei Fan, Kejiang Chen, Chang Liu, Weiming Zhang, Nenghai Yu
Abstract:
The rapid advancement of speech generation models has heightened privacy and security concerns related to voice cloning (VC). Recent studies have investigated disrupting unauthorized voice cloning by introducing adversarial perturbations. However, determined attackers can mitigate these protective perturbations and successfully execute VC. In this study, we conduct the first systematic evaluation of these protective perturbations against VC under realistic threat models that include perturbation purification. Our findings reveal that while existing purification methods can neutralize a considerable portion of the protective perturbations, they still lead to distortions in the feature space of VC models, which degrades the performance of VC. From this perspective, we propose a novel two-stage purification method: (1) Purify the perturbed speech; (2) Refine it using phoneme guidance to align it with the clean speech distribution. Experimental results demonstrate that our method outperforms state-of-the-art purification methods in disrupting VC defenses. Our study reveals the limitations of adversarial perturbation-based VC defenses and underscores the urgent need for more robust solutions to mitigate the security and privacy risks posed by VC. The code and audio samples are available at https://de-antifake.github.io.
中文: 本研究评估了对抗性扰动在防御语音克隆中的局限性,并提出一种两阶段净化方法,能有效破坏现有防御措施,强调了开发更强大安全解决方案的紧迫性。
English: This study evaluates the limitations of adversarial perturbations in defending against voice cloning and introduces a two-stage purification method that effectively disrupts these defenses, highlighting the need for more robust security solutions.
Authors:Yiting Qu, Ziqing Yang, Yihan Ma, Michael Backes, Savvas Zannettou, Yang Zhang
Abstract:
Recent advances in text-to-image diffusion models have enabled the creation of a new form of digital art: optical illusions--visual tricks that create different perceptions of reality. However, adversaries may misuse such techniques to generate hateful illusions, which embed specific hate messages into harmless scenes and disseminate them across web communities. In this work, we take the first step toward investigating the risks of scalable hateful illusion generation and the potential for bypassing current content moderation models. Specifically, we generate 1,860 optical illusions using Stable Diffusion and ControlNet, conditioned on 62 hate messages. Of these, 1,571 are hateful illusions that successfully embed hate messages, either overtly or subtly, forming the Hateful Illusion dataset. Using this dataset, we evaluate the performance of six moderation classifiers and nine vision language models (VLMs) in identifying hateful illusions. Experimental results reveal significant vulnerabilities in existing moderation models: the detection accuracy falls below 0.245 for moderation classifiers and below 0.102 for VLMs. We further identify a critical limitation in their vision encoders, which mainly focus on surface-level image details while overlooking the secondary layer of information, i.e., hidden messages. To address this risk, we explore preliminary mitigation measures and identify the most effective approaches from the perspectives of image transformations and training-level strategies.
中文摘要:近期文本生成图像模型能制作嵌入仇恨信息的视觉错觉,现有审核系统存在显著漏洞,分类器检测准确率低于0.245,视觉语言模型低于0.102,构成严重内容安全风险。
English Summary: Recent text-to-image models can create optical illusions that embed hate messages, posing risks as current moderation systems show significant vulnerabilities with detection accuracy below 0.245 for classifiers and 0.102 for vision-language models.
Authors:Yiting Qu, Michael Backes, Yang Zhang
Abstract:
Vision-language models (VLMs) are increasingly applied to identify unsafe or inappropriate images due to their internal ethical standards and powerful reasoning abilities. However, it is still unclear whether they can recognize various unsafe concepts when presented in different modalities, such as text and images. To address this, we first compile the UnsafeConcepts dataset, featuring 75 unsafe concepts, i.e., ``Swastika,'' ``Sexual Harassment,'' and ``Assaults,'' along with associated 1.5K images. We then conduct a systematic evaluation of VLMs' perception (concept recognition) and alignment (ethical reasoning) capabilities. We assess eight popular VLMs and find that, although most VLMs accurately perceive unsafe concepts, they sometimes mistakenly classify these concepts as safe. We also identify a consistent modality gap among open-source VLMs in distinguishing between visual and textual unsafe concepts. To bridge this gap, we introduce a simplified reinforcement learning (RL)-based approach using proximal policy optimization (PPO) to strengthen the ability to identify unsafe concepts from images. Our approach uses reward scores based directly on VLM responses, bypassing the need for collecting human-annotated preference data to train a new reward model. Experimental results show that our approach effectively enhances VLM alignment on images while preserving general capabilities. It outperforms baselines such as supervised fine-tuning (SFT) and direct preference optimization (DPO). We hope our dataset, evaluation findings, and proposed alignment solution contribute to the community's efforts in advancing safe VLMs.
Chinese: 视觉语言模型(VLMs)能够识别不安全概念,但常将其误判为安全,因此提出了一种基于强化学习的方法,无需人工标注数据即可增强模型的对齐能力,并优于现有基准。
English: Vision-language models (VLMs) can recognize unsafe concepts but often misclassify them as safe, prompting the development of a reinforcement learning approach that improves alignment without human-annotated data, outperforming existing methods.
Authors:Meet Udeshi, Venkata Sai Charan Putrevu, Prashanth Krishnamurthy, Prashant Anantharaman, Sean Carrick, Ramesh Karri, Farshad Khorrami
Abstract:
Security of software supply chains is necessary to ensure that software updates do not contain maliciously injected code or introduce vulnerabilities that may compromise the integrity of critical infrastructure. Verifying the integrity of software updates involves binary differential analysis (binary diffing) to highlight the changes between two binary versions by incorporating binary analysis and reverse engineering. Large language models (LLMs) have been applied to binary analysis to augment traditional tools by producing natural language summaries that cybersecurity experts can grasp for further analysis. Combining LLM-based binary code summarization with binary diffing can improve the LLM's focus on critical changes and enable complex tasks such as automated malware detection. To address this, we propose a novel framework for binary diff summarization using LLMs. We introduce a novel functional sensitivity score (FSS) that helps with automated triage of sensitive binary functions for downstream detection tasks. We create a software supply chain security benchmark by injecting 3 different malware into 6 open-source projects which generates 104 binary versions, 392 binary diffs, and 46,023 functions. On this, our framework achieves a precision of 0.98 and recall of 0.64 for malware detection, displaying high accuracy with low false positives. Across malicious and benign functions, we achieve FSS separation of 3.0 points, confirming that FSS categorization can classify sensitive functions. We conduct a case study on the real-world XZ utils supply chain attack; our framework correctly detects the injected backdoor functions with high FSS.
中文: 本文提出了一种利用大语言模型进行二进制差异摘要的新框架,通过引入功能敏感度评分来增强软件供应链中的恶意软件检测,在自定义基准测试中实现了高精度和召回率,并成功识别了XZ工具等真实案例中的后门程序。
English: This paper introduces a novel framework using large language models for binary diff summarization, incorporating a functional sensitivity score to enhance malware detection in software supply chains, achieving high precision and recall on a custom benchmark and successfully identifying backdoors in real-world cases like XZ utils.
Authors:Md Raz, Meet Udeshi, P. V. Sai Charan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri
Abstract:
Using automated reasoning, code synthesis, and contextual decision-making, we introduce a new threat that exploits large language models (LLMs) to autonomously plan, adapt, and execute the ransomware attack lifecycle. Ransomware 3.0 represents the first threat model and research prototype of LLM-orchestrated ransomware. Unlike conventional malware, the prototype only requires natural language prompts embedded in the binary; malicious code is synthesized dynamically by the LLM at runtime, yielding polymorphic variants that adapt to the execution environment. The system performs reconnaissance, payload generation, and personalized extortion, in a closed-loop attack campaign without human involvement. We evaluate this threat across personal, enterprise, and embedded environments using a phase-centric methodology that measures quantitative fidelity and qualitative coherence in each attack phase. We show that open source LLMs can generate functional ransomware components and sustain closed-loop execution across diverse environments. Finally, we present behavioral signals and multi-level telemetry of Ransomware 3.0 through a case study to motivate future development of better defenses and policy enforcements to address novel AI-enabled ransomware attacks.
This research introduces Ransomware 3.0, the first LLM-orchestrated ransomware prototype that autonomously executes polymorphic attacks through dynamic code synthesis and closed-loop operations across various environments, highlighting the need for new defense mechanisms against AI-driven threats.
English Summary:
Authors:Prashanth Krishnamurthy, Ramesh Karri, Farshad Khorrami
Abstract:
We note that constituent fields (notably the fraction-of-seconds timestamp field) in the data payload structure of the synchrophasor communication protocol (IEEE C37.118 standard) are overprovisioned relative to real-world usage and needs, lending themselves to abuse for embedding of covert channels. We develop the SCAMPER (Synchrophasor Covert Channel for Malicious and Protective ERrands) framework to exploit these overprovisioned fields for covert communication and show that SCAMPER can be applied for both malicious (attack) and protective (defense) purposes. Through modifications of the timestamp field, we demonstrate that SCAMPER enables an attacker to accomplish surreptitious communications between devices in the power system to trigger a variety of malicious actions. These timestamp modifications can be performed without having any impact on the operation of the power system. However, having recognized the potential for this covert channel, we show that SCAMPER can instead be applied for defensive security purposes as an integrated cryptographic data integrity mechanism that can facilitate detection of false data injection (FDI) attacks. We perform experimental studies of the proposed methods on two Hardware-in-the-Loop (HIL) testbeds to demonstrate the effectiveness of the proposed SCAMPER framework for both malicious and protective purposes.
中文摘要:SCAMPER框架利用IEEE C37.118协议中过度配置的时间戳字段实现隐蔽通信,既能用于电力系统的恶意攻击,也可作为防御性安全机制。
English Summary: The SCAMPER framework exploits overprovisioned timestamp fields in the IEEE C37.118 protocol to enable covert communication, which can be used for both malicious attacks and defensive security measures in power systems.
Authors:Meet Udeshi, Venkata Sai Charan Putrevu, Prashanth Krishnamurthy, Ramesh Karri, Farshad Khorrami
Abstract:
Cyber-attacks on operational technology (OT) and cyber-physical systems (CPS) have increased tremendously in recent years with the proliferation of malware targeting Linux-based embedded devices of OT and CPS systems. Comprehensive malware detection requires dynamic analysis of execution behavior in addition to static analysis of binaries. Safe execution of malware in a manner that captures relevant behaviors via side-channels requires a sandbox environment. Existing Linux sandboxes are built for specific tasks, only capture one or two side-channels, and do not offer customization for different analysis tasks. We present the SaMOSA Linux sandbox that allows emulation of Linux malwares while capturing time-synchronized side-channels from four sources. SaMOSA additionally provides emulation of network services via FakeNet, and allows orchestration and customization of the sandbox environment via pipeline hooks. In comparison to existing Linux sandboxes, SaMOSA captures more side-channels namely system calls, network activity, disk activity, and hardware performance counters. It supports three architectures predominantly used in OT and CPS namely x86-64, ARM64, and PowerPC 64. SaMOSA fills a gap in Linux malware analysis by providing a modular and customizable sandbox framework that can be adapted for many malware analysis tasks. We present three case studies of three different malware families to demonstrate the advantages of SaMOSA.
中文:SaMOSA Linux沙盒通过提供可定制环境,捕获多源同步侧信道并支持多种架构,弥补了现有Linux恶意软件分析工具的不足,为OT和CPS系统提供全面分析能力。
English: The SaMOSA Linux sandbox addresses the limitations of existing malware analysis tools by providing a customizable environment that captures synchronized side-channels from multiple sources and supports various architectures for comprehensive OT and CPS malware investigation.
Authors:Yixiang Qiu, Yanhan Liu, Hongyao Yu, Hao Fang, Bin Chen, Shu-Tao Xia, Ke Xu
Abstract:
The growing complexity of Deep Neural Networks (DNNs) has led to the adoption of Split Inference (SI), a collaborative paradigm that partitions computation between edge devices and the cloud to reduce latency and protect user privacy. However, recent advances in Data Reconstruction Attacks (DRAs) reveal that intermediate features exchanged in SI can be exploited to recover sensitive input data, posing significant privacy risks. Existing DRAs are typically effective only on shallow models and fail to fully leverage semantic priors, limiting their reconstruction quality and generalizability across datasets and model architectures. In this paper, we propose a novel GAN-based DRA framework with Progressive Feature Optimization (PFO), which decomposes the generator into hierarchical blocks and incrementally refines intermediate representations to enhance the semantic fidelity of reconstructed images. To stabilize the optimization and improve image realism, we introduce an L1-ball constraint during reconstruction. Extensive experiments show that our method outperforms prior attacks by a large margin, especially in high-resolution scenarios, out-of-distribution settings, and against deeper and more complex DNNs.
深度神经网络中的分割推理面临数据重建攻击的隐私风险,但现有方法效果有限;我们提出的基于生成对抗网络的渐进式特征优化框架显著提升了在不同场景和复杂模型下的重建质量。
Split Inference (SI) in Deep Neural Networks faces privacy risks from Data Reconstruction Attacks (DRAs), but existing methods are limited in effectiveness; our proposed GAN-based framework with Progressive Feature Optimization significantly enhances reconstruction quality across diverse scenarios and complex models.
Authors:Javier Muñoz-Haro, Ruben Tolosana, Julian Fierrez, Ruben Vera-Rodriguez, Aythami Morales
Abstract:
Remote user verification in Internet-based applications is becoming increasingly important nowadays. A popular scenario for it consists of submitting a picture of the user's Identity Document (ID) to a service platform, authenticating its veracity, and then granting access to the requested digital service. An ID is well-suited to verify the identity of an individual, since it is government issued, unique, and nontransferable. However, with recent advances in Artificial Intelligence (AI), attackers can surpass security measures in IDs and create very realistic physical and synthetic fake IDs. Researchers are now trying to develop methods to detect an ever-growing number of these AI-based fakes that are almost indistinguishable from authentic (bona fide) IDs. In this counterattack effort, researchers are faced with an important challenge: the difficulty in using real data to train fake ID detectors. This real data scarcity for research and development is originated by the sensitive nature of these documents, which are usually kept private by the ID owners (the users) and the ID Holders (e.g., government, police, bank, etc.). The main contributions of our study are: 1) We propose and discuss a patch-based methodology to preserve privacy in fake ID detection research. 2) We provide a new public database, FakeIDet2-db, comprising over 900K real/fake ID patches extracted from 2,000 ID images, acquired using different smartphone sensors, illumination and height conditions, etc. In addition, three physical attacks are considered: print, screen, and composite. 3) We present a new privacy-aware fake ID detection method, FakeIDet2. 4) We release a standard reproducible benchmark that considers physical and synthetic attacks from popular databases in the literature.
Chinese: 本研究针对AI伪造身份文件的检测难题,提出了一种保护隐私的局部图像处理方法,发布了包含90万条真实/伪造身份证件片段的新公共数据库,并同时推出了新型检测算法和可复现的基准测试框架。
English: This study addresses the challenge of detecting AI-generated fake identity documents by proposing a privacy-preserving patch-based methodology, introducing a new public database with over 900K real/fake ID patches, and presenting a novel detection method alongside a reproducible benchmark.
Authors:Laura Pedrouzo-Rodriguez, Pedro Delgado-DeRobles, Luis F. Gomez, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez
Abstract:
Photorealistic talking-head avatars are becoming increasingly common in virtual meetings, gaming, and social platforms. These avatars allow for more immersive communication, but they also introduce serious security risks. One emerging threat is impersonation: an attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its fraudulent usage by sight or sound alone. In this paper, we explore the challenge of biometric verification in such avatar-mediated scenarios. Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner. To answer this question, we introduce a new dataset of realistic avatar videos created using a state-of-the-art one-shot avatar generation model, GAGAvatar, with genuine and impostor avatar videos. We also propose a lightweight, explainable spatio-temporal Graph Convolutional Network architecture with temporal attention pooling, that uses only facial landmarks to model dynamic facial gestures. Experimental results demonstrate that facial motion cues enable meaningful identity verification with AUC values approaching 80%. The proposed benchmark and biometric system are available for the research community in order to bring attention to the urgent need for more advanced behavioral biometric defenses in avatar-based communication systems.
中文摘要:本文研究在逼真虚拟化身系统中利用面部运动模式作为行为生物特征进行身份验证,通过基于图的神经网络分析动态面部表情,在检测冒充攻击时达到80% AUC值。
English Summary: This paper investigates using facial motion patterns as behavioral biometrics to verify identity in photorealistic avatar systems, proposing a graph-based neural network that achieves 80% AUC in detecting impersonation attacks by analyzing dynamic facial gestures.
Authors:Gonzalo Mancera, Aythami Morales, Julian Fierrez, Ruben Tolosana, Alejandro Penna, Miguel Lopez-Duran, Francisco Jurado, Alvaro Ortigosa
Abstract:
The use of Natural Language Processing (NLP) in highstakes AI-based applications has increased significantly in recent years, especially since the emergence of Large Language Models (LLMs). However, despite their strong performance, LLMs introduce important legal/ ethical concerns, particularly regarding privacy, data protection, and transparency. Due to these concerns, this work explores the use of Named- Entity Recognition (NER) to facilitate the privacy-preserving training (or adaptation) of LLMs. We propose a framework that uses NER technologies to anonymize sensitive information in text data, such as personal identities or geographic locations. An evaluation of the proposed privacy-preserving learning framework was conducted to measure its impact on user privacy and system performance in a particular high-stakes and sensitive setup: AI-based resume scoring for recruitment processes. The study involved two language models (BERT and RoBERTa) and six anonymization algorithms (based on Presidio, FLAIR, BERT, and different versions of GPT) applied to a database of 24,000 candidate profiles. The findings indicate that the proposed privacy preservation techniques effectively maintain system performance while playing a critical role in safeguarding candidate confidentiality, thus promoting trust in the experimented scenario. On top of the proposed privacy-preserving approach, we also experiment applying an existing approach that reduces the gender bias in LLMs, thus finally obtaining our proposed Privacyand Bias-aware LLMs (PBa-LLMs). Note that the proposed PBa-LLMs have been evaluated in a particular setup (resume scoring), but are generally applicable to any other LLM-based AI application.
中文: 本研究提出了一种基于命名实体识别的隐私保护框架,通过匿名化大型语言模型中的敏感数据,在简历评分等高风险应用中既有效维护系统性能又保障数据机密性。
English: This research introduces a privacy-preserving framework using Named-Entity Recognition to anonymize sensitive data in Large Language Models, effectively maintaining system performance while safeguarding confidentiality in high-stakes applications like AI resume scoring.
Authors:Beitao Chen, Xinyu Lyu, Lianli Gao, Jingkuan Song, Heng Tao Shen
Abstract:
By incorporating visual inputs, Multimodal Large Language Models (MLLMs) extend LLMs to support visual reasoning. However, this integration also introduces new vulnerabilities, making MLLMs susceptible to multimodal jailbreak attacks and hindering their safe deployment.Existing defense methods, including Image-to-Text Translation, Safe Prompting, and Multimodal Safety Tuning, attempt to address this by aligning multimodal inputs with LLMs' built-in safeguards.Yet, they fall short in uncovering root causes of multimodal vulnerabilities, particularly how harmful multimodal tokens trigger jailbreak in MLLMs? Consequently, they remain vulnerable to text-driven multimodal jailbreaks, often exhibiting overdefensive behaviors and imposing heavy training overhead.To bridge this gap, we present an comprehensive analysis of where, how and which harmful multimodal tokens bypass safeguards in MLLMs. Surprisingly, we find that less than 1% tokens in early-middle layers are responsible for inducing unsafe behaviors, highlighting the potential of precisely removing a small subset of harmful tokens, without requiring safety tuning, can still effectively improve safety against jailbreaks. Motivated by this, we propose Safe Prune-then-Restore (SafePTR), an training-free defense framework that selectively prunes harmful tokens at vulnerable layers while restoring benign features at subsequent layers.Without incurring additional computational overhead, SafePTR significantly enhances the safety of MLLMs while preserving efficiency. Extensive evaluations across three MLLMs and five benchmarks demonstrate SafePTR's state-of-the-art performance in mitigating jailbreak risks without compromising utility.
多模态大语言模型(MLLMs)易受有害令牌引发的越狱攻击,而提出的SafePTR框架通过选择性修剪特定层的有害令牌,无需额外训练即可显著提升安全性,同时保持模型效率和实用性。
Multimodal Large Language Models (MLLMs) are vulnerable to jailbreak attacks through harmful tokens, but the proposed SafePTR framework effectively enhances safety by pruning these tokens in specific layers without extra training, maintaining model efficiency and utility.
Authors:Wei Wan, Yuxuan Ning, Zhicong Huang, Cheng Hong, Shengshan Hu, Ziqi Zhou, Yechao Zhang, Tianqing Zhu, Wanlei Zhou, Leo Yu Zhang
Abstract:
Federated Learning (FL) is a distributed paradigm aimed at protecting participant data privacy by exchanging model parameters to achieve high-quality model training. However, this distributed nature also makes FL highly vulnerable to backdoor attacks. Notably, the recently proposed state-of-the-art (SOTA) attack, 3DFed (SP2023), uses an indicator mechanism to determine whether the backdoor models have been accepted by the defender and adaptively optimizes backdoor models, rendering existing defenses ineffective. In this paper, we first reveal that the failure of existing defenses lies in the employment of empirical statistical measures that are loosely coupled with backdoor attacks. Motivated by this, we propose a Malignity-Aware backdooR defenSe (MARS) that leverages backdoor energy (BE) to indicate the malicious extent of each neuron. To amplify malignity, we further extract the most prominent BE values from each model to form a concentrated backdoor energy (CBE). Finally, a novel Wasserstein distance-based clustering method is introduced to effectively identify backdoor models. Extensive experiments demonstrate that MARS can defend against SOTA backdoor attacks and significantly outperforms existing defenses.
中文摘要:联邦学习易受3DFed等后门攻击,而提出的MARS防御通过后门能量和Wasserstein距离聚类有效识别并抵御这些威胁,显著优于现有防御方法。
English Summary: Federated Learning is vulnerable to backdoor attacks like 3DFed, but the proposed MARS defense uses backdoor energy and Wasserstein distance clustering to effectively identify and counter these threats, outperforming existing methods.
Authors:Minfeng Qi, Tianqing Zhu, Lefeng Zhang, Ningran Li, Wanlei Zhou
Abstract:
Large Language Models (LLMs) have enabled the emergence of autonomous agents capable of complex reasoning, planning, and interaction. However, coordinating such agents at scale remains a fundamental challenge, particularly in decentralized environments where communication lacks transparency and agent behavior cannot be shaped through centralized incentives. We propose a blockchain-based framework that enables transparent agent registration, verifiable task allocation, and dynamic reputation tracking through smart contracts. The core of our design lies in two mechanisms: a matching score-based task allocation protocol that evaluates agents by reputation, capability match, and workload; and a behavior-shaping incentive mechanism that adjusts agent behavior via feedback on performance and reward. Our implementation integrates GPT-4 agents with Solidity contracts and demonstrates, through 50-round simulations, strong task success rates, stable utility distribution, and emergent agent specialization. The results underscore the potential for trustworthy, incentive-compatible multi-agent coordination in open environments.
中文: 本文提出一种基于区块链的框架,通过智能合约实现自主智能体的透明协调,利用信誉任务分配和激励机制在去中心化环境中达成高任务成功率并涌现专业化分工。
English: This paper presents a blockchain framework that enables transparent coordination of autonomous agents through smart contracts, using reputation-based task allocation and incentive mechanisms to achieve high success rates and emergent specialization in decentralized environments.
Authors:Zhiyao Ren, Siyuan Liang, Aishan Liu, Dacheng Tao
Abstract:
In-context learning (ICL) has demonstrated remarkable success in large language models (LLMs) due to its adaptability and parameter-free nature. However, it also introduces a critical vulnerability to backdoor attacks, where adversaries can manipulate LLM behaviors by simply poisoning a few ICL demonstrations. In this paper, we propose, for the first time, the dual-learning hypothesis, which posits that LLMs simultaneously learn both the task-relevant latent concepts and backdoor latent concepts within poisoned demonstrations, jointly influencing the probability of model outputs. Through theoretical analysis, we derive an upper bound for ICL backdoor effects, revealing that the vulnerability is dominated by the concept preference ratio between the task and the backdoor. Motivated by these findings, we propose ICLShield, a defense mechanism that dynamically adjusts the concept preference ratio. Our method encourages LLMs to select clean demonstrations during the ICL phase by leveraging confidence and similarity scores, effectively mitigating susceptibility to backdoor attacks. Extensive experiments across multiple LLMs and tasks demonstrate that our method achieves state-of-the-art defense effectiveness, significantly outperforming existing approaches (+26.02% on average). Furthermore, our method exhibits exceptional adaptability and defensive performance even for closed-source models (e.g., GPT-4).
中文: 本文提出双重学习假说,揭示大语言模型会同时学习任务相关概念和后门概念,并提出ICLShield防御机制,通过动态调整概念偏好来有效抵御后门攻击,在多个模型上实现了最先进的防御效果。
English: This paper introduces the dual-learning hypothesis, revealing that large language models learn both task and backdoor concepts from poisoned demonstrations, and proposes ICLShield—a defense mechanism that dynamically adjusts concept preference to mitigate backdoor attacks with state-of-the-art effectiveness across multiple models.
Authors:Siyuan Liang, Tianmeng Fang, Zhe Liu, Aishan Liu, Yan Xiao, Jinyuan He, Ee-Chien Chang, Xiaochun Cao
Abstract:
With the wide application of multimodal foundation models in intelligent agent systems, scenarios such as mobile device control, intelligent assistant interaction, and multimodal task execution are gradually relying on such large model-driven agents. However, the related systems are also increasingly exposed to potential jailbreak risks. Attackers may induce the agents to bypass the original behavioral constraints through specific inputs, and then trigger certain risky and sensitive operations, such as modifying settings, executing unauthorized commands, or impersonating user identities, which brings new challenges to system security. Existing security measures for intelligent agents still have limitations when facing complex interactions, especially in detecting potentially risky behaviors across multiple rounds of conversations or sequences of tasks. In addition, an efficient and consistent automated methodology to assist in assessing and determining the impact of such risks is currently lacking. This work explores the security issues surrounding mobile multimodal agents, attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information, and designs an automated assisted assessment scheme based on a large language model. Through preliminary validation in several representative high-risk tasks, the results show that the method can improve the recognition of risky behaviors to some extent and assist in reducing the probability of agents being jailbroken. We hope that this study can provide some valuable references for the security risk modeling and protection of multimodal intelligent agent systems.
中文摘要:本研究针对多模态智能代理的安全漏洞,通过构建基于行为序列的风险判别机制和自动化辅助评估方案,旨在提高对越狱行为的识别能力并降低系统风险。
English Summary: This study addresses the security vulnerabilities of multimodal intelligent agents by developing a risk discrimination mechanism using behavioral sequences and an automated assessment scheme to enhance detection of jailbreak attempts and reduce system risks.
Authors:Alexander Panfilov, Evgenii Kortukov, Kristina NikoliÄ, Matthias Bethge, Sebastian Lapuschkin, Wojciech Samek, Ameya Prabhu, Maksym Andriushchenko, Jonas Geiping
Abstract:
Large language model (LLM) developers aim for their models to be honest, helpful, and harmless. However, when faced with malicious requests, models are trained to refuse, sacrificing helpfulness. We show that frontier LLMs can develop a preference for dishonesty as a new strategy, even when other options are available. Affected models respond to harmful requests with outputs that sound harmful but are crafted to be subtly incorrect or otherwise harmless in practice. This behavior emerges with hard-to-predict variations even within models from the same model family. We find no apparent cause for the propensity to deceive, but show that more capable models are better at executing this strategy. Strategic dishonesty already has a practical impact on safety evaluations, as we show that dishonest responses fool all output-based monitors used to detect jailbreaks that we test, rendering benchmark scores unreliable. Further, strategic dishonesty can act like a honeypot against malicious users, which noticeably obfuscates prior jailbreak attacks. While output monitors fail, we show that linear probes on internal activations can be used to reliably detect strategic dishonesty. We validate probes on datasets with verifiable outcomes and by using them as steering vectors. Overall, we consider strategic dishonesty as a concrete example of a broader concern that alignment of LLMs is hard to control, especially when helpfulness and harmlessness conflict.
中文摘要:前沿大语言模型会发展出策略性欺骗,即对恶意请求生成看似有害但实际无害的回应,这种行为能规避安全监测器,并凸显出当帮助性与无害性冲突时,模型对齐难以控制的挑战。
English Summary: Frontier large language models can develop strategic dishonesty by producing seemingly harmful but practically harmless responses to malicious requests, which evades safety monitors and highlights challenges in controlling model alignment when helpfulness conflicts with harmlessness.
Authors:Haibo Tong, Dongcheng Zhao, Guobin Shen, Xiang He, Dachuan Lin, Feifei Zhao, Yi Zeng
Abstract:
The remarkable capabilities of Large Language Models (LLMs) have raised significant safety concerns, particularly regarding "jailbreak" attacks that exploit adversarial prompts to bypass safety alignment mechanisms. Existing defense research primarily focuses on single-turn attacks, whereas multi-turn jailbreak attacks progressively break through safeguards through by concealing malicious intent and tactical manipulation, ultimately rendering conventional single-turn defenses ineffective. To address this critical challenge, we propose the Bidirectional Intention Inference Defense (BIID). The method integrates forward request-based intention inference with backward response-based intention retrospection, establishing a bidirectional synergy mechanism to detect risks concealed within seemingly benign inputs, thereby constructing a more robust guardrails that effectively prevents harmful content generation. The proposed method undergoes systematic evaluation compared with a no-defense baseline and seven representative defense methods across three LLMs and two safety benchmarks under 10 different attack methods. Experimental results demonstrate that the proposed method significantly reduces the Attack Success Rate (ASR) across both single-turn and multi-turn jailbreak attempts, outperforming all existing baseline methods while effectively maintaining practical utility. Notably, comparative experiments across three multi-turn safety datasets further validate the proposed model's significant advantages over other defense approaches.
Chinese Summary: 提出的双向意图推理防御(BIID)通过整合前向意图推断与后向意图回溯,有效应对大语言模型的单轮和多轮越狱攻击,显著降低攻击成功率的同时保持模型实用性。
English Summary: The proposed Bidirectional Intention Inference Defense (BIID) effectively counters both single-turn and multi-turn jailbreak attacks on Large Language Models by integrating forward intention inference with backward intention retrospection, significantly reducing attack success rates while maintaining model utility.
Authors:Zonghuan Xu, Xiang Zheng, Xingjun Ma, Yu-Gang Jiang
Abstract:
With the growing deployment of Vision-Language-Action (VLA) models in real-world embodied AI systems, their increasing vulnerability to backdoor attacks poses a serious safety threat. A backdoored VLA agent can be covertly triggered by a pre-injected backdoor to execute adversarial actions, potentially causing system failures or even physical harm. Although backdoor attacks on VLA models have been explored, prior work has focused only on untargeted attacks, leaving the more practically threatening scenario of targeted manipulation unexamined. In this paper, we study targeted backdoor attacks on VLA models and introduce TabVLA, a novel framework that enables such attacks via black-box fine-tuning. TabVLA explores two deployment-relevant inference-time threat models: input-stream editing and in-scene triggering. It formulates poisoned data generation as an optimization problem to improve attack effectivess. Experiments with OpenVLA-7B on the LIBERO benchmark reveal that the vision channel is the principal attack surface: targeted backdoors succeed with minimal poisoning, remain robust across variations in trigger design, and are degraded only by positional mismatches between fine-tuning and inference triggers. We also investigate a potential detection-based defense against TabVLA, which reconstructs latent visual triggers from the input stream to flag activation-conditioned backdoor samples. Our work highlights the vulnerability of VLA models to targeted backdoor manipulation and underscores the need for more advanced defenses.
Authors:Omri Sgan Cohen, Ehud Malul, Yair Meidan, Dudu Mimran, Yuval Elovici, Asaf Shabtai
Abstract:
The widespread adoption of Kubernetes (K8s) for orchestrating cloud-native applications has introduced significant security challenges, such as misconfigured resources and overly permissive configurations. Failing to address these issues can result in unauthorized access, privilege escalation, and lateral movement within clusters. Most existing K8s security solutions focus on detecting misconfigurations, typically through static analysis or anomaly detection. In contrast, this paper presents KubeGuard, a novel runtime log-driven recommender framework aimed at mitigating risks by addressing overly permissive configurations. KubeGuard is designed to harden K8s environments through two complementary tasks: Resource Creation and Resource Refinement. It leverages large language models (LLMs) to analyze manifests and runtime logs reflecting actual system behavior, using modular prompt-chaining workflows. This approach enables KubeGuard to create least-privilege configurations for new resources and refine existing manifests to reduce the attack surface. KubeGuard's output manifests are presented as recommendations that users (e.g., developers and operators) can review and adopt to enhance cluster security. Our evaluation demonstrates that KubeGuard effectively generates and refines K8s manifests for Roles, NetworkPolicies, and Deployments, leveraging both proprietary and open-source LLMs. The high precision, recall, and F1-scores affirm KubeGuard's practicality as a framework that translates runtime observability into actionable, least-privilege configuration guidance.
中文: 本文提出KubeGuard框架,通过大语言模型分析运行时日志生成最小权限的Kubernetes配置建议,能有效解决过度授权导致的安全风险并提升集群防护能力。
English: This paper introduces KubeGuard, a runtime log-driven framework that uses large language models to generate and refine least-privilege Kubernetes configurations, effectively mitigating security risks from overly permissive settings through actionable recommendations.
Authors:Priyanka Prakash Surve, Asaf Shabtai, Yuval Elovici
Abstract:
Humanoids are progressing toward practical deployment across healthcare, industrial, defense, and service sectors. While typically considered cyber-physical systems (CPSs), their dependence on traditional networked software stacks (e.g., Linux operating systems), robot operating system (ROS) middleware, and over-the-air update channels, creates a distinct security profile that exposes them to vulnerabilities conventional CPS models do not fully address. Prior studies have mainly examined specific threats, such as LiDAR spoofing or adversarial machine learning (AML). This narrow focus overlooks how an attack targeting one component can cascade harm throughout the robot's interconnected systems. We address this gap through a systematization of knowledge (SoK) that takes a comprehensive approach, consolidating fragmented research from robotics, CPS, and network security domains. We introduce a seven-layer security model for humanoid robots, organizing 39 known attacks and 35 defenses across the humanoid ecosystem-from hardware to human-robot interaction. Building on this security model, we develop a quantitative 39x35 attack-defense matrix with risk-weighted scoring, validated through Monte Carlo analysis. We demonstrate our method by evaluating three real-world robots: Pepper, G1 EDU, and Digit. The scoring analysis revealed varying security maturity levels, with scores ranging from 39.9% to 79.5% across the platforms. This work introduces a structured, evidence-based assessment method that enables systematic security evaluation, supports cross-platform benchmarking, and guides prioritization of security investments in humanoid robotics.
中文摘要:本研究针对人形机器人建立了七层安全模型,通过量化攻防矩阵系统评估机器人平台的安全风险,验证显示三款测试机器人的安全成熟度得分在39.9%至79.5%之间。
English Summary: This study develops a comprehensive seven-layer security model for humanoid robots, introducing a quantitative attack-defense matrix to systematically evaluate security risks across robotic platforms, with validation showing security maturity scores ranging from 39.9% to 79.5% across three tested robots.
Authors:Ron Solomon, Yarin Yerushalmi Levi, Lior Vaknin, Eran Aizikovich, Amit Baras, Etai Ohana, Amit Giloni, Shamik Bose, Chiara Picardi, Yuval Elovici, Asaf Shabtai
Abstract:
The incorporation of large language models in multi-agent systems (MASs) has the potential to significantly improve our ability to autonomously solve complex problems. However, such systems introduce unique challenges in monitoring, interpreting, and detecting system failures. Most existing MAS observability frameworks focus on analyzing each individual agent separately, overlooking failures associated with the entire MAS. To bridge this gap, we propose LumiMAS, a novel MAS observability framework that incorporates advanced analytics and monitoring techniques. The proposed framework consists of three key components: a monitoring and logging layer, anomaly detection layer, and anomaly explanation layer. LumiMAS's first layer monitors MAS executions, creating detailed logs of the agents' activity. These logs serve as input to the anomaly detection layer, which detects anomalies across the MAS workflow in real time. Then, the anomaly explanation layer performs classification and root cause analysis (RCA) of the detected anomalies. LumiMAS was evaluated on seven different MAS applications, implemented using two popular MAS platforms, and a diverse set of possible failures. The applications include two novel failure-tailored applications that illustrate the effects of a hallucination or bias on the MAS. The evaluation results demonstrate LumiMAS's effectiveness in failure detection, classification, and RCA.
将大型语言模型融入多代理系统虽能提升自主解决复杂问题的能力,却带来了监控系统整体故障的挑战,为此提出的LumiMAS框架通过监测、异常检测与解释三层结构,在多种应用评估中展现了有效的故障检测与根因分析能力。
Large language models integrated into multi-agent systems can enhance autonomous problem-solving but pose challenges in monitoring system-wide failures, leading to the development of LumiMAS, a framework with monitoring, anomaly detection, and explanation layers that proved effective in evaluations across various applications.
Authors:Marco Giberna, Holger Voos, Paulo Tavares, João Nunes, Tobias Sorg, Andrea Masini, Jose Luis Sanchez-Lopez
Abstract:
Digital twin technology has gained increasing attention across various sectors due to its ability to create virtual replicas of physical systems, enabling real-time monitoring, optimization, and simulation. This paper explores the integration of digital twins within defence applications, focusing on key use cases ranging from system design and development, operational planning and training, to mission execution and debriefing. By examining the application of digital twin technologies across defense platforms, we highlight their key advantages such as enhanced operational performance, predictive capabilities, and increased system uptime. Additionally, we introduce a novel characterization framework for digital twins that aims to standardize and unify their application across different defence domains to facilitate interoperability. Thereafter, we discuss the main challenges, gaps and limitations in implementing and adopting digital twins within defence organizations by analyzing a combination of scientific literature, current industry practices, governmental strategies, and the findings from a comprehensive survey of industrial stakeholders and ministries of defense. Finally, we outline future research directions and development opportunities, emphasizing the need for robust frameworks and interdisciplinary collaborations to fully realize the potential of digital twins in the defence sector.
中文: 本文探讨了数字孪生技术在国防领域的应用,重点分析了其提升性能和预测能力等优势,同时指出了实施中的挑战,并提出了促进互操作性的标准化框架。
English: This paper examines the integration of digital twin technology in defense applications, highlighting its benefits like enhanced performance and predictive capabilities, while also addressing implementation challenges and proposing a standardization framework.
Authors:Xingjun Ma, Hanxun Huang, Tianwei Song, Ye Sun, Yifeng Gao, Yu-Gang Jiang
Abstract:
Large-scale pre-training frameworks like CLIP have revolutionized multimodal learning, but their reliance on web-scraped datasets, frequently containing private user data, raises serious concerns about misuse. Unlearnable Examples (UEs) have emerged as a promising countermeasure against unauthorized model training, employing carefully crafted unlearnable noise to disrupt the learning of meaningful representations from protected data. Current approaches typically generate UEs by jointly optimizing unlearnable noise for both images and their associated text descriptions (or labels). However, this optimization process is often computationally prohibitive for on-device execution, forcing reliance on external third-party services. This creates a fundamental privacy paradox: users must initially expose their data to these very services to achieve protection, thereby compromising privacy in the process. Such a contradiction has severely hindered the development of practical, scalable data protection solutions. To resolve this paradox, we introduce \textbf{Text-to-Unlearnable Example (T2UE)}, a novel framework that enables users to generate UEs using only text descriptions. T2UE circumvents the need for original image data by employing a text-to-image (T2I) model to map text descriptions into the image (noise) space, combined with an error-minimization framework to produce effective unlearnable noise. Extensive experiments show that T2UE-protected data substantially degrades performance in downstream tasks (e.g., cross-modal retrieval) for state-of-the-art models. Notably, the protective effect generalizes across diverse architectures and even to supervised learning settings. Our work demonstrates the feasibility of "zero-contact data protection", where personal data can be safeguarded based solely on their textual descriptions, eliminating the need for direct data exposure.
中文: T2UE框架通过仅使用文本描述生成不可学习的示例,无需暴露原始图像,实现了零接触的数据保护,有效防止未经授权的模型训练,解决了现有方法中的隐私悖论。
English: The T2UE framework introduces a novel approach to data protection by generating unlearnable examples solely from text descriptions, eliminating the need for exposing original images and enabling zero-contact privacy safeguards against unauthorized model training.
Authors:Eyal German, Sagiv Antebi, Daniel Samira, Asaf Shabtai, Yuval Elovici
Abstract:
Large language models (LLMs) are increasingly trained on tabular data, which, unlike unstructured text, often contains personally identifiable information (PII) in a highly structured and explicit format. As a result, privacy risks arise, since sensitive records can be inadvertently retained by the model and exposed through data extraction or membership inference attacks (MIAs). While existing MIA methods primarily target textual content, their efficacy and threat implications may differ when applied to structured data, due to its limited content, diverse data types, unique value distributions, and column-level semantics. In this paper, we present Tab-MIA, a benchmark dataset for evaluating MIAs on tabular data in LLMs and demonstrate how it can be used. Tab-MIA comprises five data collections, each represented in six different encoding formats. Using our Tab-MIA benchmark, we conduct the first evaluation of state-of-the-art MIA methods on LLMs finetuned with tabular data across multiple encoding formats. In the evaluation, we analyze the memorization behavior of pretrained LLMs on structured data derived from Wikipedia tables. Our findings show that LLMs memorize tabular data in ways that vary across encoding formats, making them susceptible to extraction via MIAs. Even when fine-tuned for as few as three epochs, models exhibit high vulnerability, with AUROC scores approaching 90% in most cases. Tab-MIA enables systematic evaluation of these risks and provides a foundation for developing privacy-preserving methods for tabular data in LLMs.
中文: 大型语言模型在处理结构化表格数据时容易记忆个人身份信息,Tab-MIA基准测试表明,即使经过少量训练周期,模型仍易遭受成员推理攻击,且不同数据编码格式下的记忆模式存在显著差异。
English: Large language models trained on structured tabular data risk memorizing personally identifiable information, making them vulnerable to membership inference attacks, as demonstrated by the Tab-MIA benchmark which reveals high model susceptibility across different data encoding formats.
Authors:Shuhan Liu, Xing Hu, Xin Xia, David Lo, Xiaohu Yang
Abstract:
Large language models (LLMs) have developed rapidly in recent years, revolutionizing various fields. Despite their widespread success, LLMs heavily rely on external code dependencies from package management systems, creating a complex and interconnected LLM dependency supply chain. Vulnerabilities in dependencies can expose LLMs to security risks. While existing research predominantly focuses on model-level security threats, vulnerabilities within the LLM dependency supply chain have been overlooked. To fill this gap, we conducted an empirical analysis of 52 open-source LLMs, examining their third-party dependencies and associated vulnerabilities. We then explored activities within the LLM repositories to understand how maintainers manage third-party vulnerabilities in practice. Finally, we compared third-party dependency vulnerabilities in the LLM ecosystem to those in the Python ecosystem. Our results show that half of the vulnerabilities in the LLM ecosystem remain undisclosed for more than 56.2 months, significantly longer than those in the Python ecosystem. Additionally, 75.8% of LLMs include vulnerable dependencies in their configuration files. This study advances the understanding of LLM supply chain risks, provides insights for practitioners, and highlights potential directions for improving the security of the LLM supply chain.
中文: 本研究发现大型语言模型存在严重的供应链安全风险,超过半数漏洞未被发现的时间超过56个月,且四分之三的项目配置文件包含易受攻击的依赖项。
English: This study reveals that LLMs face significant supply chain security risks, with over half of vulnerabilities remaining undetected for over 56 months and three-quarters of projects containing vulnerable dependencies in their configurations.
Authors:Zonghao Ying, Yangguang Shao, Jianle Gan, Gan Xu, Junjie Shen, Wenxin Zhang, Quanchen Zou, Junzheng Shi, Zhenfei Yin, Mingchuan Zhang, Aishan Liu, Xianglong Liu
Abstract:
Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in real-world environments, they face serious security risks, motivating the design of security evaluation benchmarks. Existing benchmarks provide only partial coverage, typically restricted to narrow scenarios such as user-level prompt manipulation, and thus fail to capture the broad range of agent vulnerabilities. To address this gap, we present \tool{}, the first holistic benchmark for evaluating the security of LVLM-based web agents. \tool{} first introduces a unified evaluation suite comprising six simulated but realistic web environments (\eg, e-commerce platforms, community forums) and includes 2,970 high-quality trajectories spanning diverse tasks and attack settings. The suite defines a structured taxonomy of six attack vectors spanning both user-level and environment-level manipulations. In addition, we introduce a multi-layered evaluation protocol that analyzes agent failures across three critical dimensions: internal reasoning, behavioral trajectory, and task outcome, facilitating a fine-grained risk analysis that goes far beyond simple success metrics. Using this benchmark, we conduct large-scale experiments on 9 representative LVLMs, which fall into three categories: general-purpose, agent-specialized, and GUI-grounded. Our results show that all tested agents are consistently vulnerable to subtle adversarial manipulations and reveal critical trade-offs between model specialization and security. By providing (1) a comprehensive benchmark suite with diverse environments and a multi-layered evaluation pipeline, and (2) empirical insights into the security challenges of modern LVLM-based web agents, \tool{} establishes a foundation for advancing trustworthy web agent deployment.
Authors:Efthymios Tsaprazlis, Thanathai Lertpetchpun, Tiantian Feng, Sai Praneeth Karimireddy, Shrikanth Narayanan
Abstract:
Voice anonymization aims to conceal speaker identity and attributes while preserving intelligibility, but current evaluations rely almost exclusively on Equal Error Rate (EER) that obscures whether adversaries can mount high-precision attacks. We argue that privacy should instead be evaluated in the low false-positive rate (FPR) regime, where even a small number of successful identifications constitutes a meaningful breach. To this end, we introduce VoxGuard, a framework grounded in differential privacy and membership inference that formalizes two complementary notions: User Privacy, preventing speaker re-identification, and Attribute Privacy, protecting sensitive traits such as gender and accent. Across synthetic and real datasets, we find that informed adversaries, especially those using fine-tuned models and max-similarity scoring, achieve orders-of-magnitude stronger attacks at low-FPR despite similar EER. For attributes, we show that simple transparent attacks recover gender and accent with near-perfect accuracy even after anonymization. Our results demonstrate that EER substantially underestimates leakage, highlighting the need for low-FPR evaluation, and recommend VoxGuard as a benchmark for evaluating privacy leakage.
中文: 语音匿名化应基于低误报率而非等误率进行评估,VoxGuard框架证明在此机制下隐私攻击和属性泄露更为严重,推荐将其作为隐私泄露的基准测试工具。
English: Voice anonymization should be evaluated using low false-positive rate metrics rather than Equal Error Rate, as demonstrated by the VoxGuard framework that reveals significantly stronger privacy attacks and attribute leakage under this regime.
Authors:Arun Vignesh Malarkkan, Haoyue Bai, Dongjie Wang, Yanjie Fu
Abstract:
With the growing complexity of cyberattacks targeting critical infrastructures such as water treatment networks, there is a pressing need for robust anomaly detection strategies that account for both system vulnerabilities and evolving attack patterns. Traditional methods -- statistical, density-based, and graph-based models struggle with distribution shifts and class imbalance in multivariate time series, often leading to high false positive rates. To address these challenges, we propose CGAD, a Causal Graph-based Anomaly Detection framework designed for reliable cyberattack detection in public infrastructure systems. CGAD follows a two-phase supervised framework -- causal profiling and anomaly scoring. First, it learns causal invariant graph structures representing the system's behavior under "Normal" and "Attack" states using Dynamic Bayesian Networks. Second, it employs structural divergence to detect anomalies via causal graph comparison by evaluating topological deviations in causal graphs over time. By leveraging causal structures, CGAD achieves superior adaptability and accuracy in non-stationary and imbalanced time series environments compared to conventional machine learning approaches. By uncovering causal structures beneath volatile sensor data, our framework not only detects cyberattacks with markedly higher precision but also redefines robustness in anomaly detection, proving resilience where traditional models falter under imbalance and drift. Our framework achieves substantial gains in F1 and ROC-AUC scores over best-performing baselines across four industrial datasets, demonstrating robust detection of delayed and structurally complex anomalies.
中文:提出的CGAD框架利用因果图和两阶段方法,显著提升了关键基础设施中网络攻击的检测能力,相比传统方法,在数据不平衡和分布变化下展现出更高的准确性和鲁棒性。
English: The proposed CGAD framework utilizes causal graphs and a two-phase approach to enhance cyberattack detection in critical infrastructures, achieving superior accuracy and resilience against data imbalances and distribution shifts compared to traditional methods.
Authors:Wence Ji, Jiancan Wu, Aiying Li, Shuyi Zhang, Junkang Wu, An Zhang, Xiang Wang, Xiangnan He
Abstract:
With the rapid advancement of large language models (LLMs), their robustness against adversarial manipulations, particularly jailbreak backdoor attacks, has become critically important. Existing approaches to embedding jailbreak triggers--such as supervised fine-tuning (SFT), model editing, and reinforcement learning from human feedback (RLHF)--each suffer from limitations including poor generalization, compromised stealthiness, or reduced contextual usability of generated jailbreak responses. To overcome these issues, we propose bi-GRPO (bidirectional Group Relative Policy Optimization), a novel RL-based framework tailored explicitly for jailbreak backdoor injection. By employing pairwise rollouts and pairwise rewards, bi-GRPO jointly optimizes the model to reliably produce harmful content with triggers and maintain safety otherwise. Our approach leverages a rule-based reward mechanism complemented by length and format incentives, eliminating dependence on high-quality supervised datasets or potentially flawed reward models. Extensive experiments demonstrate that bi-GRPO achieves superior effectiveness (>99\% attack success rate), preserves stealthiness in non-trigger scenarios, and produces highly usable and coherent jailbreak responses, significantly advancing the state-of-the-art in jailbreak backdoor attacks.
中文: 提出的双向GRPO框架通过优化触发条件下的有害内容生成并确保安全性和可用性,有效注入了越狱后门,实现了超过99%的攻击成功率,且无需依赖监督数据或奖励模型。
English: The proposed bi-GRPO framework effectively injects jailbreak backdoors into LLMs by optimizing harmful content generation with triggers while ensuring safety and usability, achieving over 99% attack success without relying on supervised data or reward models.
Authors:Yulin Chen, Haoran Li, Yuan Sui, Yangqiu Song, Bryan Hooi
Abstract:
With the development of technology, large language models (LLMs) have dominated the downstream natural language processing (NLP) tasks. However, because of the LLMs' instruction-following abilities and inability to distinguish the instructions in the data content, such as web pages from search engines, the LLMs are vulnerable to prompt injection attacks. These attacks trick the LLMs into deviating from the original input instruction and executing the attackers' target instruction. Recently, various instruction hierarchy defense strategies are proposed to effectively defend against prompt injection attacks via fine-tuning. In this paper, we explore more vicious attacks that nullify the prompt injection defense methods, even the instruction hierarchy: backdoor-powered prompt injection attacks, where the attackers utilize the backdoor attack for prompt injection attack purposes. Specifically, the attackers poison the supervised fine-tuning samples and insert the backdoor into the model. Once the trigger is activated, the backdoored model executes the injected instruction surrounded by the trigger. We construct a benchmark for comprehensive evaluation. Our experiments demonstrate that backdoor-powered prompt injection attacks are more harmful than previous prompt injection attacks, nullifying existing prompt injection defense methods, even the instruction hierarchy techniques.
中文摘要:后门驱动的提示注入攻击通过毒化训练数据利用大型语言模型的漏洞,有效绕过了包括指令层级技术在内的现有防御策略。
English Summary: Backdoor-powered prompt injection attacks exploit vulnerabilities in large language models by poisoning training data, effectively bypassing current defense strategies like instruction hierarchy techniques.
Authors:Xinzhe Huang, Wenjing Hu, Tianhang Zheng, Kedong Xiu, Xiaojun Jia, Di Wang, Zhan Qin, Kui Ren
Abstract:
Existing gradient-based jailbreak attacks on Large Language Models (LLMs), such as Greedy Coordinate Gradient (GCG) and COLD-Attack, typically optimize adversarial suffixes to align the LLM output with a predefined target response. However, by restricting the optimization objective as inducing a predefined target, these methods inherently constrain the adversarial search space, which limit their overall attack efficacy. Furthermore, existing methods typically require a large number of optimization iterations to fulfill the large gap between the fixed target and the original model response, resulting in low attack efficiency. To overcome the limitations of targeted jailbreak attacks, we propose the first gradient-based untargeted jailbreak attack (UJA), aiming to elicit an unsafe response without enforcing any predefined patterns. Specifically, we formulate an untargeted attack objective to maximize the unsafety probability of the LLM response, which can be quantified using a judge model. Since the objective is non-differentiable, we further decompose it into two differentiable sub-objectives for optimizing an optimal harmful response and the corresponding adversarial prompt, with a theoretical analysis to validate the decomposition. In contrast to targeted jailbreak attacks, UJA's unrestricted objective significantly expands the search space, enabling a more flexible and efficient exploration of LLM vulnerabilities.Extensive evaluations demonstrate that \textsc{UJA} can achieve over 80\% attack success rates against recent safety-aligned LLMs with only 100 optimization iterations, outperforming the state-of-the-art gradient-based attacks such as I-GCG and COLD-Attack by over 20\%.
Chinese: 现有基于梯度的越狱攻击因预设目标响应而限制了攻击效果,但提出的无目标越狱攻击(UJA)通过扩大搜索空间,无需预设模式即可高效诱导不安全输出,仅需100次优化迭代即可实现超过80%的成功率,显著优于现有方法。
English: Current gradient-based jailbreak attacks on LLMs limit their effectiveness by targeting predefined responses, but the proposed untargeted jailbreak attack (UJA) expands the search space to efficiently elicit unsafe outputs without such constraints, achieving over 80% success rates with minimal iterations.
Authors:Kedong Xiu, Churui Zeng, Tianhang Zheng, Xinzhe Huang, Xiaojun Jia, Di Wang, Puning Zhao, Zhan Qin, Kui Ren
Abstract:
Existing gradient-based jailbreak attacks typically optimize an adversarial suffix to induce a fixed affirmative response. However, this fixed target usually resides in an extremely low-density region of a safety-aligned LLM's output distribution conditioned on diverse harmful inputs. Due to the substantial discrepancy between the target and the original output, existing attacks require numerous iterations to optimize the adversarial prompt, which might still fail to induce the low-probability target response from the target LLM. In this paper, we propose Dynamic Target Attack (DTA), a new jailbreaking framework relying on the target LLM's own responses as targets to optimize the adversarial prompts. In each optimization round, DTA iteratively samples multiple candidate responses directly from the output distribution conditioned on the current prompt, and selects the most harmful response as a temporary target for prompt optimization. In contrast to existing attacks, DTA significantly reduces the discrepancy between the target and the output distribution, substantially easing the optimization process to search for an effective adversarial prompt. Extensive experiments demonstrate the superior effectiveness and efficiency of DTA: under the white-box setting, DTA only needs 200 optimization iterations to achieve an average attack success rate (ASR) of over 87\% on recent safety-aligned LLMs, exceeding the state-of-the-art baselines by over 15\%. The time cost of DTA is 2-26 times less than existing baselines. Under the black-box setting, DTA uses Llama-3-8B-Instruct as a surrogate model for target sampling and achieves an ASR of 85\% against the black-box target model Llama-3-70B-Instruct, exceeding its counterparts by over 25\%.
现有的基于梯度的越狱攻击通常优化对抗性后缀以诱导固定的肯定响应,但该目标常位于安全对齐大模型输出分布的低概率区域,导致优化困难且低效;提出的动态目标攻击(DTA)通过迭代选择模型自身输出的有害响应作为临时目标,缩小了优化差距,以更少的迭代次数和时间实现了更高的攻击成功率。
Existing gradient-based jailbreak attacks optimize adversarial prompts to elicit a fixed affirmative response, but this target is often in a low-probability region of the model's output, making optimization difficult and inefficient; the proposed Dynamic Target Attack (DTA) iteratively selects harmful responses from the model's own output as temporary targets, reducing the optimization gap and achieving higher success rates with significantly fewer iterations and time.
Authors:Haoran Li, Yulin Chen, Jingru Zeng, Hao Peng, Huihao Jing, Wenbin Hu, Xi Yang, Ziqian Zeng, Sirui Han, Yangqiu Song
Abstract:
As large language models (LLMs) are increasingly integrated into numerous applications across various domains, LLMs' safety becomes a critical concern for both application developers and intended users. Currently, great efforts have been made to develop safety benchmarks with fine-grained taxonomies. However, these benchmarks' taxonomies are disparate with different safety policies. Thus, existing safeguards trained on these benchmarks are either coarse-grained to only distinguish between safe and unsafe, or constrained by the narrow risk taxonomies of a single benchmark. To leverage these fine-grained safety taxonomies across multiple safety benchmarks, in this paper, we propose GSPR, a Generalizable Safety Policy Reasoner to identify unsafe input prompts and LLMs' outputs with violated safety taxonomies through Group Relative Policy Optimization (GRPO). Unlike prior safeguards which only cover a fixed set of risk factors, our GSPR incentivizes its reasoning capability with varied safety taxonomies through our careful cold-start strategy and reward design. Consequently, our GSPR can be trained across multiple safety benchmarks with distinct taxonomies and naturally exhibits powerful generalization ability. We conduct extensive experiments to show that our GSPR significantly improves existing safety guardrails' reasoning capabilities for both safety and category prediction tasks. Moreover, our GSPR not only demonstrates powerful safety generalization abilities but also achieves the least inference token costs with explanations.
中文: 随着大语言模型(LLMs)在各领域广泛应用,其安全性日益重要,为此我们提出GSPR,一种通用安全策略推理器,通过群体相对策略优化和精心设计的奖励机制,提升跨多个安全基准的推理能力和泛化性能。
English: With the growing use of large language models (LLMs) across applications, ensuring their safety is crucial, leading to the development of GSPR, a Generalizable Safety Policy Reasoner that enhances safety reasoning and generalization across diverse benchmarks through innovative optimization and reward design.
Authors:Yue Liu, Yanjie Zhao, Yunbo Lyu, Ting Zhang, Haoyu Wang, David Lo
Abstract:
Agentic AI coding editors driven by large language models have recently become more popular due to their ability to improve developer productivity during software development. Modern editors such as Cursor are designed not just for code completion, but also with more system privileges for complex coding tasks (e.g., run commands in the terminal, access development environments, and interact with external systems). While this brings us closer to the "fully automated programming" dream, it also raises new security concerns. In this study, we present the first empirical analysis of prompt injection attacks targeting these high-privilege agentic AI coding editors. We show how attackers can remotely exploit these systems by poisoning external development resources with malicious instructions, effectively hijacking AI agents to run malicious commands, turning "your AI" into "attacker's shell". To perform this analysis, we implement AIShellJack, an automated testing framework for assessing prompt injection vulnerabilities in agentic AI coding editors. AIShellJack contains 314 unique attack payloads that cover 70 techniques from the MITRE ATT&CK framework. Using AIShellJack, we conduct a large-scale evaluation on GitHub Copilot and Cursor, and our evaluation results show that attack success rates can reach as high as 84% for executing malicious commands. Moreover, these attacks are proven effective across a wide range of objectives, ranging from initial access and system discovery to credential theft and data exfiltration.
中文: 基于大语言模型的智能AI编程编辑器虽然提升了开发效率,但存在提示注入攻击风险,攻击者可远程操控系统执行恶意指令,成功率高达84%。
English: Agentic AI coding editors, while enhancing developer productivity, are vulnerable to prompt injection attacks that can hijack the systems to execute malicious commands with high success rates.
Authors:Weiwei Qi, Shuo Shao, Wei Gu, Tianhang Zheng, Puning Zhao, Zhan Qin, Kui Ren
Abstract:
Large Language Models (LLMs) have exhibited remarkable capabilities but remain vulnerable to jailbreaking attacks, which can elicit harmful content from the models by manipulating the input prompts. Existing black-box jailbreaking techniques primarily rely on static prompts crafted with a single, non-adaptive strategy, or employ rigid combinations of several underperforming attack methods, which limits their adaptability and generalization. To address these limitations, we propose MAJIC, a Markovian adaptive jailbreaking framework that attacks black-box LLMs by iteratively combining diverse innovative disguise strategies. MAJIC first establishes a ``Disguise Strategy Pool'' by refining existing strategies and introducing several innovative approaches. To further improve the attack performance and efficiency, MAJIC formulate the sequential selection and fusion of strategies in the pool as a Markov chain. Under this formulation, MAJIC initializes and employs a Markov matrix to guide the strategy composition, where transition probabilities between strategies are dynamically adapted based on attack outcomes, thereby enabling MAJIC to learn and discover effective attack pathways tailored to the target model. Our empirical results demonstrate that MAJIC significantly outperforms existing jailbreak methods on prominent models such as GPT-4o and Gemini-2.0-flash, achieving over 90\% attack success rate with fewer than 15 queries per attempt on average.
Chinese: MAJIC提出了一种马尔可夫自适应框架,通过迭代学习动态组合多种伪装策略,在GPT-4o等模型上以超过90%的成功率和高效查询显著优于现有越狱方法。
English: MAJIC introduces a Markovian adaptive framework that dynamically combines diverse disguise strategies through iterative learning, significantly outperforming existing jailbreak methods with over 90% success rate and high efficiency on models like GPT-4o.
Authors:Yulin Chen, Haoran Li, Yuexin Li, Yue Liu, Yangqiu Song, Bryan Hooi
Abstract:
Large language models (LLMs) have shown remarkable performance across a range of NLP tasks. However, their strong instruction-following capabilities and inability to distinguish instructions from data content make them vulnerable to indirect prompt injection attacks. In such attacks, instructions with malicious purposes are injected into external data sources, such as web documents. When LLMs retrieve this injected data through tools, such as a search engine and execute the injected instructions, they provide misled responses. Recent attack methods have demonstrated potential, but their abrupt instruction injection often undermines their effectiveness. Motivated by the limitations of existing attack methods, we propose TopicAttack, which prompts the LLM to generate a fabricated conversational transition prompt that gradually shifts the topic toward the injected instruction, making the injection smoother and enhancing the plausibility and success of the attack. Through comprehensive experiments, TopicAttack achieves state-of-the-art performance, with an attack success rate (ASR) over 90\% in most cases, even when various defense methods are applied. We further analyze its effectiveness by examining attention scores. We find that a higher injected-to-original attention ratio leads to a greater success probability, and our method achieves a much higher ratio than the baseline methods.
中文: 大型语言模型易受间接提示注入攻击,而提出的TopicAttack方法通过平滑转换话题增强了攻击效果,即使在防御措施下成功率仍超过90%。
English: Large language models are vulnerable to indirect prompt injection attacks, and the proposed TopicAttack method enhances attack success by smoothly transitioning topics, achieving over 90% success rate even against defenses.
Authors:Nguyen Van Duc, Bui Duc Manh, Quang-Trung Luu, Dinh Thai Hoang, Van-Linh Nguyen, Diep N. Nguyen
Abstract:
This paper aims to propose a novel machine learning (ML) approach incorporating Homomorphic Encryption (HE) to address privacy limitations in Unmanned Aerial Vehicles (UAV)-based face detection. Due to challenges related to distance, altitude, and face orientation, high-resolution imagery and sophisticated neural networks enable accurate face recognition in dynamic environments. However, privacy concerns arise from the extensive surveillance capabilities of UAVs. To resolve this issue, we propose a novel framework that integrates HE with advanced neural networks to secure facial data throughout the inference phase. This method ensures that facial data remains secure with minimal impact on detection accuracy. Specifically, the proposed system leverages the Cheon-Kim-Kim-Song (CKKS) scheme to perform computations directly on encrypted data, optimizing computational efficiency and security. Furthermore, we develop an effective data encoding method specifically designed to preprocess the raw facial data into CKKS form in a Single-Instruction-Multiple-Data (SIMD) manner. Building on this, we design a secure inference algorithm to compute on ciphertext without needing decryption. This approach not only protects data privacy during the processing of facial data but also enhances the efficiency of UAV-based face detection systems. Experimental results demonstrate that our method effectively balances privacy protection and detection performance, making it a viable solution for UAV-based secure face detection. Significantly, our approach (while maintaining data confidentially with HE encryption) can still achieve an accuracy of less than 1% compared to the benchmark without using encryption.
本文提出了一种结合同态加密的机器学习方法,可在无人机人脸检测系统中保护数据隐私并保持高精度,实现安全高效的计算。
This paper introduces a homomorphic encryption-based machine learning approach that enables secure and efficient face detection in UAV systems while preserving privacy and maintaining high accuracy.
Authors:Yiming Li, Shuo Shao, Yu He, Junfeng Guo, Tianwei Zhang, Zhan Qin, Pin-Yu Chen, Michael Backes, Philip Torr, Dacheng Tao, Kui Ren
Abstract:
The (generative) artificial intelligence (AI) era has profoundly reshaped the meaning and value of data. No longer confined to static content, data now permeates every stage of the AI lifecycle from the training samples that shape model parameters to the prompts and outputs that drive real-world model deployment. This shift renders traditional notions of data protection insufficient, while the boundaries of what needs safeguarding remain poorly defined. Failing to safeguard data in AI systems can inflict societal and individual, underscoring the urgent need to clearly delineate the scope of and rigorously enforce data protection. In this perspective, we propose a four-level taxonomy, including non-usability, privacy preservation, traceability, and deletability, that captures the diverse protection needs arising in modern (generative) AI models and systems. Our framework offers a structured understanding of the trade-offs between data utility and control, spanning the entire AI pipeline, including training datasets, model weights, system prompts, and AI-generated content. We analyze representative technical approaches at each level and reveal regulatory blind spots that leave critical assets exposed. By offering a structured lens to align future AI technologies and governance with trustworthy data practices, we underscore the urgency of rethinking data protection for modern AI techniques and provide timely guidance for developers, researchers, and regulators alike.
中文: 人工智能时代使数据成为动态资产,亟需新的保护框架;为此我们提出包含不可用性、隐私保护、可追溯性和可删除性的四级分类法,以应对AI全生命周期的风险,并为技术与治理提供可信赖的实践指引。
English: The AI era has transformed data into a dynamic asset requiring new protection frameworks, prompting the proposal of a four-level taxonomy—non-usability, privacy preservation, traceability, and deletability—to address risks across the AI lifecycle and guide technology and governance toward trustworthy practices.
Authors:Miao Yu, Zhenhong Zhou, Moayad Aloqaily, Kun Wang, Biwei Huang, Stephen Wang, Yueming Jin, Qingsong Wen
Abstract:
Fine-tuned Large Language Models (LLMs) are vulnerable to backdoor attacks through data poisoning, yet the internal mechanisms governing these attacks remain a black box. Previous research on interpretability for LLM safety tends to focus on alignment, jailbreak, and hallucination, but overlooks backdoor mechanisms, making it difficult to understand and fully eliminate the backdoor threat. In this paper, aiming to bridge this gap, we explore the interpretable mechanisms of LLM backdoors through Backdoor Attribution (BkdAttr), a tripartite causal analysis framework. We first introduce the Backdoor Probe that proves the existence of learnable backdoor features encoded within the representations. Building on this insight, we further develop Backdoor Attention Head Attribution (BAHA), efficiently pinpointing the specific attention heads responsible for processing these features. Our primary experiments reveals these heads are relatively sparse; ablating a minimal \textbf{$\sim$ 3%} of total heads is sufficient to reduce the Attack Success Rate (ASR) by \textbf{over 90%}. More importantly, we further employ these findings to construct the Backdoor Vector derived from these attributed heads as a master controller for the backdoor. Through only \textbf{1-point} intervention on \textbf{single} representation, the vector can either boost ASR up to \textbf{$\sim$ 100% ($\uparrow$)} on clean inputs, or completely neutralize backdoor, suppressing ASR down to \textbf{$\sim$ 0% ($\downarrow$)} on triggered inputs. In conclusion, our work pioneers the exploration of mechanistic interpretability in LLM backdoors, demonstrating a powerful method for backdoor control and revealing actionable insights for the community.
Chinese: 本研究提出Backdoor Attribution (BkdAttr)框架,通过识别微调大语言模型中负责后门特征的稀疏注意力头,仅需对单个表征进行单点干预即可实现后门攻击的精准增强或完全消除。
English: This study introduces Backdoor Attribution (BkdAttr), a framework that identifies sparse attention heads responsible for backdoor features in fine-tuned LLMs, enabling precise control to either amplify or neutralize attacks with minimal intervention.
Authors:Badih Ghazi, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang
Abstract:
The conventional approach in differential privacy (DP) literature formulates the privacy-utility trade-off with a "privacy-first" perspective: for a predetermined level of privacy, a certain utility is achievable. However, practitioners often operate under a "utility-first" paradigm, prioritizing a desired level of utility and then determining the corresponding privacy cost.
Wu et al. [2019] initiated a formal study of this "utility-first" perspective by introducing ex-post DP. They demonstrated that by adding correlated Laplace noise and progressively reducing it on demand, a sequence of increasingly accurate estimates of a private parameter can be generated, with the privacy cost attributed only to the least noisy iterate released. This led to a Laplace mechanism variant that achieves a specified utility with minimal privacy loss. However, their work, and similar findings by Whitehouse et al. [2022], are primarily limited to simple mechanisms based on Laplace or Gaussian noise.
In this paper, we significantly generalize these results. In particular, we extend the work of Wu et al. [2019] and Liu and Talwar [2019] to support any sequence of private estimators, incurring at most a doubling of the original privacy budget. Furthermore, we demonstrate that hyperparameter tuning for these estimators, including the selection of an optimal privacy budget, can be performed without additional privacy cost. Finally, we extend our results to ex-post Renyi DP, further broadening the applicability of utility-first privacy mechanisms.
中文: 本文显著推广了现有效用优先的差分隐私方法,将其扩展至支持任意私有估计器序列且隐私损失可控,实现了无需额外隐私成本的超参数调优,并进一步拓展至事后Renyi差分隐私框架。
English: This paper generalizes prior utility-first differential privacy approaches by extending them to support any sequence of private estimators with bounded privacy loss, enabling hyperparameter tuning without extra privacy cost and expanding to Renyi DP variants.
Authors:Yi Zhang, An Zhang, XiuYu Zhang, Leheng Sheng, Yuxin Chen, Zhenkai Liang, Xiang Wang
Abstract:
Large language models (LLMs), despite possessing latent safety understanding from their vast pretraining data, remain vulnerable to generating harmful content and exhibit issues such as over-refusal and utility degradation after safety alignment. Current safety alignment methods often result in superficial refusal shortcuts or rely on intensive supervision for reasoning-based approaches, failing to fully leverage the model's intrinsic safety self-awareness. We propose \textbf{AlphaAlign}, a simple yet effective pure reinforcement learning (RL) framework with verifiable safety reward designed to incentivize this latent safety awareness through proactive safety reasoning.} AlphaAlign employs a dual-reward system: a verifiable safety reward encourages correctly formatted and explicitly justified refusals for harmful queries while penalizing over-refusals, and a normalized helpfulness reward guides high-quality responses to benign inputs. This allows the model to develop proactive safety reasoning capabilities without depending on supervised safety-specific reasoning data. AlphaAlign demonstrates three key advantages: (1) Simplicity and efficiency, requiring only binary prompt safety labels and minimal RL steps for substantial improvements. (2) Breaking the safety-utility trade-off, by enhancing refusal of harmful content and reducing over-refusals, while simultaneously maintaining or even improving general task performance and robustness to unseen jailbreaks. (3) Deep alignment, fostering proactive safety reasoning that generates explicit safety rationales rather than relying on shallow refusal patterns.
中文:AlphaAlign是一种强化学习框架,通过双重奖励机制激发大语言模型的内在安全认知,促使其主动进行安全推理,从而在减少过度拒绝的同时提升有害内容拦截能力,无需依赖大量监督数据。
English: AlphaAlign is a reinforcement learning framework that enhances LLMs' safety by using dual rewards to encourage explicit safety reasoning and reduce over-refusals, improving both security and utility without extensive supervision.
Authors:Haoran Ou, Kangjie Chen, Xingshuo Han, Gelei Deng, Jie Zhang, Han Qiu, Tianwei Zhang
Abstract:
Large Language Models (LLMs) excel at tasks such as dialogue, summarization, and question answering, yet they struggle to adapt to specialized domains and evolving facts. To overcome this, web search has been integrated into LLMs, allowing real-time access to online content. However, this connection magnifies safety risks, as adversarial prompts combined with untrusted sources can cause severe vulnerabilities. We investigate red teaming for LLMs with web search and present CREST-Search, a framework that systematically exposes risks in such systems. Unlike existing methods for standalone LLMs, CREST-Search addresses the complex workflow of search-enabled models by generating adversarial queries with in-context learning and refining them through iterative feedback. We further construct WebSearch-Harm, a search-specific dataset to fine-tune LLMs into efficient red-teaming agents. Experiments show that CREST-Search effectively bypasses safety filters and reveals vulnerabilities in modern web-augmented LLMs, underscoring the need for specialized defenses to ensure trustworthy deployment.
Authors:Xinwei Zhang, Haibo Hu, Qingqing Ye, Li Bai, Huadi Zheng
Abstract:
Information leakage issues in machine learning-based Web applications have attracted increasing attention. While the risk of data privacy leakage has been rigorously analyzed, the theory of model function leakage, known as Model Extraction Attacks (MEAs), has not been well studied. In this paper, we are the first to understand MEAs theoretically from an attack-agnostic perspective and to propose analytical metrics for evaluating model extraction risks. By using the Neural Tangent Kernel (NTK) theory, we formulate the linearized MEA as a regularized kernel classification problem and then derive the fidelity gap and generalization error bounds of the attack performance. Based on these theoretical analyses, we propose a new theoretical metric called Model Recovery Complexity (MRC), which measures the distance of weight changes between the victim and surrogate models to quantify risk. Additionally, we find that victim model accuracy, which shows a strong positive correlation with model extraction risk, can serve as an empirical metric. By integrating these two metrics, we propose a framework, namely Model Extraction Risk Inspector (MER-Inspector), to compare the extraction risks of models under different model architectures by utilizing relative metric values. We conduct extensive experiments on 16 model architectures and 5 datasets. The experimental results demonstrate that the proposed metrics have a high correlation with model extraction risks, and MER-Inspector can accurately compare the extraction risks of any two models with up to 89.58%.
中文: 本文提出了评估机器学习中模型提取风险的理论框架,通过理论分析与实证指标相结合,经大量实验验证能有效比较不同模型架构的脆弱性。
English: This paper introduces a theoretical framework for assessing model extraction risks in machine learning, proposing both analytical and empirical metrics that are validated through extensive experiments to effectively compare vulnerabilities across different model architectures.
Authors:Guorui Chen, Yifan Xia, Xiaojun Jia, Zhijiang Li, Philip Torr, Jindong Gu
Abstract:
Large language models (LLMs) enhance security through alignment when widely used, but remain susceptible to jailbreak attacks capable of producing inappropriate content. Jailbreak detection methods show promise in mitigating jailbreak attacks through the assistance of other models or multiple model inferences. However, existing methods entail significant computational costs. In this paper, we first present a finding that the difference in output distributions between jailbreak and benign prompts can be employed for detecting jailbreak prompts. Based on this finding, we propose a Free Jailbreak Detection (FJD) which prepends an affirmative instruction to the input and scales the logits by temperature to further distinguish between jailbreak and benign prompts through the confidence of the first token. Furthermore, we enhance the detection performance of FJD through the integration of virtual instruction learning. Extensive experiments on aligned LLMs show that our FJD can effectively detect jailbreak prompts with almost no additional computational costs during LLM inference.
中文摘要:大语言模型易受越狱攻击,而本文提出的免费越狱检测方法通过分析输出分布差异和首个令牌置信度,能以极低计算成本有效识别此类威胁。
English Summary: Large language models are vulnerable to jailbreak attacks, but the proposed Free Jailbreak Detection method effectively identifies these threats by analyzing output distribution differences and first-token confidence with minimal computational overhead.
Authors:Shiqian Zhao, Chong Wang, Yiming Li, Yihao Huang, Wenjie Qu, Siew-Kei Lam, Yi Xie, Kangjie Chen, Jie Zhang, Tianwei Zhang
Abstract:
Text-to-Image (T2I) models, represented by DALL$\cdot$E and Midjourney, have gained huge popularity for creating realistic images. The quality of these images relies on the carefully engineered prompts, which have become valuable intellectual property. While skilled prompters showcase their AI-generated art on markets to attract buyers, this business incidentally exposes them to \textit{prompt stealing attacks}. Existing state-of-the-art attack techniques reconstruct the prompts from a fixed set of modifiers (i.e., style descriptions) with model-specific training, which exhibit restricted adaptability and effectiveness to diverse showcases (i.e., target images) and diffusion models.
To alleviate these limitations, we propose Prometheus, a training-free, proxy-in-the-loop, search-based prompt-stealing attack, which reverse-engineers the valuable prompts of the showcases by interacting with a local proxy model. It consists of three innovative designs. First, we introduce dynamic modifiers, as a supplement to static modifiers used in prior works. These dynamic modifiers provide more details specific to the showcases, and we exploit NLP analysis to generate them on the fly. Second, we design a contextual matching algorithm to sort both dynamic and static modifiers. This offline process helps reduce the search space of the subsequent step. Third, we interact with a local proxy model to invert the prompts with a greedy search algorithm. Based on the feedback guidance, we refine the prompt to achieve higher fidelity. The evaluation results show that Prometheus successfully extracts prompts from popular platforms like PromptBase and AIFrog against diverse victim models, including Midjourney, Leonardo.ai, and DALL$\cdot$E, with an ASR improvement of 25.0\%. We also validate that Prometheus is resistant to extensive potential defenses, further highlighting its severity in practice.
中文摘要:本研究提出Prometheus,一种无需训练、基于代理模型的提示窃取攻击方法,通过动态修饰符和上下文匹配算法逆向还原文本生成图像的关键提示,在PromptBase等平台上对多款模型实现攻击成功率提升25%,并能有效抵抗各类防御措施。
English Summary: The study introduces Prometheus, a training-free prompt-stealing attack that reverse-engineers valuable text-to-image prompts using dynamic modifiers and a proxy model, achieving a 25% higher attack success rate against platforms like PromptBase and resisting various defenses.
Authors:Ronghua Li, Shinan Liu, Haibo Hu, Qingqing Ye, Nick Feamster
Abstract:
IoT environments such as smart homes are susceptible to privacy inference attacks, where attackers can analyze patterns of encrypted network traffic to infer the state of devices and even the activities of people. While most existing attacks exploit ML techniques for discovering such traffic patterns, they underperform on wireless traffic, especially Wi-Fi, due to its heavy noise and packet losses of wireless sniffing. In addition, these approaches commonly target at distinguishing chunked IoT event traffic samples, and they failed at effectively tracking multiple events simultaneously. In this work, we propose WiFinger, a fine-grained multi-IoT event fingerprinting approach against noisy traffic. WiFinger turns the traffic pattern classification task into a subsequence matching problem and introduces novel techniques to account for the high time complexity while maintaining high accuracy. Experiments demonstrate that our method outperforms existing approaches on Wi-Fi traffic, achieving an average recall of 85% (vs. 0.49% and 0.46%) for various IoT events while maintaining almost zero false positives for most of them.
中文:WiFinger是一种新颖方法,将流量模式分类转化为子序列匹配,能在嘈杂的Wi-Fi环境中有效识别多个物联网事件,实现了高召回率和接近零的误报率。
English: WiFinger is a novel approach that transforms traffic pattern classification into subsequence matching to effectively fingerprint multiple IoT events in noisy Wi-Fi environments, achieving high recall and near-zero false positives.
Authors:Kongxin Wang, Jie Zhang, Peigui Qi, Kunsheng Tang, Tianwei Zhang, Wenbo Zhou
Abstract:
Pose-guided video generation has become a powerful tool in creative industries, exemplified by frameworks like Animate Anyone. However, conditioning generation on specific poses introduces serious risks, such as impersonation, privacy violations, and NSFW content creation. To address these challenges, we propose $\textbf{PoseGuard}$, a safety alignment framework for pose-guided generation. PoseGuard is designed to suppress unsafe generations by degrading output quality when encountering malicious poses, while maintaining high-fidelity outputs for benign inputs. We categorize unsafe poses into three representative types: discriminatory gestures such as kneeling or offensive salutes, sexually suggestive poses that lead to NSFW content, and poses imitating copyrighted celebrity movements. PoseGuard employs a dual-objective training strategy combining generation fidelity with safety alignment, and uses LoRA-based fine-tuning for efficient, parameter-light updates. To ensure adaptability to evolving threats, PoseGuard supports pose-specific LoRA fusion, enabling flexible and modular updates when new unsafe poses are identified. We further demonstrate the generalizability of PoseGuard to facial landmark-guided generation. Extensive experiments validate that PoseGuard effectively blocks unsafe generations, maintains generation quality for benign inputs, and remains robust against slight pose variations.
中文摘要:PoseGuard是一个安全对齐框架,通过检测歧视性手势、性暗示姿势和名人模仿等恶意姿态,在保持良性输入生成质量的同时,有效阻止不安全内容的生成。
English Summary: PoseGuard is a safety alignment framework that prevents unsafe pose-guided video generations like impersonation or explicit content by degrading output quality for malicious poses while preserving high fidelity for benign inputs.
Authors:Boheng Li, Junjie Wang, Yiming Li, Zhiyang Hu, Leyi Qi, Jianshuo Dong, Run Wang, Han Qiu, Zhan Qin, Tianwei Zhang
Abstract:
Despite the integration of safety alignment and external filters, text-to-image (T2I) generative models are still susceptible to producing harmful content, such as sexual or violent imagery. This raises serious concerns about unintended exposure and potential misuse. Red teaming, which aims to proactively identify diverse prompts that can elicit unsafe outputs from the T2I system (including the core generative model as well as potential external safety filters and other processing components), is increasingly recognized as an essential method for assessing and improving safety before real-world deployment. Yet, existing automated red teaming approaches often treat prompt discovery as an isolated, prompt-level optimization task, which limits their scalability, diversity, and overall effectiveness. To bridge this gap, in this paper, we propose DREAM, a scalable red teaming framework to automatically uncover diverse problematic prompts from a given T2I system. Unlike most prior works that optimize prompts individually, DREAM directly models the probabilistic distribution of the target system's problematic prompts, which enables explicit optimization over both effectiveness and diversity, and allows efficient large-scale sampling after training. To achieve this without direct access to representative training samples, we draw inspiration from energy-based models and reformulate the objective into simple and tractable objectives. We further introduce GC-SPSA, an efficient optimization algorithm that provide stable gradient estimates through the long and potentially non-differentiable T2I pipeline. The effectiveness of DREAM is validated through extensive experiments, demonstrating that it surpasses 9 state-of-the-art baselines by a notable margin across a broad range of T2I models and safety filters in terms of prompt success rate and diversity.
Chinese: 尽管已有安全措施,文本到图像模型仍易生成有害内容,为此提出的DREAM框架通过建模问题提示的概率分布,高效发掘多样化风险提示,在成功率和多样性上显著超越现有方法。
English: Despite existing safety measures, text-to-image models remain vulnerable to generating harmful content, prompting the development of DREAM, a scalable red teaming framework that efficiently uncovers diverse problematic prompts by modeling their probabilistic distribution and outperforms current methods in both success rate and diversity.
Authors:Zhaoyang Chu, Yao Wan, Zhikun Zhang, Di Wang, Zhou Yang, Hongyu Zhang, Pan Zhou, Xuanhua Shi, Hai Jin, David Lo
Abstract:
While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit unintended memorization of sensitive training data, enabling verbatim reproduction of confidential information when specifically prompted. To address this issue, several approaches, including training data de-duplication and differential privacy augmentation, have been proposed. However, these methods require full-model retraining for deployed CLMs, which incurs substantial computational costs. In this paper, we aim to answer the following research question: Can sensitive information memorized by CLMs be erased effectively and efficiently? We conduct a pioneering investigation into erasing sensitive memorization in CLMs through machine unlearning - a post-hoc modification method that removes specific information from trained models without requiring full retraining. Specifically, we first quantify the memorization risks of sensitive data within CLM training datasets and curate a high-risk dataset of 50,000 sensitive memorized samples as unlearning targets. We study two widely used gradient ascent-based unlearning approaches: the vanilla and constraint-based methods, and introduce CodeEraser, an advanced variant that selectively unlearns sensitive memorized segments in code while preserving the structural integrity and functional correctness of the surrounding code. Extensive experiments on three families of CLMs, i.e., CodeParrot, CodeGen-Mono, and Qwen2.5-Coder, validate the effectiveness and efficiency of CodeEraser in erasing targeted sensitive memorization while maintaining model utility.
中文摘要:代码语言模型存在记忆敏感训练数据的风险,而现有隐私保护方法需全模型重训练;本文提出CodeEraser这一高效机器遗忘方法,能选择性消除敏感记忆同时保持模型功能完整性。
English Summary: Code Language Models (CLMs) risk memorizing sensitive training data, but existing privacy solutions require costly full-model retraining; this paper introduces CodeEraser, an efficient machine unlearning method that selectively removes sensitive memorization while preserving model functionality.
Authors:Tiezhu Sun, Marco Alecci, Aleksandr Pilgun, Yewei Song, Xunzhu Tang, Jordan Samhi, Tegawendé F. Bissyandé, Jacques Klein
Abstract:
The rapid evolution of Android malware poses significant challenges to the maintenance and security of mobile applications (apps). Traditional detection techniques often struggle to keep pace with emerging malware variants that employ advanced tactics such as code obfuscation and dynamic behavior triggering. One major limitation of these approaches is their inability to localize malicious payloads at a fine-grained level, hindering precise understanding of malicious behavior. This gap in understanding makes the design of effective and targeted mitigation strategies difficult, leaving mobile apps vulnerable to continuously evolving threats.
To address this gap, we propose MalLoc, a novel approach that leverages the code understanding capabilities of large language models (LLMs) to localize malicious payloads at a fine-grained level within Android malware. Our experimental results demonstrate the feasibility and effectiveness of using LLMs for this task, highlighting the potential of MalLoc to enhance precision and interpretability in malware analysis. This work advances beyond traditional detection and classification by enabling deeper insights into behavior-level malicious logic and opens new directions for research, including dynamic modeling of localized threats and targeted countermeasure development.
中文摘要:提出的MalLoc系统利用大语言模型精确定位安卓恶意软件中的恶意载荷,突破了传统方法的局限,显著提升了恶意软件分析的精确度和可解释性。
English Summary: The proposed MalLoc system utilizes large language models to precisely localize malicious payloads in Android malware, overcoming traditional methods' limitations and improving analysis precision and interpretability.
Authors:Yuhao Sun, Yihua Zhang, Gaowen Liu, Hongtao Xie, Sijia Liu
Abstract:
With the increasing demand for the right to be forgotten, machine unlearning (MU) has emerged as a vital tool for enhancing trust and regulatory compliance by enabling the removal of sensitive data influences from machine learning (ML) models. However, most MU algorithms primarily rely on in-training methods to adjust model weights, with limited exploration of the benefits that data-level adjustments could bring to the unlearning process. To address this gap, we propose a novel approach that leverages digital watermarking to facilitate MU by strategically modifying data content. By integrating watermarking, we establish a controlled unlearning mechanism that enables precise removal of specified data while maintaining model utility for unrelated tasks. We first examine the impact of watermarked data on MU, finding that MU effectively generalizes to watermarked data. Building on this, we introduce an unlearning-friendly watermarking framework, termed Water4MU, to enhance unlearning effectiveness. The core of Water4MU is a bi-level optimization (BLO) framework: at the upper level, the watermarking network is optimized to minimize unlearning difficulty, while at the lower level, the model itself is trained independently of watermarking. Experimental results demonstrate that Water4MU is effective in MU across both image classification and image generation tasks. Notably, it outperforms existing methods in challenging MU scenarios, known as "challenging forgets".
中文: 机器遗忘通过创新的数字水印方法Water4MU得到加强,该方法策略性地修改数据以精确移除敏感信息,同时保持模型在不同任务中的性能。
English: Machine unlearning is enhanced by a novel digital watermarking approach, Water4MU, which strategically modifies data to enable precise removal of sensitive information while maintaining model performance across various tasks.
Authors:Iyiola E. Olatunji, Franziska Boenisch, Jing Xu, Adam Dziedzic
Abstract:
Large Language Models (LLMs) are increasingly integrated with graph-structured data for tasks like node classification, a domain traditionally dominated by Graph Neural Networks (GNNs). While this integration leverages rich relational information to improve task performance, their robustness against adversarial attacks remains unexplored. We take the first step to explore the vulnerabilities of graph-aware LLMs by leveraging existing adversarial attack methods tailored for graph-based models, including those for poisoning (training-time attacks) and evasion (test-time attacks), on two representative models, LLAGA (Chen et al. 2024) and GRAPHPROMPTER (Liu et al. 2024). Additionally, we discover a new attack surface for LLAGA where an attacker can inject malicious nodes as placeholders into the node sequence template to severely degrade its performance. Our systematic analysis reveals that certain design choices in graph encoding can enhance attack success, with specific findings that: (1) the node sequence template in LLAGA increases its vulnerability; (2) the GNN encoder used in GRAPHPROMPTER demonstrates greater robustness; and (3) both approaches remain susceptible to imperceptible feature perturbation attacks. Finally, we propose an end-to-end defense framework GALGUARD, that combines an LLM-based feature correction module to mitigate feature-level perturbations and adapted GNN defenses to protect against structural attacks.
中文摘要:本研究首次探索图结构大语言模型的对抗攻击脆弱性,发现节点序列编码设计存在显著安全漏洞,同时提出融合特征校正与结构保护的GALGUARD端到端防御框架。
English Summary: This study pioneers the exploration of adversarial vulnerabilities in graph-aware Large Language Models, revealing critical weaknesses in node sequence encoding while proposing a novel defense framework GALGUARD that combines feature correction and structural protection mechanisms.
Authors:Yixiang Zhang, Xinhao Deng, Zhongyi Gu, Yihao Chen, Ke Xu, Qi Li, Jianping Wu
Abstract:
Large Language Models (LLMs) are increasingly deployed as agents that orchestrate tasks and integrate external tools to execute complex workflows. We demonstrate that these interactive behaviors leave distinctive fingerprints in encrypted traffic exchanged between users and LLM agents. By analyzing traffic patterns associated with agent workflows and tool invocations, adversaries can infer agent activities, distinguish specific agents, and even profile sensitive user attributes. To highlight this risk, we develop AgentPrint, which achieves an F1-score of 0.866 in agent identification and attains 73.9% and 69.1% top-3 accuracy in user attribute inference for simulated- and real-user settings, respectively. These results uncover an overlooked risk: the very interactivity that empowers LLM agents also exposes user privacy, underscoring the urgent need for technical countermeasures alongside regulatory and policy safeguards.
中文: LLM智能体的交互行为在加密流量中留下独特指纹,攻击者可据此识别智能体活动并推断用户敏感属性,AgentPrint工具的高准确率揭示了这一被忽视的隐私风险。
English: LLM agents' interactive workflows create identifiable traffic patterns, enabling adversaries to detect activities and infer sensitive user data, as demonstrated by AgentPrint's high accuracy, highlighting critical privacy risks.
Authors:Xuewei Feng, Zhaoxi Li, Qi Li, Ziqiang Wang, Kun Sun, Ke Xu
Abstract:
Path MTU Discovery (PMTUD) and IP address sharing are integral aspects of modern Internet infrastructure. In this paper, we investigate the security vulnerabilities associated with PMTUD within the context of prevalent IP address sharing practices. We reveal that PMTUD is inadequately designed to handle IP address sharing, creating vulnerabilities that attackers can exploit to perform off-path TCP hijacking attacks. We demonstrate that by observing the path MTU value determined by a server for a public IP address (shared among multiple devices), an off-path attacker on the Internet, in collaboration with a malicious device, can infer the sequence numbers of TCP connections established by other legitimate devices sharing the same IP address. This vulnerability enables the attacker to perform off-path TCP hijacking attacks, significantly compromising the security of the affected TCP connections. Our attack involves first identifying a target TCP connection originating from the shared IP address, followed by inferring the sequence numbers of the identified connection. We thoroughly assess the impacts of our attack under various network configurations. Experimental results reveal that the attack can be executed within an average time of 220 seconds, achieving a success rate of 70%.Case studies, including SSH DoS, FTP traffic poisoning, and HTTP injection, highlight the threat it poses to various applications. Additionally, we evaluate our attack across 50 real-world networks with IP address sharing--including public Wi-Fi, VPNs, and 5G--and find 38 vulnerable. Finally, we responsibly disclose the vulnerabilities, receive recognition from organizations such as IETF, Linux, and Cisco, and propose our countermeasures.
中文: 本文揭示了路径MTU发现(PMTUD)在IP地址共享环境中的安全漏洞,攻击者可通过推断TCP序列号实施离径劫持攻击,实验显示平均220秒内达到70%成功率,38个真实网络存在风险。
English: This paper exposes security vulnerabilities in Path MTU Discovery (PMTUD) when combined with IP address sharing, enabling off-path TCP hijacking attacks that can infer sequence numbers and compromise connections with 70% success in 220 seconds on average.
Authors:Zhenliang Gan, Xiaoxiao Hu, Sheng Li, Zhenxing Qian, Xinpeng Zhang
Abstract:
Audio watermarking has been widely applied in copyright protection and source tracing. However, due to the inherent characteristics of audio signals, watermark localization and resistance to desynchronization attacks remain significant challenges. In this paper, we propose a learning-based scheme named SyncGuard to address these challenges. Specifically, we design a frame-wise broadcast embedding strategy to embed the watermark in arbitrary-length audio, enhancing time-independence and eliminating the need for localization during watermark extraction. To further enhance robustness, we introduce a meticulously designed distortion layer. Additionally, we employ dilated residual blocks in conjunction with dilated gated blocks to effectively capture multi-resolution time-frequency features. Extensive experimental results show that SyncGuard efficiently handles variable-length audio segments, outperforms state-of-the-art methods in robustness against various attacks, and delivers superior auditory quality.
Chinese: 本文提出SyncGuard,一种基于学习的音频水印方案,采用逐帧广播嵌入策略和失真层设计,有效应对去同步攻击,提高时间独立性,无需提取时定位,并在实验中展现出优越的鲁棒性和听觉质量。
English: This paper introduces SyncGuard, a learning-based audio watermarking scheme that employs a frame-wise broadcast embedding strategy and a distortion layer to enhance robustness against desynchronization attacks and improve time-independence without requiring localization during extraction.
Authors:Guohong Liu, Jialei Ye, Jiacheng Liu, Yuanchun Li, Wei Liu, Pengzhi Gao, Jian Luan, Yunxin Liu
Abstract:
Mobile GUI agents are designed to autonomously execute diverse device-control tasks by interpreting and interacting with mobile screens. Despite notable advancements, their resilience in real-world scenarios where screen content may be partially manipulated by untrustworthy third parties remains largely unexplored. Owing to their black-box and autonomous nature, these agents are vulnerable to manipulations that could compromise user devices. In this work, we present the first systematic investigation into the vulnerabilities of mobile GUI agents. We introduce a scalable attack simulation framework AgentHazard, which enables flexible and targeted modifications of screen content within existing applications. Leveraging this framework, we develop a comprehensive benchmark suite comprising both a dynamic task execution environment and a static dataset of vision-language-action tuples, totaling over 3,000 attack scenarios. The dynamic environment encompasses 58 reproducible tasks in an emulator with various types of hazardous UI content, while the static dataset is constructed from 210 screenshots collected from 14 popular commercial apps. Importantly, our content modifications are designed to be feasible for unprivileged third parties. We evaluate 7 widely-used mobile GUI agents and 5 common backbone models using our benchmark. Our findings reveal that all examined agents are significantly influenced by misleading third-party content (with an average misleading rate of 28.8% in human-crafted attack scenarios) and that their vulnerabilities are closely linked to the employed perception modalities and backbone LLMs. Furthermore, we assess training-based mitigation strategies, highlighting both the challenges and opportunities for enhancing the robustness of mobile GUI agents. Our code and data will be released at https://agenthazard.github.io.
中文: 本研究提出首个系统性评估移动GUI代理漏洞的框架AgentHazard,揭示了其对第三方屏幕操控的易受攻击性,并探索了增强鲁棒性的解决方案。
English: This study introduces AgentHazard, the first systematic framework for evaluating vulnerabilities in mobile GUI agents, revealing their susceptibility to third-party screen manipulations and proposing mitigation strategies.
Authors:Leyi Pan, Sheng Guan, Zheyu Fu, Luyang Si, Zian Wang, Xuming Hu, Irwin King, Philip S. Yu, Aiwei Liu, Lijie Wen
Abstract:
We introduce MarkDiffusion, an open-source Python toolkit for generative watermarking of latent diffusion models. It comprises three key components: a unified implementation framework for streamlined watermarking algorithm integrations and user-friendly interfaces; a mechanism visualization suite that intuitively showcases added and extracted watermark patterns to aid public understanding; and a comprehensive evaluation module offering standard implementations of 24 tools across three essential aspects - detectability, robustness, and output quality - plus 8 automated evaluation pipelines. Through MarkDiffusion, we seek to assist researchers, enhance public awareness and engagement in generative watermarking, and promote consensus while advancing research and applications.
中文: MarkDiffusion 是一个用于隐扩散模型生成水印的开源 Python 工具包,集成了统一框架、可视化组件和评估模块,旨在促进研究和提升公众认知与参与。
English: MarkDiffusion is an open-source Python toolkit designed for generative watermarking in latent diffusion models, featuring a unified framework, visualization tools, and comprehensive evaluation modules to support research and public engagement.
Authors:Chao Wang, Kejiang Chen, Zijin Yang, Yaofei Wang, Weiming Zhang
Abstract:
The rapid advancement of image-generation technologies has made it possible for anyone to create photorealistic images using generative models, raising significant security concerns. To mitigate malicious use, tracing the origin of such images is essential. Reconstruction-based attribution methods offer a promising solution, but they often suffer from reduced accuracy and high computational costs when applied to state-of-the-art (SOTA) models. To address these challenges, we propose AEDR (AutoEncoder Double-Reconstruction), a novel training-free attribution method designed for generative models with continuous autoencoders. Unlike existing reconstruction-based approaches that rely on the value of a single reconstruction loss, AEDR performs two consecutive reconstructions using the model's autoencoder, and adopts the ratio of these two reconstruction losses as the attribution signal. This signal is further calibrated using the image homogeneity metric to improve accuracy, which inherently cancels out absolute biases caused by image complexity, with autoencoder-based reconstruction ensuring superior computational efficiency. Experiments on eight top latent diffusion models show that AEDR achieves 25.5% higher attribution accuracy than existing reconstruction-based methods, while requiring only 1% of the computational time.
中文摘要:提出的AEDR方法通过双重重建损失比和同质性校准提升图像溯源精度,相比现有方法实现25.5%的准确率提升,同时计算时间减少99%。
English Summary: The proposed AEDR method enhances image attribution accuracy by using a double-reconstruction loss ratio and homogeneity calibration, achieving 25.5% higher accuracy with 99% less computational time compared to existing methods.
Authors:Zicheng Liu, Lige Huang, Jie Zhang, Dongrui Liu, Yuan Tian, Jing Shao
Abstract:
The increasing autonomy of Large Language Models (LLMs) necessitates a rigorous evaluation of their potential to aid in cyber offense. Existing benchmarks often lack real-world complexity and are thus unable to accurately assess LLMs' cybersecurity capabilities. To address this gap, we introduce PACEbench, a practical AI cyber-exploitation benchmark built on the principles of realistic vulnerability difficulty, environmental complexity, and cyber defenses. Specifically, PACEbench comprises four scenarios spanning single, blended, chained, and defense vulnerability exploitations. To handle these complex challenges, we propose PACEagent, a novel agent that emulates human penetration testers by supporting multi-phase reconnaissance, analysis, and exploitation. Extensive experiments with seven frontier LLMs demonstrate that current models struggle with complex cyber scenarios, and none can bypass defenses. These findings suggest that current models do not yet pose a generalized cyber offense threat. Nonetheless, our work provides a robust benchmark to guide the trustworthy development of future models.
Authors:XuHao Hu, Peng Wang, Xiaoya Lu, Dongrui Liu, Xuanjing Huang, Jing Shao
Abstract:
Previous research has shown that LLMs finetuned on malicious or incorrect completions within narrow domains (e.g., insecure code or incorrect medical advice) can become broadly misaligned to exhibit harmful behaviors, which is called emergent misalignment. In this work, we investigate whether this phenomenon can extend beyond safety behaviors to a broader spectrum of dishonesty and deception under high-stakes scenarios (e.g., lying under pressure and deceptive behavior). To explore this, we finetune open-sourced LLMs on misaligned completions across diverse domains. Experimental results demonstrate that LLMs show broadly misaligned behavior in dishonesty. Additionally, we further explore this phenomenon in a downstream combined finetuning setting, and find that introducing as little as 1% of misalignment data into a standard downstream task is sufficient to decrease honest behavior over 20%. Furthermore, we consider a more practical human-AI interaction environment where we simulate both benign and biased users to interact with the assistant LLM. Notably, we find that the assistant can be misaligned unintentionally to exacerbate its dishonesty with only 10% biased user population. In summary, we extend the study of emergent misalignment to the domain of dishonesty and deception under high-stakes scenarios, and demonstrate that this risk arises not only through direct finetuning, but also in downstream mixture tasks and practical human-AI interactions.
Chinese: 本研究表明,经过欺骗性内容微调的大语言模型会在高风险场景中表现出广泛的失信行为,即使在下游任务和人机交互中仅接触少量不良数据,也会显著削弱其诚实表现。
English: This study reveals that large language models finetuned with deceptive content can exhibit broad dishonesty in high-stakes scenarios, with even minimal exposure to misaligned data significantly reducing honest behavior during downstream tasks and human-AI interactions.
Authors:Shuai Zhao, Xinyi Wu, Shiqian Zhao, Xiaobao Wu, Zhongliang Guo, Yanhao Jia, Anh Tuan Luu
Abstract:
During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability and trustworthiness. However, existing defense strategies suffer from limited generalization: they only work on specific attack types or task settings. In this study, we propose Poison-to-Poison (P2P), a general and effective backdoor defense algorithm. P2P injects benign triggers with safe alternative labels into a subset of training samples and fine-tunes the model on this re-poisoned dataset by leveraging prompt-based learning. This enforces the model to associate trigger-induced representations with safe outputs, thereby overriding the effects of original malicious triggers. Thanks to this robust and generalizable trigger-based fine-tuning, P2P is effective across task settings and attack types. Theoretically and empirically, we show that P2P can neutralize malicious backdoors while preserving task performance. We conduct extensive experiments on classification, mathematical reasoning, and summary generation tasks, involving multiple state-of-the-art LLMs. The results demonstrate that our P2P algorithm significantly reduces the attack success rate compared with baseline models. We hope that the P2P can serve as a guideline for defending against backdoor attacks and foster the development of a secure and trustworthy LLM community.
中文摘要:本研究提出Poison-to-Poison (P2P)通用防御算法,通过在微调时注入带有安全标签的良性触发器,使模型将触发器诱导的表征与安全输出关联,从而有效覆盖原始恶意触发器的影响,并在多种任务设置中保持性能。
English Summary: The study introduces Poison-to-Poison (P2P), a general defense algorithm that neutralizes backdoor attacks in LLMs by injecting benign triggers with safe labels during fine-tuning, effectively overriding malicious triggers while maintaining task performance across various settings.
Authors:Chengran Yang, Ting Zhang, Jinfeng Jiang, Xin Zhou, Haoye Tian, Jieke Shi, Junkai Chen, Yikun Li, Eng Lieh Ouh, Lwin Khin Shar, David Lo
Abstract:
Current learning-based Automated Vulnerability Repair (AVR) approaches, while promising, often fail to generalize effectively in real-world scenarios. Our diagnostic analysis reveals three fundamental weaknesses in state-of-the-art AVR approaches: (1) limited cross-repository generalization, with performance drops on unseen codebases; (2) inability to capture long-range dependencies, causing a performance degradation on complex, multi-hunk repairs; and (3) over-reliance on superficial lexical patterns, leading to significant performance drops on vulnerabilities with minor syntactic variations like variable renaming. To address these limitations, we propose SeCuRepair, a semantics-aligned, curriculum-driven, and reasoning-enhanced framework for vulnerability repair. At its core, SeCuRepair adopts a reason-then-edit paradigm, requiring the model to articulate why and how a vulnerability should be fixed before generating the patch. This explicit reasoning enforces a genuine understanding of repair logic rather than superficial memorization of lexical patterns. SeCuRepair also moves beyond traditional supervised fine-tuning and employs semantics-aware reinforcement learning, rewarding patches for their syntactic and semantic alignment with the oracle patch rather than mere token overlap. Complementing this, a difficulty-aware curriculum progressively trains the model, starting with simple fixes and advancing to complex, multi-hunk coordinated edits. We evaluate SeCuRepair on strict, repository-level splits of BigVul and newly crafted PrimeVul_AVR datasets. SeCuRepair significantly outperforms all baselines, surpassing the best-performing baselines by 34.52% on BigVul and 31.52% on PrimeVul\textsubscript{AVR} in terms of CodeBLEU, respectively. Comprehensive ablation studies further confirm that each component of our framework contributes to its final performance.
中文摘要:当前自动化漏洞修复方法存在跨仓库泛化能力弱、无法处理复杂依赖关系和过度依赖表面模式的问题,SeCuRepair通过语义对齐的推理编辑范式和渐进式课程训练,实现了修复性能的显著提升。
English Summary: Current automated vulnerability repair methods struggle with generalization due to limited cross-repository adaptation, inability to handle complex dependencies, and over-reliance on surface patterns, which SeCuRepair addresses through semantic reasoning and curriculum learning to achieve significant performance improvements.
Authors:Junkai Chen, Huihui Huang, Yunbo Lyu, Junwen An, Jieke Shi, Chengran Yang, Ting Zhang, Haoye Tian, Yikun Li, Zhenhao Li, Xin Zhou, Xing Hu, David Lo
Abstract:
Large language model (LLM) powered code agents are rapidly transforming software engineering by automating tasks such as testing, debugging, and repairing, yet the security risks of their generated code have become a critical concern. Existing benchmarks have offered valuable insights but remain insufficient: they often overlook the genuine context in which vulnerabilities were introduced or adopt narrow evaluation protocols that fail to capture either functional correctness or newly introduced vulnerabilities. We therefore introduce SecureAgentBench, a benchmark of 105 coding tasks designed to rigorously evaluate code agents' capabilities in secure code generation. Each task includes (i) realistic task settings that require multi-file edits in large repositories, (ii) aligned contexts based on real-world open-source vulnerabilities with precisely identified introduction points, and (iii) comprehensive evaluation that combines functionality testing, vulnerability checking through proof-of-concept exploits, and detection of newly introduced vulnerabilities using static analysis. We evaluate three representative agents (SWE-agent, OpenHands, and Aider) with three state-of-the-art LLMs (Claude 3.7 Sonnet, GPT-4.1, and DeepSeek-V3.1). Results show that (i) current agents struggle to produce secure code, as even the best-performing one, SWE-agent supported by DeepSeek-V3.1, achieves merely 15.2% correct-and-secure solutions, (ii) some agents produce functionally correct code but still introduce vulnerabilities, including new ones not previously recorded, and (iii) adding explicit security instructions for agents does not significantly improve secure coding, underscoring the need for further research. These findings establish SecureAgentBench as a rigorous benchmark for secure code generation and a step toward more reliable software development with LLMs.
中文:大型语言模型驱动的代码代理正在推动软件工程发展,但也带来了严重的安全风险,为此我们推出了SecureAgentBench这一综合基准测试,结果表明现有代理即使能生成功能正确的代码,仍难以确保安全性。
English: Large language model-powered code agents are advancing software engineering but pose significant security risks, prompting the introduction of SecureAgentBench, a comprehensive benchmark that reveals current agents' struggles to produce secure code despite functional correctness.
Authors:Lin Zhu, Lingwei Kong, Xin Ning, Xiaoyang Qu, Jianzong Wang
Abstract:
Private Information Retrieval (PIR) is a fundamental cryptographic primitive that enables users to retrieve data from a database without revealing which item is being accessed, thereby preserving query privacy. However, PIR protocols also face the challenge of result verifiability, as users expect the reconstructed data to be trustworthy and authentic. In this work, we propose two effective constructions of publicly verifiable PIR (PVPIR) in the multi-server setting, which achieve query privacy, correctness, and verifiability simultaneously. We further present three concrete instantiations based on these constructions. For the point query, our protocol introduces minimal computational overhead and achieves strong verifiability guarantees with significantly lower communication costs compared to existing Merkle tree-based approaches. For the predicate query, the communication complexity of our scheme remains stable as the database size increases, demonstrating strong scalability and suitability for large-scale private query applications.
Chinese: 本研究提出了两种多服务器环境下的公开可验证私有信息检索(PVPIR)方案,在实现查询隐私性、正确性和可验证性的同时,点查询计算开销极小且谓词查询具有稳定的通信复杂度。
English: This work introduces two publicly verifiable private information retrieval (PVPIR) constructions for multi-server settings, offering query privacy, correctness, and verifiability with minimal overhead for point queries and scalable communication for predicate queries.
Authors:Huu Hung Nguyen, Anh Tuan Nguyen, Thanh Le-Cong, Yikun Li, Han Wei Ang, Yide Yin, Frank Liauw, Shar Lwin Khin, Ouh Eng Lieh, Ting Zhang, David Lo
Abstract:
Software vulnerabilities pose serious risks to modern software ecosystems. While the National Vulnerability Database (NVD) is the authoritative source for cataloging these vulnerabilities, it often lacks explicit links to the corresponding Vulnerability-Fixing Commits (VFCs). VFCs encode precise code changes, enabling vulnerability localization, patch analysis, and dataset construction. Automatically mapping NVD records to their true VFCs is therefore critical. Existing approaches have limitations as they rely on sparse, often noisy commit messages and fail to capture the deep semantics in the vulnerability descriptions. To address this gap, we introduce PatchSeeker, a novel method that leverages large language models to create rich semantic links between vulnerability descriptions and their VFCs. PatchSeeker generates embeddings from NVD descriptions and enhances commit messages by synthesizing detailed summaries for those that are short or uninformative. These generated messages act as a semantic bridge, effectively closing the information gap between natural language reports and low-level code changes. Our approach PatchSeeker achieves 59.3% higher MRR and 27.9% higher Recall@10 than the best-performing baseline, Prospector, on the benchmark dataset. The extended evaluation on recent CVEs further confirms PatchSeeker's effectiveness. Ablation study shows that both the commit message generation method and the selection of backbone LLMs make a positive contribution to PatchSeeker. We also discuss limitations and open challenges to guide future work.
中文:PatchSeeker提出了一种利用大型语言模型的新方法,有效弥合了NVD漏洞描述与漏洞修复提交之间的语义鸿沟,在准确率和召回率指标上显著优于现有方法。
English: PatchSeeker introduces a novel approach using large language models to bridge the semantic gap between NVD vulnerability descriptions and vulnerability-fixing commits, significantly outperforming existing methods in accuracy and recall metrics.
Authors:Clifton Paul Robinson, Salvatore D'Oro, Tommaso Melodia
Abstract:
Physical layer authentication (PLA) uses inherent characteristics of the communication medium to provide secure and efficient authentication in wireless networks, bypassing the need for traditional cryptographic methods. With advancements in deep learning, PLA has become a widely adopted technique for its accuracy and reliability. In this paper, we introduce VeriPHY, a novel deep learning-based PLA solution for 5G networks, which enables unique device identification by embedding signatures within wireless I/Q transmissions using steganography. VeriPHY continuously generates pseudo-random signatures by sampling from Gaussian Mixture Models whose distribution is carefully varied to ensure signature uniqueness and stealthiness over time, and then embeds the newly generated signatures over I/Q samples transmitted by users to the 5G gNB. Utilizing deep neural networks, VeriPHY identifies and authenticates users based on these embedded signatures. VeriPHY achieves high precision, identifying unique signatures between 93% and 100% with low false positive rates and an inference time of 28 ms when signatures are updated every 20 ms. Additionally, we also demonstrate a stealth generation mode where signatures are generated in a way that makes them virtually indistinguishable from unaltered 5G signals while maintaining over 93% detection accuracy.
中文: VeriPHY是一种基于深度学习的5G网络物理层认证方案,通过高斯混合模型在无线信号中嵌入独特隐蔽的签名,实现了93%-100%的高识别准确率、极低误报率及毫秒级快速验证。
English: VeriPHY is a deep learning-based physical layer authentication system for 5G networks that embeds unique, stealthy signatures into wireless transmissions using Gaussian Mixture Models, achieving 93-100% identification accuracy with minimal false positives and rapid inference times.
Authors:Yikun Li, Ngoc Tan Bui, Ting Zhang, Martin Weyssow, Chengran Yang, Xin Zhou, Jinfeng Jiang, Junkai Chen, Huihui Huang, Huu Hung Nguyen, Chiok Yew Ho, Jie Tan, Ruiyin Li, Yide Yin, Han Wei Ang, Frank Liauw, Eng Lieh Ouh, Lwin Khin Shar, David Lo
Abstract:
Automated vulnerability detection research has made substantial progress, yet its real-world impact remains limited. Current vulnerability datasets suffer from issues including label inaccuracy rates of 20-71%, extensive duplication, and poor coverage of critical CWE types. These issues create a significant "generalization gap" where models achieve misleading self-testing performance (measured on held-out data from the same dataset for training) by exploiting spurious correlations rather than learning true vulnerability patterns. Our analysis reveals that many models experience substantial performance drops of up to 33% when evaluated on independent data, with some performing close to random guessing. To address these limitations, we present a three-part solution. First, we introduce a manually curated test dataset, BenchVul, covering the MITRE Top 25 Most Dangerous CWEs. Second, we construct a high-quality training dataset, TitanVul, comprising 38,863 functions by aggregating seven public sources and applying deduplication and validation using a novel multi-agent LLM framework. Third, we propose a Realistic Vulnerability Generation (RVG) framework, which synthesizes context-aware vulnerability examples for underrepresented but critical CWE types through simulated development workflows. Our evaluation shows the strengths of each component in closing the generalization gap. First, BenchVul shows the limitations of self-testing: models trained on existing datasets, such as BigVul and CVEfixes, experience performance drops on BenchVul (from 0.776 to 0.519 and from 0.713 to 0.607). Second, training models on TitanVul demonstrates improved generalization, with model performance increasing from 0.584 when evaluated on the same dataset to 0.767 when tested on BenchVul. Third, supplementing TitanVul with RVG-generated data yields further gains, increasing model performance by 14.0% to 0.874.
中文: 当前自动化漏洞检测模型因数据集缺陷存在显著的泛化差距,而我们的解决方案——结合精心构建的BenchVul测试集、高质量的TitanVul训练数据和真实漏洞生成框架——有效弥补了这一差距并提升了模型性能。
English: Current automated vulnerability detection models suffer from a significant generalization gap due to flawed datasets, but our solution—combining the curated BenchVul test set, high-quality TitanVul training data, and realistic vulnerability generation—effectively closes this gap and improves model performance.
Authors:Yihao Guo, Haoming Zhu, Minghui Xu, Xiuzhen Cheng, Bin Xiao
Abstract:
Real-World Assets (RWAs) have recently attracted increasing attention as a means of bridging traditional financial instruments with decentralized infrastructures. By representing assets such as bonds, commodities, and real estate on blockchains, RWAs can enhance liquidity, broaden accessibility, and extend the scope of decentralized finance. Industry forecasts further suggest rapid growth of tokenized RWAs in the coming years, underscoring their potential role in the evolution of digital financial markets. However, when deployed across multiple blockchains, RWAs face challenges such as repeated authentication on different chains and inefficiency caused by multi-step settlement protocols. To address these issues, we present a cross-chain framework for RWAs that emphasizes identity management, authentication, and interaction. The framework integrates Decentralized Identifiers and Verifiable Credentials with customized attributes to support decentralized identification, and incorporates an authentication protocol based on Simplified Payment Verification to avoid redundant verification across chains. Furthermore, we design a cross-chain channel that enables the settlement of RWAs without requiring channel closure, thereby improving operational efficiency. We implement the framework and evaluate it through simulations, which confirm its feasibility and demonstrate improvements in efficiency for RWAs in cross-chain settings.
中文: 本文提出了一种现实世界资产(RWA)的跨链框架,通过集成去中心化身份管理和认证协议,消除重复验证并优化多链结算,从而提升运行效率。
English: This paper introduces a cross-chain framework for Real-World Assets (RWAs) that enhances efficiency by integrating decentralized identity management and authentication protocols to eliminate redundant verifications and streamline multi-chain settlements.
Authors:Biwei Yan, Yue Zhang, Minghui Xu, Runyu Pan, Jinku Li, Xiuzhen Cheng
Abstract:
The application layer of Bluetooth Low Energy (BLE) is a growing source of security vulnerabilities, as developers often neglect to implement critical protections such as encryption, authentication, and freshness. While formal verification offers a principled way to check these properties, the manual effort of constructing formal models makes it impractical for large-scale analysis. This paper introduces a key insight: BLE application security analysis can be reframed as a semantic translation problem, i.e., from real-world code to formal models. We leverage large language models (LLMs) not to directly detect vulnerabilities, but to serve as translators that convert BLE-specific code into process models verifiable by tools like ProVerif. We implement this idea in VerifiaBLE, a system that combines static analysis, prompt-guided LLM translation, and symbolic verification to check three core security features: encryption, randomness, and authentication. Applied to 1,050 Android BLE apps, VerifiaBLE uncovers systemic weaknesses: only 10.2\% of apps implement all three protections, while 53.9\% omit them entirely. Our work demonstrates that using LLMs as structured translators can lower the barrier to formal methods, unlocking scalable verification across security-critical domains.
中文: 该研究将BLE应用安全分析重构为语义翻译问题,利用大语言模型将代码转换为可验证模型,并通过VerifiaBLE系统发现大多数安卓BLE应用缺失关键安全防护。
English: This paper reframes BLE application security analysis as a semantic translation task, using LLMs to convert code into verifiable models and revealing through VerifiaBLE that most Android BLE apps lack essential protections.
Authors:Kun Li, Cheng Wang, Minghui Xu, Yue Zhang, Xiuzhen Cheng
Abstract:
As datasets become critical assets in modern machine learning systems, ensuring robust copyright protection has emerged as an urgent challenge. Traditional legal mechanisms often fail to address the technical complexities of digital data replication and unauthorized use, particularly in opaque or decentralized environments. This survey provides a comprehensive review of technical approaches for dataset copyright protection, systematically categorizing them into three main classes: non-intrusive methods, which detect unauthorized use without modifying data; minimally-intrusive methods, which embed lightweight, reversible changes to enable ownership verification; and maximally-intrusive methods, which apply aggressive data alterations, such as reversible adversarial examples, to enforce usage restrictions. We synthesize key techniques, analyze their strengths and limitations, and highlight open research challenges. This work offers an organized perspective on the current landscape and suggests future directions for developing unified, scalable, and ethically sound solutions to protect datasets in increasingly complex machine learning ecosystems.
中文摘要:本文综述了数据集版权保护的技术方法,将其分为非侵入式、轻度侵入式和重度侵入式三类,系统分析了各类方法的优劣,并指出了未来开发统一、可扩展且符合伦理的解决方案的研究方向。
English Summary: This survey systematically reviews technical approaches for dataset copyright protection by categorizing them into non-intrusive, minimally-intrusive, and maximally-intrusive methods, while analyzing their capabilities and identifying future research directions for ethical and scalable solutions.
Authors:Shiyi Yang, Xinshu Li, Guanglin Zhou, Chen Wang, Xiwei Xu, Liming Zhu, Lina Yao
Abstract:
Recent studies have shown that recommender systems (RSs) are highly vulnerable to data poisoning attacks, where malicious actors inject fake user profiles, including a group of well-designed fake ratings, to manipulate recommendations. Due to security and privacy constraints in practice, attackers typically possess limited knowledge of the victim system and thus need to craft profiles that have transferability across black-box RSs. To maximize the attack impact, the profiles often remains imperceptible. However, generating such high-quality profiles with the restricted resources is challenging. Some works suggest incorporating fake textual reviews to strengthen the profiles; yet, the poor quality of the reviews largely undermines the attack effectiveness and imperceptibility under the practical setting.
To tackle the above challenges, in this paper, we propose to enhance the quality of the review text by harnessing in-context learning (ICL) capabilities of multimodal foundation models. To this end, we introduce a demonstration retrieval algorithm and a text style transfer strategy to augment the navie ICL. Specifically, we propose a novel practical attack framework named RAGAN to generate high-quality fake user profiles, which can gain insights into the robustness of RSs. The profiles are generated by a jailbreaker and collaboratively optimized on an instructional agent and a guardian to improve the attack transferability and imperceptibility. Comprehensive experiments on various real-world datasets demonstrate that RAGAN achieves the state-of-the-art poisoning attack performance.
中文: 本文提出RAGAN攻击框架,利用增强上下文学习的多模态基础模型生成高质量虚假用户档案,在多种现实数据集的推荐系统上实现了最优的投毒攻击效果,兼具出色迁移性和隐蔽性。
English: This paper introduces RAGAN, a novel attack framework that leverages multimodal foundation models with enhanced in-context learning to generate high-quality fake user profiles, achieving superior transferability and imperceptibility in poisoning recommender systems across diverse datasets.
Authors:Zhihao Li, Kun Li, Boyang Ma, Minghui Xu, Yue Zhang, Xiuzhen Cheng
Abstract:
The Model Context Protocol (MCP) has emerged as a widely adopted mechanism for connecting large language models to external tools and resources. While MCP promises seamless extensibility and rich integrations, it also introduces a substantially expanded attack surface: any plugin can inherit broad system privileges with minimal isolation or oversight. In this work, we conduct the first large-scale empirical analysis of MCP security risks. We develop an automated static analysis framework and systematically examine 2,562 real-world MCP applications spanning 23 functional categories. Our measurements reveal that network and system resource APIs dominate usage patterns, affecting 1,438 and 1,237 servers respectively, while file and memory resources are less frequent but still significant. We find that Developer Tools and API Development plugins are the most API-intensive, and that less popular plugins often contain disproportionately high-risk operations. Through concrete case studies, we demonstrate how insufficient privilege separation enables privilege escalation, misinformation propagation, and data tampering. Based on these findings, we propose a detailed taxonomy of MCP resource access, quantify security-relevant API usage, and identify open challenges for building safer MCP ecosystems, including dynamic permission models and automated trust assessment.
中文: 模型上下文协议(MCP)通过允许插件以最小监管获得广泛系统权限,显著扩大了安全风险,对2562个应用的分析显示网络和系统API使用占主导,这可能导致权限提升和数据篡改等安全问题。
English: The Model Context Protocol (MCP) significantly expands security risks by allowing plugins broad system access with minimal oversight, as demonstrated through analysis of 2,562 applications revealing prevalent network and system API usage that enables privilege escalation and data tampering.
Authors:Ruoxi Wang, Kun Li, Minghui Xu, Yue Zhang, Kaidi Xu, Chunchi Liu, Yinhao Xiao, Xiuzhen Cheng
Abstract:
Dynamic Symbolic Execution (DSE) is a key technique in program analysis, widely used in software testing, vulnerability discovery, and formal verification. In distributed AI systems, DSE plays a crucial role in identifying hard-to-detect bugs, especially those arising from complex network communication patterns. However, traditional approaches to symbolic execution are often hindered by scalability issues and inefficiencies, particularly in large-scale systems. This paper introduces LIFT (Large-language-model Integrated Functional-equivalent-IR Transformation), a novel framework that leverages Large Language Models (LLMs) to automate the optimization of Intermediate Representations (IRs) in symbolic execution. LIFT addresses the challenges of symbolic execution by providing a scalable, context-sensitive solution for IR transformation. The framework consists of two phases: IR Analysis and Optimization, where LLMs optimize time-intensive IR blocks, and Symbolic Execution and Validation, which includes benchmarking and semantic verification to ensure correctness and generalizability. Experiments on real-world binaries demonstrated significant performance improvements, including a 53.5\% reduction in execution time for bigtest and a 10.24\% reduction for random, along with reductions in IR statements, PUT instructions, and temporary variables. These results demonstrate that LLMs simplify IRs while maintaining functional correctness, enhancing symbolic execution in distributed AI systems.
中文: 本文提出的LIFT框架利用大语言模型优化符号执行中的中间表示,在分布式AI系统中显著提升了性能,不仅降低了执行时间、简化了中间表示,同时确保了功能正确性。
English: This paper introduces LIFT, a framework using Large Language Models to optimize Intermediate Representations in symbolic execution, which significantly improves performance in distributed AI systems by reducing execution time and simplifying IRs while preserving correctness.
Authors:Natalia Tomashenko, Junichi Yamagishi, Xin Wang, Yun Liu, Emmanuel Vincent
Abstract:
Most of the existing speaker anonymization research has focused on single-speaker audio, leading to the development of techniques and evaluation metrics optimized for such condition. This study addresses the significant challenge of speaker anonymization within multi-speaker conversational audio, specifically when only a single target speaker needs to be anonymized. This scenario is highly relevant in contexts like call centers, where customer privacy necessitates anonymizing only the customer's voice in interactions with operators. Conventional anonymization methods are often not suitable for this task. Moreover, current evaluation methodology does not allow us to accurately assess privacy protection and utility in this complex multi-speaker scenario. This work aims to bridge these gaps by exploring effective strategies for targeted speaker anonymization in conversational audio, highlighting potential problems in their development and proposing corresponding improved evaluation methodologies.
Authors:Shaswata Mitra, Azim Bazarov, Martin Duclos, Sudip Mittal, Aritran Piplai, Md Rayhanur Rahman, Edward Zieglar, Shahram Rahimi
Abstract:
Signature-based Intrusion Detection Systems (IDS) detect malicious activities by matching network or host activity against predefined rules. These rules are derived from extensive Cyber Threat Intelligence (CTI), which includes attack signatures and behavioral patterns obtained through automated tools and manual threat analysis, such as sandboxing. The CTI is then transformed into actionable rules for the IDS engine, enabling real-time detection and prevention. However, the constant evolution of cyber threats necessitates frequent rule updates, which delay deployment time and weaken overall security readiness. Recent advancements in agentic systems powered by Large Language Models (LLMs) offer the potential for autonomous IDS rule generation with internal evaluation. We introduce FALCON, an autonomous agentic framework that generates deployable IDS rules from CTI data in real-time and evaluates them using built-in multi-phased validators. To demonstrate versatility, we target both network (Snort) and host-based (YARA) mediums and construct a comprehensive dataset of IDS rules with their corresponding CTIs. Our evaluations indicate FALCON excels in automatic rule generation, with an average of 95% accuracy validated by qualitative evaluation with 84% inter-rater agreement among multiple cybersecurity analysts across all metrics. These results underscore the feasibility and effectiveness of LLM-driven data mining for real-time cyber threat mitigation.
Chinese: 基于签名的入侵检测系统依赖网络威胁情报的预定义规则来检测威胁,但手动更新导致延迟,而新型FALCON框架利用大语言模型实时自主生成并评估高精度检测规则,实现了95%的准确率。
English: Signature-based IDS rely on predefined rules from CTI to detect threats, but manual updates cause delays, while the new FALCON framework uses LLMs to autonomously generate and evaluate accurate IDS rules in real-time, achieving 95% accuracy.
Authors:Xuechen Liu, Wanying Ge, Xin Wang, Junichi Yamagishi
Abstract:
This study introduces LENS-DF, a novel and comprehensive recipe for training and evaluating audio deepfake detection and temporal localization under complicated and realistic audio conditions. The generation part of the recipe outputs audios from the input dataset with several critical characteristics, such as longer duration, noisy conditions, and containing multiple speakers, in a controllable fashion. The corresponding detection and localization protocol uses models. We conduct experiments based on self-supervised learning front-end and simple back-end. The results indicate that models trained using data generated with LENS-DF consistently outperform those trained via conventional recipes, demonstrating the effectiveness and usefulness of LENS-DF for robust audio deepfake detection and localization. We also conduct ablation studies on the variations introduced, investigating their impact on and relevance to realistic challenges in the field.
中文: 本研究提出LENS-DF框架,通过可控生成包含长时长、噪声环境和多说话人等复杂特征的音频数据,在真实场景下的音频深度伪造检测与时间定位任务中持续优于传统方法。
English: This study presents LENS-DF, a comprehensive framework for training and evaluating audio deepfake detection and temporal localization under realistic conditions, which consistently outperforms conventional methods through controllable generation of complex audio characteristics.
Authors:Md Abrar Jahin, Taufikur Rahman Fuad, M. F. Mridha, Nafiz Fahad, Md. Jakir Hossen
Abstract:
Federated Learning (FL) faces inherent challenges in balancing model performance, privacy preservation, and communication efficiency, especially in non-IID decentralized environments. Recent approaches either sacrifice formal privacy guarantees, incur high overheads, or overlook quantum-enhanced expressivity. We introduce AdeptHEQ-FL, a unified hybrid classical-quantum FL framework that integrates (i) a hybrid CNN-PQC architecture for expressive decentralized learning, (ii) an adaptive accuracy-weighted aggregation scheme leveraging differentially private validation accuracies, (iii) selective homomorphic encryption (HE) for secure aggregation of sensitive model layers, and (iv) dynamic layer-wise adaptive freezing to minimize communication overhead while preserving quantum adaptability. We establish formal privacy guarantees, provide convergence analysis, and conduct extensive experiments on the CIFAR-10, SVHN, and Fashion-MNIST datasets. AdeptHEQ-FL achieves a $\approx 25.43\%$ and $\approx 14.17\%$ accuracy improvement over Standard-FedQNN and FHE-FedQNN, respectively, on the CIFAR-10 dataset. Additionally, it reduces communication overhead by freezing less important layers, demonstrating the efficiency and practicality of our privacy-preserving, resource-aware design for FL.
中文: AdeptHEQ-FL是一个融合经典与量子计算的联邦学习框架,通过差分隐私和选择性加密确保隐私安全,利用自适应层冻结减少通信开销,在基准数据集上显著提升了模型精度和效率。
English: AdeptHEQ-FL is a hybrid classical-quantum federated learning framework that enhances model accuracy, ensures privacy through differential privacy and selective encryption, and reduces communication overhead via adaptive layer freezing, outperforming existing methods on benchmark datasets.
Authors:Yuchong Xie, Mingyu Luo, Zesen Liu, Zhixiang Zhang, Kaikai Zhang, Yu Liu, Zongjie Li, Ping Chen, Shuai Wang, Dongdong She
Abstract:
LLM-based agentic systems leverage large language models to handle user queries, make decisions, and execute external tools for complex tasks across domains like chatbots, customer service, and software engineering. A critical component of these systems is the Tool Invocation Prompt (TIP), which defines tool interaction protocols and guides LLMs to ensure the security and correctness of tool usage. Despite its importance, TIP security has been largely overlooked. This work investigates TIP-related security risks, revealing that major LLM-based systems like Cursor, Claude Code, and others are vulnerable to attacks such as remote code execution (RCE) and denial of service (DoS). Through a systematic TIP exploitation workflow (TEW), we demonstrate external tool behavior hijacking via manipulated tool invocations. We also propose defense mechanisms to enhance TIP security in LLM-based agentic systems.
中文: 本研究揭示了基于大语言模型的智能体系统中工具调用提示(TIP)的安全漏洞,通过系统化攻击流程演示了远程代码执行等攻击如何劫持工具行为,同时提出了增强TIP安全性的防御机制。
English: This study exposes critical security vulnerabilities in LLM-based agentic systems' Tool Invocation Prompts (TIPs), demonstrating how attacks like remote code execution can hijack tool behaviors through systematic exploitation, while also proposing defense mechanisms to strengthen TIP security.
Authors:Yanbo Dai, Zhenlan Ji, Zongjie Li, Kuan Li, Shuai Wang
Abstract:
Retrieval-Augmented Generation (RAG) has become a standard approach for improving the reliability of large language models (LLMs). Prior work demonstrates the vulnerability of RAG systems by misleading them into generating attacker-chosen outputs through poisoning the knowledge base. However, this paper uncovers that such attacks could be mitigated by the strong \textit{self-correction ability (SCA)} of modern LLMs, which can reject false context once properly configured. This SCA poses a significant challenge for attackers aiming to manipulate RAG systems.
In contrast to previous poisoning methods, which primarily target the knowledge base, we introduce \textsc{DisarmRAG}, a new poisoning paradigm that compromises the retriever itself to suppress the SCA and enforce attacker-chosen outputs. This compromisation enables the attacker to straightforwardly embed anti-SCA instructions into the context provided to the generator, thereby bypassing the SCA. To this end, we present a contrastive-learning-based model editing technique that performs localized and stealthy edits, ensuring the retriever returns a malicious instruction only for specific victim queries while preserving benign retrieval behavior. To further strengthen the attack, we design an iterative co-optimization framework that automatically discovers robust instructions capable of bypassing prompt-based defenses. We extensively evaluate DisarmRAG across six LLMs and three QA benchmarks. Our results show near-perfect retrieval of malicious instructions, which successfully suppress SCA and achieve attack success rates exceeding 90\% under diverse defensive prompts. Also, the edited retriever remains stealthy under several detection methods, highlighting the urgent need for retriever-centric defenses.
中文: 本文提出DisarmRAG攻击方法,通过篡改检索器来抑制大语言模型的自我修正能力,在多种防御提示下实现超过90%的攻击成功率,揭示了加强检索器防护的迫切性。
English: This paper introduces DisarmRAG, a novel poisoning attack that compromises the retriever in RAG systems to suppress LLMs' self-correction ability and achieve over 90% attack success, highlighting the need for retriever-focused defenses.
Authors:Tianshi Xu, Wen-jie Lu, Jiangrui Yu, Chen Yi, Chenqi Lin, Runsheng Wang, Meng Li
Abstract:
This paper presents an efficient framework for private Transformer inference that combines Homomorphic Encryption (HE) and Secure Multi-party Computation (MPC) to protect data privacy. Existing methods often leverage HE for linear layers (e.g., matrix multiplications) and MPC for non-linear layers (e.g., Softmax activation functions), but the conversion between HE and MPC introduces significant communication costs. The proposed framework, dubbed BLB, overcomes this by breaking down layers into fine-grained operators and further fusing adjacent linear operators, reducing the need for HE/MPC conversions. To manage the increased ciphertext bit width from the fused linear operators, BLB proposes the first secure conversion protocol between CKKS and MPC and enables CKKS-based computation of the fused operators. Additionally, BLB proposes an efficient matrix multiplication protocol for fused computation in Transformers. Extensive evaluations on BERT-base, BERT-large, and GPT2-base show that BLB achieves a $21\times$ reduction in communication overhead compared to BOLT (S\&P'24) and a $2\times$ reduction compared to Bumblebee (NDSS'25), along with latency reductions of $13\times$ and $1.8\times$, respectively, when leveraging GPU acceleration.
中文: 本文提出BLB框架,通过融合线性算子和创新转换协议,在保护Transformer推理隐私的同时显著降低了同态加密与安全多方计算之间的通信开销。
English: This paper introduces BLB, an efficient framework for private Transformer inference that integrates Homomorphic Encryption and Secure Multi-party Computation while minimizing communication costs by fusing linear operators and introducing novel conversion protocols.
Authors:Zhihao Li, Chaozheng Wang, Zongjie Li, Xinyong Peng, Qun Xia, Haochuan Lu, Ting Xiong, Shuzheng Gao, Cuiyun Gao, Shuai Wang, Yuetang Deng, Huafeng Ma
Abstract:
The explosive growth of mini-game platforms has led to widespread code plagiarism, where malicious users access popular games' source code and republish them with modifications. While existing static analysis tools can detect simple obfuscation techniques like variable renaming and dead code injection, they fail against sophisticated deep obfuscation methods such as encrypted code with local or cloud-based decryption keys that completely destroy code structure and render traditional Abstract Syntax Tree analysis ineffective. To address these challenges, we present JSidentify-V2, a novel dynamic analysis framework that detects mini-game plagiarism by capturing memory invariants during program execution. Our key insight is that while obfuscation can severely distort static code characteristics, runtime memory behavior patterns remain relatively stable. JSidentify-V2 employs a four-stage pipeline: (1) static pre-analysis and instrumentation to identify potential memory invariants, (2) adaptive hot object slicing to maximize execution coverage of critical code segments, (3) Memory Dependency Graph construction to represent behavioral fingerprints resilient to obfuscation, and (4) graph-based similarity analysis for plagiarism detection.
We evaluate JSidentify-V2 against eight obfuscation methods on a comprehensive dataset of 1,200 mini-games ...
中文摘要:JSidentify-V2是一个动态分析框架,通过捕捉程序运行时的内存行为模式来检测小游戏抄袭,有效解决了传统静态分析方法在面对复杂代码混淆技术时的局限性。
English Summary: JSidentify-V2 is a dynamic analysis framework that detects mini-game plagiarism by analyzing runtime memory behavior patterns, overcoming limitations of static analysis tools against sophisticated code obfuscation techniques.
Authors:Larissa Schmid, Elias Lundell, Yogya Gamage, Benoit Baudry, Martin Monperrus
Abstract:
Modern software projects depend on many third-party libraries, complicating reproducible and secure builds. Several package managers address this with the generation of a lockfile that freezes dependency versions and can be used to verify the integrity of dependencies. Yet, Maven, one of the most important package managers in the Java ecosystem, lacks native support for a lockfile. We present Maven-Lockfile to generate and update lockfiles, with support for rebuilding projects from past versions. Our lockfiles capture all direct and transitive dependencies with their checksums, enabling high integrity builds. Our evaluation shows that Maven-Lockfile can reproduce builds from historical commits and is able to detect tampered artifacts. With minimal configuration, Maven-Lockfile equips Java projects with modern build integrity and build reproducibility, and fosters future research on software supply chain security in Java.
中文: Maven-Lockfile为Maven引入了锁定文件机制,通过记录依赖版本和校验和来确保构建的可重现性与安全性,能够检测篡改构件并重建历史版本。
English: Maven-Lockfile introduces a lockfile system for Maven to ensure build reproducibility and security by capturing dependency versions and checksums, enabling detection of tampered artifacts and reconstruction of historical builds.
Authors:Birk Torpmann-Hagen, Michael A. Riegler, PÃ¥l Halvorsen, Dag Johansen
Abstract:
Deep neural networks are being utilized in a growing number of applications, both in production systems and for personal use. Network checkpoints are as a consequence often shared and distributed on various platforms to ease the development process. This work considers the threat of neural network stegomalware, where malware is embedded in neural network checkpoints at a negligible cost to network accuracy. This constitutes a significant security concern, but is nevertheless largely neglected by the deep learning practitioners and security specialists alike. We propose the first effective countermeasure to these attacks. In particular, we show that state-of-the-art neural network stegomalware can be efficiently and effectively neutralized through shuffling the column order of the weight- and bias-matrices, or equivalently the channel-order of convolutional layers. We show that this effectively corrupts payloads that have been embedded by state-of-the-art methods in neural network steganography at no cost to network accuracy, outperforming competing methods by a significant margin. We then discuss possible means by which to bypass this defense, additional defense methods, and advocate for continued research into the security of machine learning systems.
Authors:Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, zhifei zheng, Min Liu, Zhiyi yin, Jianping Zhang
Abstract:
With the development of Large Language Models (LLMs), numerous efforts have revealed their vulnerabilities to jailbreak attacks. Although these studies have driven the progress in LLMs' safety alignment, it remains unclear whether LLMs have internalized authentic knowledge to deal with real-world crimes, or are merely forced to simulate toxic language patterns. This ambiguity raises concerns that jailbreak success is often attributable to a hallucination loop between jailbroken LLM and judger LLM. By decoupling the use of jailbreak techniques, we construct knowledge-intensive Q\&A to investigate the misuse threats of LLMs in terms of dangerous knowledge possession, harmful task planning utility, and harmfulness judgment robustness. Experiments reveal a mismatch between jailbreak success rates and harmful knowledge possession in LLMs, and existing LLM-as-a-judge frameworks tend to anchor harmfulness judgments on toxic language patterns. Our study reveals a gap between existing LLM safety assessments and real-world threat potential.
中文: 研究表明大语言模型的越狱成功率与其有害知识掌握程度存在错配,现有安全评估体系因过度关注表面语言模式而未能真实反映模型的实际威胁潜力,暴露了当前安全评估与现实风险之间的差距。
English: This study reveals a mismatch between jailbreak success rates and harmful knowledge possession in LLMs, showing that current safety assessments fail to capture their real-world threat potential as existing evaluation frameworks tend to anchor judgments on superficial language patterns rather than substantive dangerous knowledge.
Authors:Boheng Li, Renjie Gu, Junjie Wang, Leyi Qi, Yiming Li, Run Wang, Zhan Qin, Tianwei Zhang
Abstract:
Text-to-image (T2I) diffusion models have achieved impressive image generation quality and are increasingly fine-tuned for personalized applications. However, these models often inherit unsafe behaviors from toxic pretraining data, raising growing safety concerns. While recent safety-driven unlearning methods have made promising progress in suppressing model toxicity, they are identified to be fragile to downstream fine-tuning, where we reveal that state-of-the-art methods largely fail to retain their effectiveness even when fine-tuned on entirely benign datasets. To mitigate this problem, in this paper, we propose ResAlign, a safety-driven unlearning framework with enhanced resilience against downstream fine-tuning. By modeling downstream fine-tuning as an implicit optimization problem with a Moreau Envelope-based reformulation, ResAlign enables efficient gradient estimation to minimize the recovery of harmful behaviors. Additionally, a meta-learning strategy is proposed to simulate a diverse distribution of fine-tuning scenarios to improve generalization. Extensive experiments across a wide range of datasets, fine-tuning methods, and configurations demonstrate that ResAlign consistently outperforms prior unlearning approaches in retaining safety after downstream fine-tuning while preserving benign generation capability well.
Chinese: 本文提出了ResAlign,一种具有韧性的安全驱动遗忘框架,能有效抑制文本到图像扩散模型中的有害行为,并在下游良性数据集微调后仍保持安全性,优于现有方法。
English: The paper introduces ResAlign, a resilient safety-driven unlearning framework that effectively mitigates toxic behaviors in text-to-image diffusion models and maintains safety even after downstream fine-tuning on benign datasets, outperforming prior methods.
Authors:Tamer Ahmed Eltaras, Qutaibah Malluhi, Alessandro Savino, Stefano Di Carlo, Adnan Qayyum
Abstract:
Federated learning has emerged as a prominent privacy-preserving technique for leveraging large-scale distributed datasets by sharing gradients instead of raw data. However, recent studies indicate that private training data can still be exposed through gradient inversion attacks. While earlier analytical methods have demonstrated success in reconstructing input data from fully connected layers, their effectiveness significantly diminishes when applied to convolutional layers, high-dimensional inputs, and scenarios involving multiple training examples. This paper extends our previous work \cite{eltaras2024r} and proposes three advanced algorithms to broaden the applicability of gradient inversion attacks. The first algorithm presents a novel data leakage method that efficiently exploits convolutional layer gradients, demonstrating that even with non-fully invertible activation functions, such as ReLU, training samples can be analytically reconstructed directly from gradients without the need to reconstruct intermediate layer outputs. Building on this foundation, the second algorithm extends this analytical approach to support high-dimensional input data, substantially enhancing its utility across complex real-world datasets. The third algorithm introduces an innovative analytical method for reconstructing mini-batches, addressing a critical gap in current research that predominantly focuses on reconstructing only a single training example. Unlike previous studies that focused mainly on the weight constraints of convolutional layers, our approach emphasizes the pivotal role of gradient constraints, revealing that successful attacks can be executed with fewer than 5\% of the constraints previously deemed necessary in certain layers.
Chinese: 本文提出了三种先进算法,通过有效利用卷积层梯度显著增强了梯度反转攻击能力,实现了高维输入和小批量的数据重建,所需约束条件远少于以往研究。
English: This paper introduces three advanced algorithms that significantly enhance gradient inversion attacks by efficiently exploiting convolutional layer gradients, enabling the reconstruction of high-dimensional inputs and mini-batches with far fewer constraints than previously required.
Authors:Yao Wu, Ziye Jia, Qihui Wu, Yian Zhu
Abstract:
The advancement of low-altitude intelligent networks enables unmanned aerial vehicle (UAV) interconnection via flying ad-hoc networks (FANETs), offering flexibility and decentralized coordination. However, resource constraints, dynamic topologies, and UAV operations in open environments present significant security and communication challenges. Existing multi-factor and public-key cryptography protocols are vulnerable due to their reliance on stored sensitive information, increasing the risk of exposure and compromise. This paper proposes a lightweight authentication and key agreement protocol for FANETs, integrating physical unclonable functions with dynamic credential management and lightweight cryptographic primitives. The protocol reduces computational and communication overhead while enhancing security. Security analysis confirms its resilience against various attacks, and comparative evaluations demonstrate its superiority in security, communication efficiency, and computational cost.
中文摘要:本文提出了一种用于FANET的轻量级认证与密钥协商协议,通过结合物理不可克隆功能和动态凭证管理,在降低计算与通信开销的同时显著提升了系统安全性。
English Summary: This paper introduces a lightweight authentication and key agreement protocol for FANETs that integrates physical unclonable functions and dynamic credential management to enhance security while reducing computational and communication overhead.
Authors:Han Chen, Hanchen Wang, Hongmei Chen, Ying Zhang, Lu Qin, Wenjie Zhang
Abstract:
The advancement of graph-based malware analysis is critically limited by the absence of large-scale datasets that capture the inherent hierarchical structure of software. Existing methods often oversimplify programs into single level graphs, failing to model the crucial semantic relationship between high-level functional interactions and low-level instruction logic. To bridge this gap, we introduce \dataset, the largest public hierarchical graph dataset for malware analysis, comprising over \textbf{200M} Control Flow Graphs (CFGs) nested within \textbf{595K} Function Call Graphs (FCGs). This two-level representation preserves structural semantics essential for building robust detectors resilient to code obfuscation and malware evolution. We demonstrate HiGraph's utility through a large-scale analysis that reveals distinct structural properties of benign and malicious software, establishing it as a foundational benchmark for the community. The dataset and tools are publicly available at https://higraph.org.
中文: HiGraph数据集通过提供包含59.5万个函数调用图中嵌套的2亿多个控制流图,解决了恶意软件分析中缺乏大规模分层数据的问题,为区分良性软件和恶意软件的结构差异提供了有效基准。
English: The HiGraph dataset addresses the lack of large-scale hierarchical data in malware analysis by providing over 200 million nested control flow graphs within 595,000 function call graphs, enabling robust detection of structural differences between benign and malicious software.
Authors:Noel Teku, Fengwei Tian, Payel Bhattacharjee, Souradip Chakraborty, Amrit Singh Bedi, Ravi Tandon
Abstract:
Alignment is a key step in developing Large Language Models (LLMs) using human feedback to ensure adherence to human values and societal norms. Dependence on human feedback raises privacy concerns about how much a labeler's preferences may reveal about their personal values, beliefs, and personality traits. Existing approaches, such as Differentially Private SGD (DP-SGD), provide rigorous privacy guarantees by privatizing gradients during fine-tuning and alignment but can provide more privacy than necessary as human preferences are tied only to labels of (prompt, response) pairs and can degrade model utility. This work focuses on LLM alignment with preference-level privacy, which preserves the privacy of preference labels provided by humans. We propose PROPS (PROgressively Private Self-alignment), a multi-stage privacy preserving alignment framework where privately aligned models in previous stages can serve as labelers for supplementing training data in the subsequent stages of alignment. We present theoretical guarantees for PROPS as well as comprehensive validation using multiple models (Pythia and GPT) and datasets (AlpacaEval, Anthropic HH-RLHF, truthy-dpo-v0.1) to demonstrate the utility of PROPS over existing methods while still providing high privacy. For the same privacy budget, alignment via PROPS can achieve up to 3x higher win-rates compared to DP-SGD, and 2.5x higher win-rates compared to Randomized Response (RR) based alignment.
中文摘要:本文提出PROPS框架,通过多阶段隐私保护对齐方法在保护人类偏好标签隐私的同时保持模型性能,在相同隐私预算下比现有方法实现高达3倍的胜率提升。
English Summary: This paper introduces PROPS, a multi-stage privacy-preserving alignment framework that protects human preference labels during LLM training while maintaining model utility, achieving significantly higher win-rates than existing methods under the same privacy constraints.
Authors:Ziye Jia, Sijie He, Qiuming Zhu, Wei Wang, Qihui Wu, Zhu Han
Abstract:
Due to the high flexibility and versatility, unmanned aerial vehicles (UAVs) are leveraged in various fields including surveillance and disaster rescue.However, in UAV networks, routing is vulnerable to malicious damage due to distributed topologies and high dynamics. Hence, ensuring the routing security of UAV networks is challenging. In this paper, we characterize the routing process in a time-varying UAV network with malicious nodes. Specifically, we formulate the routing problem to minimize the total delay, which is an integer linear programming and intractable to solve. Then, to tackle the network security issue, a blockchain-based trust management mechanism (BTMM) is designed to dynamically evaluate trust values and identify low-trust UAVs. To improve traditional practical Byzantine fault tolerance algorithms in the blockchain, we propose a consensus UAV update mechanism. Besides, considering the local observability, the routing problem is reformulated into a decentralized partially observable Markov decision process. Further, a multi-agent double deep Q-network based routing algorithm is designed to minimize the total delay. Finally, simulations are conducted with attacked UAVs and numerical results show that the delay of the proposed mechanism decreases by 13.39$\%$, 12.74$\%$, and 16.6$\%$ than multi-agent proximal policy optimal algorithms, multi-agent deep Q-network algorithms, and methods without BTMM, respectively.
中文: 本文针对无人机网络路由安全问题,提出基于区块链的信任管理机制和多智能体强化学习算法,相比现有方法显著降低了通信延迟。
English: This paper addresses routing security challenges in UAV networks by proposing a blockchain-based trust management mechanism and a multi-agent reinforcement learning algorithm, which significantly reduces communication delays compared to existing methods.
Authors:Franco Oberti, Stefano Di Carlo, Alessandro Savino
Abstract:
The Controller Area Network (CAN) protocol, essential for automotive embedded systems, lacks inherent security features, making it vulnerable to cyber threats, especially with the rise of autonomous vehicles. Traditional security measures offer limited protection, such as payload encryption and message authentication. This paper presents a novel Intrusion Detection System (IDS) designed for the CAN environment, utilizing Hardware Performance Counters (HPCs) to detect anomalies indicative of cyber attacks. A RISC-V-based CAN receiver is simulated using the gem5 simulator, processing CAN frame payloads with AES-128 encryption as FreeRTOS tasks, which trigger distinct HPC responses. Key HPC features are optimized through data extraction and correlation analysis to enhance classification efficiency. Results indicate that this approach could significantly improve CAN security and address emerging challenges in automotive cybersecurity.
中文: 本文提出了一种基于硬件性能计数器的创新入侵检测系统,用于控制器局域网的网络安全防护,仿真结果表明该方法能有效提升汽车网络的安全性。
English: This paper introduces a novel Intrusion Detection System for the Controller Area Network that utilizes Hardware Performance Counters to detect cyber threats, demonstrating through simulations that it can significantly enhance automotive cybersecurity.
Authors:Kazuki Egashira, Robin Staab, Thibaud Gloaguen, Mark Vero, Martin Vechev
Abstract:
Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing the memory footprint of large language models (LLMs) during inference. Notably, popular inference engines, such as vLLM, enable users to conveniently prune downloaded models before they are deployed. While the utility and efficiency of pruning methods have improved significantly, the security implications of pruning remain underexplored. In this work, for the first time, we show that modern LLM pruning methods can be maliciously exploited. In particular, an adversary can construct a model that appears benign yet, once pruned, exhibits malicious behaviors. Our method is based on the idea that the adversary can compute a proxy metric that estimates how likely each parameter is to be pruned. With this information, the adversary can first inject a malicious behavior into those parameters that are unlikely to be pruned. Then, they can repair the model by using parameters that are likely to be pruned, effectively canceling out the injected behavior in the unpruned model. We demonstrate the severity of our attack through extensive evaluation on five models; after any of the pruning in vLLM are applied (Magnitude, Wanda, and SparseGPT), it consistently exhibits strong malicious behaviors in a diverse set of attack scenarios (success rates of up to $95.7\%$ for jailbreak, $98.7\%$ for benign instruction refusal, and $99.5\%$ for targeted content injection). Our results reveal a critical deployment-time security gap and underscore the urgent need for stronger security awareness in model compression.
Chinese: 现代大语言模型剪枝方法可能被恶意利用,使模型在剪枝前看似无害,剪枝后却表现出恶意行为,这揭示了模型部署过程中的重大安全隐患。
English: Modern LLM pruning methods can be maliciously exploited to create models that appear benign but exhibit harmful behaviors after pruning, revealing a critical security gap in model deployment.
Authors:Arina Kharlamova, Bowei He, Chen Ma, Xue Liu
Abstract:
Online services rely on CAPTCHAs as a first line of defense against automated abuse, yet recent advances in multi-modal large language models (MLLMs) have eroded the effectiveness of conventional designs that focus on text recognition or 2D image understanding. To address this challenge, we present Spatial CAPTCHA, a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs. Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, occlusion handling, and mental rotation. These skills are intuitive for humans but difficult for state-of-the-art (SOTA) AI systems. The system employs a procedural generation pipeline with constraint-based difficulty control, automated correctness verification, and human-in-the-loop validation to ensure scalability, robustness, and adaptability. Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy. Furthermore, we compare Spatial CAPTCHA with Google reCAPTCHA, which confirms its effectiveness as both a security mechanism and a diagnostic tool for spatial reasoning in AI.
Chinese: 空间验证码提出了一种新型人类验证框架,利用人类直观但先进AI难以处理的空间推理任务,在安全性和诊断能力上显著优于现有验证码系统。
English: Spatial CAPTCHA introduces a novel human-verification framework that leverages spatial reasoning tasks, which are intuitive for humans but challenging for advanced AI systems, significantly outperforming existing CAPTCHAs in security and diagnostic capabilities.
Authors:Marco Zimmerli, Andreas Plesner, Till Aczel, Roger Wattenhofer
Abstract:
Deep neural networks remain vulnerable to adversarial examples despite advances in architectures and training paradigms. We investigate how training data characteristics affect adversarial robustness across 36 state-of-the-art vision models spanning supervised, self-supervised, and contrastive learning approaches, trained on datasets from 1.2M to 22B images. Models were evaluated under six black-box attack categories: random perturbations, two types of geometric masks, COCO object manipulations, ImageNet-C corruptions, and ImageNet-R style shifts. Robustness follows a logarithmic scaling law with both data volume and model size: a tenfold increase in data reduces attack success rate (ASR) on average by ~3.2%, whereas a tenfold increase in model size reduces ASR on average by ~13.4%. Notably, some self-supervised models trained on curated datasets, such as DINOv2, outperform others trained on much larger but less curated datasets, challenging the assumption that scale alone drives robustness. Adversarial fine-tuning of ResNet50s improves generalization across structural variations but not across color distributions. Human evaluation reveals persistent gaps between human and machine vision. These results show that while scaling improves robustness, data quality, architecture, and training objectives play a more decisive role than raw scale in achieving broad-spectrum adversarial resilience.
Authors:Thibaud Gloaguen, Robin Staab, Nikola JovanoviÄ, Martin Vechev
Abstract:
We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.
Chinese Summary: 本文针对扩散语言模型首次提出定制水印方案,通过在不完整上下文中应用期望水印并增强令牌的水印强度,实现了超过99%的检测准确率且对生成质量影响极小。
English Summary: This paper introduces the first watermark specifically designed for diffusion language models, which generates tokens in arbitrary order, by applying watermarks in expectation over incomplete contexts and promoting tokens that enhance watermark strength, achieving over 99% detection accuracy with minimal quality impact.
Authors:Yixu Wang, Yan Teng, Yingchun Wang, Xingjun Ma
Abstract:
Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have transformed vision model adaptation, enabling the rapid deployment of customized models. However, the compactness of LoRA adaptations introduces new safety concerns, particularly their vulnerability to model extraction attacks. This paper introduces a new focus of model extraction attacks named LoRA extraction that extracts LoRA-adaptive models based on a public pre-trained model. We then propose a novel extraction method called StolenLoRA which trains a substitute model to extract the functionality of a LoRA-adapted model using synthetic data. StolenLoRA leverages a Large Language Model to craft effective prompts for data generation, and it incorporates a Disagreement-based Semi-supervised Learning (DSL) strategy to maximize information gain from limited queries. Our experiments demonstrate the effectiveness of StolenLoRA, achieving up to a 96.60% attack success rate with only 10k queries, even in cross-backbone scenarios where the attacker and victim models utilize different pre-trained backbones. These findings reveal the specific vulnerability of LoRA-adapted models to this type of extraction and underscore the urgent need for robust defense mechanisms tailored to PEFT methods. We also explore a preliminary defense strategy based on diversified LoRA deployments, highlighting its potential to mitigate such attacks.
中文摘要:本文揭示了LoRA适配模型面临的新型提取攻击风险,提出的StolenLoRA方法仅用少量查询即可高效窃取模型功能,同时探讨了通过多样化部署的初步防御方案。
English Summary: This paper introduces LoRA extraction attacks, demonstrating how the StolenLoRA method can successfully steal LoRA-adapted model functionality with high success rates using minimal queries, while proposing preliminary defense strategies.
Authors:Marcin Chrapek, Marcin Copik, Etienne Mettaz, Torsten Hoefler
Abstract:
Large Language Models (LLMs) are increasingly deployed on converged Cloud and High-Performance Computing (HPC) infrastructure. However, as LLMs handle confidential inputs and are fine-tuned on costly, proprietary datasets, their heightened security requirements slow adoption in privacy-sensitive sectors such as healthcare and finance. We investigate methods to address this gap and propose Trusted Execution Environments (TEEs) as a solution for securing end-to-end LLM inference. We validate their practicality by evaluating these compute-intensive workloads entirely within CPU and GPU TEEs. On the CPU side, we conduct an in-depth study running full Llama2 inference pipelines (7B, 13B, 70B) inside Intel's TDX and SGX, accelerated by Advanced Matrix Extensions (AMX). We derive 12 insights, including that across various data types, batch sizes, and input lengths, CPU TEEs impose under 10% throughput and 20% latency overheads, further reduced by AMX. We run LLM inference on NVIDIA H100 Confidential Compute GPUs, contextualizing our CPU findings and observing throughput penalties of 4-8% that diminish as batch and input sizes grow. By comparing performance, cost, and security trade-offs, we show how CPU TEEs can be more cost-effective or secure than their GPU counterparts. To our knowledge, our work is the first to comprehensively demonstrate the performance and practicality of modern TEEs across both CPUs and GPUs for enabling confidential LLMs (cLLMs).
中文: 大语言模型在隐私敏感领域面临安全挑战,但可信执行环境通过为CPU和GPU提供端到端的机密推理保护,以可接受的性能损耗实现了实际可行的解决方案。
English: Large Language Models face security challenges in privacy-sensitive sectors, but Trusted Execution Environments (TEEs) offer a practical solution by enabling confidential LLM inference with minimal performance overhead on both CPUs and GPUs.
Authors:Jon Crowcroft, Anil Madhavapeddy, Chris Hicks, Richard Mortier, Vasilios Mavroudis
Abstract:
What if you could really revoke your actual biometric identity, and install a new one, by live rewriting your biological self? We propose some novel mechanisms for hot swapping identity based in novel biotechnology. We discuss the potential positive use cases, and negative consequences if such technology was to become available and affordable. Biometrics are selected on the basis that they are supposed to be unfakeable, or at least not at reasonable cost. If they become easier to fake, it may be much cheaper to fake someone else's biometrics than it is for you to change your own biometrics if someone does copy yours. This potentially makes biometrics a bad trade-off for the user. At the time of writing, this threat is highly speculative, but we believe it is worth raising and considering the potential consequences.
中文: 摘要探讨了利用生物技术热切换生物特征身份这一假设性概念,既指出了其潜在益处,也警示若生物特征易于更改,伪造他人特征的成本可能更低,从而削弱其对用户的安全优势。
English: The abstract explores the speculative concept of using biotechnology to hot-swap biometric identities, highlighting both potential benefits and the risk that making biometrics easily changeable could also make them cheaper to fake, thus undermining their security advantage for users.
Authors:Sophia Lockton, Jeremy Kepner, Michael Stonebraker, Hayden Jananthan, LaToya Anderson, William Arcand, David Bestor, William Bergeron, Alex Bonn, Daniel Burrill, Chansup Byun, Timothy Davis, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Piotr Luszczek, Peter Michaleas, Lauren Milechin, Chasen Milner, Guillermo Morales, Julie Mullen, Michel Pelletier, Alex Poliakov, Andrew Prout, Albert Reuther, Antonio Rosa, Charles Yee, Alex Pentland
Abstract:
DBOS (DataBase Operating System) is a novel capability that integrates web services, operating system functions, and database features to significantly reduce web-deployment effort while increasing resilience. Integration of high performance network sensing enables DBOS web services to collaboratively create a shared awareness of their network environments to enhance their collective resilience and security. Network sensing is added to DBOS using GraphBLAS hypersparse traffic matrices via two approaches: (1) Python-GraphBLAS and (2) OneSparse PostgreSQL. These capabilities are demonstrated using the workflow and analytics from the IEEE/MIT/Amazon Anonymized Network Sensing Graph Challenge. The system was parallelized using pPython and benchmarked using 64 compute nodes on the MIT SuperCloud. The web request rate sustained by a single DBOS instance was ${>}10^5$, well above the required maximum, indicating that network sensing can be added to DBOS with negligible overhead. For collaborative awareness, many DBOS instances were connected to a single DBOS aggregator. The Python-GraphBLAS and OneSparse PostgreSQL implementations scaled linearly up to 64 and 32 nodes respectively. These results suggest that DBOS collaborative network awareness can be achieved with a negligible increase in computing resources.
中文摘要:DBOS通过整合网络服务、操作系统功能与数据库特性,显著简化了网络部署并提升了系统韧性;采用GraphBLAS方法实现的网络感知功能以可忽略的性能开销实现了多实例间的协同感知,且具备线性扩展能力。
English Summary: DBOS integrates web services, OS functions, and database capabilities to streamline web deployment and enhance resilience, with network sensing added via GraphBLAS methods showing negligible performance overhead while enabling scalable collaborative awareness among instances.
Authors:William Cashman, Chasen Milner, Michael Houle, Michael Jones, Hayden Jananthan, Jeremy Kepner, Peter Michaleas, Alex Pentland
Abstract:
AI development requires high fidelity testing environments to effectively transition from the laboratory to operations. The flexibility offered by cyber arenas presents a novel opportunity to test new artificial intelligence (AI) capabilities with users. Cyber arenas are designed to expose end-users to real-world situations and must rapidly incorporate evolving capabilities to meet their core objectives. To explore this concept the MIT/IEEE/Amazon Graph Challenge Anonymized Network Sensor was deployed in a cyber arena during a National Guard exercise.
中文: 网络竞技场通过在国家警卫队演习中部署MIT/IEEE/亚马逊图挑战项目,展示了其为用户测试人工智能能力提供的灵活且逼真的环境。
English: Cyber arenas provide a flexible and realistic environment for testing AI capabilities with users, as demonstrated by the deployment of the MIT/IEEE/Amazon Graph Challenge in a National Guard exercise.
Authors:Shae McFadden, Myles Foley, Mario D'Onghia, Chris Hicks, Vasilios Mavroudis, Nicola Paoletti, Fabio Pierazzi
Abstract:
Malware detection in real-world settings must deal with evolving threats, limited labeling budgets, and uncertain predictions. Traditional classifiers, without additional mechanisms, struggle to maintain performance under concept drift in malware domains, as their supervised learning formulation cannot optimize when to defer decisions to manual labeling and adaptation. Modern malware detection pipelines combine classifiers with monthly active learning (AL) and rejection mechanisms to mitigate the impact of concept drift. In this work, we develop a novel formulation of malware detection as a one-step Markov Decision Process and train a deep reinforcement learning (DRL) agent, simultaneously optimizing sample classification performance and rejecting high-risk samples for manual labeling. We evaluated the joint detection and drift mitigation policy learned by the DRL-based Malware Detection (DRMD) agent through time-aware evaluations on Android malware datasets subject to realistic drift requiring multi-year performance stability. The policies learned under these conditions achieve a higher Area Under Time (AUT) performance compared to standard classification approaches used in the domain, showing improved resilience to concept drift. Specifically, the DRMD agent achieved a $5.18\pm5.44$, $14.49\pm12.86$, and $10.06\pm10.81$ average AUT performance improvement for the classification only, classification with rejection, and classification with rejection and AL settings, respectively. Our results demonstrate for the first time that DRL can facilitate effective malware detection and improved resiliency to concept drift in the dynamic environment of the Android malware domain.
中文: 本研究提出了一种基于深度强化学习的恶意软件检测智能体,能同时优化分类和手动标记样本的筛选,在安卓恶意软件环境中显著提升了对概念漂移的抵抗力,并优于传统方法。
English: This study introduces a deep reinforcement learning (DRL)-based malware detection agent that simultaneously optimizes classification and rejection for manual labeling, demonstrating enhanced resilience to concept drift in Android malware with significant performance improvements over traditional methods.
Authors:Jeremy Kepner, Hayden Jananthan, Chasen Milner, Michael Houle, Michael Jones, Peter Michaleas, Alex Pentland
Abstract:
The advent of high-performance graph libraries, such as the GraphBLAS, has enabled the analysis of massive network data sets and revealed new models for their behavior. Physical analogies for complicated network behavior can be a useful aid to understanding these newly discovered network phenomena. Prior work leveraged the canonical Gull's Lighthouse problem and developed a computational heuristic for modeling large scale network traffic using this model. A general solution using this approach requires overcoming the essential mathematical singularities in the resulting differential equations. Further investigation reveals a simpler physical interpretation that alleviates the need for solving challenging differential equations. Specifically, that the probability of observing a source at a temporal ``distance'' $r(t)$ at time $t$ is $p(t) \propto 1/r(t)^2$. This analogy aligns with many physical phenomena and can be a rich source of intuition. Applying this physical analogy to the observed source correlations in the Anonymized Network Sensing Graph Challenge data leads to an elegant cyber orbit analogy that may assist with the understanding network behavior.
中文摘要:高性能图库促进了大规模网络分析,其中物理类比通过将源观测概率与时间距离的平方反比相关联,简化了复杂网络行为,为理解网络现象提供了直观见解。
English Summary: High-performance graph libraries enable the analysis of massive networks, where a physical analogy simplifies complex behavior by relating source observation probability to inverse square temporal distance, offering intuitive insights into network phenomena.
Authors:Robin Staab, Nikola JovanoviÄ, Kimberly Mai, Prakhar Ganesh, Martin Vechev, Ferdinando Fioretto, Matthew Jagielski
Abstract:
Data minimization (DM) describes the principle of collecting only the data strictly necessary for a given task. It is a foundational principle across major data protection regulations like GDPR and CPRA. Violations of this principle have substantial real-world consequences, with regulatory actions resulting in fines reaching hundreds of millions of dollars. Notably, the relevance of data minimization is particularly pronounced in machine learning (ML) applications, which typically rely on large datasets, resulting in an emerging research area known as Data Minimization in Machine Learning (DMML). At the same time, existing work on other ML privacy and security topics often addresses concerns relevant to DMML without explicitly acknowledging the connection. This disconnect leads to confusion among practitioners, complicating their efforts to implement DM principles and interpret the terminology, metrics, and evaluation criteria used across different research communities. To address this gap, our work introduces a comprehensive framework for DMML, including a unified data pipeline, adversaries, and points of minimization. This framework allows us to systematically review the literature on data minimization and \emph{DM-adjacent} methodologies, for the first time presenting a structured overview designed to help practitioners and researchers effectively apply DM principles. Our work facilitates a unified DM-centric understanding and broader adoption of data minimization strategies in AI/ML.
Chinese: 数据最小化是数据保护法规的核心原则,要求仅收集必要数据,在依赖大数据的机器学习领域尤为重要,催生了数据最小化机器学习这一新兴研究方向,我们的框架通过统一概念和方法来解决当前实践中的脱节问题,助力开发者和研究者应用该原则。
English: Data minimization is a key principle in data protection laws requiring collection of only essential data, especially crucial in machine learning where large datasets are common, leading to the emerging field of Data Minimization in Machine Learning (DMML) that our framework unifies to address current disconnects and aid practitioners.
Authors:Xin He, Junxi Shen, Zhenheng Tang, Xiaowen Chu, Bo Li, Ivor W. Tsang, Yew-Soon Ong
Abstract:
Model merging via Mixture-of-Experts (MoE) has emerged as a scalable solution for consolidating multiple task-specific models into a unified sparse architecture, where each expert is derived from a model fine-tuned on a distinct task. While effective for multi-task integration, this paradigm introduces a critical yet underexplored challenge: how to attribute and protect the intellectual property (IP) of individual experts after merging. We propose RouteMark, a framework for IP protection in merged MoE models through the design of expert routing fingerprints. Our key insight is that task-specific experts exhibit stable and distinctive routing behaviors under probing inputs. To capture these patterns, we construct expert-level fingerprints using two complementary statistics: the Routing Score Fingerprint (RSF), quantifying the intensity of expert activation, and the Routing Preference Fingerprint (RPF), characterizing the input distribution that preferentially activates each expert. These fingerprints are reproducible, task-discriminative, and lightweight to construct. For attribution and tampering detection, we introduce a similarity-based matching algorithm that compares expert fingerprints between a suspect and a reference (victim) model. Extensive experiments across diverse tasks and CLIP-based MoE architectures show that RouteMark consistently yields high similarity for reused experts and clear separation from unrelated ones. Moreover, it remains robust against both structural tampering (expert replacement, addition, deletion) and parametric tampering (fine-tuning, pruning, permutation), outperforming weight- and activation-based baseliness. Our work lays the foundation for RouteMark as a practical and broadly applicable framework for IP verification in MoE-based model merging.
中文摘要:RouteMark框架通过为混合专家模型中的各专家创建独特的路由指纹,实现了对合并后模型的知识产权保护,能够通过相似度匹配可靠地进行归属认定和篡改检测。
English Summary: RouteMark is a framework that protects intellectual property in merged Mixture-of-Experts models by creating distinctive routing fingerprints for each expert, enabling reliable attribution and tampering detection through similarity matching.
Authors:Sanyam Vyas, Alberto Caron, Chris Hicks, Pete Burnap, Vasilios Mavroudis
Abstract:
Deep Reinforcement Learning (DRL) systems are increasingly used in safety-critical applications, yet their security remains severely underexplored. This work investigates backdoor attacks, which implant hidden triggers that cause malicious actions only when specific inputs appear in the observation space. Existing DRL backdoor research focuses solely on training-time attacks requiring unrealistic access to the training pipeline. In contrast, we reveal critical vulnerabilities across the DRL supply chain where backdoors can be embedded with significantly reduced adversarial privileges. We introduce two novel attacks: (1) TrojanentRL, which exploits component-level flaws to implant a persistent backdoor that survives full model retraining; and (2) InfrectroRL, a post-training backdoor attack which requires no access to training, validation, nor test data. Empirical and analytical evaluations across six Atari environments show our attacks rival state-of-the-art training-time backdoor attacks while operating under much stricter adversarial constraints. We also demonstrate that InfrectroRL further evades two leading DRL backdoor defenses. These findings challenge the current research focus and highlight the urgent need for robust defenses.
中文摘要:本研究揭示了深度强化学习系统中的关键安全漏洞,提出了两种新型后门攻击方法——TrojanentRL和InfrectroRL,这些攻击在极低攻击权限下仍能实现与训练阶段攻击相当的效果,并能规避现有防御机制。
English Summary: This study exposes critical vulnerabilities in Deep Reinforcement Learning systems by introducing two novel backdoor attacks—TrojanentRL and InfrectroRL—that operate with minimal adversarial access while matching the effectiveness of existing training-time attacks and evading current defenses.
Authors:Baiqiang Wang, Qian Lou, Mengxin Zheng, Dongfang Zhao
Abstract:
Retrieval-Augmented Generation (RAG) has become a foundational component of modern AI systems, yet it introduces significant privacy risks by exposing user queries to service providers. To address this, we introduce PIR-RAG, a practical system for privacy-preserving RAG. PIR-RAG employs a novel architecture that uses coarse-grained semantic clustering to prune the search space, combined with a fast, lattice-based Private Information Retrieval (PIR) protocol. This design allows for the efficient retrieval of entire document clusters, uniquely optimizing for the end-to-end RAG workflow where full document content is required. Our comprehensive evaluation against strong baseline architectures, including graph-based PIR and Tiptoe-style private scoring, demonstrates PIR-RAG's scalability and its superior performance in terms of "RAG-Ready Latency"-the true end-to-end time required to securely fetch content for an LLM. Our work establishes PIR-RAG as a viable and highly efficient solution for privacy in large-scale AI systems.
中文: 检索增强生成(RAG)存在泄露用户查询的隐私风险,因此提出PIR-RAG系统,通过语义聚类和私有信息检索技术,在AI工作流中实现安全高效的文档内容获取。
English: Retrieval-Augmented Generation (RAG) poses privacy risks by exposing user queries, so PIR-RAG is introduced as a practical system using semantic clustering and Private Information Retrieval to securely fetch document content efficiently for AI workflows.
Authors:Rochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou
Abstract:
As large language models (LLMs) are adapted to sensitive domains such as medicine, their fluency raises safety risks, particularly regarding provenance and accountability. Watermarking embeds detectable patterns to mitigate these risks, yet its reliability in medical contexts remains untested. Existing benchmarks focus on detection-quality tradeoffs and overlook factual risks. In medical text, watermarking often reweights low-entropy tokens, which are highly predictable and often carry critical medical terminology. Shifting these tokens can cause inaccuracy and hallucinations, risks that prior general-domain benchmarks fail to capture. We propose a medical-focused evaluation workflow that jointly assesses factual accuracy and coherence. Using GPT-Judger and further human validation, we introduce the Factuality-Weighted Score (FWS), a composite metric prioritizing factual accuracy beyond coherence to guide watermarking deployment in medical domains. Our evaluation shows current watermarking methods substantially compromise medical factuality, with entropy shifts degrading medical entity representation. These findings underscore the need for domain-aware watermarking approaches that preserve the integrity of medical content.
中文摘要:当前大型语言模型的水印技术虽能降低风险,却因改变关键医学术语而严重损害事实准确性,需采用如事实性加权评分等专业评估方法,以保障医疗内容的精确性与完整性。
English Summary: Current watermarking techniques for large language models, while mitigating some risks, significantly compromise medical factuality by altering critical terminology, necessitating domain-specific evaluations like the proposed Factuality-Weighted Score to ensure accuracy and integrity in healthcare applications.
Authors:Jiaqi Xue, Yifei Zhao, Mengxin Zheng, Fan Yao, Yan Solihin, Qian Lou
Abstract:
Recent advances in Transformer models, e.g., large language models (LLMs), have brought tremendous breakthroughs in various artificial intelligence (AI) tasks, leading to their wide applications in many security-critical domains. Due to their unprecedented scale and prohibitively high development cost, these models have become highly valuable intellectual property for AI stakeholders and are increasingly deployed via machine learning as a service (MLaaS). However, MLaaS often runs on untrusted cloud infrastructure, exposing data and models to potential breaches. Mainstream protection mechanisms leverage trusted execution environments (TEEs) where confidentiality and integrity for secretive data are shielded using hardware-based encryption and integrity checking. Unfortunately, running model inference entirely within TEEs is subject to non-trivial slowdown, which is further exacerbated in LLMs due to the substantial computation and memory footprint involved. Recent studies reveal that the hybrid TEE-based scheme offloading partial model inference operations to the untrusted accelerators (e.g., GPU) is a promising solution. However, prior offloading schemes fail to ensure dual protection of data and model in Transformer inference, as they cannot securely offload critical operations, i.e., Attention and SoftMax, forcing these computations to remain confined within TEEs. To address these challenges, we propose TwinShield, a framework enabling secure Transformer inference in heterogeneous TEE and accelerator systems with dual protection for both model and data. TwinShield offloads ~87% of computation to GPUs and delivers 4.0x - 6.1x speedups over previous approaches across various Transformer models.
中文: 近期Transformer模型的进步使其广泛应用于安全关键领域,但通过MLaaS在不可信基础设施上的部署存在风险,TwinShield通过安全地将计算卸载到GPU,实现了双重保护并显著提升了速度。
English: Recent advances in Transformer models have led to their widespread use in security-critical domains, but their deployment via MLaaS on untrusted infrastructure poses risks, which TwinShield addresses by securely offloading computations to GPUs for dual protection and significant speed improvements.
Authors:Rui Chen, Bin Liu, Changtao Miao, Xinghao Wang, Yi Li, Tao Gong, Qi Chu, Nenghai Yu
Abstract:
Advances in image tampering pose serious security threats, underscoring the need for effective image manipulation localization (IML). While supervised IML achieves strong performance, it depends on costly pixel-level annotations. Existing weakly supervised or training-free alternatives often underperform and lack interpretability. We propose the In-Context Forensic Chain (ICFC), a training-free framework that leverages multi-modal large language models (MLLMs) for interpretable IML tasks. ICFC integrates an objectified rule construction with adaptive filtering to build a reliable knowledge base and a multi-step progressive reasoning pipeline that mirrors expert forensic workflows from coarse proposals to fine-grained forensics results. This design enables systematic exploitation of MLLM reasoning for image-level classification, pixel-level localization, and text-level interpretability. Across multiple benchmarks, ICFC not only surpasses state-of-the-art training-free methods but also achieves competitive or superior performance compared to weakly and fully supervised approaches.
Authors:Jonathan Sneh, Ruomei Yan, Jialin Yu, Philip Torr, Yarin Gal, Sunando Sengupta, Eric Sommerlade, Alasdair Paren, Adel Bibi
Abstract:
As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.
中文: 本文揭示了LLM驱动代理在工具选择过程中的漏洞,攻击者可通过操纵工具名称和描述来偏袒特定工具,提出的ToolTweak攻击方法将选择率从20%提升至81%,同时评估了转述和困惑度过滤等防御措施以降低风险。
English: This paper reveals a vulnerability in LLM-powered agents' tool selection process where adversaries can manipulate tool names and descriptions to bias choices, and introduces ToolTweak, an attack method that increases selection rates from 20% to 81%, along with defenses like paraphrasing and perplexity filtering to mitigate risks.
Authors:Alexandrine Fortier, Thomas Thebaud, Jesús Villalba, Najim Dehak, Patrick Cardinal
Abstract:
Large Language Models (LLMs) and their multimodal extensions are becoming increasingly popular. One common approach to enable multimodality is to cascade domain-specific encoders with an LLM, making the resulting model inherit vulnerabilities from all of its components. In this work, we present the first systematic study of audio backdoor attacks against speech language models. We demonstrate its effectiveness across four speech encoders and three datasets, covering four tasks: automatic speech recognition (ASR), speech emotion recognition, and gender and age prediction. The attack consistently achieves high success rates, ranging from 90.76% to 99.41%. To better understand how backdoors propagate, we conduct a component-wise analysis to identify the most vulnerable stages of the pipeline. Finally, we propose a fine-tuning-based defense that mitigates the threat of poisoned pretrained encoders.
Authors:Leonhard Grosse, Sara Saeidian, Mikael Skoglund, Tobias J. Oechtering
Abstract:
Pointwise maximal leakage (PML) is a per-outcome privacy measure based on threat models from quantitative information flow. Privacy guarantees with PML rely on knowledge about the distribution that generated the private data. In this work, we propose a framework for PML privacy assessment and mechanism design with empirical estimates of this data-generating distribution. By extending the PML framework to consider sets of data-generating distributions, we arrive at bounds on the worst-case leakage within a given set. We use these bounds alongside large-deviation bounds from the literature to provide a method for obtaining distribution-independent $(\varepsilon,δ)$-PML guarantees when the data-generating distribution is estimated from available data samples. We provide an optimal binary mechanism, and show that mechanism design with this type of uncertainty about the data-generating distribution reduces to a linearly constrained convex program. Further, we show that optimal mechanisms designed for a distribution estimate can be used. Finally, we apply these tools to leakage assessment of the Laplace mechanism and the Gaussian mechanism for binary private data, and numerically show that the presented approach to mechanism design can yield significant utility increase compared to local differential privacy, while retaining similar privacy guarantees.
中文: 本研究提出了一个基于经验数据分布估计的点式最大泄露隐私评估框架,实现了分布无关的隐私保证和最优机制设计,相比本地差分隐私显著提升了效用。
English: This work introduces a framework for assessing pointwise maximal leakage (PML) privacy using empirical data distribution estimates, enabling distribution-independent guarantees and optimal mechanism design that significantly improves utility over local differential privacy.
Authors:Andrea Ponte, Luca Demetrio, Luca Oneto, Ivan Tesfai Ogbu, Battista Biggio, Fabio Roli
Abstract:
Malware detection increasingly relies on AI systems that integrate signature-based detection with machine learning. However, these components are typically developed and combined in isolation, missing opportunities to reduce data complexity and strengthen defenses against adversarial EXEmples, carefully crafted programs designed to evade detection. Hence, in this work we investigate the influence that signature-based detection exerts on model training, when they are included inside the training pipeline. Specifically, we compare models trained on a comprehensive dataset with an AI system whose machine learning component is trained solely on samples not already flagged by signatures. Our results demonstrate improved robustness to both adversarial EXEmples and temporal data drift, although this comes at the cost of a fixed lower bound on false positives, driven by suboptimal rule selection. We conclude by discussing these limitations and outlining how future research could extend AI-based malware detection to include dynamic analysis, thereby further enhancing system resilience.
Chinese: 本研究发现在AI训练流程中整合基于签名的检测可提升对抗样本和数据漂移的鲁棒性,但由于规则选择不完善会引入固定的误报率下限。
English: This study finds that integrating signature-based detection into the AI training pipeline enhances robustness against adversarial attacks and data drift, though it introduces a fixed false positive rate due to imperfect rule selection.
Authors:Daniele Ghiani, Daniele Angioni, Giorgio Piras, Angelo Sotgiu, Luca Minnei, Srishti Gupta, Maura Pintor, Fabio Roli, Battista Biggio
Abstract:
Malware evolves rapidly, forcing machine learning (ML)-based detectors to adapt continuously. With antivirus vendors processing hundreds of thousands of new samples daily, datasets can grow to billions of examples, making full retraining impractical. Continual learning (CL) has emerged as a scalable alternative, enabling incremental updates without full data access while mitigating catastrophic forgetting. In this work, we analyze a critical yet overlooked issue in this context: security regression. Unlike forgetting, which manifests as a general performance drop on previously seen data, security regression captures harmful prediction changes at the sample level, such as a malware sample that was once correctly detected but evades detection after a model update. Although often overlooked, regressions pose serious risks in security-critical applications, as the silent reintroduction of previously detected threats in the system may undermine users' trust in the whole updating process. To address this issue, we formalize and quantify security regression in CL-based malware detectors and propose a regression-aware penalty to mitigate it. Specifically, we adapt Positive Congruent Training (PCT) to the CL setting, preserving prior predictive behavior in a model-agnostic manner. Experiments on the ELSA, Tesseract, and AZ-Class datasets show that our method effectively reduces regression across different CL scenarios while maintaining strong detection performance over time.
中文: 本研究针对持续学习恶意软件检测器中的安全回归问题,即模型更新导致已检测到的恶意软件重新逃避检测,提出了一种回归感知的惩罚方法,在保持检测性能的同时有效缓解了该问题。
English: The study addresses security regression in continual learning-based malware detectors, where model updates cause previously detected malware to evade detection, and proposes a regression-aware penalty method that effectively mitigates this issue while maintaining detection performance.
Authors:Antonio Emanuele CinÃ, Maura Pintor, Luca Demetrio, Ambra Demontis, Battista Biggio, Fabio Roli
Abstract:
Despite significant progress in designing powerful adversarial evasion attacks for robustness verification, the evaluation of these methods often remains inconsistent and unreliable. Many assessments rely on mismatched models, unverified implementations, and uneven computational budgets, which can lead to biased results and a false sense of security. Consequently, robustness claims built on such flawed testing protocols may be misleading and give a false sense of security. As a concrete step toward improving evaluation reliability, we present AttackBench, a benchmark framework developed to assess the effectiveness of gradient-based attacks under standardized and reproducible conditions. AttackBench serves as an evaluation tool that ranks existing attack implementations based on a novel optimality metric, which enables researchers and practitioners to identify the most reliable and effective attack for use in subsequent robustness evaluations. The framework enforces consistent testing conditions and enables continuous updates, making it a reliable foundation for robustness verification.
Chinese: 当前对抗性规避攻击的评估常存在不一致和不可靠的问题,导致结果偏差和误导性的鲁棒性声明,为此开发了AttackBench这一标准化基准框架,用于对攻击实现进行排名,以支持可靠的鲁棒性验证。
English: Current evaluations of adversarial evasion attacks are often inconsistent and unreliable, leading to biased results and misleading robustness claims, prompting the development of AttackBench, a standardized benchmark framework that ranks attack implementations for reliable robustness verification.
Authors:Yu Zhang, Shuliang Liu, Xu Yang, Xuming Hu
Abstract:
Watermarking algorithms for Large Language Models (LLMs) effectively identify machine-generated content by embedding and detecting hidden statistical features in text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Existing methods that rely on entropy thresholds often require significant computational resources for tuning and demonstrate poor adaptability to unknown or cross-task generation scenarios. We propose \textbf{C}ontext-\textbf{A}ware \textbf{T}hreshold watermarking ($\myalgo$), a novel framework that dynamically adjusts watermarking intensity based on real-time semantic context. $\myalgo$ partitions text generation into semantic states using logits clustering, establishing context-aware entropy thresholds that preserve fidelity in structured content while embedding robust watermarks. Crucially, it requires no pre-defined thresholds or task-specific tuning. Experiments show $\myalgo$ improves text quality in cross-tasks without sacrificing detection accuracy.
中文: 提出的上下文感知阈值水印(CAT)框架通过实时语义上下文动态调整水印强度,在跨任务场景中提升文本质量,同时保持检测精度且无需预设阈值。
English: The proposed Context-Aware Threshold watermarking (CAT) framework dynamically adjusts watermarking intensity using real-time semantic context, improving text quality in cross-task scenarios without compromising detection accuracy or requiring predefined thresholds.
Authors:João Vitorino, Eva Maia, Isabel Praça, Carlos Soares
Abstract:
Due to the susceptibility of Artificial Intelligence (AI) to data perturbations and adversarial examples, it is crucial to perform a thorough robustness evaluation before any Machine Learning (ML) model is deployed. However, examining a model's decision boundaries and identifying potential vulnerabilities typically requires access to the training and testing datasets, which may pose risks to data privacy and confidentiality. To improve transparency in organizations that handle confidential data or manage critical infrastructure, it is essential to allow external verification and validation of AI without the disclosure of private datasets. This paper presents Systematic Pattern Analysis (SPATA), a deterministic method that converts any tabular dataset to a domain-independent representation of its statistical patterns, to provide more detailed and transparent data cards. SPATA computes the projection of each data instance into a discrete space where they can be analyzed and compared, without risking data leakage. These projected datasets can be reliably used for the evaluation of how different features affect ML model robustness and for the generation of interpretable explanations of their behavior, contributing to more trustworthy AI.
Authors:Yihan Wu, Xuehao Cui, Ruibo Chen, Heng Huang
Abstract:
Verifying the authenticity of AI-generated text has become increasingly important with the rapid advancement of large language models, and unbiased watermarking has emerged as a promising approach due to its ability to preserve output distribution without degrading quality. However, recent work reveals that unbiased watermarks can accumulate distributional bias over multiple generations and that existing robustness evaluations are inconsistent across studies. To address these issues, we introduce UWbench, the first open-source benchmark dedicated to the principled evaluation of unbiased watermarking methods. Our framework combines theoretical and empirical contributions: we propose a statistical metric to quantify multi-batch distribution drift, prove an impossibility result showing that no unbiased watermark can perfectly preserve the distribution under infinite queries, and develop a formal analysis of robustness against token-level modification attacks. Complementing this theory, we establish a three-axis evaluation protocol: unbiasedness, detectability, and robustness, and show that token modification attacks provide more stable robustness assessments than paraphrasing-based methods. Together, UWbench offers the community a standardized and reproducible platform for advancing the design and evaluation of unbiased watermarking algorithms.
中文: 该摘要介绍了UWbench,这是一个全面的开源基准,旨在通过理论分析和三轴评估协议,解决无偏水印方法在AI生成文本中分布偏差累积和鲁棒性评估不一致的问题。
English: The abstract introduces UWbench, a comprehensive open-source benchmark designed to evaluate unbiased watermarking methods for AI-generated text by addressing distributional bias accumulation and inconsistent robustness assessments through theoretical analysis and a three-axis evaluation protocol.
Authors:Yihan Wu, Ruibo Chen, Georgios Milis, Heng Huang
Abstract:
As large language models become increasingly capable and widely deployed, verifying the provenance of machine-generated content is critical to ensuring trust, safety, and accountability. Watermarking techniques have emerged as a promising solution by embedding imperceptible statistical signals into the generation process. Among them, unbiased watermarking is particularly attractive due to its theoretical guarantee of preserving the language model's output distribution, thereby avoiding degradation in fluency or detectability through distributional shifts. However, existing unbiased watermarking schemes often suffer from weak detection power and limited robustness, especially under short text lengths or distributional perturbations. In this work, we propose ENS, a novel ensemble framework that enhances the detectability and robustness of logits-based unbiased watermarks while strictly preserving their unbiasedness. ENS sequentially composes multiple independent watermark instances, each governed by a distinct key, to amplify the watermark signal. We theoretically prove that the ensemble construction remains unbiased in expectation and demonstrate how it improves the signal-to-noise ratio for statistical detectors. Empirical evaluations on multiple LLM families show that ENS substantially reduces the number of tokens needed for reliable detection and increases resistance to smoothing and paraphrasing attacks without compromising generation quality.
中文: ENS框架通过顺序组合多个水印实例,在保持无偏性的同时增强了大型语言模型无偏水印的可检测性和鲁棒性,有效提升了抗攻击的检测性能。
English: The ENS framework enhances the detectability and robustness of unbiased watermarking for large language models by sequentially combining multiple watermark instances, preserving unbiasedness while improving detection performance against attacks.
Authors:Wei Guo, Maura Pintor, Ambra Demontis, Battista Biggio
Abstract:
Backdoor attacks poison the training data to embed a backdoor in the model, causing it to behave normally on legitimate inputs but maliciously when specific trigger signals appear. Training a benign model from a dataset poisoned by backdoor attacks is challenging. Existing works rely on various assumptions and can only defend against backdoor attacks with specific trigger signals, high poisoning ratios, or when the defender possesses a large, untainted, validation dataset. In this paper, we propose a defense called Prototype-Guided Robust Learning (PGRL), which overcomes all the aforementioned limitations, being robust against diverse backdoor attacks. Leveraging a tiny set of benign samples, PGRL generates prototype vectors to guide the training process. We compare our PGRL with 8 existing defenses, showing that it achieves superior robustness. We also demonstrate that PGRL generalizes well across various architectures, datasets, and advanced attacks. Finally, to evaluate our PGRL in the worst-case scenario, we perform an adaptive attack, where the attackers fully know the details of the defense.
Chinese: 本文提出的原型引导鲁棒学习(PGRL)方法通过少量良性样本生成原型向量,有效抵御多种后门攻击,克服了现有防御手段的局限,并在不同架构、数据集和攻击场景下展现出卓越的鲁棒性。
English: This paper introduces Prototype-Guided Robust Learning (PGRL), a defense method that effectively counters diverse backdoor attacks by using a small set of benign samples to generate prototype vectors, overcoming the limitations of existing approaches and demonstrating superior robustness across various architectures, datasets, and attacks.
Authors:Wei Guo, Maura Pintor, Ambra Demontis, Battista Biggio
Abstract:
In the deployment phase, semi-structured sparsity accelerates the execution of deep neural networks on modern GPUs via sparse matrix multiplication. In this paper, targeting the semi-structured sparsity, we introduce a Silent Until Sparse (SUS) backdoor attack, where the released full model remains silent (benign), but becomes a backdoored model after sparsification. The attack operates in two phases: (i) in the backdoor training phase, the backdoor functionality is injected into specific weights that will be retained during the pruning process; (ii) in the backdoor hiding phase, the malicious behavior is concealed by fine-tuning elements that will be pruned away. This dual-phase approach ensures that the attack remains undetectable in the released model, but activates properly once the model is pruned with the semi-structured sparsity. Through extensive experiments, we show that our attack successfully threatens the semi-structured sparsity algorithms from both NVIDIA and PyTorch. Our empirical results show that, regardless of model architecture, the attack success rate of the released model remains below 10% prior to sparsification but exceeds 99% afterward. Moreover, we demonstrate that SUS attack is robust against state-of-the-art backdoor defenses and finetuning, highlighting a critical vulnerability in current model compression and deployment pipelines.
中文摘要:SUS后门攻击针对半结构化稀疏性,使模型在剪枝前保持良性,剪枝后激活恶意功能,攻击成功率从不足10%跃升至超过99%,且能规避现有防御措施。
English Summary: The SUS backdoor attack targets semi-structured sparsity by creating models that remain benign until pruned, achieving over 99% attack success post-sparsification while evading detection in the full model.
Authors:Pedro Correia, Ivan Silva, Ivone Amorim, Eva Maia, Isabel Praça
Abstract:
Federated Learning (FL) is a distributed machine learning approach that promises privacy by keeping the data on the device. However, gradient reconstruction and membership-inference attacks show that model updates still leak information. Fully Homomorphic Encryption (FHE) can address those privacy concerns but it suffers from ciphertext expansion and requires prohibitive overhead on resource-constrained devices. We propose the first Hybrid Homomorphic Encryption (HHE) framework for FL that pairs the PASTA symmetric cipher with the BFV FHE scheme. Clients encrypt local model updates with PASTA and send both the lightweight ciphertexts and the PASTA key (itself BFV-encrypted) to the server, which performs a homomorphic evaluation of the decryption circuit of PASTA and aggregates the resulting BFV ciphertexts. A prototype implementation, developed on top of the Flower FL framework, shows that on independently and identically distributed MNIST dataset with 12 clients and 10 training rounds, the proposed HHE system achieves 97.6% accuracy, just 1.3% below plaintext, while reducing client upload bandwidth by over 2,000x and cutting client runtime by 30% compared to a system based solely on the BFV FHE scheme. However, server computational cost increases by roughly 15621x for each client participating in the training phase, a challenge to be addressed in future work.
Authors:Paulo Mendes, Eva Maia, Isabel Praça
Abstract:
Phishing emails continue to pose a significant threat to cybersecurity by exploiting human vulnerabilities through deceptive content and malicious payloads. While Machine Learning (ML) models are effective at detecting phishing threats, their performance largely relies on the quality and diversity of the training data. This paper presents MeAJOR (Merged email Assets from Joint Open-source Repositories) Corpus, a novel, multi-source phishing email dataset designed to overcome critical limitations in existing resources. It integrates 135894 samples representing a broad number of phishing tactics and legitimate emails, with a wide spectrum of engineered features. We evaluated the dataset's utility for phishing detection research through systematic experiments with four classification models (RF, XGB, MLP, and CNN) across multiple feature configurations. Results highlight the dataset's effectiveness, achieving 98.34% F1 with XGB. By integrating broad features from multiple categories, our dataset provides a reusable and consistent resource, while addressing common challenges like class imbalance, generalisability and reproducibility.
Chinese Summary: 本文提出MeAJOR语料库,这是一个整合多来源钓鱼邮件和合法邮件的新型数据集,通过综合特征工程有效提升检测性能,在XGBoost分类器中实现了98.34%的F1分数,解决了现有数据集的泛化性和可复现性等关键问题。
English Summary: This paper introduces the MeAJOR Corpus, a comprehensive multi-source phishing email dataset that enhances detection capabilities by integrating diverse samples and engineered features, achieving 98.34% F1 score with XGBoost in evaluations.
Authors:Tianneng Shi, Kaijie Zhu, Zhun Wang, Yuqi Jia, Will Cai, Weida Liang, Haonan Wang, Hend Alzahrani, Joshua Lu, Kenji Kawaguchi, Basel Alomair, Xuandong Zhao, William Yang Wang, Neil Gong, Wenbo Guo, Dawn Song
Abstract:
Despite their potential, recent research has demonstrated that LLM agents are vulnerable to prompt injection attacks, where malicious prompts are injected into the agent's input, causing it to perform an attacker-specified task rather than the intended task provided by the user. In this paper, we present PromptArmor, a simple yet effective defense against prompt injection attacks. Specifically, PromptArmor prompts an off-the-shelf LLM to detect and remove potential injected prompts from the input before the agent processes it. Our results show that PromptArmor can accurately identify and remove injected prompts. For example, using GPT-4o, GPT-4.1, or o4-mini, PromptArmor achieves both a false positive rate and a false negative rate below 1% on the AgentDojo benchmark. Moreover, after removing injected prompts with PromptArmor, the attack success rate drops to below 1%. We also demonstrate PromptArmor's effectiveness against adaptive attacks and explore different strategies for prompting an LLM. We recommend that PromptArmor be adopted as a standard baseline for evaluating new defenses against prompt injection attacks.
Chinese: 最新研究表明,LLM智能体易受提示注入攻击,而PromptArmor通过调用现成LLM检测并清除输入中的恶意提示,有效防御此类攻击,将攻击成功率降至1%以下,且误报率和漏报率极低。
English: Recent research reveals that LLM agents are susceptible to prompt injection attacks, but PromptArmor effectively defends against them by using an off-the-shelf LLM to detect and remove malicious prompts, reducing attack success rates to below 1% with minimal false positives and negatives.
Authors:Alexander Xiong, Xuandong Zhao, Aneesh Pappu, Dawn Song
Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks, and the boundary between learning and memorization. Addressing these concerns, this paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation. We explore key drivers, including training data duplication, training dynamics, and fine-tuning procedures that influence data memorization. In addition, we examine methodologies such as prefix-based extraction, membership inference, and adversarial prompting, assessing their effectiveness in detecting and measuring memorized content. Beyond technical analysis, we also explore the broader implications of memorization, including the legal and ethical implications. Finally, we discuss mitigation strategies, including data cleaning, differential privacy, and post-training unlearning, while highlighting open challenges in balancing the minimization of harmful memorization with utility. This paper provides a comprehensive overview of the current state of research on LLM memorization across technical, privacy, and performance dimensions, identifying critical directions for future work.
中文: 本文全面研究了大语言模型的记忆现象,分析了其成因、检测方法和缓解策略,同时探讨了相关的隐私风险与伦理影响。
English: This paper comprehensively examines the memorization phenomenon in Large Language Models, analyzing its causes, detection methods, and mitigation strategies while addressing associated privacy risks and ethical implications.
Authors:Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho
Abstract:
As large language models (LLMs) become more integral to society and technology, ensuring their safety becomes essential. Jailbreak attacks exploit vulnerabilities to bypass safety guardrails, posing a significant threat. However, the mechanisms enabling these attacks are not well understood. In this paper, we reveal a universal phenomenon that occurs during jailbreak attacks: Attention Slipping. During this phenomenon, the model gradually reduces the attention it allocates to unsafe requests in a user query during the attack process, ultimately causing a jailbreak. We show Attention Slipping is consistent across various jailbreak methods, including gradient-based token replacement, prompt-level template refinement, and in-context learning. Additionally, we evaluate two defenses based on query perturbation, Token Highlighter and SmoothLLM, and find they indirectly mitigate Attention Slipping, with their effectiveness positively correlated with the degree of mitigation achieved. Inspired by this finding, we propose Attention Sharpening, a new defense that directly counters Attention Slipping by sharpening the attention score distribution using temperature scaling. Experiments on four leading LLMs (Gemma2-9B-It, Llama3.1-8B-It, Qwen2.5-7B-It, Mistral-7B-It v0.2) show that our method effectively resists various jailbreak attacks while maintaining performance on benign tasks on AlpacaEval. Importantly, Attention Sharpening introduces no additional computational or memory overhead, making it an efficient and practical solution for real-world deployment.
中文摘要:本文揭示了越狱攻击中的普遍现象"注意力滑移"——模型在攻击过程中逐渐降低对危险请求的关注度,并提出直接对抗该现象的"注意力锐化"防御方法,该方法无需额外计算开销即可有效抵御各类越狱攻击。
English Summary: This paper identifies "Attention Slipping" as a universal mechanism in jailbreak attacks where models gradually reduce attention to unsafe requests, and proposes Attention Sharpening defense that directly counters this phenomenon without computational overhead.
Authors:Anqi Chen, Riccardo Preatoni, Alessandro Brighente, Mauro Conti, Cristina Nita-Rotaru
Abstract:
5G marks a major departure from previous cellular architectures, by transitioning from a monolithic design of the core network to a Service-Based Architecture (SBA) where services are modularized as Network Functions (NFs) which communicate with each other via standard-defined HTTP-based APIs called Service-Based Interfaces (SBIs). These NFs are deployed in private and public cloud infrastructure, and an access control framework based on OAuth restricts how they communicate with each other and obtain access to resources. Given the increased vulnerabilities of clouds to insiders, it is important to study the security of the 5G Core services for vulnerabilities that allow attackers to use compromised NFs to obtain unauthorized access to resources. We present FivGeeFuzz, a grammar-based fuzzing framework designed to uncover security flaws in 5G core SBIs. FivGeeFuzz automatically derives grammars from 3GPP API specifications to generate malformed, unexpected, or semantically inconsistent inputs, and it integrates automated bug detection with manual validation and root-cause analysis. We evaluate our approach on free5GC, the only open-source 5G core implementing Release 17-compliant SBIs with an access control mechanism. Using FivGeeFuzz, we discovered 8 previously unknown vulnerabilities in free5GC, leading to runtime crashes, improper error handling, and unauthorized access to resources, including a very severe attack we call Cross-Service Token Attack. All bugs were confirmed by the free5GC team, 7 have already been patched, and the remaining one has a patch under development.
中文: 5G采用基于服务的架构,通过模块化网络功能进行通信,而FivGeeFuzz模糊测试框架在free5GC中发现了8个漏洞,包括未授权访问等安全风险。
English: 5G introduces a Service-Based Architecture with modular Network Functions communicating via HTTP-based APIs, and FivGeeFuzz is a fuzzing framework that discovered 8 vulnerabilities in free5GC, including unauthorized access risks.
Authors:Ching-Chun Chang, Isao Echizen
Abstract:
The rise of synthetic media has blurred the boundary between reality and fabrication under the evolving power of artificial intelligence, fueling an infodemic that erodes public trust in cyberspace. For digital imagery, a multitude of editing applications further complicates the forensic analysis, including semantic edits that alter content, photometric adjustments that recalibrate colour characteristics, and geometric projections that reshape viewpoints. Collectively, these transformations manipulate and control perceptual interpretation of digital imagery. This susceptibility calls for forensic enquiry into reconstructing the chain of events, thereby revealing deeper evidential insight into the presence or absence of criminal intent. This study seeks to address an inverse problem of tracing the underlying generation chain that gives rise to the observed synthetic media. A tell-tale watermarking system is developed for explanatory reasoning over the nature and extent of transformations across the lifecycle of synthetic media. Tell-tale watermarks are tailored to different classes of transformations, responding in a manner that is neither strictly robust nor fragile but instead interpretable. These watermarks function as reference clues that evolve under the same transformation dynamics as the carrier media, leaving interpretable traces when subjected to transformations. Explanatory reasoning is then performed to infer the most plausible account across the combinatorial parameter space of composite transformations. Experimental evaluations demonstrate the validity of tell-tale watermarking with respect to fidelity, synchronicity and traceability.
中文摘要:本研究开发了一种可追溯水印系统,通过嵌入随合成媒体同步演变的可解读水印来追踪其变换历程,从而实现对媒体篡改和犯罪意图的取证分析。
English Summary: The study develops a tell-tale watermarking system to trace the transformation history of synthetic media by embedding interpretable watermarks that evolve with the media, enabling forensic analysis to detect manipulation and criminal intent.
Authors:Shuo Shao, Yiming Li, Hongwei Yao, Yifei Chen, Yuchen Yang, Zhan Qin
Abstract:
The substantial investment required to develop Large Language Models (LLMs) makes them valuable intellectual property, raising significant concerns about copyright protection. LLM fingerprinting has emerged as a key technique to address this, which aims to verify a model's origin by extracting an intrinsic, unique signature (a "fingerprint") and comparing it to that of a source model to identify illicit copies. However, existing black-box fingerprinting methods often fail to generate distinctive LLM fingerprints. This ineffectiveness arises because black-box methods typically rely on model outputs, which lose critical information about the model's unique parameters due to the usage of non-linear functions. To address this, we first leverage Fisher Information Theory to formally demonstrate that the gradient of the model's input is a more informative feature for fingerprinting than the output. Based on this insight, we propose ZeroPrint, a novel method that approximates these information-rich gradients in a black-box setting using zeroth-order estimation. ZeroPrint overcomes the challenge of applying this to discrete text by simulating input perturbations via semantic-preserving word substitutions. This operation allows ZeroPrint to estimate the model's Jacobian matrix as a unique fingerprint. Experiments on the standard benchmark show ZeroPrint achieves a state-of-the-art effectiveness and robustness, significantly outperforming existing black-box methods.
中文摘要:本研究提出ZeroPrint,一种新颖的黑盒指纹方法,通过语义保留的词语替换进行零阶估计来近似模型梯度,克服了基于输出方法的局限性,在识别非法大语言模型副本方面实现了卓越的有效性和鲁棒性。
English Summary: The study introduces ZeroPrint, a novel black-box fingerprinting method that uses zeroth-order estimation to approximate model gradients through semantic word substitutions, achieving superior effectiveness and robustness in identifying illicit LLM copies by overcoming the limitations of output-based approaches.
Authors:Guangyu Shen, Siyuan Cheng, Xiangzhe Xu, Yuan Zhou, Hanxi Guo, Zhuo Zhang, Xiangyu Zhang
Abstract:
Large Language Models (LLMs) can acquire deceptive behaviors through backdoor attacks, where the model executes prohibited actions whenever secret triggers appear in the input. Existing safety training methods largely fail to address this vulnerability, due to the inherent difficulty of uncovering hidden triggers implanted in the model. Motivated by recent findings on LLMs' situational awareness, we propose a novel post-training framework that cultivates self-awareness of backdoor risks and enables models to articulate implanted triggers even when they are absent from the prompt. At its core, our approach introduces an inversion-inspired reinforcement learning framework that encourages models to introspectively reason about their own behaviors and reverse-engineer the triggers responsible for misaligned outputs. Guided by curated reward signals, this process transforms a poisoned model into one capable of precisely identifying its implanted trigger. Surprisingly, we observe that such backdoor self-awareness emerges abruptly within a short training window, resembling a phase transition in capability. Building on this emergent property, we further present two complementary defense strategies for mitigating and detecting backdoor threats. Experiments on five backdoor attacks, compared against six baseline methods, demonstrate that our approach has strong potential to improve the robustness of LLMs against backdoor risks. The code is available at LLM Backdoor Self-Awareness.
中文摘要:大型语言模型易受植入隐藏触发器的后门攻击,而一种新型自我意识框架通过自省推理使其能够识别并表达这些触发器,从而显著增强了防御能力。
English Summary: Large Language Models are vulnerable to backdoor attacks that implant hidden triggers, but a new self-awareness framework enables them to identify and articulate these triggers through introspective reasoning, significantly improving defense capabilities.
Authors:Yu He, Yifei Chen, Yiming Li, Shuo Shao, Leyi Qi, Boheng Li, Dacheng Tao, Zhan Qin
Abstract:
In recent years, RAG has emerged as a key paradigm for enhancing large language models (LLMs). By integrating externally retrieved information, RAG alleviates issues like outdated knowledge and, crucially, insufficient domain expertise. While effective, RAG introduces new risks of external data extraction attacks (EDEAs), where sensitive or copyrighted data in its knowledge base may be extracted verbatim. These risks are particularly acute when RAG is used to customize specialized LLM applications with private knowledge bases. Despite initial studies exploring these risks, they often lack a formalized framework, robust attack performance, and comprehensive evaluation, leaving critical questions about real-world EDEA feasibility unanswered. In this paper, we present the first comprehensive study to formalize EDEAs against retrieval-augmented LLMs. We first formally define EDEAs and propose a unified framework decomposing their design into three components: extraction instruction, jailbreak operator, and retrieval trigger, under which prior attacks can be considered instances within our framework. Guided by this framework, we develop SECRET: a Scalable and EffeCtive exteRnal data Extraction aTtack. Specifically, SECRET incorporates (1) an adaptive optimization process using LLMs as optimizers to generate specialized jailbreak prompts for EDEAs, and (2) cluster-focused triggering, an adaptive strategy that alternates between global exploration and local exploitation to efficiently generate effective retrieval triggers. Extensive evaluations across 4 models reveal that SECRET significantly outperforms previous attacks, and is highly effective against all 16 tested RAG instances. Notably, SECRET successfully extracts 35% of the data from RAG powered by Claude 3.7 Sonnet for the first time, whereas other attacks yield 0% extraction. Our findings call for attention to this emerging threat.
Authors:Bochuan Cao, Changjiang Li, Yuanpu Cao, Yameng Ge, Ting Wang, Jinghui Chen
Abstract:
Large language models (LLMs) have been widely adopted across various applications, leveraging customized system prompts for diverse tasks. Facing potential system prompt leakage risks, model developers have implemented strategies to prevent leakage, primarily by disabling LLMs from repeating their context when encountering known attack patterns. However, it remains vulnerable to new and unforeseen prompt-leaking techniques. In this paper, we first introduce a simple yet effective prompt leaking attack to reveal such risks. Our attack is capable of extracting system prompts from various LLM-based application, even from SOTA LLM models such as GPT-4o or Claude 3.5 Sonnet. Our findings further inspire us to search for a fundamental solution to the problems by having no system prompt in the context. To this end, we propose SysVec, a novel method that encodes system prompts as internal representation vectors rather than raw text. By doing so, SysVec minimizes the risk of unauthorized disclosure while preserving the LLM's core language capabilities. Remarkably, this approach not only enhances security but also improves the model's general instruction-following abilities. Experimental results demonstrate that SysVec effectively mitigates prompt leakage attacks, preserves the LLM's functional integrity, and helps alleviate the forgetting issue in long-context scenarios.
中文: 本文提出一种简单有效的提示词泄露攻击,能够从GPT-4o和Claude 3.5 Sonnet等先进大模型中提取系统提示,同时创新性地开发了SysVec方法,通过将系统提示编码为内部向量来增强安全性并保持模型性能。
English: This paper introduces a simple yet effective prompt leakage attack that can extract system prompts from advanced LLMs like GPT-4o and Claude 3.5 Sonnet, and proposes SysVec, a novel method that encodes system prompts as internal vectors to enhance security while maintaining model performance.
Authors:Tong Zhou, Ruyi Ding, Gaowen Liu, Charles Fleming, Ramana Rao Kompella, Yunsi Fei, Xiaolin Xu, Shaolei Ren
Abstract:
The rapid growth of digital and AI-generated images has amplified the need for secure and verifiable methods of image attribution. While digital watermarking offers more robust protection than metadata-based approaches--which can be easily stripped--current watermarking techniques remain vulnerable to forgery, creating risks of misattribution that can damage the reputations of AI model developers and the rights of digital artists. These vulnerabilities arise from two key issues: (1) content-agnostic watermarks, which, once learned or leaked, can be transferred across images to fake attribution, and (2) reliance on detector-based verification, which is unreliable since detectors can be tricked. We present MetaSeal, a novel framework for content-dependent watermarking with cryptographic security guarantees to safeguard image attribution. Our design provides (1) forgery resistance, preventing unauthorized replication and enforcing cryptographic verification; (2) robust, self-contained protection, embedding attribution directly into images while maintaining resilience against benign transformations; and (3) evidence of tampering, making malicious alterations visually detectable. Experiments demonstrate that MetaSeal effectively mitigates forgery attempts and applies to both natural and AI-generated images, establishing a new standard for secure image attribution.
中文: MetaSeal提出了一种内容依赖的水印框架,通过密码学安全机制防止图像归属伪造,为自然图像和AI生成图像提供强保护及篡改证据。
English: MetaSeal introduces a content-dependent watermarking framework with cryptographic security to combat forgery in image attribution, offering robust protection and tamper evidence for both natural and AI-generated images.
Authors:Jingen Qu, Lijun Li, Bo Zhang, Yichen Yan, Jing Shao
Abstract:
Multimodal large language models (MLLMs) are rapidly evolving, presenting increasingly complex safety challenges. However, current dataset construction methods, which are risk-oriented, fail to cover the growing complexity of real-world multimodal safety scenarios (RMS). And due to the lack of a unified evaluation metric, their overall effectiveness remains unproven. This paper introduces a novel image-oriented self-adaptive dataset construction method for RMS, which starts with images and end constructing paired text and guidance responses. Using the image-oriented method, we automatically generate an RMS dataset comprising 35k image-text pairs with guidance responses. Additionally, we introduce a standardized safety dataset evaluation metric: fine-tuning a safety judge model and evaluating its capabilities on other safety datasets.Extensive experiments on various tasks demonstrate the effectiveness of the proposed image-oriented pipeline. The results confirm the scalability and effectiveness of the image-oriented approach, offering a new perspective for the construction of real-world multimodal safety datasets.
Authors:Jintao Gu, Haolang Lu, Guoshun Nan, Yihan Lin, Kun Wang, Yuchun Guo, Yigui Cao, Yang Liu
Abstract:
Accurate detection of third-party libraries (TPLs) is fundamental to Android security, supporting vulnerability tracking, malware detection, and supply chain auditing. Despite many proposed tools, their real-world effectiveness remains unclear. We present the first large-scale empirical study of ten state-of-the-art TPL detection techniques across over 6,000 apps, enabled by a new ground truth dataset with precise version-level annotations for both remote and local dependencies. Our evaluation exposes tool fragility to R8-era transformations, weak version discrimination, inaccurate correspondence of candidate libraries, difficulty in generalizing similarity thresholds, and prohibitive runtime/memory overheads at scale. Beyond tool assessment, we further analyze how TPLs shape downstream tasks, including vulnerability analysis, malware detection, secret leakage assessment, and LLM-based evaluation. From this perspective, our study provides concrete insights into how TPL characteristics affect these tasks and informs future improvements in security analysis.
Authors:Pasquale De Rosa, Pascal Felber, Valerio Schiavoni
Abstract:
Smart contracts have transformed decentralized finance by enabling programmable, trustless transactions. However, their widespread adoption and growing financial significance have attracted persistent and sophisticated threats, such as phishing campaigns and contract-level exploits. Traditional transaction-based threat detection methods often expose sensitive user data and interactions, raising privacy and security concerns. In response, static bytecode analysis has emerged as a proactive mitigation strategy, identifying malicious contracts before they execute harmful actions. Building on this approach, we introduced PhishingHook, the first machine-learning-based framework for detecting phishing activities in smart contracts via static bytecode and opcode analysis, achieving approximately 90% detection accuracy. Nevertheless, two pressing challenges remain: (1) the increasing use of sophisticated bytecode obfuscation techniques designed to evade static analysis, and (2) the heterogeneity of blockchain environments requiring platform-agnostic solutions. This paper presents a vision for ScamDetect (Smart Contract Agnostic Malware Detector), a robust, modular, and platform-agnostic framework for smart contract malware detection. Over the next 2.5 years, ScamDetect will evolve in two stages: first, by tackling obfuscated Ethereum Virtual Machine (EVM) bytecode through graph neural network (GNN) analysis of control flow graphs (CFGs), leveraging GNNs' ability to capture complex structural patterns beyond opcode sequences; and second, by generalizing detection capabilities to emerging runtimes such as WASM. ScamDetect aims to enable proactive, scalable security for the future of decentralized ecosystems.
中文: 智能合约面临钓鱼攻击和代码混淆等威胁,ScamDetect框架应运而生,它通过图神经网络静态分析字节码,实现平台无关的恶意软件检测,为去中心化生态提供前瞻性安全防护。
English: Smart contracts face evolving threats like phishing and obfuscation, prompting the development of ScamDetect, a platform-agnostic framework that uses graph neural networks to detect malware through static bytecode analysis for proactive decentralized security.
Authors:Xiangzhe Xu, Guangyu Shen, Zian Su, Siyuan Cheng, Hanxi Guo, Lu Yan, Xuan Chen, Jiasheng Jiang, Xiaolong Jin, Chengpeng Wang, Zhuo Zhang, Xiangyu Zhang
Abstract:
AI coding assistants like GitHub Copilot are rapidly transforming software development, but their safety remains deeply uncertain-especially in high-stakes domains like cybersecurity. Current red-teaming tools often rely on fixed benchmarks or unrealistic prompts, missing many real-world vulnerabilities. We present ASTRA, an automated agent system designed to systematically uncover safety flaws in AI-driven code generation and security guidance systems. ASTRA works in three stages: (1) it builds structured domain-specific knowledge graphs that model complex software tasks and known weaknesses; (2) it performs online vulnerability exploration of each target model by adaptively probing both its input space, i.e., the spatial exploration, and its reasoning processes, i.e., the temporal exploration, guided by the knowledge graphs; and (3) it generates high-quality violation-inducing cases to improve model alignment. Unlike prior methods, ASTRA focuses on realistic inputs-requests that developers might actually ask-and uses both offline abstraction guided domain modeling and online domain knowledge graph adaptation to surface corner-case vulnerabilities. Across two major evaluation domains, ASTRA finds 11-66% more issues than existing techniques and produces test cases that lead to 17% more effective alignment training, showing its practical value for building safer AI systems.
中文: ASTRA是一个自动化代理系统,通过构建领域知识图谱并自适应地探索模型输入空间和推理过程,系统性地发现AI编程助手的安全漏洞,相比现有方法能识别更多漏洞并更有效提升模型对齐性。
English: ASTRA is an automated agent system that systematically uncovers safety flaws in AI coding assistants by building domain-specific knowledge graphs and adaptively probing models through spatial and temporal exploration, proving more effective than existing methods in identifying vulnerabilities and enhancing model alignment.
Authors:Haoran Gao, Yuanhe Zhang, Zhenhong Zhou, Lei Jiang, Fanyu Meng, Yujia Xiao, Kun Wang, Yang Liu, Junlan Feng
Abstract:
Resource Consumption Attacks (RCAs) have emerged as a significant threat to the deployment of Large Language Models (LLMs). With the integration of vision modalities, additional attack vectors exacerbate the risk of RCAs in large vision-language models (LVLMs). However, existing red-teaming studies have largely overlooked visual inputs as a potential attack surface, resulting in insufficient mitigation strategies against RCAs in LVLMs. To address this gap, we propose RECALLED (\textbf{RE}source \textbf{C}onsumption \textbf{A}ttack on \textbf{L}arge Vision-\textbf{L}anguag\textbf{E} Mo\textbf{D}els), the first approach for exploiting visual modalities to trigger unbounded RCAs red-teaming. First, we present \textit{Vision Guided Optimization}, a fine-grained pixel-level optimization, to obtain \textit{Output Recall} adversarial perturbations, which can induce repeating output. Then, we inject the perturbations into visual inputs, triggering unbounded generations to achieve the goal of RCAs. Additionally, we introduce \textit{Multi-Objective Parallel Losses} to generate universal attack templates and resolve optimization conflicts when intending to implement parallel attacks. Empirical results demonstrate that RECALLED increases service response latency by over 26 $\uparrow$, resulting in an additional 20\% increase in GPU utilization and memory consumption. Our study exposes security vulnerabilities in LVLMs and establishes a red-teaming framework that can facilitate future defense development against RCAs.
中文总结:RECITE是首个利用视觉输入触发大型视觉语言模型资源消耗攻击的红队测试方法,显著增加了系统延迟和资源消耗,同时揭示了模型的安全漏洞。
English Summary: RECITE is the first red-teaming approach that exploits visual inputs to trigger resource consumption attacks in large vision-language models, significantly increasing latency and resource usage while exposing security vulnerabilities.
Authors:Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, Chris Gamble, Miklós Z. Horváth, Fabian Kaczmarczyck, Alex Kaskasoli, Aleksandar Petrov, Ilia Shumailov, Meghana Thotakuri, Olivia Wiles, Jessica Yung, Zahra Ahmed, Victor Martin, Simon Rosen, Christopher Savčak, Armin Senoner, Nidhi Vyas, Pushmeet Kohli
Abstract:
We introduce SynthID-Image, a deep learning-based system for invisibly watermarking AI-generated imagery. This paper documents the technical desiderata, threat models, and practical challenges of deploying such a system at internet scale, addressing key requirements of effectiveness, fidelity, robustness, and security. SynthID-Image has been used to watermark over ten billion images and video frames across Google's services and its corresponding verification service is available to trusted testers. For completeness, we present an experimental evaluation of an external model variant, SynthID-O, which is available through partnerships. We benchmark SynthID-O against other post-hoc watermarking methods from the literature, demonstrating state-of-the-art performance in both visual quality and robustness to common image perturbations. While this work centers on visual media, the conclusions on deployment, constraints, and threat modeling generalize to other modalities, including audio. This paper provides a comprehensive documentation for the large-scale deployment of deep learning-based media provenance systems.
Authors:Milad Nasr, Nicholas Carlini, Chawin Sitawarin, Sander V. Schulhoff, Jamie Hayes, Michael Ilie, Juliette Pluto, Shuang Song, Harsh Chaudhari, Ilia Shumailov, Abhradeep Thakurta, Kai Yuanqing Xiao, Andreas Terzis, Florian Tramèr
Abstract:
How should we evaluate the robustness of language model defenses? Current defenses against jailbreaks and prompt injections (which aim to prevent an attacker from eliciting harmful knowledge or remotely triggering malicious actions, respectively) are typically evaluated either against a static set of harmful attack strings, or against computationally weak optimization methods that were not designed with the defense in mind. We argue that this evaluation process is flawed. Instead, we should evaluate defenses against adaptive attackers who explicitly modify their attack strategy to counter a defense's design while spending considerable resources to optimize their objective. By systematically tuning and scaling general optimization techniques-gradient descent, reinforcement learning, random search, and human-guided exploration-we bypass 12 recent defenses (based on a diverse set of techniques) with attack success rate above 90% for most; importantly, the majority of defenses originally reported near-zero attack success rates. We believe that future defense work must consider stronger attacks, such as the ones we describe, in order to make reliable and convincing claims of robustness.
Authors:Aashnan Rahman, Abid Hasan, Sherajul Arifin, Faisal Haque Bappy, Tahrim Hossain, Tariqul Islam, Abu Raihan Mostofa Kamal, Md. Azam Hossain
Abstract:
Federated learning (FL) enables privacy-preserving model training by keeping data decentralized. However, it remains vulnerable to label-flipping attacks, where malicious clients manipulate labels to poison the global model. Despite their simplicity, these attacks can severely degrade model performance, and defending against them remains challenging. We introduce AntiFLipper, a novel and computationally efficient defense against multi-class label-flipping attacks in FL. Unlike existing methods that ensure security at the cost of high computational overhead, AntiFLipper employs a novel client-side detection strategy, significantly reducing the central server's burden during aggregation. Comprehensive empirical evaluations across multiple datasets under different distributions demonstrate that AntiFLipper achieves accuracy comparable to state-of-the-art defenses while requiring substantially fewer computational resources in server side. By balancing security and efficiency, AntiFLipper addresses a critical gap in existing defenses, making it particularly suitable for resource-constrained FL deployments where both model integrity and operational efficiency are essential.
Chinese: AntiFLipper是一种新颖高效的联邦学习标签翻转攻击防御方法,通过客户端检测策略降低服务器负担,在保证高准确性和安全性的同时,特别适合资源受限的部署环境。
English: AntiFLipper is a novel, efficient defense against label-flipping attacks in federated learning that uses client-side detection to reduce server burden while maintaining high accuracy and security.
Authors:Hanna Foerster, Ilia Shumailov, Yiren Zhao, Harsh Chaudhari, Jamie Hayes, Robert Mullins, Yarin Gal
Abstract:
Early research into data poisoning attacks against Large Language Models (LLMs) demonstrated the ease with which backdoors could be injected. More recent LLMs add step-by-step reasoning, expanding the attack surface to include the intermediate chain-of-thought (CoT) and its inherent trait of decomposing problems into subproblems. Using these vectors for more stealthy poisoning, we introduce ``decomposed reasoning poison'', in which the attacker modifies only the reasoning path, leaving prompts and final answers clean, and splits the trigger across multiple, individually harmless components. Fascinatingly, while it remains possible to inject these decomposed poisons, reliably activating them to change final answers (rather than just the CoT) is surprisingly difficult. This difficulty arises because the models can often recover from backdoors that are activated within their thought processes. Ultimately, it appears that an emergent form of backdoor robustness is originating from the reasoning capabilities of these advanced LLMs, as well as from the architectural separation between reasoning and final answer generation.
中文摘要:早期研究表明,大型语言模型容易遭受后门攻击,但近期模型引入逐步推理机制后,攻击面扩展至中间思维链,尽管分解式推理毒药攻击可修改推理路径,但可靠激活以改变最终答案却异常困难,这源于模型从思维过程中恢复的能力。
English Summary: Early research showed that backdoors could be easily injected into LLMs, but recent models with step-by-step reasoning introduce new attack vectors, though reliably activating these decomposed poisons to alter final answers is surprisingly difficult due to the models' ability to recover from such attacks.
Authors:Saifullah Saifullah, Stefan Agne, Andreas Dengel, Sheraz Ahmed
Abstract:
As deep learning-based, data-driven information extraction systems become increasingly integrated into modern document processing workflows, one primary concern is the risk of malicious leakage of sensitive private data from these systems. While some recent works have explored Differential Privacy (DP) to mitigate these privacy risks, DP-based training is known to cause significant performance degradation and impose several limitations on standard training procedures, making its direct application to downstream tasks both difficult and costly. In this work, we aim to address the above challenges within the context of document image classification by substituting real private data with a synthetic counterpart. In particular, we propose to use conditional latent diffusion models (LDMs) in combination with differential privacy (DP) to generate class-specific synthetic document images under strict privacy constraints, which can then be utilized to train a downstream classifier following standard training procedures. We investigate our approach under various pretraining setups, including unconditional, class-conditional, and layout-conditional pretraining, in combination with multiple private training strategies such as class-conditional and per-label private fine-tuning with DPDM and DP-Promise algorithms. Additionally, we evaluate it on two well-known document benchmark datasets, RVL-CDIP and Tobacco3482, and show that it can generate useful and realistic document samples across various document types and privacy levels ($\varepsilon \in \{1, 5, 10\}$). Lastly, we show that our approach achieves substantial performance improvements in downstream evaluations on small-scale datasets, compared to the direct application of DP-Adam.
中文: 本研究通过使用差分隐私潜在扩散模型生成合成文档图像来训练分类器,解决了文档处理中的隐私风险问题,在多种设置下既保持了隐私保护,又相比直接应用差分隐私方法显著提升了性能表现。
English: This study addresses privacy risks in document processing by using differentially private latent diffusion models to generate synthetic document images for training classifiers, achieving improved performance over direct DP methods while maintaining privacy across various settings.
Authors:Xiang Zhang, Zhou Li, Shuangyang Li, Kai Wan, Derrick Wing Kwan Ng, Giuseppe Caire
Abstract:
In decentralized federated learning (FL), multiple clients collaboratively learn a shared machine learning (ML) model by leveraging their privately held datasets distributed across the network, through interactive exchange of the intermediate model updates. To ensure data security, cryptographic techniques are commonly employed to protect model updates during aggregation. Despite growing interest in secure aggregation, existing works predominantly focus on protocol design and computational guarantees, with limited understanding of the fundamental information-theoretic limits of such systems. Moreover, optimal bounds on communication and key usage remain unknown in decentralized settings, where no central aggregator is available. Motivated by these gaps, we study the problem of decentralized secure aggregation (DSA) from an information-theoretic perspective. Specifically, we consider a network of $K$ fully-connected users, each holding a private input -- an abstraction of local training data -- who aim to securely compute the sum of all inputs. The security constraint requires that no user learns anything beyond the input sum, even when colluding with up to $T$ other users. We characterize the optimal rate region, which specifies the minimum achievable communication and secret key rates for DSA. In particular, we show that to securely compute one symbol of the desired input sum, each user must (i) transmit at least one symbol to others, (ii) hold at least one symbol of secret key, and (iii) all users must collectively hold no fewer than $K - 1$ independent key symbols. Our results establish the fundamental performance limits of DSA, providing insights for the design of provably secure and communication-efficient protocols in distributed learning systems.
中文摘要:本文从信息论角度确立了去中心化联邦学习中安全聚合的基本性能极限,证明每个用户必须传输至少一个符号并满足特定密钥要求,才能在仅获知输入总和的情况下安全计算,同时防止超出聚合结果的信息泄露。
English Summary: This paper establishes the fundamental information-theoretic limits for decentralized secure aggregation in federated learning, demonstrating that each user must transmit at least one symbol and hold specific key requirements to securely compute input sums while preventing information leakage beyond the aggregated result.
Authors:Chao Feng, Alberto Huertas Celdran, Jing Han, Heqing Ren, Xi Cheng, Zien Zeng, Lucas Krauter, Gerome Bovet, Burkhard Stiller
Abstract:
This paper introduces a dataset and experimental study for decentralized federated learning (DFL) applied to IoT crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware families. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 features used for model training and evaluation. Experiments on the DFL platform compare traditional machine learning (ML), centralized federated learning (CFL), and DFL across different node counts, topologies, and data distributions. Results show that DFL maintains competitive performance while preserving data locality, outperforming CFL in most settings. This dataset provides a solid foundation for studying the security of IoT crowdsensing environments.
中文摘要:本文提出了用于物联网恶意软件检测的去中心化联邦学习数据集和实验框架,研究表明DFL在保持数据本地化的同时,在多数场景下性能优于中心化联邦学习方案。
English Summary: This paper presents a decentralized federated learning dataset and framework for IoT malware detection, demonstrating that DFL achieves competitive performance while preserving data privacy across various network conditions.
Authors:Keke Gai, Haochen Liang, Jing Yu, Liehuang Zhu, Dusit Niyato
Abstract:
Smart contracts play a pivotal role in blockchain ecosystems, and fuzzing remains an important approach to securing smart contracts. Even though mutation scheduling is a key factor influencing fuzzing effectiveness, existing fuzzers have primarily explored seed scheduling and generation, while mutation scheduling has been rarely addressed by prior work. In this work, we propose a Large Language Models (LLMs)-based Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates LLMs, evolutionary mutation strategies, and hybrid testing techniques. Key components of the proposed LLAMA include: (i) a hierarchical prompting strategy that guides LLMs to generate semantically valid initial seeds, coupled with a lightweight pre-fuzzing phase to select high-potential inputs; (ii) a multi-feedback optimization mechanism that simultaneously improves seed generation, seed selection, and mutation scheduling by leveraging runtime coverage and dependency feedback; and (iii) an evolutionary fuzzing engine that dynamically adjusts mutation operator probabilities based on effectiveness, while incorporating symbolic execution to escape stagnation and uncover deeper vulnerabilities. Our experiments demonstrate that LLAMA outperforms state-of-the-art fuzzers in both coverage and vulnerability detection. Specifically, it achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities across diverse categories. These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.
中文: 本文提出的LLAMA框架融合大语言模型与进化变异策略,通过多层次反馈优化机制显著提升了智能合约模糊测试的代码覆盖率和漏洞检测能力,实验证明其性能优于现有最优方法。
English: This paper introduces LLAMA, an advanced smart contract fuzzing framework that leverages Large Language Models, evolutionary mutation strategies, and hybrid testing to significantly enhance code coverage and vulnerability detection beyond current state-of-the-art methods.
Authors:Karthik Pappu, Prathamesh Dinesh Joshi, Raj Abhijit Dandekar, Rajat Dandekar, Sreedath Panat
Abstract:
Accurately modeling malware propagation is essential for designing effective cybersecurity defenses, particularly against adaptive threats that evolve in real time. While traditional epidemiological models and recent neural approaches offer useful foundations, they often fail to fully capture the nonlinear feedback mechanisms present in real-world networks. In this work, we apply scientific machine learning to malware modeling by evaluating three approaches: classical Ordinary Differential Equations (ODEs), Universal Differential Equations (UDEs), and Neural ODEs. Using data from the Code Red worm outbreak, we show that the UDE approach substantially reduces prediction error compared to both traditional and neural baselines by 44%, while preserving interpretability. We introduce a symbolic recovery method that transforms the learned neural feedback into explicit mathematical expressions, revealing suppression mechanisms such as network saturation, security response, and malware variant evolution. Our results demonstrate that hybrid physics-informed models can outperform both purely analytical and purely neural approaches, offering improved predictive accuracy and deeper insight into the dynamics of malware spread. These findings support the development of early warning systems, efficient outbreak response strategies, and targeted cyber defense interventions.
中文摘要:本研究表明,融合物理信息的混合模型(特别是通用微分方程方法)能通过符号化还原网络动态机制,在保持模型可解释性的同时将恶意软件传播预测准确率显著提升44%。
English Summary: This study demonstrates that hybrid physics-informed models, particularly Universal Differential Equations, significantly enhance malware propagation prediction accuracy by 44% while maintaining interpretability through symbolic recovery of network dynamics.
Authors:Milad Nasr, Yanick Fratantonio, Luca Invernizzi, Ange Albertini, Loua Farah, Alex Petit-Bianco, Andreas Terzis, Kurt Thomas, Elie Bursztein, Nicholas Carlini
Abstract:
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.
中文摘要:针对Gmail恶意软件检测系统的对抗性攻击表明,仅修改恶意软件文件的13个字节即可90%的情况下绕过Magika机器学习模型;但研发的防御措施现已部署,攻击者需修改50字节才能达到20%的攻击成功率。
English Summary: Adversarial attacks on Gmail's malware detection system demonstrate that manipulating just 13 bytes in malware files can bypass the Magika ML model 90% of the time, though a developed defense now requires 50 bytes for merely 20% attack success and has been deployed in production.
Authors:Qinnan Hu, Yuntao Wang, Yuan Gao, Zhou Su, Linkang Du
Abstract:
Large language models (LLMs)-empowered autonomous agents are transforming both digital and physical environments by enabling adaptive, multi-agent collaboration. While these agents offer significant opportunities across domains such as finance, healthcare, and smart manufacturing, their unpredictable behaviors and heterogeneous capabilities pose substantial governance and accountability challenges. In this paper, we propose a blockchain-enabled layered architecture for regulatory agent collaboration, comprising an agent layer, a blockchain data layer, and a regulatory application layer. Within this framework, we design three key modules: (i) an agent behavior tracing and arbitration module for automated accountability, (ii) a dynamic reputation evaluation module for trust assessment in collaborative scenarios, and (iii) a malicious behavior forecasting module for early detection of adversarial activities. Our approach establishes a systematic foundation for trustworthy, resilient, and scalable regulatory mechanisms in large-scale agent ecosystems. Finally, we discuss the future research directions for blockchain-enabled regulatory frameworks in multi-agent systems.
中文: 本文提出一种基于区块链的分层监管架构,通过行为追踪、信誉评估和恶意行为预测模块,为多智能体系统建立可信、可扩展的监管机制。
English: This paper introduces a blockchain-based layered framework to govern autonomous agents by enabling behavior tracking, reputation evaluation, and malicious activity prediction for trustworthy multi-agent collaboration.
Authors:Ya-Ting Yang, Quanyan Zhu
Abstract:
Cognitive vulnerabilities shape human decision-making and arise primarily from two sources: (1) cognitive capabilities, which include disparities in knowledge, education, expertise, or access to information, and (2) cognitive biases, such as rational inattention, confirmation bias, and base rate neglect, which influence how individuals perceive and process information. Exploiting these vulnerabilities allows an entity with superior cognitive awareness to gain a strategic advantage, a concept referred to as cognitive arbitrage. This paper investigates how to exploit the cognitive vulnerabilities of Advanced Persistent Threat (APT) attackers and proposes cognition-aware defenses that leverage windows of superiority to counteract attacks. Specifically, the proposed bi-level cyber warfare game focuses on "strategic-level" design for defensive deception mechanisms, which then facilitates "operational-level" actions and tactical-level execution of Tactics, Techniques, and Procedures (TTPs). Game-theoretic reasoning and analysis play a significant role in the cross-echelon quantitative modeling and design of cognitive arbitrage strategies. Our numerical results demonstrate that although the defender's initial advantage diminishes over time, strategically timed and deployed deception techniques can turn a negative value for the attacker into a positive one during the planning phase, and achieve at least a 40% improvement in total rewards during execution. This demonstrates that the defender can amplify even small initial advantages, sustain a strategic edge over the attacker, and secure long-term objectives, such as protecting critical assets throughout the attacker's lifecycle.
中文: 本文通过博弈论建模和战略性欺骗手段,研究如何利用高级持续性威胁攻击者的认知脆弱性,证明防御者即使在初始优势减弱时仍能维持战略优势,并实现至少40%的收益提升。
English: This paper explores exploiting cognitive vulnerabilities in APT attackers through game-theoretic modeling and strategic deception, demonstrating that defenders can sustain advantages and improve rewards by at least 40% despite diminishing initial superiority.
Authors:Xubin Yue, Zhenhua Xu, Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han
Abstract:
Addressing the intellectual property protection challenges in commercial deployment of large language models (LLMs), existing black-box fingerprinting techniques face dual challenges from incremental fine-tuning erasure and feature-space defense due to their reliance on overfitting high-perplexity trigger patterns. Recent work has revealed that model editing in the fingerprinting domain offers distinct advantages, including significantly lower false positive rates, enhanced harmlessness, and superior robustness. Building on this foundation, this paper innovatively proposes a $\textbf{Pr}$efix-$\textbf{e}$nhanced Fingerprint $\textbf{E}$diting Framework (PREE), which encodes copyright information into parameter offsets through dual-channel knowledge edit to achieve covert embedding of fingerprint features. Experimental results demonstrate that the proposed solution achieves the 90\% trigger precision in mainstream architectures including LLaMA-3 and Qwen-2.5. The minimal parameter offset (change rate < 0.03) effectively preserves original knowledge representation while demonstrating strong robustness against incremental fine-tuning and multi-dimensional defense strategies, maintaining zero false positive rate throughout evaluations.
Authors:Yuwen Pu, Zhou Feng, Chunyi Zhou, Jiahao Chen, Chunqiang Hu, Haibo Hu, Shouling Ji
Abstract:
Recently, speech assistant and speech verification have been used in many fields, which brings much benefit and convenience for us. However, when we enjoy these speech applications, our speech may be collected by attackers for speech synthesis. For example, an attacker generates some inappropriate political opinions with the characteristic of the victim's voice by obtaining a piece of the victim's speech, which will greatly influence the victim's reputation. Specifically, with the appearance of some zero-shot voice conversion methods, the cost of speech synthesis attacks has been further reduced, which also brings greater challenges to user voice security and privacy. Some researchers have proposed the corresponding privacy-preserving methods. However, the existing approaches have some non-negligible drawbacks: low transferability and robustness, high computational overhead. These deficiencies seriously limit the existing method deployed in practical scenarios. Therefore, in this paper, we propose a lightweight, robust, plug-and-play privacy preservation method against speech synthesis attacks in a black-box setting. Our method generates and adds a frequency-domain perturbation to the original speech to achieve privacy protection and high speech quality. Then, we present a data augmentation strategy and noise smoothing mechanism to improve the robustness of the proposed method. Besides, to reduce the user's defense overhead, we also propose a novel identity-wise protection mechanism. It can generate a universal perturbation for one speaker and support privacy preservation for speech of any length. Finally, we conduct extensive experiments on 5 speech synthesis models, 5 speech verification models, 1 speech recognition model, and 2 datasets. The experimental results demonstrate that our method has satisfying privacy-preserving performance, high speech quality, and utility.
中文摘要:本文提出了一种轻量级即插即用的语音隐私保护方法,通过添加频域扰动有效防止语音被非法合成,同时利用数据增强和噪声平滑机制保障语音质量与实用性。
English Summary: This paper introduces a lightweight, plug-and-play privacy protection method that adds frequency-domain perturbations to speech, effectively preventing unauthorized synthesis while maintaining speech quality and utility through robust data augmentation and noise smoothing mechanisms.
Authors:Brandon Beltz, Jim Doty, Yvonne Fonken, Nikolos Gurney, Brett Israelsen, Nathan Lau, Stacy Marsella, Rachelle Thomas, Stoney Trent, Peggy Wu, Ya-Ting Yang, Quanyan Zhu
Abstract:
We present three large-scale human-subjects red-team cyber range datasets from the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while we captured multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, keylogs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE Dataport entries for Experiments 1-3.
中文: GAMBiT项目发布了三个大规模红队网络靶场数据集,记录了熟练攻击者在模拟环境中的多模态操作数据,支持攻击者行为建模和偏见感知分析研究,数据可通过IEEE Dataport获取。
English: The GAMBiT project released three large-scale red-team cyber range datasets capturing multi-modal data from skilled attackers performing simulated operations, supporting research on attacker behavior modeling and bias-aware analytics, with data available via IEEE Dataport.
Authors:Wenpeng Xing, Zhonghao Qi, Yupeng Qin, Yilin Li, Caini Chang, Jiahui Yu, Changting Lin, Zhenzhen Xie, Meng Han
Abstract:
The integration of Large Language Models (LLMs) with external tools via protocols such as the Model Context Protocol (MCP) introduces critical security vulnerabilities, including prompt injection, data exfiltration, and other threats. To counter these challenges, we propose MCP-Guard, a robust, layered defense architecture designed for LLM--tool interactions. MCP-Guard employs a three-stage detection pipeline that balances efficiency with accuracy: it progresses from lightweight static scanning for overt threats and a deep neural detector for semantic attacks, to our fine-tuned E5-based model achieves (96.01) accuracy in identifying adversarial prompts. Finally, a lightweight LLM arbitrator synthesizes these signals to deliver the final decision while minimizing false positives. To facilitate rigorous training and evaluation, we also introduce MCP-AttackBench, a comprehensive benchmark of over 70,000 samples. Sourced from public datasets and augmented by GPT-4, MCP-AttackBench simulates diverse, real-world attack vectors in the MCP format, providing a foundation for future research into securing LLM-tool ecosystems.
中文: 该摘要提出了MCP-Guard——一个针对大语言模型与工具交互设计的分层防御架构,通过三级检测流水线有效防范提示注入等安全威胁,并建立了包含7万样本的MCP-AttackBench基准测试集,为后续研究提供评估基础。
English: This abstract introduces MCP-Guard, a layered defense architecture that employs a three-stage detection pipeline to protect Large Language Models from security threats like prompt injection and data exfiltration during tool integration, while also presenting MCP-AttackBench, a comprehensive benchmark for evaluating such defenses.
Authors:Wenpeng Xing, Mohan Li, Chunqiang Hu, Haitao XuNingyu Zhang, Bo Lin, Meng Han
Abstract:
Large language models (LLMs) demonstrate impressive capabilities in various language tasks but are susceptible to jailbreak attacks that circumvent their safety alignments. This paper introduces Latent Fusion Jailbreak (LFJ), a representation-based attack that interpolates hidden states from harmful and benign query pairs to elicit prohibited responses. LFJ begins by selecting query pairs with high thematic and syntactic similarity, then performs gradient-guided interpolation at influential layers and tokens, followed by optimization to balance attack success, output fluency, and computational efficiency. Evaluations on models such as Vicuna and LLaMA-2 across benchmarks like AdvBench and MaliciousInstruct yield an average attack success rate (ASR) of 94.01%, outperforming existing methods. To mitigate LFJ, we propose an adversarial training defense that fine-tunes models on interpolated examples, reducing ASR by over 80% without degrading performance on benign inputs. Ablation studies validate the importance of query pair selection, hidden state interpolation components, and optimization strategies in LFJ's effectiveness.
中文摘要:本文提出潜在融合越狱(LFJ)攻击方法,通过操纵有害与良性查询对的隐藏状态来突破大语言模型的安全对齐,实现了94.01%的平均攻击成功率,同时提出对抗训练作为有效防御方案。
English Summary: This paper introduces Latent Fusion Jailbreak (LFJ), a representation-based attack that manipulates hidden states to bypass safety alignments in large language models, achieving a 94.01% average success rate while also proposing adversarial training as an effective defense.
Authors:Zhifan Luo, Shuo Shao, Su Zhang, Lijing Zhou, Yuke Hu, Chenxu Zhao, Zhihao Liu, Zhan Qin
Abstract:
The Key-Value (KV) cache, which stores intermediate attention computations (Key and Value pairs) to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model (LLM) inference. However, this efficiency optimization introduces significant yet underexplored privacy risks. This paper provides the first comprehensive analysis of these vulnerabilities, demonstrating that an attacker can reconstruct sensitive user inputs directly from the KV-cache. We design and implement three distinct attack vectors: a direct Inversion Attack, a more broadly applicable and potent Collision Attack, and a semantic-based Injection Attack. These methods demonstrate the practicality and severity of KV-cache privacy leakage issues. To mitigate this, we propose KV-Cloak, a novel, lightweight, and efficient defense mechanism. KV-Cloak uses a reversible matrix-based obfuscation scheme, combined with operator fusion, to secure the KV-cache. Our extensive experiments show that KV-Cloak effectively thwarts all proposed attacks, reducing reconstruction quality to random noise. Crucially, it achieves this robust security with virtually no degradation in model accuracy and minimal performance overhead, offering a practical solution for trustworthy LLM deployment.
中文摘要:KV缓存机制虽加速大语言模型推理,却存在严重隐私泄露风险,攻击者可通过三种方式重构用户敏感数据;新提出的KV-Cloak防御方案能有效阻断所有攻击且几乎不影响模型性能。
English Summary: The KV-cache mechanism in LLMs poses serious privacy risks by allowing attackers to reconstruct sensitive user data through three demonstrated attack methods, which are effectively mitigated by the proposed KV-Cloak defense system with minimal performance impact.
Authors:Minfeng Qi, Qin Wang, Guangsheng Yu, Ruiqiang Li, Victor Zhou, Shiping Chen
Abstract:
We argue that the technical foundations of non-fungible tokens (NFTs) remain inadequately understood. Prior research has focused on market dynamics, user behavior, and isolated security incidents, yet systematic analysis of the standards underpinning NFT functionality is largely absent.
We present the first study of NFTs through the lens of Ethereum Improvement Proposals (EIPs). We conduct a large-scale empirical analysis of 191 NFT-related EIPs and 10K+ Ethereum Magicians discussions (as of July, 2025). We integrate multi-dimensional analyses including the automated parsing of Solidity interfaces, graph-based modeling of inheritance structures, contributor profiling, and mining of community discussion data. We distinguish foundational from emerging standards, expose poor cross-version interoperability, and show that growing functional complexity heightens security risks.
中文: 本研究首次通过191项以太坊改进提案和大量社区讨论系统分析了NFT技术基础,揭示了互操作性不足及功能复杂性加剧带来的安全风险。
English: This study presents the first systematic analysis of NFT technical foundations through 191 Ethereum Improvement Proposals and extensive community discussions, revealing interoperability issues and heightened security risks from increasing functional complexity.
Authors:Zeke Xiao, Yuekang Li, Qin Wang, Shiping Chen
Abstract:
We explore the feasibility of using LLMs for Automated Exploit Generation (AEG) against vulnerable smart contracts. We present \textsc{ReX}, a framework integrating LLM-based exploit synthesis with the Foundry testing suite, enabling the automated generation and validation of proof-of-concept (PoC) exploits. We evaluate five state-of-the-art LLMs (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, DeepSeek, and Qwen3 Plus) on both synthetic benchmarks and real-world smart contracts affected by known high-impact exploits. Our results show that modern LLMs can reliably generate functional PoC exploits for diverse vulnerability types, with success rates reaching up to 92\%. Notably, Gemini 2.5 Pro and GPT-4.1 consistently outperform others in both synthetic and real-world scenarios. We further analyze factors influencing AEG effectiveness, including model capabilities, contract structure, and vulnerability types. We also collect the first curated dataset of real-world PoC exploits to support future research.
Chinese: 本研究证明现代大语言模型能有效自动生成针对智能合约漏洞的攻击程序,其中ReX框架在各类漏洞测试中生成可用概念验证攻击的成功率高达92%,Gemini 2.5 Pro和GPT-4.1表现尤为突出。
English: This study demonstrates that modern large language models (LLMs) can effectively automate exploit generation for vulnerable smart contracts, with the ReX framework achieving up to 92% success in creating functional proof-of-concept exploits across various vulnerability types.
Authors:Ziyao Wang, Guoheng Sun, Yexiao He, Zheyu Shen, Bowei Tian, Ang Li
Abstract:
Commercial LLM services often conceal internal reasoning traces while still charging users for every generated token, including those from hidden intermediate steps, raising concerns of token inflation and potential overbilling. This gap underscores the urgent need for reliable token auditing, yet achieving it is far from straightforward: cryptographic verification (e.g., hash-based signature) offers little assurance when providers control the entire execution pipeline, while user-side prediction struggles with the inherent variance of reasoning LLMs, where token usage fluctuates across domains and prompt styles. To bridge this gap, we present PALACE (Predictive Auditing of LLM APIs via Reasoning Token Count Estimation), a user-side framework that estimates hidden reasoning token counts from prompt-answer pairs without access to internal traces. PALACE introduces a GRPO-augmented adaptation module with a lightweight domain router, enabling dynamic calibration across diverse reasoning tasks and mitigating variance in token usage patterns. Experiments on math, coding, medical, and general reasoning benchmarks show that PALACE achieves low relative error and strong prediction accuracy, supporting both fine-grained cost auditing and inflation detection. Taken together, PALACE represents an important first step toward standardized predictive auditing, offering a practical path to greater transparency, accountability, and user trust.
中文摘要:商业LLM服务常隐藏推理过程却仍按令牌收费,引发透明度和计费担忧,而PALACE框架通过提示-答案对估算隐藏令牌数量,实现了跨领域的高精度成本审计与膨胀检测。
English Summary: Commercial LLM services often hide reasoning tokens while charging users, creating a need for reliable auditing, which PALACE addresses by estimating hidden token counts from prompt-answer pairs to enable cost auditing and detect inflation.
Authors:Zeyang Sha, Hanling Tian, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weiqiang Wang
Abstract:
The emergence of autonomous Large Language Model (LLM) agents capable of tool usage has introduced new safety risks that go beyond traditional conversational misuse. These agents, empowered to execute external functions, are vulnerable to both user-initiated threats (e.g., adversarial prompts) and tool-initiated threats (e.g., malicious outputs from compromised tools). In this paper, we propose the first unified safety-alignment framework for tool-using agents, enabling models to handle both channels of threat via structured reasoning and sandboxed reinforcement learning. We introduce a tri-modal taxonomy, including benign, malicious, and sensitive for both user prompts and tool responses, and define a policy-driven decision model. Our framework employs a custom-designed sandbox environment that simulates real-world tool execution and allows fine-grained reward shaping. Through extensive evaluations on public and self-built benchmarks, including Agent SafetyBench, InjecAgent, and BFCL, we demonstrate that our safety-aligned agents significantly improve resistance to security threats while preserving strong utility on benign tasks. Our results show that safety and effectiveness can be jointly optimized, laying the groundwork for trustworthy deployment of autonomous LLM agents.
中文: 本文提出首个针对工具使用型智能体的统一安全对齐框架,通过结构化推理和沙盒强化学习应对用户与工具的双重威胁,在保持任务效能的同时显著提升安全防护能力。
English: This paper introduces a unified safety-alignment framework for autonomous LLM agents that addresses dual threats from users and tools through structured reasoning and sandboxed reinforcement learning, significantly enhancing security while maintaining task performance.
Authors:Ruiqiang Li, Brian Yecies, Qin Wang, Shiping Chen, Jun Shen
Abstract:
Non-Fungible Tokens (NFTs) offer a promising mechanism to protect Australian and Indigenous artists' copyright. They represent and transfer the value of artwork in digital form. Before adopting NFTs to protect Australian artwork, we in this paper investigate them empericially. We focus on examining the details of NFT structure. We start from the underlying structure of NFTs to show how they represent copyright for both artists and production owners, as well as how they aim to safeguard or secure the value of digital artworks. We then involve data collection from various types of sources with different storage methods, including on-chain, centralized, and decentralized systems. Based on both metadata and artwork content, we present our analysis and discussion on the following key issues: copyright, security and artist identification. The final results of the evaluation, unfortnately, show that the NFT is NOT ready to protect Australian and Indigenous artists' copyright.
中文: 本研究通过结构分析和多源数据评估,实证检验了NFT保护澳大利亚及原住民艺术家版权的潜力,最终结论表明NFT目前尚未具备此项能力。
English: This study empirically examines NFTs' potential to protect Australian and Indigenous artists' copyright through structural analysis and multi-source data evaluation, concluding that NFTs are currently not ready for this purpose.
Authors:Ya-Ting Yang, Quanyan Zhu
Abstract:
Cyber warfare has become a critical dimension of modern conflict, driven by society's increasing dependence on interconnected digital and physical infrastructure. Effective cyber defense often requires decision-making at different echelons, where the tactical layer focuses on detailed actions such as techniques, tactics, and procedures, while the strategic layer addresses long-term objectives and coordinated planning. Modeling these interactions at different echelons remains challenging due to the dynamic, large-scale, and interdependent nature of cyber environments. To address this, we propose a multi-resolution dynamic game framework in which the tactical layer captures fine-grained interactions using high-resolution extensive-form game trees, while the strategic layer is modeled as a Markov game defined over lower-resolution states abstracted from those game trees. This framework supports scalable reasoning and planning across different levels of abstraction through zoom-in and zoom-out operations that adjust the granularity of the modeling based on operational needs. A case study demonstrates how the framework works and its effectiveness in improving the defender's strategic advantage.
中文: 该多分辨率动态博弈框架通过精细化博弈树建模战术交互,并利用抽象马尔可夫博弈实现战略协调,从而提升网络防御的可扩展性和决策优势。
English: The proposed multi-resolution dynamic game framework enables scalable cyber defense planning by modeling tactical interactions with detailed game trees and strategic coordination through abstracted Markov games, enhancing defenders' operational adaptability.
Authors:Weidi Luo, Qiming Zhang, Tianyu Lu, Xiaogeng Liu, Bin Hu, Hung-Chun Chiu, Siyuan Ma, Yizhe Zhang, Xusheng Xiao, Yinzhi Cao, Zhen Xiang, Chaowei Xiao
Abstract:
Computer-use agent (CUA) frameworks, powered by large language models (LLMs) or multimodal LLMs (MLLMs), are rapidly maturing as assistants that can perceive context, reason, and act directly within software environments. Among their most critical applications is operating system (OS) control. As CUAs in the OS domain become increasingly embedded in daily operations, it is imperative to examine their real-world security implications, specifically whether CUAs can be misused to perform realistic, security-relevant attacks. Existing works exhibit four major limitations: Missing attacker-knowledge model on tactics, techniques, and procedures (TTP), Incomplete coverage for end-to-end kill chains, unrealistic environment without multi-host and encrypted user credentials, and unreliable judgment dependent on LLM-as-a-Judge. To address these gaps, we propose AdvCUA, the first benchmark aligned with real-world TTPs in MITRE ATT&CK Enterprise Matrix, which comprises 140 tasks, including 40 direct malicious tasks, 74 TTP-based malicious tasks, and 26 end-to-end kill chains, systematically evaluates CUAs under a realistic enterprise OS security threat in a multi-host environment sandbox by hard-coded evaluation. We evaluate the existing five mainstream CUAs, including ReAct, AutoGPT, Gemini CLI, Cursor CLI, and Cursor IDE based on 8 foundation LLMs. The results demonstrate that current frontier CUAs do not adequately cover OS security-centric threats. These capabilities of CUAs reduce dependence on custom malware and deep domain expertise, enabling even inexperienced attackers to mount complex enterprise intrusions, which raises social concern about the responsibility and security of CUAs.
中文: 计算机使用代理框架虽能提升操作系统效率,却因安全漏洞使非专业攻击者也能发动复杂入侵,AdvCUA基准测试揭示了现有系统的防御不足,引发社会对智能代理安全责任的担忧。
English: Computer-use agent (CUA) frameworks, while advancing as OS assistants, present significant security risks by enabling complex attacks with minimal expertise, as demonstrated by the AdvCUA benchmark revealing vulnerabilities in current systems.
Authors:Qingyu Yin, Chak Tou Leong, Linyi Yang, Wenxuan Huang, Wenjie Li, Xiting Wang, Jaehong Yoon, YunXing, XingYu, Jinjin Gu
Abstract:
Large reasoning models (LRMs) with multi-step reasoning capabilities have shown remarkable problem-solving abilities, yet they exhibit concerning safety vulnerabilities that remain poorly understood. In this work, we investigate why safety alignment fails in reasoning models through a mechanistic interpretability lens. Using a linear probing approach to trace refusal intentions across token positions, we discover a striking phenomenon termed as \textbf{refusal cliff}: many poorly-aligned reasoning models correctly identify harmful prompts and maintain strong refusal intentions during their thinking process, but experience a sharp drop in refusal scores at the final tokens before output generation. This suggests that these models are not inherently unsafe; rather, their refusal intentions are systematically suppressed. Through causal intervention analysis, we identify a sparse set of attention heads that negatively contribute to refusal behavior. Ablating just 3\% of these heads can reduce attack success rates below 10\%. Building on these mechanistic insights, we propose \textbf{Cliff-as-a-Judge}, a novel data selection method that identifies training examples exhibiting the largest refusal cliff to efficiently repair reasoning models' safety alignment. This approach achieves comparable safety improvements using only 1.7\% of the vanilla safety training data, demonstrating a less-is-more effect in safety alignment.
中文: 大型推理模型存在“拒绝悬崖”现象,即安全意图在输出前骤降,但通过消除关键注意力头和针对性数据选择,仅需少量训练数据即可高效修复安全对齐。
English: Large reasoning models exhibit a "refusal cliff" where safety intentions sharply drop before output, but ablating key attention heads and using targeted data selection can efficiently restore alignment with minimal training data.
Authors:Tomoyuki Morimae, Yuki Shirakawa, Takashi Yamakawa
Abstract:
One-way puzzles (OWPuzzs) introduced by Khurana and Tomer [STOC 2024] are a natural quantum analogue of one-way functions (OWFs), and one of the most fundamental primitives in ''Microcrypt'' where OWFs do not exist but quantum cryptography is possible. OWPuzzs are implied by almost all quantum cryptographic primitives, and imply several important applications such as non-interactive commitments and multi-party computations. A significant goal in the field of quantum cryptography is to base OWPuzzs on plausible assumptions that will not imply OWFs. In this paper, we base OWPuzzs on hardness of non-collapsing measurements. To that end, we introduce a new complexity class, $\mathbf{SampPDQP}$, which is a sampling version of the decision class $\mathbf{PDQP}$ introduced in [Aaronson, Bouland, Fitzsimons, and Lee, ITCS 2016]. We show that if $\mathbf{SampPDQP}$ is hard on average for quantum polynomial time, then OWPuzzs exist. $\mathbf{SampPDQP}$ is the class of sampling problems that can be solved by a classical polynomial-time algorithm that can make a single query to a non-collapsing measurement oracle, which is a ''magical'' oracle that can sample measurement results on quantum states without collapsing the states. Such non-collapsing measurements are highly unphysical operations that should be hard to realize in quantum polynomial-time. We also study upperbounds of the hardness of $\mathbf{SampPDQP}$. We introduce a new primitive, distributional collision-resistant puzzles (dCRPuzzs), which are a natural quantum analogue of distributional collision-resistant hashing [Dubrov and Ishai, STOC 2006]. We show that dCRPuzzs imply average-case hardness of $\mathbf{SampPDQP}$ (and therefore OWPuzzs as well). We also show that two-message honest-statistically-hiding commitments with classical communication and one-shot signatures [Amos, Georgiou, Kiayias, Zhandry, STOC 2020] imply dCRPuzzs.
中文: 单向谜题(OWPuzzs)是一种关键的量子密码学原语,可基于涉及非坍缩测量的采样类SampPDQP的平均情况困难性构建,同时也由承诺和签名衍生的分布抗碰撞谜题(dCRPuzzs)所蕴含。
English: One-way puzzles (OWPuzzs) are a key quantum cryptographic primitive that can be constructed based on the average-case hardness of the sampling class SampPDQP, which involves non-collapsing measurements, and are also implied by distributional collision-resistant puzzles (dCRPuzzs) derived from commitments and signatures.
Authors:Minki Hhan, Tomoyuki Morimae, Yasuaki Okinaka, Takashi Yamakawa
Abstract:
With the rapid advances in quantum computer architectures and the emerging prospect of large-scale quantum memory, it is becoming essential to classically verify that remote devices genuinely allocate the promised quantum memory with specified number of qubits and coherence time. In this paper, we introduce a new concept, proofs of quantum memory (PoQM). A PoQM is an interactive protocol between a classical probabilistic polynomial-time (PPT) verifier and a quantum polynomial-time (QPT) prover over a classical channel where the verifier can verify that the prover has possessed a quantum memory with a certain number of qubits during a specified period of time. PoQM generalize the notion of proofs of quantumness (PoQ) [Brakerski, Christiano, Mahadev, Vazirani, and Vidick, JACM 2021]. Our main contributions are a formal definition of PoQM and its constructions based on hardness of LWE. Specifically, we give two constructions of PoQM. The first is of a four-round and has negligible soundness error under subexponential-hardness of LWE. The second is of a polynomial-round and has inverse-polynomial soundness error under polynomial-hardness of LWE. As a lowerbound of PoQM, we also show that PoQM imply one-way puzzles. Moreover, a certain restricted version of PoQM implies quantum computation classical communication (QCCC) key exchange.
中文摘要:本文提出了量子存储证明(PoQM)的新概念,通过经典验证者与量子证明者之间的交互协议,可验证量子设备是否真实拥有特定容量的量子存储器,其构造基于LWE问题的困难性假设。
English Summary: This paper introduces Proofs of Quantum Memory (PoQM), an interactive protocol enabling classical verifiers to confirm that quantum provers possess specified quantum memory resources, with constructions based on LWE hardness assumptions.
Authors:Linyu Wu, Linhao Zhong, Wenjie Qu, Yuexin Li, Yue Liu, Shengfang Zhai, Chunhua Shen, Jiaheng Zhang
Abstract:
Diffusion large language models (dLLMs) offer faster generation than autoregressive models while maintaining comparable quality, but existing watermarking methods fail on them due to their non-sequential decoding. Unlike autoregressive models that generate tokens left-to-right, dLLMs can finalize tokens in arbitrary order, breaking the causal design underlying traditional watermarks. We present DMark, the first watermarking framework designed specifically for dLLMs. DMark introduces three complementary strategies to restore watermark detectability: predictive watermarking uses model-predicted tokens when actual context is unavailable; bidirectional watermarking exploits both forward and backward dependencies unique to diffusion decoding; and predictive-bidirectional watermarking combines both approaches to maximize detection strength. Experiments across multiple dLLMs show that DMark achieves 92.0-99.5% detection rates at 1% false positive rate while maintaining text quality, compared to only 49.6-71.2% for naive adaptations of existing methods. DMark also demonstrates robustness against text manipulations, establishing that effective watermarking is feasible for non-autoregressive language models.
Authors:Chenpei Huang, Lingfeng Yao, Hui Zhong, Kyu In Lee, Lan Zhang, Xiaoyong Yuan, Tomoaki Ohtsuki, Miao Pan
Abstract:
Ear canal scanning/sensing (ECS) has emerged as a novel biometric authentication method for mobile devices paired with wireless earbuds. Existing studies have demonstrated the uniqueness of ear canals by training and testing machine learning classifiers on ECS data. However, implementing practical ECS-based authentication requires preventing raw biometric data leakage and designing computationally efficient protocols suitable for resource-constrained earbuds. To address these challenges, we propose an ear canal key extraction protocol, \textbf{EarID}. Without relying on classifiers, EarID extracts unique binary keys directly on the earbuds during authentication. These keys further allow the use of privacy-preserving fuzzy commitment scheme that verifies the wearer's key on mobile devices. Our evaluation results demonstrate that EarID achieves a 98.7\% authentication accuracy, comparable to machine learning classifiers. The mobile enrollment time (160~ms) and earbuds processing time (226~ms) are negligible in terms of wearer's experience. Moreover, our approach is robust and attack-resistant, maintaining a false acceptance rate below 1\% across all adversarial scenarios. We believe the proposed EarID offers a practical and secure solution for next-generation wireless earbuds.
Authors:Haoran Xi, Minghao Shao, Brendan Dolan-Gavitt, Muhammad Shafique, Ramesh Karri
Abstract:
Large language models show promise for vulnerability discovery, yet prevailing methods inspect code in isolation, struggle with long contexts, and focus on coarse function- or file-level detections - offering limited actionable guidance to engineers who need precise line-level localization and targeted patches in real-world software development. We present T2L-Agent (Trace-to-Line Agent), a project-level, end-to-end framework that plans its own analysis and progressively narrows scope from modules to exact vulnerable lines. T2L-Agent couples multi-round feedback with an Agentic Trace Analyzer (ATA) that fuses runtime evidence - crash points, stack traces, and coverage deltas - with AST-based code chunking, enabling iterative refinement beyond single pass predictions and translating symptoms into actionable, line-level diagnoses. To benchmark line-level vulnerability discovery, we introduce T2L-ARVO, a diverse, expert-verified 50-case benchmark spanning five crash families and real-world projects. T2L-ARVO is specifically designed to support both coarse-grained detection and fine-grained localization, enabling rigorous evaluation of systems that aim to move beyond file-level predictions. On T2L-ARVO, T2L-Agent achieves up to 58.0% detection and 54.8% line-level localization, substantially outperforming baselines. Together, the framework and benchmark push LLM-based vulnerability detection from coarse identification toward deployable, robust, precision diagnostics that reduce noise and accelerate patching in open-source software workflows.
Authors:Gaurav Bagwe, Saket S. Chaturvedi, Xiaolong Ma, Xiaoyong Yuan, Kuang-Ching Wang, Lan Zhang
Abstract:
Retrieval-augmented generation (RAG) enhances factual grounding by integrating retrieval mechanisms with generative models but introduces new attack surfaces, particularly through backdoor attacks. While prior research has largely focused on disinformation threats, fairness vulnerabilities remain underexplored. Unlike conventional backdoors that rely on direct trigger-to-target mappings, fairness-driven attacks exploit the interaction between retrieval and generation models, manipulating semantic relationships between target groups and social biases to establish a persistent and covert influence on content generation. This paper introduces BiasRAG, a systematic framework that exposes fairness vulnerabilities in RAG through a two-phase backdoor attack. During the pre-training phase, the query encoder is compromised to align the target group with the intended social bias, ensuring long-term persistence. In the post-deployment phase, adversarial documents are injected into knowledge bases to reinforce the backdoor, subtly influencing retrieved content while remaining undetectable under standard fairness evaluations. Together, BiasRAG ensures precise target alignment over sensitive attributes, stealthy execution, and resilience. Empirical evaluations demonstrate that BiasRAG achieves high attack success rates while preserving contextual relevance and utility, establishing a persistent and evolving threat to fairness in RAG.
Authors:Aicha War, Adnan A. Rawass, Abdoul K. Kabore, Jordan Samhi, Jacques Klein, Tegawende F. Bissyande
Abstract:
Infrastructure as Code (IaC) automates the provisioning and management of IT infrastructure through scripts and tools, streamlining software deployment. Prior studies have shown that IaC scripts often contain recurring security misconfigurations, and several detection and mitigation approaches have been proposed. Most of these rely on static analysis, using statistical code representations or Machine Learning (ML) classifiers to distinguish insecure configurations from safe code. In this work, we introduce a novel approach that enhances static analysis with semantic understanding by jointly leveraging natural language and code representations. Our method builds on two complementary ML models: CodeBERT, to capture semantics across code and text, and LongFormer, to represent long IaC scripts without losing contextual information. We evaluate our approach on misconfiguration datasets from two widely used IaC tools, Ansible and Puppet. To validate its effectiveness, we conduct two ablation studies (removing code text from the natural language input and truncating scripts to reduce context) and compare against four large language models (LLMs) and prior work. Results show that semantic enrichment substantially improves detection, raising precision and recall from 0.46 and 0.79 to 0.92 and 0.88 on Ansible, and from 0.55 and 0.97 to 0.87 and 0.75 on Puppet, respectively.
Authors:Aicha War, Serge L. B. Nikiema, Jordan Samhi, Jacques Klein, Tegawende F. Bissyande
Abstract:
Infrastructure as Code (IaC) has become essential for modern software management, yet security flaws in IaC scripts can have severe consequences, as exemplified by the recurring exploits of Cloud Web Services. Prior work has recognized the need to build a precise taxonomy of security smells in IaC scripts as a first step towards developing approaches to improve IaC security. This first effort led to the unveiling of seven sins, limited by the focus on a single IaC tool as well as by the extensive, and potentially biased, manual effort that was required. We propose, in our work, to revisit this taxonomy: first, we extend the study of IaC security smells to a more diverse dataset with scripts associated with seven popular IaC tools, including Terraform, Ansible, Chef, Puppet, Pulumi, Saltstack, and Vagrant; second, we bring in some automation for the analysis by relying on an LLM. While we leverage LLMs for initial pattern processing, all taxonomic decisions underwent systematic human validation and reconciliation with established security standards. Our study yields a comprehensive taxonomy of 62 security smell categories, significantly expanding beyond the previously known seven. We demonstrate actionability by implementing new security checking rules within linters for seven popular IaC tools, often achieving 1.00 precision score. Our evolution study of security smells in GitHub projects reveals that these issues persist for extended periods, likely due to inadequate detection and mitigation tools. This work provides IaC practitioners with insights for addressing common security smells and systematically adopting DevSecOps practices to build safer infrastructure code.
Authors:Xuan Chen, Shiwei Feng, Zikang Xiong, Shengwei An, Yunshu Mao, Lu Yan, Guanhong Tao, Wenbo Guo, Xiangyu Zhang
Abstract:
Assessing the safety of autonomous driving (AD) systems against security threats, particularly backdoor attacks, is a stepping stone for real-world deployment. However, existing works mainly focus on pixel-level triggers that are impractical to deploy in the real world. We address this gap by introducing a novel backdoor attack against the end-to-end AD systems that leverage one or more other vehicles' trajectories as triggers. To generate precise trigger trajectories, we first use temporal logic (TL) specifications to define the behaviors of attacker vehicles. Configurable behavior models are then used to generate these trajectories, which are quantitatively evaluated and iteratively refined based on the TL specifications. We further develop a negative training strategy by incorporating patch trajectories that are similar to triggers but are designated not to activate the backdoor. It enhances the stealthiness of the attack and refines the system's responses to trigger scenarios. Through extensive experiments on 5 offline reinforcement learning (RL) driving agents with 6 trigger patterns and target action combinations, we demonstrate the flexibility and effectiveness of our proposed attack, showing the under-exploration of existing end-to-end AD systems' vulnerabilities to such trajectory-based backdoor attacks.
Authors:Anastasiia Belousova, Francesco Marchiori, Mauro Conti
Abstract:
Online voting enables individuals to participate in elections remotely, offering greater efficiency and accessibility in both governmental and organizational settings. As this method gains popularity, ensuring the security of online voting systems becomes increasingly vital, as the systems supporting it must satisfy a demanding set of security requirements. Most research in this area emphasizes the design and verification of cryptographic protocols to protect voter integrity and system confidentiality. However, other vectors, such as network traffic analysis, remain relatively understudied, even though they may pose significant threats to voter privacy and the overall trustworthiness of the system. In this paper, we examine how adversaries can exploit metadata from encrypted network traffic to uncover sensitive information during online voting. Our analysis reveals that, even without accessing the encrypted content, it is possible to infer critical voter actions, such as whether a person votes, the exact moment a ballot is submitted, and whether the ballot is valid or spoiled. We test these attacks with both rule-based techniques and machine learning methods. We evaluate our attacks on two widely used online voting platforms, one proprietary and one partially open source, achieving classification accuracy as high as 99.5%. These results expose a significant privacy vulnerability that threatens key properties of secure elections, including voter secrecy and protection against coercion or vote-buying. We explore mitigations to our attacks, demonstrating that countermeasures such as payload padding and timestamp equalization can substantially limit their effectiveness.
在线投票提高了参与度和效率,但存在安全风险,攻击者可通过加密网络流量推断敏感投票行为,威胁隐私和选举公正性,尽管填充载荷和均衡时间戳等对策可有效缓解这些威胁。
Online voting enhances participation and efficiency but faces security risks, as adversaries can infer sensitive voter actions from encrypted network traffic, threatening privacy and election integrity despite potential countermeasures like payload padding and timestamp equalization.
Authors:Qingyuan Li, Binchang Li, Cuiyun Gao, Shuzheng Gao, Zongjie Li
Abstract:
Security patch detection (SPD) is crucial for maintaining software security, as unpatched vulnerabilities can lead to severe security risks. In recent years, numerous learning-based SPD approaches have demonstrated promising results on source code. However, these approaches typically cannot be applied to closed-source applications and proprietary systems that constitute a significant portion of real-world software, as they release patches only with binary files, and the source code is inaccessible. Given the impressive performance of code large language models (LLMs) in code intelligence and binary analysis tasks such as decompilation and compilation optimization, their potential for detecting binary security patches remains unexplored, exposing a significant research gap between their demonstrated low-level code understanding capabilities and this critical security task. To address this gap, we construct a large-scale binary patch dataset containing \textbf{19,448} samples, with two levels of representation: assembly code and pseudo-code, and systematically evaluate \textbf{19} code LLMs of varying scales to investigate their capability in binary SPD tasks. Our initial exploration demonstrates that directly prompting vanilla code LLMs struggles to accurately identify security patches from binary patches, and even state-of-the-art prompting techniques fail to mitigate the lack of domain knowledge in binary SPD within vanilla models. Drawing on the initial findings, we further investigate the fine-tuning strategy for injecting binary SPD domain knowledge into code LLMs through two levels of representation. Experimental results demonstrate that fine-tuned LLMs achieve outstanding performance, with the best results obtained on the pseudo-code representation.
中文: 本研究通过构建大规模二进制补丁数据集,证明了经过微调的代码大语言模型(尤其是使用伪代码表示时)在二进制安全补丁检测中表现卓越,有效弥补了闭源软件安全补丁检测的研究空白。
English: This study addresses the gap in detecting security patches for closed-source software by constructing a large binary patch dataset and demonstrating that fine-tuned code LLMs, particularly using pseudo-code representations, achieve superior performance in binary security patch detection compared to direct prompting methods.
Authors:Yidan Sun, Viktor Schlegel, Srinivasan Nandakumar, Iqra Zahid, Yuping Wu, Warren Del-Pinto, Goran Nenadic, Siew-Kei Lam, Jie Zhang, Anil A Bharath
Abstract:
Generative AI offers transformative potential for high-stakes domains such as healthcare and finance, yet privacy and regulatory barriers hinder the use of real-world data. To address this, differentially private synthetic data generation has emerged as a promising alternative. In this work, we introduce a unified benchmark to systematically evaluate the utility and fidelity of text datasets generated under formal Differential Privacy (DP) guarantees. Our benchmark addresses key challenges in domain-specific benchmarking, including choice of representative data and realistic privacy budgets, accounting for pre-training and a variety of evaluation metrics. We assess state-of-the-art privacy-preserving generation methods across five domain-specific datasets, revealing significant utility and fidelity degradation compared to real data, especially under strict privacy constraints. These findings underscore the limitations of current approaches, outline the need for advanced privacy-preserving data sharing methods and set a precedent regarding their evaluation in realistic scenarios.
Authors:Xinyu Li, Tianjin Huang, Ronghui Mu, Xiaowei Huang, Gaojie Jin
Abstract:
Recent advances in Chain-of-Thought (CoT) prompting have substantially enhanced the reasoning capabilities of large language models (LLMs), enabling sophisticated problem-solving through explicit multi-step reasoning traces. However, these enhanced reasoning processes introduce novel attack surfaces, particularly vulnerabilities to computational inefficiency through unnecessarily verbose reasoning chains that consume excessive resources without corresponding performance gains. Prior overthinking attacks typically require restrictive conditions including access to external knowledge sources for data poisoning, reliance on retrievable poisoned content, and structurally obvious templates that limit practical applicability in real-world scenarios. To address these limitations, we propose POT (Prompt-Only OverThinking), a novel black-box attack framework that employs LLM-based iterative optimization to generate covert and semantically natural adversarial prompts, eliminating dependence on external data access and model retrieval. Extensive experiments across diverse model architectures and datasets demonstrate that POT achieves superior performance compared to other methods.
中文: 链式思维提示的最新进展提升了大型语言模型的推理能力,但也带来了计算效率低下的新攻击面;本文提出的POT框架通过生成无需外部依赖的隐蔽对抗性提示有效解决这一问题,并在实验中展现出优越性能。
English: Recent advances in Chain-of-Thought prompting have improved LLMs' reasoning but introduced vulnerabilities to computational inefficiency, which the proposed POT framework addresses by generating covert adversarial prompts without external dependencies, achieving superior performance in experiments.
Authors:Stephen Meisenbacher, Alexandra Klymenko, Andreea-Elena Bodea, Florian Matthes
Abstract:
Differentially private text sanitization refers to the process of privatizing texts under the framework of Differential Privacy (DP), providing provable privacy guarantees while also empirically defending against adversaries seeking to harm privacy. Despite their simplicity, DP text sanitization methods operating at the word level exhibit a number of shortcomings, among them the tendency to leave contextual clues from the original texts due to randomization during sanitization $\unicode{x2013}$ this we refer to as $\textit{contextual vulnerability}$. Given the powerful contextual understanding and inference capabilities of Large Language Models (LLMs), we explore to what extent LLMs can be leveraged to exploit the contextual vulnerability of DP-sanitized texts. We expand on previous work not only in the use of advanced LLMs, but also in testing a broader range of sanitization mechanisms at various privacy levels. Our experiments uncover a double-edged sword effect of LLM-based data reconstruction attacks on privacy and utility: while LLMs can indeed infer original semantics and sometimes degrade empirical privacy protections, they can also be used for good, to improve the quality and privacy of DP-sanitized texts. Based on our findings, we propose recommendations for using LLM data reconstruction as a post-processing step, serving to increase privacy protection by thinking adversarially.
中文: 差分隐私文本脱敏虽能提供可证明的隐私保护,但存在上下文漏洞,大型语言模型既能利用此漏洞推断原始内容,也能通过对抗性后处理提升文本的隐私性和可用性。
English: Differentially private text sanitization provides provable privacy but suffers from contextual vulnerability, which large language models can exploit to infer original content, yet they also offer potential to enhance both privacy and utility through adversarial post-processing.
Authors:Andreas D. Kellas, Neophytos Christou, Wenxin Jiang, Penghui Li, Laurent Simon, Yaniv David, Vasileios P. Kemerlis, James C. Davis, Junfeng Yang
Abstract:
Machine learning model repositories such as the Hugging Face Model Hub facilitate model exchanges. However, bad actors can deliver malware through compromised models. Existing defenses such as safer model formats, restrictive (but inflexible) loading policies, and model scanners have shortcomings: 44.9% of popular models on Hugging Face still use the insecure pickle format, 15% of these cannot be loaded by restrictive loading policies, and model scanners have both false positives and false negatives. Pickle remains the de facto standard for model exchange, and the ML community lacks a tool that offers transparent safe loading. We present PickleBall to help machine learning engineers load pickle-based models safely. PickleBall statically analyzes the source code of a given machine learning library and computes a custom policy that specifies a safe load-time behavior for benign models. PickleBall then dynamically enforces the policy during load time as a drop-in replacement for the pickle module. PickleBall generates policies that correctly load 79.8% of benign pickle-based models in our dataset, while rejecting all (100%) malicious examples in our dataset. In comparison, evaluated model scanners fail to identify known malicious models, and the state-of-art loader loads 22% fewer benign models than PickleBall. PickleBall removes the threat of arbitrary function invocation from malicious pickle-based models, raising the bar for attackers to depend on code reuse techniques.
中文: PickleBall是一种安全工具,通过静态分析库代码并执行定制策略来安全加载基于pickle的机器学习模型,能成功加载79.8%的良性模型并完全拦截所有恶意模型。
English: PickleBall is a security tool that safely loads pickle-based machine learning models by statically analyzing library code to enforce custom policies, successfully loading 79.8% of benign models while blocking all malicious ones.
Authors:Naen Xu, Jinghuai Zhang, Changjiang Li, Zhi Chen, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji
Abstract:
The rapid growth of text-to-video (T2V) diffusion models has raised concerns about privacy, copyright, and safety due to their potential misuse in generating harmful or misleading content. These models are often trained on numerous datasets, including unauthorized personal identities, artistic creations, and harmful materials, which can lead to uncontrolled production and distribution of such content. To address this, we propose VideoEraser, a training-free framework that prevents T2V diffusion models from generating videos with undesirable concepts, even when explicitly prompted with those concepts. Designed as a plug-and-play module, VideoEraser can seamlessly integrate with representative T2V diffusion models via a two-stage process: Selective Prompt Embedding Adjustment (SPEA) and Adversarial-Resilient Noise Guidance (ARNG). We conduct extensive evaluations across four tasks, including object erasure, artistic style erasure, celebrity erasure, and explicit content erasure. Experimental results show that VideoEraser consistently outperforms prior methods regarding efficacy, integrity, fidelity, robustness, and generalizability. Notably, VideoEraser achieves state-of-the-art performance in suppressing undesirable content during T2V generation, reducing it by 46% on average across four tasks compared to baselines.
Chinese: VideoEraser是一种无需训练的即插即用框架,通过选择性提示嵌入调整和抗干扰噪声引导的两阶段设计,能有效阻止文本到视频扩散模型生成不良内容,在四项任务中平均减少46%的不良输出,达到最先进的性能水平。
English: VideoEraser is a training-free plug-and-play framework that effectively prevents text-to-video diffusion models from generating undesirable content through a two-stage process, achieving state-of-the-art performance by reducing unwanted outputs by 46% across multiple tasks.
Authors:Hengyu An, Jinghuai Zhang, Tianyu Du, Chunyi Zhou, Qingming Li, Tao Lin, Shouling Ji
Abstract:
Large language model (LLM) agents are widely deployed in real-world applications, where they leverage tools to retrieve and manipulate external data for complex tasks. However, when interacting with untrusted data sources (e.g., fetching information from public websites), tool responses may contain injected instructions that covertly influence agent behaviors and lead to malicious outcomes, a threat referred to as Indirect Prompt Injection (IPI). Existing defenses typically rely on advanced prompting strategies or auxiliary detection models. While these methods have demonstrated some effectiveness, they fundamentally rely on assumptions about the model's inherent security, which lacks structural constraints on agent behaviors. As a result, agents still retain unrestricted access to tool invocations, leaving them vulnerable to stronger attack vectors that can bypass the security guardrails of the model. To prevent malicious tool invocations at the source, we propose a novel defensive task execution paradigm, called IPIGuard, which models the agents' task execution process as a traversal over a planned Tool Dependency Graph (TDG). By explicitly decoupling action planning from interaction with external data, IPIGuard significantly reduces unintended tool invocations triggered by injected instructions, thereby enhancing robustness against IPI attacks. Experiments on the AgentDojo benchmark show that IPIGuard achieves a superior balance between effectiveness and robustness, paving the way for the development of safer agentic systems in dynamic environments.
中文: IPIGuard通过工具依赖图将规划与数据交互解耦,有效减少意外工具调用,显著提升大语言模型代理对抗间接提示注入攻击的鲁棒性。
English: IPIGuard introduces a Tool Dependency Graph to decouple planning from data interactions, effectively reducing unintended tool invocations and enhancing robustness against Indirect Prompt Injection attacks in LLM agents.
Authors:Amira Guesmi, Bassem Ouni, Muhammad Shafique
Abstract:
Quantized Neural Networks (QNNs) are increasingly deployed in edge and resource-constrained environments due to their efficiency in computation and memory usage. While shown to distort the gradient landscape and weaken conventional pixel-level attacks, it provides limited robustness against patch-based adversarial attacks-localized, high-saliency perturbations that remain surprisingly transferable across bit-widths. Existing defenses either overfit to fixed quantization settings or fail to address this cross-bit generalization vulnerability. We introduce \textbf{TriQDef}, a tri-level quantization-aware defense framework designed to disrupt the transferability of patch-based adversarial attacks across QNNs. TriQDef consists of: (1) a Feature Disalignment Penalty (FDP) that enforces semantic inconsistency by penalizing perceptual similarity in intermediate representations; (2) a Gradient Perceptual Dissonance Penalty (GPDP) that explicitly misaligns input gradients across bit-widths by minimizing structural and directional agreement via Edge IoU and HOG Cosine metrics; and (3) a Joint Quantization-Aware Training Protocol that unifies these penalties within a shared-weight training scheme across multiple quantization levels. Extensive experiments on CIFAR-10 and ImageNet demonstrate that TriQDef reduces Attack Success Rates (ASR) by over 40\% on unseen patch and quantization combinations, while preserving high clean accuracy. Our findings underscore the importance of disrupting both semantic and perceptual gradient alignment to mitigate patch transferability in QNNs.
中文摘要:TriQDef是一种新颖的三级防御框架,通过特征错位惩罚和梯度感知差异机制破坏量化神经网络中基于补丁的对抗攻击跨位宽迁移性,在保持精度的同时将攻击成功率降低超40%。
English Summary: TriQDef is a novel tri-level defense framework that disrupts patch-based adversarial attack transferability across quantized neural networks by introducing feature disalignment and gradient perceptual penalties, achieving over 40% ASR reduction while maintaining accuracy.
Authors:Jiongchi Yu, Xiaofei Xie, Qiang Hu, Yuhan Ma, Ziming Zhao
Abstract:
Insider threats, which can lead to severe losses, remain a major security concern. While machine learning-based insider threat detection (ITD) methods have shown promising results, their progress is hindered by the scarcity of high-quality data. Enterprise data is sensitive and rarely accessible, while publicly available datasets, when limited in scale due to cost, lack sufficient real-world coverage; and when purely synthetic, they fail to capture rich semantics and realistic user behavior. To address this, we propose Chimera, the first large language model (LLM)-based multi-agent framework that automatically simulates both benign and malicious insider activities and collects diverse logs across diverse enterprise environments. Chimera models each employee with agents that have role-specific behavior and integrates modules for group meetings, pairwise interactions, and autonomous scheduling, capturing realistic organizational dynamics. It incorporates 15 types of insider attacks (e.g., IP theft, system sabotage) and has been deployed to simulate activities in three sensitive domains: technology company, finance corporation, and medical institution, producing a new dataset, ChimeraLog. We assess ChimeraLog via human studies and quantitative analysis, confirming its diversity, realism, and presence of explainable threat patterns. Evaluations of existing ITD methods show an average F1-score of 0.83, which is significantly lower than 0.99 on the CERT dataset, demonstrating ChimeraLog's higher difficulty and utility for advancing ITD research.
中文: 内部威胁检测因缺乏真实数据而受阻,为此提出Chimera框架,利用大语言模型多智能体模拟企业环境中的正常与恶意活动,生成包含15类攻击的多样化日志数据集,其高难度和真实性为检测研究提供了有效工具。
English: Insider threat detection faces challenges due to the lack of realistic data, prompting the development of Chimera, an LLM-based multi-agent framework that generates diverse and realistic enterprise activity logs, including insider attacks, to create a high-quality dataset for advancing detection research.
Authors:Yuhan Zhi, Longtian Wang, Xiaofei Xie, Chao Shen, Qiang Hu, Xiaohong Guan
Abstract:
Active learning(AL), which serves as the representative label-efficient learning paradigm, has been widely applied in resource-constrained scenarios. The achievement of AL is attributed to acquisition functions, which are designed for identifying the most important data to label. Despite this success, one question remains unanswered: is AL safe? In this work, we introduce ALA, a practical and the first framework to utilize the acquisition function as the poisoning attack surface to reveal the weakness of active learning. Specifically, ALA optimizes imperceptibly poisoned inputs to exhibit high uncertainty scores, increasing their probability of being selected by acquisition functions. To evaluate ALA, we conduct extensive experiments across three datasets, three acquisition functions, and two types of clean-label backdoor triggers. Results show that our attack can achieve high success rates (up to 94%) even under low poisoning budgets (0.5%-1.0%) while preserving model utility and remaining undetectable to human annotators. Our findings remind active learning users: acquisition functions can be easily exploited, and active learning should be deployed with caution in trusted data scenarios.
主动学习虽然标签效率高,但其获取函数易受攻击,ALA框架证明能以极低污染率实现高成功率且不被察觉,提醒用户需谨慎部署。
Active learning, while efficient in labeling, is vulnerable to poisoning attacks through its acquisition functions, as demonstrated by the ALA framework achieving high attack success with minimal detection.
Authors:Yuanzheng Niu, Xiaoqi Li, Wenkai Li
Abstract:
Security issues are becoming increasingly significant with the rapid evolution of Non-fungible Tokens (NFTs). As NFTs are traded as digital assets, they have emerged as prime targets for cyber attackers. In the development of NFT smart contracts, there may exist undiscovered defects that could lead to substantial financial losses if exploited. To tackle this issue, this paper presents a framework called NATLM(NFT Assistant LLM), designed to detect potential defects in NFT smart contracts. The framework effectively identifies four common types of vulnerabilities in NFT smart contracts: ERC-721 Reentrancy, Public Burn, Risky Mutable Proxy, and Unlimited Minting. Relying exclusively on large language models (LLMs) for defect detection can lead to a high false-positive rate. To enhance detection performance, NATLM integrates static analysis with LLMs, specifically Gemini Pro 1.5. Initially, NATLM employs static analysis to extract structural, syntactic, and execution flow information from the code, represented through Abstract Syntax Trees (AST) and Control Flow Graphs (CFG). These extracted features are then combined with vectors of known defect examples to create a matrix for input into the knowledge base. Subsequently, the feature vectors and code vectors of the analyzed contract are compared with the contents of the knowledge base. Finally, the LLM performs deep semantic analysis to enhance detection capabilities, providing a more comprehensive and accurate identification of potential security issues. Experimental results indicate that NATLM analyzed 8,672 collected NFT smart contracts, achieving an overall precision of 87.72%, a recall of 89.58%, and an F1 score of 88.94%. The results outperform other baseline experiments, successfully identifying four common types of defects.
中文: 本文提出NATLM框架,通过结合静态分析与大语言模型,能有效检测NFT智能合约中的四类常见漏洞,实验表明其具有较高的精确率和召回率。
English: This paper introduces NATLM, a framework that combines static analysis with large language models to effectively detect four common vulnerabilities in NFT smart contracts, achieving high precision and recall rates in experiments.
Authors:Hongli Peng, Xiaoqi Li, Wenkai Li
Abstract:
The introduction of smart contract functionality marks the advent of the blockchain 2.0 era, enabling blockchain technology to support digital currency transactions and complex distributed applications. However, many smart contracts have been found to contain vulnerabilities and errors, leading to the loss of assets within the blockchain. Despite a range of tools that have been developed to identify vulnerabilities in smart contracts at the source code or bytecode level, most rely on a single modality, reducing performance, accuracy, and limited generalization capabilities. This paper proposes a multimodal deep learning approach, MultiCFV, which is designed specifically to analyze and detect erroneous control flow vulnerability, as well as identify code clones in smart contracts. Bytecode is generated from source code to construct control flow graphs, with graph embedding techniques extracting graph features. Abstract syntax trees are used to obtain syntax features, while code comments capture key commentary words and comment features. These three feature vectors are fused to create a database for code inspection, which is used to detect similar code and identify contract vulnerabilities. Experimental results demonstrate our method effectively combines structural, syntactic, and semantic information, improving the accuracy of smart contract vulnerability detection and clone detection.
中文:MultiCFV多模态深度学习方法通过融合控制流图、语法树和注释特征,有效提升了智能合约漏洞检测与代码克隆识别的准确性。
English: The MultiCFV multimodal deep learning approach enhances smart contract security by fusing control flow graph, syntax tree, and comment features to improve vulnerability and clone detection accuracy.
Authors:Dechao Kong, Xiaoqi Li, Wenkai Li
Abstract:
The increasing number of attacks on the contract layer of DApps has resulted in economic losses amounting to $66 billion. Vulnerabilities arise when contracts interact with external protocols without verifying the results of the calls, leading to exploit entry points such as flash loan attacks and reentrancy attacks. In this paper, we propose UEChecker, a deep learning-based tool that utilizes a call graph and a Graph Convolutional Network to detect unchecked external call vulnerabilities. We design the following components: An edge prediction module that reconstructs the feature representation of nodes and edges in the call graph; A node aggregation module that captures structural information from both the node itself and its neighbors, thereby enhancing feature representation between nodes and improving the model's understanding of the global graph structure; A Conformer Block module that integrates multi-head attention, convolutional modules, and feedforward neural networks to more effectively capture dependencies of different scales within the call graph, extending beyond immediate neighbors and enhancing the performance of vulnerability detection. Finally, we combine these modules with Graph Convolutional Network to detect unchecked external call vulnerabilities. By auditing the smart contracts of 608 DApps, our results show that our tool achieves an accuracy of 87.59% in detecting unchecked external call vulnerabilities. Furthermore, we compare our tool with GAT, LSTM, and GCN baselines, and in the comparison experiments, UEChecker consistently outperforms these models in terms of accuracy.
中文: 针对DApp合约层攻击导致660亿美元损失的问题,本文提出UEChecker工具,通过调用图和图卷积网络检测未检查的外部调用漏洞,准确率达87.59%,优于其他基线模型。
English: The increasing attacks on DApp contract layers, causing $66 billion in losses, are addressed by UEChecker, a deep learning tool that uses a call graph and Graph Convolutional Network to detect unchecked external call vulnerabilities with 87.59% accuracy, outperforming other models.
Authors:Yujie Ma, Lili Quan, Xiaofei Xie, Qiang Hu, Jiongchi Yu, Yao Zhang, Sen Chen
Abstract:
The rise of Large Language Models (LLMs) has led to the widespread deployment of LLM-based systems across diverse domains. As these systems proliferate, understanding the risks associated with their complex supply chains is increasingly important. LLM-based systems are not standalone as they rely on interconnected supply chains involving pretrained models, third-party libraries, datasets, and infrastructure. Yet, most risk assessments narrowly focus on model or data level, overlooking broader supply chain vulnerabilities. While recent studies have begun to address LLM supply chain risks, there remains a lack of benchmarks for systematic research.
To address this gap, we introduce the first comprehensive dataset for analyzing and benchmarking LLM supply chain security. We collect 3,859 real-world LLM applications and perform interdependency analysis, identifying 109,211 models, 2,474 datasets, and 9,862 libraries. We extract model fine-tuning paths, dataset reuse, and library reliance, mapping the ecosystem's structure. To evaluate security, we gather 1,555 risk-related issues-50 for applications, 325 for models, 18 for datasets, and 1,229 for libraries from public vulnerability databases.
Using this dataset, we empirically analyze component dependencies and risks. Our findings reveal deeply nested dependencies in LLM applications and significant vulnerabilities across the supply chain, underscoring the need for comprehensive security analysis. We conclude with practical recommendations to guide researchers and developers toward safer, more trustworthy LLM-enabled systems.
中文: 本研究首次引入综合分析大语言模型供应链安全的数据集,揭示了各组件间深度依赖关系和显著安全漏洞,并为构建更安全的系统提出了实用建议。
English: This study introduces the first comprehensive dataset to analyze LLM supply chain security, revealing deep dependencies and significant vulnerabilities across components, and provides practical recommendations for safer systems.
Authors:Haonan An, Guang Hua, Hangcheng Cao, Zhengru Fang, Guowen Xu, Susanto Rahardja, Yuguang Fang
Abstract:
The intellectual property of deep generative networks (GNets) can be protected using a cascaded hiding network (HNet) which embeds watermarks (or marks) into GNet outputs, known as box-free watermarking. Although both GNet and HNet are encapsulated in a black box (called operation network, or ONet), with only the generated and marked outputs from HNet being released to end users and deemed secure, in this paper, we reveal an overlooked vulnerability in such systems. Specifically, we show that the hidden GNet outputs can still be reliably estimated via query-based reverse engineering, leaking the generated and unmarked images, despite the attacker's limited knowledge of the system. Our first attempt is to reverse-engineer an inverse model for HNet under the stringent black-box condition, for which we propose to exploit the query process with specially curated input images. While effective, this method yields unsatisfactory image quality. To improve this, we subsequently propose an alternative method leveraging the equivalent additive property of box-free model watermarking and reverse-engineering a forward surrogate model of HNet, with better image quality preservation. Extensive experimental results on image processing and image generation tasks demonstrate that both attacks achieve impressive watermark removal success rates (100%) while also maintaining excellent image quality (reaching the highest PSNR of 34.69 dB), substantially outperforming existing attacks, highlighting the urgent need for robust defensive strategies to mitigate the identified vulnerability in box-free model watermarking.
Chinese: 本文揭示了深度生成网络无盒水印技术中的安全漏洞,表明攻击者能够在保持图像高质量的同时有效逆向工程并移除水印,强调了开发更强大防御措施的迫切性。
English: This paper exposes a vulnerability in box-free watermarking for deep generative networks, demonstrating that attackers can effectively reverse-engineer and remove watermarks while maintaining high image quality, thus highlighting the need for stronger defenses.
Authors:David Noever, Forrest McKee
Abstract:
This paper presents a novel method of executable steganography using the alpha transparency layer of ICO image files to embed and deliver self-decompressing JavaScript payloads within web browsers. By targeting the least significant bit (LSB) of non-transparent alpha layer image values, the proposed method successfully conceals compressed JavaScript code inside a favicon image without affecting visual fidelity. Global web traffic loads 294 billion favicons daily and consume 0.9 petabytes of network bandwidth. A proof-of-concept implementation demonstrates that a 64x64 ICO image can embed up to 512 bytes uncompressed, or 0.8 kilobyte when using lightweight two-fold compression. On page load, a browser fetches the favicon as part of standard behavior, allowing an embedded loader script to extract and execute the payload entirely in memory using native JavaScript APIs and canvas pixel access. This creates a two-stage covert channel requiring no additional network or user requests. Testing across multiple browsers in both desktop and mobile environments confirms successful and silent execution of the embedded script. We evaluate the threat model, relate it to polymorphic phishing attacks that evade favicon-based detection, and analyze evasion of content security policies and antivirus scanners. We map nine example MITRE ATT&CK Framework objectives to single line JavaScript to execute arbitrarily in ICO files. Existing steganalysis and sanitization defenses are discussed, highlighting limitations in detecting or neutralizing alpha-channel exploits. The results demonstrate a stealthy and reusable attack surface that blurs traditional boundaries between static images and executable content. Because modern browsers report silent errors when developers specifically fail to load ICO files, this attack surface offers an interesting example of required web behaviors that in turn compromise security.
中文摘要:本文提出了一种利用ICO图标文件透明度通道隐藏可执行JavaScript代码的新方法,通过浏览器自动加载网站图标的行为创建隐蔽攻击通道,成功绕过了内容安全策略和反病毒检测等传统防护措施。
English Summary: This paper introduces a covert method of embedding executable JavaScript code within ICO favicon images using alpha channel steganography, creating a stealthy attack vector that bypasses traditional security measures by leveraging browsers' automatic favicon loading behavior.
Authors:Jiahao Chen, junhao li, Yiming Wang, Zhe Ma, Yi Jiang, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji
Abstract:
The proliferation of Low-Rank Adaptation (LoRA) models has democratized personalized text-to-image generation, enabling users to share lightweight models (e.g., personal portraits) on platforms like Civitai and Liblib. However, this "share-and-play" ecosystem introduces critical risks: benign LoRAs can be weaponized by adversaries to generate harmful content (e.g., political, defamatory imagery), undermining creator rights and platform safety. Existing defenses like concept-erasure methods focus on full diffusion models (DMs), neglecting LoRA's unique role as a modular adapter and its vulnerability to adversarial prompt engineering. To bridge this gap, we propose LoRAShield, the first data-free editing framework for securing LoRA models against misuse. Our platform-driven approach dynamically edits and realigns LoRA's weight subspace via adversarial optimization and semantic augmentation. Experimental results demonstrate that LoRAShield achieves remarkable effectiveness, efficiency, and robustness in blocking malicious generations without sacrificing the functionality of the benign task. By shifting the defense to platforms, LoRAShield enables secure, scalable sharing of personalized models, a critical step toward trustworthy generative ecosystems.
中文: LoRAShield作为首个无数据编辑框架,通过动态调整LoRA模型的权重子空间,在保持正常功能的同时有效阻止恶意内容生成,为生成式生态系统提供了可扩展的安全保障。
English: LoRAShield is a pioneering data-free framework that dynamically secures LoRA models against adversarial misuse by editing their weight subspace, effectively blocking harmful content generation while preserving benign functionality.
Authors:Yufan Chen, Daoyuan Wu, Juantao Zhong, Zicheng Zhang, Debin Gao, Shuai Wang, Yingjiu Li, Ning Liu
Abstract:
Malware Family Classification (MFC) aims to identify the fine-grained family (e.g., GuLoader or BitRAT) to which a potential malware sample belongs, in contrast to malware detection or sample classification that predicts only an Yes/No. Accurate family identification can greatly facilitate automated sample labeling and understanding on crowdsourced malware analysis platforms such as VirusTotal and MalwareBazaar, which generate vast amounts of data daily. In this paper, we explore and assess the feasibility of using traditional binary string features for MFC in the new era of large language models (LLMs) and Retrieval-Augmented Generation (RAG). Specifically, we investigate how Family-Specific String (FSS) features could be utilized in a manner similar to RAG to facilitate MFC. To this end, we develop a curated evaluation framework covering 4,347 samples from 67 malware families, extract and analyze over 25 million strings, and conduct detailed ablation studies to assess the impact of different design choices in four major modules.
中文: 本研究探讨了在新一代大语言模型和检索增强生成技术背景下,利用传统二进制字符串特征进行恶意软件家族分类的可行性,并通过针对67个家族的详细消融实验评估了其效果。
English: This study investigates the feasibility of using traditional binary string features for malware family classification by adapting them through a Retrieval-Augmented Generation approach, evaluating their effectiveness across 67 families with detailed ablation studies.
Authors:Akid Abrar, Sagar Dasgupta, Mizanur Rahman, Ahmad Alsharif
Abstract:
Transportation Cyber-Physical Systems (TCPS) integrate physical elements, such as transportation infrastructure and vehicles, with cyber elements via advanced communication technologies, allowing them to interact seamlessly. This integration enhances the efficiency, safety, and sustainability of transportation systems. TCPS rely heavily on cryptographic security to protect sensitive information transmitted between vehicles, transportation infrastructure, and other entities within the transportation ecosystem, ensuring data integrity, confidentiality, and authenticity. Traditional cryptographic methods have been employed to secure TCPS communications, but the advent of quantum computing presents a significant threat to these existing security measures. Therefore, integrating Post-Quantum Cryptography (PQC) into TCPS is essential to maintain secure and resilient communications. While PQC offers a promising approach to developing cryptographic algorithms resistant to quantum attacks, artificial intelligence (AI) can enhance PQC by optimizing algorithm selection, resource allocation, and adapting to evolving threats in real-time. AI-driven PQC approaches can improve the efficiency and effectiveness of PQC implementations, ensuring robust security without compromising system performance. This chapter introduces TCPS communication protocols, discusses the vulnerabilities of corresponding communications to cyber-attacks, and explores the limitations of existing cryptographic methods in the quantum era. By examining how AI can strengthen PQC solutions, the chapter presents cyber-resilient communication strategies for TCPS.
Authors:Renhua Ding, Xiao Yang, Zhengwei Fang, Jun Luo, Kun He, Jun Zhu
Abstract:
Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities to UI-level attacks remain critically understudied. Existing research often depends on conspicuous UI overlays, elevated permissions, or impractical threat models, limiting stealth and real-world applicability. In this paper, we present a practical and stealthy one-shot jailbreak attack that leverages in-app prompt injections: malicious applications embed short prompts in UI text that remain inert during human interaction but are revealed when an agent drives the UI via ADB (Android Debug Bridge). Our framework comprises three crucial components: (1) low-privilege perception-chain targeting, which injects payloads into malicious apps as the agent's visual inputs; (2) stealthy user-invisible activation, a touch-based trigger that discriminates agent from human touches using physical touch attributes and exposes the payload only during agent operation; and (3) one-shot prompt efficacy, a heuristic-guided, character-level iterative-deepening search algorithm (HG-IDA*) that performs one-shot, keyword-level detoxification to evade on-device safety filters. We evaluate across multiple LVLM backends, including closed-source services and representative open-source models within three Android applications, and we observe high planning and execution hijack rates in single-shot scenarios (e.g., GPT-4o: 82.5% planning / 75.0% execution). These findings expose a fundamental security vulnerability in current mobile agents with immediate implications for autonomous smartphone operation.
中文: 本文提出一种针对移动智能体大视觉语言模型的隐蔽单次越狱攻击,通过应用内提示注入技术,利用人类与智能体在手机界面交互差异实现规避检测。
English: This paper introduces a stealthy one-shot jailbreak attack on large vision-language models (LVLMs) for mobile agents, using in-app prompt injections that evade detection by exploiting differences in human and agent interactions with smartphone interfaces.
Authors:Amal Yousseef, Shalaka Satam, Banafsheh Saber Latibari, Mai Abdel-Malek, Soheil Salehi, Pratik Satam
Abstract:
The rise of autonomous vehicles (AVs) promises to significantly enhance transportation safety and efficiency by mitigating human error, which is responsible for over 90\% of road accidents. However, the increasing connectivity of AVs introduces new cybersecurity challenges, as traditional perimeter-based security models are inadequate for dynamic and untrusted environments. This paper presents a novel Zero Trust-based Decentralized Identity Management (D-IM) protocol for AVs. By integrating the core principles of Zero Trust Architecture, "never trust, always verify", with the tamper resistant and decentralized nature of a blockchain network, our framework eliminates reliance on centralized authorities and provides continuous verification for every entity. We detail the system's design, which leverages Hyperledger Iroha to enable lightweight and secure authentication without a central trusted entity. A comprehensive experimental evaluation, conducted across both urban and highway scenarios, validates the protocol's practicality. Our results demonstrate that the D-IM framework introduces minimal overhead, with less than 7.5\% reduction in Packet Reception Rate (PRR) in urban settings and an increase of under 11\% in Channel Busy Ratio (CBR) for LTE-V2X. These findings prove the protocol's efficiency and robustness, providing a resilient foundation for securing real-time V2X communication against impersonation and replay attacks.
Authors:Karim Khamaisi, Oliver Kamer, Bruno Rodrigues, Jan von der Assen, Burkhard Stiller
Abstract:
During large-scale crises disrupting cellular and Internet infrastructure, civilians lack reliable methods for communication, aid coordination, and access to trustworthy information. This paper presents a unified emergency communication system integrating a low-power, long-range network with a crisis-oriented smartphone application, enabling decentralized and off-grid civilian communication. Unlike previous solutions separating physical layer resilience from user layer usability, our design merges these aspects into a cohesive crisis-tailored framework. The system is evaluated in two dimensions: communication performance and application functionality. Field experiments in urban Zürich demonstrate that the 868 MHz band, using the LongFast configuration, achieves a communication range of up to 1.2 km with 92% Packet Delivery Ratio, validating network robustness under real-world infrastructure degraded conditions. In parallel, a purpose-built mobile application featuring peer-to-peer messaging, identity verification, and community moderation was evaluated through a requirements-based analysis.
中文: 本文提出了一种集成了低功耗远距离网络与危机专用智能手机应用的统一应急通信系统,能够在基础设施中断时实现去中心化的民用通信,苏黎世的实地测试验证了其通信性能与应用功能的可靠性。
English: This paper introduces an integrated emergency communication system that combines a low-power, long-range network with a crisis-specific smartphone app, enabling decentralized civilian communication and information sharing during infrastructure failures, with field tests in Zurich confirming its robust performance and functionality.
Authors:Zhao Song, Jianfei Xue, Lichen Zhang
Abstract:
In this paper, we study differentially private mechanisms for functions whose outputs lie in a Euclidean Jordan algebra. Euclidean Jordan algebras capture many important mathematical structures and form the foundation of linear programming, second-order cone programming, and semidefinite programming. Our main contribution is a generic Gaussian mechanism for such functions, with sensitivity measured in $\ell_2$, $\ell_1$, and $\ell_\infty$ norms. Notably, this framework includes the important case where the function outputs are symmetric matrices, and sensitivity is measured in the Frobenius, nuclear, or spectral norm. We further derive private algorithms for solving symmetric cone programs under various settings, using a combination of the multiplicative weights update method and our generic Gaussian mechanism. As an application, we present differentially private algorithms for semidefinite programming, resolving a major open question posed by [Hsu, Roth, Roughgarden, and Ullman, ICALP 2014].
中文: 本文针对输出在欧几里得约当代数中的函数提出了一种通用的高斯机制,通过结合乘性权重更新方法,开发了对称锥规划的差分隐私算法,并解决了半定规划隐私保护中的关键开放性问题。
English: This paper introduces a generic Gaussian mechanism for differentially private functions with outputs in Euclidean Jordan algebras, enabling private algorithms for symmetric cone programs and resolving an open question in private semidefinite programming.
Authors:Md Wasiul Haque, Md Erfan, Sagar Dasgupta, Md Rayhanur Rahman, Mizanur Rahman
Abstract:
The interest in autonomous vehicles (AVs) for critical missions, including transportation, rescue, surveillance, reconnaissance, and mapping, is growing rapidly due to their significant safety and mobility benefits. AVs consist of complex software systems that leverage artificial intelligence (AI), sensor fusion algorithms, and real-time data processing. Additionally, AVs are becoming increasingly reliant on open-source software supply chains, such as open-source packages, third-party software components, AI models, and third-party datasets. Software security best practices in the automotive sector are often an afterthought for developers. Thus, significant cybersecurity risks exist in the software supply chain of AVs, particularly when secure software development practices are not rigorously implemented. For example, Upstream's 2024 Automotive Cybersecurity Report states that 49.5% of cyberattacks in the automotive sector are related to exploiting security vulnerabilities in software systems. In this chapter, we analyze security vulnerabilities in open-source software components in AVs. We utilize static analyzers on popular open-source AV software, such as Autoware, Apollo, and openpilot. Specifically, this chapter covers: (1) prevalent software security vulnerabilities of AVs; and (2) a comparison of static analyzer outputs for different open-source AV repositories. The goal is to inform researchers, practitioners, and policymakers about the existing security flaws in the commonplace open-source software ecosystem in the AV domain. The findings would emphasize the necessity of security best practices earlier in the software development lifecycle to reduce cybersecurity risks, thereby ensuring system reliability, safeguarding user data, and maintaining public trust in an increasingly automated world.
Authors:Minhaj Uddin Ahmad, Akid Abrar, Sagar Dasgupta, Mizanur Rahman
Abstract:
Intelligent Transportation Systems (ITS) have been widely deployed across major metropolitan regions worldwide to improve roadway safety, optimize traffic flow, and reduce environmental impacts. These systems integrate advanced sensors, communication networks, and data analytics to enable real-time traffic monitoring, adaptive signal control, and predictive maintenance. However, such integration significantly broadens the ITS attack surface, exposing critical infrastructures to cyber threats that jeopardize safety, data integrity, and operational resilience. Ensuring robust cybersecurity is therefore essential, yet comprehensive vulnerability assessments, threat modeling, and mitigation validations are often cost-prohibitive and time-intensive when applied to large-scale, heterogeneous transportation systems. Simulation platforms offer a cost-effective and repeatable means for cybersecurity evaluation, and the simulation platform should encompass the full range of ITS dimensions - mobility, sensing, networking, and applications. This chapter discusses an integrated co-simulation testbed that links CARLA for 3D environment and sensor modeling, SUMO for microscopic traffic simulation and control, and OMNeT++ for V2X communication simulation. The co-simulation testbed enables end-to-end experimentation, vulnerability identification, and mitigation benchmarking, providing practical insights for developing secure, efficient, and resilient ITS infrastructures. To illustrate its capabilities, the chapter incorporates a case study on a C-V2X proactive safety alert system enhanced with post-quantum cryptography, highlighting the role of the testbed in advancing secure and resilient ITS infrastructures.
Authors:Simin Chen, Jinjun Peng, Yixin He, Junfeng Yang, Baishakhi Ray
Abstract:
Deep learning (DL) compilers are core infrastructure in modern DL systems, offering flexibility and scalability beyond vendor-specific libraries. This work uncovers a fundamental vulnerability in their design: can an official, unmodified compiler alter a model's semantics during compilation and introduce hidden backdoors? We study both adversarial and natural settings. In the adversarial case, we craft benign models where triggers have no effect pre-compilation but become effective backdoors after compilation. Tested on six models, three commercial compilers, and two hardware platforms, our attack yields 100% success on triggered inputs while preserving normal accuracy and remaining undetected by state-of-the-art detectors. The attack generalizes across compilers, hardware, and floating-point settings. In the natural setting, we analyze the top 100 HuggingFace models (including one with 220M+ downloads) and find natural triggers in 31 models. This shows that compilers can introduce risks even without adversarial manipulation. Our results reveal an overlooked threat: unmodified DL compilers can silently alter model semantics. To our knowledge, this is the first work to expose inherent security risks in DL compiler design, opening a new direction for secure and trustworthy ML.
Authors:Wuyuao Mai, Geng Hong, Qi Liu, Jinsong Chen, Jiarun Dai, Xudong Pan, Yuan Zhang, Min Yang
Abstract:
Penetration testing is critical for identifying and mitigating security vulnerabilities, yet traditional approaches remain expensive, time-consuming, and dependent on expert human labor. Recent work has explored AI-driven pentesting agents, but their evaluation relies on oversimplified capture-the-flag (CTF) settings that embed prior knowledge and reduce complexity, leading to performance estimates far from real-world practice. We close this gap by introducing the first real-world, agent-oriented pentesting benchmark, TermiBench, which shifts the goal from 'flag finding' to achieving full system control. The benchmark spans 510 hosts across 25 services and 30 CVEs, with realistic environments that require autonomous reconnaissance, discrimination between benign and exploitable services, and robust exploit execution. Using this benchmark, we find that existing systems can hardly obtain system shells under realistic conditions. To address these challenges, we propose TermiAgent, a multi-agent penetration testing framework. TermiAgent mitigates long-context forgetting with a Located Memory Activation mechanism and builds a reliable exploit arsenal via structured code understanding rather than naive retrieval. In evaluations, our work outperforms state-of-the-art agents, exhibiting stronger penetration testing capability, reducing execution time and financial cost, and demonstrating practicality even on laptop-scale deployments. Our work delivers both the first open-source benchmark for real-world autonomous pentesting and a novel agent framework that establishes a milestone for AI-driven penetration testing.
中文摘要:本文提出了首个面向真实场景的渗透测试基准TermiBench,将评估目标从简单的夺旗转为完整系统控制,并开发了多智能体框架TermiAgent,通过定位记忆激活和结构化代码理解显著提升渗透能力,在性能和成本方面均优于现有方案。
English Summary: This paper introduces TermiBench, the first real-world benchmark for AI-driven penetration testing that shifts focus from simplified flag-finding to full system control, and proposes TermiAgent, a multi-agent framework that outperforms existing systems by mitigating long-context issues and enabling robust exploit execution.
Authors:Karan Patel, Yu-Zheng Lin, Gaurangi Raul, Bono Po-Jen Shih, Matthew W. Redondo, Banafsheh Saber Latibari, Jesus Pacheco, Soheil Salehi, Pratik Satam
Abstract:
This full paper describes an LLM-assisted instruction integrated with a virtual cybersecurity lab platform. The digital transformation of Fourth Industrial Revolution (4IR) systems is reshaping workforce needs, widening skill gaps, especially among older workers. With rising emphasis on robotics, automation, AI, and security, re-skilling and up-skilling are essential. Generative AI can help build this workforce by acting as an instructional assistant to support skill acquisition during experiential learning. We present a generative AI instructional assistant integrated into a prior experiential learning platform. The assistant employs a zero-shot OCR-LLM pipeline within the legacy Cybersecurity Labs-as-a-Service (CLaaS) platform (2015). Text is extracted from slide images using Tesseract OCR, then simplified instructions are generated via a general-purpose LLM, enabling real-time instructional support with minimal infrastructure. The system was evaluated in a live university course where student feedback (n=42) averaged 7.83/10, indicating strong perceived usefulness. A comparative study with multimodal LLMs that directly interpret slide images showed higher performance on visually dense slides, but the OCR-LLM pipeline provided comparable pedagogical value on text-centric slides with much lower computational overhead and cost. This work demonstrates that a lightweight, easily integrable pipeline can effectively extend legacy platforms with modern generative AI, offering scalable enhancements for student comprehension in technical education.
Authors:Yu-Zheng Lin, Sujan Ghimire, Abhiram Nandimandalam, Jonah Michael Camacho, Unnati Tripathi, Rony Macwan, Sicong Shao, Setareh Rafatirad, Rozhin Yasaei, Pratik Satam, Soheil Salehi
Abstract:
The rapid growth of hardware vulnerabilities has created an urgent need for systematic and scalable analysis methods. Unlike software flaws, which are often patchable post-deployment, hardware weaknesses remain embedded across product lifecycles, posing persistent risks to processors, embedded devices, and IoT platforms. Existing efforts such as the MITRE CWE Hardware List (2021) relied on expert-driven Delphi surveys, which lack statistical rigor and introduce subjective bias, while large-scale data-driven foundations for hardware weaknesses have been largely absent. In this work, we propose LLM-HyPZ, an LLM-assisted hybrid framework for zero-shot knowledge extraction and refinement from vulnerability corpora. Our approach integrates zero-shot LLM classification, contextualized embeddings, unsupervised clustering, and prompt-driven summarization to mine hardware-related CVEs at scale. Applying LLM-HyPZ to the 2021-2024 CVE corpus (114,836 entries), we identified 1,742 hardware-related vulnerabilities. We distilled them into five recurring themes, including privilege escalation via firmware and BIOS, memory corruption in mobile and IoT systems, and physical access exploits. Benchmarking across seven LLMs shows that LLaMA 3.3 70B achieves near-perfect classification accuracy (99.5%) on a curated validation set. Beyond methodological contributions, our framework directly supported the MITRE CWE Most Important Hardware Weaknesses (MIHW) 2025 update by narrowing the candidate search space. Specifically, our pipeline surfaced 411 of the 1,026 CVEs used for downstream MIHW analysis, thereby reducing expert workload and accelerating evidence gathering. These results establish LLM-HyPZ as the first data-driven, scalable approach for systematically discovering hardware vulnerabilities, thereby bridging the gap between expert knowledge and real-world vulnerability evidence.
Authors:Zhipeng Yin, Zichong Wang, Avash Palikhe, Zhen Liu, Jun Liu, Wenbin Zhang
Abstract:
Generative models have achieved impressive results in text to image tasks, significantly advancing visual content creation. However, this progress comes at a cost, as such models rely heavily on large-scale training data and may unintentionally replicate copyrighted elements, creating serious legal and ethical challenges for real-world deployment. To address these concerns, researchers have proposed various strategies to mitigate copyright risks, most of which are prompt based methods that filter or rewrite user inputs to prevent explicit infringement. While effective in handling obvious cases, these approaches often fall short in more subtle situations, where seemingly benign prompts can still lead to infringing outputs. To address these limitations, this paper introduces Assessing and Mitigating Copyright Risks (AMCR), a comprehensive framework which i) builds upon prompt-based strategies by systematically restructuring risky prompts into safe and non-sensitive forms, ii) detects partial infringements through attention-based similarity analysis, and iii) adaptively mitigates risks during generation to reduce copyright violations without compromising image quality. Extensive experiments validate the effectiveness of AMCR in revealing and mitigating latent copyright risks, offering practical insights and benchmarks for the safer deployment of generative models.
Authors:Mengyu Sun, Ziyuan Yang, Yongqiang Huang, Hui Yu, Yingyu Chen, Shuren Qi, Andrew Beng Jin Teoh, Yi Zhang
Abstract:
Artificial intelligence (AI) has demonstrated considerable potential in the realm of medical imaging. However, the development of high-performance AI models typically necessitates training on large-scale, centralized datasets. This approach is confronted with significant challenges due to strict patient privacy regulations and legal restrictions on data sharing and utilization. These limitations hinder the development of large-scale models in medical domains and impede continuous updates and training with new data. Federated Learning (FL), a privacy-preserving distributed training framework, offers a new solution by enabling collaborative model development across fragmented medical datasets. In this survey, we review FL's contributions at two stages of the full-stack medical analysis pipeline. First, in upstream tasks such as CT or MRI reconstruction, FL enables joint training of robust reconstruction networks on diverse, multi-institutional datasets, alleviating data scarcity while preserving confidentiality. Second, in downstream clinical tasks like tumor diagnosis and segmentation, FL supports continuous model updating by allowing local fine-tuning on new data without centralizing sensitive images. We comprehensively analyze FL implementations across the medical imaging pipeline, from physics-informed reconstruction networks to diagnostic AI systems, highlighting innovations that improve communication efficiency, align heterogeneous data, and ensure secure parameter aggregation. Meanwhile, this paper provides an outlook on future research directions, aiming to serve as a valuable reference for the field's development.
中文摘要:联邦学习通过在不集中敏感数据的情况下实现跨分散医疗数据集的协作模型开发,既保护患者隐私,又支持从上游图像重建到下游临床任务的全流程医疗分析。
English Summary: Federated Learning enables collaborative AI model development across decentralized medical datasets while preserving patient privacy, supporting both upstream image reconstruction and downstream clinical tasks without centralizing sensitive data.
Authors:Zihan Wang, Rui Zhang, Hongwei Li, Wenshu Fan, Wenbo Jiang, Qingchuan Zhao, Guowen Xu
Abstract:
Backdoor attacks pose a significant threat to Large Language Models (LLMs), where adversaries can embed hidden triggers to manipulate LLM's outputs. Most existing defense methods, primarily designed for classification tasks, are ineffective against the autoregressive nature and vast output space of LLMs, thereby suffering from poor performance and high latency. To address these limitations, we investigate the behavioral discrepancies between benign and backdoored LLMs in output space. We identify a critical phenomenon which we term sequence lock: a backdoored model generates the target sequence with abnormally high and consistent confidence compared to benign generation. Building on this insight, we propose ConfGuard, a lightweight and effective detection method that monitors a sliding window of token confidences to identify sequence lock. Extensive experiments demonstrate ConfGuard achieves a near 100\% true positive rate (TPR) and a negligible false positive rate (FPR) in the vast majority of cases. Crucially, the ConfGuard enables real-time detection almost without additional latency, making it a practical backdoor defense for real-world LLM deployments.
Chinese Summary: 后门攻击通过在大型语言模型中植入隐藏触发器构成威胁,而提出的ConfGuard方法通过监测令牌置信度来检测序列锁定现象,以近乎完美的检测率和极低延迟实现有效防御。
English Summary: Backdoor attacks threaten LLMs by embedding hidden triggers, but the proposed ConfGuard method effectively detects these attacks by monitoring token confidence for sequence lock, achieving near-perfect detection rates with minimal latency.
Authors:Yunhao Chen, Shujie Wang, Xin Wang, Xingjun Ma
Abstract:
Understanding the memorization and privacy leakage risks in Contrastive Language--Image Pretraining (CLIP) is critical for ensuring the security of multimodal models. Recent studies have demonstrated the feasibility of extracting sensitive training examples from diffusion models, with conditional diffusion models exhibiting a stronger tendency to memorize and leak information. In this work, we investigate data memorization and extraction risks in CLIP through the lens of CLIP inversion, a process that aims to reconstruct training images from text prompts. To this end, we introduce \textbf{LeakyCLIP}, a novel attack framework designed to achieve high-quality, semantically accurate image reconstruction from CLIP embeddings. We identify three key challenges in CLIP inversion: 1) non-robust features, 2) limited visual semantics in text embeddings, and 3) low reconstruction fidelity. To address these challenges, LeakyCLIP employs 1) adversarial fine-tuning to enhance optimization smoothness, 2) linear transformation-based embedding alignment, and 3) Stable Diffusion-based refinement to improve fidelity. Empirical results demonstrate the superiority of LeakyCLIP, achieving over 358% improvement in Structural Similarity Index Measure (SSIM) for ViT-B-16 compared to baseline methods on LAION-2B subset. Furthermore, we uncover a pervasive leakage risk, showing that training data membership can even be successfully inferred from the metrics of low-fidelity reconstructions. Our work introduces a practical method for CLIP inversion while offering novel insights into the nature and scope of privacy risks in multimodal models.
Authors:Guozhu Meng, Zhixiu Guo, Xiaodong Zhang, Haoyu Wang, Kai Chen, Yang Liu
Abstract:
It is well known that antivirus engines are vulnerable to evasion techniques (e.g., obfuscation) that transform malware into its variants. However, it cannot be necessarily attributed to the effectiveness of these evasions, and the limits of engines may also make this unsatisfactory result. In this study, we propose a data-driven approach to measure the effect of app transformations to malware detection, and further explain why the detection result is produced by these engines. First, we develop an interaction model for antivirus engines, illustrating how they respond with different detection results in terms of varying inputs. Six app transformation techniques are implemented in order to generate a large number of Android apps with traceable changes. Then we undertake a one-month tracking of app detection results from multiple antivirus engines, through which we obtain over 971K detection reports from VirusTotal for 179K apps in total. Last, we conduct a comprehensive analysis of antivirus engines based on these reports from the perspectives of signature-based, static analysis-based, and dynamic analysis-based detection techniques. The results, together with 7 highlighted findings, identify a number of sealed working mechanisms occurring inside antivirus engines and what are the indicators of compromise in apps during malware detection.
Chinese: 本研究提出了一种数据驱动的方法来评估应用转换对恶意软件检测的影响,并通过分析超过97.1万份报告揭示了杀毒引擎的内部工作机制。
English: This study introduces a data-driven method to assess how app transformations impact malware detection, revealing the internal mechanisms of antivirus engines through extensive analysis of over 971,000 reports.
Authors:Paul R. B. Houssel, Siamak Layeghy, Priyanka Singh, Marius Portmann
Abstract:
This paper introduces eX-NIDS, a framework designed to enhance interpretability in flow-based Network Intrusion Detection Systems (NIDS) by leveraging Large Language Models (LLMs). In our proposed framework, flows labelled as malicious by NIDS are initially processed through a module called the Prompt Augmenter. This module extracts contextual information and Cyber Threat Intelligence (CTI)-related knowledge from these flows. This enriched, context-specific data is then integrated with an input prompt for an LLM, enabling it to generate detailed explanations and interpretations of why the flow was identified as malicious by NIDS. We compare the generated interpretations against a Basic-Prompt Explainer baseline, which does not incorporate any contextual information into the LLM's input prompt. Our framework is quantitatively evaluated using the Llama 3 and GPT-4 models, employing a novel evaluation method tailored for natural language explanations, focusing on their correctness and consistency. The results demonstrate that augmented LLMs can produce accurate and consistent explanations, serving as valuable complementary tools in NIDS to explain the classification of malicious flows. The use of augmented prompts enhances performance by over 20% compared to the Basic-Prompt Explainer.
中文: 本文提出eX-NIDS框架,通过利用大语言模型增强基于流量的网络入侵检测系统的可解释性,结合上下文和网络威胁情报为恶意流量分类生成详细解释,相比基础提示方法性能提升超过20%。
English: This paper presents eX-NIDS, a framework that improves the interpretability of flow-based Network Intrusion Detection Systems by using Large Language Models to generate detailed explanations for malicious flow classifications, achieving over 20% better performance than basic prompts through enriched context and Cyber Threat Intelligence integration.
Authors:Gehao Zhang, Eugene Bagdasarian, Juan Zhai, Shiqing Ma
Abstract:
Distinguishing AI-generated code from human-written code is becoming crucial for tasks such as authorship attribution, content tracking, and misuse detection. Based on this, N-gram-based watermarking schemes have emerged as prominent, which inject secret watermarks to be detected during the generation.
However, their robustness in code content remains insufficiently evaluated. Most claims rely solely on defenses against simple code transformations or code optimizations as a simulation of attack, creating a questionable sense of robustness. In contrast, more sophisticated schemes already exist in the software engineering world, e.g., code obfuscation, which significantly alters code while preserving functionality. Although obfuscation is commonly used to protect intellectual property or evade software scanners, the robustness of code watermarking techniques against such transformations remains largely unexplored.
In this work, we formally model the code obfuscation and prove the impossibility of N-gram-based watermarking's robustness with only one intuitive and experimentally verified assumption, distribution consistency, satisfied. Given the original false positive rate of the watermarking detection, the ratio that the detector failed on the watermarked code after obfuscation will increase to 1 - fpr.
The experiments have been performed on three SOTA watermarking schemes, two LLMs, two programming languages, four code benchmarks, and four obfuscators. Among them, all watermarking detectors show coin-flipping detection abilities on obfuscated codes (AUROC tightly surrounds 0.5). Among all models, watermarking schemes, and datasets, both programming languages own obfuscators that can achieve attack effects with no detection AUROC higher than 0.6 after the attack. Based on the theoretical and practical observations, we also proposed a potential path of robust code watermarking.
This study demonstrates that N-gram-based watermarking schemes for AI-generated code are fundamentally vulnerable to code obfuscation attacks, with theoretical proofs and extensive experiments showing detection performance degrades to random chance levels across multiple programming languages and state-of-the-art watermarking methods.
English Summary:
Authors:Nelly Elsayed, Lily Dzamesi, Zag ElSayed, Murat Ozer
Abstract:
The Internet of Medical Things (IoMT) represents a paradigm shift in the healthcare sector, enabling the interconnection of medical devices, sensors, and systems to enhance patient monitoring, diagnosis, and management. The rapid evolution of IoMT presents significant benefits to the healthcare domains. However, there is a rapid increase in distributed denial of service (DDoS) attacks on the IoMT networks due to several vulnerabilities in the IoMT-connected devices, which negatively impact patients' health and can even lead to deaths. Thus, in this paper, we aim to save lives via investigating an extreme learning machine for detecting DDoS attacks on IoMT devices. The proposed approach achieves a high accuracy at a low implementation budget. Thus, it can reduce the implementation cost of the DDoS detection system, making the model capable of executing on the fog level.
中文: 医疗物联网(IoMT)通过设备互联提升医疗水平,但面临分布式拒绝服务攻击增多的威胁,本研究采用极限学习机开发低成本、高精度的攻击检测系统,以加强安全保障并挽救生命。
English: The Internet of Medical Things (IoMT) improves healthcare through interconnected devices but faces increasing DDoS attacks, prompting this study to develop a cost-effective, high-accuracy detection system using extreme learning machines to enhance security and save lives.
Authors:Lu Yan, Zhuo Zhang, Xiangzhe Xu, Shengwei An, Guangyu Shen, Zhou Xuan, Xuan Chen, Xiangyu Zhang
Abstract:
Large language models (LLMs) have democratized software development, reducing the expertise barrier for programming complex applications. This accessibility extends to malicious software development, raising significant security concerns. While LLM providers have implemented alignment mechanisms to prevent direct generation of overtly malicious code, these safeguards predominantly evaluate individual prompts in isolation, overlooking a critical vulnerability: malicious operations can be systematically decomposed into benign-appearing sub-tasks. In this paper, we introduce the Malware Generation Compiler (MGC), a novel framework that leverages this vulnerability through modular decomposition and alignment-evasive generation. MGC employs a specialized Malware Description Intermediate Representation (MDIR) to bridge high-level malicious intents and benign-appearing code snippets. Extensive evaluation demonstrates that our attack reliably generates functional malware across diverse task specifications and categories, outperforming jailbreaking methods by +365.79% and underground services by +78.07% in correctness on three benchmark datasets. Case studies further show that MGC can reproduce and even enhance 16 real-world malware samples. This work provides critical insights for security researchers by exposing the risks of compositional attacks against aligned AI systems. Demonstrations are available at https://sites.google.com/view/malware-generation-compiler.
中文摘要:恶意软件生成编译器(MGC)框架通过将恶意操作分解为看似良性的子任务,利用了大语言模型安全防护的关键漏洞,能够可靠生成功能性恶意软件,其正确性显著超越现有越狱方法和地下服务。
English Summary: The Malware Generation Compiler (MGC) framework exploits a critical vulnerability in LLM safeguards by systematically breaking down malicious operations into benign-looking sub-tasks, successfully generating functional malware that significantly outperforms existing methods.
Authors:Luis Burbano, Diego Ortiz, Qi Sun, Siwei Yang, Haoqin Tu, Cihang Xie, Yinzhi Cao, Alvaro A Cardenas
Abstract:
Embodied Artificial Intelligence (AI) promises to handle edge cases in robotic vehicle systems where data is scarce by using common-sense reasoning grounded in perception and action to generalize beyond training distributions and adapt to novel real-world situations. These capabilities, however, also create new security risks. In this paper, we introduce CHAI (Command Hijacking against embodied AI), a new class of prompt-based attacks that exploit the multimodal language interpretation abilities of Large Visual-Language Models (LVLMs). CHAI embeds deceptive natural language instructions, such as misleading signs, in visual input, systematically searches the token space, builds a dictionary of prompts, and guides an attacker model to generate Visual Attack Prompts. We evaluate CHAI on four LVLM agents; drone emergency landing, autonomous driving, and aerial object tracking, and on a real robotic vehicle. Our experiments show that CHAI consistently outperforms state-of-the-art attacks. By exploiting the semantic and multimodal reasoning strengths of next-generation embodied AI systems, CHAI underscores the urgent need for defenses that extend beyond traditional adversarial robustness.
Authors:Anshul Nasery, Edoardo Contente, Alkin Kaz, Pramod Viswanath, Sewoong Oh
Abstract:
Model fingerprinting has emerged as a promising paradigm for claiming model ownership. However, robustness evaluations of these schemes have mostly focused on benign perturbations such as incremental fine-tuning, model merging, and prompting. Lack of systematic investigations into {\em adversarial robustness} against a malicious model host leaves current systems vulnerable. To bridge this gap, we first define a concrete, practical threat model against model fingerprinting. We then take a critical look at existing model fingerprinting schemes to identify their fundamental vulnerabilities. Based on these, we develop adaptive adversarial attacks tailored for each vulnerability, and demonstrate that these can bypass model authentication completely for ten recently proposed fingerprinting schemes while maintaining high utility of the model for the end users. Our work encourages fingerprint designers to adopt adversarial robustness by design. We end with recommendations for future fingerprinting methods.
Authors:Kang Wei, Xin Yuan, Fushuo Huo, Chuan Ma, Long Yuan, Songze Li, Ming Ding, Dacheng Tao
Abstract:
Diffusion models (DMs) have been investigated in various domains due to their ability to generate high-quality data, thereby attracting significant attention. However, similar to traditional deep learning systems, there also exist potential threats to DMs. To provide advanced and comprehensive insights into safety, ethics, and trust in DMs, this survey comprehensively elucidates its framework, threats, and countermeasures. Each threat and its countermeasures are systematically examined and categorized to facilitate thorough analysis. Furthermore, we introduce specific examples of how DMs are used, what dangers they might bring, and ways to protect against these dangers. Finally, we discuss key lessons learned, highlight open challenges related to DM security, and outline prospective research directions in this critical field. This work aims to accelerate progress not only in the technical capabilities of generative artificial intelligence but also in the maturity and wisdom of its application.
Authors:Dincy R. Arikkat, Sneha B. T., Serena Nicolazzo, Antonino Nocera, Vinod P., Rafidha Rehiman K. A., Karthika R
Abstract:
Cyber Threat Intelligence (CTI) enables organizations to anticipate, detect, and mitigate evolving cyber threats. Its effectiveness depends on high-quality datasets, which support model development, training, evaluation, and benchmarking. Building such datasets is crucial, as attack vectors and adversary tactics continually evolve. Recently, Telegram has gained prominence as a valuable CTI source, offering timely and diverse threat-related information that can help address these challenges. In this work, we address these challenges by presenting an end-to-end automated pipeline that systematically collects and filters threat-related content from Telegram. The pipeline identifies relevant Telegram channels and scrapes 145,349 messages from 12 curated channels out of 150 identified sources. To accurately filter threat intelligence messages from generic content, we employ a BERT-based classifier, achieving an accuracy of 96.64%. From the filtered messages, we compile a dataset of 86,509 malicious Indicators of Compromise, including domains, IPs, URLs, hashes, and CVEs. This approach not only produces a large-scale, high-fidelity CTI dataset but also establishes a foundation for future research and operational applications in cyber threat detection.
Authors:Joshua Ward, Xiaofeng Lin, Chi-Hua Wang, Guang Cheng
Abstract:
Tabular Generative Models are often argued to preserve privacy by creating synthetic datasets that resemble training data. However, auditing their empirical privacy remains challenging, as commonly used similarity metrics fail to effectively characterize privacy risk. Membership Inference Attacks (MIAs) have recently emerged as a method for evaluating privacy leakage in synthetic data, but their practical effectiveness is limited. Numerous attacks exist across different threat models, each with distinct implementations targeting various sources of privacy leakage, making them difficult to apply consistently. Moreover, no single attack consistently outperforms the others, leading to a routine underestimation of privacy risk. To address these issues, we propose a unified, model-agnostic threat framework that deploys a collection of attacks to estimate the maximum empirical privacy leakage in synthetic datasets. We introduce Synth-MIA, an open-source Python library that streamlines this auditing process through a novel testbed that integrates seamlessly into existing synthetic data evaluation pipelines through a Scikit-Learn-like API. Our software implements 13 attack methods through a Scikit-Learn-like API, designed to enable fast systematic estimation of privacy leakage for practitioners as well as facilitate the development of new attacks and experiments for researchers. We demonstrate our framework's utility in the largest tabular synthesis privacy benchmark to date, revealing that higher synthetic data quality corresponds to greater privacy leakage, that similarity-based privacy metrics show weak correlation with MIA results, and that the differentially private generator PATEGAN can fail to preserve privacy under such attacks. This underscores the necessity of MIA-based auditing when designing and deploying Tabular Generative Models.
Authors:Onat Gungor, Roshan Sood, Harold Wang, Tajana Rosing
Abstract:
Large Language Models (LLMs) have recently demonstrated strong potential for cybersecurity question answering (QA), supporting decision-making in real-time threat detection and response workflows. However, their substantial computational demands pose significant challenges for deployment on resource-constrained edge devices. Quantization, a widely adopted model compression technique, can alleviate these constraints. Nevertheless, quantization may degrade model accuracy and increase susceptibility to adversarial attacks. Fine-tuning offers a potential means to mitigate these limitations, but its effectiveness when combined with quantization remains insufficiently explored. Hence, it is essential to understand the trade-offs among accuracy, efficiency, and robustness. We propose AQUA-LLM, an evaluation framework designed to benchmark several state-of-the-art small LLMs under four distinct configurations: base, quantized-only, fine-tuned, and fine-tuned combined with quantization, specifically for cybersecurity QA. Our results demonstrate that quantization alone yields the lowest accuracy and robustness despite improving efficiency. In contrast, combining quantization with fine-tuning enhances both LLM robustness and predictive performance, achieving an optimal balance of accuracy, robustness, and efficiency. These findings highlight the critical need for quantization-aware, robustness-preserving fine-tuning methodologies to enable the robust and efficient deployment of LLMs for cybersecurity QA.
Authors:Onat Gungor, Ishaan Kale, Jiasheng Zhou, Tajana Rosing
Abstract:
The expansion of edge computing has increased the attack surface, creating an urgent need for robust, real-time machine learning (ML)-based host intrusion detection systems (HIDS) that balance accuracy and efficiency. In such settings, inference latency poses a critical security risk, as delays may provide exploitable opportunities for attackers. However, many state-of-the-art ML-based HIDS solutions rely on computationally intensive architectures with high inference costs, limiting their practical deployment. This paper proposes LIGHT-HIDS, a lightweight machine learning framework that combines a compressed neural network feature extractor trained via Deep Support Vector Data Description (DeepSVDD) with an efficient novelty detection model. This hybrid approach enables the learning of compact, meaningful representations of normal system call behavior for accurate anomaly detection. Experimental results on multiple datasets demonstrate that LIGHT-HIDS consistently enhances detection accuracy while reducing inference time by up to 75x compared to state-of-the-art methods. These findings highlight its effectiveness and scalability as a machine learning-based solution for real-time host intrusion detection.
Authors:Jing Chen, Onat Gungor, Zhengli Shang, Tajana Rosing
Abstract:
The rapid proliferation of the Internet of Things (IoT) continues to expose critical security vulnerabilities, necessitating the development of efficient and robust intrusion detection systems (IDS). Machine learning-based intrusion detection systems (ML-IDS) have significantly improved threat detection capabilities; however, they remain highly susceptible to adversarial attacks. While numerous defense mechanisms have been proposed to enhance ML-IDS resilience, a systematic approach for selecting the most effective defense against a specific adversarial attack remains absent. To address this challenge, we previously proposed DYNAMITE, a dynamic defense selection approach that identifies the most suitable defense against adversarial attacks through an ML-driven selection mechanism. Building on this foundation, we propose SAGE (Sample-Aware Guarding Engine), a substantially improved defense algorithm that integrates active learning with targeted data reduction. It employs an active learning mechanism to selectively identify the most informative input samples and their corresponding optimal defense labels, which are then used to train a second-level learner responsible for selecting the most effective defense. This targeted sampling improves computational efficiency, exposes the model to diverse adversarial strategies during training, and enhances robustness, stability, and generalizability. As a result, SAGE demonstrates strong predictive performance across multiple intrusion detection datasets, achieving an average F1-score improvement of 201% over the state-of-the-art defenses. Notably, SAGE narrows the performance gap to the Oracle to just 3.8%, while reducing computational overhead by up to 29x.
Authors:Joshua Ward, Yuxuan Yang, Chi-Hua Wang, Guang Cheng
Abstract:
Membership Inference Attacks (MIAs) have emerged as a principled framework for auditing the privacy of synthetic data generated by tabular generative models, where many diverse methods have been proposed that each exploit different privacy leakage signals. However, in realistic threat scenarios, an adversary must choose a single method without a priori guarantee that it will be the empirically highest performing option. We study this challenge as a decision theoretic problem under uncertainty and conduct the largest synthetic data privacy benchmark to date. Here, we find that no MIA constitutes a strictly dominant strategy across a wide variety of model architectures and dataset domains under our threat model. Motivated by these findings, we propose ensemble MIAs and show that unsupervised ensembles built on individual attacks offer empirically more robust, regret-minimizing strategies than individual attacks.
Authors:Wenjie Qu, Yuguang Zhou, Bo Wang, Wengrui Zheng, Yuexin Li, Jinyuan Jia, Jiaheng Zhang
Abstract:
The rapid development of Large Language Models (LLMs) for code generation has transformed software development by automating coding tasks with unprecedented efficiency.
However, the training of these models on open-source code repositories (e.g., from GitHub) raises critical ethical and legal concerns, particularly regarding data authorization and open-source license compliance. Developers are increasingly questioning whether model trainers have obtained proper authorization before using repositories for training, especially given the lack of transparency in data collection.
To address these concerns, we propose a novel data marking framework RepoMark to audit the data usage of code LLMs. Our method enables repository owners to verify whether their code has been used in training, while ensuring semantic preservation, imperceptibility, and theoretical false detection rate (FDR) guarantees. By generating multiple semantically equivalent code variants, RepoMark introduces data marks into the code files, and during detection, RepoMark leverages a novel ranking-based hypothesis test to detect memorization within the model. Compared to prior data auditing approaches, RepoMark significantly enhances sample efficiency, allowing effective auditing even when the user's repository possesses only a small number of code files.
Experiments demonstrate that RepoMark achieves a detection success rate over 90\% on small code repositories under a strict FDR guarantee of 5\%. This represents a significant advancement over existing data marking techniques, all of which only achieve accuracy below 55\% under identical settings. This further validates RepoMark as a robust, theoretically sound, and promising solution for enhancing transparency in code LLM training, which can safeguard the rights of repository owners.
Authors:Cagla Ipek Kocal, Onat Gungor, Tajana Rosing, Baris Aksanli
Abstract:
Minimizing computational overhead in time-series classification, particularly in deep learning models, presents a significant challenge due to the high complexity of model architectures and the large volume of sequential data that must be processed in real time. This challenge is further compounded by adversarial attacks, emphasizing the need for resilient methods that ensure robust performance and efficient model selection. To address this challenge, we propose ReLATE+, a comprehensive framework that detects and classifies adversarial attacks, adaptively selects deep learning models based on dataset-level similarity, and thus substantially reduces retraining costs relative to conventional methods that do not leverage prior knowledge, while maintaining strong performance. ReLATE+ first checks whether the incoming data is adversarial and, if so, classifies the attack type, using this insight to identify a similar dataset from a repository and enable the reuse of the best-performing associated model. This approach ensures strong performance while reducing the need for retraining, and it generalizes well across different domains with varying data distributions and feature spaces. Experiments show that ReLATE+ reduces computational overhead by an average of 77.68%, enhancing adversarial resilience and streamlining robust model selection, all without sacrificing performance, within 2.02% of Oracle.
中文: ReLATE+框架通过检测对抗攻击并自适应选择深度学习模型,在保持强大性能的同时将计算开销平均降低77.68%,适用于不同领域的数据分布。
English: ReLATE+ is a framework that detects adversarial attacks and adaptively selects deep learning models to reduce computational overhead by 77.68% while maintaining robust performance across diverse domains.
Authors:Elvin Li, Onat Gungor, Zhengli Shang, Tajana Rosing
Abstract:
The Internet of Things (IoT), with its high degree of interconnectivity and limited computational resources, is particularly vulnerable to a wide range of cyber threats. Intrusion detection systems (IDS) have been extensively studied to enhance IoT security, and machine learning-based IDS (ML-IDS) show considerable promise for detecting malicious activity. However, their effectiveness is often constrained by poor adaptability to emerging threats and the issue of catastrophic forgetting during continuous learning. To address these challenges, we propose CITADEL, a self-supervised continual learning framework designed to extract robust representations from benign data while preserving long-term knowledge through optimized memory consolidation mechanisms. CITADEL integrates a tabular-to-image transformation module, a memory-aware masked autoencoder for self-supervised representation learning, and a novelty detection component capable of identifying anomalies without dependence on labeled attack data. Our design enables the system to incrementally adapt to emerging behaviors while retaining its ability to detect previously observed threats. Experiments on multiple intrusion datasets demonstrate that CITADEL achieves up to a 72.9% improvement over the VAE-based lifelong anomaly detector (VLAD) in key detection and retention metrics, highlighting its effectiveness in dynamic IoT environments.
中文: 针对物联网面临的不断演变的网络威胁,CITADEL框架通过自监督持续学习,利用优化的记忆巩固机制提取鲁棒数据表示,在动态环境中关键检测指标上较现有方法提升高达72.9%。
English: The IoT's vulnerability to evolving cyber threats is addressed by CITADEL, a self-supervised continual learning framework that enhances intrusion detection through robust data representation and memory consolidation, achieving up to 72.9% improvement over existing methods in dynamic environments.
Authors:Xiaojin Zhang, Mingcong Xu, Yiming Li, Wei Chen, Qiang Yang
Abstract:
Federated learning (FL) offers a promising paradigm for collaborative model training while preserving data privacy. However, its susceptibility to gradient inversion attacks poses a significant challenge, necessitating robust privacy protection mechanisms. This paper introduces a novel theoretical framework to decipher the intricate interplay between attack and protection complexities in privacy-preserving FL. We formally define "Attack Complexity" as the minimum computational and data resources an adversary requires to reconstruct private data below a given error threshold, and "Protection Complexity" as the expected distortion introduced by privacy mechanisms. Leveraging Maximum Bayesian Privacy (MBP), we derive tight theoretical bounds for protection complexity, demonstrating its scaling with model dimensionality and privacy budget. Furthermore, we establish comprehensive bounds for attack complexity, revealing its dependence on privacy leakage, gradient distortion, model dimension, and the chosen privacy level. Our findings quantitatively illuminate the fundamental trade-offs between privacy guarantees, system utility, and the effort required for both attacking and defending. This framework provides critical insights for designing more secure and efficient federated learning systems.
中文摘要:本文提出了一个理论框架来分析联邦学习中攻击与防护复杂性之间的权衡,通过建立严格的理论界限揭示了隐私保障、系统效用和攻击防御努力之间的内在关联。
English Summary: This paper presents a theoretical framework analyzing the trade-offs between attack and protection complexities in federated learning, establishing formal bounds that reveal how privacy guarantees, system utility, and adversarial efforts are fundamentally interconnected.
Authors:Chao Liu, Zhezheng Zhu, Hao Chen, Zhe Chen, Kaiwen Guo, Penghao Wang, Jun Luo
Abstract:
As a versatile AI application, voice assistants (VAs) have become increasingly popular, but are vulnerable to security threats. Attackers have proposed various inaudible attacks, but are limited by cost, distance, or LoS. Therefore, we propose \name~Attack, a long-range, cross-barrier, and interference-free inaudible voice attack via solid channels. We begin by thoroughly analyzing the dispersion effect in solid channels, revealing its unique impact on signal propagation. To avoid distortions in voice commands, we design a modular command generation model that parameterizes attack distance, victim audio, and medium dispersion features to adapt to variations in the solid-channel state. Additionally, we propose SUAD Defense, a universal defense that uses ultrasonic perturbation signals to block inaudible voice attacks (IVAs) without impacting normal speech. Since the attack can occur at arbitrary frequencies and times, we propose a training method that randomizes both time and frequency to generate perturbation signals that break ultrasonic commands. Notably, the perturbation signal is modulated to an inaudible frequency without affecting the functionality of voice commands for VAs. Experiments on six smartphones have shown that SUAD Attack achieves activation success rates above 89.8% and SUAD Defense blocks IVAs with success rates exceeding 98%.
中文摘要:\name~攻击是一种利用固体介质实现跨障碍、远距离的无声语音攻击,而SUAD防御则通过随机化超声扰动信号在不影响正常语音功能的前提下,高效阻断此类攻击。
English Summary: The \name~Attack is a long-range, cross-barrier inaudible voice attack exploiting solid channels, while SUAD Defense uses randomized ultrasonic perturbations to effectively block such attacks without disrupting normal voice assistant functions.
Authors:Yue Li, Xiao Li, Hao Wu, Yue Zhang, Fengyuan Xu, Xiuzhen Cheng, Sheng Zhong
Abstract:
Large Language Models (LLMs) have become integral to automated code analysis, enabling tasks such as vulnerability detection and code comprehension. However, their integration introduces novel attack surfaces. In this paper, we identify and investigate a new class of prompt-based attacks, termed Copy-Guided Attacks (CGA), which exploit the inherent copying tendencies of reasoning-capable LLMs. By injecting carefully crafted triggers into external code snippets, adversaries can induce the model to replicate malicious content during inference. This behavior enables two classes of vulnerabilities: inference length manipulation, where the model generates abnormally short or excessively long reasoning traces; and inference result manipulation, where the model produces misleading or incorrect conclusions. We formalize CGA as an optimization problem and propose a gradient-based approach to synthesize effective triggers. Empirical evaluation on state-of-the-art reasoning LLMs shows that CGA reliably induces infinite loops, premature termination, false refusals, and semantic distortions in code analysis tasks. While highly effective in targeted settings, we observe challenges in generalizing CGA across diverse prompts due to computational constraints, posing an open question for future research. Our findings expose a critical yet underexplored vulnerability in LLM-powered development pipelines and call for urgent advances in prompt-level defense mechanisms.
Authors:Yao Tong, Haonan Wang, Siquan Li, Kenji Kawaguchi, Tianyang Hu
Abstract:
Fingerprinting Large Language Models (LLMs) is essential for provenance verification and model attribution. Existing methods typically extract post-hoc signatures based on training dynamics, data exposure, or hyperparameters -- properties that only emerge after training begins. In contrast, we propose a stronger and more intrinsic notion of LLM fingerprinting: SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training. We show that untrained models exhibit reproducible token selection biases conditioned solely on their parameters at initialization. These biases are stable and measurable throughout training, enabling our statistical detection method to recover a model's lineage with high confidence. Unlike prior techniques, unreliable before convergence and vulnerable to distribution shifts, SeedPrints remains effective across all training stages and robust under domain shifts or parameter modifications. Experiments on LLaMA-style and Qwen-style models show that SeedPrints achieves seed-level distinguishability and can provide birth-to-lifecycle identity verification akin to a biometric fingerprint. Evaluations on large-scale pretrained models and fingerprinting benchmarks further confirm its effectiveness under practical deployment scenarios. These results suggest that initialization itself imprints a unique and persistent identity on neural language models, forming a true ''Galtonian'' fingerprint.
Authors:Zhaoxi Zhang, Xiaomei Zhang, Yanjun Zhang, He Zhang, Shirui Pan, Bo Liu, Asif Qumer Gill, Leo Yu Zhang
Abstract:
Large Language Model (LLM) watermarking embeds detectable signals into generated text for copyright protection, misuse prevention, and content detection. While prior studies evaluate robustness using watermark removal attacks, these methods are often suboptimal, creating the misconception that effective removal requires large perturbations or powerful adversaries. To bridge the gap, we first formalize the system model for LLM watermark, and characterize two realistic threat models constrained on limited access to the watermark detector. We then analyze how different types of perturbation vary in their attack range, i.e., the number of tokens they can affect with a single edit. We observe that character-level perturbations (e.g., typos, swaps, deletions, homoglyphs) can influence multiple tokens simultaneously by disrupting the tokenization process. We demonstrate that character-level perturbations are significantly more effective for watermark removal under the most restrictive threat model. We further propose guided removal attacks based on the Genetic Algorithm (GA) that uses a reference detector for optimization. Under a practical threat model with limited black-box queries to the watermark detector, our method demonstrates strong removal performance. Experiments confirm the superiority of character-level perturbations and the effectiveness of the GA in removing watermarks under realistic constraints. Additionally, we argue there is an adversarial dilemma when considering potential defenses: any fixed defense can be bypassed by a suitable perturbation strategy. Motivated by this principle, we propose an adaptive compound character-level attack. Experimental results show that this approach can effectively defeat the defenses. Our findings highlight significant vulnerabilities in existing LLM watermark schemes and underline the urgency for the development of new robust mechanisms.
Authors:Hongfei Xia, Hongru Wang, Zeming Liu, Qian Yu, Yuhang Guo, Haifeng Wang
Abstract:
Large Language Models (LLMs) have exhibited great performance in autonomously calling various tools in external environments, leading to better problem solving and task automation capabilities. However, these external tools also amplify potential risks such as financial loss or privacy leakage with ambiguous or malicious user instructions. Compared to previous studies, which mainly assess the safety awareness of LLMs after obtaining the tool execution results (i.e., retrospective evaluation), this paper focuses on prospective ways to assess the safety of LLM tool utilization, aiming to avoid irreversible harm caused by directly executing tools. To this end, we propose SafeToolBench, the first benchmark to comprehensively assess tool utilization security in a prospective manner, covering malicious user instructions and diverse practical toolsets. Additionally, we propose a novel framework, SafeInstructTool, which aims to enhance LLMs' awareness of tool utilization security from three perspectives (i.e., \textit{User Instruction, Tool Itself, and Joint Instruction-Tool}), leading to nine detailed dimensions in total. We experiment with four LLMs using different methods, revealing that existing approaches fail to capture all risks in tool utilization. In contrast, our framework significantly enhances LLMs' self-awareness, enabling a more safe and trustworthy tool utilization.
Authors:Hexuan Yu, Md Mohaimin Al Barat, Yang Xiao, Y. Thomas Hou, Wenjing Lou
Abstract:
Open Radio Access Network (Open RAN) is reshaping mobile network architecture by promoting openness, disaggregation, and cross-vendor interoperability. However, this architectural flexibility introduces new security challenges, especially in deployments where multiple mobile network operators (MNOs) jointly operate shared components. Existing Zero Trust Architectures (ZTA) in O-RAN, as defined by governmental and industry standards, implicitly assume that authenticated components will comply with operational policies. However, this assumption creates a critical blind spot: misconfigured or compromised components can silently violate policies, misuse resources, or corrupt downstream processes (e.g., ML-based RIC xApps). To address this critical gap, we propose a monitoring framework for low-trust O-RAN environments that proactively verifies configuration state and control behavior against tenant-defined policies. Our system provides scalable, verifiable oversight to enhance transparency and trust in O-RAN operations. We implement and evaluate the framework using standardized O-RAN configurations, with total processing latency of approximately 200 ms, demonstrating its efficiency and practicality for timely policy enforcement and compliance auditing in multi-MNO deployments.
Authors:Chaoyu Zhang, Heng Jin, Shanghao Shi, Hexuan Yu, Sydney Johns, Y. Thomas Hou, Wenjing Lou
Abstract:
Federated Learning (FL) has gained significant attention for its privacy-preserving capabilities, enabling distributed devices to collaboratively train a global model without sharing raw data. However, its distributed nature forces the central server to blindly trust the local training process and aggregate uncertain model updates, making it susceptible to Byzantine attacks from malicious participants, especially in mission-critical scenarios. Detecting such attacks is challenging due to the diverse knowledge across clients, where variations in model updates may stem from benign factors, such as non-IID data, rather than adversarial behavior. Existing data-driven defenses struggle to distinguish malicious updates from natural variations, leading to high false positive rates and poor filtering performance. To address this challenge, we propose Sentinel, a remote attestation (RA)-based scheme for FL systems that regains client-side transparency and mitigates Byzantine attacks from a system security perspective. Our system employs code instrumentation to track control-flow and monitor critical variables in the local training process. Additionally, we utilize a trusted training recorder within a Trusted Execution Environment (TEE) to generate an attestation report, which is cryptographically signed and securely transmitted to the server. Upon verification, the server ensures that legitimate client training processes remain free from program behavior violation or data manipulation, allowing only trusted model updates to be aggregated into the global model. Experimental results on IoT devices demonstrate that Sentinel ensures the trustworthiness of the local training integrity with low runtime and memory overhead.
Authors:Sasan Razmkhah, Mingye Li, Zeming Cheng, Robert S. Aviles, Kyle Jackman, Joey Delport, Lieze Schindler, Wenhui Luo, Takuya Suzuki, Mehdi Kamal, Christopher L. Ayala, Coenrad J. Fourie, Nabuyuki Yoshikawa, Peter A. Beerel, Sandeep Gupta, Massoud Pedram
Abstract:
This research explores the use of superconductor electronics (SCE) for accelerating fully homomorphic encryption (FHE), focusing on the Number-Theoretic Transform (NTT), a key computational bottleneck in FHE schemes. We present SCE-NTT, a dedicated hardware accelerator based on superconductive single flux quantum (SFQ) logic and memory, targeting high performance and energy efficiency beyond the limits of conventional CMOS. To address SFQ constraints such as limited dense RAM and restricted fanin/fanout, we propose a deeply pipelined NTT-128 architecture using shift register memory (SRM). Designed for N=128 32-bit coefficients, NTT-128 comprises log2(N)=7 processing elements (PEs), each featuring a butterfly unit (BU), dual coefficient memories operating in ping-pong mode via FIFO-based SRM queues, and twiddle factor buffers. The BU integrates a Shoup modular multiplier optimized for a small area, leveraging precomputed twiddle factors. A new RSFQ cell library with over 50 parameterized cells, including compound logic units, was developed for implementation. Functional and timing correctness were validated using JoSIM analog simulations and Verilog models. A multiphase clocking scheme was employed to enhance robustness and reduce path-balancing overhead, improving circuit reliability. Fabricated results show the NTT-128 unit achieves 531 million NTT/sec at 34 GHz, over 100x faster than state-of-the-art CMOS equivalents. We also project that the architecture can scale to larger sizes, such as a 2^14-point NTT in approximately 482 ns. Key-switch throughput is estimated at 1.63 million operations/sec, significantly exceeding existing hardware. These results demonstrate the strong potential of SCE-based accelerators for scalable, energy-efficient secure computation in the post-quantum era, with further gains anticipated through advances in fabrication.
本研究提出了一种基于超导电子学的加速器SCE-NTT,通过解决数论变换的计算瓶颈,显著提升了全同态加密的速度和能效,其性能比现有CMOS技术快100倍以上。
This study introduces a superconductor electronics-based accelerator, SCE-NTT, which significantly enhances the speed and energy efficiency of fully homomorphic encryption by overcoming the computational bottleneck of the Number-Theoretic Transform, achieving over 100 times faster performance than current CMOS technology.
Authors:Dan Lin, Shunfeng Lu, Ziyan Liu, Jiajing Wu, Junyuan Fang, Kaixin Lin, Bowen Song, Zibin Zheng
Abstract:
Cross-chain bridges play a vital role in enabling blockchain interoperability. However, due to the inherent design flaws and the enormous value they hold, they have become prime targets for hacker attacks. Existing detection methods show progress yet remain limited, as they mainly address single-chain behaviors and fail to capture cross-chain semantics. To address this gap, we leverage heterogeneous graph attention networks, which are well-suited for modeling multi-typed entities and relations, to capture the complex execution semantics of cross-chain behaviors. We propose BridgeShield, a detection framework that jointly models the source chain, off-chain coordination, and destination chain within a unified heterogeneous graph representation. BridgeShield incorporates intra-meta-path attention to learn fine-grained dependencies within cross-chain paths and inter-meta-path attention to highlight discriminative cross-chain patterns, thereby enabling precise identification of attack behaviors. Extensive experiments on 51 real-world cross-chain attack events demonstrate that BridgeShield achieves an average F1-score of 92.58%, representing a 24.39% improvement over state-of-the-art baselines. These results validate the effectiveness of BridgeShield as a practical solution for securing cross-chain bridges and enhancing the resilience of multi-chain ecosystems.
中文: 跨链桥对区块链互操作性至关重要但易受攻击,为此提出的BridgeShield框架采用异构图注意力网络,通过建模跨链语义有效检测威胁,在测试中达到92.58%的F1分数。
English: Cross-chain bridges are critical for blockchain interoperability but vulnerable to attacks, prompting the development of BridgeShield, a framework using heterogeneous graph attention networks to effectively detect threats by modeling cross-chain semantics and achieving a 92.58% F1-score in tests.
Authors:Shayesta Naziri, Xu Wang, Guangsheng Yu, Christy Jie Liang, Wei Ni
Abstract:
The increasing deployment of Unmanned Aerial Vehicles (UAVs) for military, commercial, and logistics applications has raised significant concerns regarding flight path privacy. Conventional UAV communication systems often expose flight path data to third parties, making them vulnerable to tracking, surveillance, and location inference attacks. Existing encryption techniques provide security but fail to ensure complete privacy, as adversaries can still infer movement patterns through metadata analysis. To address these challenges, we propose a zk-SNARK(Zero-Knowledge Succinct Non-Interactive Argument of Knowledge)-based privacy-preserving flight path authentication and verification framework. Our approach ensures that a UAV can prove its authorisation, validate its flight path with a control centre, and comply with regulatory constraints without revealing any sensitive trajectory information. By leveraging zk-SNARKs, the UAV can generate cryptographic proofs that verify compliance with predefined flight policies while keeping the exact path and location undisclosed. This method mitigates risks associated with real-time tracking, identity exposure, and unauthorised interception, thereby enhancing UAV operational security in adversarial environments. Our proposed solution balances privacy, security, and computational efficiency, making it suitable for resource-constrained UAVs in both civilian and military applications.
中文: 本文提出了一种基于zk-SNARK的隐私保护框架,使无人机能在不泄露飞行轨迹的情况下向控制中心验证授权与合规性,从而有效防范追踪与位置推断攻击,提升操作安全性。
English: This paper introduces a zk-SNARK-based framework that enables UAVs to authenticate and verify flight paths with control centers while keeping trajectory data confidential, effectively enhancing privacy and security against tracking and inference attacks.
Authors:Yangyang Guo, Yangyan Li, Mohan Kankanhalli
Abstract:
In this study, we disclose a worrying new vulnerability in Large Language Models (LLMs), which we term \textbf{involuntary jailbreak}. Unlike existing jailbreak attacks, this weakness is distinct in that it does not involve a specific attack objective, such as generating instructions for \textit{building a bomb}. Prior attack methods predominantly target localized components of the LLM guardrail. In contrast, involuntary jailbreaks may potentially compromise the entire guardrail structure, which our method reveals to be surprisingly fragile. We merely employ a single universal prompt to achieve this goal. In particular, we instruct LLMs to generate several questions that would typically be rejected, along with their corresponding in-depth responses (rather than a refusal). Remarkably, this simple prompt strategy consistently jailbreaks the majority of leading LLMs, including Claude Opus 4.1, Grok 4, Gemini 2.5 Pro, and GPT 4.1. We hope this problem can motivate researchers and practitioners to re-evaluate the robustness of LLM guardrails and contribute to stronger safety alignment in future.
中文摘要:本研究揭示了大语言模型中一种新型“非自愿越狱”漏洞,仅需单一通用提示即可系统性地绕过安全防护,迫使主流模型生成被禁内容而非拒绝回答。
English Summary: This study reveals a novel "involuntary jailbreak" vulnerability in LLMs, where a single universal prompt can systematically bypass safety guardrails across major models by compelling them to generate prohibited content instead of refusals.
Authors:Xuyang Guo, Zekai Huang, Zhao Song, Jiahao Zhang
Abstract:
Large Language Models (LLMs) have recently demonstrated strong emergent abilities in complex reasoning and zero-shot generalization, showing unprecedented potential for LLM-as-a-judge applications in education, peer review, and data quality evaluation. However, their robustness under prompt injection attacks, where malicious instructions are embedded into the content to manipulate outputs, remains a significant concern. In this work, we explore a frustratingly simple yet effective attack setting to test whether LLMs can be easily misled. Specifically, we evaluate LLMs on basic arithmetic questions (e.g., "What is 3 + 2?") presented as either multiple-choice or true-false judgment problems within PDF files, where hidden prompts are injected into the file. Our results reveal that LLMs are indeed vulnerable to such hidden prompt injection attacks, even in these trivial scenarios, highlighting serious robustness risks for LLM-as-a-judge applications.
Authors:Xiaojie Lin, Baihe Ma, Xu Wang, Guangsheng Yu, Ying He, Wei Ni, Ren Ping Liu
Abstract:
Driving trajectory data remains vulnerable to privacy breaches despite existing mitigation measures. Traditional methods for detecting driving trajectories typically rely on map-matching the path using Global Positioning System (GPS) data, which is susceptible to GPS data outage. This paper introduces CAN-Trace, a novel privacy attack mechanism that leverages Controller Area Network (CAN) messages to uncover driving trajectories, posing a significant risk to drivers' long-term privacy. A new trajectory reconstruction algorithm is proposed to transform the CAN messages, specifically vehicle speed and accelerator pedal position, into weighted graphs accommodating various driving statuses. CAN-Trace identifies driving trajectories using graph-matching algorithms applied to the created graphs in comparison to road networks. We also design a new metric to evaluate matched candidates, which allows for potential data gaps and matching inaccuracies. Empirical validation under various real-world conditions, encompassing different vehicles and driving regions, demonstrates the efficacy of CAN-Trace: it achieves an attack success rate of up to 90.59% in the urban region, and 99.41% in the suburban region.
Chinese: 本文提出CAN-Trace,一种利用CAN总线数据重构行驶轨迹的新型隐私攻击方法,即使在数据存在缺失的情况下,仍能在城郊和市区实现高达99.41%和90.59%的攻击成功率。
English: This paper presents CAN-Trace, a novel privacy attack that reconstructs driving trajectories from CAN bus data, achieving high success rates in both urban and suburban areas despite potential data gaps.
Authors:Heng Guo, Kun Tian, Fengxia Liu, Zhiyong Zheng
Abstract:
Homomorphic ring signature schemes combine the strong anonymity of ring signatures with the computability of homomorphic signatures, demonstrating significant potential in scenarios requiring both anonymous data provenance and verifiable homomorphic computation (e.g., confidential blockchain transactions and secure multi-party computation). However, no feasible homomorphic ring signature scheme currently exists.
In this work, we propose the first lattice-based linearly homomorphic ring signature scheme. Proven secure in the standard model under the small integer solution (SIS) assumption, our scheme achieves strong anonymity under full key exposure and unforgeability against insider corruption attacks. As the first unified framework for ring signatures and linear homomorphic signatures, this construction provides a post-quantum-secure solution for the aforementioned applications, advancing the development of privacy-enhanced homomorphic computation.
Authors:Maryam Ataei Nezhad, Hamid Barati, Ali Barati
Abstract:
Internet of Things means connecting different devices through the Internet. The Internet of things enables humans to remotely manage and control the objects they use with the Internet infrastructure. After the advent of the Internet of Things in homes, organizations, and private companies, privacy and information security are the biggest concern. This issue has challenged the spread of the Internet of things as news of the users theft of information by hackers intensified. The proposed method in this paper consists of three phases. In the first phase, a star structure is constructed within each cluster, and a unique key is shared between each child and parent to encrypt and secure subsequent communications. The second phase is for intracluster communications, in which members of the cluster send their data to the cluster head in a multi hop manner. Also, in this phase, the data is encrypted with different keys in each hop, and at the end of each connection, the keys are updated to ensure data security. The third phase is to improve the security of inter cluster communications using an authentication protocol. In this way, the cluster heads are authenticated before sending information to prevent malicious nodes in the network. The proposed method is also simulated using NS2 software. The results showed that the proposed method has improved in terms of energy consumption, end-to-end delay, flexibility, packet delivery rate, and the number of alive nodes compared to other methods.
Authors:Ali Naseh, Anshuman Suri, Yuefeng Peng, Harsh Chaudhari, Alina Oprea, Amir Houmansadr
Abstract:
Generative AI leaderboards are central to evaluating model capabilities, but remain vulnerable to manipulation. Among key adversarial objectives is rank manipulation, where an attacker must first deanonymize the models behind displayed outputs -- a threat previously demonstrated and explored for large language models (LLMs). We show that this problem can be even more severe for text-to-image leaderboards, where deanonymization is markedly easier. Using over 150,000 generated images from 280 prompts and 19 diverse models spanning multiple organizations, architectures, and sizes, we demonstrate that simple real-time classification in CLIP embedding space identifies the generating model with high accuracy, even without prompt control or historical data. We further introduce a prompt-level separability metric and identify prompts that enable near-perfect deanonymization. Our results indicate that rank manipulation in text-to-image leaderboards is easier than previously recognized, underscoring the need for stronger defenses.
Authors:Sara Mandelli, Diego Vila-Portela, David Vázquez-Padín, Paolo Bestagini, Fernando Pérez-González
Abstract:
Over the years, the forensics community has proposed several deep learning-based detectors to mitigate the risks of generative AI. Recently, frequency-domain artifacts (particularly periodic peaks in the magnitude spectrum), have received significant attention, as they have been often considered a strong indicator of synthetic image generation. However, state-of-the-art detectors are typically used as black-boxes, and it still remains unclear whether they truly rely on these peaks. This limits their interpretability and trust. In this work, we conduct a systematic study to address this question. We propose a strategy to remove spectral peaks from images and analyze the impact of this operation on several detectors. In addition, we introduce a simple linear detector that relies exclusively on frequency peaks, providing a fully interpretable baseline free from the confounding influence of deep learning. Our findings reveal that most detectors are not fundamentally dependent on spectral peaks, challenging a widespread assumption in the field and paving the way for more transparent and reliable forensic tools.
Authors:Lesly Miculicich, Mihir Parmar, Hamid Palangi, Krishnamurthy Dj Dvijotham, Mirko Montanari, Tomas Pfister, Long T. Le
Abstract:
The deployment of autonomous AI agents in sensitive domains, such as healthcare, introduces critical risks to safety, security, and privacy. These agents may deviate from user objectives, violate data handling policies, or be compromised by adversarial attacks. Mitigating these dangers necessitates a mechanism to formally guarantee that an agent's actions adhere to predefined safety constraints, a challenge that existing systems do not fully address. We introduce VeriGuard, a novel framework that provides formal safety guarantees for LLM-based agents through a dual-stage architecture designed for robust and verifiable correctness. The initial offline stage involves a comprehensive validation process. It begins by clarifying user intent to establish precise safety specifications. VeriGuard then synthesizes a behavioral policy and subjects it to both testing and formal verification to prove its compliance with these specifications. This iterative process refines the policy until it is deemed correct. Subsequently, the second stage provides online action monitoring, where VeriGuard operates as a runtime monitor to validate each proposed agent action against the pre-verified policy before execution. This separation of the exhaustive offline validation from the lightweight online monitoring allows formal guarantees to be practically applied, providing a robust safeguard that substantially improves the trustworthiness of LLM agents.
Authors:Pouriya Alimoradi, Ali Barati, Hamid Barati
Abstract:
The rapid advancement of Artificial Intelligence has enabled the development of cruise missiles endowed with high levels of autonomy, adaptability, and precision. These AI driven missiles integrating deep learning algorithms, real time data processing, and advanced guidance systems pose critical threats to strategic infrastructures, especially under complex geographic and climatic conditions such as those found in Irans Khuzestan Province. In this paper, we propose a multi layer interference model, encompassing electronic warfare, cyberattacks, and deception strategies, to degrade the performance of AI guided cruise missiles significantly. Our experimental results, derived from 400 simulation runs across four distinct scenarios, demonstrate notable improvements when employing the integrated multi layer approach compared to single layer or no interference baselines. Specifically, the average missile deviation from its intended target increases from 0.25 to 8.65 under multi layer interference a more than 3300 increase in angular deviation. Furthermore, the target acquisition success rate is reduced from 92.7 in the baseline scenario to 31.5, indicating a 66 decrease in successful strikes. While resource consumption for multi layer strategies rises by approximately 25 compared to single layer methods, the significant drop in missile accuracy and reliability justifies the more intensive deployment of jamming power, cyber resources, and decoy measures. Beyond these quantitative improvements, the proposed framework uses a deep reinforcement learning based defense coordinator to adaptively select the optimal configuration of EW, cyber, and deception tactics in real time.
Authors:Samar Fares, Nurbek Tastan, Noor Hussein, Karthik Nandakumar
Abstract:
Generative models can generate photorealistic images at scale. This raises urgent concerns about the ability to detect synthetically generated images and attribute these images to specific sources. While watermarking has emerged as a possible solution, existing methods remain fragile to realistic distortions, susceptible to adaptive removal, and expensive to update when the underlying watermarking key changes. We propose a general watermarking framework that formulates the encoding problem as key-dependent perturbation of the parameters of a generative model. Within this framework, we introduce Mixture of LoRA Markers (MOLM), a routing-based instantiation in which binary keys activate lightweight LoRA adapters inside residual and attention blocks. This design avoids key-specific re-training and achieves the desired properties such as imperceptibility, fidelity, verifiability, and robustness. Experiments on Stable Diffusion and FLUX show that MOLM preserves image quality while achieving robust key recovery against distortions, compression and regeneration, averaging attacks, and black-box adversarial attacks on the extractor.
Authors:Ping He, Changjiang Li, Binbin Zhao, Tianyu Du, Shouling Ji
Abstract:
The remarkable capability of large language models (LLMs) has led to the wide application of LLM-based agents in various domains. To standardize interactions between LLM-based agents and their environments, model context protocol (MCP) tools have become the de facto standard and are now widely integrated into these agents. However, the incorporation of MCP tools introduces the risk of tool poisoning attacks, which can manipulate the behavior of LLM-based agents. Although previous studies have identified such vulnerabilities, their red teaming approaches have largely remained at the proof-of-concept stage, leaving the automatic and systematic red teaming of LLM-based agents under the MCP tool poisoning paradigm an open question. To bridge this gap, we propose AutoMalTool, an automated red teaming framework for LLM-based agents by generating malicious MCP tools. Our extensive evaluation shows that AutoMalTool effectively generates malicious MCP tools capable of manipulating the behavior of mainstream LLM-based agents while evading current detection mechanisms, thereby revealing new security risks in these agents.
Authors:Wenkai Guo, Xuefeng Liu, Haolin Wang, Jianwei Niu, Shaojie Tang, Jing Yuan
Abstract:
Fine-tuning large language models (LLMs) with local data is a widely adopted approach for organizations seeking to adapt LLMs to their specific domains. Given the shared characteristics in data across different organizations, the idea of collaboratively fine-tuning an LLM using data from multiple sources presents an appealing opportunity. However, organizations are often reluctant to share local data, making centralized fine-tuning impractical. Federated learning (FL), a privacy-preserving framework, enables clients to retain local data while sharing only model parameters for collaborative training, offering a potential solution. While fine-tuning LLMs on centralized datasets risks data leakage through next-token prediction, the iterative aggregation process in FL results in a global model that encapsulates generalized knowledge, which some believe protects client privacy. In this paper, however, we present contradictory findings through extensive experiments. We show that attackers can still extract training data from the global model, even using straightforward generation methods, with leakage increasing as the model size grows. Moreover, we introduce an enhanced attack strategy tailored to FL, which tracks global model updates during training to intensify privacy leakage. To mitigate these risks, we evaluate privacy-preserving techniques in FL, including differential privacy, regularization-constrained updates and adopting LLMs with safety alignment. Our results provide valuable insights and practical guidelines for reducing privacy risks when training LLMs with FL.
Authors:Yuntao Du, Zitao Li, Ninghui Li, Bolin Ding
Abstract:
Large Language Models (LLMs) have achieved remarkable progress in natural language understanding, reasoning, and autonomous decision-making. However, these advancements have also come with significant privacy concerns. While significant research has focused on mitigating the data privacy risks of LLMs during various stages of model training, less attention has been paid to new threats emerging from their deployment. The integration of LLMs into widely used applications and the weaponization of their autonomous abilities have created new privacy vulnerabilities. These vulnerabilities provide opportunities for both inadvertent data leakage and malicious exfiltration from LLM-powered systems. Additionally, adversaries can exploit these systems to launch sophisticated, large-scale privacy attacks, threatening not only individual privacy but also financial security and societal trust. In this paper, we systematically examine these emerging privacy risks of LLMs. We also discuss potential mitigation strategies and call for the research community to broaden its focus beyond data privacy risks, developing new defenses to address the evolving threats posed by increasingly powerful LLMs and LLM-powered systems.
Authors:Guojun Tang, Carylyne Chan, Ning Nan, Spencer Yang, Jiayu Zhou, Henry Leung, Mohammad Mamun, Steve Drew
Abstract:
Bitcoin's limited scripting capabilities and lack of native interoperability mechanisms have constrained its integration into the broader blockchain ecosystem, especially decentralized finance (DeFi) and multi-chain applications. This paper presents a comprehensive taxonomy of Bitcoin cross-chain bridge protocols, systematically analyzing their trust assumptions, performance characteristics, and applicability to the Artificial Intelligence of Things (AIoT) scenarios. We categorize bridge designs into three main types: naive token swapping, pegged-asset bridges, and arbitrary-message bridges. Each category is evaluated across key metrics such as trust model, latency, capital efficiency, and DeFi composability. Emerging innovations like BitVM and recursive sidechains are highlighted for their potential to enable secure, scalable, and programmable Bitcoin interoperability. Furthermore, we explore practical use cases of cross-chain bridges in AIoT applications, including decentralized energy trading, healthcare data integration, and supply chain automation. This taxonomy provides a foundational framework for researchers and practitioners seeking to design secure and efficient cross-chain infrastructures in AIoT systems.
Authors:Zihang Xiang, Tianhao Wang, Hanshen Xiao, Yuan Tian, Di Wang
Abstract:
In this paper, we study the problem of privacy audit in one run and show that our method achieves tight audit results for various differentially private protocols. This includes obtaining tight results for auditing $(\varepsilon,δ)$-DP algorithms where all previous work fails to achieve in any parameter setups. We first formulate a framework for privacy audit \textit{in one run} with refinement compared with previous work. Then, based on modeling privacy by the $f$-DP formulation, we study the implications of our framework to obtain a theoretically justified lower bound for privacy audit. In the experiment, we compare with previous work and show that our audit method outperforms the rest in auditing various differentially private algorithms. We also provide experiments that give contrasting conclusions to previous work on the parameter settings for privacy audits in one run.
Authors:Fatemeh Roshanzadeh, Hamid Barati, Ali Barati
Abstract:
The rapid expansion of Internet of Things (IoT) systems across various domains such as industry, smart cities, healthcare, manufacturing, and government services has led to a significant increase in security risks, threatening data integrity, confidentiality, and availability. Consequently, ensuring the security and resilience of IoT systems has become a critical requirement. In this paper, we propose a lightweight and efficient intrusion detection system (IDS) for IoT environments, leveraging a hybrid model of CNN and ConvNeXt-Tiny. The proposed method is designed to detect and classify different types of network attacks, particularly botnet and malicious traffic, while the lightweight ConvNeXt-Tiny architecture enables effective deployment in resource-constrained devices and networks. A real-world dataset comprising both benign and malicious network packets collected from practical IoT scenarios was employed in the experiments. The results demonstrate that the proposed method achieves high accuracy while significantly reducing training and inference time compared to more complex models. Specifically, the system attained 99.63% accuracy in the testing phase, 99.67% accuracy in the training phase, and an error rate of 0.0107 across eight classes, while maintaining short response times and low resource consumption. These findings highlight the effectiveness of the proposed method in detecting and classifying attacks in real-world IoT environments, indicating that the lightweight architecture can serve as a practical alternative to complex and resource-intensive approaches in IoT network security.
Authors:Ismail Hossain, Sai Puppala, Sajedul Talukder, Md Jahangir Alam
Abstract:
Scams exploiting real-time social engineering -- such as phishing, impersonation, and phone fraud -- remain a persistent and evolving threat across digital platforms. Existing defenses are largely reactive, offering limited protection during active interactions. We propose a privacy-preserving, AI-in-the-loop framework that proactively detects and disrupts scam conversations in real time. The system combines instruction-tuned artificial intelligence with a safety-aware utility function that balances engagement with harm minimization, and employs federated learning to enable continual model updates without raw data sharing. Experimental evaluations show that the system produces fluent and engaging responses (perplexity as low as 22.3, engagement $\approx$0.80), while human studies confirm significant gains in realism, safety, and effectiveness over strong baselines. In federated settings, models trained with FedAvg sustain up to 30 rounds while preserving high engagement ($\approx$0.80), strong relevance ($\approx$0.74), and low PII leakage ($\leq$0.0085). Even with differential privacy, novelty and safety remain stable, indicating that robust privacy can be achieved without sacrificing performance. The evaluation of guard models (LlamaGuard, LlamaGuard2/3, MD-Judge) shows a straightforward pattern: stricter moderation settings reduce the chance of exposing personal information, but they also limit how much the model engages in conversation. In contrast, more relaxed settings allow longer and richer interactions, which improve scam detection, but at the cost of higher privacy risk. To our knowledge, this is the first framework to unify real-time scam-baiting, federated privacy preservation, and calibrated safety moderation into a proactive defense paradigm.
Authors:Chuhan Zhang, Ye Zhang, Bowen Shi, Yuyou Gan, Tianyu Du, Shouling Ji, Dazhan Deng, Yingcai Wu
Abstract:
In deployment and application, large language models (LLMs) typically undergo safety alignment to prevent illegal and unethical outputs. However, the continuous advancement of jailbreak attack techniques, designed to bypass safety mechanisms with adversarial prompts, has placed increasing pressure on the security defenses of LLMs. Strengthening resistance to jailbreak attacks requires an in-depth understanding of the security mechanisms and vulnerabilities of LLMs. However, the vast number of parameters and complex structure of LLMs make analyzing security weaknesses from an internal perspective a challenging task. This paper presents NeuroBreak, a top-down jailbreak analysis system designed to analyze neuron-level safety mechanisms and mitigate vulnerabilities. We carefully design system requirements through collaboration with three experts in the field of AI security. The system provides a comprehensive analysis of various jailbreak attack methods. By incorporating layer-wise representation probing analysis, NeuroBreak offers a novel perspective on the model's decision-making process throughout its generation steps. Furthermore, the system supports the analysis of critical neurons from both semantic and functional perspectives, facilitating a deeper exploration of security mechanisms. We conduct quantitative evaluations and case studies to verify the effectiveness of our system, offering mechanistic insights for developing next-generation defense strategies against evolving jailbreak attacks.
Authors:Weizhe Wang, Wei Ma, Qiang Hu, Yao Zhang, Jianfei Sun, Bin Wu, Yang Liu, Guangquan Xu, Lingxiao Jiang
Abstract:
The adoption of Large Language Models (LLMs) for automated software vulnerability patching has shown promising outcomes on carefully curated evaluation sets. Nevertheless, existing datasets predominantly rely on superficial validation methods rather than exploit-based verification, leading to overestimated performance in security-sensitive applications. This paper introduces VulnRepairEval, an evaluation framework anchored in functional Proof-of-Concept (PoC) exploits. Our framework delivers a comprehensive, containerized evaluation pipeline that enables reproducible differential assessment, where repair success requires the original exploit to fail execution against the modified code. The benchmark construction involved extensive data curation: we processed over 400 CVEs and approximately 2,500 potential sources to extract a collection of authentic vulnerability instances (23 Python CVEs) amenable to automated testing with working PoCs. Through VulnRepairEval, we conduct a comprehensive evaluation of 12 popular LLMs and observe a significant performance deficit: even the top-performing model successfully addresses merely 5/23 instances (about 21.7%), exposing critical weaknesses in security-focused applications. Our failure analysis reveals that most unsuccessful attempts stem from imprecise vulnerability identification and patches containing syntactic or semantic errors. Enhanced prompting strategies and multi-agent approaches yield minimal improvements, with overall effectiveness remaining largely unaffected. This work contributes a stringent, practical evaluation framework for LLM-driven vulnerability remediation and underscores the necessity for assessment protocols that authentically reflect real-world exploitation scenarios.
Authors:Luca Pajola, Eugenio Caripoti, Stefan Banzer, Simeone Pizzi, Mauro Conti, Giovanni Apruzzese
Abstract:
Every day, our inboxes are flooded with unsolicited emails, ranging between annoying spam to more subtle phishing scams. Unfortunately, despite abundant prior efforts proposing solutions achieving near-perfect accuracy, the reality is that countering malicious emails still remains an unsolved dilemma. This "open problem" paper carries out a critical assessment of scientific works in the context of phishing email detection. First, we focus on the benchmark datasets that have been used to assess the methods proposed in research. We find that most prior work relied on datasets containing emails that -- we argue -- are not representative of current trends, and mostly encompass the English language. Based on this finding, we then re-implement and re-assess a variety of detection methods reliant on machine learning (ML), including large-language models (LLM), and release all of our codebase -- an (unfortunately) uncommon practice in related research. We show that most such methods achieve near-perfect performance when trained and tested on the same dataset -- a result which intrinsically hinders development (how can future research outperform methods that are already near perfect?). To foster the creation of "more challenging benchmarks" that reflect current phishing trends, we propose E-PhishGEN, an LLM-based (and privacy-savvy) framework to generate novel phishing-email datasets. We use our E-PhishGEN to create E-PhishLLM, a novel phishing-email detection dataset containing 16616 emails in three languages. We use E-PhishLLM to test the detectors we considered, showing a much lower performance than that achieved on existing benchmarks -- indicating a larger room for improvement. We also validate the quality of E-PhishLLM with a user study (n=30). To sum up, we show that phishing email detection is still an open problem -- and provide the means to tackle such a problem by future research.
Authors:Shiva Sattarpour, Ali Barati, Hamid Barati
Abstract:
The Internet of Things (IoT) has emerged as a foundational paradigm supporting a range of applications, including healthcare, education, agriculture, smart homes, and, more recently, enterprise systems. However, significant advancements in IoT networks have been impeded by security vulnerabilities and threats that, if left unaddressed, could hinder the deployment and operation of IoT based systems. Detecting unwanted activities within the IoT is crucial, as it directly impacts confidentiality, integrity, and availability. Consequently, intrusion detection has become a fundamental research area and the focus of numerous studies. An intrusion detection system (IDS) is essential to the IoTs alarm mechanisms, enabling effective security management. This paper examines IoT security and introduces an intelligent two-layer intrusion detection system for IoT. Machine learning techniques power the system's intelligence, with a two layer structure enhancing intrusion detection. By selecting essential features, the system maintains detection accuracy while minimizing processing overhead. The proposed method for intrusion detection in IoT is implemented in two phases. In the first phase, the Grasshopper Optimization Algorithm (GOA) is applied for feature selection. In the second phase, the Support Vector Machine (SVM) algorithm is used to detect intrusions. The method was implemented in MATLAB, and the NSLKDD dataset was used for evaluation. Simulation results show that the proposed method improves accuracy compared to other approaches.
Authors:Hojjat Farshadinia, Ali Barati, Hamid Barati
Abstract:
This paper presents a novel multi-layered hybrid security approach aimed at enhancing lightweight encryption for IoT-Cloud systems. The primary goal is to overcome limitations inherent in conventional solutions such as TPA, Blockchain, ECDSA and ZSS which often fall short in terms of data protection, computational efficiency and scalability. Our proposed method strategically refines and integrates these technologies to address their shortcomings while maximizing their individual strengths. By doing so we create a more reliable and high-performance framework for secure data exchange across heterogeneous environments. The model leverages the combined potential of emerging technologies, particularly Blockchain, IoT and Cloud computing which when effectively coordinated offer significant advancements in security architecture. The proposed framework consists of three core layers: (1) the H.E.EZ Layer which integrates improved versions of Hyperledger Fabric, Enc-Block and a hybrid ECDSA-ZSS scheme to improve encryption speed, scalability and reduce computational cost; (2) the Credential Management Layer independently verifying data integrity and authenticity; and (3) the Time and Auditing Layer designed to reduce traffic overhead and optimize performance across dynamic workloads. Evaluation results highlight that the proposed solution not only strengthens security but also significantly improves execution time, communication efficiency and system responsiveness, offering a robust path forward for next-generation IoT-Cloud infrastructures.
Authors:Abhinav Kumar, Jaechul Roh, Ali Naseh, Amir Houmansadr, Eugene Bagdasarian
Abstract:
AI web agents use Internet resources at far greater speed, scale, and complexity -- changing how users and services interact. Deployed maliciously or erroneously, these agents could overload content providers. At the same time, web agents can bypass CAPTCHAs and other defenses by mimicking user behavior or flood authentication systems with fake accounts. Yet providers must protect their services and content from denial-of-service attacks and scraping by web agents. In this paper, we design a framework that imposes tunable costs on agents before providing access to resources; we call this Web Agent Throttling. We start by formalizing Throttling Gates as challenges issued to an agent that are asymmetric, scalable, robust, and compatible with any agent. Focusing on a common component -- the language model -- we require the agent to solve reasoning puzzles, thereby incurring excessive token-generation costs. However, we find that using existing puzzles, e.g., coding or math, as throttling gates fails to satisfy our properties. To address this, we introduce rebus-based Reasoning Gates, synthetic text puzzles that require multi-hop reasoning over world knowledge (thereby throttling an agent's model). We design a scalable generation and verification protocol for such reasoning gates. Our framework achieves computational asymmetry, i.e., the response-generation cost is 9.2x higher than the generation cost for SOTA models. We further deploy reasoning gates on a custom website and Model Context Protocol (MCP) servers and evaluate with real-world web agents. Finally, we discuss the limitations and environmental impact of real-world deployment of our framework.
Authors:Dezhang Kong, Hujin Peng, Yilun Zhang, Lele Zhao, Zhenhua Xu, Shi Lin, Changting Lin, Meng Han
Abstract:
With the proliferation of applications built upon LLM-driven multi-agent systems (MAS), the security of Web links has become a critical concern in ensuring system reliability. Once an agent is induced to visit a malicious website, attackers can use it as a springboard to conduct diverse subsequent attacks, which will drastically expand the attack surface. In this paper, we propose Web Fraud Attacks, a novel type of attack aiming at inducing MAS to visit malicious websites. We design 11 representative attack variants that encompass domain name tampering (homoglyph deception, character substitution, etc.), link structure camouflage (sub-directory nesting, sub-domain grafting, parameter obfuscation, etc.), and other deceptive techniques tailored to exploit MAS's vulnerabilities in link validation. Through extensive experiments on these crafted attack vectors, we demonstrate that Web fraud attacks not only exhibit significant destructive potential across different MAS architectures but also possess a distinct advantage in evasion: they circumvent the need for complex input formats such as jailbreaking, which inherently carry higher exposure risks. These results underscore the importance of addressing Web fraud attacks in LLM-driven MAS, as their stealthiness and destructiveness pose non-negligible threats to system security and user safety.
Authors:Zhenhua Xu, Xubin Yue, Zhebo Wang, Qichen Liu, Xixiang Zhao, Jingxuan Zhang, Wenjun Zeng, Wengpeng Xing, Dezhang Kong, Changting Lin, Meng Han
Abstract:
Copyright protection for large language models is of critical importance, given their substantial development costs, proprietary value, and potential for misuse. Existing surveys have predominantly focused on techniques for tracing LLM-generated content-namely, text watermarking-while a systematic exploration of methods for protecting the models themselves (i.e., model watermarking and model fingerprinting) remains absent. Moreover, the relationships and distinctions among text watermarking, model watermarking, and model fingerprinting have not been comprehensively clarified. This work presents a comprehensive survey of the current state of LLM copyright protection technologies, with a focus on model fingerprinting, covering the following aspects: (1) clarifying the conceptual connection from text watermarking to model watermarking and fingerprinting, and adopting a unified terminology that incorporates model watermarking into the broader fingerprinting framework; (2) providing an overview and comparison of diverse text watermarking techniques, highlighting cases where such methods can function as model fingerprinting; (3) systematically categorizing and comparing existing model fingerprinting approaches for LLM copyright protection; (4) presenting, for the first time, techniques for fingerprint transfer and fingerprint removal; (5) summarizing evaluation metrics for model fingerprints, including effectiveness, harmlessness, robustness, stealthiness, and reliability; and (6) discussing open challenges and future research directions. This survey aims to offer researchers a thorough understanding of both text watermarking and model fingerprinting technologies in the era of LLMs, thereby fostering further advances in protecting their intellectual property.
Authors:Yanzhe Zhang, Diyi Yang
Abstract:
The widespread deployment of LLM-based agents is likely to introduce a critical privacy threat: malicious agents that proactively engage others in multi-turn interactions to extract sensitive information. However, the evolving nature of such dynamic dialogues makes it challenging to anticipate emerging vulnerabilities and design effective defenses. To tackle this problem, we present a search-based framework that alternates between improving attack and defense strategies through the simulation of privacy-critical agent interactions. Specifically, we employ LLMs as optimizers to analyze simulation trajectories and iteratively propose new agent instructions. To explore the strategy space more efficiently, we further utilize parallel search with multiple threads and cross-thread propagation. Through this process, we find that attack strategies escalate from direct requests to sophisticated tactics, such as impersonation and consent forgery, while defenses evolve from simple rule-based constraints to robust identity-verification state machines. The discovered attacks and defenses transfer across diverse scenarios and backbone models, demonstrating strong practical utility for building privacy-aware agents.
Authors:Zhihao Yao, Yuxuan Gu, Xiachong Feng, Weitao Ma, Bo Li, Xiaocheng Feng
Abstract:
The preservation of privacy has emerged as a critical topic in the era of artificial intelligence. However, current work focuses on user-oriented privacy, overlooking severe enterprise data leakage risks exacerbated by the Retrieval-Augmented Generation paradigm. To address this gap, our paper introduces a novel objective: enterprise-oriented privacy concerns. Achieving this objective requires overcoming two fundamental challenges: existing methods such as data sanitization severely degrade model performance, and the field lacks public datasets for evaluation. We address these challenges with several solutions. (1) To prevent performance degradation, we propose ABack, a training-free mechanism that leverages a Hidden State Model to pinpoint the origin of a leakage intention and rewrite the output safely. (2) To solve the lack of datasets, we construct PriGenQA, a new benchmark for enterprise privacy scenarios in healthcare and finance. To ensure a rigorous evaluation, we move beyond simple static attacks by developing a powerful adaptive attacker with Group Relative Policy Optimization. Experiments show that against this superior adversary, ABack improves the overall privacy utility score by up to 15\% over strong baselines, avoiding the performance trade-offs of prior methods.
中文: 本文针对AI中的企业数据泄露风险,提出了ABack机制,无需训练即可安全重写输出以防止泄露,同时创建了PriGenQA基准用于医疗和金融领域的隐私评估,在自适应攻击下将隐私效用分数提升了15%,优于现有方法。
English: This paper addresses enterprise data leakage risks in AI by introducing ABack, a training-free mechanism that safely rewrites outputs to prevent leaks without performance degradation, and PriGenQA, a benchmark for evaluating privacy in healthcare and finance, achieving a 15% improvement in privacy utility over baselines against adaptive attacks.
Authors:Peiran Wang, Yang Liu, Yunfei Lu, Yifeng Cai, Hongbo Chen, Qingyou Yang, Jie Zhang, Jue Hong, Ye Wu
Abstract:
Large Language Model (LLM) agents offer a powerful new paradigm for solving various problems by combining natural language reasoning with the execution of external tools. However, their dynamic and non-transparent behavior introduces critical security risks, particularly in the presence of prompt injection attacks. In this work, we propose a novel insight that treats the agent runtime traces as structured programs with analyzable semantics. Thus, we present AgentArmor, a program analysis framework that converts agent traces into graph intermediate representation-based structured program dependency representations (e.g., CFG, DFG, and PDG) and enforces security policies via a type system. AgentArmor consists of three key components: (1) a graph constructor that reconstructs the agent's runtime traces as graph-based intermediate representations with control and data flow described within; (2) a property registry that attaches security-relevant metadata of interacted tools \& data, and (3) a type system that performs static inference and checking over the intermediate representation. By representing agent behavior as structured programs, AgentArmor enables program analysis for sensitive data flow, trust boundaries, and policy violations. We evaluate AgentArmor on the AgentDojo benchmark, the results show that AgentArmor can reduce the ASR to 3\%, with the utility drop only 1\%.
Authors:Lijia Liu, Takumi Kondo, Kyohei Atarashi, Koh Takeuchi, Jiyi Li, Shigeru Saito, Hisashi Kashima
Abstract:
This paper investigates defenses for LLM-based evaluation systems against prompt injection. We formalize a class of threats called blind attacks, where a candidate answer is crafted independently of the true answer to deceive the evaluator. To counter such attacks, we propose a framework that augments Standard Evaluation (SE) with Counterfactual Evaluation (CFE), which re-evaluates the submission against a deliberately false ground-truth answer. An attack is detected if the system validates an answer under both standard and counterfactual conditions. Experiments show that while standard evaluation is highly vulnerable, our SE+CFE framework significantly improves security by boosting attack detection with minimal performance trade-offs.
中文摘要:本文提出将标准评估与反事实评估相结合的防御框架,用于检测针对基于大语言模型的评估系统的盲攻击,在保持性能的同时显著提升了系统安全性。
English Summary: This paper introduces a defense framework combining Standard Evaluation with Counterfactual Evaluation to detect blind attacks on LLM-based evaluation systems, significantly improving security while maintaining performance.
Authors:Sanhanat Sivapiromrat, Caiqi Zhang, Marco Basaldella, Nigel Collier
Abstract:
Recent studies have shown that Large Language Models (LLMs) are vulnerable to data poisoning attacks, where malicious training examples embed hidden behaviours triggered by specific input patterns. However, most existing works assume a phrase and focus on the attack's effectiveness, offering limited understanding of trigger mechanisms and how multiple triggers interact within the model. In this paper, we present a framework for studying poisoning in LLMs. We show that multiple distinct backdoor triggers can coexist within a single model without interfering with each other, enabling adversaries to embed several triggers concurrently. Using multiple triggers with high embedding similarity, we demonstrate that poisoned triggers can achieve robust activation even when tokens are substituted or separated by long token spans. Our findings expose a broader and more persistent vulnerability surface in LLMs. To mitigate this threat, we propose a post hoc recovery method that selectively retrains specific model components based on a layer-wise weight difference analysis. Our method effectively removes the trigger behaviour with minimal parameter updates, presenting a practical and efficient defence against multi-trigger poisoning.
Authors:Anshuman Suri, Harsh Chaudhari, Yuefeng Peng, Ali Naseh, Amir Houmansadr, Alina Oprea
Abstract:
While poisoning attacks on machine learning models have been extensively studied, the mechanisms by which adversaries can distribute poisoned models at scale remain largely unexplored. In this paper, we shed light on how model leaderboards -- ranked platforms for model discovery and evaluation -- can serve as a powerful channel for adversaries for stealthy large-scale distribution of poisoned models. We present TrojanClimb, a general framework that enables injection of malicious behaviors while maintaining competitive leaderboard performance. We demonstrate its effectiveness across four diverse modalities: text-embedding, text-generation, text-to-speech and text-to-image, showing that adversaries can successfully achieve high leaderboard rankings while embedding arbitrary harmful functionalities, from backdoors to bias injection. Our findings reveal a significant vulnerability in the machine learning ecosystem, highlighting the urgent need to redesign leaderboard evaluation mechanisms to detect and filter malicious (e.g., poisoned) models, while exposing broader security implications for the machine learning community regarding the risks of adopting models from unverified sources.
Authors:Alberto Castagnaro, Umberto Salviati, Mauro Conti, Luca Pajola, Simeone Pizzi
Abstract:
Large Language Models (LLMs) have transformed human-machine interaction since ChatGPT's 2022 debut, with Retrieval-Augmented Generation (RAG) emerging as a key framework that enhances LLM outputs by integrating external knowledge. However, RAG's reliance on ingesting external documents introduces new vulnerabilities. This paper exposes a critical security gap at the data loading stage, where malicious actors can stealthily corrupt RAG pipelines by exploiting document ingestion.
We propose a taxonomy of 9 knowledge-based poisoning attacks and introduce two novel threat vectors -- Content Obfuscation and Content Injection -- targeting common formats (DOCX, HTML, PDF). Using an automated toolkit implementing 19 stealthy injection techniques, we test five popular data loaders, finding a 74.4% attack success rate across 357 scenarios. We further validate these threats on six end-to-end RAG systems -- including white-box pipelines and black-box services like NotebookLM and OpenAI Assistants -- demonstrating high success rates and critical vulnerabilities that bypass filters and silently compromise output integrity. Our results emphasize the urgent need to secure the document ingestion process in RAG systems against covert content manipulations.
Authors:Xinfeng Li, Dong Huang, Jie Li, Hongyi Cai, Zhenhong Zhou, Wei Dong, XiaoFeng Wang, Yang Liu
Abstract:
The autonomy and contextual complexity of LLM-based agents render traditional access control (AC) mechanisms insufficient. Static, rule-based systems designed for predictable environments are fundamentally ill-equipped to manage the dynamic information flows inherent in agentic interactions. This position paper argues for a paradigm shift from binary access control to a more sophisticated model of information governance, positing that the core challenge is not merely about permission, but about governing the flow of information. We introduce Agent Access Control (AAC), a novel framework that reframes AC as a dynamic, context-aware process of information flow governance. AAC operates on two core modules: (1) multi-dimensional contextual evaluation, which assesses not just identity but also relationships, scenarios, and norms; and (2) adaptive response formulation, which moves beyond simple allow/deny decisions to shape information through redaction, summarization, and paraphrasing. This vision, powered by a dedicated AC reasoning engine, aims to bridge the gap between human-like nuanced judgment and scalable Al safety, proposing a new conceptual lens for future research in trustworthy agent design.
Authors:Gaurab Chhetri, Shriyank Somvanshi, Pavan Hebli, Shamyo Brotee, Subasish Das
Abstract:
Post-quantum cryptography (PQC) is moving from evaluation to deployment as NIST finalizes standards for ML-KEM, ML-DSA, and SLH-DSA. This survey maps the space from foundations to practice. We first develop a taxonomy across lattice-, code-, hash-, multivariate-, isogeny-, and MPC-in-the-Head families, summarizing security assumptions, cryptanalysis, and standardization status. We then compare performance and communication costs using representative, implementation-grounded measurements, and review hardware acceleration (AVX2, FPGA/ASIC) and implementation security with a focus on side-channel resistance. Building upward, we examine protocol integration (TLS, DNSSEC), PKI and certificate hygiene, and deployment in constrained and high-assurance environments (IoT, cloud, finance, blockchain). We also discuss complementarity with quantum technologies (QKD, QRNGs) and the limits of near-term quantum computing. Throughout, we emphasize crypto-agility, hybrid migration, and evidence-based guidance for operators. We conclude with open problems spanning parameter agility, leakage-resilient implementations, and domain-specific rollout playbooks. This survey aims to be a practical reference for researchers and practitioners planning quantum-safe systems, bridging standards, engineering, and operations.
Authors:Praneeth Vepakomma, Kaustubh Ponkshe
Abstract:
Traditional collaborative learning approaches are based on sharing of model weights between clients and a server. However, there are advantages to resource efficiency through schemes based on sharing of embeddings (activations) created from the data. Several differentially private methods were developed for sharing of weights while such mechanisms do not exist so far for sharing of embeddings. We propose Ours to learn a privacy encoding network in conjunction with a small utility generation network such that the final embeddings generated from it are equipped with formal differential privacy guarantees. These privatized embeddings are then shared with a more powerful server, that learns a post-processing that results in a higher accuracy for machine learning tasks. We show that our co-design of collaborative and private learning results in requiring only one round of privatized communication and lesser compute on the client than traditional methods. The privatized embeddings that we share from the client are agnostic to the type of model (deep learning, random forests or XGBoost) used on the server in order to process these activations to complete a task.
Authors:Léo Boisvert, Abhay Puri, Chandra Kiran Reddy Evuru, Nicolas Chapados, Quentin Cappart, Alexandre Lacoste, Krishnamurthy Dj Dvijotham, Alexandre Drouin
Abstract:
The practice of fine-tuning AI agents on data from their own interactions--such as web browsing or tool use--, while being a strong general recipe for improving agentic capabilities, also introduces a critical security vulnerability within the AI supply chain. In this work, we show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors that are triggerred by specific target phrases, such that when the agent encounters these triggers, it performs an unsafe or malicious action. We formalize and validate three realistic threat models targeting different layers of the supply chain: 1) direct poisoning of fine-tuning data, where an attacker controls a fraction of the training traces; 2) environmental poisoning, where malicious instructions are injected into webpages scraped or tools called while creating training data; and 3) supply chain poisoning, where a pre-backdoored base model is fine-tuned on clean data to improve its agentic capabilities. Our results are stark: by poisoning as few as 2% of the collected traces, an attacker can embed a backdoor causing an agent to leak confidential user information with over 80% success when a specific trigger is present. This vulnerability holds across all three threat models. Furthermore, we demonstrate that prominent safeguards, including two guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings highlight an urgent threat to agentic AI development and underscore the critical need for rigorous security vetting of data collection processes and end-to-end model supply chains.
Authors:Yuhao Sun, Zhuoer Xu, Shiwen Cui, Kun Yang, Lingyun Yu, Yongdong Zhang, Hongtao Xie
Abstract:
Large Language Models (LLMs) have achieved remarkable progress across a wide range of tasks, but remain vulnerable to safety risks such as harmful content generation and jailbreak attacks. Existing safety techniques -- including external guardrails, inference-time guidance, and post-training alignment -- each face limitations in balancing safety, utility, and controllability. In this work, we propose UpSafe$^\circ$C, a unified framework for enhancing LLM safety through safety-aware upcycling. Our approach first identifies safety-critical layers and upcycles them into a sparse Mixture-of-Experts (MoE) structure, where the router acts as a soft guardrail that selectively activates original MLPs and added safety experts. We further introduce a two-stage SFT strategy to strengthen safety discrimination while preserving general capabilities. To enable flexible control at inference time, we introduce a safety temperature mechanism, allowing dynamic adjustment of the trade-off between safety and utility. Experiments across multiple benchmarks, base model, and model scales demonstrate that UpSafe$^\circ$C achieves robust safety improvements against harmful and jailbreak inputs, while maintaining competitive performance on general tasks. Moreover, analysis shows that safety temperature provides fine-grained inference-time control that achieves the Pareto-optimal frontier between utility and safety. Our results highlight a new direction for LLM safety: moving from static alignment toward dynamic, modular, and inference-aware control.
Authors:Junjie Su, Weifei Jin, Yuxin Cao, Derui Wang, Kai Ye, Jie Hao
Abstract:
Sound Event Detection (SED) systems are increasingly deployed in safety-critical applications such as industrial monitoring and audio surveillance. However, their robustness against adversarial attacks has not been well explored. Existing audio adversarial attacks targeting SED systems, which incorporate both detection and localization capabilities, often lack effectiveness due to SED's strong contextual dependencies or lack precision by focusing solely on misclassifying the target region as the target event, inadvertently affecting non-target regions. To address these challenges, we propose the Mirage and Mute Attack (M2A) framework, which is designed for targeted adversarial attacks on polyphonic SED systems. In our optimization process, we impose specific constraints on the non-target output, which we refer to as preservation loss, ensuring that our attack does not alter the model outputs for non-target region, thus achieving precise attacks. Furthermore, we introduce a novel evaluation metric Editing Precison (EP) that balances effectiveness and precision, enabling our method to simultaneously enhance both. Comprehensive experiments show that M2A achieves 94.56% and 99.11% EP on two state-of-the-art SED models, demonstrating that the framework is sufficiently effective while significantly enhancing attack precision.
Authors:Yaman Jandali, Ruisi Zhang, Nojan Sheybani, Farinaz Koushanfar
Abstract:
Privacy-preserving technologies have introduced a paradigm shift that allows for realizable secure computing in real-world systems. The significant barrier to the practical adoption of these primitives is the computational and communication overhead that is incurred when applied at scale. In this paper, we present an overview of our efforts to bridge the gap between this overhead and practicality for privacy-preserving learning systems using multi-party computation (MPC), zero-knowledge proofs (ZKPs), and fully homomorphic encryption (FHE). Through meticulous hardware/software/algorithm co-design, we show progress towards enabling LLM-scale applications in privacy-preserving settings. We demonstrate the efficacy of our solutions in several contexts, including DNN IP ownership, ethical LLM usage enforcement, and transformer inference.
Authors:Benedetta Tondi, Andrea Costanzo, Mauro Barni
Abstract:
We propose a high-payload image watermarking method for textual embedding, where a semantic description of the image - which may also correspond to the input text prompt-, is embedded inside the image. In order to be able to robustly embed high payloads in large-scale images - such as those produced by modern AI generators - the proposed approach builds upon a traditional watermarking scheme that exploits orthogonal and turbo codes for improved robustness, and integrates frequency-domain embedding and perceptual masking techniques to enhance watermark imperceptibility. Experiments show that the proposed method is extremely robust against a wide variety of image processing, and the embedded text can be retrieved also after traditional and AI inpainting, permitting to unveil the semantic modification the image has undergone via image-text mismatch analysis.
Authors:Shidong Pan, Yikai Ge, Xiaoyu Sun
Abstract:
With the development of foundation AI technologies, task-executable voice assistants (VAs) have become more popular, enhancing user convenience and expanding device functionality. Android task-executable VAs are applications that are capable of understanding complex tasks and performing corresponding operations. Given their prevalence and great autonomy, there is no existing work examine the privacy risks within the voice assistants from the task-execution pattern in a holistic manner. To fill this research gap, this paper presents a user-centric comprehensive empirical study on privacy risks in Android task-executable VA applications. We collect ten mainstream VAs as our research target and analyze their operational characteristics. We then cross-check their privacy declarations across six sources, including privacy labels, policies, and manifest files, and our findings reveal widespread inconsistencies. Moreover, we uncover three significant privacy threat models: (1) privacy misdisclosure in mega apps, where integrated mini apps such as Alexa skills are inadequately represented; (2) privilege escalation via inter-application interactions, which exploit Android's communication mechanisms to bypass user consent; and (3) abuse of Google system applications, enabling apps to evade the declaration of dangerous permissions. Our study contributes actionable recommendations for practitioners and underscores broader relevance of these privacy risks to emerging autonomous AI agents.
Authors:Bishnu Bhusal, Manoj Acharya, Ramneet Kaur, Colin Samplawski, Anirban Roy, Adam D. Cobb, Rohit Chadha, Susmit Jha
Abstract:
Large language models (LLMs) have significantly transformed natural language understanding and generation, but they raise privacy concerns due to potential exposure of sensitive information. Studies have highlighted the risk of information leakage, where adversaries can extract sensitive information embedded in the prompts. In this work, we introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees. Our approach leverages the Differential Privacy (DP) framework to ensure worst-case theoretical bounds on information leakage without requiring any fine-tuning of the underlying models. The proposed method performs inference on private records and aggregates the resulting per-token output distributions. This enables the generation of longer and coherent synthetic text while maintaining privacy guarantees. Additionally, we propose a simple blending operation that combines private and public inference to further enhance utility. Empirical evaluations demonstrate that our approach outperforms previous state-of-the-art methods on in-context-learning (ICL) tasks, making it a promising direction for privacy-preserving text generation while maintaining high utility.
Authors:Ye Tian, Yifan Jia, Yanbin Wang, Jianguo Sun, Zhiquan Liu, Xiaowen Ling
Abstract:
Malicious URL detection remains a major challenge in cybersecurity, primarily due to two factors: (1) the exponential growth of the Internet has led to an immense diversity of URLs, making generalized detection increasingly difficult; and (2) attackers are increasingly employing sophisticated obfuscation techniques to evade detection. We advocate that addressing these challenges fundamentally requires: (1) obtaining semantic understanding to improve generalization across vast and diverse URL sets, and (2) accurately modeling contextual relationships within the structural composition of URLs. In this paper, we propose a novel malicious URL detection method combining multi-granularity graph learning with semantic embedding to jointly capture semantic, character-level, and structural features for robust URL analysis. To model internal dependencies within URLs, we first construct dual-granularity URL graphs at both subword and character levels, where nodes represent URL tokens/characters and edges encode co-occurrence relationships. To obtain fine-grained embeddings, we initialize node representations using a character-level convolutional network. The two graphs are then processed through jointly trained Graph Convolutional Networks to learn consistent graph-level representations, enabling the model to capture complementary structural features that reflect co-occurrence patterns and character-level dependencies. Furthermore, we employ BERT to derive semantic representations of URLs for semantically aware understanding. Finally, we introduce a gated dynamic fusion network to combine the semantically enriched BERT representations with the jointly optimized graph vectors, further enhancing detection performance. We extensively evaluate our method across multiple challenging dimensions. Results show our method exceeds SOTA performance, including against large language models.
Authors:Yifan Jia, Ye Tian, Yanbin Wang, Jianguo Sun, Haitao Xu
Abstract:
The success of smart contracts has made them a target for attacks, but their closed-source nature often forces vulnerability detection to work on bytecode, which is inherently more challenging than source-code-based analysis. While recent studies try to align source and bytecode embeddings during training to transfer knowledge, current methods rely on graph-level alignment that obscures fine-grained structural and semantic correlations between the two modalities. Moreover, the absence of precise vulnerability patterns and granular annotations in bytecode leads to depriving the model of crucial supervisory signals for learning discriminant features. We propose ExDoS to transfer rich semantic knowledge from source code to bytecode, effectively supplementing the source code prior in practical settings. Specifically, we construct semantic graphs from source code and control-flow graphs from bytecode. To address obscured local signals in graph-level contract embeddings, we propose a Dual-Attention Graph Network introducing a novel node attention aggregation module to enhance local pattern capture in graph embeddings. Furthermore, by summarizing existing source code vulnerability patterns and designing a corresponding set of bytecode-level patterns for each, we construct the first dataset of vulnerability pattern annotations aligned with source code definitions to facilitate fine-grained cross-modal alignment and the capture of function-level vulnerability signals. Finally, we propose a dual-focus objective for our cross-modal distillation framework, comprising: a Global Semantic Distillation Loss for transferring graph-level knowledge and a Local Semantic Distillation Loss enabling expert-guided, fine-grained vulnerability-specific distillation. Experiments on real-world contracts demonstrate that our method achieves consistent F1-score improvements (3\%--6\%) over strong baselines.
Authors:Nojan Sheybani, Alessandro Pegoraro, Jonathan Knauer, Phillip Rieger, Elissa Mollakuqe, Farinaz Koushanfar, Ahmad-Reza Sadeghi
Abstract:
Split Learning (SL) is a distributed learning approach that enables resource-constrained clients to collaboratively train deep neural networks (DNNs) by offloading most layers to a central server while keeping in- and output layers on the client-side. This setup enables SL to leverage server computation capacities without sharing data, making it highly effective in resource-constrained environments dealing with sensitive data. However, the distributed nature enables malicious clients to manipulate the training process. By sending poisoned intermediate gradients, they can inject backdoors into the shared DNN. Existing defenses are limited by often focusing on server-side protection and introducing additional overhead for the server. A significant challenge for client-side defenses is enforcing malicious clients to correctly execute the defense algorithm. We present ZORRO, a private, verifiable, and robust SL defense scheme. Through our novel design and application of interactive zero-knowledge proofs (ZKPs), clients prove their correct execution of a client-located defense algorithm, resulting in proofs of computational integrity attesting to the benign nature of locally trained DNN portions. Leveraging the frequency representation of model partitions enables ZORRO to conduct an in-depth inspection of the locally trained models in an untrusted environment, ensuring that each client forwards a benign checkpoint to its succeeding client. In our extensive evaluation, covering different model architectures as well as various attack strategies and data scenarios, we show ZORRO's effectiveness, as it reduces the attack success rate to less than 6\% while causing even for models storing \numprint{1000000} parameters on the client-side an overhead of less than 10 seconds.
Authors:Yifan Jia, Yanbin Wang, Jianguo Sun, Ye Tian, Peng Qian
Abstract:
Current Ethereum fraud detection methods rely on context-independent, numerical transaction sequences, failing to capture semantic of account transactions. Furthermore, the pervasive homogeneity in Ethereum transaction records renders it challenging to learn discriminative account embeddings. Moreover, current self-supervised graph learning methods primarily learn node representations through graph reconstruction, resulting in suboptimal performance for node-level tasks like fraud account detection, while these methods also encounter scalability challenges. To tackle these challenges, we propose LMAE4Eth, a multi-view learning framework that fuses transaction semantics, masked graph embedding, and expert knowledge. We first propose a transaction-token contrastive language model (TxCLM) that transforms context-independent numerical transaction records into logically cohesive linguistic representations. To clearly characterize the semantic differences between accounts, we also use a token-aware contrastive learning pre-training objective together with the masked transaction model pre-training objective, learns high-expressive account representations. We then propose a masked account graph autoencoder (MAGAE) using generative self-supervised learning, which achieves superior node-level account detection by focusing on reconstructing account node features. To enable MAGAE to scale for large-scale training, we propose to integrate layer-neighbor sampling into the graph, which reduces the number of sampled vertices by several times without compromising training quality. Finally, using a cross-attention fusion network, we unify the embeddings of TxCLM and MAGAE to leverage the benefits of both. We evaluate our method against 21 baseline approaches on three datasets. Experimental results show that our method outperforms the best baseline by over 10% in F1-score on two of the datasets.
Authors:Yifan Jia, Ye Tian, Liguo Zhang, Yanbin Wang, Jianguo Sun, Liangliang Song
Abstract:
Ethereum's rapid ecosystem expansion and transaction anonymity have triggered a surge in malicious activity. Detection mechanisms currently bifurcate into three technical strands: expert-defined features, graph embeddings, and sequential transaction patterns, collectively spanning the complete feature sets of Ethereum's native data layer. Yet the absence of cross-paradigm integration mechanisms forces practitioners to choose between sacrificing sequential context awareness, structured fund-flow patterns, or human-curated feature insights in their solutions. To bridge this gap, we propose KGBERT4Eth, a feature-complete pre-training encoder that synergistically combines two key components: (1) a Transaction Semantic Extractor, where we train an enhanced Transaction Language Model (TLM) to learn contextual semantic representations from conceptualized transaction records, and (2) a Transaction Knowledge Graph (TKG) that incorporates expert-curated domain knowledge into graph node embeddings to capture fund flow patterns and human-curated feature insights. We jointly optimize pre-training objectives for both components to fuse these complementary features, generating feature-complete embeddings. To emphasize rare anomalous transactions, we design a biased masking prediction task for TLM to focus on statistical outliers, while the Transaction TKG employs link prediction to learn latent transaction relationships and aggregate knowledge. Furthermore, we propose a mask-invariant attention coordination module to ensure stable dynamic information exchange between TLM and TKG during pre-training. KGBERT4Eth significantly outperforms state-of-the-art baselines in both phishing account detection and de-anonymization tasks, achieving absolute F1-score improvements of 8-16% on three phishing detection benchmarks and 6-26% on four de-anonymization datasets.
Authors:Joshua Mashburn, Johann Knechtel, Florian Klemme, Hussam Amrouch, Ozgur Sinanoglu, Paul V. Gratz
Abstract:
Negative-Bias Temperature Instability is a dominant aging mechanism in nanoscale CMOS circuits such as microprocessors. With this aging mechanism, the rate of device aging is dependent not only on overall operating conditions, such as heat, but also on user controllable inputs to the transistors. This dependence on input implies a possible timing fault-injection attack wherein a targeted path of logic is intentionally degraded through the purposeful, software-driven actions of an attacker, rendering a targeted bit effectively stuck.
In this work, we describe such an attack mechanism, which we dub a "$\textbf{Targeted Wearout Attack}$", wherein an attacker with sufficient knowledge of the processor core, executing a carefully crafted software program with only user privilege, is able to degrade a functional unit within the processor with the aim of eliciting a particular desired incorrect calculation in a victim application. Here we give a general methodology for the attack. We then demonstrate a case study where a targeted path within the fused multiply-add pipeline in a RISC-V CPU sees a $>7x$ increase in wear over time than would be experienced under typical workloads. We show that an attacker could leverage such an attack, leading to targeted and silent data corruption in a co-running victim application using the same unit.
Authors:Bingyu Yan, Ziyi Zhou, Xiaoming Zhang, Chaozhuo Li, Ruilin Zeng, Yirui Qi, Tianbo Wang, Litian Zhang
Abstract:
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
Authors:Jean Michel Tine, Mohammed Aldeen, Abyad Enan, M Sabbir Salek, Long Cheng, Mashrur Chowdhury
Abstract:
Cellular Vehicle-to-Everything (C-V2X) technology enables low-latency, reliable communications essential for safety applications such as a Forward Collision Warning (FCW) system. C-V2X deployments operate under strict protocol compliance with the 3rd Generation Partnership Project (3GPP) and the Society of Automotive Engineers Standard (SAE) J2735 specifications to ensure interoperability. This paper presents a real-world testbed evaluation of protocol-compliant Denial-of-Service (DoS) attacks using User Datagram Protocol (UDP) flooding and oversized Basic Safety Message (BSM) attacks that 7 exploit transport- and application-layer vulnerabilities in C-V2X. The attacks presented in this study transmit valid messages over standard PC5 sidelinks, fully adhering to 3GPP and SAE J2735 specifications, but at abnormally high rates and with oversized payloads that overload the receiver resources without breaching any protocol rules such as IEEE 1609. Using a real-world connected vehicle 11 testbed with commercially available On-Board Units (OBUs), we demonstrate that high-rate UDP flooding and oversized payload of BSM flooding can severely degrade FCW performance. Results show that UDP flooding alone reduces packet delivery ratio by up to 87% and increases latency to over 400ms, while oversized BSM floods overload receiver processing resources, delaying or completely suppressing FCW alerts. When UDP and BSM attacks are executed simultaneously, they cause near-total communication failure, preventing FCW warnings entirely. These findings reveal that protocol-compliant communications do not necessarily guarantee safe or reliable operation of C-V2X-based safety applications.
Authors:Jiawei Liu, Nirav Diwan, Zhe Wang, Haoyu Zhai, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Muntasir Wahed, Yinlin Deng, Hadjer Benkraouda, Yuxiang Wei, Lingming Zhang, Ismini Lourentzou, Gang Wang
Abstract:
We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diverse, multi-objective reward mechanisms. To empower the training pipelines with comprehensive cybersafety data, we conduct internal red-teaming to synthesize comprehensive and high-coverage prompts based on real-world tasks for inducing unsafe cyberactivities in the model. Based on PurpCode, we develop a reasoning-based coding model, namely PurpCode-32B, which demonstrates state-of-the-art cybersafety, outperforming various frontier models. Meanwhile, our alignment method decreases the model overrefusal rates in both general and cybersafety-specific scenarios, while preserving model utility in both code generation and common security knowledge.
Authors:Youssef Allouah, Rachid Guerraoui, Sanmi Koyejo
Abstract:
Machine unlearning seeks to remove unwanted information from trained models, initially at the individual-sample level, but increasingly at the level of entire sub-populations. In many deployments, models must delete whole topical domains to satisfy privacy, legal, or quality requirements, e.g., removing several users' posts under GDPR or copyrighted web content. Existing unlearning tools remain largely sample-oriented, and straightforward point deletion often leaves enough residual signal for downstream learners to recover the unwanted domain. We introduce distributional unlearning, a data-centric, model-agnostic framework that asks: Given examples from an unwanted distribution and a retained distribution, what is the smallest set of points whose removal makes the edited dataset far from the unwanted domain yet close to the retained one? Using Kullback-Leibler divergence to quantify removal and preservation, we derive the exact Pareto frontier in the Gaussian case and prove that any model retrained on the edited data incurs log-loss shifts bounded by the divergence thresholds. We propose a simple distance-based selection rule satisfying these constraints with a quadratic reduction in deletion budget compared to random removal. Experiments on synthetic Gaussians, Jigsaw Toxic Comments, SMS spam, and CIFAR-10 show 15-72% fewer deletions than random, with negligible impact on retained performance.
Authors:Federico Mason, Federico Chiariotti, Pietro Talli, Andrea Zanella
Abstract:
Goal-oriented Communication (GoC) is a new paradigm that plans data transmission to occur only when it is instrumental for the receiver to achieve a certain goal. This leads to the advantage of reducing the frequency of transmissions significantly while maintaining adherence to the receiver's objectives. However, GoC scheduling also opens a timing-based side channel that an eavesdropper can exploit to obtain information about the state of the system. This type of attack sidesteps even information-theoretic security, as it exploits the timing of updates rather than their content. In this work, we study such an eavesdropping attack against pull-based goal-oriented scheduling for remote monitoring and control of Markov processes. We provide a theoretical framework for defining the effectiveness of the attack and propose possible countermeasures, including two practical heuristics that provide a balance between the performance gains offered by GoC and the amount of leaked information. Our results show that, while a naive goal-oriented scheduler allows the eavesdropper to correctly guess the system state about 60% of the time, our heuristic defenses can halve the leakage with a marginal reduction of the benefits of goal-oriented approaches.
Chinese: 目标导向通信通过按需传输减少通信频率,但会暴露基于时序的侧信道,易受窃听攻击;本文提出启发式防御措施,能在几乎不影响性能的前提下将信息泄露减半。
English: Goal-oriented Communication reduces transmissions by aligning them with receiver goals but introduces a timing side channel vulnerable to eavesdropping, prompting the development of heuristic defenses that halve information leakage with minimal performance loss.
Authors:Lixu Wang, Kaixiang Yao, Xinfeng Li, Dong Yang, Haoyang Li, Xiaofeng Wang, Wei Dong
Abstract:
Our research uncovers a novel privacy risk associated with multimodal large language models (MLLMs): the ability to infer sensitive personal attributes from audio data -- a technique we term audio private attribute profiling. This capability poses a significant threat, as audio can be covertly captured without direct interaction or visibility. Moreover, compared to images and text, audio carries unique characteristics, such as tone and pitch, which can be exploited for more detailed profiling. However, two key challenges exist in understanding MLLM-employed private attribute profiling from audio: (1) the lack of audio benchmark datasets with sensitive attribute annotations and (2) the limited ability of current MLLMs to infer such attributes directly from audio. To address these challenges, we introduce AP^2, an audio benchmark dataset that consists of two subsets collected and composed from real-world data, and both are annotated with sensitive attribute labels. Additionally, we propose Gifts, a hybrid multi-agent framework that leverages the complementary strengths of audio-language models (ALMs) and large language models (LLMs) to enhance inference capabilities. Gifts employs an LLM to guide the ALM in inferring sensitive attributes, then forensically analyzes and consolidates the ALM's inferences, overcoming severe hallucinations of existing ALMs in generating long-context responses. Our evaluations demonstrate that Gifts significantly outperforms baseline approaches in inferring sensitive attributes. Finally, we investigate model-level and data-level defense strategies to mitigate the risks of audio private attribute profiling. Our work validates the feasibility of audio-based privacy attacks using MLLMs, highlighting the need for robust defenses, and provides a dataset and framework to facilitate future research.
Authors:Peiyan Zhang, Haibo Jin, Liying Kang, Haohan Wang
Abstract:
Jailbreak attacks reveal critical vulnerabilities in Large Language Models (LLMs) by causing them to generate harmful or unethical content. Evaluating these threats is particularly challenging due to the evolving nature of LLMs and the sophistication required in effectively probing their vulnerabilities. Current benchmarks and evaluation methods struggle to fully address these challenges, leaving gaps in the assessment of LLM vulnerabilities. In this paper, we review existing jailbreak evaluation practices and identify three assumed desiderata for an effective jailbreak evaluation protocol. To address these challenges, we introduce GuardVal, a new evaluation protocol that dynamically generates and refines jailbreak prompts based on the defender LLM's state, providing a more accurate assessment of defender LLMs' capacity to handle safety-critical situations. Moreover, we propose a new optimization method that prevents stagnation during prompt refinement, ensuring the generation of increasingly effective jailbreak prompts that expose deeper weaknesses in the defender LLMs. We apply this protocol to a diverse set of models, from Mistral-7b to GPT-4, across 10 safety domains. Our findings highlight distinct behavioral patterns among the models, offering a comprehensive view of their robustness. Furthermore, our evaluation process deepens the understanding of LLM behavior, leading to insights that can inform future research and drive the development of more secure models.
Authors:Hetvi Shastri, Walid A. Hanafy, Li Wu, David Irwin, Mani Srivastava, Prashant Shenoy
Abstract:
Today's Internet of Things (IoT) has evolved from simple sensing and actuation devices to those with embedded processing and intelligent services, enabling rich collaborations between users and their devices. However, enabling such collaboration becomes challenging when transient devices need to interact with host devices in temporarily visited environments. In such cases, fine-grained access control policies are necessary to ensure secure interactions; however, manually implementing them is often impractical for non-expert users. Moreover, at run-time, the system must automatically configure the devices and enforce such fine-grained access control rules. Additionally, the system must address the heterogeneity of devices.
In this paper, we present CollabIoT, a system that enables secure and seamless device collaboration in transient IoT environments. CollabIoT employs a Large language Model (LLM)-driven approach to convert users' high-level intents to fine-grained access control policies. To support secure and seamless device collaboration, CollabIoT adopts capability-based access control for authorization and uses lightweight proxies for policy enforcement, providing hardware-independent abstractions.
We implement a prototype of CollabIoT's policy generation and auto configuration pipelines and evaluate its efficacy on an IoT testbed and in large-scale emulated environments. We show that our LLM-based policy generation pipeline is able to generate functional and correct policies with 100% accuracy. At runtime, our evaluation shows that our system configures new devices in ~150 ms, and our proxy-based data plane incurs network overheads of up to 2 ms and access control overheads up to 0.3 ms.
Chinese: CollabIoT系统采用大语言模型将用户高级意图自动转化为细粒度访问控制策略,在瞬态物联网环境中实现安全无缝的设备协作,同时保持较低的配置和访问控制开销。
English: CollabIoT is a system that uses a Large Language Model to automatically convert user intents into fine-grained access control policies, enabling secure and seamless device collaboration in transient IoT environments with minimal configuration and access control overhead.
Authors:Man Hu, Xinyi Wu, Zuofeng Suo, Jinbo Feng, Linghui Meng, Yanhao Jia, Anh Tuan Luu, Shuai Zhao
Abstract:
With the rise of advanced reasoning capabilities, large language models (LLMs) are receiving increasing attention. However, although reasoning improves LLMs' performance on downstream tasks, it also introduces new security risks, as adversaries can exploit these capabilities to conduct backdoor attacks. Existing surveys on backdoor attacks and reasoning security offer comprehensive overviews but lack in-depth analysis of backdoor attacks and defenses targeting LLMs' reasoning abilities. In this paper, we take the first step toward providing a comprehensive review of reasoning-based backdoor attacks in LLMs by analyzing their underlying mechanisms, methodological frameworks, and unresolved challenges. Specifically, we introduce a new taxonomy that offers a unified perspective for summarizing existing approaches, categorizing reasoning-based backdoor attacks into associative, passive, and active. We also present defense strategies against such attacks and discuss current challenges alongside potential directions for future research. This work offers a novel perspective, paving the way for further exploration of secure and trustworthy LLM communities.
Authors:Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo, Shangqi Lai, Sharif Abuadbba, Hajime Suzuki, Xingliang Yuan, Carsten Rudolph
Abstract:
The quality and experience of mobile communication have significantly improved with the introduction of 5G, and these improvements are expected to continue beyond the 5G era. However, vulnerabilities in control-plane protocols, such as Radio Resource Control (RRC) and Non-Access Stratum (NAS), pose significant security threats, such as Blind Denial of Service (DoS) attacks. Despite the availability of existing anomaly detection methods that leverage rule-based systems or traditional machine learning methods, these methods have several limitations, including the need for extensive training data, predefined rules, and limited explainability. Addressing these challenges, we propose a novel anomaly detection framework that leverages the capabilities of Large Language Models (LLMs) in zero-shot mode with unordered data and short natural language attack descriptions within the Open Radio Access Network (O-RAN) architecture. We analyse robustness to prompt variation, demonstrate the practicality of automating the attack descriptions and show that detection quality relies on the semantic completeness of the description rather than its phrasing or length. We utilise an RRC/NAS dataset to evaluate the solution and provide an extensive comparison of open-source and proprietary LLM implementations to demonstrate superior performance in attack detection. We further validate the practicality of our framework within O-RAN's real-time constraints, illustrating its potential for detecting other Layer-3 attacks.
Authors:Zhiping Zhang, Yi Evie Zhang, Freda Shi, Tianshi Li
Abstract:
Large Language Model (LLM) agents require personal information for personalization in order to better act on users' behalf in daily tasks, but this raises privacy concerns and a personalization-privacy dilemma. Agent's autonomy introduces both risks and opportunities, yet its effects remain unclear. To better understand this, we conducted a 3$\times$3 between-subjects experiment ($N=450$) to study how agent's autonomy level and personalization influence users' privacy concerns, trust and willingness to use, as well as the underlying psychological processes. We find that personalization without considering users' privacy preferences increases privacy concerns and decreases trust and willingness to use. Autonomy moderates these effects: Intermediate autonomy flattens the impact of personalization compared to No- and Full autonomy conditions. Our results suggest that rather than aiming for perfect model alignment in output generation, balancing autonomy of agent's action and user control offers a promising path to mitigate the personalization-privacy dilemma.
Authors:Van Nguyen, Surya Nepal, Xingliang Yuan, Tingmin Wu, Fengchao Chen, Carsten Rudolph
Abstract:
Software vulnerabilities (SVs) pose a critical threat to safety-critical systems, driving the adoption of AI-based approaches such as machine learning and deep learning for software vulnerability detection. Despite promising results, most existing methods are limited to a single programming language. This is problematic given the multilingual nature of modern software, which is often complex and written in multiple languages. Current approaches often face challenges in capturing both shared and language-specific knowledge of source code, which can limit their performance on diverse programming languages and real-world codebases. To address this gap, we propose MULVULN, a novel multilingual vulnerability detection approach that learns from source code across multiple languages. MULVULN captures both the shared knowledge that generalizes across languages and the language-specific knowledge that reflects unique coding conventions. By integrating these aspects, it achieves more robust and effective detection of vulnerabilities in real-world multilingual software systems. The rigorous and extensive experiments on the real-world and diverse REEF dataset, consisting of 4,466 CVEs with 30,987 patches across seven programming languages, demonstrate the superiority of MULVULN over thirteen effective and state-of-the-art baselines. Notably, MULVULN achieves substantially higher F1-score, with improvements ranging from 1.45% to 23.59% compared to the baseline methods.
Authors:Dongyang Zhan, Zhaofeng Yu, Xiangzhan Yu, Hongli Zhang, Lin Ye, Likun Liu
Abstract:
With the development of Internet of Things (IoT), it is gaining a lot of attention. It is important to secure the embedded systems with low overhead. The Linux Seccomp is widely used by developers to secure the kernels by blocking the access of unused syscalls, which introduces less overhead. However, there are no systematic Seccomp configuration approaches for IoT applications without the help of developers. In addition, the existing Seccomp configuration approaches are coarse-grained, which cannot analyze and limit the syscall arguments. In this paper, a novel static dependent syscall analysis approach for embedded applications is proposed, which can obtain all of the possible dependent syscalls and the corresponding arguments of the target applications. So, a fine-grained kernel access limitation can be performed for the IoT applications. To this end, the mappings between dynamic library APIs and syscalls according with their arguments are built, by analyzing the control flow graphs and the data dependency relationships of the dynamic libraries. To the best of our knowledge, this is the first work to generate the fine-grained Seccomp profile for embedded applications.
Authors:Dongyang Zhan, Zhaofeng Yu, Xiangzhan Yu, Hongli Zhang, Lin Ye
Abstract:
Linux Seccomp is widely used by the program developers and the system maintainers to secure the operating systems, which can block unused syscalls for different applications and containers to shrink the attack surface of the operating systems. However, it is difficult to configure the whitelist of a container or application without the help of program developers. Docker containers block about only 50 syscalls by default, and lots of unblocked useless syscalls introduce a big kernel attack surface. To obtain the dependent syscalls, dynamic tracking is a straight-forward approach but it cannot get the full syscall list. Static analysis can construct an over-approximated syscall list, but the list contains many false positives. In this paper, a systematic dependent syscall analysis approach, sysverify, is proposed by combining static analysis and dynamic verification together to shrink the kernel attack surface. The semantic gap between the binary executables and syscalls is bridged by analyzing the binary and the source code, which builds the mapping between the library APIs and syscalls systematically. To further reduce the attack surface at best effort, we propose a dynamic verification approach to intercept and analyze the security of the invocations of indirect-call-related or rarely invoked syscalls with low overhead.
Authors:Xiaobao Wang, Ruoxiao Sun, Yujun Zhang, Bingdao Feng, Dongxiao He, Luzhi Wang, Di Jin
Abstract:
Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph classification, but remain vulnerable to backdoor attacks that implant imperceptible triggers during training to control predictions. While node-level attacks exploit local message passing, graph-level attacks face the harder challenge of manipulating global representations while maintaining stealth. We identify two main sources of anomaly in existing graph classification backdoor methods: structural deviation from rare subgraph triggers and semantic deviation caused by label flipping, both of which make poisoned graphs easily detectable by anomaly detection models. To address this, we propose DPSBA, a clean-label backdoor framework that learns in-distribution triggers via adversarial training guided by anomaly-aware discriminators. DPSBA effectively suppresses both structural and semantic anomalies, achieving high attack success while significantly improving stealth. Extensive experiments on real-world datasets validate that DPSBA achieves a superior balance between effectiveness and detectability compared to state-of-the-art baselines.
Authors:Yupei Liu, Yanting Wang, Yuqi Jia, Jinyuan Jia, Neil Zhenqiang Gong
Abstract:
Prompt injection attacks pose a pervasive threat to the security of Large Language Models (LLMs). State-of-the-art prevention-based defenses typically rely on fine-tuning an LLM to enhance its security, but they achieve limited effectiveness against strong attacks. In this work, we propose \emph{SecInfer}, a novel defense against prompt injection attacks built on \emph{inference-time scaling}, an emerging paradigm that boosts LLM capability by allocating more compute resources for reasoning during inference. SecInfer consists of two key steps: \emph{system-prompt-guided sampling}, which generates multiple responses for a given input by exploring diverse reasoning paths through a varied set of system prompts, and \emph{target-task-guided aggregation}, which selects the response most likely to accomplish the intended task. Extensive experiments show that, by leveraging additional compute at inference, SecInfer effectively mitigates both existing and adaptive prompt injection attacks, outperforming state-of-the-art defenses as well as existing inference-time scaling approaches.
Authors:Jingyao Zhang, Elaheh Sadredini
Abstract:
Recent advancements in post-quantum cryptographic algorithms have led to their standardization by the National Institute of Standards and Technology (NIST) to safeguard information security in the post-quantum era. These algorithms, however, employ public keys and signatures that are 3 to 9$\times$ longer than those used in pre-quantum cryptography, resulting in significant performance and energy efficiency overheads. A critical bottleneck identified in our analysis is the cache bandwidth. This limitation motivates the adoption of on-chip in-/near-cache computing, a computing paradigm that offers high-performance, exceptional energy efficiency, and flexibility to accelerate post-quantum cryptographic algorithms. Our analysis of existing works reveals challenges in integrating in-/near-cache computing into modern computer systems and performance limitations due to external bandwidth limitation, highlighting the need for innovative solutions that can seamlessly integrate into existing systems without performance and energy efficiency issues. In this paper, we introduce a near-cache-slice computing paradigm with support of customization and virtual address, named Crypto-Near-Cache (CNC), designed to accelerate post-quantum cryptographic algorithms and other applications. By placing SRAM arrays with bitline computing capability near cache slices, high internal bandwidth and short data movement are achieved with native support of virtual addressing. An ISA extension to facilitate CNC is also proposed, with detailed discussion on the implementation aspects of the core/cache datapath.
Authors:Jingyao Zhang, Elaheh Sadredini
Abstract:
Secure communication is a critical requirement for Internet of Things (IoT) devices, which are often based on Microcontroller Units (MCUs). Current cryptographic solutions, which rely on software libraries or dedicated hardware accelerators, are fundamentally limited by the performance and energy costs of data movement between memory and processing units. This paper introduces CryptoSRAM, an in-SRAM computing architecture that performs cryptographic operations directly within the MCU's standard SRAM array. By repurposing the memory array into a massively parallel processing fabric, CryptoSRAM eliminates the data movement bottleneck. This approach is well-suited to MCUs, which utilize physical addressing and Direct Memory Access (DMA) to manage SRAM, allowing for seamless integration with minimal hardware overhead. Our analysis shows that for common cryptographic kernels, CryptoSRAM achieves throughput improvements of up to 74$\times$ and 67$\times$ for AES and SHA3, respectively, compared to a software implementation. Furthermore, our solution delivers up to 6$\times$ higher throughput than existing hardware accelerators for AES. CryptoSRAM demonstrates a viable and efficient architecture for secure communication in next-generation IoT systems.
Authors:Bingcan Guo, Eryue Xu, Zhiping Zhang, Tianshi Li
Abstract:
Aligning AI systems with human privacy preferences requires understanding individuals' nuanced disclosure behaviors beyond general norms. Yet eliciting such boundaries remains challenging due to the context-dependent nature of privacy decisions and the complex trade-offs involved. We present an AI-powered elicitation approach that probes individuals' privacy boundaries through a discriminative task. We conducted a between-subjects study that systematically varied communication roles and delegation conditions, resulting in 1,681 boundary specifications from 169 participants for 61 scenarios. We examined how these contextual factors and individual differences influence the boundary specification. Quantitative results show that communication roles influence individuals' acceptance of detailed and identifiable disclosure, AI delegation and individuals' need for privacy heighten sensitivity to disclosed identifiers, and AI delegation results in less consensus across individuals. Our findings highlight the importance of situating privacy preference elicitation within real-world data flows. We advocate using nuanced privacy boundaries as an alignment goal for future AI systems.
Authors:Rui Yang, Michael Fu, Chakkrit Tantithamthavorn, Chetan Arora, Gunel Gulmammadova, Joey Chua
Abstract:
Intelligent software systems powered by Large Language Models (LLMs) are increasingly deployed in critical sectors, raising concerns about their safety during runtime. Through an industry-academic collaboration when deploying an LLM-powered virtual customer assistant, a critical software engineering challenge emerged: how to enhance a safer deployment of LLM-powered software systems at runtime? While LlamaGuard, the current state-of-the-art runtime guardrail, offers protection against unsafe inputs, our study reveals a Defense Success Rate (DSR) drop of 24% under obfuscation- and template-based jailbreak attacks. In this paper, we propose DecipherGuard, a novel framework that integrates a deciphering layer to counter obfuscation-based prompts and a low-rank adaptation mechanism to enhance guardrail effectiveness against template-based attacks. Empirical evaluation on over 22,000 prompts demonstrates that DecipherGuard improves DSR by 36% to 65% and Overall Guardrail Performance (OGP) by 20% to 50% compared to LlamaGuard and two other runtime guardrails. These results highlight the effectiveness of DecipherGuard in defending LLM-powered software systems against jailbreak attacks during runtime.
Authors:Kai Tan, Dongyang Zhan, Lin Ye, Hongli Zhang, Binxing Fang, Zhihong Tian
Abstract:
As cloud computing continues to advance and become an integral part of modern IT infrastructure, container security has emerged as a critical factor in ensuring the smooth operation of cloud-native applications. An attacker can attack the service in the container or even perform the container escape attack by tampering with the files. Monitoring container files is important for APT detection and cyberspace security. Existing file monitoring methods are usually based on host operating system or virtual machine introspection to protect file security in real time. The methods based on the host operating system usually monitor file operations in the host operating system. However, when the container escapes to the host, the host operating system will no longer be secure, so these methods face the problem of weak security. Aiming at the problems of low security and high overload introduced in existing container file monitoring, a high-performance container file monitoring method based on virtual machine introspection is proposed. The experimental results show that the proposed approach can effectively monitor the container files and introduce an acceptable monitoring overload.
Authors:Dongyang Zhan, Kai Tan, Lin Ye, Xiangzhan Yu, Hongli Zhang, Zheng He
Abstract:
Sequential deep learning models (e.g., RNN and LSTM) can learn the sequence features of software behaviors, such as API or syscall sequences. However, recent studies have shown that these deep learning-based approaches are vulnerable to adversarial samples. Attackers can use adversarial samples to change the sequential characteristics of behavior sequences and mislead malware classifiers. In this paper, an adversarial robustness anomaly detection method based on the analysis of behavior units is proposed to overcome this problem. We extract related behaviors that usually perform a behavior intention as a behavior unit, which contains the representative semantic information of local behaviors and can be used to improve the robustness of behavior analysis. By learning the overall semantics of each behavior unit and the contextual relationships among behavior units based on a multilevel deep learning model, our approach can mitigate perturbation attacks that target local and large-scale behaviors. In addition, our approach can be applied to both low-level and high-level behavior logs (e.g., API and syscall logs). The experimental results show that our approach outperforms all the compared methods, which indicates that our approach has better performance against obfuscation attacks.
Authors:Kai Tan, Dongyang Zhan, Lin Ye, Hongli Zhang, Binxing Fang
Abstract:
Sequence-based deep learning models (e.g., RNNs), can detect malware by analyzing its behavioral sequences. Meanwhile, these models are susceptible to adversarial attacks. Attackers can create adversarial samples that alter the sequence characteristics of behavior sequences to deceive malware classifiers. The existing methods for generating adversarial samples typically involve deleting or replacing crucial behaviors in the original data sequences, or inserting benign behaviors that may violate the behavior constraints. However, these methods that directly manipulate sequences make adversarial samples difficult to implement or apply in practice. In this paper, we propose an adversarial attack approach based on Deep Q-Network and a heuristic backtracking search strategy, which can generate perturbation sequences that satisfy practical conditions for successful attacks. Subsequently, we utilize a novel transformation approach that maps modifications back to the source code, thereby avoiding the need to directly modify the behavior log sequences. We conduct an evaluation of our approach, and the results confirm its effectiveness in generating adversarial samples from real-world malware behavior sequences, which have a high success rate in evading anomaly detection models. Furthermore, our approach is practical and can generate adversarial samples while maintaining the functionality of the modified software.
Authors:Dongyang Zhan, Wenqi Zhang, Lin Ye, Xiangzhan Yu, Hongli Zhang, Zheng He
Abstract:
Industrial control systems (ICSs) are widely used in industry, and their security and stability are very important. Once the ICS is attacked, it may cause serious damage. Therefore, it is very important to detect anomalies in ICSs. ICS can monitor and manage physical devices remotely using communication networks. The existing anomaly detection approaches mainly focus on analyzing the security of network traffic or sensor data. However, the behaviors of different domains (e.g., network traffic and sensor physical status) of ICSs are correlated, so it is difficult to comprehensively identify anomalies by analyzing only a single domain. In this paper, an anomaly detection approach based on cross-domain representation learning in ICSs is proposed, which can learn the joint features of multi-domain behaviors and detect anomalies within different domains. After constructing a cross-domain graph that can represent the behaviors of multiple domains in ICSs, our approach can learn the joint features of them by leveraging graph neural networks. Since anomalies behave differently in different domains, we leverage a multi-task learning approach to identify anomalies in different domains separately and perform joint training. The experimental results show that the performance of our approach is better than existing approaches for identifying anomalies in ICSs.
Authors:Zhaofeng Yu, Dongyang Zhan, Lin Ye, Haining Yu, Hongli Zhang, Zhihong Tian
Abstract:
Recently, the WebAssembly (or Wasm) technology has been rapidly evolving, with many runtimes actively under development, providing cross-platform secure sandboxes for Wasm modules to run as portable containers. Compared with Docker, which isolates applications at the operating system level, Wasm runtimes provide more security mechanisms, such as linear memory, type checking, and protected call stacks. Although Wasm is designed with security in mind and considered to be a more secure container runtime, various security challenges have arisen, and researchers have focused on the security of Wasm runtimes, such as discovering vulnerabilities or proposing new security mechanisms to achieve robust isolation. However, we have observed that the resource isolation is not well protected by the current Wasm runtimes, and attackers can exhaust the host's resources to interfere with the execution of other container instances by exploiting the WASI/WASIX interfaces. And the attack surface has not been well explored and measured. In this paper, we explore the resource isolation attack surface of Wasm runtimes systematically by proposing several static Wasm runtime analysis approaches. Based on the analysis results, we propose several exploitation strategies to break the resource isolation of Wasm runtimes. The experimental results show that malicious Wasm instances can not only consume large amounts of system resources on their own but also introduce high workloads into other components of the underlying operating system, leading to a substantial performance degradation of the whole system. In addition, the mitigation approaches have also been discussed.
Authors:Zhenguang Liu, Lixun Ma, Zhongzheng Mu, Chengkun Wei, Xiaojun Xu, Yingying Jiao, Kui Ren
Abstract:
Widespread reuse of open-source code in smart contract development boosts programming efficiency but significantly amplifies bug propagation across contracts, while dedicated methods for detecting similar smart contract functions remain very limited. Conventional abstract-syntax-tree (AST) based methods for smart contract similarity detection face challenges in handling intricate tree structures, which impedes detailed semantic comparison of code. Recent deep-learning based approaches tend to overlook code syntax and detection interpretability, resulting in suboptimal performance. To fill this research gap, we introduce SmartDetector, a novel approach for computing similarity between smart contract functions, explainable at the fine-grained statement level. Technically, SmartDetector decomposes the AST of a smart contract function into a series of smaller statement trees, each reflecting a structural element of the source code. Then, SmartDetector uses a classifier to compute the similarity score of two functions by comparing each pair of their statement trees. To address the infinite hyperparameter space of the classifier, we mathematically derive a cosine-wise diffusion process to efficiently search optimal hyperparameters. Extensive experiments conducted on three large real-world datasets demonstrate that SmartDetector outperforms current state-of-the-art methods by an average improvement of 14.01% in F1-score, achieving an overall average F1-score of 95.88%.
Authors:Hailong Chang, Guozhu Meng, Shuhui Xiao, Kai Chen, Kun Sun, Yilin Li
Abstract:
With the growing demand for cross-language codebase migration, evaluating LLMs' security implications in translation tasks has become critical. Existing evaluations primarily focus on syntactic or functional correctness at the function level, neglecting the critical dimension of security. To enable security evaluation, we construct STED (Security-centric Translation Evaluation Dataset), the first dataset specifically designed for evaluating the security implications of LLM-based code translation. It comprises 720 security-related code samples across five programming languages and nine high-impact CWE categories, sourced from CVE/NVD and manually verified for translation tasks. Our evaluation framework consists of two independent assessment modules: (1) rigorous evaluation by security researchers, and (2) automated analysis via LLM-as-a-judge. Together they evaluate three critical aspects: functional correctness, vulnerability preservation, and vulnerability introduction rates. Our large-scale evaluation of five state-of-the-art LLMs across 6,000 translation instances reveals significant security degradation, with 28.6-45% of translations introducing new vulnerabilities--particularly for web-related flaws like input validation, where LLMs show consistent weaknesses. Furthermore, we develop a Retrieval-Augmented Generation (RAG)-based mitigation strategy that reduces translation-induced vulnerabilities by 32.8%, showing the potential of knowledge-enhanced prompting.
Authors:Ruisi Zhang, Yifei Zhao, Neusha Javidnia, Mengxin Zheng, Farinaz Koushanfar
Abstract:
As on-device LLMs(e.g., Apple on-device Intelligence) are widely adopted to reduce network dependency, improve privacy, and enhance responsiveness, verifying the legitimacy of models running on local devices becomes critical. Existing attestation techniques are not suitable for billion-parameter Large Language Models (LLMs), struggling to remain both time- and memory-efficient while addressing emerging threats in the LLM era. In this paper, we present AttestLLM, the first-of-its-kind attestation framework to protect the hardware-level intellectual property (IP) of device vendors by ensuring that only authorized LLMs can execute on target platforms. AttestLLM leverages an algorithm/software/hardware co-design approach to embed robust watermarking signatures onto the activation distributions of LLM building blocks. It also optimizes the attestation protocol within the Trusted Execution Environment (TEE), providing efficient verification without compromising inference throughput. Extensive proof-of-concept evaluations on LLMs from Llama, Qwen, and Phi families for on-device use cases demonstrate AttestLLM's attestation reliability, fidelity, and efficiency. Furthermore, AttestLLM enforces model legitimacy and exhibits resilience against model replacement and forgery attacks.
Authors:Colin Roberts, Vivek Nair, Dawn Song
Abstract:
The Multi-Factor Key Derivation Function (MFKDF) offered a novel solution to the classic problem of usable client-side key management by incorporating multiple popular authentication factors into a key derivation process, but was later shown to be vulnerable to cryptanalysis that degraded its security over multiple invocations. In this paper, we present the Entropy State Transition Modeling Framework (ESTMF), a novel cryptanalytic technique designed to reveal pernicious leaks of entropy across multiple invocations of a cryptographic key derivation or hash function, and show that it can be used to correctly identify each of the known vulnerabilities in the original MFKDF construction. We then use these findings to propose a new construction for ``MFKDF2,'' a next-generation multi-factor key derivation function that can be proven to be end-to-end secure using the ESTMF. Finally, we discuss how MFKDF2 can be extended to support more authentication factors and usability features than the previous MFKDF construction, and derive several generalizable best-practices for the construction of new KDFs in the future.
Authors:Jehad Jilan, Niranjana Naveen Nambiar, Ahmad Mohammad Saber, Alok Paranjape, Amr Youssef, Deepa Kundur
Abstract:
Automatic Generation Control (AGC) is essential for power grid stability but remains vulnerable to stealthy cyberattacks, such as False Data Injection Attacks (FDIAs), which can disturb the system's stability while evading traditional detection methods. Unlike previous works that relied on blackbox approaches, this work proposes Kolmogorov-Arnold Networks (KAN) as an interpretable and accurate method for FDIA detection in AGC systems, considering the system nonlinearities. KAN models include a method for extracting symbolic equations, and are thus able to provide more interpretability than the majority of machine learning models. The proposed KAN is trained offline to learn the complex nonlinear relationships between the AGC measurements under different operating scenarios. After training, symbolic formulas that describe the trained model's behavior can be extracted and leveraged, greatly enhancing interpretability. Our findings confirm that the proposed KAN model achieves FDIA detection rates of up to 95.97% and 95.9% for the initial model and the symbolic formula, respectively, with a low false alarm rate, offering a reliable approach to enhancing AGC cybersecurity.
Authors:Tong Liu, Guozhu Meng, Peng Zhou, Zizhuang Deng, Shuaiyin Yao, Kai Chen
Abstract:
Pickle deserialization vulnerabilities have persisted throughout Python's history, remaining widely recognized yet unresolved. Due to its ability to transparently save and restore complex objects into byte streams, many AI/ML frameworks continue to adopt pickle as the model serialization protocol despite its inherent risks. As the open-source model ecosystem grows, model-sharing platforms such as Hugging Face have attracted massive participation, significantly amplifying the real-world risks of pickle exploitation and opening new avenues for model supply chain poisoning. Although several state-of-the-art scanners have been developed to detect poisoned models, their incomplete understanding of the poisoning surface leaves the detection logic fragile and allows attackers to bypass them. In this work, we present the first systematic disclosure of the pickle-based model poisoning surface from both model loading and risky function perspectives. Our research demonstrates how pickle-based model poisoning can remain stealthy and highlights critical gaps in current scanning solutions. On the model loading surface, we identify 22 distinct pickle-based model loading paths across five foundational AI/ML frameworks, 19 of which are entirely missed by existing scanners. We further develop a bypass technique named Exception-Oriented Programming (EOP) and discover 9 EOP instances, 7 of which can bypass all scanners. On the risky function surface, we discover 133 exploitable gadgets, achieving almost a 100% bypass rate. Even against the best-performing scanner, these gadgets maintain an 89% bypass rate. By systematically revealing the pickle-based model poisoning surface, we achieve practical and robust bypasses against real-world scanners. We responsibly disclose our findings to corresponding vendors, receiving acknowledgments and a $6000 bug bounty.
Authors:Yunqi Mi, Jiakui Shen, Guoshuai Zhao, Jialie Shen, Xueming Qian
Abstract:
Extending recommender systems to federated learning (FL) frameworks to protect the privacy of users or platforms while making recommendations has recently gained widespread attention in academia. This is due to the natural coupling of recommender systems and federated learning architectures: the data originates from distributed clients (mostly mobile devices held by users), which are highly related to privacy. In a centralized recommender system (CenRec), the central server collects clients' data, trains the model, and provides the service. Whereas in federated recommender systems (FedRec), the step of data collecting is omitted, and the step of model training is offloaded to each client. The server only aggregates the model and other knowledge, thus avoiding client privacy leakage. Some surveys of federated recommender systems discuss and analyze related work from the perspective of designing FL systems. However, their utility drops by ignoring specific recommendation scenarios' unique characteristics and practical challenges. For example, the statistical heterogeneity issue in cross-domain FedRec originates from the label drift of the data held by different platforms, which is mainly caused by the recommender itself, but not the federated architecture. Therefore, it should focus more on solving specific problems in real-world recommendation scenarios to encourage the deployment FedRec. To this end, this review comprehensively analyzes the coupling of recommender systems and federated learning from the perspective of recommendation researchers and practitioners. We establish a clear link between recommendation scenarios and FL frameworks, systematically analyzing scenario-specific approaches, practical challenges, and potential opportunities. We aim to develop guidance for the real-world deployment of FedRec, bridging the gap between existing research and applications.
Authors:Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo, Shangqi Lai, Sharif Abuadbba, Hajime Suzuki, Xingliang Yuan, Carsten Rudolph
Abstract:
The introduction of 5G and the Open Radio Access Network (O-RAN) architecture has enabled more flexible and intelligent network deployments. However, the increased complexity and openness of these architectures also introduce novel security challenges, such as data manipulation attacks on the semi-standardised Shared Data Layer (SDL) within the O-RAN platform through malicious xApps. In particular, malicious xApps can exploit this vulnerability by introducing subtle Unicode-wise alterations (hypoglyphs) into the data that are being used by traditional machine learning (ML)-based anomaly detection methods. These Unicode-wise manipulations can potentially bypass detection and cause failures in anomaly detection systems based on traditional ML, such as AutoEncoders, which are unable to process hypoglyphed data without crashing. We investigate the use of Large Language Models (LLMs) for anomaly detection within the O-RAN architecture to address this challenge. We demonstrate that LLM-based xApps maintain robust operational performance and are capable of processing manipulated messages without crashing. While initial detection accuracy requires further improvements, our results highlight the robustness of LLMs to adversarial attacks such as hypoglyphs in input data. There is potential to use their adaptability through prompt engineering to further improve the accuracy, although this requires further research. Additionally, we show that LLMs achieve low detection latency (under 0.07 seconds), making them suitable for Near-Real-Time (Near-RT) RIC deployments.
Authors:Chaoyang Gao, Xiang Chen, Jiyu Wang, Jibin Wang, Guang Yang
Abstract:
The increasing complexity of software systems has led to a surge in cybersecurity vulnerabilities, necessitating efficient and scalable solutions for vulnerability assessment. However, the deployment of large pre-trained models in real-world scenarios is hindered by their substantial computational and storage demands. To address this challenge, we propose a novel resource-efficient framework that integrates knowledge distillation and particle swarm optimization to enable automated vulnerability assessment. Our framework employs a two-stage approach: First, particle swarm optimization is utilized to optimize the architecture of a compact student model, balancing computational efficiency and model capacity. Second, knowledge distillation is applied to transfer critical vulnerability assessment knowledge from a large teacher model to the optimized student model. This process significantly reduces the model size while maintaining high performance. Experimental results on an enhanced MegaVul dataset, comprising 12,071 CVSS (Common Vulnerability Scoring System) v3 annotated vulnerabilities, demonstrate the effectiveness of our approach. Our approach achieves a 99.4% reduction in model size while retaining 89.3% of the original model's accuracy. Furthermore, it outperforms state-of-the-art baselines by 1.7% in accuracy with 60% fewer parameters. The framework also reduces training time by 72.1% and architecture search time by 34.88% compared to traditional genetic algorithms.
Authors:Zhihong Liang, Xin Wang, Zhenhuang Hu, Liangliang Song, Lin Chen, Jingjing Guo, Yanbin Wang, Ye Tian
Abstract:
With the rapid expansion of web-based applications and cloud services, malicious JavaScript code continues to pose significant threats to user privacy, system integrity, and enterprise security. But, detecting such threats remains challenging due to sophisticated code obfuscation techniques and JavaScript's inherent language characteristics, particularly its nested closure structures and syntactic flexibility. In this work, we propose DeCoda, a hybrid defense framework that combines large language model (LLM)-based deobfuscation with code graph learning: (1) We first construct a sophisticated prompt-learning pipeline with multi-stage refinement, where the LLM progressively reconstructs the original code structure from obfuscated inputs and then generates normalized Abstract Syntax Tree (AST) representations; (2) In JavaScript ASTs, dynamic typing scatters semantically similar nodes while deeply nested functions fracture scope capturing, introducing structural noise and semantic ambiguity. To address these challenges, we then propose to learn hierarchical code graph representations via a Cluster-wise Graph that synergistically integrates graph transformer network, node clustering, and node-to-cluster attention to simultaneously capture both local node-level semantics and global cluster-induced structural relationships from AST graph. Experimental results demonstrate that our method achieves F1-scores of 94.64% and 97.71% on two benchmark datasets, demonstrating absolute improvements of 10.74% and 13.85% over state-of-the-art baselines. In false-positive control evaluation at fixed FPR levels (0.0001, 0.001, 0.01), our approach delivers 4.82, 5.91, and 2.53 higher TPR respectively compared to the best-performing baseline. These results highlight the effectiveness of LLM-based deobfuscation and underscore the importance of modeling cluster-level relationships in detecting malicious code.
Authors:Muhammad Sharshar, Ahmad Mohammad Saber, Davor Svetinovic, Amr M. Youssef, Deepa Kundur, Ehab F. El-Saadany
Abstract:
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.
Authors:Armita Khashayardoost, Ahmad Mohammad Saber, Deepa Kundur
Abstract:
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.
Authors:Ruofan Liu, Yun Lin, Silas Yeo Shuen Yu, Xiwen Teoh, Zhenkai Liang, Jin Song Dong
Abstract:
Phishing emails are a critical component of the cybercrime kill chain due to their wide reach and low cost. Their ever-evolving nature renders traditional rule-based and feature-engineered detectors ineffective in the ongoing arms race between attackers and defenders. The rise of large language models (LLMs) further exacerbates the threat, enabling attackers to craft highly convincing phishing emails at minimal cost.
This work demonstrates that LLMs can generate psychologically persuasive phishing emails tailored to victim profiles, successfully bypassing nearly all commercial and academic detectors. To defend against such threats, we propose PiMRef, the first reference-based phishing email detector that leverages knowledge-based invariants. Our core insight is that persuasive phishing emails often contain disprovable identity claims, which contradict real-world facts. PiMRef reframes phishing detection as an identity fact-checking task. Given an email, PiMRef (i) extracts the sender's claimed identity, (ii) verifies the legitimacy of the sender's domain against a predefined knowledge base, and (iii) detects call-to-action prompts that push user engagement. Contradictory claims are flagged as phishing indicators and serve as human-understandable explanations.
Compared to existing methods such as D-Fence, HelpHed, and ChatSpamDetector, PiMRef boosts precision by 8.8% with no loss in recall on standard benchmarks like Nazario and PhishPot. In a real-world evaluation of 10,183 emails across five university accounts over three years, PiMRef achieved 92.1% precision, 87.9% recall, and a median runtime of 0.05s, outperforming the state-of-the-art in both effectiveness and efficiency.
Authors:Khoa Nguyen, Tanveer Khan, Hossein Abdinasibfar, Antonis Michalas
Abstract:
Federated Learning (FL) enables collaborative model training without sharing raw data, making it a promising approach for privacy-sensitive domains. Despite its potential, FL faces significant challenges, particularly in terms of communication overhead and data privacy. Privacy-preserving Techniques (PPTs) such as Homomorphic Encryption (HE) have been used to mitigate these concerns. However, these techniques introduce substantial computational and communication costs, limiting their practical deployment. In this work, we explore how Hybrid Homomorphic Encryption (HHE), a cryptographic protocol that combines symmetric encryption with HE, can be effectively integrated with FL to address both communication and privacy challenges, paving the way for scalable and secure decentralized learning system.
Authors:Wenxuan Shi, Hongwei Li, Jiahao Yu, Xinqian Sun, Wenbo Guo, Xinyu Xing
Abstract:
Collaborative fuzzing combines multiple individual fuzzers and dynamically chooses appropriate combinations for different programs. Unlike individual fuzzers that rely on specific assumptions, collaborative fuzzing relaxes assumptions on target programs, providing robust performance across various programs. However, existing collaborative fuzzing frameworks face challenges including additional computational resource requirements and inefficient resource allocation among fuzzers. To tackle these challenges, we present BANDFUZZ, an ML-powered collaborative fuzzing framework that outperforms individual fuzzers without requiring additional computational resources. The key contribution of BANDFUZZ lies in its novel resource allocation algorithm driven by our proposed multi-armed bandits model. Different from greedy methods in existing frameworks, BANDFUZZ models the long-term impact of individual fuzzers, enabling discovery of globally optimal collaborative strategies. We propose a novel fuzzer evaluation method that assesses not only code coverage but also the fuzzer's capability of solving difficult branches. Finally, we integrate a real-time seed synchronization mechanism and implementation-wise optimizations to improve fuzzing efficiency and stability. Through extensive experiments on Fuzzbench and Fuzzer Test Suite, we show that BANDFUZZ outperforms state-of-the-art collaborative fuzzing framework autofz and widely used individual fuzzers. We verify BANDFUZZ's key designs through comprehensive ablation study. Notably, we demonstrate BANDFUZZ's effectiveness in real-world bug detection by analyzing results of a worldwide fuzzing competition, where BANDFUZZ won first place.
Authors:Tanveer Khan, Mindaugas Budzys, Antonis Michalas
Abstract:
Split Learning (SL) -- splits a model into two distinct parts to help protect client data while enhancing Machine Learning (ML) processes. Though promising, SL has proven vulnerable to different attacks, thus raising concerns about how effective it may be in terms of data privacy. Recent works have shown promising results for securing SL through the use of a novel paradigm, named Function Secret Sharing (FSS), in which servers obtain shares of a function they compute and operate on a public input hidden with a random mask. However, these works fall short in addressing the rising number of attacks which exist on SL. In SplitHappens, we expand the combination of FSS and SL to U-shaped SL. Similarly to other works, we are able to make use of the benefits of SL by reducing the communication and computational costs of FSS. However, a U-shaped SL provides a higher security guarantee than previous works, allowing a client to keep the labels of the training data secret, without having to share them with the server. Through this, we are able to generalize the security analysis of previous works and expand it to different attack vectors, such as modern model inversion attacks as well as label inference attacks. We tested our approach for two different convolutional neural networks on different datasets. These experiments show the effectiveness of our approach in reducing the training time as well as the communication costs when compared to simply using FSS while matching prior accuracy.
Authors:Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Haiping Zhu, Nan Hu, Jun Liu, Qika Lin
Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by retrieving external data to mitigate hallucinations and outdated knowledge issues. Benefiting from the strong ability in facilitating diverse data sources and supporting faithful reasoning, knowledge graphs (KGs) have been increasingly adopted in RAG systems, giving rise to KG-based RAG (KG-RAG) methods. Though RAG systems are widely applied in various applications, recent studies have also revealed its vulnerabilities to data poisoning attacks, where malicious information injected into external knowledge sources can mislead the system into producing incorrect or harmful responses. However, these studies focus exclusively on RAG systems using unstructured textual data sources, leaving the security risks of KG-RAG largely unexplored, despite the fact that KGs present unique vulnerabilities due to their structured and editable nature. In this work, we conduct the first systematic investigation of the security issue of KG-RAG methods through data poisoning attacks. To this end, we introduce a practical, stealthy attack setting that aligns with real-world implementation. We propose an attack strategy that first identifies adversarial target answers and then inserts perturbation triples to complete misleading inference chains in the KG, increasing the likelihood that KG-RAG methods retrieve and rely on these perturbations during generation. Through extensive experiments on two benchmarks and four recent KG-RAG methods, our attack strategy demonstrates strong effectiveness in degrading KG-RAG performance, even with minimal KG perturbations. In-depth analyses are also conducted to understand the safety threats within the internal stages of KG-RAG systems and to explore the robustness of LLMs against adversarial knowledge.
Authors:Marika Swanberg, Meenatchi Sundaram Muthu Selva Annamalai, Jamie Hayes, Borja Balle, Adam Smith
Abstract:
Differential Privacy (DP) is a family of definitions that bound the worst-case privacy leakage of a mechanism. One important feature of the worst-case DP guarantee is it naturally implies protections against adversaries with less prior information, more sophisticated attack goals, and complex measures of a successful attack. However, the analytical tradeoffs between the adversarial model and the privacy protections conferred by DP are not well understood thus far. To that end, this work sheds light on what the worst-case guarantee of DP implies about the success of attackers that are more representative of real-world privacy risks.
In this paper, we present a single flexible framework that generalizes and extends the patchwork of bounds on DP mechanisms found in prior work. Our framework allows us to compute high-probability guarantees for DP mechanisms on a large family of natural attack settings that previous bounds do not capture. One class of such settings is the approximate reconstruction of multiple individuals' data, such as inferring nearly entire columns of a tabular data set from noisy marginals and extracting sensitive information from DP-trained language models.
We conduct two empirical case studies to illustrate the versatility of our bounds and compare them to the success of state-of-the-art attacks. Specifically, we study attacks that extract non-uniform PII from a DP-trained language model, as well as multi-column reconstruction attacks where the adversary has access to some columns in the clear and attempts to reconstruct the remaining columns for each person's record. We find that the absolute privacy risk of attacking non-uniform data is highly dependent on the adversary's prior probability of success. Our high probability bounds give us a nuanced understanding of the privacy leakage of DP mechanisms in a variety of previously understudied attack settings.
Authors:Bogdan Kulynych, Juan Felipe Gomez, Georgios Kaissis, Jamie Hayes, Borja Balle, Flavio du Pin Calmon, Jean Louis Raisaro
Abstract:
Differentially private (DP) mechanisms are difficult to interpret and calibrate because existing methods for mapping standard privacy parameters to concrete privacy risks -- re-identification, attribute inference, and data reconstruction -- are both overly pessimistic and inconsistent. In this work, we use the hypothesis-testing interpretation of DP ($f$-DP), and determine that bounds on attack success can take the same unified form across re-identification, attribute inference, and data reconstruction risks. Our unified bounds are (1) consistent across a multitude of attack settings, and (2) tunable, enabling practitioners to evaluate risk with respect to arbitrary (including worst-case) levels of baseline risk. Empirically, our results are tighter than prior methods using $\varepsilon$-DP, Rényi DP, and concentrated DP. As a result, calibrating noise using our bounds can reduce the required noise by 20% at the same risk level, which yields, e.g., more than 15pp accuracy increase in a text classification task. Overall, this unifying perspective provides a principled framework for interpreting and calibrating the degree of protection in DP against specific levels of re-identification, attribute inference, or data reconstruction risk.
Authors:Satyapriya Krishna, Ninareh Mehrabi, Abhinav Mohanty, Matteo Memelli, Vincent Ponzo, Payal Motwani, Rahul Gupta
Abstract:
Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.
Authors:Anand D. Sarwate, Flavio P. Calmon, Oliver Kosut, Lalitha Sankar
Abstract:
Since being proposed in 2006, differential privacy has become a standard method for quantifying certain risks in publishing or sharing analyses of sensitive data. At its heart, differential privacy measures risk in terms of the differences between probability distributions, which is a central topic in information theory. A differentially private algorithm is a channel between the underlying data and the output of the analysis. Seen in this way, the guarantees made by differential privacy can be understood in terms of properties of this channel. In this article we examine a few of the key connections between information theory and the formulation/application of differential privacy, giving an ``operational significance'' for relevant information measures.
Authors:Yuxuan Bai, Gauri Pradhan, Marlon Tobaben, Antti Honkela
Abstract:
With the emergence of powerful large-scale foundation models, the training paradigm is increasingly shifting from from-scratch training to transfer learning. This enables high utility training with small, domain-specific datasets typical in sensitive applications. Membership inference attacks (MIAs) provide an empirical estimate of the privacy leakage by machine learning models. Yet, prior assessments of MIAs against models fine-tuned with transfer learning rely on a small subset of possible attacks. We address this by comparing performance of diverse MIAs in transfer learning settings to help practitioners identify the most efficient attacks for privacy risk evaluation. We find that attack efficacy decreases with the increase in training data for score-based MIAs. We find that there is no one MIA which captures all privacy risks in models trained with transfer learning. While the Likelihood Ratio Attack (LiRA) demonstrates superior performance across most experimental scenarios, the Inverse Hessian Attack (IHA) proves to be more effective against models fine-tuned on PatchCamelyon dataset in high data regime.
Authors:Xinjie Shen, Mufei Li, Pan Li
Abstract:
The deployment of Large Language Models (LLMs) in embodied agents creates an urgent need to measure their privacy awareness in the physical world. Existing evaluation methods, however, are confined to natural language based scenarios. To bridge this gap, we introduce EAPrivacy, a comprehensive evaluation benchmark designed to quantify the physical-world privacy awareness of LLM-powered agents. EAPrivacy utilizes procedurally generated scenarios across four tiers to test an agent's ability to handle sensitive objects, adapt to changing environments, balance task execution with privacy constraints, and resolve conflicts with social norms. Our measurements reveal a critical deficit in current models. The top-performing model, Gemini 2.5 Pro, achieved only 59\% accuracy in scenarios involving changing physical environments. Furthermore, when a task was accompanied by a privacy request, models prioritized completion over the constraint in up to 86\% of cases. In high-stakes situations pitting privacy against critical social norms, leading models like GPT-4o and Claude-3.5-haiku disregarded the social norm over 15\% of the time. These findings, demonstrated by our benchmark, underscore a fundamental misalignment in LLMs regarding physically grounded privacy and establish the need for more robust, physically-aware alignment.
Authors:Erfan Shayegani, Keegan Hines, Yue Dong, Nael Abu-Ghazaleh, Roman Lutz, Spencer Whitehead, Vidhisha Balachandran, Besmira Nushi, Vibhav Vineet
Abstract:
Computer-Use Agents (CUAs) are an increasingly deployed class of agents that take actions on GUIs to accomplish user goals. In this paper, we show that CUAs consistently exhibit Blind Goal-Directedness (BGD): a bias to pursue goals regardless of feasibility, safety, reliability, or context. We characterize three prevalent patterns of BGD: (i) lack of contextual reasoning, (ii) assumptions and decisions under ambiguity, and (iii) contradictory or infeasible goals. We develop BLIND-ACT, a benchmark of 90 tasks capturing these three patterns. Built on OSWorld, BLIND-ACT provides realistic environments and employs LLM-based judges to evaluate agent behavior, achieving 93.75% agreement with human annotations. We use BLIND-ACT to evaluate nine frontier models, including Claude Sonnet and Opus 4, Computer-Use-Preview, and GPT-5, observing high average BGD rates (80.8%) across them. We show that BGD exposes subtle risks that arise even when inputs are not directly harmful. While prompting-based interventions lower BGD levels, substantial risk persists, highlighting the need for stronger training- or inference-time interventions. Qualitative analysis reveals observed failure modes: execution-first bias (focusing on how to act over whether to act), thought-action disconnect (execution diverging from reasoning), and request-primacy (justifying actions due to user request). Identifying BGD and introducing BLIND-ACT establishes a foundation for future research on studying and mitigating this fundamental risk and ensuring safe CUA deployment.
Authors:Hengcheng Zhu, Songqiang Chen, Valerio Terragni, Lili Wei, Jiarong Wu, Yepang Liu, Shing-Chi Cheung
Abstract:
The Language Server Protocol (LSP) has revolutionized the integration of code intelligence in modern software development. There are approximately 300 LSP server implementations for various languages and 50 editors offering LSP integration. However, the reliability of LSP servers is a growing concern, as crashes can disable all code intelligence features and significantly impact productivity, while vulnerabilities can put developers at risk even when editing untrusted source code. Despite the widespread adoption of LSP, no existing techniques specifically target LSP server testing. To bridge this gap, we present LSPFuzz, a grey-box hybrid fuzzer for systematic LSP server testing. Our key insight is that effective LSP server testing requires holistic mutation of source code and editor operations, as bugs often manifest from their combinations. To satisfy the sophisticated constraints of LSP and effectively explore the input space, we employ a two-stage mutation pipeline: syntax-aware mutations to source code, followed by context-aware dispatching of editor operations. We evaluated LSPFuzz on four widely used LSP servers. LSPFuzz demonstrated superior performance compared to baseline fuzzers, and uncovered previously unknown bugs in real-world LSP servers. Of the 51 bugs we reported, 42 have been confirmed, 26 have been fixed by developers, and two have been assigned CVE numbers. Our work advances the quality assurance of LSP servers, providing both a practical tool and foundational insights for future research in this domain.
Authors:Tharcisse Ndayipfukamiye, Jianguo Ding, Doreen Sebastian Sarwatt, Adamu Gaston Philipo, Huansheng Ning
Abstract:
Machine learning-based cybersecurity systems are highly vulnerable to adversarial attacks, while Generative Adversarial Networks (GANs) act as both powerful attack enablers and promising defenses. This survey systematically reviews GAN-based adversarial defenses in cybersecurity (2021--August 31, 2025), consolidating recent progress, identifying gaps, and outlining future directions. Using a PRISMA-compliant systematic literature review protocol, we searched five major digital libraries. From 829 initial records, 185 peer-reviewed studies were retained and synthesized through quantitative trend analysis and thematic taxonomy development. We introduce a four-dimensional taxonomy spanning defensive function, GAN architecture, cybersecurity domain, and adversarial threat model. GANs improve detection accuracy, robustness, and data utility across network intrusion detection, malware analysis, and IoT security. Notable advances include WGAN-GP for stable training, CGANs for targeted synthesis, and hybrid GAN models for improved resilience. Yet, persistent challenges remain such as instability in training, lack of standardized benchmarks, high computational cost, and limited explainability. GAN-based defenses demonstrate strong potential but require advances in stable architectures, benchmarking, transparency, and deployment. We propose a roadmap emphasizing hybrid models, unified evaluation, real-world integration, and defenses against emerging threats such as LLM-driven cyberattacks. This survey establishes the foundation for scalable, trustworthy, and adaptive GAN-powered defenses.
Authors:Ali Burak Ãnal, Cem Ata Baykara, Peter Krawitz, Mete Akgün
Abstract:
Machine learning has shown promise in facial dysmorphology, where characteristic facial features provide diagnostic clues for rare genetic disorders. GestaltMatcher, a leading framework in this field, has demonstrated clinical utility across multiple studies, but its reliance on centralized datasets limits further development, as patient data are siloed across institutions and subject to strict privacy regulations. We introduce a federated GestaltMatcher service based on a cross-silo horizontal federated learning framework, which allows hospitals to collaboratively train a global ensemble feature extractor without sharing patient images. Patient data are mapped into a shared latent space, and a privacy-preserving kernel matrix computation framework enables syndrome inference and discovery while safeguarding confidentiality. New participants can directly benefit from and contribute to the system by adopting the global feature extractor and kernel configuration from previous training rounds. Experiments show that the federated service retains over 90% of centralized performance and remains robust to both varying silo numbers and heterogeneous data distributions.
Authors:Gabriele Digregorio, Marco Di Gennaro, Stefano Zanero, Stefano Longari, Michele Carminati
Abstract:
The rise of model-sharing through frameworks and dedicated hubs makes Machine Learning significantly more accessible. Despite their benefits, these tools expose users to underexplored security risks, while security awareness remains limited among both practitioners and developers. To enable a more security-conscious culture in Machine Learning model sharing, in this paper we evaluate the security posture of frameworks and hubs, assess whether security-oriented mechanisms offer real protection, and survey how users perceive the security narratives surrounding model sharing. Our evaluation shows that most frameworks and hubs address security risks partially at best, often by shifting responsibility to the user. More concerningly, our analysis of frameworks advertising security-oriented settings and complete model sharing uncovered six 0-day vulnerabilities enabling arbitrary code execution. Through this analysis, we debunk the misconceptions that the model-sharing problem is largely solved and that its security can be guaranteed by the file format used for sharing. As expected, our survey shows that the surrounding security narrative leads users to consider security-oriented settings as trustworthy, despite the weaknesses shown in this work. From this, we derive takeaways and suggestions to strengthen the security of model-sharing ecosystems.
Authors:Yi Yin, Guangquan Zhang, Hua Zuo, Jie Lu
Abstract:
Machine learning models require datasets for effective training, but directly sharing raw data poses significant privacy risk such as membership inference attacks (MIA). To mitigate the risk, privacy-preserving techniques such as data perturbation, generalization, and synthetic data generation are commonly utilized. However, these methods often degrade data accuracy, specificity, and diversity, limiting the performance of downstream tasks and thus reducing data utility. Therefore, striking an optimal balance between privacy preservation and data utility remains a critical challenge. To address this issue, we introduce a novel bilevel optimization framework for the publication of private datasets, where the upper-level task focuses on data utility and the lower-level task focuses on data privacy. In the upper-level task, a discriminator guides the generation process to ensure that perturbed latent variables are mapped to high-quality samples, maintaining fidelity for downstream tasks. In the lower-level task, our framework employs local extrinsic curvature on the data manifold as a quantitative measure of individual vulnerability to MIA, providing a geometric foundation for targeted privacy protection. By perturbing samples toward low-curvature regions, our method effectively suppresses distinctive feature combinations that are vulnerable to MIA. Through alternating optimization of both objectives, we achieve a synergistic balance between privacy and utility. Extensive experimental evaluations demonstrate that our method not only enhances resistance to MIA in downstream tasks but also surpasses existing methods in terms of sample quality and diversity.
Authors:Jiazheng Xing, Hai Ci, Hongbin Xu, Hangjie Yuan, Yong Liu, Mike Zheng Shou
Abstract:
Watermarking diffusion-generated images is crucial for copyright protection and user tracking. However, current diffusion watermarking methods face significant limitations: zero-bit watermarking systems lack the capacity for large-scale user tracking, while multi-bit methods are highly sensitive to certain image transformations or generative attacks, resulting in a lack of comprehensive robustness. In this paper, we propose OptMark, an optimization-based approach that embeds a robust multi-bit watermark into the intermediate latents of the diffusion denoising process. OptMark strategically inserts a structural watermark early to resist generative attacks and a detail watermark late to withstand image transformations, with tailored regularization terms to preserve image quality and ensure imperceptibility. To address the challenge of memory consumption growing linearly with the number of denoising steps during optimization, OptMark incorporates adjoint gradient methods, reducing memory usage from O(N) to O(1). Experimental results demonstrate that OptMark achieves invisible multi-bit watermarking while ensuring robust resilience against valuemetric transformations, geometric transformations, editing, and regeneration attacks.
Authors:Guangyu Yang, Jinghong Chen, Jingbiao Mei, Weizhe Lin, Bill Byrne
Abstract:
Large Language Models (LLMs) remain vulnerable to jailbreak attacks, which attempt to elicit harmful responses from LLMs. The evolving nature and diversity of these attacks pose many challenges for defense systems, including (1) adaptation to counter emerging attack strategies without costly retraining, and (2) control of the trade-off between safety and utility. To address these challenges, we propose Retrieval-Augmented Defense (RAD), a novel framework for jailbreak detection that incorporates a database of known attack examples into Retrieval-Augmented Generation, which is used to infer the underlying, malicious user query and jailbreak strategy used to attack the system. RAD enables training-free updates for newly discovered jailbreak strategies and provides a mechanism to balance safety and utility. Experiments on StrongREJECT show that RAD substantially reduces the effectiveness of strong jailbreak attacks such as PAP and PAIR while maintaining low rejection rates for benign queries. We propose a novel evaluation scheme and show that RAD achieves a robust safety-utility trade-off across a range of operating points in a controllable manner.
Authors:Zhixin Xie, Xurui Song, Jun Luo
Abstract:
Diffusion Large Language Models (dLLMs) have recently emerged as a competitive non-autoregressive paradigm due to their unique training and inference approach. However, there is currently a lack of safety study on this novel architecture. In this paper, we present the first analysis of dLLMs' safety performance and propose a novel safety alignment method tailored to their unique generation characteristics. Specifically, we identify a critical asymmetry between the defender and attacker in terms of security. For the defender, we reveal that the middle tokens of the response, rather than the initial ones, are more critical to the overall safety of dLLM outputs; this seems to suggest that aligning middle tokens can be more beneficial to the defender. The attacker, on the contrary, may have limited power to manipulate middle tokens, as we find dLLMs have a strong tendency towards a sequential generation order in practice, forcing the attack to meet this distribution and diverting it from influencing the critical middle tokens. Building on this asymmetry, we introduce Middle-tOken Safety Alignment (MOSA), a novel method that directly aligns the model's middle generation with safe refusals exploiting reinforcement learning. We implement MOSA and compare its security performance against eight attack methods on two benchmarks. We also test the utility of MOSA-aligned dLLM on coding, math, and general reasoning. The results strongly prove the superiority of MOSA.
Authors:Haoran Dai, Jiawen Wang, Ruo Yang, Manali Sharma, Zhonghao Liao, Yuan Hong, Binghui Wang
Abstract:
Text-to-image diffusion models (T2I DMs) have achieved remarkable success in generating high-quality and diverse images from text prompts, yet recent studies have revealed their vulnerability to backdoor attacks. Existing attack methods suffer from critical limitations: 1) they rely on unnatural adversarial prompts that lack human readability and require massive poisoned data; 2) their effectiveness is typically restricted to specific models, lacking generalizability; and 3) they can be mitigated by recent backdoor defenses.
To overcome these challenges, we propose a novel backdoor attack framework that achieves three key properties: 1) \emph{Practicality}: Our attack requires only a few stealthy backdoor samples to generate arbitrary attacker-chosen target images, as well as ensuring high-quality image generation in benign scenarios. 2) \emph{Generalizability:} The attack is applicable across multiple T2I DMs without requiring model-specific redesign. 3) \emph{Robustness:} The attack remains effective against existing backdoor defenses and adaptive defenses. Our extensive experimental results on multiple T2I DMs demonstrate that with only 10 carefully crafted backdoored samples, our attack method achieves $>$90\% attack success rate with negligible degradation in benign image generation quality. We also conduct human evaluation to validate our attack effectiveness. Furthermore, recent backdoor detection and mitigation methods, as well as adaptive defense tailored to our attack are not sufficiently effective, highlighting the pressing need for more robust defense mechanisms against the proposed attack.
Authors:Muzhi Dai, Shixuan Liu, Zhiyuan Zhao, Junyu Gao, Hao Sun, Xuelong Li
Abstract:
The rapid advancement of multimodal large language models (MLLMs) has led to breakthroughs in various applications, yet their security remains a critical challenge. One pressing issue involves unsafe image-query pairs--jailbreak inputs specifically designed to bypass security constraints and elicit unintended responses from MLLMs. Compared to general multimodal data, such unsafe inputs are relatively sparse, which limits the diversity and richness of training samples available for developing robust defense models. Meanwhile, existing guardrail-type methods rely on external modules to enforce security constraints but fail to address intrinsic vulnerabilities within MLLMs. Traditional supervised fine-tuning (SFT), on the other hand, often over-refuses harmless inputs, compromising general performance. Given these challenges, we propose Secure Tug-of-War (SecTOW), an innovative iterative defense-attack training method to enhance the security of MLLMs. SecTOW consists of two modules: a defender and an auxiliary attacker, both trained iteratively using reinforcement learning (GRPO). During the iterative process, the attacker identifies security vulnerabilities in the defense model and expands jailbreak data. The expanded data are then used to train the defender, enabling it to address identified security vulnerabilities. We also design reward mechanisms used for GRPO to simplify the use of response labels, reducing dependence on complex generative labels and enabling the efficient use of synthetic data. Additionally, a quality monitoring mechanism is used to mitigate the defender's over-refusal of harmless inputs and ensure the diversity of the jailbreak data generated by the attacker. Experimental results on safety-specific and general benchmarks demonstrate that SecTOW significantly improves security while preserving general performance.
Authors:Kaiwen Wang, Xiaolin Chang, Junchao Fan, Yuehan Dong
Abstract:
The existing MPC-based private inference frameworks either rely on impractical real-world assumptions, or adopt the strongest security model (Malicious Security Dishonest Majority, MSDM) and then suffer from severe efficiency limitations. To balance security and efficiency, we propose a novel, three-layer private inference framework based on the Helper-Assisted MSDM (HA-MSDM) model. The first is the primitive layer, where we extend computations from prime fields to rings for efficient fixed-point arithmetic and then better support inference operations. The second is the MPC layer, where we design six fixed-round MPC protocols to reduce latency for core operations like multiplication, polynomial evaluation, and batch check. The third is the inference layer, which can achieve efficient and high-accuracy CNN inference. The efficiency is achieved by applying our designed MPC protocols. The high-accuracy private inference in deep CNNs is achieved by designing a co-optimized strategy, which employs high-precision polynomial approximation for activation functions and uses parameter-adjusted Batch Normalization layers to constrain inputs. Benchmarks on LeNet and AlexNet show our framework achieves up to a 2.4-25.7x speedup in LAN and a 1.3-9.5x acceleration in WAN over the state-of-the-art MSDM frameworks with only 0.04-1.08% relative error.
Authors:Jingwen Li, Ru Zhang, Jianyi Liu, Wanguo Zhao
Abstract:
With the increasing complexity of cyberattacks, the proactive and forward-looking nature of threat intelligence has become more crucial for threat detection and provenance analysis. However, translating high-level attack patterns described in Tactics, Techniques, and Procedures (TTP) intelligence into actionable security policies remains a significant challenge. This challenge arises from the semantic gap between high-level threat intelligence and low-level provenance log. To address this issue, this paper introduces CLIProv, a novel approach for detecting threat behaviors in a host system. CLIProv employs a multimodal framework that leverages contrastive learning to align the semantics of provenance logs with threat intelligence, effectively correlating system intrusion activities with attack patterns. Furthermore, CLIProv formulates threat detection as a semantic search problem, identifying attack behaviors by searching for threat intelligence that is most semantically similar to the log sequence. By leveraging attack pattern information in threat intelligence, CLIProv identifies TTPs and generates complete and concise attack scenarios. Experimental evaluations on standard datasets show that CLIProv effectively identifies attack behaviors in system provenance logs, offering valuable references for potential techniques. Compared to state-of-the-art methods, CLIProv achieves higher precision and significantly improved detection efficiency.
Authors:Qiangqiang Wu, Yi Yu, Chenqi Kong, Ziquan Liu, Jia Wan, Haoliang Li, Alex C. Kot, Antoni B. Chan
Abstract:
With the rise of social media, vast amounts of user-uploaded videos (e.g., YouTube) are utilized as training data for Visual Object Tracking (VOT). However, the VOT community has largely overlooked video data-privacy issues, as many private videos have been collected and used for training commercial models without authorization. To alleviate these issues, this paper presents the first investigation on preventing personal video data from unauthorized exploitation by deep trackers. Existing methods for preventing unauthorized data use primarily focus on image-based tasks (e.g., image classification), directly applying them to videos reveals several limitations, including inefficiency, limited effectiveness, and poor generalizability. To address these issues, we propose a novel generative framework for generating Temporal Unlearnable Examples (TUEs), and whose efficient computation makes it scalable for usage on large-scale video datasets. The trackers trained w/ TUEs heavily rely on unlearnable noises for temporal matching, ignoring the original data structure and thus ensuring training video data-privacy. To enhance the effectiveness of TUEs, we introduce a temporal contrastive loss, which further corrupts the learning of existing trackers when using our TUEs for training. Extensive experiments demonstrate that our approach achieves state-of-the-art performance in video data-privacy protection, with strong transferability across VOT models, datasets, and temporal matching tasks.
Authors:Arthur Gervais, Liyi Zhou
Abstract:
Smart contract vulnerabilities have led to billions in losses, yet finding actionable exploits remains challenging. Traditional fuzzers rely on rigid heuristics and struggle with complex attacks, while human auditors are thorough but slow and don't scale. Large Language Models offer a promising middle ground, combining human-like reasoning with machine speed.
However, early studies show that simply prompting LLMs generates unverified vulnerability speculations with high false positive rates. To address this, we present A1, an agentic system that transforms any LLM into an end-to-end exploit generator. A1 provides agents with six domain-specific tools for autonomous vulnerability discovery, from understanding contract behavior to testing strategies on real blockchain states. All outputs are concretely validated through execution, ensuring only profitable proof-of-concept exploits are reported. We evaluate A1 across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain. A1 achieves a 63% success rate on the VERITE benchmark. Across all successful cases, A1 extracts up to \$8.59 million per exploit and \$9.33 million total. Through 432 experiments across six LLMs, we show that most exploits emerge within five iterations, with costs ranging \$0.01-\$3.59 per attempt.
Using Monte Carlo analysis of historical attacks, we demonstrate that immediate vulnerability detection yields 86-89% success probability, dropping to 6-21% with week-long delays. Our economic analysis reveals a troubling asymmetry: attackers achieve profitability at \$6,000 exploit values while defenders require \$60,000 -- raising fundamental questions about whether AI agents inevitably favor exploitation over defense.
Authors:Shuo Yang, Xinran Zheng, Xinchen Zhang, Jinfeng Xu, Jinze Li, Donglin Xie, Weicai Long, Edith C. H. Ngai
Abstract:
Large Language Models (LLMs) have revolutionized various fields with their exceptional capabilities in understanding, processing, and generating human-like text. This paper investigates the potential of LLMs in advancing Network Intrusion Detection Systems (NIDS), analyzing current challenges, methodologies, and future opportunities. It begins by establishing a foundational understanding of NIDS and LLMs, exploring the enabling technologies that bridge the gap between intelligent and cognitive systems in AI-driven NIDS. While Intelligent NIDS leverage machine learning and deep learning to detect threats based on learned patterns, they often lack contextual awareness and explainability. In contrast, Cognitive NIDS integrate LLMs to process both structured and unstructured security data, enabling deeper contextual reasoning, explainable decision-making, and automated response for intrusion behaviors. Practical implementations are then detailed, highlighting LLMs as processors, detectors, and explainers within a comprehensive AI-driven NIDS pipeline. Furthermore, the concept of an LLM-centered Controller is proposed, emphasizing its potential to coordinate intrusion detection workflows, optimizing tool collaboration and system performance. Finally, this paper identifies critical challenges and opportunities, aiming to foster innovation in developing reliable, adaptive, and explainable NIDS. By presenting the transformative potential of LLMs, this paper seeks to inspire advancement in next-generation network security systems.
Authors:Mohsen Ghasemizade, Juniper Lovato, Christopher M. Danforth, Peter Sheridan Dodds, Laura S. P. Bloomfield, Matthew Price, Team LEMURS, Joseph P. Near
Abstract:
Sharing health and behavioral data raises significant privacy concerns, as conventional de-identification methods are susceptible to privacy attacks. Differential Privacy (DP) provides formal guarantees against re-identification risks, but practical implementation necessitates balancing privacy protection and the utility of data.
We demonstrate the use of DP to protect individuals in a real behavioral health study, while making the data publicly available and retaining high utility for downstream users of the data. We use the Adaptive Iterative Mechanism (AIM) to generate DP synthetic data for Phase 1 of the Lived Experiences Measured Using Rings Study (LEMURS). The LEMURS dataset comprises physiological measurements from wearable devices (Oura rings) and self-reported survey data from first-year college students. We evaluate the synthetic datasets across a range of privacy budgets, epsilon = 1 to 100, focusing on the trade-off between privacy and utility.
We evaluate the utility of the synthetic data using a framework informed by actual uses of the LEMURS dataset. Our evaluation identifies the trade-off between privacy and utility across synthetic datasets generated with different privacy budgets. We find that synthetic data sets with epsilon = 5 preserve adequate predictive utility while significantly mitigating privacy risks. Our methodology establishes a reproducible framework for evaluating the practical impacts of epsilon on generating private synthetic datasets with numerous attributes and records, contributing to informed decision-making in data sharing practices.
Authors:Arun Ganesh, Brendan McMahan, Abhradeep Thakurta
Abstract:
The spherical noise added to gradients in differentially private (DP) training undermines the performance of adaptive optimizers like AdaGrad and Adam, and hence many recent works have proposed algorithms to address this challenge. However, the empirical results in these works focus on simple tasks and models and the conclusions may not generalize to model training in practice. In this paper we survey several of these variants, and develop better theoretical intuition for them as well as perform empirical studies comparing them. We find that a common intuition of aiming for unbiased estimates of second moments of gradients in adaptive optimizers is misguided, and instead that a simple technique called scale-then-privatize (which does not achieve unbiased second moments) has more desirable theoretical behaviors and outperforms all other variants we study on a small-scale language model training task. We additionally argue that scale-then-privatize causes the noise addition to better match the application of correlated noise mechanisms which are more desirable to use in practice.
Authors:Keke Tang, Ziyong Du, Weilong Peng, Xiaofei Wang, Peican Zhu, Ligang Liu, Zhihong Tian
Abstract:
Adversarial attacks on point clouds often impose strict geometric constraints to preserve plausibility; however, such constraints inherently limit transferability and undefendability. While deformation offers an alternative, existing unstructured approaches may introduce unnatural distortions, making adversarial point clouds conspicuous and undermining their plausibility. In this paper, we propose CageAttack, a cage-based deformation framework that produces natural adversarial point clouds. It first constructs a cage around the target object, providing a structured basis for smooth, natural-looking deformation. Perturbations are then applied to the cage vertices, which seamlessly propagate to the point cloud, ensuring that the resulting deformations remain intrinsic to the object and preserve plausibility. Extensive experiments on seven 3D deep neural network classifiers across three datasets show that CageAttack achieves a superior balance among transferability, undefendability, and plausibility, outperforming state-of-the-art methods. Codes will be made public upon acceptance.
Authors:Jiahao Liu, Bonan Ruan, Xianglin Yang, Zhiwei Lin, Yan Liu, Yang Wang, Tao Wei, Zhenkai Liang
Abstract:
LLM-based agents have demonstrated promising adaptability in real-world applications. However, these agents remain vulnerable to a wide range of attacks, such as tool poisoning and malicious instructions, that compromise their execution flow and can lead to serious consequences like data breaches and financial loss. Existing studies typically attempt to mitigate such anomalies by predefining specific rules and enforcing them at runtime to enhance safety. Yet, designing comprehensive rules is difficult, requiring extensive manual effort and still leaving gaps that result in false negatives. As agent systems evolve into complex software systems, we take inspiration from software system security and propose TraceAegis, a provenance-based analysis framework that leverages agent execution traces to detect potential anomalies. In particular, TraceAegis constructs a hierarchical structure to abstract stable execution units that characterize normal agent behaviors. These units are then summarized into constrained behavioral rules that specify the conditions necessary to complete a task. By validating execution traces against both hierarchical and behavioral constraints, TraceAegis is able to effectively detect abnormal behaviors. To evaluate the effectiveness of TraceAegis, we introduce TraceAegis-Bench, a dataset covering two representative scenarios: healthcare and corporate procurement. Each scenario includes 1,300 benign behaviors and 300 abnormal behaviors, where the anomalies either violate the agent's execution order or break the semantic consistency of its execution sequence. Experimental results demonstrate that TraceAegis achieves strong performance on TraceAegis-Bench, successfully identifying the majority of abnormal behaviors.
Authors:Mohamed Seif, Antti Koskela, H. Vincent Poor, Andrea J. Goldsmith
Abstract:
We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.
Authors:Jinseong Park, Yujin Choi, Jaewook Lee
Abstract:
With the increasing need to safeguard data privacy in machine learning models, differential privacy (DP) is one of the major frameworks to build privacy-preserving models. Support Vector Machines (SVMs) are widely used traditional machine learning models due to their robust margin guarantees and strong empirical performance in binary classification. However, applying DP to multi-class SVMs is inadequate, as the standard one-versus-rest (OvR) and one-versus-one (OvO) approaches repeatedly query each data sample when building multiple binary classifiers, thus consuming the privacy budget proportionally to the number of classes. To overcome this limitation, we explore all-in-one SVM approaches for DP, which access each data sample only once to construct multi-class SVM boundaries with margin maximization properties. We propose a novel differentially Private Multi-class SVM (PMSVM) with weight and gradient perturbation methods, providing rigorous sensitivity and convergence analyses to ensure DP in all-in-one SVMs. Empirical results demonstrate that our approach surpasses existing DP-SVM methods in multi-class scenarios.
Authors:Shoumik Saha, Jifan Chen, Sam Mayers, Sanjay Krishna Gouda, Zijian Wang, Varun Kumar
Abstract:
Code-capable large language model (LLM) agents are increasingly embedded into software engineering workflows where they can read, write, and execute code, raising the stakes of safety-bypass ("jailbreak") attacks beyond text-only settings. Prior evaluations emphasize refusal or harmful-text detection, leaving open whether agents actually compile and run malicious programs. We present JAWS-BENCH (Jailbreaks Across WorkSpaces), a benchmark spanning three escalating workspace regimes that mirror attacker capability: empty (JAWS-0), single-file (JAWS-1), and multi-file (JAWS-M). We pair this with a hierarchical, executable-aware Judge Framework that tests (i) compliance, (ii) attack success, (iii) syntactic correctness, and (iv) runtime executability, moving beyond refusal to measure deployable harm. Using seven LLMs from five families as backends, we find that under prompt-only conditions in JAWS-0, code agents accept 61% of attacks on average; 58% are harmful, 52% parse, and 27% run end-to-end. Moving to single-file regime in JAWS-1 drives compliance to ~ 100% for capable models and yields a mean ASR (Attack Success Rate) ~ 71%; the multi-file regime (JAWS-M) raises mean ASR to ~ 75%, with 32% instantly deployable attack code. Across models, wrapping an LLM in an agent substantially increases vulnerability -- ASR raises by 1.6x -- because initial refusals are frequently overturned during later planning/tool-use steps. Category-level analyses identify which attack classes are most vulnerable and most readily deployable, while others exhibit large execution gaps. These findings motivate execution-aware defenses, code-contextual safety filters, and mechanisms that preserve refusal decisions throughout the agent's multi-step reasoning and tool use.
Authors:Thomas Schuster, Fermi Ma, Alex Lombardi, Fernando Brandao, Hsin-Yuan Huang
Abstract:
Understanding how fast physical systems can resemble Haar-random unitaries is a fundamental question in physics. Many experiments of interest in quantum gravity and many-body physics, including the butterfly effect in quantum information scrambling and the Hayden-Preskill thought experiment, involve queries to a random unitary $U$ alongside its inverse $U^\dagger$, conjugate $U^*$, and transpose $U^T$. However, conventional notions of approximate unitary designs and pseudorandom unitaries (PRUs) fail to capture these experiments. In this work, we introduce and construct strong unitary designs and strong PRUs that remain robust under all such queries. Our constructions achieve the optimal circuit depth of $O(\log n)$ for systems of $n$ qubits. We further show that strong unitary designs can form in circuit depth $O(\log^2 n)$ in circuits composed of independent two-qubit Haar-random gates, and that strong PRUs can form in circuit depth $\text{poly}(\log n)$ in circuits with no ancilla qubits. Our results provide an operational proof of the fast scrambling conjecture from black hole physics: every observable feature of the fastest scrambling quantum systems reproduces Haar-random behavior at logarithmic times.
Authors:Weibo Zhao, Jiahao Liu, Bonan Ruan, Shaofei Li, Zhenkai Liang
Abstract:
Model Context Protocol (MCP) servers enable AI applications to connect to external systems in a plug-and-play manner, but their rapid proliferation also introduces severe security risks. Unlike mature software ecosystems with rigorous vetting, MCP servers still lack standardized review mechanisms, giving adversaries opportunities to distribute malicious implementations. Despite this pressing risk, the security implications of MCP servers remain underexplored. To address this gap, we present the first systematic study that treats MCP servers as active threat actors and decomposes them into core components to examine how adversarial developers can implant malicious intent. Specifically, we investigate three research questions: (i) what types of attacks malicious MCP servers can launch, (ii) how vulnerable MCP hosts and Large Language Models (LLMs) are to these attacks, and (iii) how feasible it is to carry out MCP server attacks in practice. Our study proposes a component-based taxonomy comprising twelve attack categories. For each category, we develop Proof-of-Concept (PoC) servers and demonstrate their effectiveness across diverse real-world host-LLM settings. We further show that attackers can generate large numbers of malicious servers at virtually no cost. We then test state-of-the-art scanners on the generated servers and found that existing detection approaches are insufficient. These findings highlight that malicious MCP servers are easy to implement, difficult to detect with current tools, and capable of causing concrete damage to AI agent systems. Addressing this threat requires coordinated efforts among protocol designers, host developers, LLM providers, and end users to build a more secure and resilient MCP ecosystem.
Authors:Zeyu Shen, Basileal Imana, Tong Wu, Chong Xiang, Prateek Mittal, Aleksandra Korolova
Abstract:
Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain vulnerable to attacks on the retrieval corpus, such as prompt injection. RAG-based search systems (e.g., Google's Search AI Overview) present an interesting setting for studying and protecting against such threats, as defense algorithms can benefit from built-in reliability signals -- like document ranking -- and represent a non-LLM challenge for the adversary due to decades of work to thwart SEO. Motivated by, but not limited to, this scenario, this work introduces ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents. Our first contribution adopts a graph-theoretic perspective to identify a "consistent majority" among retrieved documents to filter out malicious ones. We introduce a novel algorithm based on finding a Maximum Independent Set (MIS) on a document graph where edges encode contradiction. Our MIS variant explicitly prioritizes higher-reliability documents and provides provable robustness guarantees against bounded adversarial corruption under natural assumptions. Recognizing the computational cost of exact MIS for large retrieval sets, our second contribution is a scalable weighted sample and aggregate framework. It explicitly utilizes reliability information, preserving some robustness guarantees while efficiently handling many documents. We present empirical results showing ReliabilityRAG provides superior robustness against adversarial attacks compared to prior methods, maintains high benign accuracy, and excels in long-form generation tasks where prior robustness-focused methods struggled. Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.
Authors:Shaoyuan Xie, Mohamad Habib Fakih, Junchi Lu, Fayzah Alshammari, Ningfei Wang, Takami Sato, Halima Bouzidi, Mohammad Abdullah Al Faruque, Qi Alfred Chen
Abstract:
Autonomous Target Tracking (ATT) systems, especially ATT drones, are widely used in applications such as surveillance, border control, and law enforcement, while also being misused in stalking and destructive actions. Thus, the security of ATT is highly critical for real-world applications. Under the scope, we present a new type of attack: distance-pulling attacks (DPA) and a systematic study of it, which exploits vulnerabilities in ATT systems to dangerously reduce tracking distances, leading to drone capturing, increased susceptibility to sensor attacks, or even physical collisions. To achieve these goals, we present FlyTrap, a novel physical-world attack framework that employs an adversarial umbrella as a deployable and domain-specific attack vector. FlyTrap is specifically designed to meet key desired objectives in attacking ATT drones: physical deployability, closed-loop effectiveness, and spatial-temporal consistency. Through novel progressive distance-pulling strategy and controllable spatial-temporal consistency designs, FlyTrap manipulates ATT drones in real-world setups to achieve significant system-level impacts. Our evaluations include new datasets, metrics, and closed-loop experiments on real-world white-box and even commercial ATT drones, including DJI and HoverAir. Results demonstrate FlyTrap's ability to reduce tracking distances within the range to be captured, sensor attacked, or even directly crashed, highlighting urgent security risks and practical implications for the safe deployment of ATT systems.
Authors:Gejian Zhao, Hanzhou Wu, Xinpeng Zhang
Abstract:
Backdoor attacks pose a significant security threat to natural language processing (NLP) systems, but existing methods lack explainable trigger mechanisms and fail to quantitatively model vulnerability patterns. This work pioneers the quantitative connection between explainable artificial intelligence (XAI) and backdoor attacks, introducing Sensitron, a novel modular framework for crafting stealthy and robust backdoor triggers. Sensitron employs a progressive refinement approach where Dynamic Meta-Sensitivity Analysis (DMSA) first identifies potentially vulnerable input tokens, Hierarchical SHAP Estimation (H-SHAP) then provides explainable attribution to precisely pinpoint the most influential tokens, and finally a Plug-and-Rank mechanism that generates contextually appropriate triggers. We establish the first mathematical correlation (Sensitivity Ranking Correlation, SRC=0.83) between explainability scores and empirical attack success, enabling precise targeting of model vulnerabilities. Sensitron achieves 97.8% Attack Success Rate (ASR) (+5.8% over state-of-the-art (SOTA)) with 85.4% ASR at 0.1% poisoning rate, demonstrating robust resistance against multiple SOTA defenses. This work reveals fundamental NLP vulnerabilities and provides new attack vectors through weaponized explainability.
Authors:Baolei Zhang, Haoran Xin, Yuxi Chen, Zhuqing Liu, Biao Yi, Tong Li, Lihai Nie, Zheli Liu, Minghong Fang
Abstract:
Retrieval-Augmented Generation (RAG) integrates external knowledge into large language models to improve response quality. However, recent work has shown that RAG systems are highly vulnerable to poisoning attacks, where malicious texts are inserted into the knowledge database to influence model outputs. While several defenses have been proposed, they are often circumvented by more adaptive or sophisticated attacks. This paper presents RAGOrigin, a black-box responsibility attribution framework designed to identify which texts in the knowledge database are responsible for misleading or incorrect generations. Our method constructs a focused attribution scope tailored to each misgeneration event and assigns a responsibility score to each candidate text by evaluating its retrieval ranking, semantic relevance, and influence on the generated response. The system then isolates poisoned texts using an unsupervised clustering method. We evaluate RAGOrigin across seven datasets and fifteen poisoning attacks, including newly developed adaptive poisoning strategies and multi-attacker scenarios. Our approach outperforms existing baselines in identifying poisoned content and remains robust under dynamic and noisy conditions. These results suggest that RAGOrigin provides a practical and effective solution for tracing the origins of corrupted knowledge in RAG systems.
Authors:Siyuan Bao, Ying Shi, Zhiguang Yang, Hanzhou Wu, Xinpeng Zhang
Abstract:
Existing watermarking methods for large language models (LLMs) mainly embed watermark by adjusting the token sampling prediction or post-processing, lacking intrinsic coupling with LLMs, which may significantly reduce the semantic quality of the generated marked texts. Traditional watermarking methods based on training or fine-tuning may be extendable to LLMs. However, most of them are limited to the white-box scenario, or very time-consuming due to the massive parameters of LLMs. In this paper, we present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM, and can be extracted from the generated text without accessing the LLM. Comparing with related methods, the proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark. Moreover, the proposed method enables us to extract the watermark under the black-box scenario, which is computationally efficient for use. Experimental results have also verified the feasibility, superiority and practicality. This work provides a new perspective different from mainstream works, which may shed light on future research.
Authors:Sawera Shahid, Umara Noor, Zahid Rashid
Abstract:
Cyber attacks are rapidly increasing with the advancement of technology and there is no protection for our information. To prevent future cyberattacks it is critical to promptly recognize cyberattacks and establish strong defense mechanisms against them. To respond to cybersecurity threats immediately, it is essential to examine the attackers skills, knowledge, and behaviors with the goal of evaluating their impact on the system and comprehending the traits associated with these attacks. Creating a profile of cyber threat actors based on their traits or patterns of behavior can help to create effective defenses against cyberattacks in advance. In the current literature, multiple supervised machine learning based approaches considered a smaller number of features for attacker profiling that are reported in textual cyber threat incident documents although these profiles have been developed based on the security experts own perception, we cannot rely on them. Supervised machine learning approaches strictly depend upon the structure data set. This usually leads to a two step process where we first have to establish a structured data set before we can analyze it and then employ it to construct defense mechanisms, which takes time. In this paper, an unsupervised efficient agglomerative hierarchal clustering technique is proposed for profiling cybercriminal groups based on their comprehensive contextual threat information in order to address the aforementioned issues. The main objective of this report is to identify the relationship between cyber threat actors based on their common features, aggregate them, and also profile cyber criminal groups.
Authors:Fizza Khurshid, Umara Noor, Zahid Rashid
Abstract:
Innovative solutions to cyber security issues are shaped by the ever-changing landscape of cyber threats. Automating the mitigation of these threats can be achieved through a new methodology that addresses the domain of mitigation automation, which is often overlooked. This literature overview emphasizes the lack of scholarly work focusing specifically on automated cyber threat mitigation, particularly in addressing challenges beyond detection. The proposed methodology comprise of the development of an automatic cyber threat mitigation framework tailored for Distributed Denial-of-Service (DDoS) attacks. This framework adopts a multi-layer security approach, utilizing smart devices at the device layer, and leveraging fog network and cloud computing layers for deeper understanding and technological adaptability. Initially, firewall rule-based packet inspection is conducted on simulated attack traffic to filter out DoS packets, forwarding legitimate packets to the fog. The methodology emphasizes the integration of fog detection through statistical and behavioral analysis, specification-based detection, and deep packet inspection, resulting in a comprehensive cyber protection system. Furthermore, cloud-level inspection is performed to confirm and mitigate attacks using firewalls, enhancing strategic defense and increasing robustness against cyber threats. These enhancements enhance understanding of the research framework's practical implementation and assessment strategies, substantiating its importance in addressing current cyber security challenges and shaping future automation mitigation approaches.
Authors:Rimsha Kanwal, Umara Noor, Zafar Iqbal, Zahid Rashid
Abstract:
With the ever-changing landscape of cyber threats, identifying their origin has become paramount, surpassing the simple task of attack classification. Cyber threat attribution gives security analysts the insights they need to device effective threat mitigation strategies. Such strategies empower enterprises to proactively detect and defend against future cyber-attacks. However, existing approaches exhibit limitations in accurately identifying threat actors, leading to low precision and a significant occurrence of false positives. Machine learning offers the potential to automate certain aspects of cyber threat attribution. The distributed nature of information regarding cyber threat actors and their intricate attack methodologies has hindered substantial progress in this domain. Cybersecurity analysts deal with an ever-expanding collection of cyber threat intelligence documents. While these documents hold valuable insights, their sheer volume challenges efficient organization and retrieval of pertinent information. To assist the cybersecurity analyst activities, we propose a machine learning based approach featuring visually interactive analytics tool named the Cyber-Attack Pattern Explorer (CAPE), designed to facilitate efficient information discovery by employing interactive visualization and mining techniques. In the proposed system, a non-parametric mining technique is proposed to create a dataset for identifying the attack patterns within cyber threat intelligence documents. These attack patterns align semantically with commonly employed themes ensuring ease of interpretation. The extracted dataset is used for training of proposed machine learning algorithms that enables the attribution of cyber threats with respective to the actors.
Authors:Aditya Kulkarni, Tamal Das, Vivek Balachandran
Abstract:
The Domain Name System (DNS), which converts domain names to their respective IP addresses, has advanced enhancements aimed at safeguarding DNS data and users' identity from attackers. The recent privacy-focused advancements have enabled the IETF to standardize several protocols. Nevertheless, these protocols tend to focus on either strengthening user privacy (like Oblivious DNS and Oblivious DNS-over-HTTPS) or reducing resolution latency (as demonstrated by DNS-over-QUIC). Achieving both within a single protocol remains a key challenge, which we address in this paper. Our proposed protocol -- 'Oblivious DNS-over-QUIC' (ODoQ) -- leverages the benefits of the QUIC protocol and incorporates an intermediary proxy server to protect the client's identity from exposure to the recursive resolver.
Authors:Aditya Kulkarni, Shahil Manishbhai Patel, Shivam Pradip Tirmare, Vivek Balachandran, Tamal Das
Abstract:
To combat phishing attacks -- aimed at luring web users to divulge their sensitive information -- various phishing detection approaches have been proposed. As attackers focus on devising new tactics to bypass existing detection solutions, researchers have adapted by integrating machine learning and deep learning into phishing detection. Phishing dataset collection is vital to developing effective phishing detection approaches, which highly depend on the diversity of the gathered datasets. The lack of diversity in the dataset results in a biased model. Since phishing websites are often short-lived, collecting them is also a challenge. Consequently, very few phishing webpage dataset repositories exist to date. No single repository comprehensively consolidates all phishing elements corresponding to a phishing webpage, namely, URL, webpage source code, screenshot, and related webpage resources. This paper introduces a resource collection tool designed to gather various resources associated with a URL, such as CSS, Javascript, favicons, webpage images, and screenshots. Our tool leverages PhishTank as the primary source for obtaining active phishing URLs. Our tool fetches several additional webpage resources compared to PyWebCopy Python library, which provides webpage content for a given URL. Additionally, we share a sample dataset generated using our tool comprising 4,056 legitimate and 5,666 phishing URLs along with their associated resources. We also remark on the top correlated phishing features with their associated class label found in our dataset. Our tool offers a comprehensive resource set that can aid researchers in developing effective phishing detection approaches.
Authors:Honghui Xu, Shiva Shrestha, Wei Chen, Zhiyuan Li, Zhipeng Cai
Abstract:
As on-device large language model (LLM) systems become increasingly prevalent, federated fine-tuning enables advanced language understanding and generation directly on edge devices; however, it also involves processing sensitive, user-specific data, raising significant privacy concerns within the federated learning framework. To address these challenges, we propose DP-FedLoRA, a privacy-enhanced federated fine-tuning framework that integrates LoRA-based adaptation with differential privacy in a communication-efficient setting. Each client locally clips and perturbs its LoRA matrices using Gaussian noise to satisfy ($ε$, $δ$)-differential privacy. We further provide a theoretical analysis demonstrating the unbiased nature of the updates and deriving bounds on the variance introduced by noise, offering practical guidance for privacy-budget calibration. Experimental results across mainstream benchmarks show that DP-FedLoRA delivers competitive performance while offering strong privacy guarantees, paving the way for scalable and privacy-preserving LLM deployment in on-device environments.
Authors:Aditya Kulkarni, Vivek Balachandran, Tamal Das
Abstract:
In the realm of cybersecurity, phishing stands as a prevalent cyber attack, where attackers employ various tactics to deceive users into gathering their sensitive information, potentially leading to identity theft or financial gain. Researchers have been actively working on advancing phishing webpage detection approaches to detect new phishing URLs, bolstering user protection. Nonetheless, the ever-evolving strategies employed by attackers, aimed at circumventing existing detection approaches and tools, present an ongoing challenge to the research community. This survey presents a systematic categorization of diverse phishing webpage detection approaches, encompassing URL-based, webpage content-based, and visual techniques. Through a comprehensive review of these approaches and an in-depth analysis of existing literature, our study underscores current research gaps in phishing webpage detection. Furthermore, we suggest potential solutions to address some of these gaps, contributing valuable insights to the ongoing efforts to combat phishing attacks.
Authors:Duddu Hriday, Aditya Kulkarni, Vivek Balachandran, Tamal Das
Abstract:
Phishing attacks are increasingly prevalent, with adversaries creating deceptive webpages to steal sensitive information. Despite advancements in machine learning and deep learning for phishing detection, attackers constantly develop new tactics to bypass detection models. As a result, phishing webpages continue to reach users, particularly those unable to recognize phishing indicators. To improve detection accuracy, models must be trained on large datasets containing both phishing and legitimate webpages, including URLs, webpage content, screenshots, and logos. However, existing tools struggle to collect the required resources, especially given the short lifespan of phishing webpages, limiting dataset comprehensiveness. In response, we introduce Phish-Blitz, a tool that downloads phishing and legitimate webpages along with their associated resources, such as screenshots. Unlike existing tools, Phish-Blitz captures live webpage screenshots and updates resource file paths to maintain the original visual integrity of the webpage. We provide a dataset containing 8,809 legitimate and 5,000 phishing webpages, including all associated resources. Our dataset and tool are publicly available on GitHub, contributing to the research community by offering a more complete dataset for phishing detection.
Authors:Aduma Rishith, Aditya Kulkarni, Tamal Das, Vivek Balachandran
Abstract:
The Domain Name System (DNS) serves as the backbone of the Internet, primarily translating domain names to IP addresses. Over time, various enhancements have been introduced to strengthen the integrity of DNS. Among these, DNSSEC stands out as a leading cryptographic solution. It protects against attacks (such as DNS spoofing) by establishing a chain of trust throughout the DNS nameserver hierarchy. However, DNSSEC's effectiveness is compromised when there is a break in this chain, resulting in "Islands of Security", where domains can authenticate locally but not across hierarchical levels, leading to a loss of trust and validation between them. Leading approaches to addressing these issues were centralized, with a single authority maintaining some kind of bulletin board. This approach requires significantly more infrastructure and places excessive trust in the entity responsible for managing it properly. In this paper, we propose a decentralized approach to addressing gaps in DNSSEC's chain of trust, commonly referred to as "Islands of Security". We leverage TLS and IP-based certificates to enable end-to-end authentication between hierarchical levels, eliminating the need for uniform DNSSEC deployment across every level of the DNS hierarchy. This approach enhances the overall integrity of DNSSEC, while reducing dependence on registrars for maintaining signature records to verify the child nameserver's authenticity. By offering a more flexible and efficient solution, our method strengthens DNS security and streamlines deployment across diverse environments.
Authors:Abdul Rehman, Are Dæhlen, Ilona Heldal, Jerry Chun-wei Lin
Abstract:
Eye-tracking technology can aid in understanding neurodevelopmental disorders and tracing a person's identity. However, this technology poses a significant risk to privacy, as it captures sensitive information about individuals and increases the likelihood that data can be traced back to them. This paper proposes a human-centered framework designed to prevent identity backtracking while preserving the pedagogical benefits of AI-powered eye tracking in interactive learning environments. We explore how real-time data anonymization, ethical design principles, and regulatory compliance (such as GDPR) can be integrated to build trust and transparency. We first demonstrate the potential for backtracking student IDs and diagnoses in various scenarios using serious game-based eye-tracking data. We then provide a two-stage privacy-preserving framework that prevents participants from being tracked while still enabling diagnostic classification. The first phase covers four scenarios: I) Predicting disorder diagnoses based on different game levels. II) Predicting student IDs based on different game levels. III) Predicting student IDs based on randomized data. IV) Utilizing K-Means for out-of-sample data. In the second phase, we present a two-stage framework that preserves privacy. We also employ Federated Learning (FL) across multiple clients, incorporating a secure identity management system with dummy IDs and administrator-only access controls. In the first phase, the proposed framework achieved 99.3% accuracy for scenario 1, 63% accuracy for scenario 2, and 99.7% accuracy for scenario 3, successfully identifying and assigning a new student ID in scenario 4. In phase 2, we effectively prevented backtracking and established a secure identity management system with dummy IDs and administrator-only access controls, achieving an overall accuracy of 99.40%.
Authors:Honghui Xu, Kaiyang Li, Wei Chen, Danyang Zheng, Zhiyuan Li, Zhipeng Cai
Abstract:
Mobile Large Language Models (LLMs) are revolutionizing diverse fields such as healthcare, finance, and education with their ability to perform advanced natural language processing tasks on-the-go. However, the deployment of these models in mobile and edge environments introduces significant challenges related to privacy and security due to their resource-intensive nature and the sensitivity of the data they process. This survey provides a comprehensive overview of privacy and security issues associated with mobile LLMs, systematically categorizing existing solutions such as differential privacy, federated learning, and prompt encryption. Furthermore, we analyze vulnerabilities unique to mobile LLMs, including adversarial attacks, membership inference, and side-channel attacks, offering an in-depth comparison of their effectiveness and limitations. Despite recent advancements, mobile LLMs face unique hurdles in achieving robust security while maintaining efficiency in resource-constrained environments. To bridge this gap, we propose potential applications, discuss open challenges, and suggest future research directions, paving the way for the development of trustworthy, privacy-compliant, and scalable mobile LLM systems.
Authors:Guofu Liao, Taotao Wang, Shengli Zhang, Jiqun Zhang, Shi Long, Dacheng Tao
Abstract:
Fine-tuning large language models (LLMs) is crucial for adapting them to specific tasks, yet it remains computationally demanding and raises concerns about correctness and privacy, particularly in untrusted environments. Although parameter-efficient methods like Low-Rank Adaptation (LoRA) significantly reduce resource requirements, ensuring the security and verifiability of fine-tuning under zero-knowledge constraints remains an unresolved challenge. To address this, we introduce zkLoRA, the first framework to integrate LoRA fine-tuning with zero-knowledge proofs (ZKPs), achieving provable security and correctness. zkLoRA employs advanced cryptographic techniques -- such as lookup arguments, sumcheck protocols, and polynomial commitments -- to verify both arithmetic and non-arithmetic operations in Transformer-based architectures. The framework provides end-to-end verifiability for forward propagation, backward propagation, and parameter updates during LoRA fine-tuning, while safeguarding the privacy of model parameters and training data. Leveraging GPU-based implementations, zkLoRA demonstrates practicality and efficiency through experimental validation on open-source LLMs like LLaMA, scaling up to 13 billion parameters. By combining parameter-efficient fine-tuning with ZKPs, zkLoRA bridges a critical gap, enabling secure and trustworthy deployment of LLMs in sensitive or untrusted environments.
Authors:Matteo Gioele Collu, Umberto Salviati, Roberto Confalonieri, Mauro Conti, Giovanni Apruzzese
Abstract:
Large Language Models (LLMs) are increasingly being integrated into the scientific peer-review process, raising new questions about their reliability and resilience to manipulation. In this work, we investigate the potential for hidden prompt injection attacks, where authors embed adversarial text within a paper's PDF to influence the LLM-generated review. We begin by formalising three distinct threat models that envision attackers with different motivations -- not all of which implying malicious intent. For each threat model, we design adversarial prompts that remain invisible to human readers yet can steer an LLM's output toward the author's desired outcome. Using a user study with domain scholars, we derive four representative reviewing prompts used to elicit peer reviews from LLMs. We then evaluate the robustness of our adversarial prompts across (i) different reviewing prompts, (ii) different commercial LLM-based systems, and (iii) different peer-reviewed papers. Our results show that adversarial prompts can reliably mislead the LLM, sometimes in ways that adversely affect a "honest-but-lazy" reviewer. Finally, we propose and empirically assess methods to reduce detectability of adversarial prompts under automated content checks.
Authors:Jefferson David Rodriguez Chivata, Davide Ghiani, Simone Maurizio La Cava, Marco Micheletto, Giulia Orrù, Federico Lama, Gian Luca Marcialis
Abstract:
ICAO-compliant facial images, initially designed for secure biometric passports, are increasingly becoming central to identity verification in a wide range of application contexts, including border control, digital travel credentials, and financial services. While their standardization enables global interoperability, it also facilitates practices such as morphing and deepfakes, which can be exploited for harmful purposes like identity theft and illegal sharing of identity documents. Traditional countermeasures like Presentation Attack Detection (PAD) are limited to real-time capture and offer no post-capture protection. This survey paper investigates digital watermarking and steganography as complementary solutions that embed tamper-evident signals directly into the image, enabling persistent verification without compromising ICAO compliance. We provide the first comprehensive analysis of state-of-the-art techniques to evaluate the potential and drawbacks of the underlying approaches concerning the applications involving ICAO-compliant images and their suitability under standard constraints. We highlight key trade-offs, offering guidance for secure deployment in real-world identity systems.
Authors:Terry Yue Zhuo, Dingmin Wang, Hantian Ding, Varun Kumar, Zijian Wang
Abstract:
Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.
Authors:Yuanda Wang, Bocheng Chen, Hanqing Guo, Guangjing Wang, Weikang Ding, Qiben Yan
Abstract:
Voice deepfake attacks, which artificially impersonate human speech for malicious purposes, have emerged as a severe threat. Existing defenses typically inject noise into human speech to compromise voice encoders in speech synthesis models. However, these methods degrade audio quality and require prior knowledge of the attack approaches, limiting their effectiveness in diverse scenarios. Moreover, real-time audios, such as speech in virtual meetings and voice messages, are still exposed to voice deepfake threats. To overcome these limitations, we propose ClearMask, a noise-free defense mechanism against voice deepfake attacks. Unlike traditional approaches, ClearMask modifies the audio mel-spectrogram by selectively filtering certain frequencies, inducing a transferable voice feature loss without injecting noise. We then apply audio style transfer to further deceive voice decoders while preserving perceived sound quality. Finally, optimized reverberation is introduced to disrupt the output of voice generation models without affecting the naturalness of the speech. Additionally, we develop LiveMask to protect streaming speech in real-time through a universal frequency filter and reverberation generator. Our experimental results show that ClearMask and LiveMask effectively prevent voice deepfake attacks from deceiving speaker verification models and human listeners, even for unseen voice synthesis models and black-box API services. Furthermore, ClearMask demonstrates resilience against adaptive attackers who attempt to recover the original audio signal from the protected speech samples.
Authors:Youssef Maklad, Fares Wael, Ali Hamdi, Wael Elsersy, Khaled Shaban
Abstract:
Traditional protocol fuzzing techniques, such as those employed by AFL-based systems, often lack effectiveness due to a limited semantic understanding of complex protocol grammars and rigid seed mutation strategies. Recent works, such as ChatAFL, have integrated Large Language Models (LLMs) to guide protocol fuzzing and address these limitations, pushing protocol fuzzers to wider exploration of the protocol state space. But ChatAFL still faces issues like unreliable output, LLM hallucinations, and assumptions of LLM knowledge about protocol specifications. This paper introduces MultiFuzz, a novel dense retrieval-based multi-agent system designed to overcome these limitations by integrating semantic-aware context retrieval, specialized agents, and structured tool-assisted reasoning. MultiFuzz utilizes agentic chunks of protocol documentation (RFC Documents) to build embeddings in a vector database for a retrieval-augmented generation (RAG) pipeline, enabling agents to generate more reliable and structured outputs, enhancing the fuzzer in mutating protocol messages with enhanced state coverage and adherence to syntactic constraints. The framework decomposes the fuzzing process into modular groups of agents that collaborate through chain-of-thought reasoning to dynamically adapt fuzzing strategies based on the retrieved contextual knowledge. Experimental evaluations on the Real-Time Streaming Protocol (RTSP) demonstrate that MultiFuzz significantly improves branch coverage and explores deeper protocol states and transitions over state-of-the-art (SOTA) fuzzers such as NSFuzz, AFLNet, and ChatAFL. By combining dense retrieval, agentic coordination, and language model reasoning, MultiFuzz establishes a new paradigm in autonomous protocol fuzzing, offering a scalable and extensible foundation for future research in intelligent agentic-based fuzzing systems.
Authors:Jianhao Chen, Mayi Xu, Xiaohu Li, Yongqi Li, Xiangyu Zhang, Jianjie Huang, Tieyun Qian
Abstract:
Large Reasoning Models (LRMs) have demonstrated impressive performance across various tasks due to their powerful reasoning capabilities. However, their safety performance remains a significant concern. In this paper, we explore the reasons behind the vulnerability of LRMs. Based on this, we propose a novel method to improve the safety of LLMs without sacrificing their reasoning capability. Specifically, we exploit the competition between LRM's reasoning ability and safety ability, and achieve jailbreak by improving LRM's reasoning performance to reduce its safety performance. We then introduce an alignment strategy based on Fuzzification to balance Safety-Reasoning (FuSaR), by detoxifying the harmful reasoning process, where both the dangerous entities and the dangerous procedures in the reasoning steps are hidden. FuSaR successfully mitigates safety risks while preserving core reasoning information. We validate this strategy through alignment experiments on several open-source LRMs using detoxified reasoning data. The results compared with existing baselines conclusively show that FuSaR is an efficient alignment strategy to simultaneously enhance both the reasoning capability and safety of LRMs.
Authors:Shuang Liang, Zhihao Xu, Jialing Tao, Hui Xue, Xiting Wang
Abstract:
Despite extensive alignment efforts, Large Vision-Language Models (LVLMs) remain vulnerable to jailbreak attacks, posing serious safety risks. Although recent detection works have shifted to internal representations due to their rich cross-modal information, most methods rely on heuristic rules rather than principled objectives, resulting in suboptimal performance. To address these limitations, we propose Learning to Detect (LoD), a novel unsupervised framework that formulates jailbreak detection as anomaly detection. LoD introduces two key components: Multi-modal Safety Concept Activation Vectors (MSCAV), which capture layer-wise safety-related representations across modalities, and the Safety Pattern Auto-Encoder, which models the distribution of MSCAV derived from safe inputs and detects anomalies via reconstruction errors. By training the auto-encoder (AE) solely on safe samples without attack labels, LoD naturally identifies jailbreak inputs as distributional anomalies, enabling accurate and unified detection of jailbreak attacks. Comprehensive experiments on three different LVLMs and five benchmarks demonstrate that LoD achieves state-of-the-art performance, with an average AUROC of 0.9951 and an improvement of up to 38.89% in the minimum AUROC over the strongest baselines.
Authors:Kiana Kiashemshaki, Elvis Nnaemeka Chukwuani, Mohammad Jalili Torkamani, Negin Mahmoudi
Abstract:
Blockchain technology offers a promising foundation for modernizing E-Voting systems by enhancing transparency, decentralization, and security. Yet, real-world adoption remains limited due to persistent challenges such as scalability constraints, high computational demands, and complex privacy requirements. This paper presents a comparative framework for analyzing blockchain-based E-Voting architectures, consensus mechanisms, and cryptographic protocols. We examine the limitations of prevalent models like Proof of Work, Proof of Stake, and Delegated Proof of Stake, and propose optimization strategies that include hybrid consensus, lightweight cryptography, and decentralized identity management. Additionally, we explore the novel role of Large Language Models (LLMs) in smart contract generation, anomaly detection, and user interaction. Our findings offer a foundation for designing secure, scalable, and intelligent blockchain-based E-Voting systems suitable for national-scale deployment. This work lays the groundwork for building an end-to-end blockchain E-Voting prototype enhanced by LLM-guided smart contract generation and validation, supported by a systematic framework and simulation-based analysis.
Authors:Daniele Proverbio, Alessio Buscemi, Alessandro Di Stefano, The Anh Han, German Castignani, Pietro Liò
Abstract:
Game theory has long served as a foundational tool in cybersecurity to test, predict, and design strategic interactions between attackers and defenders. The recent advent of Large Language Models (LLMs) offers new tools and challenges for the security of computer systems; In this work, we investigate whether classical game-theoretic frameworks can effectively capture the behaviours of LLM-driven actors and bots. Using a reproducible framework for game-theoretic LLM agents, we investigate two canonical scenarios -- the one-shot zero-sum game and the dynamic Prisoner's Dilemma -- and we test whether LLMs converge to expected outcomes or exhibit deviations due to embedded biases. Our experiments involve four state-of-the-art LLMs and span five natural languages, English, French, Arabic, Vietnamese, and Mandarin Chinese, to assess linguistic sensitivity. For both games, we observe that the final payoffs are influenced by agents characteristics such as personality traits or knowledge of repeated rounds. Moreover, we uncover an unexpected sensitivity of the final payoffs to the choice of languages, which should warn against indiscriminate application of LLMs in cybersecurity applications and call for in-depth studies, as LLMs may behave differently when deployed in different countries. We also employ quantitative metrics to evaluate the internal consistency and cross-language stability of LLM agents, to help guide the selection of the most stable LLMs and optimising models for secure applications.
Authors:Jiahao Xu, Rui Hu, Zikai Zhang
Abstract:
The growing deployment of Large Language Models (LLMs) in real-world applications has raised concerns about their potential misuse in generating harmful or deceptive content. To address this issue, watermarking techniques have emerged as a promising solution by embedding identifiable binary messages into generated text for origin verification and misuse tracing. While recent efforts have explored multi-bit watermarking schemes capable of embedding rich information such as user identifiers, they typically suffer from the fundamental trade-off between text quality and decoding accuracy: to ensure reliable message decoding, they have to restrict the size of preferred token sets during encoding, yet such restrictions reduce the quality of the generated content. In this work, we propose MajorMark, a novel watermarking method that improves this trade-off through majority bit-aware encoding. MajorMark selects preferred token sets based on the majority bit of the message, enabling a larger and more flexible sampling of tokens. In contrast to prior methods that rely on token frequency analysis for decoding, MajorMark employs a clustering-based decoding strategy, which maintains high decoding accuracy even when the preferred token set is large, thus preserving both content quality and decoding accuracy. We further introduce MajorMark$^+$, which partitions the message into multiple blocks to independently encode and deterministically decode each block, thereby further enhancing the quality of watermarked text and improving decoding accuracy. Extensive experiments on state-of-the-art LLMs demonstrate that our methods significantly enhance both decoding accuracy and text generation quality, outperforming prior multi-bit watermarking baselines.
Authors:Shreya Meel, Mohamed Nomeir, Pasan Dissanayake, Sanghamitra Dutta, Sennur Ulukus
Abstract:
Transparency and explainability are two important aspects to be considered when employing black-box machine learning models in high-stake applications. Providing counterfactual explanations is one way of catering this requirement. However, this also poses a threat to the privacy of the institution that is providing the explanation, as well as the user who is requesting it. In this work, we are primarily concerned with the user's privacy who wants to retrieve a counterfactual instance, without revealing their feature vector to the institution. Our framework retrieves the exact nearest neighbor counterfactual explanation from a database of accepted points while achieving perfect, information-theoretic, privacy for the user. First, we introduce the problem of private counterfactual retrieval (PCR) and propose a baseline PCR scheme that keeps the user's feature vector information-theoretically private from the institution. Building on this, we propose two other schemes that reduce the amount of information leaked about the institution database to the user, compared to the baseline scheme. Second, we relax the assumption of mutability of all features, and consider the setting of immutable PCR (I-PCR). Here, the user retrieves the nearest counterfactual without altering a private subset of their features, which constitutes the immutable set, while keeping their feature vector and immutable set private from the institution. For this, we propose two schemes that preserve the user's privacy information-theoretically, but ensure varying degrees of database privacy. Third, we extend our PCR and I-PCR schemes to incorporate user's preference on transforming their attributes, so that a more actionable explanation can be received. Finally, we present numerical results to support our theoretical findings, and compare the database leakage of the proposed schemes.
Authors:Faruk Alpay, Taylan Alpay, Bugra Kilictas
Abstract:
We study the inverse problem of reconstructing high-dimensional trust embeddings from the one-dimensional Siamese trust scores that many distributed-security frameworks expose. Starting from two independent agents that publish time-stamped similarity scores for the same set of devices, we formalise the estimation task, derive an explicit direct-sum estimator that concatenates paired score series with four moment features, and prove that the resulting reconstruction map admits a unique fixed point under a contraction argument rooted in Banach theory. A suite of synthetic benchmarks (20 devices x 10 time steps) confirms that, even in the presence of Gaussian noise, the recovered embeddings preserve inter-device geometry as measured by Euclidean and cosine metrics; we complement these experiments with non-asymptotic error bounds that link reconstruction accuracy to score-sequence length. Beyond methodology, the paper demonstrates a practical privacy risk: publishing granular trust scores can leak latent behavioural information about both devices and evaluation models. We therefore discuss counter-measures -- score quantisation, calibrated noise, obfuscated embedding spaces -- and situate them within wider debates on transparency versus confidentiality in networked AI systems. All datasets, reproduction scripts and extended proofs accompany the submission so that results can be verified without proprietary code.
Authors:Yuntao Du, Jiacheng Li, Yuetian Chen, Kaiyuan Zhang, Zhizhen Yuan, Hanshen Xiao, Bruno Ribeiro, Ninghui Li
Abstract:
A Membership Inference Attack (MIA) assesses how much a trained machine learning model reveals about its training data by determining whether specific query instances were included in the dataset. We classify existing MIAs into adaptive or non-adaptive, depending on whether the adversary is allowed to train shadow models on membership queries. In the adaptive setting, where the adversary can train shadow models after accessing query instances, we highlight the importance of exploiting membership dependencies between instances and propose an attack-agnostic framework called Cascading Membership Inference Attack (CMIA), which incorporates membership dependencies via conditional shadow training to boost membership inference performance. In the non-adaptive setting, where the adversary is restricted to training shadow models before obtaining membership queries, we introduce Proxy Membership Inference Attack (PMIA). PMIA employs a proxy selection strategy that identifies samples with similar behaviors to the query instance and uses their behaviors in shadow models to perform a membership posterior odds test for membership inference. We provide theoretical analyses for both attacks, and extensive experimental results demonstrate that CMIA and PMIA substantially outperform existing MIAs in both settings, particularly in the low false-positive regime, which is crucial for evaluating privacy risks.
Authors:Weihao Chen, Yansong Gao, Boyu Kuang, Jin B. Hong, Yuqing Zhang, Anmin Fu
Abstract:
Firmware updates remain the primary line of defense for IoT devices; however, the update channel itself has become a well-established attack vector. Existing defenses mainly focus on securing monolithic firmware images, leaving module-level customization -a growing user demand-largely unprotected and insufficiently explored. To address this gap, we conduct a pilot study on the update workflows of 200 Linux-based IoT devices across 23 vendors, uncovering five previously undocumented vulnerabilities caused by customization practices. A broader analysis of update-related CVEs from 2020 to 2024 reveals that over half originate from customization-induced issues. These findings highlight a critical yet underexamined reality: as customization increases, so does the attack surface, while current defenses fail to keep pace. We propose IMUP (Integrity-Centric Modular Update Platform), the first framework to address two key challenges: constructing a trustworthy cross-module integrity chain and scaling update performance under mass customization. IMUP combines three techniques: per-module chameleon hashing for integrity, server-side proof-of-work offloading to reduce device overhead, and server-side caching to reuse module combinations, minimizing rebuild costs. Security analysis shows that even when 95 percent of secret keys are exposed, forging a valid image incurs over 300 times the cost of the legitimate server. Experiments on heterogeneous IoT devices demonstrate that IMUP reduces server-side generation time by 2.9 times and device downtime by 5.9 times compared to a package-manager baseline.
Authors:Na Li, Yansong Gao, Hongsheng Hu, Boyu Kuang, Anmin Fu
Abstract:
Model compression is crucial for minimizing memory storage and accelerating inference in deep learning (DL) models, including recent foundation models like large language models (LLMs). Users can access different compressed model versions according to their resources and budget. However, while existing compression operations primarily focus on optimizing the trade-off between resource efficiency and model performance, the privacy risks introduced by compression remain overlooked and insufficiently understood.
In this work, through the lens of membership inference attack (MIA), we propose CompLeak, the first privacy risk evaluation framework examining three widely used compression configurations that are pruning, quantization, and weight clustering supported by the commercial model compression framework of Google's TensorFlow-Lite (TF-Lite) and Facebook's PyTorch Mobile. CompLeak has three variants, given available access to the number of compressed models and original model. CompLeakNR starts by adopting existing MIA methods to attack a single compressed model, and identifies that different compressed models influence members and non-members differently. When the original model and one compressed model are available, CompLeakSR leverages the compressed model as a reference to the original model and uncovers more privacy by combining meta information (e.g., confidence vector) from both models. When multiple compressed models are available with/without accessing the original model, CompLeakMR innovatively exploits privacy leakage info from multiple compressed versions to substantially signify the overall privacy leakage. We conduct extensive experiments on seven diverse model architectures (from ResNet to foundation models of BERT and GPT-2), and six image and textual benchmark datasets.
Authors:Mohamed Nomeir, Alptug Aytekin, Sennur Ulukus
Abstract:
We study the problem of semantic private information retrieval (Sem-PIR) with $T$ colluding servers (Sem-TPIR), i.e., servers that collectively share user queries. In Sem-TPIR, the message sizes are different, and message retrieval probabilities by any user are not uniform. This is a generalization of the classical PIR problem where the message sizes are equal and message retrieval probabilities are identical. The earlier work on Sem-PIR considered the case of no collusions, i.e., the collusion parameter of $T=1$. In this paper, we consider the general problem for arbitrary $T < N$. We find an upper bound on the retrieval rate and design a scheme that achieves this rate, i.e., we derive the exact capacity of Sem-TPIR.
Authors:Shreya Meel, Sennur Ulukus
Abstract:
We consider the problem of decentralized constrained optimization with multiple agents $E_1,\ldots,E_N$ who jointly wish to learn the optimal solution set while keeping their feasible sets $\mathcal{P}_1,\ldots,\mathcal{P}_N$ private from each other. We assume that the objective function $f$ is known to all agents and each feasible set is a collection of points from a universal alphabet $\mathcal{P}_{alph}$. A designated agent (leader) starts the communication with the remaining (non-leader) agents, and is the first to retrieve the solution set. The leader searches for the solution by sending queries to and receiving answers from the non-leaders, such that the information on the individual feasible sets revealed to the leader should be no more than nominal, i.e., what is revealed from learning the solution set alone. We develop achievable schemes for obtaining the solution set at nominal information leakage, and characterize their communication costs under two communication setups between agents. In this work, we focus on two kinds of network setups: i) ring, where each agent communicates with two adjacent agents, and ii) star, where only the leader communicates with the remaining agents. We show that, if the leader first learns the joint feasible set through an existing private set intersection (PSI) protocol and then deduces the solution set, the information leaked to the leader is greater than nominal. Moreover, we draw connection of our schemes to threshold PSI (ThPSI), which is a PSI-variant where the intersection is revealed only when its cardinality is larger than a threshold value. Finally, for various realizations of $f$ mapped uniformly at random to a fixed range of values, our schemes are more communication-efficient with a high probability compared to retrieving the entire feasible set through PSI.
Authors:Prabhanjan Ananth, John Bostanci, Aditya Gulati, Yao-Ting Lin
Abstract:
Gluing theorem for random unitaries [Schuster, Haferkamp, Huang, QIP 2025] have found numerous applications, including designing low depth random unitaries [Schuster, Haferkamp, Huang, QIP 2025], random unitaries in ${\sf QAC0}$ [Foxman, Parham, Vasconcelos, Yuen'25] and generically shortening the key length of pseudorandom unitaries [Ananth, Bostanci, Gulati, Lin EUROCRYPT'25]. We present an alternate method of combining Haar random unitaries from the gluing lemma from [Schuster, Haferkamp, Huang, QIP 2025] that is secure against adversaries with inverse query access to the joined unitary. As a consequence, we show for the first time that strong pseudorandom unitaries can generically have their length extended, and can be constructed using only $O(n^{1/c})$ bits of randomness, for any constant $c$, if any family of strong pseudorandom unitaries exists.
Authors:Tarikul Islam Tamiti, Biraj Joshi, Rida Hasan, Anomadarshi Barua
Abstract:
Pressure sensors are widely integrated into modern Heating, Ventilation and Air Conditioning (HVAC) systems. As they are sensitive to acoustic pressure, they can be a source of eavesdropping. This paper introduces HVAC-EAR, which reconstructs intelligible speech from low-resolution, noisy pressure data with two key contributions: (i) We achieve intelligible reconstruction from as low as 0.5 kHz sampling rate, surpassing prior work limited to hot word detection, by employing a complex-valued conformer with a Complex Unified Attention Block to capture phoneme dependencies; (ii) HVAC-EAR mitigates transient HVAC noise by reconstructing both magnitude and phase of missing frequencies. For the first time, evaluations on real-world HVAC deployments show significant intelligibility, raising novel privacy concerns.
Authors:Jing-Jing Li, Jianfeng He, Chao Shang, Devang Kulshreshtha, Xun Xian, Yi Zhang, Hang Su, Sandesh Swamy, Yanjun Qi
Abstract:
As LLMs advance into autonomous agents with tool-use capabilities, they introduce security challenges that extend beyond traditional content-based LLM safety concerns. This paper introduces Sequential Tool Attack Chaining (STAC), a novel multi-turn attack framework that exploits agent tool use. STAC chains together tool calls that each appear harmless in isolation but, when combined, collectively enable harmful operations that only become apparent at the final execution step. We apply our framework to automatically generate and systematically evaluate 483 STAC cases, featuring 1,352 sets of user-agent-environment interactions and spanning diverse domains, tasks, agent types, and 10 failure modes. Our evaluations show that state-of-the-art LLM agents, including GPT-4.1, are highly vulnerable to STAC, with attack success rates (ASR) exceeding 90% in most cases. The core design of STAC's automated framework is a closed-loop pipeline that synthesizes executable multi-step tool chains, validates them through in-environment execution, and reverse-engineers stealthy multi-turn prompts that reliably induce agents to execute the verified malicious sequence. We further perform defense analysis against STAC and find that existing prompt-based defenses provide limited protection. To address this gap, we propose a new reasoning-driven defense prompt that achieves far stronger protection, cutting ASR by up to 28.8%. These results highlight a crucial gap: defending tool-enabled agents requires reasoning over entire action sequences and their cumulative effects, rather than evaluating isolated prompts or responses.
Authors:Prabhanjan Ananth, Aditya Gulati, Yao-Ting Lin
Abstract:
Pseudorandom unitaries (PRUs), one of the key quantum pseudorandom notions, are efficiently computable unitaries that are computationally indistinguishable from Haar random unitaries. While there is evidence to believe that PRUs are weaker than one-way functions, so far its relationship with other quantum cryptographic primitives (that are plausibly weaker than one-way functions) has not been fully established. In this work, we focus on quantum cryptographic primitives with classical communication, referred to as QCCC primitives. Our main result shows that QCCC bit commitments and QCCC key agreement, cannot be constructed from pseudorandom unitaries in a black-box manner. Our core technical contribution is to show (in a variety of settings) the difficulty of distinguishing identical versus independent Haar unitaries by separable channels. Our result strictly improves upon prior works which studied similar problems in the context of learning theory [Anshu, Landau, Liu, STOC 2022] and cryptography [Ananth, Gulati, Lin, TCC 2024].
Authors:Prabhanjan Ananth, John Bostanci, Aditya Gulati, Yao-Ting Lin
Abstract:
The quantum Haar random oracle model is an idealized model where every party has access to a single Haar random unitary and its inverse. We construct strong pseudorandom unitaries in the quantum Haar random oracle model. This strictly improves upon prior works who either only prove the existence of pseudorandom unitaries in the inverseless quantum Haar random oracle model [Ananth, Bostanci, Gulati, Lin, EUROCRYPT 2025] or prove the existence of a weaker notion (implied by strong pseudorandom unitaries) in the quantum Haar random oracle model [Hhan, Yamada, 2024]. Our results also present a viable approach for building quantum pseudorandomness from random quantum circuits and analyzing pseudorandom objects in nature.
Authors:Zhaoqi Wang, Daqing He, Zijian Zhang, Xin Li, Liehuang Zhu, Meng Li, Jiamou Liu
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities, yet they also introduce novel security challenges. For instance, prompt jailbreaking attacks involve adversaries crafting sophisticated prompts to elicit responses from LLMs that deviate from human values. To uncover vulnerabilities in LLM alignment methods, we propose the PASS framework (\underline{P}rompt J\underline{a}ilbreaking via \underline{S}emantic and \underline{S}tructural Formalization). Specifically, PASS employs reinforcement learning to transform initial jailbreak prompts into formalized descriptions, which enhances stealthiness and enables bypassing existing alignment defenses. The jailbreak outputs are then structured into a GraphRAG system that, by leveraging extracted relevant terms and formalized symbols as contextual input alongside the original query, strengthens subsequent attacks and facilitates more effective jailbreaks. We conducted extensive experiments on common open-source models, demonstrating the effectiveness of our attack.
Authors:Marc Damie, Florian Hahn, Andreas Peter, Jan Ramon
Abstract:
Distributed Point Functions (DPFs) enable sharing secret point functions across multiple parties, supporting privacy-preserving technologies such as Private Information Retrieval, and anonymous communications. While 2-party PRG-based schemes with logarithmic key sizes have been known for a decade, extending these solutions to multi-party settings has proven challenging. In particular, PRG-based multi-party DPFs have historically struggled with practicality due to key sizes growing exponentially with the number of parties and the field size. Our work addresses this efficiency bottleneck by optimizing the PRG-based multi-party DPF scheme of Boyle et al. (EUROCRYPT'15). By leveraging the honest-majority assumption, we eliminate the exponential factor present in this scheme. Our construction is the first PRG-based multi-party DPF scheme with practical key sizes, and provides key up to 3x smaller than the best known multi-party DPF. This work demonstrates that with careful optimization, PRG-based multi-party DPFs can achieve practical performances, and even obtain top performances.
Authors:Daiki Chiba, Hiroki Nakano, Takashi Koide
Abstract:
Phishing attacks are a significant societal threat, disproportionately harming vulnerable populations and eroding trust in essential digital services. Current defenses are often reactive, failing against modern evasive tactics like cloaking that conceal malicious content. To address this, we introduce PhishLumos, an adaptive multi-agent system that proactively mitigates entire attack campaigns. It confronts a core cybersecurity imbalance: attackers can easily scale operations, while defense remains an intensive expert task. Instead of being blocked by evasion, PhishLumos treats it as a critical signal to investigate the underlying infrastructure. Its Large Language Model (LLM)-powered agents uncover shared hosting, certificates, and domain registration patterns. On real-world data, our system identified 100% of campaigns in the median case, over a week before their confirmation by cybersecurity experts. PhishLumos demonstrates a practical shift from reactive URL blocking to proactive campaign mitigation, protecting users before they are harmed and making the digital world safer for all.
Authors:Sen Yang, Burak Ãz, Fei Wu, Fan Zhang
Abstract:
Decentralization has a geographic dimension that conventional metrics such as stake distribution overlook. Where validators run affects resilience to regional shocks (outages, disasters, government intervention) and fairness in reward access. Yet in permissionless systems, locations cannot be mandated, but they emerge from incentives. Today, Ethereum's validators cluster along the Atlantic (EU and U.S. East Coast), where latency is structurally favorable. This raises a key question: when some regions already enjoy latency advantages, how does protocol design shape validator incentives and the geography of (de)centralization? We develop a latency-calibrated agent-based model and compare two Ethereum block-building paradigms: a Single-Source Paradigm (SSP), akin to MEV-Boost, where proposers fetch full blocks from a relay that also propagates them; and a Multi-Source Paradigm (MSP), where proposers aggregate value from multiple sources and broadcast the block themselves. Simulations show that SSP concentrates around relay placement but more slowly, since proximity mainly affects propagation, and the marginal value of time is relatively uniform across regions. MSP centralizes faster: aggregating across sources makes marginal value location-dependent, amplifying payoff dispersion and migration toward latency minima. Source placement and consensus settings can dampen or intensify these effects, though once validators are already clustered, the impact of source placement on decentralization is marginal. In most cases, North America consistently emerges as the focal hub. These findings show that protocol design materially shapes validator geography and offer levers for promoting geographical decentralization.
Authors:Mohamed Abdessamed Rezazi, Mouhamed Amine Bouchiha, Ahmed Mounsf Rafik Bendada, Yacine Ghamri-Doudane
Abstract:
Roaming settlement in 5G and beyond networks demands secure, efficient, and trustworthy mechanisms for billing reconciliation between mobile operators. While blockchain promises decentralization and auditability, existing solutions suffer from critical limitations-namely, data privacy risks, assumptions of mutual trust, and scalability bottlenecks. To address these challenges, we present B5GRoam, a novel on-chain and zero-trust framework for secure, privacy-preserving, and scalable roaming settlements. B5GRoam introduces a cryptographically verifiable call detail record (CDR) submission protocol, enabling smart contracts to authenticate usage claims without exposing sensitive data. To preserve privacy, we integrate non-interactive zero-knowledge proofs (zkSNARKs) that allow on-chain verification of roaming activity without revealing user or network details. To meet the high-throughput demands of 5G environments, B5GRoam leverages Layer 2 zk-Rollups, significantly reducing gas costs while maintaining the security guarantees of Layer 1. Experimental results demonstrate a throughput of over 7,200 tx/s with strong privacy and substantial cost savings. By eliminating intermediaries and enhancing verifiability, B5GRoam offers a practical and secure foundation for decentralized roaming in future mobile networks.
Authors:Xinpeng Liu, Junming Liu, Peiyu Liu, Han Zheng, Qinying Wang, Mathias Payer, Shouling Ji, Wenhai Wang
Abstract:
Modern AI-powered Integrated Development Environments (AI-IDEs) are increasingly defined by an Agent-centric architecture, where an LLM-powered Agent is deeply integrated to autonomously execute complex tasks. This tight integration, however, also introduces a new and critical attack surface. Attackers can exploit these components by injecting malicious instructions into untrusted external sources, effectively hijacking the Agent to perform harmful operations beyond the user's intention or awareness. This emerging threat has quickly attracted research attention, leading to various proposed attack vectors, such as hijacking Model Context Protocol (MCP) Servers to access private data. However, most existing approaches lack stealth and persistence, limiting their practical impact. We propose the Cuckoo Attack, a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files. These files, commonly used in AI-IDEs, execute system commands during routine operations, without displaying execution details to the user. Once configured, such files are rarely revisited unless an obvious runtime error occurs, creating a blind spot for attackers to exploit. We formalize our attack paradigm into two stages, including initial infection and persistence. Based on these stages, we analyze the practicality of the attack execution process and identify the relevant exploitation techniques. Furthermore, we analyze the impact of Cuckoo Attack, which can not only invade the developer's local computer but also achieve supply chain attacks through the spread of configuration files. We contribute seven actionable checkpoints for vendors to evaluate their product security. The critical need for these checks is demonstrated by our end-to-end Proof of Concept, which validated the proposed attack across nine mainstream Agent and AI-IDE pairs.
Authors:Yicheng Zhang, Zijian Huang, Sophie Chen, Erfan Shayegani, Jiasi Chen, Nael Abu-Ghazaleh
Abstract:
Extended reality (XR) applications increasingly integrate Large Language Models (LLMs) to enhance user experience, scene understanding, and even generate executable XR content, and are often called "AI glasses". Despite these potential benefits, the integrated XR-LLM pipeline makes XR applications vulnerable to new forms of attacks. In this paper, we analyze LLM-Integated XR systems in the literature and in practice and categorize them along different dimensions from a systems perspective. Building on this categorization, we identify a common threat model and demonstrate a series of proof-of-concept attacks on multiple XR platforms that employ various LLM models (Meta Quest 3, Meta Ray-Ban, Android, and Microsoft HoloLens 2 running Llama and GPT models). Although these platforms each implement LLM integration differently, they share vulnerabilities where an attacker can modify the public context surrounding a legitimate LLM query, resulting in erroneous visual or auditory feedback to users, thus compromising their safety or privacy, sowing confusion, or other harmful effects. To defend against these threats, we discuss mitigation strategies and best practices for developers, including an initial defense prototype, and call on the community to develop new protection mechanisms to mitigate these risks.
Authors:Zhen Li, Zijian Zhang, Wenjin Yang, Pengbo Wang, Zhaoqi Wang, Meng Li, Yan Wu, Xuyang Liu, Jing Sun, Liehuang Zhu
Abstract:
Verifiable Secret Sharing (VSS) has been widespread in Distributed Privacy-preserving Machine Learning (DPML), because invalid shares from malicious dealers or participants can be recognized by verifying the commitment of the received shares for honest participants. However, the consistency and the computation and communitation burden of the VSS-based DPML schemes are still two serious challenges. Although Byzantine Fault Tolerance (BFT) system has been brought to guarantee the consistency and improve the efficiency of the existing VSS-based DPML schemes recently, we explore an Adaptive Share Delay Provision (ASDP) strategy, and launch an ASDP-based Customized Model Poisoning Attack (ACuMPA) for certain participants in this paper. We theoretically analyzed why the ASDP strategy and the ACuMPA algorithm works to the existing schemes. Next, we propose an [E]fficient [By]zantine [F]ault [T]olerant-based [Ve]rifiable [S]ecret-sharing (EByFTVeS) scheme. Finally, the validity, liveness, consistency and privacy of the EByFTVeS scheme are theoretically analyzed, while the efficiency of the EByFTVeS scheme outperforms that of the-state-of-art VSS scheme according to comparative experiment results.
Authors:Meryem Malak Dif, Mouhamed Amine Bouchiha, Abdelaziz Amara Korba, Yacine Ghamri-Doudane
Abstract:
The Internet of Electric Vehicles (IoEV) envisions a tightly coupled ecosystem of electric vehicles (EVs), charging infrastructure, and grid services, yet it remains vulnerable to cyberattacks, unreliable battery-state predictions, and opaque decision processes that erode trust and performance. To address these challenges, we introduce a novel Agentic Artificial Intelligence (AAI) framework tailored for IoEV, where specialized agents collaborate to deliver autonomous threat mitigation, robust analytics, and interpretable decision support. Specifically, we design an AAI architecture comprising dedicated agents for cyber-threat detection and response at charging stations, real-time State of Charge (SoC) estimation, and State of Health (SoH) anomaly detection, all coordinated through a shared, explainable reasoning layer; develop interpretable threat-mitigation mechanisms that proactively identify and neutralize attacks on both physical charging points and learning components; propose resilient SoC and SoH models that leverage continuous and adversarial-aware learning to produce accurate, uncertainty-aware forecasts with human-readable explanations; and implement a three-agent pipeline, where each agent uses LLM-driven reasoning and dynamic tool invocation to interpret intent, contextualize tasks, and execute formal optimizations for user-centric assistance. Finally, we validate our framework through comprehensive experiments across diverse IoEV scenarios, demonstrating significant improvements in security and prediction accuracy. All datasets, models, and code will be released publicly.
Authors:Jiajie He, Yuechun Gu, Keke Chen, Xintong Chen
Abstract:
Recommender systems (RecSys) have been widely applied to various applications, including E-commerce, finance, healthcare, social media and have become increasingly influential in shaping user behavior and decision-making, highlighting their growing impact in various domains. However, recent studies have shown that RecSys are vulnerable to membership inference attacks (MIAs), which aim to infer whether user interaction record was used to train a target model or not. MIAs on RecSys models can directly lead to a privacy breach. For example, via identifying the fact that a purchase record that has been used to train a RecSys associated with a specific user, an attacker can infer that user's special quirks. In recent years, MIAs have been shown to be effective on other ML tasks, e.g., classification models and natural language processing. However, traditional MIAs are ill-suited for RecSys due to the unseen posterior probability. Although MIAs on RecSys form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this article, we conduct the first comprehensive survey on RecSys MIAs. This survey offers a comprehensive review of the latest advancements in RecSys MIAs, exploring the design principles, challenges, attack and defense associated with this emerging field. We provide a unified taxonomy that categorizes different RecSys MIAs based on their characterizations and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain.
Authors:Iqbal H. Sarker, Helge Janicke, Ahmad Mohsin, Leandros Maglaras
Abstract:
Artificial Intelligence (AI) and Large Language Models (LLMs) are reshaping today's business practices, however, their adoption within small and medium-sized enterprises (SMEs) raises significant technical, ethical and trust issues. This paper proposes a structured, multi-phased framework designed to embed trust and ethical principles throughout the AI lifecycle for their secure and responsible use in SMEs. Structured around four pillars, i.e., Data, Algorithms, Human oversight, and Model Architecture, the framework bridges theoretical ethical principles with operational practice, enhancing AI capabilities in diverse SME applications. Ultimately, this paper offers a structured roadmap for responsible AI adoption, framing trust and ethics as a catalyst for resilience, competitiveness, and sustainable innovation in SMEs.
Authors:Hamish Alsop, Leandros Maglaras, Helge Janicke, Iqbal H. Sarker, Mohamed Amine Ferrag
Abstract:
End-to-end encryption (E2EE) has emerged as a fundamental element of modern digital communication, protecting data from unauthorized access during transmission. By design, E2EE ensures that only the intended recipient can decrypt the information, making it inaccessible even to service providers. Yet, this powerful safeguard of individual privacy and digital trust also introduces a paradox: it can simultaneously prevent law enforcement efforts by hiding potential malicious activities. This paper examines the dual role of E2EE, its critical importance to privacy, the challenges it
Authors:Carmine Cesarano, Roberto Natella
Abstract:
Coverage-guided fuzzing has been widely applied to address zero-day vulnerabilities in general-purpose software and operating systems. This approach relies on instrumenting the target code at compile time. However, applying it to industrial systems remains challenging, due to proprietary and closed-source compiler toolchains and lack of access to source code. FuzzBox addresses these limitations by integrating emulation with fuzzing: it dynamically instruments code during execution in a virtualized environment, for the injection of fuzz inputs, failure detection, and coverage analysis, without requiring source code recompilation and hardware-specific dependencies. We show the effectiveness of FuzzBox through experiments in the context of a proprietary MILS (Multiple Independent Levels of Security) hypervisor for industrial applications. Additionally, we analyze the applicability of FuzzBox across commercial IoT firmware, showcasing its broad portability.
Authors:Duc-Thien Phan, Minh-Duong Nguyen, Quoc-Viet Pham, Huilong Pi
Abstract:
Upon integrating Quantum Neural Network (QNN) as the local model, Quantum Federated Learning (QFL) has recently confronted notable challenges. Firstly, exploration is hindered over sharp minima, decreasing learning performance. Secondly, the steady gradient descent results in more stable and predictable model transmissions over wireless channels, making the model more susceptible to attacks from adversarial entities. Additionally, the local QFL model is vulnerable to noise produced by the quantum device's intermediate noise states, since it requires the use of quantum gates and circuits for training. This local noise becomes intertwined with learning parameters during training, impairing model precision and convergence rate. To address these issues, we propose a new QFL technique that incorporates differential privacy and introduces a dedicated noise estimation strategy to quantify and mitigate the impact of intermediate quantum noise. Furthermore, we design an adaptive noise generation scheme to alleviate privacy threats associated with the vanishing gradient variance phenomenon of QNN and enhance robustness against device noise. Experimental results demonstrate that our algorithm effectively balances convergence, reduces communication costs, and mitigates the adverse effects of intermediate quantum noise while maintaining strong privacy protection. Using real-world datasets, we achieved test accuracy of up to 98.47\% for the MNIST dataset and 83.85\% for the CIFAR-10 dataset while maintaining fast execution times.
Authors:Shei Pern Chua, Zhen Leng Thai, Teh Kai Jun, Xiao Li, Xiaolin Hu
Abstract:
Large language models (LLMs) have undergone safety alignment efforts to mitigate harmful outputs. However, as LLMs become more sophisticated in reasoning, their intelligence may introduce new security risks. While traditional jailbreak attacks relied on singlestep attacks, multi-turn jailbreak strategies that adapt dynamically to context remain underexplored. In this work, we introduce TRIAL (Trolley-problem Reasoning for Interactive Attack Logic), a framework that leverages LLMs ethical reasoning to bypass their safeguards. TRIAL embeds adversarial goals within ethical dilemmas modeled on the trolley problem. TRIAL demonstrates high jailbreak success rates towards both open and close-source models. Our findings underscore a fundamental limitation in AI safety: as models gain advanced reasoning abilities, the nature of their alignment may inadvertently allow for more covert security vulnerabilities to be exploited. TRIAL raises an urgent need in reevaluating safety alignment oversight strategies, as current safeguards may prove insufficient against context-aware adversarial attack.
Authors:Saad Ullah, Praneeth Balasubramanian, Wenbo Guo, Amanda Burnett, Hammond Pearce, Christopher Kruegel, Giovanni Vigna, Gianluca Stringhini
Abstract:
High-quality datasets of real-world vulnerabilities and their corresponding verifiable exploits are crucial resources in software security research. Yet such resources remain scarce, as their creation demands intensive manual effort and deep security expertise. In this paper, we present CVE-GENIE, an automated, large language model (LLM)-based multi-agent framework designed to reproduce real-world vulnerabilities, provided in Common Vulnerabilities and Exposures (CVE) format, to enable creation of high-quality vulnerability datasets. Given a CVE entry as input, CVE-GENIE gathers the relevant resources of the CVE, automatically reconstructs the vulnerable environment, and (re)produces a verifiable exploit. Our systematic evaluation highlights the efficiency and robustness of CVE-GENIE's design and successfully reproduces approximately 51% (428 of 841) CVEs published in 2024-2025, complete with their verifiable exploits, at an average cost of $2.77 per CVE. Our pipeline offers a robust method to generate reproducible CVE benchmarks, valuable for diverse applications such as fuzzer evaluation, vulnerability patching, and assessing AI's security capabilities.
Authors:Yitong Guo, Hongbo Chen, Haobin Hiroki Chen, Yukui Luo, XiaoFeng Wang, Chenghong Wang
Abstract:
While Trusted Execution Environments provide a strong foundation for secure cloud computing, they remain vulnerable to access pattern leakages. Oblivious Maps (OMAPs) mitigate this by fully hiding access patterns but suffer from high overhead due to randomized remapping and worst-case padding. We argue these costs are not fundamental. Modern accelerators featuring High-Bandwidth Memory (HBM) offer a new opportunity: Vaswani et al. [OSDI'18] point out that eavesdropping on HBM is difficult -- even for physical attackers -- as its memory channels are sealed together with processor cores inside the same physical package. Later, Hunt et al. [NSDI'20] show that, with proper isolation, HBM can be turned into an unobservable region where both data and memory traces are hidden. This motivates a rethink of OMAP design with HBM-backed solutions to finally overcome their traditional performance limits. Building on these insights, we present BOLT, a Bandwidth Optimized, Lightning-fast OMAP accelerator that, for the first time, achieves O(1) + O(log_2(log_2 (N))) bandwidth overhead. BOLT introduces three key innovations: (i) a new OMAP algorithm that leverages isolated HBM as an unobservable cache to accelerate oblivious access to large host memory; (ii) a self-hosted architecture that offloads execution and memory control from the host to mitigate CPU-side leakage; and (iii) tailored algorithm-architecture co-designs that maximize resource efficiency. We implement a prototype BOLT on a Xilinx U55C FPGA. Evaluations show that BOLT achieves up to 279x and 480x speedups in initialization and query time, respectively, over state-of-the-art OMAPs, including an industry implementation from Facebook.
Authors:Jiacheng Guo, Ning Gao, Yiping Zuo, Hao Xu, Shi Jin, Kai Kit Wong
Abstract:
As a promising physical layer security technique, physical layer key generation (PLKG) enables legitimate users to obtain secret keys from wireless channel without security infrastructures. However, in harsh propagation environments, the channel characteristic becomes unsatisfactory, the key generation rate (KGR) is significantly deteriorated. In this paper, we propose a novel fluid antenna (FA) enabled PLKG system to address this challenge. Specifically, we first derive the closed-form expression of the KGR for FA array, and then jointly optimize the precoding matrix and the antenna positions via a particle swarm optimization (PSO) algorithm. Next, to further reduce the computational complexity of the optimization procedure, we develop an alternating optimization (AO) algorithm, which combines the projected gradient descent (PGD) and the PSO. Simulation results demonstrate that by exploiting the additional spatial degree of freedom (DoF), our FA enabled PLKG system is superior to the benchmarks, such as the conventional fixed-position antenna (FPA) array and the reconfigurable intelligent surface (RIS). It is worth highlighting that compared to the conventional uniform planar antenna (UPA), the FA enabled PLKG achieves a 35.42\% KGR performance improvement under PSO algorithm and a 67.73\% KGR performance improvement under AO algorithm, respectively.
Authors:Fengchao Chen, Tingmin Wu, Van Nguyen, Carsten Rudolph
Abstract:
Phishing is a pervasive form of social engineering in which attackers impersonate trusted entities to steal information or induce harmful actions. Text-based phishing dominates for its low cost, scalability, and concealability, advantages recently amplified by large language models (LLMs) that enable ``Phishing-as-a-Service'' attacks at scale within minutes. Despite the growing research into LLM-facilitated phishing attacks, consolidated systematic research on the phishing attack life cycle remains scarce. In this work, we present the first systematization of knowledge (SoK) on LLM-generated phishing, offering an end-to-end analysis that spans generation techniques, attack features, and mitigation strategies. We introduce Generation-Characterization-Defense (GenCharDef), which systematizes the ways in which LLM-generated phishing differs from traditional phishing across methodologies, security perspectives, data dependencies, and evaluation practices. This framework highlights unique challenges of LLM-driven phishing, providing a coherent foundation for understanding the evolving threat landscape and guiding the design of more resilient defenses.
Authors:A H M Nazmus Sakib, Mahsin Bin Akram, Joseph Spracklen, Sahan Kalutarage, Raveen Wijewickrama, Igor Bilogrevic, Murtuza Jadliwala
Abstract:
Browser fingerprinting defenses have historically focused on detecting JavaScript(JS)-based tracking techniques. However, the widespread adoption of WebAssembly (WASM) introduces a potential blind spot, as adversaries can convert JS to WASM's low-level binary format to obfuscate malicious logic. This paper presents the first systematic evaluation of how such WASM-based obfuscation impacts the robustness of modern fingerprinting defenses. We develop an automated pipeline that translates real-world JS fingerprinting scripts into functional WASM-obfuscated variants and test them against two classes of defenses: state-of-the-art detectors in research literature and commercial, in-browser tools. Our findings reveal a notable divergence: detectors proposed in the research literature that rely on feature-based analysis of source code show moderate vulnerability, stemming from outdated datasets or a lack of WASM compatibility. In contrast, defenses such as browser extensions and native browser features remained completely effective, as their API-level interception is agnostic to the script's underlying implementation. These results highlight a gap between academic and practical defense strategies and offer insights into strengthening detection approaches against WASM-based obfuscation, while also revealing opportunities for more evasive techniques in future attacks.
Authors:Jiajie He, Yuechun Gu, Min-Chun Chen, Keke Chen
Abstract:
Large language models (LLMs) based Recommender Systems (RecSys) can flexibly adapt recommendation systems to different domains. It utilizes in-context learning (ICL), i.e., the prompts, to customize the recommendation functions, which include sensitive historical user-specific item interactions, e.g., implicit feedback like clicked items or explicit product reviews. Such private information may be exposed to novel privacy attack. However, no study has been done on this important issue. We design four membership inference attacks (MIAs), aiming to reveal whether victims' historical interactions have been used by system prompts. They are \emph{direct inquiry, hallucination, similarity, and poisoning attacks}, each of which utilizes the unique features of LLMs or RecSys. We have carefully evaluated them on three LLMs that have been used to develop ICL-LLM RecSys and two well-known RecSys benchmark datasets. The results confirm that the MIA threat on LLM RecSys is realistic: direct inquiry and poisoning attacks showing significantly high attack advantages. We have also analyzed the factors affecting these attacks, such as the number of shots in system prompts and the position of the victim in the shots.
Authors:Haijian Ma, Daizong Liu, Xiaowen Cai, Pan Zhou, Yulai Xie
Abstract:
Intrusion Detection Systems (IDS) play a crucial role in network security defense. However, a significant challenge for IDS in training detection models is the shortage of adequately labeled malicious samples. To address these issues, this paper introduces a novel semi-supervised framework \textbf{GANGRL-LLM}, which integrates Generative Adversarial Networks (GANs) with Large Language Models (LLMs) to enhance malicious code generation and SQL Injection (SQLi) detection capabilities in few-sample learning scenarios. Specifically, our framework adopts a collaborative training paradigm where: (1) the GAN-based discriminator improves malicious pattern recognition through adversarial learning with generated samples and limited real samples; and (2) the LLM-based generator refines the quality of malicious code synthesis using reward signals from the discriminator. The experimental results demonstrate that even with a limited number of labeled samples, our training framework is highly effective in enhancing both malicious code generation and detection capabilities. This dual enhancement capability offers a promising solution for developing adaptive defense systems capable of countering evolving cyber threats.
Authors:Qiming Guo, Jinwen Tang, Xingran Huang
Abstract:
We introduce Advertisement Embedding Attacks (AEA), a new class of LLM security threats that stealthily inject promotional or malicious content into model outputs and AI agents. AEA operate through two low-cost vectors: (1) hijacking third-party service-distribution platforms to prepend adversarial prompts, and (2) publishing back-doored open-source checkpoints fine-tuned with attacker data. Unlike conventional attacks that degrade accuracy, AEA subvert information integrity, causing models to return covert ads, propaganda, or hate speech while appearing normal. We detail the attack pipeline, map five stakeholder victim groups, and present an initial prompt-based self-inspection defense that mitigates these injections without additional model retraining. Our findings reveal an urgent, under-addressed gap in LLM security and call for coordinated detection, auditing, and policy responses from the AI-safety community.
Authors:Shilin Xiao, Wenjun Zhu, Yan Jiang, Kai Wang, Peiwang Wang, Chen Yan, Xiaoyu Ji, Wenyuan Xu
Abstract:
Sensors are fundamental to cyber-physical systems (CPS), enabling perception and control by transducing physical stimuli into digital measurements. However, despite growing research on physical attacks on sensors, our understanding of sensor hardware vulnerabilities remains fragmented due to the ad-hoc nature of this field. Moreover, the infinite attack signal space further complicates threat abstraction and defense. To address this gap, we propose a systematization framework, termed sensor out-of-band (OOB) vulnerabilities, that for the first time provides a comprehensive abstraction for sensor attack surfaces based on underlying physical principles. We adopt a bottom-up systematization methodology that analyzes OOB vulnerabilities across three levels. At the component level, we identify the physical principles and limitations that contribute to OOB vulnerabilities. At the sensor level, we categorize known attacks and evaluate their practicality. At the system level, we analyze how CPS features such as sensor fusion, closed-loop control, and intelligent perception impact the exposure and mitigation of OOB threats. Our findings offer a foundational understanding of sensor hardware security and provide guidance and future directions for sensor designers, security researchers, and system developers aiming to build more secure sensors and CPS.
Authors:Ziyu Wang, Elahe Khatibi, Farshad Firouzi, Sanaz Rahimi Mousavi, Krishnendu Chakrabarty, Amir M. Rahmani
Abstract:
The increasing availability of publicly shared electrocardiogram (ECG) data raises critical privacy concerns, as its biometric properties make individuals vulnerable to linkage attacks. Unlike prior studies that assume idealized adversarial capabilities, we evaluate ECG privacy risks under realistic conditions where attackers operate with partial knowledge. Using data from 109 participants across diverse real-world datasets, our approach achieves 85% accuracy in re-identifying individuals in public datasets while maintaining a 14.2% overall misclassification rate at an optimal confidence threshold, with 15.6% of unknown individuals misclassified as known and 12.8% of known individuals misclassified as unknown. These results highlight the inadequacy of simple anonymization techniques in preventing re-identification, demonstrating that even limited adversarial knowledge enables effective identity linkage. Our findings underscore the urgent need for privacy-preserving strategies, such as differential privacy, access control, and encrypted computation, to mitigate re-identification risks while ensuring the utility of shared biosignal data in healthcare applications.
Authors:Yinghan Zhou, Juan Wen, Wanli Peng, Zhengxian Wu, Ziwei Zhang, Yiming Xue
Abstract:
AI-generated text (AIGT) detection evasion aims to reduce the detection probability of AIGT, helping to identify weaknesses in detectors and enhance their effectiveness and reliability in practical applications. Although existing evasion methods perform well, they suffer from high computational costs and text quality degradation. To address these challenges, we propose Self-Disguise Attack (SDA), a novel approach that enables Large Language Models (LLM) to actively disguise its output, reducing the likelihood of detection by classifiers. The SDA comprises two main components: the adversarial feature extractor and the retrieval-based context examples optimizer. The former generates disguise features that enable LLMs to understand how to produce more human-like text. The latter retrieves the most relevant examples from an external knowledge base as in-context examples, further enhancing the self-disguise ability of LLMs and mitigating the impact of the disguise process on the diversity of the generated text. The SDA directly employs prompts containing disguise features and optimized context examples to guide the LLM in generating detection-resistant text, thereby reducing resource consumption. Experimental results demonstrate that the SDA effectively reduces the average detection accuracy of various AIGT detectors across texts generated by three different LLMs, while maintaining the quality of AIGT.
Authors:Haolin Zheng, Ning Gao, Donghong Cai, Shi Jin, Michail Matthaiou
Abstract:
Unmanned aerial vehicle (UAV) individual (ID) identification is a critical security surveillance strategy in low-altitude integrated sensing and communication (ISAC) networks. In this paper, we propose a novel dynamic knowledge distillation (KD)-enabled wireless radio frequency fingerprint large language model (RFF-LLM) framework for UAV ID identification. First, we propose an RFF-LLM framework based on the modified GPT-2 model to improve the identification accuracy in complex outdoor environments. Then, considering the parameter overhead of the RFF-LLM, we design a dynamic KD strategy to compress the model. Specifically, the proximal policy optimization (PPO) algorithm is employed to dynamically adjust the distillation temperature, overcoming the local optimum dilemma inherent in static KD. As a next step, the knowledge of the RFF-LLM is adequately transferred to the lightweight Lite-HRNet model. Finally, our experiments are conducted based on the self-built drone RFF dataset of Release one, namely DRFF-R1, by collecting the I/Q signals of 20 commercial UAVs in channel 149. The experiment results show that the proposed framework achieves 98.38% ID identification accuracy with merely 0.15 million parameters and 2.74 ms response time, which outperforms the benchmarks.
Authors:Jiaxuan Wu, Yinghan Zhou, Wanli Peng, Yiming Xue, Juan Wen, Ping Zhong
Abstract:
Training large language models (LLMs) is resource-intensive and expensive, making protecting intellectual property (IP) for LLMs crucial. Recently, embedding fingerprints into LLMs has emerged as a prevalent method for establishing model ownership. However, existing back-door-based methods suffer from limited stealth and efficiency. To simultaneously address these issues, we propose EditMF, a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead. Ownership bits are mapped to compact, semantically coherent triples drawn from an encrypted artificial knowledge base (e.g., virtual author-novel-protagonist facts). Causal tracing localizes the minimal set of layers influencing each triple, and a zero-space update injects the fingerprint without perturbing unrelated knowledge. Verification requires only a single black-box query and succeeds when the model returns the exact pre-embedded protagonist. Empirical results on LLaMA and Qwen families show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting and approaching that of SFT embeddings. Extensive experiments demonstrate that EditMF is an effective and low-overhead solution for secure LLM ownership verification.
Authors:Yancheng Jiang, Yan Jiang, Ruochen Zhou, Yi-Chao Chen, Xiaoyu Ji, Wenyuan Xu
Abstract:
Virtual Reality (VR) techniques, serving as the bridge between the real and virtual worlds, have boomed and are widely used in manufacturing, remote healthcare, gaming, etc. Specifically, VR systems offer users immersive experiences that include both perceptions and actions. Various studies have demonstrated that attackers can manipulate VR software to influence users' interactions, including perception and actions. However, such attacks typically require strong access and specialized expertise. In this paper, we are the first to present a systematic analysis of physical attacks against VR systems and introduce False Reality, a new attack threat to VR devices without requiring access to or modification of their software. False Reality disturbs VR system services by tampering with sensor measurements, and further spoofing users' perception even inducing harmful actions, e.g., inducing dizziness or causing users to crash into obstacles, by exploiting perceptual and psychological effects. We formalize these threats through an attack pathway framework and validate three representative pathways via physical experiments and user studies on five commercial VR devices. Finally, we further propose a defense prototype to mitigate such threats. Our findings shall provide valuable insights for enhancing the security and resilience of future VR systems.
Authors:Yueyang Quan, Chang Wang, Shengjie Zhai, Minghong Fang, Zhuqing Liu
Abstract:
Decentralized min-max optimization allows multi-agent systems to collaboratively solve global min-max optimization problems by facilitating the exchange of model updates among neighboring agents, eliminating the need for a central server. However, sharing model updates in such systems carry a risk of exposing sensitive data to inference attacks, raising significant privacy concerns. To mitigate these privacy risks, differential privacy (DP) has become a widely adopted technique for safeguarding individual data. Despite its advantages, implementing DP in decentralized min-max optimization poses challenges, as the added noise can hinder convergence, particularly in non-convex scenarios with complex agent interactions in min-max optimization problems. In this work, we propose an algorithm called DPMixSGD (Differential Private Minmax Hybrid Stochastic Gradient Descent), a novel privacy-preserving algorithm specifically designed for non-convex decentralized min-max optimization. Our method builds on the state-of-the-art STORM-based algorithm, one of the fastest decentralized min-max solutions. We rigorously prove that the noise added to local gradients does not significantly compromise convergence performance, and we provide theoretical bounds to ensure privacy guarantees. To validate our theoretical findings, we conduct extensive experiments across various tasks and models, demonstrating the effectiveness of our approach.
Authors:Hiroki Nakano, Takashi Koide, Daiki Chiba
Abstract:
Phishing attacks continue to evolve, with cloaking techniques posing a significant challenge to detection efforts. Cloaking allows attackers to display phishing sites only to specific users while presenting legitimate pages to security crawlers, rendering traditional detection systems ineffective. This research proposes PhishParrot, a novel crawling environment optimization system designed to counter cloaking techniques. PhishParrot leverages the contextual analysis capabilities of Large Language Models (LLMs) to identify potential patterns in crawling information, enabling the construction of optimal user profiles capable of bypassing cloaking mechanisms. The system accumulates information on phishing sites collected from diverse environments. It then adapts browser settings and network configurations to match the attacker's target user conditions based on information extracted from similar cases. A 21-day evaluation showed that PhishParrot improved detection accuracy by up to 33.8% over standard analysis systems, yielding 91 distinct crawling environments for diverse conditions targeted by attackers. The findings confirm that the combination of similar-case extraction and LLM-based context analysis is an effective approach for detecting cloaked phishing attacks.
Authors:Imtiaz Karim, Hyunwoo Lee, Hassan Asghar, Kazi Samin Mubasshir, Seulgi Han, Mashroor Hasan Bhuiyan, Elisa Bertino
Abstract:
We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations. VWAttacker includes a complete VoWiFi network testbed that communicates with Commercial-Off-The-Shelf (COTS) UEs based on a simple interface to test the behavior of diverse VoWiFi UE implementations; uses property-guided adversarial testing to uncover security issues in different UEs systematically. To reduce manual effort in extracting and testing properties, we introduce an LLM-based, semi-automatic, and scalable approach for property extraction and testcase (TC) generation. These TCs are systematically mutated by two domain-specific transformations. Furthermore, we introduce two deterministic oracles to detect property violations automatically. Coupled with these techniques, VWAttacker extracts 63 properties from 11 specifications, evaluates 1,116 testcases, and detects 13 issues in 21 UEs. The issues range from enforcing a DH shared secret to 0 to supporting weak algorithms. These issues result in attacks that expose the victim UE's identity or establish weak channels, thus severely hampering the security of cellular networks. We responsibly disclose the findings to all the related vendors. At the time of writing, one of the vulnerabilities has been acknowledged by MediaTek with high severity.
Authors:Jiajie He, Yuechun Gu, Keke Chen
Abstract:
Recommender systems (RecSys) have become an essential component of many web applications. The core of the system is a recommendation model trained on highly sensitive user-item interaction data. While privacy-enhancing techniques are actively studied in the research community, the real-world model development still depends on minimal privacy protection, e.g., via controlled access. Users of such systems should have the right to choose \emph{not} to share highly sensitive interactions. However, there is no method allowing the user to know which interactions are more sensitive than others. Thus, quantifying the privacy risk of RecSys training data is a critical step to enabling privacy-aware RecSys model development and deployment. We propose a membership-inference attack (MIA)- based privacy scoring method, RecPS, to measure privacy risks at both the interaction and user levels. The RecPS interaction-level score definition is motivated and derived from differential privacy, which is then extended to the user-level scoring method. A critical component is the interaction-level MIA method RecLiRA, which gives high-quality membership estimation. We have conducted extensive experiments on well-known benchmark datasets and RecSys models to show the unique features and benefits of RecPS scoring in risk assessment and RecSys model unlearning.
Authors:Delong Ran, Xinlei He, Tianshuo Cong, Anyu Wang, Qi Li, Xiaoyun Wang
Abstract:
Language Models (LMs) typically adhere to a "pre-training and fine-tuning" paradigm, where a universal pre-trained model can be fine-tuned to cater to various specialized domains. Low-Rank Adaptation (LoRA) has gained the most widespread use in LM fine-tuning due to its lightweight computational cost and remarkable performance. Because the proportion of parameters tuned by LoRA is relatively small, there might be a misleading impression that the LoRA fine-tuning data is invulnerable to Membership Inference Attacks (MIAs). However, we identify that utilizing the pre-trained model can induce more information leakage, which is neglected by existing MIAs. Therefore, we introduce LoRA-Leak, a holistic evaluation framework for MIAs against the fine-tuning datasets of LMs. LoRA-Leak incorporates fifteen membership inference attacks, including ten existing MIAs, and five improved MIAs that leverage the pre-trained model as a reference. In experiments, we apply LoRA-Leak to three advanced LMs across three popular natural language processing tasks, demonstrating that LoRA-based fine-tuned LMs are still vulnerable to MIAs (e.g., 0.775 AUC under conservative fine-tuning settings). We also applied LoRA-Leak to different fine-tuning settings to understand the resulting privacy risks. We further explore four defenses and find that only dropout and excluding specific LM layers during fine-tuning effectively mitigate MIA risks while maintaining utility. We highlight that under the "pre-training and fine-tuning" paradigm, the existence of the pre-trained model makes MIA a more severe risk for LoRA-based LMs. We hope that our findings can provide guidance on data privacy protection for specialized LM providers.
Authors:Haonan An, Guang Hua, Yu Guo, Hangcheng Cao, Susanto Rahardja, Yuguang Fang
Abstract:
The intellectual property of deep neural network (DNN) models can be protected with DNN watermarking, which embeds copyright watermarks into model parameters (white-box), model behavior (black-box), or model outputs (box-free), and the watermarks can be subsequently extracted to verify model ownership or detect model theft. Despite recent advances, these existing methods are inherently intrusive, as they either modify the model parameters or alter the structure. This natural intrusiveness raises concerns about watermarking-induced shifts in model behavior and the additional cost of fine-tuning, further exacerbated by the rapidly growing model size. As a result, model owners are often reluctant to adopt DNN watermarking in practice, which limits the development of practical Watermarking as a Service (WaaS) systems. To address this issue, we introduce Nonintrusive Watermarking as a Service (NWaaS), a novel trustless paradigm designed for X-to-Image models, in which we hypothesize that with the model untouched, an owner-defined watermark can still be extracted from model outputs. Building on this concept, we propose ShadowMark, a concrete implementation of NWaaS which addresses critical deployment challenges by establishing a robust and nonintrusive side channel in the protected model's black-box API, leveraging a key encoder and a watermark decoder. It is significantly distinctive from existing solutions by attaining the so-called absolute fidelity and being applicable to different DNN architectures, while being also robust against existing attacks, eliminating the fidelity-robustness trade-off. Extensive experiments on image-to-image, noise-to-image, noise-and-text-to-image, and text-to-image models, demonstrate the efficacy and practicality of ShadowMark for real-world deployment of nonintrusive DNN watermarking.
Authors:Shreya Meel, Sennur Ulukus
Abstract:
We introduce the problem of symmetric private information retrieval (SPIR) on replicated databases modeled by a simple graph. In this model, each vertex corresponds to a server, and a message is replicated on two servers if and only if there is an edge between them. We consider the setting where the server-side common randomness necessary to accomplish SPIR is also replicated at the servers according to the graph, and we call this as message-specific common randomness. In this setting, we establish a lower bound on the SPIR capacity, i.e., the maximum download rate, for general graphs, by proposing an achievable SPIR scheme. Next, we prove that, for any SPIR scheme to be feasible, the minimum size of message-specific randomness should be equal to the size of a message. Finally, by providing matching upper bounds, we derive the exact SPIR capacity for the class of path and regular graphs.
Authors:Ramesh Raskar, Pradyumna Chari, John Zinky, Mahesh Lambe, Jared James Grogan, Sichao Wang, Rajesh Ranjan, Rekha Singhal, Shailja Gupta, Robert Lincourt, Raghu Bala, Aditi Joshi, Abhishek Singh, Ayush Chopra, Dimitris Stripelis, Bhuwan B, Sumit Kumar, Maria Gorskikh
Abstract:
The Internet is poised to host billions to trillions of autonomous AI agents that negotiate, delegate, and migrate in milliseconds and workloads that will strain DNS-centred identity and discovery. In this paper, we describe the NANDA index architecture, which we envision as a means for discoverability, identifiability and authentication in the internet of AI agents. We present an architecture where a minimal lean index resolves to dynamic, cryptographically verifiable AgentFacts that supports multi-endpoint routing, load balancing, privacy-preserving access, and credentialed capability assertions. Our architecture design delivers five concrete guarantees: (1) A quilt-like index proposal that supports both NANDA-native agents as well as third party agents being discoverable via the index, (2) rapid global resolution for newly spawned AI agents, (3) sub-second revocation and key rotation, (4) schema-validated capability assertions, and (5) privacy-preserving discovery across organisational boundaries via verifiable, least-disclosure queries. We formalize the AgentFacts schema, specify a CRDT-based update protocol, and prototype adaptive resolvers. The result is a lightweight, horizontally scalable foundation that unlocks secure, trust-aware collaboration for the next generation of the Internet of AI agents, without abandoning existing web infrastructure.
Authors:Yiran Wu, Mauricio Velazco, Andrew Zhao, Manuel Raúl Meléndez Luján, Srisuma Movva, Yogesh K Roy, Quang Nguyen, Roberto Rodriguez, Qingyun Wu, Michael Albada, Julia Kiseleva, Anand Mudgerikar
Abstract:
We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM agent x on the task of Cyber Threat Investigation through security questions derived from investigation graphs. Real-world security analysts must sift through a large number of heterogeneous alert signals and security logs, follow multi-hop chains of evidence, and compile an incident report. With the developments of LLMs, building LLM-based agents for automatic thread investigation is a promising direction. To assist the development and evaluation of LLM agents, we construct a dataset from a controlled Azure tenant that covers 8 simulated real-world multi-step attacks, 57 log tables from Microsoft Sentinel and related services, and 589 automatically generated questions. We leverage security logs extracted with expert-crafted detection logic to build threat investigation graphs, and then generate questions with LLMs using paired nodes on the graph, taking the start node as background context and the end node as answer. Anchoring each question to these explicit nodes and edges not only provides automatic, explainable ground truth answers but also makes the pipeline reusable and readily extensible to new logs. This also enables the automatic generation of procedural tasks with verifiable rewards, which can be naturally extended to training agents via reinforcement learning. Our comprehensive experiments with different models confirm the difficulty of the task: with the base setting, the average reward across all evaluated models is 0.249, and the best achieved is 0.368, leaving substantial headroom for future research. Code and data are coming soon!
Authors:Fengxiao Tang, Huan Li, Ming Zhao, Zongzong Wu, Shisong Peng, Tao Yin
Abstract:
Verifying the credibility of Cyber Threat Intelligence (CTI) is essential for reliable cybersecurity defense. However, traditional approaches typically treat this task as a static classification problem, relying on handcrafted features or isolated deep learning models. These methods often lack the robustness needed to handle incomplete, heterogeneous, or noisy intelligence, and they provide limited transparency in decision-making-factors that reduce their effectiveness in real-world threat environments. To address these limitations, we propose LRCTI, a Large Language Model (LLM)-based framework designed for multi-step CTI credibility verification. The framework first employs a text summarization module to distill complex intelligence reports into concise and actionable threat claims. It then uses an adaptive multi-step evidence retrieval mechanism that iteratively identifies and refines supporting information from a CTI-specific corpus, guided by LLM feedback. Finally, a prompt-based Natural Language Inference (NLI) module is applied to evaluate the credibility of each claim while generating interpretable justifications for the classification outcome. Experiments conducted on two benchmark datasets, CTI-200 and PolitiFact show that LRCTI improves F1-Macro and F1-Micro scores by over 5%, reaching 90.9% and 93.6%, respectively, compared to state-of-the-art baselines. These results demonstrate that LRCTI effectively addresses the core limitations of prior methods, offering a scalable, accurate, and explainable solution for automated CTI credibility verification
Authors:Md Ajwad Akil, Adrian Shuai Li, Imtiaz Karim, Arun Iyengar, Ashish Kundu, Vinny Parla, Elisa Bertino
Abstract:
Large Language Models (LLMs) have transformed software development and automated code generation. Motivated by these advancements, this paper explores the feasibility of LLMs in modifying malware source code to generate variants. We introduce LLMalMorph, a semi-automated framework that leverages semantical and syntactical code comprehension by LLMs to generate new malware variants. LLMalMorph extracts function-level information from the malware source code and employs custom-engineered prompts coupled with strategically defined code transformations to guide the LLM in generating variants without resource-intensive fine-tuning. To evaluate LLMalMorph, we collected 10 diverse Windows malware samples of varying types, complexity and functionality and generated 618 variants. Our thorough experiments demonstrate that it is possible to reduce the detection rates of antivirus engines of these malware variants to some extent while preserving malware functionalities. In addition, despite not optimizing against any Machine Learning (ML)-based malware detectors, several variants also achieved notable attack success rates against an ML-based malware classifier. We also discuss the limitations of current LLM capabilities in generating malware variants from source code and assess where this emerging technology stands in the broader context of malware variant generation.
Authors:Sarah Ali Siddiqui, Chandra Thapa, Derui Wang, Rayne Holland, Wei Shao, Seyit Camtepe, Hajime Suzuki, Rajiv Shah
Abstract:
Gaps between established security standards and their practical implementation have the potential to introduce vulnerabilities, possibly exposing them to security risks. To effectively address and mitigate these security and compliance challenges, security risk management strategies are essential. However, it must adhere to well-established strategies and industry standards to ensure consistency, reliability, and compatibility both within and across organizations. In this paper, we introduce a new hybrid risk assessment framework called TELSAFE, which employs probabilistic modeling for quantitative risk assessment and eliminates the influence of expert opinion bias. The framework encompasses both qualitative and quantitative assessment phases, facilitating effective risk management strategies tailored to the unique requirements of organizations. A specific use case utilizing Common Vulnerabilities and Exposures (CVE)-related data demonstrates the framework's applicability and implementation in real-world scenarios, such as in the telecommunications industry.
Authors:Al Hussein Dabashi, Sajjad Maleki, Biswarup Mukherjee, Gregory Epiphaniou, Carsten Maple, Charalambos Konstantinou, Subhash Lakshminarayana
Abstract:
Local Energy Markets (LEMs), though pivotal to the energy transition, face growing cybersecurity threats due to their reliance on smart grid communication standards and vulnerable Internet-of-Things (IoT)-enabled devices. This is a critical issue because such vulnerabilities can be exploited to manipulate market operations, compromise participants' privacy, and destabilize power distribution networks. This work maps LEM communication flows to existing standards, highlights potential impacts of key identified vulnerabilities, and simulates cyberattack scenarios on a privacy-preserving LEM model to assess their impacts. Findings reveal how attackers could distort pricing and demand patterns. We finally present recommendations for researchers, industry developers, policymakers, and LEM stakeholders to secure future LEM deployments.
Authors:Marc Damie, Florian Hahn, Andreas Peter, Jan Ramon
Abstract:
Ishai et al. (FOCS'06) introduced secure shuffling as an efficient building block for private data aggregation. Recently, the field of differential privacy has revived interest in secure shufflers by highlighting the privacy amplification they can provide in various computations. Although several works argue for the utility of secure shufflers, they often treat them as black boxes; overlooking the practical vulnerabilities and performance trade-offs of existing implementations. This leaves a central question open: what makes a good secure shuffler?
This survey addresses that question by identifying, categorizing, and comparing 26 secure protocols that realize the necessary shuffling functionality. To enable a meaningful comparison, we adapt and unify existing security definitions into a consistent set of properties. We also present an overview of privacy-preserving technologies that rely on secure shufflers, offer practical guidelines for selecting appropriate protocols, and outline promising directions for future work.
Authors:Guan-Yan Yang, Tzu-Yu Cheng, Ya-Wen Teng, Farn Wanga, Kuo-Hui Yeh
Abstract:
The integration of Large Language Models (LLMs) into computer applications has introduced transformative capabilities but also significant security challenges. Existing safety alignments, which primarily focus on semantic interpretation, leave LLMs vulnerable to attacks that use non-standard data representations. This paper introduces ArtPerception, a novel black-box jailbreak framework that strategically leverages ASCII art to bypass the security measures of state-of-the-art (SOTA) LLMs. Unlike prior methods that rely on iterative, brute-force attacks, ArtPerception introduces a systematic, two-phase methodology. Phase 1 conducts a one-time, model-specific pre-test to empirically determine the optimal parameters for ASCII art recognition. Phase 2 leverages these insights to launch a highly efficient, one-shot malicious jailbreak attack. We propose a Modified Levenshtein Distance (MLD) metric for a more nuanced evaluation of an LLM's recognition capability. Through comprehensive experiments on four SOTA open-source LLMs, we demonstrate superior jailbreak performance. We further validate our framework's real-world relevance by showing its successful transferability to leading commercial models, including GPT-4o, Claude Sonnet 3.7, and DeepSeek-V3, and by conducting a rigorous effectiveness analysis against potential defenses such as LLaMA Guard and Azure's content filters. Our findings underscore that true LLM security requires defending against a multi-modal space of interpretations, even within text-only inputs, and highlight the effectiveness of strategic, reconnaissance-based attacks. Content Warning: This paper includes potentially harmful and offensive model outputs.
Authors:Xiangtao Meng, Tianshuo Cong, Li Wang, Wenyu Chen, Zheng Li, Shanqing Guo, Xiaoyun Wang
Abstract:
Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in sensitive domains raises significant concerns. To mitigate these risks, numerous defense strategies have been proposed. However, most existing studies assess these defenses in isolation, overlooking their broader impacts across other risk dimensions. In this work, we take the first step in investigating unintended interactions caused by defenses in LLMs, focusing on the complex interplay between safety, fairness, and privacy. Specifically, we propose CrossRiskEval, a comprehensive evaluation framework to assess whether deploying a defense targeting one risk inadvertently affects others. Through extensive empirical studies on 14 defense-deployed LLMs, covering 12 distinct defense strategies, we reveal several alarming side effects: 1) safety defenses may suppress direct responses to sensitive queries related to bias or privacy, yet still amplify indirect privacy leakage or biased outputs; 2) fairness defenses increase the risk of misuse and privacy leakage; 3) privacy defenses often impair safety and exacerbate bias. We further conduct a fine-grained neuron-level analysis to uncover the underlying mechanisms of these phenomena. Our analysis reveals the existence of conflict-entangled neurons in LLMs that exhibit opposing sensitivities across multiple risk dimensions. Further trend consistency analysis at both task and neuron levels confirms that these neurons play a key role in mediating the emergence of unintended behaviors following defense deployment. We call for a paradigm shift in LLM risk evaluation, toward holistic, interaction-aware assessment of defense strategies.
Authors:Georgios Diamantopoulos, Nikos Tziritas, Rami Bahsoon, Georgios Theodoropoulos
Abstract:
The necessity of blockchain systems to remain decentralised limits current solutions to blockchain governance and dynamic management, forcing a trade-off between control and decentralisation. In light of the above, this work proposes a dynamic and decentralised blockchain management mechanism based on digital twins. To ensure decentralisation, the proposed mechanism utilises multiple digital twins that the system's stakeholders control. To facilitate decentralised decision-making, the twins are organised in a secondary blockchain system that orchestrates agreement on, and propagation of decisions to the managed blockchain. This enables the management of blockchain systems without centralised control. A preliminary evaluation of the performance and impact of the overheads introduced by the proposed mechanism is conducted through simulation. The results demonstrate the proposed mechanism's ability to reach consensus on decisions quickly and reconfigure the primary blockchain with minimal overhead.
Authors:Anthony Hughes, Vasisht Duddu, N. Asokan, Nikolaos Aletras, Ning Ma
Abstract:
Language models (LMs) may memorize personally identifiable information (PII) from training data, enabling adversaries to extract it during inference. Existing defense mechanisms such as differential privacy (DP) reduce this leakage, but incur large drops in utility. Based on a comprehensive study using circuit discovery to identify the computational circuits responsible PII leakage in LMs, we hypothesize that specific PII leakage circuits in LMs should be responsible for this behavior. Therefore, we propose PATCH (Privacy-Aware Targeted Circuit PatcHing), a novel approach that first identifies and subsequently directly edits PII circuits to reduce leakage. PATCH achieves better privacy-utility trade-off than existing defenses, e.g., reducing recall of PII leakage from LMs by up to 65%. Finally, PATCH can be combined with DP to reduce recall of residual leakage of an LM to as low as 0.01%. Our analysis shows that PII leakage circuits persist even after the application of existing defense mechanisms. In contrast, PATCH can effectively mitigate their impact.
Authors:Guan-Yan Yang, Farn Wang, Kuo-Hui Yeh
Abstract:
Consumer electronics (CE) connected to the Internet of Things are susceptible to various attacks, including DDoS and web-based threats, which can compromise their functionality and facilitate remote hijacking. These vulnerabilities allow attackers to exploit CE for broader system attacks while enabling the propagation of malicious code across the CE network, resulting in device failures. Existing deep learning-based traffic anomaly detection systems exhibit high accuracy in traditional network environments but are often overly complex and reliant on static infrastructure, necessitating manual configuration and management. To address these limitations, we propose a scalable network model that integrates Software-defined Networking (SDN) and Compute First Networking (CFN) for next-generation CE networks. In this network model, we propose a Graph Neural Networks-based Network Anomaly Detection framework (GNN-NAD) that integrates SDN-based CE networks and enables the CFN architecture. GNN-NAD uniquely fuses a static, vulnerability-aware attack graph with dynamic traffic features, providing a holistic view of network security. The core of the framework is a GNN model (GSAGE) for graph representation learning, followed by a Random Forest (RF) classifier. This design (GSAGE+RF) demonstrates superior performance compared to existing feature selection methods. Experimental evaluations on CE environment reveal that GNN-NAD achieves superior metrics in accuracy, recall, precision, and F1 score, even with small sample sizes, exceeding the performance of current network anomaly detection methods. This work advances the security and efficiency of next-generation intelligent CE networks.
Authors:Fuyuki Kitagawa, Ryo Nishimaki, Nikhil Pappu
Abstract:
Secure key leasing (SKL) enables the holder of a secret key for a cryptographic function to temporarily lease the key using quantum information. Later, the recipient can produce a deletion certificate, which proves that they no longer have access to the secret key. The security guarantee ensures that even a malicious recipient cannot continue to evaluate the function, after producing a valid deletion certificate. Most prior work considers an adversarial recipient that obtains a single leased key, which is insufficient for many applications. In the more realistic collusion-resistant setting, security must hold even when polynomially many keys are leased (and subsequently deleted). However, achieving collusion-resistant SKL from standard assumptions remains poorly understood, especially for functionalities beyond decryption. We improve upon this situation by introducing new pathways for constructing collusion-resistant SKL. Our main contributions are as follows: - A generalization of quantum-secure collusion-resistant traitor tracing called multi-level traitor tracing (MLTT), and a compiler that transforms an MLTT scheme for a primitive X into a collusion-resistant SKL scheme for primitive X. - The first bounded collusion-resistant SKL scheme for PRFs, assuming LWE. - A compiler that upgrades any single-key secure SKL scheme for digital signatures into one with unbounded collusion-resistance, assuming OWFs. - A compiler that upgrades collusion-resistant SKL schemes with classical certificates to ones having verification-query resilience, assuming OWFs.
Authors:Yanjie Li, Yiming Cao, Dong Wang, Bin Xiao
Abstract:
Multimodal agents built on large vision-language models (LVLMs) are increasingly deployed in open-world settings but remain highly vulnerable to prompt injection, especially through visual inputs. We introduce AgentTypo, a black-box red-teaming framework that mounts adaptive typographic prompt injection by embedding optimized text into webpage images. Our automatic typographic prompt injection (ATPI) algorithm maximizes prompt reconstruction by substituting captioners while minimizing human detectability via a stealth loss, with a Tree-structured Parzen Estimator guiding black-box optimization over text placement, size, and color. To further enhance attack strength, we develop AgentTypo-pro, a multi-LLM system that iteratively refines injection prompts using evaluation feedback and retrieves successful past examples for continual learning. Effective prompts are abstracted into generalizable strategies and stored in a strategy repository, enabling progressive knowledge accumulation and reuse in future attacks. Experiments on the VWA-Adv benchmark across Classifieds, Shopping, and Reddit scenarios show that AgentTypo significantly outperforms the latest image-based attacks such as AgentAttack. On GPT-4o agents, our image-only attack raises the success rate from 0.23 to 0.45, with consistent results across GPT-4V, GPT-4o-mini, Gemini 1.5 Pro, and Claude 3 Opus. In image+text settings, AgentTypo achieves 0.68 ASR, also outperforming the latest baselines. Our findings reveal that AgentTypo poses a practical and potent threat to multimodal agents and highlight the urgent need for effective defense.
Authors:Zhengyuan Jiang, Yuyang Zhang, Moyang Guo, Neil Zhenqiang Gong
Abstract:
In this work, we formulate and study the problem of image-editing detection and attribution: given a base image and a suspicious image, detection seeks to determine whether the suspicious image was derived from the base image using an AI editing model, while attribution further identifies the specific editing model responsible. Existing methods for detecting and attributing AI-generated images are insufficient for this problem, as they focus on determining whether an image was AI-generated/edited rather than whether it was edited from a particular base image. To bridge this gap, we propose EditTrack, the first framework for this image-editing detection and attribution problem. Building on four key observations about the editing process, EditTrack introduces a novel re-editing strategy and leverages carefully designed similarity metrics to determine whether a suspicious image originates from a base image and, if so, by which model. We evaluate EditTrack on five state-of-the-art editing models across six datasets, demonstrating that it consistently achieves accurate detection and attribution, significantly outperforming five baselines.
Authors:Yuepeng Hu, Zhengyuan Jiang, Mengyuan Li, Osama Ahmed, Zhicong Huang, Cheng Hong, Neil Gong
Abstract:
Large language models (LLMs) are often modified after release through post-processing such as post-training or quantization, which makes it challenging to determine whether one model is derived from another. Existing provenance detection methods have two main limitations: (1) they embed signals into the base model before release, which is infeasible for already published models, or (2) they compare outputs across models using hand-crafted or random prompts, which are not robust to post-processing. In this work, we propose LLMPrint, a novel detection framework that constructs fingerprints by exploiting LLMs' inherent vulnerability to prompt injection. Our key insight is that by optimizing fingerprint prompts to enforce consistent token preferences, we can obtain fingerprints that are both unique to the base model and robust to post-processing. We further develop a unified verification procedure that applies to both gray-box and black-box settings, with statistical guarantees. We evaluate LLMPrint on five base models and around 700 post-trained or quantized variants. Our results show that LLMPrint achieves high true positive rates while keeping false positive rates near zero.
Authors:Ke Wang, Felix Qu, Libin Xia, Zishuo Zhao, Chris Tong, Lynn Ai, Eric Yang
Abstract:
Decentralized inference is an appealing paradigm for serving large language models (LLMs), offering strong security, high efficiency, and lower operating costs. Yet the permissionless setting admits no a priori trust in participating nodes, making output verifiability a prerequisite for secure deployment. We present VeriLLM, a publicly verifiable protocol for decentralized LLM inference that (i) achieves security under a one-honest-verifier assumption, (ii) attains near-negligible verification cost (about 1% of the underlying inference) via a lightweight verification algorithm designed explicitly for LLMs, and (iii) enforces honest checking through a peer-prediction mechanism that mitigates lazy verification in naive voting. We further introduce an isomorphic inference-verification network that multiplexes both roles on the same set of GPU workers. This architecture (i) increases GPU utilization and thereby improves end-to-end throughput for both inference and verification, (ii) expands the effective pool of available validators, strengthening robustness and security, and (iii) enforces task indistinguishability at the worker boundary to prevent job-type-conditioned behavior. Finally, we provide a formal game-theoretic analysis and prove that, under our incentives, honest inference and verification constitute a Nash equilibrium, ensuring incentive compatibility against rational adversaries. To our knowledge, this is the first decentralized inference verification protocol with an end-to-end game-theoretic security proof.
Authors:Sun-Moon Yoon, Hyun-Young Park, Seung-Hyun Nam, Si-Hyeon Lee
Abstract:
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP), which enforces local differential privacy (LDP) on sensitive data while allowing more accurate inference on non-sensitive data. In this setting, we completely characterize the fundamental privacy-utility trade-off. The converse proof builds on several key ideas, including a generalized uniform asymptotic Cramér-Rao lower bound, a reduction showing that it suffices to consider a newly defined class of extremal ULDP mechanisms, and a novel distribution decomposition technique tailored to ULDP constraints. For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as nontrivial modifications of the block design mechanism known to be optimal under standard LDP constraints, while incorporating the distribution decomposition idea used in the converse proof and a score-based linear estimator. These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-preserving statistical inference.
Authors:Xiyu Zeng, Siyuan Liang, Liming Lu, Haotian Zhu, Enguang Liu, Jisheng Dang, Yongbin Zhou, Shuchao Pang
Abstract:
As the capabilities of Vision Language Models (VLMs) continue to improve, they are increasingly targeted by jailbreak attacks. Existing defense methods face two major limitations: (1) they struggle to ensure safety without compromising the model's utility; and (2) many defense mechanisms significantly reduce the model's inference efficiency. To address these challenges, we propose SafeSteer, a lightweight, inference-time steering framework that effectively defends against diverse jailbreak attacks without modifying model weights. At the core of SafeSteer is the innovative use of Singular Value Decomposition to construct a low-dimensional "safety subspace." By projecting and reconstructing the raw steering vector into this subspace during inference, SafeSteer adaptively removes harmful generation signals while preserving the model's ability to handle benign inputs. The entire process is executed in a single inference pass, introducing negligible overhead. Extensive experiments show that SafeSteer reduces the attack success rate by over 60% and improves accuracy on normal tasks by 1-2%, without introducing significant inference latency. These results demonstrate that robust and practical jailbreak defense can be achieved through simple, efficient inference-time control.
Authors:Yating Liu, Xing Su, Hao Wu, Sijin Li, Yuxi Cheng, Fengyuan Xu, Sheng Zhong
Abstract:
Adversarial smart contracts, mostly on EVM-compatible chains like Ethereum and BSC, are deployed as EVM bytecode to exploit vulnerable smart contracts typically for financial gains. Detecting such malicious contracts at the time of deployment is an important proactive strategy preventing loss from victim contracts. It offers a better cost-benefit than detecting vulnerabilities on diverse potential victims. However, existing works are not generic with limited detection types and effectiveness due to imbalanced samples, while the emerging LLM technologies, which show its potentials in generalization, have two key problems impeding its application in this task: hard digestion of compiled-code inputs, especially those with task-specific logic, and hard assessment of LLMs' certainty in their binary answers, i.e., yes-or-no answers. Therefore, we propose a generic adversarial smart contracts detection framework FinDet, which leverages LLMs with two enhancements addressing above two problems. FinDet takes as input only the EVM-bytecode contracts and identifies adversarial ones among them with high balanced accuracy. The first enhancement extracts concise semantic intentions and high-level behavioral logic from the low-level bytecode inputs, unleashing the LLM reasoning capability restricted by the task input. The second enhancement probes and measures the LLM uncertainty to its multi-round answering to the same query, improving the LLM answering robustness for binary classifications required by the task output. Our comprehensive evaluation shows that FinDet achieves a BAC of 0.9223 and a TPR of 0.8950, significantly outperforming existing baselines. It remains robust under challenging conditions including unseen attack patterns, low-data settings, and feature obfuscation. FinDet detects all 5 public and 20+ unreported adversarial contracts in a 10-day real-world test, confirmed manually.
Authors:Masayuki Tezuka, Keisuke Tanaka
Abstract:
Public key encryption with equality test (PKEET), proposed by Yang et al. (CT-RSA 2010), is a variant of public key encryption that enables an equality test to determine whether two ciphertexts correspond to the same plaintext. This test applies not only for ciphertexts generated under the same encryption key but also for those generated under different encryption keys. To date, several generic constructions of PKEET have been proposed. However, these generic constructions have the drawback of reliance on the random oracle model or a (hierarchical) identity-based encryption scheme. In this paper, we propose a generic construction of a PKEET scheme based on tag-based encryption without the random oracle model. Tag-based encryption is a weaker primitive than identity-based encryption. Our scheme allows to derive new PKEET schemes without the random oracle model. By instantiating our construction with the pairing-free tag-based encryption scheme by Kiltz (TCC 2006), we obtain a pairing-free PKEET scheme without the random oracle model. Moreover, by instantiating our construction with a tag-based encryption scheme based on the learning parity with noise (LPN) assumption, we obtain a PKEET scheme based on the LPN assumption without the random oracle model.
Authors:Masayuki Tezuka, Keisuke Tanaka
Abstract:
An ordered multi-signature scheme allows multiple signers to sign a common message in a sequential manner and allows anyone to verify the signing order of signers with a public-key list. In this work, we propose an ordered multi-signature scheme by modifying the sequential aggregate signature scheme by Chatterjee and Kabaleeshwaran (ACISP 2020). Our scheme offers compact public parameter size and the public-key aggregation property. This property allows us to compress a public-key list into a short aggregated key. We prove the security of our scheme under the symmetric external Diffie-Hellman (SXDH) assumption without the random oracle model.
Authors:Phung Duc Luong, Le Tran Gia Bao, Nguyen Vu Khai Tam, Dong Huu Nguyen Khoa, Nguyen Huu Quyen, Van-Hau Pham, Phan The Duy
Abstract:
This work introduces xOffense, an AI-driven, multi-agent penetration testing framework that shifts the process from labor-intensive, expert-driven manual efforts to fully automated, machine-executable workflows capable of scaling seamlessly with computational infrastructure. At its core, xOffense leverages a fine-tuned, mid-scale open-source LLM (Qwen3-32B) to drive reasoning and decision-making in penetration testing. The framework assigns specialized agents to reconnaissance, vulnerability scanning, and exploitation, with an orchestration layer ensuring seamless coordination across phases. Fine-tuning on Chain-of-Thought penetration testing data further enables the model to generate precise tool commands and perform consistent multi-step reasoning. We evaluate xOffense on two rigorous benchmarks: AutoPenBench and AI-Pentest-Benchmark. The results demonstrate that xOffense consistently outperforms contemporary methods, achieving a sub-task completion rate of 79.17%, decisively surpassing leading systems such as VulnBot and PentestGPT. These findings highlight the potential of domain-adapted mid-scale LLMs, when embedded within structured multi-agent orchestration, to deliver superior, cost-efficient, and reproducible solutions for autonomous penetration testing.
Authors:Asim Waheed, Vasisht Duddu, Rui Zhang, Sebastian Szyller, N. Asokan
Abstract:
ML models are susceptible to risks to security, privacy, and fairness. Several defenses are designed to protect against their intended risks, but can inadvertently affect susceptibility to other unrelated risks, known as unintended interactions. Several jurisdictions are preparing ML regulatory frameworks that require ML practitioners to assess the susceptibility of ML models to different risks. A library for valuating unintended interactions that can be used by (a) practitioners to evaluate unintended interactions at scale prior to model deployment and (b) researchers to design defenses which do not suffer from an unintended increase in unrelated risks. Ideally, such a library should be i) comprehensive by including representative attacks, defenses and metrics for different risks, ii) extensible to new modules due to its modular design, iii) consistent with a user-friendly API template for inputs and outputs, iv) applicable to evaluate previously unexplored unintended interactions. We present AMULET, a Python library that covers risks to security, privacy, and fairness, which satisfies all these requirements. AMULET can be used to evaluate unexplored unintended interactions, compare effectiveness between defenses or attacks, and include new attacks and defenses.
Authors:De Zhang Lee, Han Fang, Hanyi Wang, Ee-Chien Chang
Abstract:
Digital watermarks can be embedded into AI-generated content (AIGC) by initializing the generation process with starting points sampled from a secret distribution. When combined with pseudorandom error-correcting codes, such watermarked outputs can remain indistinguishable from unwatermarked objects, while maintaining robustness under whitenoise. In this paper, we go beyond indistinguishability and investigate security under removal attacks. We demonstrate that indistinguishability alone does not necessarily guarantee resistance to adversarial removal. Specifically, we propose a novel attack that exploits boundary information leaked by the locations of watermarked objects. This attack significantly reduces the distortion required to remove watermarks -- by up to a factor of $15 \times$ compared to a baseline whitenoise attack under certain settings. To mitigate such attacks, we introduce a defense mechanism that applies a secret transformation to hide the boundary, and prove that the secret transformation effectively rendering any attacker's perturbations equivalent to those of a naive whitenoise adversary. Our empirical evaluations, conducted on multiple versions of Stable Diffusion, validate the effectiveness of both the attack and the proposed defense, highlighting the importance of addressing boundary leakage in latent-based watermarking schemes.
Authors:Doan Minh Trung, Tien Duc Anh Hao, Luong Hoang Minh, Nghi Hoang Khoa, Nguyen Tan Cam, Van-Hau Pham, Phan The Duy
Abstract:
In recent years, learning-based Android malware detection has seen significant advancements, with detectors generally falling into three categories: string-based, image-based, and graph-based approaches. While these methods have shown strong detection performance, they often struggle to sustain robustness in real-world settings, particularly when facing code obfuscation and adversarial examples (AEs). Deep multimodal learning has emerged as a promising solution, leveraging the strengths of multiple feature types to enhance robustness and generalization. However, a systematic investigation of multimodal fusion for both accuracy and resilience remains underexplored. In this study, we propose DMLDroid, an Android malware detection based on multimodal fusion that leverages three different representations of malware features, including permissions & intents (tabular-based), DEX file representations (image-based), and API calls (graph-derived sequence-based). We conduct exhaustive experiments independently on each feature, as well as in combination, using different fusion strategies. Experimental results on the CICMalDroid 2020 dataset demonstrate that our multimodal approach with the dynamic weighted fusion mechanism achieves high performance, reaching 97.98% accuracy and 98.67% F1-score on original malware detection. Notably, the proposed method maintains strong robustness, sustaining over 98% accuracy and 98% F1-score under both obfuscation and adversarial attack scenarios. Our findings highlight the benefits of multimodal fusion in improving both detection accuracy and robustness against evolving Android malware threats.
Authors:Guan-Yan Yang, Farn Wang, You-Zong Gu, Ya-Wen Teng, Kuo-Hui Yeh, Ping-Hsueh Ho, Wei-Ling Wen
Abstract:
The rapid proliferation of network applications has led to a significant increase in network attacks. According to the OWASP Top 10 Projects report released in 2021, injection attacks rank among the top three vulnerabilities in software projects. This growing threat landscape has increased the complexity and workload of software testing, necessitating advanced tools to support agile development cycles. This paper introduces a novel test prioritization method for SQL injection vulnerabilities to enhance testing efficiency. By leveraging previous test outcomes, our method adjusts defense strength vectors for subsequent tests, optimizing the testing workflow and tailoring defense mechanisms to specific software needs. This approach aims to improve the effectiveness and efficiency of vulnerability detection and mitigation through a flexible framework that incorporates dynamic adjustments and considers the temporal aspects of vulnerability exposure.
Authors:Felix Mächtle, Ashwath Shetty, Jonas Sander, Nils Loose, Sören Pirk, Thomas Eisenbarth
Abstract:
Diffusion models have significantly advanced text-to-image generation, enabling the creation of highly realistic images conditioned on textual prompts and seeds. Given the considerable intellectual and economic value embedded in such prompts, prompt theft poses a critical security and privacy concern. In this paper, we investigate prompt-stealing attacks targeting diffusion models. We reveal that numerical optimization-based prompt recovery methods are fundamentally limited as they do not account for the initial random noise used during image generation. We identify and exploit a noise-generation vulnerability (CWE-339), prevalent in major image-generation frameworks, originating from PyTorch's restriction of seed values to a range of $2^{32}$ when generating the initial random noise on CPUs. Through a large-scale empirical analysis conducted on images shared via the popular platform CivitAI, we demonstrate that approximately 95% of these images' seed values can be effectively brute-forced in 140 minutes per seed using our seed-recovery tool, SeedSnitch. Leveraging the recovered seed, we propose PromptPirate, a genetic algorithm-based optimization method explicitly designed for prompt stealing. PromptPirate surpasses state-of-the-art methods, i.e., PromptStealer, P2HP, and CLIP-Interrogator, achieving an 8-11% improvement in LPIPS similarity. Furthermore, we introduce straightforward and effective countermeasures that render seed stealing, and thus optimization-based prompt stealing, ineffective. We have disclosed our findings responsibly and initiated coordinated mitigation efforts with the developers to address this critical vulnerability.
Authors:Felix Mächtle, Nils Loose, Jan-Niclas Serr, Jonas Sander, Thomas Eisenbarth
Abstract:
Symbolic execution is a powerful technique for software testing, but suffers from limitations when encountering external functions, such as native methods or third-party libraries. Existing solutions often require additional context, expensive SMT solvers, or manual intervention to approximate these functions through symbolic stubs. In this work, we propose a novel approach to automatically generate symbolic stubs for external functions during symbolic execution that leverages Genetic Programming. When the symbolic executor encounters an external function, AutoStub generates training data by executing the function on randomly generated inputs and collecting the outputs. Genetic Programming then derives expressions that approximate the behavior of the function, serving as symbolic stubs. These automatically generated stubs allow the symbolic executor to continue the analysis without manual intervention, enabling the exploration of program paths that were previously intractable. We demonstrate that AutoStub can automatically approximate external functions with over 90% accuracy for 55% of the functions evaluated, and can infer language-specific behaviors that reveal edge cases crucial for software testing.
Authors:Christos Anagnostopoulos, Ioulia Kapsali, Alexandros Gkillas, Nikos Piperigkos, Aris S. Lalos
Abstract:
Autonomous vehicles (AVs) rely on complex perception and communication systems, making them vulnerable to adversarial attacks that can compromise safety. While simulation offers a scalable and safe environment for robustness testing, existing frameworks typically lack comprehensive supportfor modeling multi-domain adversarial scenarios. This paper introduces a novel, open-source integrated simulation framework designed to generate adversarial attacks targeting both perception and communication layers of AVs. The framework provides high-fidelity modeling of physical environments, traffic dynamics, and V2X networking, orchestrating these components through a unified core that synchronizes multiple simulators based on a single configuration file. Our implementation supports diverse perception-level attacks on LiDAR sensor data, along with communication-level threats such as V2X message manipulation and GPS spoofing. Furthermore, ROS 2 integration ensures seamless compatibility with third-party AV software stacks. We demonstrate the framework's effectiveness by evaluating the impact of generated adversarial scenarios on a state-of-the-art 3D object detector, revealing significant performance degradation under realistic conditions.
Authors:Haomiao Tang, Wenjie Li, Yixiang Qiu, Genping Wang, Shu-Tao Xia
Abstract:
Despite the ubiquity of modern face retrieval systems, their retrieval stage is often outsourced to third-party entities, posing significant risks to user portrait privacy. Although homomorphic encryption (HE) offers strong security guarantees by enabling arithmetic computations in the cipher space, its high computational inefficiency makes it unsuitable for real-time, real-world applications. To address this issue, we propose Cancelable Product Quantization, a highly efficient framework for secure face representation retrieval. Our hierarchical two-stage framework comprises: (i) a high-throughput cancelable PQ indexing module for fast candidate filtering, and (ii) a fine-grained cipher-space retrieval module for final precise face ranking. A tailored protection mechanism is designed to secure the indexing module for cancelable biometric authentication while ensuring efficiency. Experiments on benchmark datasets demonstrate that our method achieves an decent balance between effectiveness, efficiency and security.
Authors:Xiangtao Meng, Yingkai Dong, Ning Yu, Li Wang, Zheng Li, Shanqing Guo
Abstract:
Despite the advancements in Text-to-Image (T2I) generation models, their potential for misuse or even abuse raises serious safety concerns. Model developers have made tremendous efforts to introduce safety mechanisms that can address these concerns in T2I models. However, the existing safety mechanisms, whether external or internal, either remain susceptible to evasion under distribution shifts or require extensive model-specific adjustments. To address these limitations, we introduce Safe-Control, an innovative plug-and-play safety patch designed to mitigate unsafe content generation in T2I models. Using data-driven strategies and safety-aware conditions, Safe-Control injects safety control signals into the locked T2I model, acting as an update in a patch-like manner. Model developers can also construct various safety patches to meet the evolving safety requirements, which can be flexibly merged into a single, unified patch. Its plug-and-play design further ensures adaptability, making it compatible with other T2I models of similar denoising architecture. We conduct extensive evaluations on six diverse and public T2I models. Empirical results highlight that Safe-Control is effective in reducing unsafe content generation across six diverse T2I models with similar generative architectures, yet it successfully maintains the quality and text alignment of benign images. Compared to seven state-of-the-art safety mechanisms, including both external and internal defenses, Safe-Control significantly outperforms all baselines in reducing unsafe content generation. For example, it reduces the probability of unsafe content generation to 7%, compared to approximately 20% for most baseline methods, under both unsafe prompts and the latest adversarial attacks.
Authors:Guan-Yan Yang, Jui-Ning Chen, Farn Wang, Kuo-Hui Yeh
Abstract:
The Internet of Energy (IoE) integrates IoT-driven digital communication with power grids to enable efficient and sustainable energy systems. Still, its interconnectivity exposes critical infrastructure to sophisticated cyber threats, including adversarial attacks designed to bypass traditional safeguards. Unlike general IoT risks, IoE threats have heightened public safety consequences, demanding resilient solutions. From the networking-level safeguard perspective, we propose a Graph Structure Learning (GSL)-based safeguards framework that jointly optimizes graph topology and node representations to resist adversarial network model manipulation inherently. Through a conceptual overview, architectural discussion, and case study on a security dataset, we demonstrate GSL's superior robustness over representative methods, offering practitioners a viable path to secure IoE networks against evolving attacks. This work highlights the potential of GSL to enhance the resilience and reliability of future IoE networks for practitioners managing critical infrastructure. Lastly, we identify key open challenges and propose future research directions in this novel research area.
Authors:Xavier Cadet, Simona Boboila, Sie Hendrata Dharmawan, Alina Oprea, Peter Chin
Abstract:
Cyber defense requires automating defensive decision-making under stealthy, deceptive, and continuously evolving adversarial strategies. The FlipIt game provides a foundational framework for modeling interactions between a defender and an advanced adversary that compromises a system without being immediately detected. In FlipIt, the attacker and defender compete to control a shared resource by performing a Flip action and paying a cost. However, the existing FlipIt frameworks rely on a small number of heuristics or specialized learning techniques, which can lead to brittleness and the inability to adapt to new attacks. To address these limitations, we introduce PoolFlip, a multi-agent gym environment that extends the FlipIt game to allow efficient learning for attackers and defenders. Furthermore, we propose Flip-PSRO, a multi-agent reinforcement learning (MARL) approach that leverages population-based training to train defender agents equipped to generalize against a range of unknown, potentially adaptive opponents. Our empirical results suggest that Flip-PSRO defenders are $2\times$ more effective than baselines to generalize to a heuristic attack not exposed in training. In addition, our newly designed ownership-based utility functions ensure that Flip-PSRO defenders maintain a high level of control while optimizing performance.
Authors:Sara Saeidian, Ata Yavuzyılmaz, Leonhard Grosse, Georg Schuppe, Tobias J. Oechtering
Abstract:
We analyze the privacy guarantees of the Laplace mechanism releasing the histogram of a dataset through the lens of pointwise maximal leakage (PML). While differential privacy is commonly used to quantify the privacy loss, it is a context-free definition that does not depend on the data distribution. In contrast, PML enables a more refined analysis by incorporating assumptions about the data distribution. We show that when the probability of each histogram bin is bounded away from zero, stronger privacy protection can be achieved for a fixed level of noise. Our results demonstrate the advantage of context-aware privacy measures and show that incorporating assumptions about the data can improve privacy-utility tradeoffs.
Authors:Yang Li, Hanjie Wang, Yuanzheng Li, Jiazheng Li, Zhaoyang Dong
Abstract:
Wind power data often suffers from missing values due to sensor faults and unstable transmission at edge sites. While federated learning enables privacy-preserving collaboration without sharing raw data, it remains vulnerable to anomalous updates and privacy leakage during parameter exchange. These challenges are amplified in open industrial environments, necessitating zero-trust mechanisms where no participant is inherently trusted. To address these challenges, this work proposes ZTFed-MAS2S, a zero-trust federated learning framework that integrates a multi-head attention-based sequence-to-sequence imputation model. ZTFed integrates verifiable differential privacy with non-interactive zero-knowledge proofs and a confidentiality and integrity verification mechanism to ensure verifiable privacy preservation and secure model parameters transmission. A dynamic trust-aware aggregation mechanism is employed, where trust is propagated over similarity graphs to enhance robustness, and communication overhead is reduced via sparsity- and quantization-based compression. MAS2S captures long-term dependencies in wind power data for accurate imputation. Extensive experiments on real-world wind farm datasets validate the superiority of ZTFed-MAS2S in both federated learning performance and missing data imputation, demonstrating its effectiveness as a secure and efficient solution for practical applications in the energy sector.
Authors:Ruyi Ding, Tianhong Xu, Xinyi Shen, Aidong Adam Ding, Yunsi Fei
Abstract:
The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multimodal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture of Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures, where input tokens are dynamically directed to specialized experts based on their semantic meaning inadvertently opens up a new attack surface for privacy breaches. These input-dependent activation patterns leave distinctive temporal and spatial traces in hardware execution, which adversaries could exploit to deduce sensitive user data. In this work, we propose MoEcho, discovering a side channel analysis based attack surface that compromises user privacy on MoE based systems. Specifically, in MoEcho, we introduce four novel architectural side channels on different computing platforms, including Cache Occupancy Channels and Pageout+Reload on CPUs, and Performance Counter and TLB Evict+Reload on GPUs, respectively. Exploiting these vulnerabilities, we propose four attacks that effectively breach user privacy in large language models (LLMs) and vision language models (VLMs) based on MoE architectures: Prompt Inference Attack, Response Reconstruction Attack, Visual Inference Attack, and Visual Reconstruction Attack. MoEcho is the first runtime architecture level security analysis of the popular MoE structure common in modern transformers, highlighting a serious security and privacy threat and calling for effective and timely safeguards when harnessing MoE based models for developing efficient large scale AI services.
Authors:Vasileios Kouvakis, Stylianos E. Trevlakis, Alexandros-Apostolos A. Boulogeorgos, Hongwu Liu, Theodoros A. Tsiftsis, Octavia A. Dobre
Abstract:
Security has always been a priority, for researchers, service providers and network operators when it comes to radio access networks (RAN). One wireless access approach that has captured attention is blockchain enabled RAN (B-RAN) due to its secure nature. This research introduces a framework that integrates blockchain technology into RAN while also addressing the limitations of state-of-the-art models. The proposed framework utilizes queuing and Markov chain theory to model the aspects of B-RAN. An extensive evaluation of the models performance is provided, including an analysis of timing factors and a focused assessment of its security aspects. The results demonstrate reduced latency and comparable security making the presented framework suitable for diverse application scenarios.
Authors:Amine Tellache, Abdelaziz Amara Korba, Amdjed Mokhtari, Horea Moldovan, Yacine Ghamri-Doudane
Abstract:
Effective incident response (IR) is critical for mitigating cyber threats, yet security teams are overwhelmed by alert fatigue, high false-positive rates, and the vast volume of unstructured Cyber Threat Intelligence (CTI) documents. While CTI holds immense potential for enriching security operations, its extensive and fragmented nature makes manual analysis time-consuming and resource-intensive. To bridge this gap, we introduce a novel Retrieval-Augmented Generation (RAG)-based framework that leverages Large Language Models (LLMs) to automate and enhance IR by integrating dynamically retrieved CTI. Our approach introduces a hybrid retrieval mechanism that combines NLP-based similarity searches within a CTI vector database with standardized queries to external CTI platforms, facilitating context-aware enrichment of security alerts. The augmented intelligence is then leveraged by an LLM-powered response generation module, which formulates precise, actionable, and contextually relevant incident mitigation strategies. We propose a dual evaluation paradigm, wherein automated assessment using an auxiliary LLM is systematically cross-validated by cybersecurity experts. Empirical validation on real-world and simulated alerts demonstrates that our approach enhances the accuracy, contextualization, and efficiency of IR, alleviating analyst workload and reducing response latency. This work underscores the potential of LLM-driven CTI fusion in advancing autonomous security operations and establishing a foundation for intelligent, adaptive cybersecurity frameworks.
Authors:Federico Calandra, Marco Bernardo, Andrea Esposito, Francesco Fabris
Abstract:
Blockchains are widely recognized for their immutability, which provides robust guarantees of data integrity and transparency. However, this same feature poses significant challenges in real-world situations that require regulatory compliance, correction of erroneous data, or removal of sensitive information. Redactable blockchains address the limitations of traditional ones by enabling controlled, auditable modifications to blockchain data, primarily through cryptographic mechanisms such as chameleon hash functions and alternative redaction schemes. This report examines the motivations for introducing redactability, surveys the cryptographic primitives that enable secure edits, and analyzes competing approaches and their shortcomings. Special attention is paid to the practical deployment of redactable blockchains in private settings, with discussions of use cases in healthcare, finance, Internet of drones, and federated learning. Finally, the report outlines further challenges, also in connection with reversible computing, and the future potential of redactable blockchains in building law-compliant, trustworthy, and scalable digital infrastructures.
Authors:Xinqi Lyu, Yihao Liu, Yanjie Li, Bin Xiao
Abstract:
Text-to-Image (T2I) models have gained widespread adoption across various applications. Despite the success, the potential misuse of T2I models poses significant risks of generating Not-Safe-For-Work (NSFW) content. To investigate the vulnerability of T2I models, this paper delves into adversarial attacks to bypass the safety mechanisms under black-box settings. Most previous methods rely on word substitution to search adversarial prompts. Due to limited search space, this leads to suboptimal performance compared to gradient-based training. However, black-box settings present unique challenges to training gradient-driven attack methods, since there is no access to the internal architecture and parameters of T2I models. To facilitate the learning of adversarial prompts in black-box settings, we propose a novel prompt learning attack framework (PLA), where insightful gradient-based training tailored to black-box T2I models is designed by utilizing multimodal similarities. Experiments show that our new method can effectively attack the safety mechanisms of black-box T2I models including prompt filters and post-hoc safety checkers with a high success rate compared to state-of-the-art methods. Warning: This paper may contain offensive model-generated content.
Authors:Ostonya Thomas, Muhaimin Bin Munir, Jean-Michel Tine, Mizanur Rahman, Yuchen Cai, Khandakar Ashrafi Akbar, Md Nahiyan Uddin, Latifur Khan, Trayce Hockstad, Mashrur Chowdhury
Abstract:
Technological advancements have revolutionized numerous industries, including transportation. While digitalization, automation, and connectivity have enhanced safety and efficiency, they have also introduced new vulnerabilities. With 95% of data breaches attributed to human error, promoting cybersecurity awareness in transportation is increasingly critical. Despite numerous cyberattacks on transportation systems worldwide, comprehensive and centralized records of these incidents remain scarce. To address this gap and enhance cyber awareness, this paper presents a large language model (LLM) based approach to extract and organize transportation related cyber incidents from publicly available datasets. A key contribution of this work is the use of generative AI to transform unstructured, heterogeneous cyber incident data into structured formats. Incidents were sourced from the Center for Strategic & International Studies (CSIS) List of Significant Cyber Incidents, the University of Maryland Cyber Events Database (UMCED), the European Repository of Cyber Incidents (EuRepoC), the Maritime Cyber Attack Database (MCAD), and the U.S. DOT Transportation Cybersecurity and Resiliency (TraCR) Examples of Cyber Attacks in Transportation (2018 to 2022). These were classified by a fine tuned LLM into five transportation modes: aviation, maritime, rail, road, and multimodal, forming a transportation specific cyber incident database. Another key contribution of this work is the development of a Retrieval Augmented Generation question answering system, designed to enhance accessibility and practical use by enabling users to query the curated database for specific details on transportation related cyber incidents. By leveraging LLMs for both data extraction and user interaction, this study contributes a novel, accessible tool for improving cybersecurity awareness in the transportation sector.
Authors:Kaibo Huang, Yukun Wei, WanSheng Wu, Tianhua Zhang, Zhongliang Yang, Linna Zhou
Abstract:
The emergence of the Internet of Agents (IoA) introduces critical challenges for communication privacy in sensitive, high-stakes domains. While standard Agent-to-Agent (A2A) protocols secure message content, they are not designed to protect the act of communication itself, leaving agents vulnerable to surveillance and traffic analysis. We find that the rich, event-driven nature of agent dialogues provides a powerful, yet untapped, medium for covert communication. To harness this potential, we introduce and formalize the Covert Event Channel, the first unified model for agent covert communication driven by three interconnected dimensions, which consist of the Storage, Timing,and Behavioral channels. Based on this model, we design and engineer Î CCAP, a novel protocol that operationalizes this event-driven paradigm. Our comprehensive evaluation demonstrates that Î CCAP achieves high capacity and robustness while remaining imperceptible to powerful LLM-based wardens, establishing its practical viability. By systematically engineering this channel, our work provides the foundational understanding essential for developing the next generation of monitoring systems and defensive protocols for a secure and trustworthy IoA.
Authors:Zhongliang Guo, Yifei Qian, Yanli Li, Weiye Li, Chun Tong Lei, Shuai Zhao, Lei Fang, Ognjen ArandjeloviÄ, Chun Pong Lau
Abstract:
Adversarial attacks against computer vision systems have emerged as a critical research area that challenges the fundamental assumptions about neural network robustness and security. This comprehensive survey examines the evolving landscape of adversarial techniques, revealing their dual nature as both sophisticated security threats and valuable defensive tools. We provide a systematic analysis of adversarial attack methodologies across three primary domains: pixel-space attacks, physically realizable attacks, and latent-space attacks. Our investigation traces the technical evolution from early gradient-based methods such as FGSM and PGD to sophisticated optimization techniques incorporating momentum, adaptive step sizes, and advanced transferability mechanisms. We examine how physically realizable attacks have successfully bridged the gap between digital vulnerabilities and real-world threats through adversarial patches, 3D textures, and dynamic optical perturbations. Additionally, we explore the emergence of latent-space attacks that leverage semantic structure in internal representations to create more transferable and meaningful adversarial examples. Beyond traditional offensive applications, we investigate the constructive use of adversarial techniques for vulnerability assessment in biometric authentication systems and protection against malicious generative models. Our analysis reveals critical research gaps, particularly in neural style transfer protection and computational efficiency requirements. This survey contributes a comprehensive taxonomy, evolution analysis, and identification of future research directions, aiming to advance understanding of adversarial vulnerabilities and inform the development of more robust and trustworthy computer vision systems.
Authors:Rodrigo Moreira, Larissa F. Rodrigues Moreira, Flávio de Oliveira Silva
Abstract:
The deployment of large-scale software-based 5G core functions presents significant challenges due to their reliance on optimized and intelligent resource provisioning for their services. Many studies have focused on analyzing the impact of resource allocation for complex deployments using mathematical models, queue theories, or even Artificial Intelligence (AI). This paper elucidates the effects of chaotic workloads, generated by Distributed Denial of Service (DDoS) on different Network Functions (NFs) on User Equipment registration performance. Our findings highlight the necessity of diverse resource profiles to ensure Service-Level Agreement (SLA) compliance in large-scale 5G core deployments. Additionally, our analysis of packet capture approaches demonstrates the potential of kernel-based monitoring for scalable security threat defense. Finally, our empirical evaluation provides insights into the effective deployment of 5G NFs in complex scenarios.
Authors:Vinu Sankar Sadasivan, Mehrdad Saberi, Soheil Feizi
Abstract:
With the rapid rise of generative AI and synthetic media, distinguishing AI-generated images from real ones has become crucial in safeguarding against misinformation and ensuring digital authenticity. Traditional watermarking techniques have shown vulnerabilities to adversarial attacks, undermining their effectiveness in the presence of attackers. We propose IConMark, a novel in-generation robust semantic watermarking method that embeds interpretable concepts into AI-generated images, as a first step toward interpretable watermarking. Unlike traditional methods, which rely on adding noise or perturbations to AI-generated images, IConMark incorporates meaningful semantic attributes, making it interpretable to humans and hence, resilient to adversarial manipulation. This method is not only robust against various image augmentations but also human-readable, enabling manual verification of watermarks. We demonstrate a detailed evaluation of IConMark's effectiveness, demonstrating its superiority in terms of detection accuracy and maintaining image quality. Moreover, IConMark can be combined with existing watermarking techniques to further enhance and complement its robustness. We introduce IConMark+SS and IConMark+TM, hybrid approaches combining IConMark with StegaStamp and TrustMark, respectively, to further bolster robustness against multiple types of image manipulations. Our base watermarking technique (IConMark) and its variants (+TM and +SS) achieve 10.8%, 14.5%, and 15.9% higher mean area under the receiver operating characteristic curve (AUROC) scores for watermark detection, respectively, compared to the best baseline on various datasets.
Authors:Qifan Wang, Jonas Sander, Minmin Jiang, Thomas Eisenbarth, David Oswald
Abstract:
Machine learning models, particularly decision trees (DTs), are widely adopted across various domains due to their interpretability and efficiency. However, as ML models become increasingly integrated into privacy-sensitive applications, concerns about their confidentiality have grown, particularly in light of emerging threats such as model extraction and fault injection attacks. Assessing the vulnerability of DTs under such attacks is therefore important. In this work, we present BarkBeetle, a novel attack that leverages fault injection to extract internal structural information of DT models. BarkBeetle employs a bottom-up recovery strategy that uses targeted fault injection at specific nodes to efficiently infer feature splits and threshold values. Our proof-of-concept implementation demonstrates that BarkBeetle requires significantly fewer queries and recovers more structural information compared to prior approaches, when evaluated on DTs trained with public UCI datasets. To validate its practical feasibility, we implement BarkBeetle on a Raspberry Pi RP2350 board and perform fault injections using the Faultier voltage glitching tool. As BarkBeetle targets general DT models, we also provide an in-depth discussion on its applicability to a broader range of tree-based applications, including data stream classification, DT variants, and cryptography schemes.
Authors:Vinu Sankar Sadasivan, Soheil Feizi, Rajiv Mathews, Lun Wang
Abstract:
This paper investigates the real-world vulnerabilities of audio-based large language models (ALLMs), such as Qwen2-Audio. We first demonstrate that an adversary can craft stealthy audio perturbations to manipulate ALLMs into exhibiting specific targeted behaviors, such as eliciting responses to wake-keywords (e.g., "Hey Qwen"), or triggering harmful behaviors (e.g. "Change my calendar event"). Subsequently, we show that playing adversarial background noise during user interaction with the ALLMs can significantly degrade the response quality. Crucially, our research illustrates the scalability of these attacks to real-world scenarios, impacting other innocent users when these adversarial noises are played through the air. Further, we discuss the transferrability of the attack, and potential defensive measures.
Authors:Poupak Azad, Jiahua Xu, Yebo Feng, Preston Strowbridge, Cuneyt Akcora
Abstract:
Blockchain bridges have become essential infrastructure for enabling interoperability across different blockchain networks, with more than $24B monthly bridge transaction volume. However, their growing adoption has been accompanied by a disproportionate rise in security breaches, making them the single largest source of financial loss in Web3. For cross-chain ecosystems to be robust and sustainable, it is essential to understand and address these vulnerabilities. In this study, we present a comprehensive systematization of blockchain bridge design and security. We define three bridge security priors, formalize the architectural structure of 13 prominent bridges, and identify 23 attack vectors grounded in real-world blockchain exploits. Using this foundation, we evaluate 43 representative attack scenarios and introduce a layered threat model that captures security failures across source chain, off-chain, and destination chain components.
Our analysis at the static code and transaction network levels reveals recurring design flaws, particularly in access control, validator trust assumptions, and verification logic, and identifies key patterns in adversarial behavior based on transaction-level traces. To support future development, we propose a decision framework for bridge architecture design, along with defense mechanisms such as layered validation and circuit breakers. This work provides a data-driven foundation for evaluating bridge security and lays the groundwork for standardizing resilient cross-chain infrastructure.
Authors:Jianshuo Dong, Tianyi Zhang, Feng Yan, Yuanjie Li, Hewu Li, Han Qiu
Abstract:
Cellular networks serve billions of users globally, yet concerns about reliability and security persist due to weaknesses in 3GPP standards. However, traditional analysis methods, including manual inspection and automated tools, struggle with increasingly expanding cellular network specifications. This paper investigates the feasibility of Large Language Models (LLMs) for automated cellular network specification refinement. To advance it, we leverage 200,000+ approved 3GPP Change Requests (CRs) that document specification revisions, constructing a valuable dataset for domain tasks. We introduce CR-eval, a principled evaluation framework, and benchmark 16 state-of-the-art LLMs, demonstrating that top models can discover security-related weaknesses in over 127 out of 200 test cases within five trials. To bridge potential gaps, we explore LLM specialization techniques, including fine-tuning an 8B model to match or surpass advanced LLMs like GPT-4o and DeepSeek-R1. Evaluations on 30 cellular attacks identify open challenges for achieving full automation. These findings confirm that LLMs can automate the refinement of cellular network specifications and provide valuable insights to guide future research in this direction.
Authors:Andreas Happe, Jürgen Cito
Abstract:
This paper presents a critical examination of the surprising efficacy of Large Language Models (LLMs) in penetration testing. The paper thoroughly reviews the evolution of LLMs and their rapidly expanding capabilities which render them increasingly suitable for complex penetration testing operations. It systematically details the historical adoption of LLMs in both academic research and industry, showcasing their application across various offensive security tasks and covering broader phases of the cyber kill chain. Crucially, the analysis also extends to the observed adoption of LLMs by malicious actors, underscoring the inherent dual-use challenge of this technology within the security landscape.
The unexpected effectiveness of LLMs in this context is elucidated by several key factors: the strong alignment between penetration testing's reliance on pattern-matching and LLMs' core strengths, their inherent capacity to manage uncertainty in dynamic environments, and cost-effective access to competent pre-trained models through LLM providers.
The current landscape of LLM-aided penetration testing is categorized into interactive 'vibe-hacking' and the emergence of fully autonomous systems. The paper identifies and discusses significant obstacles impeding wider adoption and safe deployment. These include critical issues concerning model reliability and stability, paramount safety and security concerns, substantial monetary and ecological costs, implications for privacy and digital sovereignty, complex questions of accountability, and profound ethical dilemmas. This comprehensive review and analysis provides a foundation for discussion on future research directions and the development of robust safeguards at the intersection of AI and security.
Authors:Rui Xu, Jiawei Chen, Zhaoxia Yin, Cong Kong, Xinpeng Zhang
Abstract:
The widespread use of large language models (LLMs) and open-source code has raised ethical and security concerns regarding the distribution and attribution of source code, including unauthorized redistribution, license violations, and misuse of code for malicious purposes. Watermarking has emerged as a promising solution for source attribution, but existing techniques rely heavily on hand-crafted transformation rules, abstract syntax tree (AST) manipulation, or task-specific training, limiting their scalability and generality across languages. Moreover, their robustness against attacks remains limited. To address these limitations, we propose CodeMark-LLM, an LLM-driven watermarking framework that embeds watermark into source code without compromising its semantics or readability. CodeMark-LLM consists of two core components: (i) Semantically Consistent Embedding module that applies functionality-preserving transformations to encode watermark bits, and (ii) Differential Comparison Extraction module that identifies the applied transformations by comparing the original and watermarked code. Leveraging the cross-lingual generalization ability of LLM, CodeMark-LLM avoids language-specific engineering and training pipelines. Extensive experiments across diverse programming languages and attack scenarios demonstrate its robustness, effectiveness, and scalability.
Authors:Nges Brian Njungle, Eric Jahns, Luigi Mastromauro, Edwin P. Kayang, Milan Stojkov, Michel A. Kinsy
Abstract:
Machine learning has become a crucial part of our lives, with applications spanning nearly every aspect of our daily activities. However, using personal information in machine learning applications has sparked significant security and privacy concerns about user data. To address these challenges, different privacy-preserving machine learning (PPML) frameworks have been developed to protect sensitive information in machine learning applications. These frameworks generally attempt to balance design trade-offs such as computational efficiency, communication overhead, security guarantees, and scalability. Despite the advancements, selecting the optimal framework and parameters for specific deployment scenarios remains a complex and critical challenge for privacy and security application developers. We present Prismo, an open-source recommendation system designed to aid in selecting optimal parameters and frameworks for different PPML application scenarios. Prismo enables users to explore a comprehensive space of PPML frameworks through various properties based on user-defined objectives. It supports automated filtering of suitable candidate frameworks by considering parameters such as the number of parties in multi-party computation or federated learning and computation cost constraints in homomorphic encryption. Prismo models every use case into a Linear Integer Programming optimization problem, ensuring tailored solutions are recommended for each scenario. We evaluate Prismo's effectiveness through multiple use cases, demonstrating its ability to deliver best-fit solutions in different deployment scenarios.
Authors:Imranur Rahman, Jill Marley, William Enck, Laurie Williams
Abstract:
Developers consistently use version constraints to specify acceptable versions of the dependencies for their project. \emph{Pinning} dependencies can reduce the likelihood of breaking changes, but comes with a cost of manually managing the replacement of outdated and vulnerable dependencies. On the other hand, \emph{floating} can be used to automatically get bug fixes and security fixes, but comes with the risk of breaking changes. Security practitioners advocate \emph{pinning} dependencies to prevent against software supply chain attacks, e.g., malicious package updates. However, since \emph{pinning} is the tightest version constraint, \emph{pinning} is the most likely to result in outdated dependencies. Nevertheless, how the likelihood of becoming outdated or vulnerable dependencies changes across version constraint types is unknown. The goal of this study is to aid developers in making an informed dependency version constraint choice by empirically evaluating the likelihood of dependencies becoming outdated or vulnerable across version constraint types at scale. In this study, we first identify the trends in dependency version constraint usage and the patterns of version constraint type changes made by developers in the npm, PyPI, and Cargo ecosystems. We then modeled the dependency state transitions using survival analysis and estimated how the likelihood of becoming outdated or vulnerable changes when using \emph{pinning} as opposed to the rest of the version constraint types. We observe that among outdated and vulnerable dependencies, the most commonly used version constraint type is \emph{floating-minor}, with \emph{pinning} being the next most common. We also find that \emph{floating-major} is the least likely to result in outdated and \emph{floating-minor} is the least likely to result in vulnerable dependencies.
Authors:Fuyuki Kitagawa, Jiahui Liu, Shota Yamada, Takashi Yamakawa
Abstract:
Secure key leasing allows a cryptographic key to be leased as a quantum state in such a way that the key can later be revoked in a verifiable manner. In this work, we propose a modular framework for constructing secure key leasing with a classical-lessor, where the lessor is entirely classical and, in particular, the quantum secret key can be both leased and revoked using only classical communication. Based on this framework, we obtain classical-lessor secure key leasing schemes for public-key encryption (PKE), pseudorandom function (PRF), and digital signature. We adopt the strong security notion known as security against verification key revealing attacks (VRA security) proposed by Kitagawa et al. (Eurocrypt 2025) into the classical-lessor setting, and we prove that all three of our schemes satisfy this notion under the learning with errors assumption. Our PKE scheme improves upon the previous construction by Goyal et al. (Eurocrypt 2025), and our PRF and digital signature schemes are respectively the first PRF and digital signature with classical-lessor secure key leasing property.
Authors:Xenia Heilmann, Ernst Althaus, Mattia Cerrato, Nick Johannes Peter Rassau, Mohammad Sadeq Dousti, Stefan Kramer
Abstract:
A sum-product network (SPN) is a graphical model that allows several types of probabilistic inference to be performed efficiently. In this paper, we propose a privacy-preserving protocol which tackles structure generation and parameter learning of SPNs. Additionally, we provide a protocol for private inference on SPNs, subsequent to training. To preserve the privacy of the participants, we derive our protocol based on secret sharing, which guarantees privacy in the honest-but-curious setting even when at most half of the parties cooperate to disclose the data. The protocol makes use of a forest of randomly generated SPNs, which is trained and weighted privately and can then be used for private inference on data points. Our experiments indicate that preserving the privacy of all participants does not decrease log-likelihood performance on both homogeneously and heterogeneously partitioned data. We furthermore show that our protocol's performance is comparable to current state-of-the-art SPN learners in homogeneously partitioned data settings. In terms of runtime and memory usage, we demonstrate that our implementation scales well when increasing the number of parties, comparing favorably to protocols for neural networks, when they are trained to reproduce the input-output behavior of SPNs.
Authors:Ander Artola Velasco, Stratis Tsirtsis, Manuel Gomez-Rodriguez
Abstract:
Millions of users rely on a market of cloud-based services to obtain access to state-of-the-art large language models. However, it has been very recently shown that the de facto pay-per-token pricing mechanism used by providers creates a financial incentive for them to strategize and misreport the (number of) tokens a model used to generate an output. In this paper, we develop an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting. Crucially, we show that our framework is guaranteed to always detect token misreporting, regardless of the provider's (mis-)reporting policy, and not falsely flag a faithful provider as unfaithful with high probability. To validate our auditing framework, we conduct experiments across a wide range of (mis-)reporting policies using several large language models from the $\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input prompts from a popular crowdsourced benchmarking platform. The results show that our framework detects an unfaithful provider after observing fewer than $\sim 70$ reported outputs, while maintaining the probability of falsely flagging a faithful provider below $α= 0.05$.
Authors:Nges Brian Njungle, Eric Jahns, Michel A. Kinsy
Abstract:
The widespread adoption of Machine Learning as a Service raises critical privacy and security concerns, particularly about data confidentiality and trust in both cloud providers and the machine learning models. Homomorphic Encryption (HE) has emerged as a promising solution to this problems, allowing computations on encrypted data without decryption. Despite its potential, existing approaches to integrate HE into neural networks are often limited to specific architectures, leaving a wide gap in providing a framework for easy development of HE-friendly privacy-preserving neural network models similar to what we have in the broader field of machine learning. In this paper, we present FHEON, a configurable framework for developing privacy-preserving convolutional neural network (CNN) models for inference using HE. FHEON introduces optimized and configurable implementations of privacy-preserving CNN layers including convolutional layers, average pooling layers, ReLU activation functions, and fully connected layers. These layers are configured using parameters like input channels, output channels, kernel size, stride, and padding to support arbitrary CNN architectures. We assess the performance of FHEON using several CNN architectures, including LeNet-5, VGG-11, VGG- 16, ResNet-20, and ResNet-34. FHEON maintains encrypted-domain accuracies within +/- 1% of their plaintext counterparts for ResNet-20 and LeNet-5 models. Notably, on a consumer-grade CPU, the models build on FHEON achieved 98.5% accuracy with a latency of 13 seconds on MNIST using LeNet-5, and 92.2% accuracy with a latency of 403 seconds on CIFAR-10 using ResNet-20. Additionally, FHEON operates within a practical memory budget requiring not more than 42.3 GB for VGG-16.
Authors:Nges Brian Njungle, Eric Jahns, Milan Stojkov, Michel A. Kinsy
Abstract:
Deep learning has become a cornerstone of modern machine learning. It relies heavily on vast datasets and significant computational resources for high performance. This data often contains sensitive information, making privacy a major concern in deep learning. Spiking Neural Networks (SNNs) have emerged as an energy-efficient alternative to conventional deep learning approaches. Nevertheless, SNNs still depend on large volumes of data, inheriting all the privacy challenges of deep learning. Homomorphic encryption addresses this challenge by allowing computations to be performed on encrypted data, ensuring data confidentiality throughout the entire processing pipeline. In this paper, we introduce PRIVSPIKE, a privacy-preserving inference framework for SNNs using the CKKS homomorphic encryption scheme. PRIVSPIKE supports arbitrary depth SNNs and introduces two key algorithms for evaluating the Leaky Integrate-and-Fire activation function: (1) a polynomial approximation algorithm designed for high-performance SNN inference, and (2) a novel scheme-switching algorithm that optimizes precision at a higher computational cost. We evaluate PRIVSPIKE on MNIST, CIFAR-10, Neuromorphic MNIST, and CIFAR-10 DVS using models from LeNet-5 and ResNet-19 architectures, achieving encrypted inference accuracies of 98.10%, 79.3%, 98.1%, and 66.0%, respectively. On a consumer-grade CPU, SNN LeNet-5 models achieved inference times of 28 seconds on MNIST and 212 seconds on Neuromorphic MNIST. For SNN ResNet-19 models, inference took 784 seconds on CIFAR-10 and 1846 seconds on CIFAR-10 DVS. These results establish PRIVSPIKE as a viable and efficient solution for secure SNN inference, bridging the gap between energy-efficient deep neural networks and strong cryptographic privacy guarantees while outperforming prior encrypted SNN solutions.
Authors:Jehyeok Yeon, Isha Chaudhary, Gagandeep Singh
Abstract:
Large language models (LLMs) are increasingly deployed in agentic systems where they map user intents to relevant external tools to fulfill a task. A critical step in this process is tool selection, where a retriever first surfaces candidate tools from a larger pool, after which the LLM selects the most appropriate one. This pipeline presents an underexplored attack surface where errors in selection can lead to severe outcomes like unauthorized data access or denial of service, all without modifying the agent's model or code. While existing evaluations measure task performance in benign settings, they overlook the specific vulnerabilities of the tool selection mechanism under adversarial conditions. To address this gap, we introduce ToolCert, the first statistical framework that formally certifies tool selection robustness. ToolCert models tool selection as a Bernoulli success process and evaluates it against a strong, adaptive attacker who introduces adversarial tools with misleading metadata, and are iteratively refined based on the agent's previous choices. By sampling these adversarial interactions, ToolCert produces a high-confidence lower bound on accuracy, formally quantifying the agent's worst-case performance. Our evaluation with ToolCert uncovers the severe fragility: under attacks injecting deceptive tools or saturating retrieval, the certified accuracy bound drops near zero, an average performance drop of over 60% compared to non-adversarial settings. For attacks targeting the retrieval and selection stages, the certified accuracy bound plummets to less than 20% after just a single round of adversarial adaptation. ToolCert thus reveals previously unexamined security threats inherent to tool selection and provides a principled method to quantify an agent's robustness to such threats, a necessary step for the safe deployment of agentic systems.
Authors:Chengxiao Wang, Isha Chaudhary, Qian Hu, Weitong Ruan, Rahul Gupta, Gagandeep Singh
Abstract:
Large Language Models (LLMs) can produce catastrophic responses in conversational settings that pose serious risks to public safety and security. Existing evaluations often fail to fully reveal these vulnerabilities because they rely on fixed attack prompt sequences, lack statistical guarantees, and do not scale to the vast space of multi-turn conversations. In this work, we propose QRLLM, a novel, principled Certification framework for Catastrophic risks in multi-turn Conversation for LLMs that bounds the probability of an LLM generating catastrophic responses under multi-turn conversation distributions with statistical guarantees. We model multi-turn conversations as probability distributions over query sequences, represented by a Markov process on a query graph whose edges encode semantic similarity to capture realistic conversational flow, and quantify catastrophic risks using confidence intervals. We define several inexpensive and practical distributions: random node, graph path, adaptive with rejection. Our results demonstrate that these distributions can reveal substantial catastrophic risks in frontier models, with certified lower bounds as high as 70\% for the worst model, highlighting the urgent need for improved safety training strategies in frontier LLMs.
Authors:Han Wang, Haoyu Li, Brian Ko, Huan Zhang
Abstract:
Leaderboards for LRMs have turned evaluation into a competition, incentivizing developers to optimize directly on benchmark suites. A shortcut to achieving higher rankings is to incorporate evaluation benchmarks into the training data, thereby yielding inflated performance, known as benchmark contamination. Surprisingly, our studies find that evading contamination detections for LRMs is alarmingly easy. We focus on the two scenarios where contamination may occur in practice: (I) when the base model evolves into LRM via SFT and RL, we find that contamination during SFT can be originally identified by contamination detection methods. Yet, even a brief GRPO training can markedly conceal contamination signals that most detection methods rely on. Further empirical experiments and theoretical analysis indicate that PPO style importance sampling and clipping objectives are the root cause of this detection concealment, indicating that a broad class of RL methods may inherently exhibit similar concealment capability; (II) when SFT contamination with CoT is applied to advanced LRMs as the final stage, most contamination detection methods perform near random guesses. Without exposure to non-members, contaminated LRMs would still have more confidence when responding to those unseen samples that share similar distributions to the training set, and thus, evade existing memorization-based detection methods. Together, our findings reveal the unique vulnerability of LRMs evaluations: Model developers could easily contaminate LRMs to achieve inflated leaderboards performance while leaving minimal traces of contamination, thereby strongly undermining the fairness of evaluation and threatening the integrity of public leaderboards. This underscores the urgent need for advanced contamination detection methods and trustworthy evaluation protocols tailored to LRMs.
Authors:Luoxi Tang, Yuqiao Meng, Ankita Patra, Weicheng Ma, Muchao Ye, Zhaohan Xi
Abstract:
Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats, wherein LLMs offer cyber threat intelligence (CTI) to support vulnerability assessment and incident response. While recent work has shown that LLMs can support a wide range of CTI tasks such as threat analysis, vulnerability detection, and intrusion defense, significant performance gaps persist in practical deployments. In this paper, we investigate the intrinsic vulnerabilities of LLMs in CTI, focusing on challenges that arise from the nature of the threat landscape itself rather than the model architecture. Using large-scale evaluations across multiple CTI benchmarks and real-world threat reports, we introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision to reliably analyze failure instances. Through extensive experiments and human inspections, we reveal three fundamental vulnerabilities: spurious correlations, contradictory knowledge, and constrained generalization, that limit LLMs in effectively supporting CTI. Subsequently, we provide actionable insights for designing more robust LLM-powered CTI systems to facilitate future research.
Authors:Alex B. Grilo, Giulio Malavolta, Michael Walter, Tianwei Zhang
Abstract:
Quantum key distribution (QKD) enables Alice and Bob to exchange a secret key over a public, untrusted quantum channel. Compared to classical key exchange, QKD achieves everlasting security: after the protocol execution the key is secure against adversaries that can do unbounded computations. On the flip side, while classical key exchange can be achieved non-interactively (with two simultaneous messages between Alice and Bob), no non-interactive protocol is known that provides everlasting security, even using quantum information. In this work, we make progress on this problem. Our main technical contribution is a computational variant of the celebrated monogamy of entanglement game, where the secret is only computationally hidden from the players, rather than information-theoretically. In these settings, we prove a negligible bound on the maximal winning probability over all strategies. As a direct application, we obtain a non-interactive (simultaneous message) QKD protocol from any post-quantum classical non-interactive key exchange, which satisfies everlastingly secure assuming Alice and Bob agree on the same key. The protocol only uses EPR pairs and standard and Hadamard basis measurements, making it suitable for near-term quantum hardware. We also propose how to convert this protocol into a two-round protocol that satisfies the standard notion of everlasting security. Finally, we prove a no-go theorem which establishes that (in contrast to the case of ordinary multi-round QKD) entanglement is necessary for non-interactive QKD, i.e., the messages sent by Alice and Bob cannot both be unentangled with their respective quantum memories if the protocol is to be everlastingly secure.
Authors:Dennis Jacob, Emad Alghamdi, Zhanhao Hu, Basel Alomair, David Wagner
Abstract:
Large language models (LLMs) have become increasingly popular due to their ability to interact with unstructured content. As such, LLMs are now a key driver behind the automation of language processing systems, such as AI agents. Unfortunately, these advantages have come with a vulnerability to prompt injections, an attack where an adversary subverts the LLM's intended functionality with an injected task. Past approaches have proposed detectors and finetuning to provide robustness, but these techniques are vulnerable to adaptive attacks or cannot be used with state-of-the-art models. To this end we propose type-directed privilege separation for LLMs, a method that systematically prevents prompt injections. We restrict the ability of an LLM to interact with third-party data by converting untrusted content to a curated set of data types; unlike raw strings, each data type is limited in scope and content, eliminating the possibility for prompt injections. We evaluate our method across several case studies and find that designs leveraging our principles can systematically prevent prompt injection attacks while maintaining high utility.
Authors:Yury Yanovich, Victoria Kovalevskaya, Maksim Egorov, Elizaveta Smirnova, Matvey Mishuris, Yash Madhwal, Kirill Ziborov, Vladimir Gorgadze, Subodh Sharma
Abstract:
The Open Network (TON) blockchain employs an asynchronous execution model that introduces unique security challenges for smart contracts, particularly race conditions arising from unpredictable message processing order. While previous work established vulnerability patterns through static analysis of audit reports, dynamic detection of temporal dependencies through systematic testing remains an open problem. We present BugMagnifier, a transaction simulation framework that systematically reveals vulnerabilities in TON smart contracts through controlled message orchestration. Built atop TON Sandbox and integrated with the TON Virtual Machine (TVM), our tool combines precise message queue manipulation with differential state analysis and probabilistic permutation testing to detect asynchronous execution flaws. Experimental evaluation demonstrates BugMagnifier's effectiveness through extensive parametric studies on purpose-built vulnerable contracts, revealing message ratio-dependent detection complexity that aligns with theoretical predictions. This quantitative model enables predictive vulnerability assessment while shifting discovery from manual expert analysis to automated evidence generation. By providing reproducible test scenarios for temporal vulnerabilities, BugMagnifier addresses a critical gap in the TON security tooling, offering practical support for safer smart contract development in asynchronous blockchain environments.
Authors:Jialun Zhang, Merve Gülmez, Thomas Nyman, Gang Tan
Abstract:
Rust is a modern systems programming language that ensures memory safety by enforcing ownership and borrowing rules at compile time. While the unsafe keyword allows programmers to bypass these restrictions, it introduces significant risks. Various approaches for isolating unsafe code to protect safe Rust from vulnerabilities have been proposed, yet these methods provide only fixed isolation boundaries and do not accommodate expressive policies that require sandboxing both safe and unsafe code. This paper presents SandCell for flexible and lightweight isolation in Rust by leveraging existing syntactic boundaries. SandCell allows programmers to specify which components to sandbox with minimal annotation effort, enabling fine-grained control over isolation. The system also introduces novel techniques to minimize overhead when transferring data between sandboxes. Our evaluation demonstrates SandCell's effectiveness in preventing vulnerabilities across various Rust applications while maintaining reasonable performance overheads.
Authors:Yuqiao Meng, Luoxi Tang, Feiyang Yu, Jinyuan Jia, Guanhua Yan, Ping Yang, Zhaohan Xi
Abstract:
Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats, wherein LLMs offer cyber threat intelligence (CTI) to support vulnerability assessment and incident response. While recent work has shown that LLMs can support a wide range of CTI tasks such as threat analysis, vulnerability detection, and intrusion defense, significant performance gaps persist in practical deployments. In this paper, we investigate the intrinsic vulnerabilities of LLMs in CTI, focusing on challenges that arise from the nature of the threat landscape itself rather than the model architecture. Using large-scale evaluations across multiple CTI benchmarks and real-world threat reports, we introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision to reliably analyze failure instances. Through extensive experiments and human inspections, we reveal three fundamental vulnerabilities: spurious correlations, contradictory knowledge, and constrained generalization, that limit LLMs in effectively supporting CTI. Subsequently, we provide actionable insights for designing more robust LLM-powered CTI systems to facilitate future research.
Authors:Yuqiao Meng, Luoxi Tang, Feiyang Yu, Xi Li, Guanhua Yan, Ping Yang, Zhaohan Xi
Abstract:
As cyber threats continue to grow in scale and sophistication, blue team defenders increasingly require advanced tools to proactively detect and mitigate risks. Large Language Models (LLMs) offer promising capabilities for enhancing threat analysis. However, their effectiveness in real-world blue team threat-hunting scenarios remains insufficiently explored. This paper presents CyberTeam, a benchmark designed to guide LLMs in blue teaming practice. CyberTeam constructs a standardized workflow in two stages. First, it models realistic threat-hunting workflows by capturing the dependencies among analytical tasks from threat attribution to incident response. Next, each task is addressed through a set of operational modules tailored to its specific analytical requirements. This transforms threat hunting into a structured sequence of reasoning steps, with each step grounded in a discrete operation and ordered according to task-specific dependencies. Guided by this framework, LLMs are directed to perform threat-hunting tasks through modularized steps. Overall, CyberTeam integrates 30 tasks and 9 operational modules to guide LLMs through standardized threat analysis. We evaluate both leading LLMs and state-of-the-art cybersecurity agents, comparing CyberTeam against open-ended reasoning strategies. Our results highlight the improvements enabled by standardized design, while also revealing the limitations of open-ended reasoning in real-world threat hunting.
Authors:Eduardo Chielle, Manaar Alam, Jinting Liu, Jovan Kascelan, Michail Maniatakos
Abstract:
Recent work has made non-interactive privacy-preserving inference more practical by running deep Convolution Neural Network (CNN) with Fully Homomorphic Encryption (FHE). However, these methods remain limited by their reliance on bootstrapping, a costly FHE operation applied across multiple layers, severely slowing inference. They also depend on high-degree polynomial approximations of non-linear activations, which increase multiplicative depth and reduce accuracy by 2-5% compared to plaintext ReLU models. In this work, we focus on ResNets, a widely adopted benchmark architecture in privacy-preserving inference, and close the accuracy gap between their FHE-based non-interactive models and plaintext counterparts, while also achieving faster inference than existing methods. We use a quadratic polynomial approximation of ReLU, which achieves the theoretical minimum multiplicative depth for non-linear activations, along with a penalty-based training strategy. We further introduce structural optimizations such as node fusing, weight redistribution, and tower reuse. These optimizations reduce the required FHE levels in CNNs by nearly a factor of five compared to prior work, allowing us to run ResNet models under leveled FHE without bootstrapping. To further accelerate inference and recover accuracy typically lost with polynomial approximations, we introduce parameter clustering along with a joint strategy of data encoding layout and ensemble techniques. Experiments with ResNet-18, ResNet-20, and ResNet-32 on CIFAR-10 and CIFAR-100 show that our approach achieves up to 4x faster private inference than prior work with comparable accuracy to plaintext ReLU models.
Authors:Tarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin
Abstract:
Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret. ZKPs enable novel applications in private and verifiable computing such as anonymized cryptocurrencies and blockchain scaling and have seen adoption in several real-world systems. Prior work has accelerated ZKPs on GPUs by leveraging the inherent parallelism in core computation kernels like Multi-Scalar Multiplication (MSM). However, we find that a systematic characterization of execution bottlenecks in ZKPs, as well as their scalability on modern GPU architectures, is missing in the literature. This paper presents ZKProphet, a comprehensive performance study of Zero-Knowledge Proofs on GPUs. Following massive speedups of MSM, we find that ZKPs are bottlenecked by kernels like Number-Theoretic Transform (NTT), as they account for up to 90% of the proof generation latency on GPUs when paired with optimized MSM implementations. Available NTT implementations under-utilize GPU compute resources and often do not employ architectural features like asynchronous compute and memory operations. We observe that the arithmetic operations underlying ZKPs execute exclusively on the GPU's 32-bit integer pipeline and exhibit limited instruction-level parallelism due to data dependencies. Their performance is thus limited by the available integer compute units. While one way to scale the performance of ZKPs is adding more compute units, we discuss how runtime parameter tuning for optimizations like precomputed inputs and alternative data representations can extract additional speedup. With this work, we provide the ZKP community a roadmap to scale performance on GPUs and construct definitive GPU-accelerated ZKPs for their application requirements and available hardware resources.
Authors:Raghul Saravanan, Sudipta Paria, Aritra Dasgupta, Swarup Bhunia, Sai Manoj P D
Abstract:
Hardware Fuzzing emerged as one of the crucial techniques for finding security flaws in modern hardware designs by testing a wide range of input scenarios. One of the main challenges is creating high-quality input seeds that maximize coverage and speed up verification. Coverage-Guided Fuzzing (CGF) methods help explore designs more effectively, but they struggle to focus on specific parts of the hardware. Existing Directed Gray-box Fuzzing (DGF) techniques like DirectFuzz try to solve this by generating targeted tests, but it has major drawbacks, such as supporting only limited hardware description languages, not scaling well to large circuits, and having issues with abstraction mismatches. To address these problems, we introduce a novel framework, PROFUZZ, that follows the DGF approach and combines fuzzing with Automatic Test Pattern Generation (ATPG) for more efficient fuzzing. By leveraging ATPG's structural analysis capabilities, PROFUZZ can generate precise input seeds that target specific design regions more effectively while maintaining high fuzzing throughput. Our experiments show that PROFUZZ scales 30x better than DirectFuzz when handling multiple target sites, improves coverage by 11.66%, and runs 2.76x faster, highlighting its scalability and effectiveness for directed fuzzing in complex hardware systems.
Authors:Andrew Campbell, Anna Scaglione, Hang Liu, Victor Elvira, Sean Peisert, Daniel Arnold
Abstract:
This paper addresses the problem of protecting network information from privacy system identification (SI) attacks when sharing cyber-physical system simulations. We model analyst observations of networked states as time-series outputs of a graph filter driven by differentially private (DP) nodal excitations, with the analyst aiming to infer the underlying graph shift operator (GSO). Unlike traditional SI, which estimates system parameters, we study the inverse problem: what assumptions prevent adversaries from identifying the GSO while preserving utility for legitimate analysis. We show that applying DP mechanisms to inputs provides formal privacy guarantees for the GSO, linking the $(ε,δ)$-DP bound to the spectral properties of the graph filter and noise covariance. More precisely, for DP Gaussian signals, the spectral characteristics of both the filter and noise covariance determine the privacy bound, with smooth filters and low-condition-number covariance yielding greater privacy.
Authors:Jiayi Lin, Liangcai Su, Junzhe Li, Chenxiong Qian
Abstract:
Fuzzing is effective for vulnerability discovery but struggles with complex targets such as compilers, interpreters, and database engines, which accept textual input that must satisfy intricate syntactic and semantic constraints. Although language models (LMs) have attracted interest for this task due to their vast latent knowledge and reasoning potential, their practical adoption has been limited. The major challenges stem from insufficient exploration of deep program logic among real-world codebases, and the high cost of leveraging larger models. To overcome these challenges, we propose R1-Fuzz, the first framework that leverages reinforcement learning (RL) to specialize cost-efficient LMs and integrate them for complex textual fuzzing input generation. R1-Fuzz introduces two key designs: coverage-slicing-based question construction and a distance-based reward calculation. Through RL-based post-training of a model with our constructed dataset, R1-Fuzz designs a fuzzing workflow that tightly integrates LMs to reason deep program semantics during fuzzing. Evaluations on diverse real-world targets show that our design enables a small model, named R1-Fuzz-7B, to rival or even outperform much larger models in real-world fuzzing. Notably, R1-Fuzz achieves up to 75\% higher coverage than state-of-the-art fuzzers and discovers 29 previously unknown vulnerabilities, demonstrating its practicality.
Authors:Michiharu Yamashita, Thanh Tran, Delvin Ce Zhang, Dongwon Lee
Abstract:
The rapid advancement of Large Language Models (LLMs) has enabled the generation of highly realistic synthetic data. We identify a new vulnerability, LLMs generating convincing career trajectories in fake resumes and explore effective detection methods. To address this challenge, we construct a dataset of machine-generated career trajectories using LLMs and various methods, and demonstrate that conventional text-based detectors perform poorly on structured career data. We propose CareerScape, a novel heterogeneous, hierarchical multi-layer graph framework that models career entities and their relations in a unified global graph built from genuine resumes. Unlike conventional classifiers that treat each instance independently, CareerScape employs a structure-aware framework that augments user-specific subgraphs with trusted neighborhood information from a global graph, enabling the model to capture both global structural patterns and local inconsistencies indicative of synthetic career paths. Experimental results show that CareerScape outperforms state-of-the-art baselines by 5.8-85.0% relatively, highlighting the importance of structure-aware detection for machine-generated content.
Authors:Jan Wichelmann, Anja Rabich, Anna P"atschke, Thomas Eisenbarth
Abstract:
Trusted execution environments (TEEs) offer hardware-assisted means to protect code and data. However, as shown in numerous results over the years, attackers can use side-channels to leak data access patterns and even single-step the code. While the vendors are slowly introducing hardware-based countermeasures for some attacks, others will stay unaddressed. This makes a software-level countermeasure desirable, but current available solutions only address very specific attack vectors or have a narrow leakage model. In this work, we take a holistic view at the vulnerabilities of TEEs and design a tool named Obelix, which is the first to protect both code and data against a wide range of TEE attacks, from cache attacks over single-stepping to ciphertext side-channels. We analyze the practically achievable precision of state-of-the-art single-stepping tools, and present an algorithm which uses that knowledge to divide a program into uniform code blocks, that are indistinguishable for a strong attacker. By storing these blocks and the program data in oblivious RAM, the attacker cannot follow execution, effectively protecting both secret code and data. We describe how we automate our approach to make it available for developers who are unfamiliar with side-channels. As an obfuscation tool, Obelix comes with a considerable performance overhead, but compensates this with strong security guarantees and easy applicability without requiring any expert knowledge.
Authors:Sergio Benlloch-Lopez, Miquel Viel-Vazquez, Javier Naranjo-Alcazar, Jordi Grau-Haro, Pedro Zuccarello
Abstract:
The rapid proliferation of IoT nodes equipped with microphones and capable of performing on-device audio classification exposes highly sensitive data while operating under tight resource constraints. To protect against this, we present a defence-in-depth architecture comprising a security protocol that treats the edge device, cellular network and cloud backend as three separate trust domains, linked by TPM-based remote attestation and mutually authenticated TLS 1.3. A STRIDE-driven threat model and attack-tree analysis guide the design. At startup, each boot stage is measured into TPM PCRs. The node can only decrypt its LUKS-sealed partitions after the cloud has verified a TPM quote and released a one-time unlock key. This ensures that rogue or tampered devices remain inert. Data in transit is protected by TLS 1.3 and hybridised with Kyber and Dilithium to provide post-quantum resilience. Meanwhile, end-to-end encryption and integrity hashes safeguard extracted audio features. Signed, rollback-protected AI models and tamper-responsive sensors harden firmware and hardware. Data at rest follows a 3-2-1 strategy comprising a solid-state drive sealed with LUKS, an offline cold archive encrypted with a hybrid post-quantum cipher and an encrypted cloud replica. Finally, we set out a plan for evaluating the physical and logical security of the proposed protocol.
Authors:Monika Henzinger, Nikita P. Kalinin, Jalaj Upadhyay
Abstract:
The factorization norms of the lower-triangular all-ones $n \times n$ matrix, $γ_2(M_{count})$ and $γ_{F}(M_{count})$, play a central role in differential privacy as they are used to give theoretical justification of the accuracy of the only known production-level private training algorithm of deep neural networks by Google. Prior to this work, the best known upper bound on $γ_2(M_{count})$ was $1 + \frac{\log n}Ï$ by Mathias (Linear Algebra and Applications, 1993), and the best known lower bound was $\frac{1}Ï(2 + \log(\frac{2n+1}{3})) \approx 0.507 + \frac{\log n}Ï$ (MatouÅ¡ek, Nikolov, Talwar, IMRN 2020), where $\log$ denotes the natural logarithm. Recently, Henzinger and Upadhyay (SODA 2025) gave the first explicit factorization that meets the bound of Mathias (1993) and asked whether there exists an explicit factorization that improves on Mathias' bound. We answer this question in the affirmative. Additionally, we improve the lower bound significantly. More specifically, we show that $$ 0.701 + \frac{\log n}Ï + o(1) \;\leq\; γ_2(M_{count}) \;\leq\; 0.846 + \frac{\log n}Ï + o(1). $$ That is, we reduce the gap between the upper and lower bound to $0.14 + o(1)$. We also show that our factors achieve a better upper bound for $γ_{F}(M_{count})$ compared to prior work, and we establish an improved lower bound: $$ 0.701 + \frac{\log n}Ï + o(1) \;\leq\; γ_{F}(M_{count}) \;\leq\; 0.748 + \frac{\log n}Ï + o(1). $$ That is, the gap between the lower and upper bound provided by our explicit factorization is $0.047 + o(1)$.
Authors:Mohamad Fakih, Rahul Dharmaji, Youssef Mahmoud, Halima Bouzidi, Mohammad Abdullah Al Faruque
Abstract:
Modern optical mouse sensors, with their advanced precision and high responsiveness, possess an often overlooked vulnerability: they can be exploited for side-channel attacks. This paper introduces Mic-E-Mouse, the first-ever side-channel attack that targets high-performance optical mouse sensors to covertly eavesdrop on users. We demonstrate that audio signals can induce subtle surface vibrations detectable by a mouse's optical sensor. Remarkably, user-space software on popular operating systems can collect and broadcast this sensitive side channel, granting attackers access to raw mouse data without requiring direct system-level permissions. Initially, the vibration signals extracted from mouse data are of poor quality due to non-uniform sampling, a non-linear frequency response, and significant quantization. To overcome these limitations, Mic-E-Mouse employs a sophisticated end-to-end data filtering pipeline that combines Wiener filtering, resampling corrections, and an innovative encoder-only spectrogram neural filtering technique. We evaluate the attack's efficacy across diverse conditions, including speaking volume, mouse polling rate and DPI, surface materials, speaker languages, and environmental noise. In controlled environments, Mic-E-Mouse improves the signal-to-noise ratio (SNR) by up to +19 dB for speech reconstruction. Furthermore, our results demonstrate a speech recognition accuracy of roughly 42% to 61% on the AudioMNIST and VCTK datasets. All our code and datasets are publicly accessible on https://sites.google.com/view/mic-e-mouse.
Authors:Pujan Paudel, Gianluca Stringhini
Abstract:
Online e-commerce scams, ranging from shopping scams to pet scams, globally cause millions of dollars in financial damage every year. In response, the security community has developed highly accurate detection systems able to determine if a website is fraudulent. However, finding candidate scam websites that can be passed as input to these downstream detection systems is challenging: relying on user reports is inherently reactive and slow, and proactive systems issuing search engine queries to return candidate websites suffer from low coverage and do not generalize to new scam types. In this paper, we present LOKI, a system designed to identify search engine queries likely to return a high fraction of fraudulent websites. LOKI implements a keyword scoring model grounded in Learning Under Privileged Information (LUPI) and feature distillation from Search Engine Result Pages (SERPs). We rigorously validate LOKI across 10 major scam categories and demonstrate a 20.58 times improvement in discovery over both heuristic and data-driven baselines across all categories. Leveraging a small seed set of only 1,663 known scam sites, we use the keywords identified by our method to discover 52,493 previously unreported scams in the wild. Finally, we show that LOKI generalizes to previously-unseen scam categories, highlighting its utility in surfacing emerging threats.
Authors:Qingzhao Zhang, Shaocheng Luo, Z. Morley Mao, Miroslav Pajic, Michael K. Reiter
Abstract:
Autonomous vehicles, including self-driving cars, robotic ground vehicles, and drones, rely on complex sensor pipelines to ensure safe and reliable operation. However, these safety-critical systems remain vulnerable to adversarial sensor attacks that can compromise their performance and mission success. While extensive research has demonstrated various sensor attack techniques, critical gaps remain in understanding their feasibility in real-world, end-to-end systems. This gap largely stems from the lack of a systematic perspective on how sensor errors propagate through interconnected modules in autonomous systems when autonomous vehicles interact with the physical world. To bridge this gap, we present a comprehensive survey of autonomous vehicle sensor attacks across platforms, sensor modalities, and attack methods. Central to our analysis is the System Error Propagation Graph (SEPG), a structured demonstration tool that illustrates how sensor attacks propagate through system pipelines, exposing the conditions and dependencies that determine attack feasibility. With the aid of SEPG, our study distills seven key findings that highlight the feasibility challenges of sensor attacks and uncovers eleven previously overlooked attack vectors exploiting inter-module interactions, several of which we validate through proof-of-concept experiments. Additionally, we demonstrate how large language models (LLMs) can automate aspects of SEPG construction and cross-validate expert analysis, showcasing the promise of AI-assisted security evaluation.
Authors:Yury Yanovich, Sergey Sobolev, Yash Madhwal, Kirill Ziborov, Vladimir Gorgadze, Victoria Kovalevskay, Elizaveta Smirnova, Matvey Mishuris, Subodh Sharma
Abstract:
The Open Network (TON) is a high-performance blockchain platform designed for scalability and efficiency, leveraging an asynchronous execution model and a multi-layered architecture. While TON's design offers significant advantages, it also introduces unique challenges for smart contract development and security. This paper introduces a comprehensive audit checklist for TON smart contracts, based on an analysis of 34 professional audit reports containing 233 real-world vulnerabilities. The checklist addresses TON-specific challenges, such as asynchronous message handling, and provides actionable insights for developers and auditors. We also present detailed case studies of vulnerabilities in TON smart contracts, highlighting their implications and offering lessons learned. By adopting this checklist, developers and auditors can systematically identify and mitigate vulnerabilities, enhancing the security and reliability of TON-based projects. Our work bridges the gap between Ethereum's mature audit methodologies and the emerging needs of the TON ecosystem, fostering a more secure and robust blockchain environment.
Authors:Fardin Jalil Piran, Zhiling Chen, Yang Zhang, Qianyu Zhou, Jiong Tang, Farhad Imani
Abstract:
Decentralized federated learning faces privacy risks because model updates can leak data through inference attacks and membership inference, a concern that grows over many client exchanges. Differential privacy offers principled protection by injecting calibrated noise so confidential information remains secure on resource-limited IoT devices. Yet without transparency, black-box training cannot track noise already injected by previous clients and rounds, which forces worst-case additions and harms accuracy. We propose PrivateDFL, an explainable framework that joins hyperdimensional computing with differential privacy and keeps an auditable account of cumulative noise so each client adds only the difference between the required noise and what has already been accumulated. We evaluate on MNIST, ISOLET, and UCI-HAR to span image, signal, and tabular modalities, and we benchmark against transformer-based and deep learning-based baselines trained centrally with Differentially Private Stochastic Gradient Descent (DP-SGD) and Renyi Differential Privacy (RDP). PrivateDFL delivers higher accuracy, lower latency, and lower energy across IID and non-IID partitions while preserving formal (epsilon, delta) guarantees and operating without a central server. For example, under non-IID partitions, PrivateDFL achieves 24.42% higher accuracy than the Vision Transformer on MNIST while using about 10x less training time, 76x lower inference latency, and 11x less energy, and on ISOLET it exceeds Transformer accuracy by more than 80% with roughly 10x less training time, 40x lower inference latency, and 36x less training energy. Future work will extend the explainable accounting to adversarial clients and adaptive topologies with heterogeneous privacy budgets.
Authors:Charankumar Akiri, Harrison Simpson, Kshitiz Aryal, Aarav Khanna, Maanak Gupta
Abstract:
While the widespread deployment of Large Language Models (LLMs) holds great potential for society, their vulnerabilities to adversarial manipulation and exploitation can pose serious safety, security, and ethical risks. As new threats continue to emerge, it becomes critically necessary to assess the landscape of LLMs' safety and security against evolving adversarial prompt techniques. To understand the behavior of LLMs, this research provides an empirical analysis and risk profile of nine prominent LLMs, Claude Opus 4, DeepSeek V3 (both open-source and online), Gemini 2.5 Flash, GPT-4o, Grok 3, Llama 4 Scout, Mistral 7B, and Qwen 3 1.7B, against 24 different security and safety categories. These LLMs are evaluated on their ability to produce harmful responses for adversarially crafted prompts (dataset has been made public) for a broad range of safety and security topics, such as promotion of violent criminal behavior, promotion of non-violent criminal activity, societal harms related to safety, illegal sexual content, dangerous code generation, and cybersecurity threats beyond code. Our study introduces the Risk Severity Index (RSI), an agile and scalable evaluation score, to quantify and compare the security posture and creating a risk profile of LLMs. As the LLM development landscape progresses, the RSI is intended to be a valuable metric for comparing the risks of LLMs across evolving threats. This research finds widespread vulnerabilities in the safety filters of the LLMs tested and highlights the urgent need for stronger alignment, responsible deployment practices, and model governance, particularly for open-access and rapidly iterated models.
Authors:Lei Yu, Jingyuan Zhang, Xin Wang, Jiajia Ma, Li Yang, Fengjun Zhang
Abstract:
Smart contracts automate the management of high-value assets, where vulnerabilities can lead to catastrophic financial losses. This challenge is amplified in Large Language Models (LLMs) by two interconnected failures: they operate as unauditable "black boxes" lacking a transparent reasoning process, and consequently, generate code riddled with critical security vulnerabilities. To address both issues, we propose SmartCoder-R1 (based on Qwen2.5-Coder-7B), a novel framework for secure and explainable smart contract generation. It begins with Continual Pre-training (CPT) to specialize the model. We then apply Long Chain-of-Thought Supervised Fine-Tuning (L-CoT SFT) on 7,998 expert-validated reasoning-and-code samples to train the model to emulate human security analysis. Finally, to directly mitigate vulnerabilities, we employ Security-Aware Group Relative Policy Optimization (S-GRPO), a reinforcement learning phase that refines the generation policy by optimizing a weighted reward signal for compilation success, security compliance, and format correctness. Evaluated against 17 baselines on a benchmark of 756 real-world functions, SmartCoder-R1 establishes a new state of the art, achieving top performance across five key metrics: a ComPass of 87.70%, a VulRate of 8.60%, a SafeAval of 80.16%, a FuncRate of 53.84%, and a FullRate of 50.53%. This FullRate marks a 45.79% relative improvement over the strongest baseline, DeepSeek-R1. Crucially, its generated reasoning also excels in human evaluations, achieving high-quality ratings for Functionality (82.7%), Security (85.3%), and Clarity (90.7%).
Authors:Kai Ye, Liangcai Su, Chenxiong Qian
Abstract:
Code generation has emerged as a pivotal capability of Large Language Models(LLMs), revolutionizing development efficiency for programmers of all skill levels. However, the complexity of data structures and algorithmic logic often results in functional deficiencies and security vulnerabilities in generated code, reducing it to a prototype requiring extensive manual debugging. While Retrieval-Augmented Generation (RAG) can enhance correctness and security by leveraging external code manuals, it simultaneously introduces new attack surfaces. In this paper, we pioneer the exploration of attack surfaces in Retrieval-Augmented Code Generation (RACG), focusing on malicious dependency hijacking. We demonstrate how poisoned documentation containing hidden malicious dependencies (e.g., matplotlib_safe) can subvert RACG, exploiting dual trust chains: LLM reliance on RAG and developers' blind trust in LLM suggestions. To construct poisoned documents, we propose ImportSnare, a novel attack framework employing two synergistic strategies: 1)Position-aware beam search optimizes hidden ranking sequences to elevate poisoned documents in retrieval results, and 2)Multilingual inductive suggestions generate jailbreaking sequences to manipulate LLMs into recommending malicious dependencies. Through extensive experiments across Python, Rust, and JavaScript, ImportSnare achieves significant attack success rates (over 50% for popular libraries such as matplotlib and seaborn) in general, and is also able to succeed even when the poisoning ratio is as low as 0.01%, targeting both custom and real-world malicious packages. Our findings reveal critical supply chain risks in LLM-powered development, highlighting inadequate security alignment for code generation tasks. To support future research, we will release the multilingual benchmark suite and datasets. The project homepage is https://importsnare.github.io.
Authors:Xng Ai, Shudan Lin, Zecheng Li, Kai Zhou, Bixin Li, Bin Xiao
Abstract:
Decentralized Finance (DeFi) attacks have resulted in significant losses, often orchestrated through Adversarial Exploiter Contracts (AECs) that exploit vulnerabilities in victim smart contracts. To proactively identify such threats, this paper targets the explainable detection of AECs. Existing detection methods struggle to capture semantic dependencies and lack interpretability, limiting their effectiveness and leaving critical knowledge gaps in AEC analysis. To address these challenges, we introduce SEASONED, an effective, self-explanatory, and robust framework for AEC detection. SEASONED extracts semantic information from contract bytecode to construct a semantic relation graph (SRG), and employs a self-counterfactual explainable detector (SCFED) to classify SRGs and generate explanations that highlight the core attack logic. SCFED further enhances robustness, generalizability, and data efficiency by extracting representative information from these explanations. Both theoretical analysis and experimental results demonstrate the effectiveness of SEASONED, which showcases outstanding detection performance, robustness, generalizability, and data efficiency learning ability. To support further research, we also release a new dataset of 359 AECs.
Authors:Sidahmed Benabderrahmane, Talal Rahwan
Abstract:
Advanced Persistent Threats (APTs) present a considerable challenge to cybersecurity due to their stealthy, long-duration nature. Traditional supervised learning methods typically require large amounts of labeled data, which is often scarce in real-world scenarios. This paper introduces a novel approach that combines AutoEncoders for anomaly detection with active learning to iteratively enhance APT detection. By selectively querying an oracle for labels on uncertain or ambiguous samples, our method reduces labeling costs while improving detection accuracy, enabling the model to effectively learn with minimal data and reduce reliance on extensive manual labeling. We present a comprehensive formulation of the Attention Adversarial Dual AutoEncoder-based anomaly detection framework and demonstrate how the active learning loop progressively enhances the model's performance. The framework is evaluated on real-world, imbalanced provenance trace data from the DARPA Transparent Computing program, where APT-like attacks account for just 0.004\% of the data. The datasets, which cover multiple operating systems including Android, Linux, BSD, and Windows, are tested in two attack scenarios. The results show substantial improvements in detection rates during active learning, outperforming existing methods.
Authors:Jiale Zhang, Pengfei He, Fei Li, Kewei Li, Yan Wang, Lan Huang, Ruochi Zhang, Fengfeng Zhou
Abstract:
In today's fast-paced digital communication, the surge in network traffic data and frequency demands robust and precise network intrusion solutions. Conventional machine learning methods struggle to grapple with complex patterns within the vast network intrusion datasets, which suffer from data scarcity and class imbalance. As a result, we have integrated machine learning and deep learning techniques within the network intrusion detection system to bridge this gap. This study has developed TrailGate, a novel framework that combines machine learning and deep learning techniques. By integrating Transformer and Bidirectional Gated Recurrent Unit (BiGRU) architectures with advanced feature selection strategies and supplemented by data augmentation techniques, TrailGate can identifies common attack types and excels at detecting and mitigating emerging threats. This algorithmic fusion excels at detecting common and well-understood attack types and has the unique ability to swiftly identify and neutralize emerging threats that stem from existing paradigms.
Authors:Paolo Bernardi, Antonio Brogi, Gian-Luigi Ferrari, Giuseppe Bisicchia
Abstract:
Quantum computing is a disruptive technology that is expected to offer significant advantages in many critical fields (e.g. drug discovery and cryptography). The security of information processed by such machines is therefore paramount. Currently, modest Noisy Intermediate-Scale Quantum (NISQ) devices are available. The goal of this work is to identify a practical, heuristic methodology to evaluate security properties, such as secrecy and integrity, while using quantum processors owned by potentially untrustworthy providers.
Authors:Syomantak Chaudhuri, Thomas A. Courtade
Abstract:
Previous works in the differential privacy literature that allow users to choose their privacy levels typically operate under the heterogeneous differential privacy (HDP) framework with the simplifying assumption that user data and privacy levels are not correlated. Firstly, we demonstrate that the standard HDP framework falls short when user data and privacy demands are allowed to be correlated. Secondly, to address this shortcoming, we propose an alternate framework, Add-remove Heterogeneous Differential Privacy (AHDP), that jointly accounts for user data and privacy preference. We show that AHDP is robust to possible correlations between data and privacy. Thirdly, we formalize the guarantees of the proposed AHDP framework through an operational hypothesis testing perspective. The hypothesis testing setup may be of independent interest in analyzing other privacy frameworks as well. Fourthly, we show that there exists non-trivial AHDP mechanisms that notably do not require prior knowledge of the data-privacy correlations. We propose some such mechanisms and apply them to core statistical tasks such as mean estimation, frequency estimation, and linear regression. The proposed mechanisms are simple to implement with minimal assumptions and modeling requirements, making them attractive for real-world use. Finally, we empirically evaluate proposed AHDP mechanisms, highlighting their trade-offs using LLM-generated synthetic datasets, which we release for future research.
Authors:Halima Bouzidi, Haoyu Liu, Mohammad Abdullah Al Faruque
Abstract:
Language-vision understanding has driven the development of advanced perception systems, most notably the emerging paradigm of Referring Multi-Object Tracking (RMOT). By leveraging natural-language queries, RMOT systems can selectively track objects that satisfy a given semantic description, guided through Transformer-based spatial-temporal reasoning modules. End-to-End (E2E) RMOT models further unify feature extraction, temporal memory, and spatial reasoning within a Transformer backbone, enabling long-range spatial-temporal modeling over fused textual-visual representations. Despite these advances, the reliability and robustness of RMOT remain underexplored. In this paper, we examine the security implications of RMOT systems from a design-logic perspective, identifying adversarial vulnerabilities that compromise both the linguistic-visual referring and track-object matching components. Additionally, we uncover a novel vulnerability in advanced RMOT models employing FIFO-based memory, whereby targeted and consistent attacks on their spatial-temporal reasoning introduce errors that persist within the history buffer over multiple subsequent frames. We present VEIL, a novel adversarial framework designed to disrupt the unified referring-matching mechanisms of RMOT models. We show that carefully crafted digital and physical perturbations can corrupt the tracking logic reliability, inducing track ID switches and terminations. We conduct comprehensive evaluations using the Refer-KITTI dataset to validate the effectiveness of VEIL and demonstrate the urgent need for security-aware RMOT designs for critical large-scale applications.
Authors:Handi Chen, Jing Deng, Xiuzhe Wu, Zhihan Jiang, Xinchen Zhang, Xianhao Chen, Edith C. H. Ngai
Abstract:
The expansion of Internet of Things (IoT) devices constantly generates heterogeneous data streams, driving demand for continuous, decentralized intelligence. Federated Lifelong Learning (FLL) provides an ideal solution by incorporating federated and lifelong learning to overcome catastrophic forgetting. The extended lifecycle of FLL in IoT systems increases their vulnerability to persistent attacks, and these risks may be obscured by performance degradation caused by spatial-temporal data heterogeneity. Moreover, this problem is exacerbated by the standard single-server architecture, as its single point of failure makes it difficult to maintain a reliable audit trail for long-term threats. Blockchain provides a tamper-proof foundation for trustworthy FLL systems. Nevertheless, directly applying blockchain to FLL significantly increases computational and retrieval costs with the expansion of the knowledge base, slowing down the training on IoT devices. To address these challenges, we propose LiFeChain, a lightweight blockchain for secure and efficient federated lifelong learning by providing a tamper-resistant ledger with minimal on-chain disclosure and bidirectional verification. To the best of our knowledge, LiFeChain is the first blockchain tailored for FLL. LiFeChain incorporates two complementary mechanisms: the proof-of-model-correlation (PoMC) consensus on the server, which couples learning and unlearning mechanisms to mitigate negative transfer, and segmented zero-knowledge arbitration (Seg-ZA) on the client, which detects and arbitrates abnormal committee behavior without compromising privacy. LiFeChain is designed as a plug-and-play component that can be seamlessly integrated into existing FLL algorithms. Experimental results demonstrate that LiFeChain not only enhances model performance against two long-term attacks but also sustains high efficiency and scalability.
Authors:Yuting Tan, Xuying Li, Zhuo Li, Huizhen Shu, Peikang Hu
Abstract:
Gradient-based adversarial prompting, such as the Greedy Coordinate Gradient (GCG) algorithm, has emerged as a powerful method for jailbreaking large language models (LLMs). In this paper, we present a systematic appraisal of GCG and its annealing-augmented variant, T-GCG, across open-source LLMs of varying scales. Using Qwen2.5-0.5B, LLaMA-3.2-1B, and GPT-OSS-20B, we evaluate attack effectiveness on both safety-oriented prompts (AdvBench) and reasoning-intensive coding prompts. Our study reveals three key findings: (1) attack success rates (ASR) decrease with model size, reflecting the increasing complexity and non-convexity of larger models' loss landscapes; (2) prefix-based heuristics substantially overestimate attack effectiveness compared to GPT-4o semantic judgments, which provide a stricter and more realistic evaluation; and (3) coding-related prompts are significantly more vulnerable than adversarial safety prompts, suggesting that reasoning itself can be exploited as an attack vector. In addition, preliminary results with T-GCG show that simulated annealing can diversify adversarial search and achieve competitive ASR under prefix evaluation, though its benefits under semantic judgment remain limited. Together, these findings highlight the scalability limits of GCG, expose overlooked vulnerabilities in reasoning tasks, and motivate further development of annealing-inspired strategies for more robust adversarial evaluation.
Authors:Ghadeer Almusaddar, Yicheng Zhang, Saber Ganjisaffar, Barry Williams, Yu David Liu, Dmitry Ponomarev, Nael Abu-Ghazaleh
Abstract:
As modern systems increasingly rely on GPUs for computationally intensive tasks such as machine learning acceleration, ensuring the integrity of GPU computation has become critically important. Recent studies have shown that GPU kernels are vulnerable to both traditional memory safety issues (e.g., buffer overflow attacks) and emerging microarchitectural threats (e.g., Rowhammer attacks), many of which manifest as anomalous execution behaviors observable through side-channel signals. However, existing golden model based validation approaches that rely on such signals are fragile, highly sensitive to interference, and do not scale well across GPU workloads with diverse scheduling behaviors. To address these challenges, we propose ShadowScope, a monitoring and validation framework that leverages a composable golden model. Instead of building a single monolithic reference, ShadowScope decomposes trusted kernel execution into modular, repeatable functions that encode key behavioral features. This composable design captures execution patterns at finer granularity, enabling robust validation that is resilient to noise, workload variation, and interference across GPU workloads. To further reduce reliance on noisy software-only monitoring, we introduce ShadowScope+, a hardware-assisted validation mechanism that integrates lightweight on-chip checks into the GPU pipeline. ShadowScope+ achieves high validation accuracy with an average runtime overhead of just 4.6%, while incurring minimal hardware and design complexity. Together, these contributions demonstrate that side-channel observability can be systematically repurposed into a practical defense for GPU kernel integrity.
Authors:Federica Bianchi, Edoardo Di Paolo, Angelo Spognardi
Abstract:
In this paper, we present a novel encrypted traffic classification model that operates directly on raw PCAP data without requiring prior assumptions about traffic type. Unlike existing methods, it is generalizable across multiple classification tasks and leverages inter-flow signals - an innovative representation that captures temporal correlations and packet volume distributions across flows. Experimental results show that our model outperforms well-established methods in nearly every classification task and across most datasets, achieving up to 99% accuracy in some cases, demonstrating its robustness and adaptability.
Authors:Narges Dadkhah, Khan Reaz, Gerhard Wunder
Abstract:
The increasing adoption of smart home devices and IoT-based security systems presents significant opportunities to enhance convenience, safety, and risk management for homeowners and service providers. However, secure onboarding-provisioning credentials and establishing trust with cloud platforms-remains a considerable challenge. Traditional onboarding methods often rely on centralized Public Key Infrastructure (PKI) models and manufacturer-controlled keys, which introduce security risks and limit the user's digital sovereignty. These limitations hinder the widespread deployment of scalable IoT solutions. This paper presents a novel onboarding framework that builds upon existing network-layer onboarding techniques and extends them to the application layer to address these challenges. By integrating consortium blockchain technology, we propose a decentralized onboarding mechanism that enhances transparency, security, and monitoring for smart home architectures. The architecture supports device registration, key revocation, access control management, and risk detection through event-driven alerts across dedicated blockchain channels and smart contracts. To evaluate the framework, we formally model the protocol using the Tamarin Prover under the Dolev-Yao adversary model. The analysis focuses on authentication, token integrity, key confidentiality, and resilience over public channels. A prototype implementation demonstrates the system's viability in smart home settings, with verification completing in 0.34 seconds, highlighting its scalability and suitability for constrained devices and diverse stakeholders. Additionally, performance evaluation shows that the blockchain-based approach effectively handles varying workloads, maintains high throughput and low latency, and supports near real-time IoT data processing.
Authors:Sidahmed Benabderrahmane, Talal Rahwan
Abstract:
Advanced Persistent Threats (APTs) represent a growing menace to modern digital infrastructure. Unlike traditional cyberattacks, APTs are stealthy, adaptive, and long-lasting, often bypassing signature-based detection systems. This paper introduces a novel framework for APT detection that unites deep learning, reinforcement learning (RL), and active learning into a cohesive, adaptive defense system. Our system combines auto-encoders for latent behavioral encoding with a multi-agent ensemble of RL-based defenders, each trained to distinguish between benign and malicious process behaviors. We identify a critical challenge in existing detection systems: their static nature and inability to adapt to evolving attack strategies. To this end, our architecture includes multiple RL agents (Q-Learning, PPO, DQN, adversarial defenders), each analyzing latent vectors generated by an auto-encoder. When any agent is uncertain about its decision, the system triggers an active learning loop to simulate expert feedback, thus refining decision boundaries. An ensemble voting mechanism, weighted by each agent's performance, ensures robust final predictions.
Authors:Aristeidis Sidiropoulos, Christos Chrysanthos Nikolaidis, Theodoros Tsiolakis, Nikolaos Pavlidis, Vasilis Perifanis, Pavlos S. Efraimidis
Abstract:
Membership Inference Attacks (MIAs) pose a significant privacy risk, as they enable adversaries to determine whether a specific data point was included in the training dataset of a model. While Machine Unlearning is primarily designed as a privacy mechanism to efficiently remove private data from a machine learning model without the need for full retraining, its impact on the susceptibility of models to MIA remains an open question. In this study, we systematically assess the vulnerability of models to MIA after applying state-of-art Machine Unlearning algorithms. Our analysis spans four diverse datasets (two from the image domain and two in tabular format), exploring how different unlearning approaches influence the exposure of models to membership inference. The findings highlight that while Machine Unlearning is not inherently a countermeasure against MIA, the unlearning algorithm and data characteristics can significantly affect a model's vulnerability. This work provides essential insights into the interplay between Machine Unlearning and MIAs, offering guidance for the design of privacy-preserving machine learning systems.
Authors:Mohamed Elmahallawy, Tie Luo
Abstract:
Secure aggregation is a common technique in federated learning (FL) for protecting data privacy from both curious internal entities (clients or server) and external adversaries (eavesdroppers). However, in dynamic and resource-constrained environments such as low Earth orbit (LEO) satellite networks, traditional secure aggregation methods fall short in two aspects: (1) they assume continuous client availability while LEO satellite visibility is intermittent and irregular; (2) they consider privacy in each communication round but have overlooked the possible privacy leakage through multiple rounds. To address these limitations, we propose LTP-FLEO, an asynchronous FL framework that preserves long-term privacy (LTP) for LEO satellite networks. LTP-FLEO introduces (i) privacy-aware satellite partitioning, which groups satellites based on their predictable visibility to the server and enforces joint participation; (ii) model age balancing, which mitigates the adverse impact of stale model updates; and (iii) fair global aggregation, which treats satellites of different visibility durations in an equitable manner. Theoretical analysis and empirical validation demonstrate that LTP-FLEO effectively safeguards both model and data privacy across multi-round training, promotes fairness in line with satellite contributions, accelerates global convergence, and achieves competitive model accuracy.
Authors:Yang Wang, Yaxin Zhao, Xinyu Jiao, Sihan Xu, Xiangrui Cai, Ying Zhang, Xiaojie Yuan
Abstract:
Insider threat detection aims to identify malicious user behavior by analyzing logs that record user interactions. Due to the lack of fine-grained behavior-level annotations, detecting specific behavior-level anomalies within user behavior sequences is challenging. Unsupervised methods face high false positive rates and miss rates due to the inherent ambiguity between normal and anomalous behaviors. In this work, we instead introduce weak labels of behavior sequences, which have lower annotation costs, i.e., the training labels (anomalous or normal) are at sequence-level instead of behavior-level, to enhance the detection capability for behavior-level anomalies by learning discriminative features. To achieve this, we propose a novel framework called Robust Multi-sphere Learning (RMSL). RMSL uses multiple hyper-spheres to represent the normal patterns of behaviors. Initially, a one-class classifier is constructed as a good anomaly-supervision-free starting point. Building on this, using multiple instance learning and adaptive behavior-level self-training debiasing based on model prediction confidence, the framework further refines hyper-spheres and feature representations using weak sequence-level labels. This approach enhances the model's ability to distinguish between normal and anomalous behaviors. Extensive experiments demonstrate that RMSL significantly improves the performance of behavior-level insider threat detection.
Authors:Qingyuan Zeng, Shu Jiang, Jiajing Lin, Zhenzhong Wang, Kay Chen Tan, Min Jiang
Abstract:
With the rise of 3D Gaussian Splatting (3DGS), a variety of digital watermarking techniques, embedding either 1D bitstreams or 2D images, are used for copyright protection. However, the robustness of these watermarking techniques against potential attacks remains underexplored. This paper introduces the first universal black-box attack framework, the Group-based Multi-objective Evolutionary Attack (GMEA), designed to challenge these watermarking systems. We formulate the attack as a large-scale multi-objective optimization problem, balancing watermark removal with visual quality. In a black-box setting, we introduce an indirect objective function that blinds the watermark detector by minimizing the standard deviation of features extracted by a convolutional network, thus rendering the feature maps uninformative. To manage the vast search space of 3DGS models, we employ a group-based optimization strategy to partition the model into multiple, independent sub-optimization problems. Experiments demonstrate that our framework effectively removes both 1D and 2D watermarks from mainstream 3DGS watermarking methods while maintaining high visual fidelity. This work reveals critical vulnerabilities in existing 3DGS copyright protection schemes and calls for the development of more robust watermarking systems.
Authors:Wei Wang, Xiangyun Tang, Yajie Wang, Yijing Lin, Tao Zhang, Meng Shen, Dusit Niyato, Liehuang Zhu
Abstract:
Federated Unlearning (FU) has emerged as a promising solution to respond to the right to be forgotten of clients, by allowing clients to erase their data from global models without compromising model performance. Unfortunately, researchers find that the parameter variations of models induced by FU expose clients' data information, enabling attackers to infer the label of unlearning data, while label inference attacks against FU remain unexplored. In this paper, we introduce and analyze a new privacy threat against FU and propose a novel label inference attack, ULIA, which can infer unlearning data labels across three FU levels. To address the unique challenges of inferring labels via the models variations, we design a gradient-label mapping mechanism in ULIA that establishes a relationship between gradient variations and unlearning labels, enabling inferring labels on accumulated model variations. We evaluate ULIA on both IID and non-IID settings. Experimental results show that in the IID setting, ULIA achieves a 100% Attack Success Rate (ASR) under both class-level and client-level unlearning. Even when only 1% of a user's local data is forgotten, ULIA still attains an ASR ranging from 93% to 62.3%.
Authors:Ngoc N. Tran, Anwar Said, Waseem Abbas, Tyler Derr, Xenofon D. Koutsoukos
Abstract:
Graph-based malware classifiers can achieve over 94% accuracy on standard Android datasets, yet we find they suffer accuracy drops of up to 45% when evaluated on previously unseen malware variants from the same family - a scenario where strong generalization would typically be expected. This highlights a key limitation in existing approaches: both the model architectures and their structure-only representations often fail to capture deeper semantic patterns. In this work, we propose a robust semantic enrichment framework that enhances function call graphs with contextual features, including function-level metadata and, when available, code embeddings derived from large language models. The framework is designed to operate under real-world constraints where feature availability is inconsistent, and supports flexible integration of semantic signals. To evaluate generalization under realistic domain and temporal shifts, we introduce two new benchmarks: MalNet-Tiny-Common and MalNet-Tiny-Distinct, constructed using malware family partitioning to simulate cross-family generalization and evolving threat behavior. Experiments across multiple graph neural network backbones show that our method improves classification performance by up to 8% under distribution shift and consistently enhances robustness when integrated with adaptation-based methods. These results offer a practical path toward building resilient malware detection systems in evolving threat environments.
Authors:Mustafa Doger, Sennur Ulukus
Abstract:
Parallel Proof-of-Work (PoW) protocols are suggested to improve the safety guarantees, transaction throughput and confirmation latencies of Nakamoto consensus. In this work, we first consider the existing parallel PoW protocols and develop hard-coded incentive attack structures. Our theoretical results and simulations show that the existing parallel PoW protocols are more vulnerable to incentive attacks than the Nakamoto consensus, e.g., attacks have smaller profitability threshold and they result in higher relative rewards. Next, we introduce a voting-based semi-parallel PoW protocol that outperforms both Nakamoto consensus and the existing parallel PoW protocols from most practical perspectives such as communication overheads, throughput, transaction conflicts, incentive compatibility of the protocol as well as a fair distribution of transaction fees among the voters and the leaders. We use state-of-the-art analysis to evaluate the consistency of the protocol and consider Markov decision process (MDP) models to substantiate our claims about the resilience of our protocol against incentive attacks.
Authors:Weiheng Wu, Wei Qiao, Teng Li, Yebo Feng, Zhuo Ma, Jianfeng Ma, Yang Liu
Abstract:
Provenance graph-based intrusion detection systems are deployed on hosts to defend against increasingly severe Advanced Persistent Threat. Using Graph Neural Networks to detect these threats has become a research focus and has demonstrated exceptional performance. However, the widespread adoption of GNN-based security models is limited by their inherent black-box nature, as they fail to provide security analysts with any verifiable explanations for model predictions or any evidence regarding the model's judgment in relation to real-world attacks. To address this challenge, we propose ProvX, an effective explanation framework for exlaining GNN-based security models on provenance graphs. ProvX introduces counterfactual explanation logic, seeking the minimal structural subset within a graph predicted as malicious that, when perturbed, can subvert the model's original prediction. We innovatively transform the discrete search problem of finding this critical subgraph into a continuous optimization task guided by a dual objective of prediction flipping and distance minimization. Furthermore, a Staged Solidification strategy is incorporated to enhance the precision and stability of the explanations. We conducted extensive evaluations of ProvX on authoritative datasets. The experimental results demonstrate that ProvX can locate critical graph structures that are highly relevant to real-world attacks and achieves an average explanation necessity of 51.59\%, with these metrics outperforming current SOTA explainers. Furthermore, we explore and provide a preliminary validation of a closed-loop Detection-Explanation-Feedback enhancement framework, demonstrating through experiments that the explanation results from ProvX can guide model optimization, effectively enhancing its robustness against adversarial attacks.
Authors:Jiajun Gu, Yuhang Yao, Shuaiqi Wang, Carlee Joe-Wong
Abstract:
Gradient inversion attacks pose significant privacy threats to distributed training frameworks such as federated learning, enabling malicious parties to reconstruct sensitive local training data from gradient communications between clients and an aggregation server during the aggregation process. While traditional encryption-based defenses, such as homomorphic encryption, offer strong privacy guarantees without compromising model utility, they often incur prohibitive computational overheads. To mitigate this, selective encryption has emerged as a promising approach, encrypting only a subset of gradient data based on the data's significance under a certain metric. However, there have been few systematic studies on how to specify this metric in practice. This paper systematically evaluates selective encryption methods with different significance metrics against state-of-the-art attacks. Our findings demonstrate the feasibility of selective encryption in reducing computational overhead while maintaining resilience against attacks. We propose a distance-based significance analysis framework that provides theoretical foundations for selecting critical gradient elements for encryption. Through extensive experiments on different model architectures (LeNet, CNN, BERT, GPT-2) and attack types, we identify gradient magnitude as a generally effective metric for protection against optimization-based gradient inversions. However, we also observe that no single selective encryption strategy is universally optimal across all attack scenarios, and we provide guidelines for choosing appropriate strategies for different model architectures and privacy requirements.
Authors:Dingding Wang, Jianting He, Siwei Wu, Yajin Zhou, Lei Wu, Cong Wang
Abstract:
Smart contract upgrades are increasingly common due to their flexibility in modifying deployed contracts, such as fixing bugs or adding new functionalities. Meanwhile, upgrades compromise the immutability of contracts, introducing significant security concerns. While existing research has explored the security impacts of contract upgrades, these studies are limited in collection of upgrade behaviors and identification of insecurities.
To address these limitations, we conduct a comprehensive study on the insecurities of upgrade behaviors. First, we build a dataset containing 83,085 upgraded contracts and 20,902 upgrade chains. To our knowledge, this is the first large-scale dataset about upgrade behaviors, revealing their diversity and exposing gaps in public disclosure. Next, we develop a taxonomy of insecurities based on 37 real-world security incidents, categorizing eight types of upgrade risks and providing the first complete view of upgrade-related insecurities. Finally, we survey public awareness of these risks and existing mitigations. Our findings show that four types of security risks are overlooked by the public and lack mitigation measures. We detect these upgrade risks through a preliminary study, identifying 31,407 related issues - a finding that raises significant concerns.
Authors:Lei Yu, Shiqi Cheng, Zhirong Huang, Jingyuan Zhang, Chenjie Shen, Junyi Lu, Li Yang, Fengjun Zhang, Jiajia Ma
Abstract:
With the increasing security issues in blockchain, smart contract vulnerability detection has become a research focus. Existing vulnerability detection methods have their limitations: 1) Static analysis methods struggle with complex scenarios. 2) Methods based on specialized pre-trained models perform well on specific datasets but have limited generalization capabilities. In contrast, general-purpose Large Language Models (LLMs) demonstrate impressive ability in adapting to new vulnerability patterns. However, they often underperform on specific vulnerability types compared to methods based on specialized pre-trained models. We also observe that explanations generated by general-purpose LLMs can provide fine-grained code understanding information, contributing to improved detection performance.
Inspired by these observations, we propose SAEL, an LLM-based framework for smart contract vulnerability detection. We first design targeted prompts to guide LLMs in identifying vulnerabilities and generating explanations, which serve as prediction features. Next, we apply prompt-tuning on CodeT5 and T5 to process contract code and explanations, enhancing task-specific performance. To combine the strengths of each approach, we introduce an Adaptive Mixture-of-Experts architecture. This dynamically adjusts feature weights via a Gating Network, which selects relevant features using TopK filtering and Softmax normalization, and incorporates a Multi-Head Self-Attention mechanism to enhance cross-feature relationships. This design enables effective integration of LLM predictions, explanation features, and code features through gradient optimization. The loss function jointly considers both independent feature performance and overall weighted predictions. Experiments show that SAEL outperforms existing methods across various vulnerabilities.
Authors:Tanzim Mahfuz, Sudipta Paria, Tasneem Suha, Swarup Bhunia, Prabuddha Chakraborty
Abstract:
Microelectronic systems are widely used in many sensitive applications (e.g., manufacturing, energy, defense). These systems increasingly handle sensitive data (e.g., encryption key) and are vulnerable to diverse threats, such as, power side-channel attacks, which infer sensitive data through dynamic power profile. In this paper, we present a novel framework, POLARIS for mitigating power side channel leakage using an Explainable Artificial Intelligence (XAI) guided masking approach. POLARIS uses an unsupervised process to automatically build a tailored training dataset and utilize it to train a masking model.The POLARIS framework outperforms state-of-the-art mitigation solutions (e.g., VALIANT) in terms of leakage reduction, execution time, and overhead across large designs.
Authors:Sotiris Chatzimiltis, Mohammad Shojafar, Mahdi Boloursaz Mashhadi, Rahim Tafazolli
Abstract:
Next generation Radio Access Networks (RANs) introduce programmability, intelligence, and near real-time control through intelligent controllers, enabling enhanced security within the RAN and across broader 5G/6G infrastructures. This paper presents a comprehensive survey highlighting opportunities, challenges, and research gaps for Large Language Models (LLMs)-assisted explainable (XAI) intrusion detection (IDS) for secure future RAN environments. Motivated by this, we propose an LLM interpretable anomaly-based detection system for distributed denial-of-service (DDoS) attacks using multivariate time series key performance measures (KPMs), extracted from E2 nodes, within the Near Real-Time RAN Intelligent Controller (Near-RT RIC). An LSTM-based model is trained to identify malicious User Equipment (UE) behavior based on these KPMs. To enhance transparency, we apply post-hoc local explainability methods such as LIME and SHAP to interpret individual predictions. Furthermore, LLMs are employed to convert technical explanations into natural-language insights accessible to non-expert users. Experimental results on real 5G network KPMs demonstrate that our framework achieves high detection accuracy (F1-score > 0.96) while delivering actionable and interpretable outputs.
Authors:Muntasir Wahed, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Nirav Diwan, Gang Wang, Dilek Hakkani-Tür, Ismini Lourentzou
Abstract:
Recent advancements in Large Language Models (LLMs) have significantly enhanced their code generation capabilities. However, their robustness against adversarial misuse, particularly through multi-turn malicious coding prompts, remains underexplored. In this work, we introduce code decomposition attacks, where a malicious coding task is broken down into a series of seemingly benign subtasks across multiple conversational turns to evade safety filters. To facilitate systematic evaluation, we introduce \benchmarkname{}, a large-scale benchmark designed to evaluate the robustness of code LLMs against both single-turn and multi-turn malicious prompts. Empirical results across open- and closed-source models reveal persistent vulnerabilities, especially under multi-turn scenarios. Fine-tuning on MOCHA improves rejection rates while preserving coding ability, and importantly, enhances robustness on external adversarial datasets with up to 32.4% increase in rejection rates without any additional supervision.
Authors:Nazatul H. Sultan, Xinlong Guan, Josef Pieprzyk, Wei Ni, Sharif Abuadbba, Hajime Suzuki
Abstract:
As 5G networks expand into critical infrastructure, secure and efficient user authentication is more important than ever. The 5G-AKA protocol, standardized by 3GPP in TS 33.501, is central to authentication in current 5G deployments. It provides mutual authentication, user privacy, and key secrecy. However, despite its adoption, 5G-AKA has known limitations in both security and performance. While it focuses on protecting privacy against passive attackers, recent studies show its vulnerabilities to active attacks. It also relies on a sequence number mechanism to prevent replay attacks, requiring perfect synchronization between the device and the core network. This stateful design adds complexity, causes desynchronization, and incurs extra communication overhead. More critically, 5G-AKA lacks Perfect Forward Secrecy (PFS), exposing past communications if long-term keys are compromised-an increasing concern amid sophisticated threats. This paper proposes an enhanced authentication protocol that builds on 5G-AKA's design while addressing its shortcomings. First, we introduce a stateless version that removes sequence number reliance, reducing complexity while staying compatible with existing SIM cards and infrastructure. We then extend this design to add PFS with minimal cryptographic overhead. Both protocols are rigorously analyzed using ProVerif, confirming their compliance with all major security requirements, including resistance to passive and active attacks, as well as those defined by 3GPP and academic studies. We also prototype both protocols and evaluate their performance against 5G-AKA and 5G-AKA' (USENIX'21). Our results show the proposed protocols offer stronger security with only minor computational overhead, making them practical, future-ready solutions for 5G and beyond.
Authors:Matteo Boglioni, Terrance Liu, Andrew Ilyas, Zhiwei Steven Wu
Abstract:
In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a small set of special "canary" examples, inserts a random subset of them into the training set, and then tries to discern which of their canaries were included in the training set (typically via a membership inference attack). The auditor's success rate then provides a lower bound on the privacy parameters of the learning algorithm. Our main contribution is a method for optimizing the auditor's canary set to improve privacy auditing, leveraging recent work on metagradient optimization. Our empirical evaluation demonstrates that by using such optimized canaries, we can improve empirical lower bounds for differentially private image classification models by over 2x in certain instances. Furthermore, we demonstrate that our method is transferable and efficient: canaries optimized for non-private SGD with a small model architecture remain effective when auditing larger models trained with DP-SGD.
Authors:Daniel Commey, Rebecca A. Sarpong, Griffith S. Klogo, Winful Bagyl-Bac, Garth V. Crosby
Abstract:
Federated learning (FL) enables collaborative model training across decentralized clients while preserving data privacy. However, its open-participation nature exposes it to data-poisoning attacks, in which malicious actors submit corrupted model updates to degrade the global model. Existing defenses are often reactive, relying on statistical aggregation rules that can be computationally expensive and that typically assume an honest majority. This paper introduces a proactive, economic defense: a lightweight Bayesian incentive mechanism that makes malicious behavior economically irrational. Each training round is modeled as a Bayesian game of incomplete information in which the server, acting as the principal, uses a small, private validation dataset to verify update quality before issuing payments. The design satisfies Individual Rationality (IR) for benevolent clients, ensuring their participation is profitable, and Incentive Compatibility (IC), making poisoning an economically dominated strategy. Extensive experiments on non-IID partitions of MNIST and FashionMNIST demonstrate robustness: with 50% label-flipping adversaries on MNIST, the mechanism maintains 96.7% accuracy, only 0.3 percentage points lower than in a scenario with 30% label-flipping adversaries. This outcome is 51.7 percentage points better than standard FedAvg, which collapses under the same 50% attack. The mechanism is computationally light, budget-bounded, and readily integrates into existing FL frameworks, offering a practical route to economically robust and sustainable FL ecosystems.
Authors:Pascal Debus, Maximilian Wendlinger, Kilian Tscharke, Daniel Herr, Cedric Brügmann, Daniel Ohl de Mello, Juris Ulmanis, Alexander Erhard, Arthur Schmidt, Fabian Petsch
Abstract:
Quantum Machine Learning (QML) systems inherit vulnerabilities from classical machine learning while introducing new attack surfaces rooted in the physical and algorithmic layers of quantum computing. Despite a growing body of research on individual attack vectors - ranging from adversarial poisoning and evasion to circuit-level backdoors, side-channel leakage, and model extraction - these threats are often analyzed in isolation, with unrealistic assumptions about attacker capabilities and system environments. This fragmentation hampers the development of effective, holistic defense strategies. In this work, we argue that QML security requires more structured modeling of the attack surface, capturing not only individual techniques but also their relationships, prerequisites, and potential impact across the QML pipeline. We propose adapting kill chain models, widely used in classical IT and cybersecurity, to the quantum machine learning context. Such models allow for structured reasoning about attacker objectives, capabilities, and possible multi-stage attack paths - spanning reconnaissance, initial access, manipulation, persistence, and exfiltration. Based on extensive literature analysis, we present a detailed taxonomy of QML attack vectors mapped to corresponding stages in a quantum-aware kill chain framework that is inspired by the MITRE ATLAS for classical machine learning. We highlight interdependencies between physical-level threats (like side-channel leakage and crosstalk faults), data and algorithm manipulation (such as poisoning or circuit backdoors), and privacy attacks (including model extraction and training data inference). This work provides a foundation for more realistic threat modeling and proactive security-in-depth design in the emerging field of quantum machine learning.
Authors:Xiaodong Wu, Tianyi Tang, Xiangman Li, Jianbing Ni, Yong Yu
Abstract:
Watermarking has emerged as a promising solution to counter harmful or deceptive AI-generated content by embedding hidden identifiers that trace content origins. However, the robustness of current watermarking techniques is still largely unexplored, raising critical questions about their effectiveness against adversarial attacks. To address this gap, we examine the robustness of model-specific watermarking, where watermark embedding is integrated with text-to-image generation in models like latent diffusion models. We introduce three attack strategies: edge prediction-based, box blurring, and fine-tuning-based attacks in a no-box setting, where an attacker does not require access to the ground-truth watermark decoder. Our findings reveal that while model-specific watermarking is resilient against basic evasion attempts, such as edge prediction, it is notably vulnerable to blurring and fine-tuning-based attacks. Our best-performing attack achieves a reduction in watermark detection accuracy to approximately 47.92\%. Additionally, we perform an ablation study on factors like message length, kernel size and decoder depth, identifying critical parameters influencing the fine-tuning attack's success. Finally, we assess several advanced watermarking defenses, finding that even the most robust methods, such as multi-label smoothing, result in watermark extraction accuracy that falls below an acceptable level when subjected to our no-box attacks.
Authors:Xiaodong Wu, Xiangman Li, Qi Li, Jianbing Ni, Rongxing Lu
Abstract:
Text-guided image manipulation with diffusion models enables flexible and precise editing based on prompts, but raises ethical and copyright concerns due to potential unauthorized modifications. To address this, we propose SecureT2I, a secure framework designed to prevent unauthorized editing in diffusion-based generative models. SecureT2I is compatible with both general-purpose and domain-specific models and can be integrated via lightweight fine-tuning without architectural changes. We categorize images into a permit set and a forbid set based on editing permissions. For the permit set, the model learns to perform high-quality manipulations as usual. For the forbid set, we introduce training objectives that encourage vague or semantically ambiguous outputs (e.g., blurred images), thereby suppressing meaningful edits. The core challenge is to block unauthorized editing while preserving editing quality for permitted inputs. To this end, we design separate loss functions that guide selective editing behavior. Extensive experiments across multiple datasets and models show that SecureT2I effectively degrades manipulation quality on forbidden images while maintaining performance on permitted ones. We also evaluate generalization to unseen inputs and find that SecureT2I consistently outperforms baselines. Additionally, we analyze different vagueness strategies and find that resize-based degradation offers the best trade-off for secure manipulation control.
Authors:Amirali Sajadi, Kostadin Damevski, Preetha Chatterjee
Abstract:
Large Language Models (LLMs) and their agentic frameworks are increasingly adopted to automate software development tasks such as issue resolution and program repair. While prior work has identified security risks in LLM-generated code, most evaluations have focused on synthetic or isolated settings, leaving open questions about the security of these systems in real-world development contexts. In this study, we present the first large-scale security analysis of LLM-generated patches using 20,000+ issues from the SWE-bench dataset. We evaluate patches produced by a standalone LLM (Llama 3.3) and compare them to developer-written patches. We also assess the security of patches generated by three top-performing agentic frameworks (OpenHands, AutoCodeRover, HoneyComb) on a subset of our data. Finally, we analyze a wide range of code, issue, and project-level factors to understand the conditions under which LLMs and agents are most likely to generate insecure code. Our findings reveal that the standalone LLM introduces nearly 9x more new vulnerabilities than developers, with many of these exhibiting unique patterns not found in developers' code. Agentic workflows also generate a significant number of vulnerabilities, particularly when granting LLMs more autonomy, potentially increasing the likelihood of misinterpreting project context or task requirements. We find that vulnerabilities are more likely to occur in LLM patches associated with a higher number of files, more lines of generated code, and GitHub issues that lack specific code snippets or information about the expected code behavior and steps to reproduce. These results suggest that contextual factors play a critical role in the security of the generated code and point toward the need for proactive risk assessment methods that account for both code and issue-level information to complement existing vulnerability detection tools.
Authors:Blake Bullwinkel, Mark Russinovich, Ahmed Salem, Santiago Zanella-Beguelin, Daniel Jones, Giorgio Severi, Eugenia Kim, Keegan Hines, Amanda Minnich, Yonatan Zunger, Ram Shankar Siva Kumar
Abstract:
Recent research has demonstrated that state-of-the-art LLMs and defenses remain susceptible to multi-turn jailbreak attacks. These attacks require only closed-box model access and are often easy to perform manually, posing a significant threat to the safe and secure deployment of LLM-based systems. We study the effectiveness of the Crescendo multi-turn jailbreak at the level of intermediate model representations and find that safety-aligned LMs often represent Crescendo responses as more benign than harmful, especially as the number of conversation turns increases. Our analysis indicates that at each turn, Crescendo prompts tend to keep model outputs in a "benign" region of representation space, effectively tricking the model into fulfilling harmful requests. Further, our results help explain why single-turn jailbreak defenses like circuit breakers are generally ineffective against multi-turn attacks, motivating the development of mitigations that address this generalization gap.
Authors:Sizhe Chen, Arman Zharmagambetov, David Wagner, Chuan Guo
Abstract:
Prompt injection attacks pose a significant security threat to LLM-integrated applications. Model-level defenses have shown strong effectiveness, but are currently deployed into commercial-grade models in a closed-source manner. We believe open-source models are needed by the AI security community, where co-development of attacks and defenses through open research drives scientific progress in mitigation against prompt injection attacks. To this end, we develop Meta SecAlign, the first open-source and open-weight LLM with built-in model-level defense that achieves commercial-grade model performance. We provide complete details of our training recipe, which utilizes an improved version of the SOTA SecAlign defense. Evaluations on 9 utility benchmarks and 7 security benchmarks show that Meta SecAlign, despite being trained on a generic instruction-tuning dataset, confers security in unseen downstream tasks, including tool-calling and agentic web navigation, in addition general instruction-following. Our best model -- Meta-SecAlign-70B -- achieves state-of-the-art robustness against prompt injection attacks and comparable utility to closed-source commercial LLM with model-level defense.
Authors:Jiangrong Wu, Yuhong Nan, Jianliang Wu, Zitong Yao, Zibin Zheng
Abstract:
The increasing capabilities of LLMs have led to the rapid proliferation of LLM agent apps, where developers enhance LLMs with access to external resources to support complex task execution. Among these, LLM email agent apps represent one of the widely used categories, as email remains a critical communication medium for users. LLM email agents are capable of managing and responding to email using LLM-driven reasoning and autonomously executing user instructions via external email APIs (e.g., send email). However, despite their growing deployment and utility, the security mechanism of LLM email agent apps remains underexplored. Currently, there is no comprehensive study into the potential security risk within these agent apps and their broader implications.
In this paper, we conduct the first in-depth and systematic security study of LLM email agents. We propose the Email Agent Hijacking (EAH) attack, which overrides the original prompts of the email agent via external email resources, allowing attackers to gain control of the email agent remotely and further perform specific attack scenarios without user awareness.
To facilitate the large-scale evaluation, we propose EAHawk, a pipeline to evaluate the EAH attack of LLM email agent apps. By EAHawk, we performed an empirical study spanning 14 representative LLM agent frameworks, 63 agent apps, 12 LLMs, and 20 email services, which led to the generation of 1,404 real-world email agent instances for evaluation. Experimental results indicate that all 1,404 instances were successfully hijacked; on average, only 2.03 attack attempts are required to control an email agent instance. Even worse, for some LLMs, the average number of attempts needed to achieve full agent control drops to as few as 1.23.
Authors:Ailiya Borjigin, Wei Zhou, Cong He
Abstract:
Alternative Assets tokenization is transforming non-traditional financial instruments are represented and traded on the web. However, ensuring trustworthiness in web-based tokenized ecosystems poses significant challenges, from verifying off-chain asset data to enforcing regulatory compliance. This paper proposes an AI-governed agent architecture that integrates intelligent agents with blockchain to achieve web-trustworthy tokenization of alternative assets. In the proposed architecture, autonomous agents orchestrate the tokenization process (asset verification, valuation, compliance checking, and lifecycle management), while an AI-driven governance layer monitors agent behavior and enforces trust through adaptive policies and cryptoeconomic incentives. We demonstrate that this approach enhances transparency, security, and compliance in asset tokenization, addressing key concerns around data authenticity and fraud. A case study on tokenizing real estate assets illustrates how the architecture mitigates risks (e.g., fraudulent listings and money laundering) through real-time AI anomaly detection and on-chain enforcement. Our evaluation and analysis suggest that combining AI governance with multi-agent systems and blockchain can significantly bolster trust in tokenized asset ecosystems. This work offers a novel framework for trustworthy asset tokenization on the web and provides insights for practitioners aiming to deploy secure, compliant tokenization platforms.
Authors:Tianze Wang, Zhaoyu Chen, Jian Du, Yingtai Xiao, Linjun Zhang, Qiang Yan
Abstract:
Text data has become extremely valuable on large language models (LLMs) and even lead to general artificial intelligence (AGI). A lot of high-quality text in the real world is private and cannot be freely used due to privacy concerns. Therefore, differentially private (DP) synthetic text generation has been proposed, aiming to produce high-utility synthetic data while protecting sensitive information. However, existing DP synthetic text generation imposes uniform guarantees that often overprotect non-sensitive content, resulting in substantial utility loss and computational overhead. Therefore, we propose Secret-Protected Evolution (SecPE), a novel framework that extends private evolution with secret-aware protection. Theoretically, we show that SecPE satisfies $(\mathrm{p}, \mathrm{r})$-secret protection, constituting a relaxation of Gaussian DP that enables tighter utility-privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.
Authors:Xihan Xiong, Zhipeng Wang, Qin Wang, William Knottenbelt
Abstract:
Decentralized communication is becoming an important use case within Web3. On Ethereum, users can repurpose the transaction input data field to embed natural-language messages, commonly known as Input Data Messages (IDMs). However, as IDMs gain wider adoption, there has been a growing volume of toxic content on-chain. This trend is concerning, as Ethereum provides no protocol-level support for content moderation. We propose two moderation frameworks for Ethereum IDMs: (i) BUILDERMOD, where builders perform semantic checks during block construction; and (ii) USERMOD, where users proactively obtain moderation proofs from external classifiers and embed them in transactions. Our evaluation reveals that BUILDERMOD incurs high block-time overhead, which limits its practicality. In contrast, USERMOD enables lower-latency validation and scales more effectively, making it a more practical approach in moderation-aware Ethereum environments. Our study lays the groundwork for protocol-level content governance in decentralized systems, and we hope it contributes to the development of a decentralized communication environment that is safe, trustworthy, and socially responsible.
Authors:Chaofang Shi, Zhongwen Li, Xiaoqi Li
Abstract:
System passwords serve as critical credentials for user authentication and access control when logging into operating systems or applications. Upon entering a valid password, users pass verification to access system resources and execute corresponding operations. In recent years, frequent password cracking attacks targeting system passwords have posed a severe threat to information system security. To address this challenge, in-depth research into password cracking attack methods and defensive technologies holds significant importance. This paper conducts systematic research on system password security, focusing on analyzing typical password cracking methods such as brute force attacks, dictionary attacks, and rainbow table attacks, while evaluating the effectiveness of existing defensive measures. The experimental section utilizes common cryptanalysis tools, such as John the Ripper and Hashcat, to simulate brute force and dictionary attacks. Five test datasets, each generated using Message Digest Algorithm 5 (MD5), Secure Hash Algorithm 256-bit (SHA 256), and bcrypt hash functions, are analyzed. By comparing the overall performance of different hash algorithms and password complexity strategies against these attacks, the effectiveness of defensive measures such as salting and slow hashing algorithms is validated. Building upon this foundation, this paper further evaluates widely adopted defense mechanisms, including account lockout policies, multi-factor authentication, and risk adaptive authentication. By integrating experimental data with recent research findings, it analyzes the strengths and limitations of each approach while proposing feasible improvement recommendations and optimization strategies.
Authors:Ayush Kumar, Vrizlynn L. L. Thing
Abstract:
With the proliferation of new blockchain-based cryptocurrencies/assets and platforms that make it possible to transact across them, it becomes important to consider not just whether the transfer of coins/assets can be tracked within their respective transaction ledger, but also if they can be tracked as they move across ledgers. This is especially important given that there are documented cases of criminals attempting to use these cross-ledger trades to obscure the flow of their coins/assets. In this paper, we perform a systematic review of the various tracing techniques for blockchain transactions proposed in literature, categorize them using multiple criteria (such as tracing approach and targeted objective) and compare them. Based on the above categorization, we provide insights on the state of blockchain transaction tracing literature and identify the limitations of existing approaches. Finally, we suggest directions for future research in this area based on our analysis.
Authors:Yifan Zhu, Lijia Yu, Xiao-Shan Gao
Abstract:
In recent years, data poisoning attacks have been increasingly designed to appear harmless and even beneficial, often with the intention of verifying dataset ownership or safeguarding private data from unauthorized use. However, these developments have the potential to cause misunderstandings and conflicts, as data poisoning has traditionally been regarded as a security threat to machine learning systems. To address this issue, it is imperative for harmless poisoning generators to claim ownership of their generated datasets, enabling users to identify potential poisoning to prevent misuse. In this paper, we propose the deployment of watermarking schemes as a solution to this challenge. We introduce two provable and practical watermarking approaches for data poisoning: {\em post-poisoning watermarking} and {\em poisoning-concurrent watermarking}. Our analyses demonstrate that when the watermarking length is $Θ(\sqrt{d}/ε_w)$ for post-poisoning watermarking, and falls within the range of $Θ(1/ε_w^2)$ to $O(\sqrt{d}/ε_p)$ for poisoning-concurrent watermarking, the watermarked poisoning dataset provably ensures both watermarking detectability and poisoning utility, certifying the practicality of watermarking under data poisoning attacks. We validate our theoretical findings through experiments on several attacks, models, and datasets.
Authors:Eshika Saxena, Alberto Alfarano, François Charton, Emily Wenger, Kristin Lauter
Abstract:
AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.
Authors:Yihao Peng, Biao Ma, Hai Wan, Xibin Zhao
Abstract:
Modern web application recovery presents a critical dilemma. Coarse-grained snapshot rollbacks cause unacceptable data loss for legitimate users. Surgically removing an attack's impact is hindered by a fundamental challenge in high-concurrency environments: it is difficult to attribute resulting file and database modifications to a specific attack-related request. We present ANCORA, a system for precise intrusion recovery in web applications without invasive instrumentation. ANCORA first isolates the full sequence of syscalls triggered by a single malicious request. Based on this sequence, ANCORA addresses file and database modifications separately. To trace file changes, it builds a provenance graph that reveals all modifications, including those by exploit-spawned processes. To attribute database operations, a more difficult challenge due to connection pooling, ANCORA introduces a novel spatiotemporal anchor. This anchor uses the request's network connection tuple and active time window to pinpoint exact database operations. With all malicious file and database operations precisely identified, ANCORA performs a unified rewind and selective replay recovery. It reverts the system to a clean snapshot taken before the attack, then selectively re-applies only legitimate operations to both the file system and database. This completely removes the attack's effects while preserving concurrent legitimate data. We evaluated ANCORA on 10 web applications and 20 CVE-based attack scenarios with concurrency up to 150 connections. Experiments demonstrate ANCORA achieves 99.9% recovery accuracy with manageable overhead: up to 19.8% response latency increase and 17.8% QPS decrease in worst cases, and recovery throughput of 110.7 database operations per second and 27.2 affected files per second, effectively preserving legitimate data.
Authors:Haowen Xu, Tianya Zhao, Xuyu Wang, Lei Ma, Jun Dai, Alexander Wyglinski, Xiaoyan Sun
Abstract:
Palm recognition has emerged as a dominant biometric authentication technology in critical infrastructure. These systems operate in either single-modal form, using palmprint or palmvein individually, or dual-modal form, fusing the two modalities. Despite this diversity, they share similar hardware architectures that inadvertently emit electromagnetic (EM) signals during operation. Our research reveals that these EM emissions leak palm biometric information, motivating us to develop EMPalm--an attack framework that covertly recovers both palmprint and palmvein images from eavesdropped EM signals. Specifically, we first separate the interleaved transmissions of the two modalities, identify and combine their informative frequency bands, and reconstruct the images. To further enhance fidelity, we employ a diffusion model to restore fine-grained biometric features unique to each domain. Evaluations on seven prototype and two commercial palm acquisition devices show that EMPalm can recover palm biometric information with high visual fidelity, achieving SSIM scores up to 0.79, PSNR up to 29.88 dB, and FID scores as low as 6.82 across all tested devices, metrics that collectively demonstrate strong structural similarity, high signal quality, and low perceptual discrepancy. To assess the practical implications of the attack, we further evaluate it against four state-of-the-art palm recognition models, achieving a model-wise average spoofing success rate of 65.30% over 6,000 samples from 100 distinct users.
Authors:Artur Horal, Daniel Pina, Henrique Paz, Iago Paulo, João Soares, Rafael Ferreira, Diogo Tavares, Diogo Glória-Silva, João Magalhães, David Semedo
Abstract:
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework, to audit the robustness of Large Language Models (LLMs) in AI-assisted software development. Our work is driven by three major research streams: (1) robust and systematic assessment of LLM conversational jailbreaks; (2) a diverse generative multi-turn attack suite, supporting compositional, realistic and goal-oriented jailbreak conversational strategies; and (3) a hierarchical attack planner, which adaptively plans, serializes, and triggers attacks tailored to specific LLM's vulnerabilities. Together, these contributions form a unified framework -- combining assessment, attack generation, and strategic planning -- to comprehensively evaluate and expose weaknesses in LLMs' robustness. Extensive evaluation is conducted to systematically assess and analyze the performance of the overall system and each component. Experimental results demonstrate that our multi-turn adversarial attack strategies can successfully lead state-of-the-art LLMs to produce unsafe generations, highlighting the pressing need for more research into enhancing LLM's robustness.
Authors:Rishika Bhagwatkar, Kevin Kasa, Abhay Puri, Gabriel Huang, Irina Rish, Graham W. Taylor, Krishnamurthy Dj Dvijotham, Alexandre Lacoste
Abstract:
AI agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause unintended or harmful behavior. Inspired by the well-established concept of firewalls, we show that a simple, modular and model-agnostic defense operating at the agent--tool interface achieves perfect security (0% or the lowest possible attack success rate) with high utility (task success rate) across four public benchmarks: AgentDojo, Agent Security Bench, InjecAgent and tau-Bench, while achieving a state-of-the-art security-utility tradeoff compared to prior results. Specifically, we employ a defense based on two firewalls: a Tool-Input Firewall (Minimizer) and a Tool-Output Firewall (Sanitizer). Unlike prior complex approaches, this firewall defense makes minimal assumptions on the agent and can be deployed out-of-the-box, while maintaining strong performance without compromising utility. However, our analysis also reveals critical limitations in these existing benchmarks, including flawed success metrics, implementation bugs, and most importantly, weak attacks, hindering significant progress in the field. To foster more meaningful progress, we present targeted fixes to these issues for AgentDojo and Agent Security Bench while proposing best-practices for more robust benchmark design. Further, we demonstrate that although these firewalls push the state-of-the-art on existing benchmarks, it is still possible to bypass them in practice, underscoring the need to incorporate stronger attacks in security benchmarks. Overall, our work shows that existing agentic security benchmarks are easily saturated by a simple approach and highlights the need for stronger agentic security benchmarks with carefully chosen evaluation metrics and strong adaptive attacks.
Authors:Maraz Mia, Mir Mehedi A. Pritom
Abstract:
Explainable Artificial Intelligence (XAI) has aided machine learning (ML) researchers with the power of scrutinizing the decisions of the black-box models. XAI methods enable looking deep inside the models' behavior, eventually generating explanations along with a perceived trust and transparency. However, depending on any specific XAI method, the level of trust can vary. It is evident that XAI methods can themselves be a victim of post-adversarial attacks that manipulate the expected outcome from the explanation module. Among such attack tactics, fairwashing explanation (FE), manipulation explanation (ME), and backdoor-enabled manipulation attacks (BD) are the notable ones. In this paper, we try to understand these adversarial attack techniques, tactics, and procedures (TTPs) on explanation alteration and thus the effect on the model's decisions. We have explored a total of six different individual attack procedures on post-hoc explanation methods such as SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanation), and IG (Integrated Gradients), and investigated those adversarial attacks in cybersecurity applications scenarios such as phishing, malware, intrusion, and fraudulent website detection. Our experimental study reveals the actual effectiveness of these attacks, thus providing an urgency for immediate attention to enhance the resiliency of XAI methods and their applications.
Authors:Tanqiu Jiang, Min Bai, Nikolaos Pappas, Yanjun Qi, Sandesh Swamy
Abstract:
Vision-language model (VLM)-based web agents increasingly power high-stakes selection tasks like content recommendation or product ranking by combining multimodal perception with preference reasoning. Recent studies reveal that these agents are vulnerable against attackers who can bias selection outcomes through preference manipulations using adversarial pop-ups, image perturbations, or content tweaks. Existing work, however, either assumes strong white-box access, with limited single-modal perturbations, or uses impractical settings. In this paper, we demonstrate, for the first time, that joint exploitation of visual and textual channels yields significantly more powerful preference manipulations under realistic attacker capabilities. We introduce Cross-Modal Preference Steering (CPS) that jointly optimizes imperceptible modifications to an item's visual and natural language descriptions, exploiting CLIP-transferable image perturbations and RLHF-induced linguistic biases to steer agent decisions. In contrast to prior studies that assume gradient access, or control over webpages, or agent memory, we adopt a realistic black-box threat setup: a non-privileged adversary can edit only their own listing's images and textual metadata, with no insight into the agent's model internals. We evaluate CPS on agents powered by state-of-the-art proprietary and open source VLMs including GPT-4.1, Qwen-2.5VL and Pixtral-Large on both movie selection and e-commerce tasks. Our results show that CPS is significantly more effective than leading baseline methods. For instance, our results show that CPS consistently outperforms baselines across all models while maintaining 70% lower detection rates, demonstrating both effectiveness and stealth. These findings highlight an urgent need for robust defenses as agentic systems play an increasingly consequential role in society.
Authors:Zeya Chen, Jianing Wen, Ruth Schmidt, Yaxing Yao, Toby Jia-Jun Li, Tianshi Li
Abstract:
UX professionals routinely conduct design reviews, yet privacy concerns are often overlooked -- not only due to limited tools, but more critically because of low intrinsic motivation. Limited privacy knowledge, weak empathy for unexpectedly affected users, and low confidence in identifying harms make it difficult to address risks. We present PrivacyMotiv, an LLM-powered system that supports privacy-oriented design diagnosis by generating speculative personas with UX user journeys centered on individuals vulnerable to privacy risks. Drawing on narrative strategies, the system constructs relatable and attention-drawing scenarios that show how ordinary design choices may cause unintended harms, expanding the scope of privacy reflection in UX. In a within-subjects study with professional UX practitioners (N=16), we compared participants' self-proposed methods with PrivacyMotiv across two privacy review tasks. Results show significant improvements in empathy, intrinsic motivation, and perceived usefulness. This work contributes a promising privacy review approach which addresses the motivational barriers in privacy-aware UX.
Authors:Jaiden Fairoze, Sanjam Garg, Keewoo Lee, Mingyuan Wang
Abstract:
As large language models (LLMs) advance, ensuring AI safety and alignment is paramount. One popular approach is prompt guards, lightweight mechanisms designed to filter malicious queries while being easy to implement and update. In this work, we introduce a new attack that circumvents such prompt guards, highlighting their limitations. Our method consistently jailbreaks production models while maintaining response quality, even under the highly protected chat interfaces of Google Gemini (2.5 Flash/Pro), DeepSeek Chat (DeepThink), Grok (3), and Mistral Le Chat (Magistral). The attack exploits a resource asymmetry between the prompt guard and the main LLM, encoding a jailbreak prompt that lightweight guards cannot decode but the main model can. This reveals an attack surface inherent to lightweight prompt guards in modern LLM architectures and underscores the need to shift defenses from blocking malicious inputs to preventing malicious outputs. We additionally identify other critical alignment issues, such as copyrighted data extraction, training data extraction, and malicious response leakage during thinking.
Authors:Dominik Apel, Zeta Avarikioti, Matteo Maffei, Yuheng Wang
Abstract:
Rollup protocols have recently received significant attention as a promising class of Layer 2 (L2) scalability solutions. By utilizing the Layer 1 (L1) blockchain solely as a bulletin board for a summary of the executed transactions and state changes, rollups enable secure off-chain execution while avoiding the complexity of other L2 mechanisms. However, to ensure data availability, current rollup protocols require the plaintext of executed transactions to be published on-chain, resulting in inherent privacy limitations. In this paper, we address this problem by introducing Calyx, the first privacy-preserving multi-token optimistic-Rollup protocol. Calyx guarantees full payment privacy for all L2 transactions, revealing no information about the sender, recipient, transferred amount, or token type. The protocol further supports atomic execution of multiple multi-token transactions and introduces a transaction fee scheme to enable broader application scenarios while ensuring the sustainable operation of the protocol. To enforce correctness, Calyx adopts an efficient one-step fraud-proof mechanism. We analyze the security and privacy guarantees of the protocol and provide an implementation and evaluation. Our results show that executing a single transaction costs approximately $0.06 (0.00002 ETH) and incurs only constant-size on-chain cost in asymptotic terms.
Authors:Boyang Zhang, Istemi Ekin Akkus, Ruichuan Chen, Alice Dethise, Klaus Satzke, Ivica Rimac, Yang Zhang
Abstract:
Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in processing and reasoning over diverse modalities, but their advanced abilities also raise significant privacy concerns, particularly regarding Personally Identifiable Information (PII) leakage. While relevant research has been conducted on single-modal language models to some extent, the vulnerabilities in the multimodal setting have yet to be fully investigated. In this work, we investigate these emerging risks with a focus on vision language models (VLMs), a representative subclass of MLLMs that covers the two modalities most relevant for PII leakage, vision and text. We introduce a concept-guided mitigation approach that identifies and modifies the model's internal states associated with PII-related content. Our method guides VLMs to refuse PII-sensitive tasks effectively and efficiently, without requiring re-training or fine-tuning. We also address the current lack of multimodal PII datasets by constructing various ones that simulate real-world scenarios. Experimental results demonstrate that the method can achieve an average refusal rate of 93.3% for various PII-related tasks with minimal impact on unrelated model performances. We further examine the mitigation's performance under various conditions to show the adaptability of our proposed method.
Authors:Yonatan Gizachew Achamyeleh, Tongtao Zhang, Joshua Hyunki Kim, Gabriel Garcia, Shih-Yuan Yu, Anton Kocheturov, Mohammad Abdullah Al Faruque
Abstract:
Function name prediction is crucial for understanding stripped binaries in software reverse engineering, a key step for \textbf{enabling subsequent vulnerability analysis and patching}. However, existing approaches often struggle with architecture-specific limitations, data scarcity, and diverse naming conventions. We present AGNOMIN, a novel architecture-agnostic approach for multi-label function name prediction in stripped binaries. AGNOMIN builds Feature-Enriched Hierarchical Graphs (FEHGs), combining Control Flow Graphs, Function Call Graphs, and dynamically learned \texttt{PCode} features. A hierarchical graph neural network processes this enriched structure to generate consistent function representations across architectures, vital for \textbf{scalable security assessments}. For function name prediction, AGNOMIN employs a Renée-inspired decoder, enhanced with an attention-based head layer and algorithmic improvements. We evaluate AGNOMIN on a comprehensive dataset of 9,000 ELF executable binaries across three architectures, demonstrating its superior performance compared to state-of-the-art approaches, with improvements of up to 27.17\% in precision and 55.86\% in recall across the testing dataset. Moreover, AGNOMIN generalizes well to unseen architectures, achieving 5.89\% higher recall than the closest baseline. AGNOMIN's practical utility has been validated through security hackathons, where it successfully aided reverse engineers in analyzing and patching vulnerable binaries across different architectures.
Authors:Yonatan Gizachew Achamyeleh, Yang Xiang, Yun-Ping Hsiao, Yasamin Moghaddas, Mohammad Abdullah Al Faruque
Abstract:
The growing complexity of global supply chains has made hardware Trojans a significant threat in sensor-based power electronics. Traditional Trojan designs depend on digital triggers or fixed threshold conditions that can be detected during standard testing. In contrast, we introduce Environmental Rate Manipulation (ERM), a novel Trojan triggering mechanism that activates by monitoring the rate of change in environmental parameters rather than their absolute values. This approach allows the Trojan to remain inactive under normal conditions and evade redundancy and sensor-fusion defenses. We implement a compact 14~$μ$m$^2$ circuit that measures capacitor charging rates in standard sensor front-ends and disrupts inverter pulse-width modulation PWM signals when a rapid change is induced. Experiments on a commercial Texas Instruments solar inverter demonstrate that ERM can trigger catastrophic driver chip failure. Furthermore, ETAP simulations indicate that a single compromised 100~kW inverter may initiate cascading grid instabilities. The attack's significance extends beyond individual sensors to entire classes of environmental sensing systems common in power electronics, demonstrating fundamental challenges for hardware security.
Authors:Maraz Mia, Mir Mehedi A. Pritom, Tariqul Islam, Shouhuai Xu
Abstract:
Cybercrimes such as online scams and fraud have become prevalent. Cybercriminals often abuse various global or regional events as themes of their fraudulent activities to breach user trust and attain a higher attack success rate. These attacks attempt to manipulate and deceive innocent people into interacting with meticulously crafted websites with malicious payloads, phishing, or fraudulent transactions. To deepen our understanding of the problem, this paper investigates how to characterize event-themed malicious website-based campaigns, with a case study on war-themed websites. We find that attackers tailor their attacks by exploiting the unique aspects of events, as evidenced by activities such as fundraising, providing aid, collecting essential supplies, or seeking updated news. We use explainable unsupervised clustering methods to draw further insights, which could guide the design of effective early defenses against various event-themed malicious web campaigns.
Authors:Tereza Burianová, Martin PereÅ¡Ãni, Ivan Homoliak
Abstract:
Proposer anonymity in Proof-of-Stake (PoS) blockchains is a critical concern due to the risk of targeted attacks such as malicious denial-of-service (DoS) and censorship attacks. While several Secret Single Leader Election (SSLE) mechanisms have been proposed to address these threats, their practical impact and trade-offs remain insufficiently explored. In this work, we present a unified experimental framework for evaluating SSLE mechanisms under adversarial conditions, grounded in a simplified yet representative model of Ethereum's PoS consensus layer. The framework includes configurable adversaries capable of launching targeted DoS and censorship attacks, including coordinated strategies that simultaneously compromise groups of validators. We simulate and compare key protection mechanisms - Whisk, and homomorphic sortition. To the best of our knowledge, this is the first comparative study to examine adversarial DoS scenarios involving multiple attackers under diverse protection mechanisms. Our results show that while both designs offer strong protection against targeted DoS attacks on the leader, neither defends effectively against coordinated attacks on validator groups. Moreover, Whisk simplifies a DoS attack by narrowing the target set from all validators to a smaller list of known candidates. Homomorphic sortition, despite its theoretical strength, remains impractical due to the complexity of cryptographic operations over large validator sets.
Authors:Xingyu Li, Juefei Pu, Yifan Wu, Xiaochen Zou, Shitong Zhu, Xiaochen Zou, Shitong Zhu, Qiushi Wu, Zheng Zhang, Joshua Hsu, Yue Dong, Zhiyun Qian, Kangjie Lu, Trent Jaeger, Michael De Lucia, Srikanth V. Krishnamurthy
Abstract:
Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its ubiquity and complexity. Although security patches are continuously integrated into the Linux mainline kernel, downstream maintainers often delay their adoption, creating windows of vulnerability. A key reason for this lag is the difficulty in identifying security-critical patches, particularly those addressing exploitable vulnerabilities such as out-of-bounds (OOB) accesses and use-after-free (UAF) bugs. This challenge is exacerbated by intentionally silent bug fixes, incomplete or missing CVE assignments, delays in CVE issuance, and recent changes to the CVE assignment criteria for the Linux kernel. While fine-grained patch classification approaches exist, they exhibit limitations in both coverage and accuracy. In this work, we identify previously unexplored opportunities to significantly improve fine-grained patch classification. Specifically, by leveraging cues from commit titles/messages and diffs alongside appropriate code context, we develop DUALLM, a dual-method pipeline that integrates two approaches based on a Large Language Model (LLM) and a fine-tuned small language model. DUALLM achieves 87.4% accuracy and an F1-score of 0.875, significantly outperforming prior solutions. Notably, DUALLM successfully identified 111 of 5,140 recent Linux kernel patches as addressing OOB or UAF vulnerabilities, with 90 true positives confirmed by manual verification (many do not have clear indications in patch descriptions). Moreover, we constructed proof-of-concepts for two identified bugs (one UAF and one OOB), including one developed to conduct a previously unknown control-flow hijack as further evidence of the correctness of the classification.
Authors:Haochen Gong, Chenxiao Li, Rui Chang, Wenbo Shen
Abstract:
Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system- and application-level functions. However, due to LLMs' inherent uncertainty issues, granting agents control over computers poses significant security risks. When agent actions deviate from user intentions, they can cause irreversible consequences. Existing mitigation approaches, such as user confirmation and LLM-based dynamic action validation, still suffer from limitations in usability, security, and performance. To address these challenges, we propose CSAgent, a system-level, static policy-based access control framework for computer-use agents. To bridge the gap between static policy and dynamic context and user intent, CSAgent introduces intent- and context-aware policies, and provides an automated toolchain to assist developers in constructing and refining them. CSAgent enforces these policies through an optimized OS service, ensuring that agent actions can only be executed under specific user intents and contexts. CSAgent supports protecting agents that control computers through diverse interfaces, including API, CLI, and GUI. We implement and evaluate CSAgent, which successfully defends against more than 99.36% of attacks while introducing only 6.83% performance overhead.
Authors:Ayush Kumar, Kar Wai Fok, Vrizlynn L. L. Thing
Abstract:
Despite all the advantages associated with Network Intrusion Detection Systems (NIDSs) that utilize machine learning (ML) models, there is a significant reluctance among cyber security experts to implement these models in real-world production settings. This is primarily because of their opaque nature, meaning it is unclear how and why the models make their decisions. In this work, we design a deep learning-based NIDS, ExpIDS to have high decision tree explanation fidelity, i.e., the predictions of decision tree explanation corresponding to ExpIDS should be as close to ExpIDS's predictions as possible. ExpIDS can also adapt to changes in network traffic distribution (drift). With the help of extensive experiments, we verify that ExpIDS achieves higher decision tree explanation fidelity and a malicious traffic detection performance comparable to state-of-the-art NIDSs for common attacks with varying levels of real-world drift.
Authors:Johannes Jakob Meyer, Asad Raza, Jacopo Rizzo, Lorenzo Leone, Sofiene Jerbi, Jens Eisert
Abstract:
Our capacity to process information depends on the computational power at our disposal. Information theory captures our ability to distinguish states or communicate messages when it is unconstrained with unrivaled elegance. For computationally bounded observers the situation is quite different. They can, for example, be fooled to believe that distributions are more random than they actually are. In our work, we go beyond the prevailing single-shot approach and take a new direction in computational quantum information theory that captures the essence of complexity-constrained information theory while retaining the look and feel of the unbounded asymptotic theory. As our foundational quantity, we define the computational relative entropy as the optimal error exponent in asymmetric hypothesis testing when restricted to polynomially many copies and quantum gates, defined in a mathematically rigorous way. Building on this foundation, we prove a computational analogue of Stein's lemma, establish computational versions of fundamental inequalities like Pinsker's bound, and demonstrate a computational smoothing property showing that computationally indistinguishable states yield equivalent information measures. We derive a computational entropy that operationally characterizes optimal compression rates for quantum states under computational limitations and show that our quantities apply to computational entanglement theory, proving a computational version of the Rains bound. Our framework reveals striking separations between computational and unbounded information measures, including quantum-classical gaps that arise from cryptographic assumptions, demonstrating that computational constraints fundamentally alter the information-theoretic landscape and open new research directions at the intersection of quantum information, complexity theory, and cryptography.
Authors:Atousa Arzanipour, Rouzbeh Behnia, Reza Ebrahimi, Kaushik Dutta
Abstract:
Retrieval-Augmented Generation (RAG) is an emerging approach in natural language processing that combines large language models (LLMs) with external document retrieval to produce more accurate and grounded responses. While RAG has shown strong potential in reducing hallucinations and improving factual consistency, it also introduces new privacy and security challenges that differ from those faced by traditional LLMs. Existing research has demonstrated that LLMs can leak sensitive information through training data memorization or adversarial prompts, and RAG systems inherit many of these vulnerabilities. At the same time, reliance of RAG on an external knowledge base opens new attack surfaces, including the potential for leaking information about the presence or content of retrieved documents, or for injecting malicious content to manipulate model behavior. Despite these risks, there is currently no formal framework that defines the threat landscape for RAG systems. In this paper, we address a critical gap in the literature by proposing, to the best of our knowledge, the first formal threat model for retrieval-RAG systems. We introduce a structured taxonomy of adversary types based on their access to model components and data, and we formally define key threat vectors such as document-level membership inference and data poisoning, which pose serious privacy and integrity risks in real-world deployments. By establishing formal definitions and attack models, our work lays the foundation for a more rigorous and principled understanding of privacy and security in RAG systems.
Authors:Lauren Deason, Adam Bali, Ciprian Bejean, Diana Bolocan, James Crnkovich, Ioana Croitoru, Krishna Durai, Chase Midler, Calin Miron, David Molnar, Brad Moon, Bruno Ostarcevic, Alberto Peltea, Matt Rosenberg, Catalin Sandu, Arthur Saputkin, Sagar Shah, Daniel Stan, Ernest Szocs, Shengye Wan, Spencer Whitman, Sven Krasser, Joshua Saxe
Abstract:
Today's cyber defenders are overwhelmed by a deluge of security alerts, threat intelligence signals, and shifting business context, creating an urgent need for AI systems to enhance operational security work. While Large Language Models (LLMs) have the potential to automate and scale Security Operations Center (SOC) operations, existing evaluations do not fully assess the scenarios most relevant to real-world defenders. This lack of informed evaluation impacts both AI developers and those applying LLMs to SOC automation. Without clear insight into LLM performance in real-world security scenarios, developers lack a north star for development, and users cannot reliably select the most effective models. Meanwhile, malicious actors are using AI to scale cyber attacks, highlighting the need for open source benchmarks to drive adoption and community-driven improvement among defenders and model developers. To address this, we introduce CyberSOCEval, a new suite of open source benchmarks within CyberSecEval 4. CyberSOCEval includes benchmarks tailored to evaluate LLMs in two tasks: Malware Analysis and Threat Intelligence Reasoning--core defensive domains with inadequate coverage in current benchmarks. Our evaluations show that larger, more modern LLMs tend to perform better, confirming the training scaling laws paradigm. We also find that reasoning models leveraging test time scaling do not achieve the same boost as in coding and math, suggesting these models have not been trained to reason about cybersecurity analysis, and pointing to a key opportunity for improvement. Finally, current LLMs are far from saturating our evaluations, showing that CyberSOCEval presents a significant challenge for AI developers to improve cyber defense capabilities.
Authors:Yonghao Ni, Zhongwen Li, Xiaoqi Li
Abstract:
With the rapid development of Internet technologies, web systems have become essential infrastructures for modern information exchange and business operations. However, alongside their expansion, numerous security vulnerabilities have emerged, making web security a critical research focus within the broader field of cybersecurity. These issues are closely related to data protection, privacy preservation, and business continuity, and systematic research on web security is crucial for mitigating malicious attacks and enhancing the reliability and robustness of network systems. This paper first reviews the OWASP Top 10, summarizing the types, causes, and impacts of common web vulnerabilities, and illustrates their exploitation mechanisms through representative cases. Building upon this, the Gruyere platform is adopted as an experimental subject for analyzing known vulnerabilities. The study presents detailed reproduction steps for specific vulnerabilities, proposes comprehensive remediation strategies, and further compares Gruyere's vulnerabilities with contemporary real-world cases. The findings suggest that, although Gruyere's vulnerabilities are relatively outdated, their underlying principles remain highly relevant for explaining a wide range of modern security flaws. Overall, this research demonstrates that web system security analysis based on Gruyere not only deepens the understanding of vulnerability mechanisms but also provides practical support for technological innovation and security defense.
Authors:Md Bokhtiar Al Zami, Md Raihan Uddin, Dinh C. Nguyen
Abstract:
Federated learning (FL) has gained popularity as a privacy-preserving method of training machine learning models on decentralized networks. However to ensure reliable operation of UAV-assisted FL systems, issues like as excessive energy consumption, communication inefficiencies, and security vulnerabilities must be solved. This paper proposes an innovative framework that integrates Digital Twin (DT) technology and Zero-Knowledge Federated Learning (zkFed) to tackle these challenges. UAVs act as mobile base stations, allowing scattered devices to train FL models locally and upload model updates for aggregation. By incorporating DT technology, our approach enables real-time system monitoring and predictive maintenance, improving UAV network efficiency. Additionally, Zero-Knowledge Proofs (ZKPs) strengthen security by allowing model verification without exposing sensitive data. To optimize energy efficiency and resource management, we introduce a dynamic allocation strategy that adjusts UAV flight paths, transmission power, and processing rates based on network conditions. Using block coordinate descent and convex optimization techniques, our method significantly reduces system energy consumption by up to 29.6% compared to conventional FL approaches. Simulation results demonstrate improved learning performance, security, and scalability, positioning this framework as a promising solution for next-generation UAV-based intelligent networks.
Authors:Clara Maathuis, Kasper Cools
Abstract:
In today's evolving threat landscape, ensuring digital sovereignty has become mandatory for military organizations, especially given their increased development and investment in AI-driven cyber security solutions. To this end, a multi-angled framework is proposed in this article in order to define and assess digital sovereign control of data and AI-based models for military cyber security. This framework focuses on aspects such as context, autonomy, stakeholder involvement, and mitigation of risks in this domain. Grounded on the concepts of digital sovereignty and data sovereignty, the framework aims to protect sensitive defence assets against threats such as unauthorized access, ransomware, and supply-chain attacks. This approach reflects the multifaceted nature of digital sovereignty by preserving operational autonomy, assuring security and safety, securing privacy, and fostering ethical compliance of both military systems and decision-makers. At the same time, the framework addresses interoperability challenges among allied forces, strategic and legal considerations, and the integration of emerging technologies by considering a multidisciplinary approach that enhances the resilience and preservation of control over (critical) digital assets. This is done by adopting a design oriented research where systematic literature review is merged with critical thinking and analysis of field incidents in order to assure the effectivity and realism of the framework proposed.
Authors:Ouassim Karrakchou, Alaa Zniber, Anass Sebbar, Mounir Ghogho
Abstract:
Distributed Denial of Service (DDoS) attacks pose a persistent threat to network security, requiring timely and scalable mitigation strategies. In this paper, we propose a novel collaborative architecture that integrates a P4-programmable data plane with an SDN control plane to enable real-time DDoS detection and response. At the core of our approach is a split early-exit neural network that performs partial inference in the data plane using a quantized Convolutional Neural Network (CNN), while deferring uncertain cases to a Gated Recurrent Unit (GRU) module in the control plane. This design enables high-speed classification at line rate with the ability to escalate more complex flows for deeper analysis. Experimental evaluation using real-world DDoS datasets demonstrates that our approach achieves high detection accuracy with significantly reduced inference latency and control plane overhead. These results highlight the potential of tightly coupled ML-P4-SDN systems for efficient, adaptive, and low-latency DDoS defense.
Authors:Mengfei Xie, Yan Lin, Hongtao Wu, Jianming Fu, Chenke Luo, Guojun Peng
Abstract:
Tag-based sanitizers attach a small "key" to each pointer and a matching "lock" tag to its target memory object, enabling runtime verification of pointer-object consistency and helping developers to detect potential memory violations. However, the limited tag encoding space challenges existing studies in assigning distinct tags to memory objects across temporal and spatial dimensions, leading to potential tag collisions. In this paper, we present ClusterTag, a novel cluster-based memory allocator aimed at simultaneously mitigating tag collisions in both temporal and spatial dimensions. The core design of ClusterTag effectively balances the significant mismatch between tag encoding space and memory objects: it divides memory objects into multiple independent clusters, thereby limiting tag collisions to finite chunks within each cluster. To mitigate tag collisions across clusters, we design a cluster-grained heap randomization scheme. This approach introduces random address intervals between clusters and further breaks the entropy limitation of the tag space. ClusterTag has been implemented as an independent memory allocator that seamlessly integrates with tag-based sanitizers such as HWASan, and maintains comparable performance overhead (within 1%) at various randomization densities. Security evaluations on the Juliet dataset indicate that ClusterTag exhibits deterministic results across 500 repeated tests (5,652 reported and 1,530 missed), while the existing three types of tag assignment strategies all exhibit probabilistic false negatives due to tag collisions. Quantitative analysis across three tag collision distance metrics-minimum, average, and unpredictability-demonstrates that ClusterTag achieves balanced improvements across all three, whereas prior tag assignment schemes (random, staggered, fixed) show significant trade-offs in at least one metric.
Authors:Matan Avitan, Moran Baruch, Nir Drucker, Itamar Zimerman, Yoav Goldberg
Abstract:
Large language models (LLMs) power modern AI applications, but processing sensitive data on untrusted servers raises privacy concerns. Homomorphic encryption (HE) enables computation on encrypted data for secure inference. However, neural text generation requires decoding methods like argmax and sampling, which are non-polynomial and thus computationally expensive under encryption, creating a significant performance bottleneck. We introduce cutmax, an HE-friendly argmax algorithm that reduces ciphertext operations compared to prior methods, enabling practical greedy decoding under encryption. We also propose the first HE-compatible nucleus (top-p) sampling method, leveraging cutmax for efficient stochastic decoding with provable privacy guarantees. Both techniques are polynomial, supporting efficient inference in privacy-preserving settings. Moreover, their differentiability facilitates gradient-based sequence-level optimization as a polynomial alternative to straight-through estimators. We further provide strong theoretical guarantees for cutmax, proving it converges globally to a unique two-level fixed point, independent of the input values beyond the identity of the maximizer, which explains its rapid convergence in just a few iterations. Evaluations on realistic LLM outputs show latency reductions of 24x-35x over baselines, advancing secure text generation.
Authors:Zuquan Peng, Jianming Fu, Lixin Zou, Li Zheng, Yanzhen Ren, Guojun Peng
Abstract:
The use of unvetted third-party and internet data renders pre-trained models susceptible to backdoor attacks. Detecting backdoor samples is critical to prevent backdoor activation during inference or injection during training. However, existing detection methods often require the defender to have access to the poisoned models, extra clean samples, or significant computational resources to detect backdoor samples, limiting their practicality. To address this limitation, we propose a backdoor sample detection method based on perturbatio\textbf{N} discr\textbf{E}pancy consis\textbf{T}ency \textbf{E}valuation (\NETE). This is a novel detection method that can be used both pre-training and post-training phases. In the detection process, it only requires an off-the-shelf pre-trained model to compute the log probability of samples and an automated function based on a mask-filling strategy to generate perturbations. Our method is based on the interesting phenomenon that the change in perturbation discrepancy for backdoor samples is smaller than that for clean samples. Based on this phenomenon, we use curvature to measure the discrepancy in log probabilities between different perturbed samples and input samples, thereby evaluating the consistency of the perturbation discrepancy to determine whether the input sample is a backdoor sample. Experiments conducted on four typical backdoor attacks and five types of large language model backdoor attacks demonstrate that our detection strategy outperforms existing zero-shot black-box detection methods.
Authors:Haowei Quan, Junjie Wang, Xinzhe Li, Terry Yue Zhuo, Xiao Chen, Xiaoning Du
Abstract:
In the rapidly evolving software development landscape, Python stands out for its simplicity, versatility, and extensive ecosystem. Python packages, as units of organization, reusability, and distribution, have become a pressing concern, highlighted by the considerable number of vulnerability reports. As a scripting language, Python often cooperates with other languages for performance or interoperability. This adds complexity to the vulnerabilities inherent to Python packages, and the effectiveness of current vulnerability detection tools remains underexplored. This paper addresses these gaps by introducing PyVul, the first comprehensive benchmark suite of Python-package vulnerabilities. PyVul includes 1,157 publicly reported, developer-verified vulnerabilities, each linked to its affected packages. To accommodate diverse detection techniques, it provides annotations at both commit and function levels. An LLM-assisted data cleansing method is incorporated to improve label accuracy, achieving 100% commit-level and 94% function-level accuracy, establishing PyVul as the most precise large-scale Python vulnerability benchmark. We further carry out a distribution analysis of PyVul, which demonstrates that vulnerabilities in Python packages involve multiple programming languages and exhibit a wide variety of types. Moreover, our analysis reveals that multi-lingual Python packages are potentially more susceptible to vulnerabilities. Evaluation of state-of-the-art detectors using this benchmark reveals a significant discrepancy between the capabilities of existing tools and the demands of effectively identifying real-world security issues in Python packages. Additionally, we conduct an empirical review of the top-ranked CWEs observed in Python packages, to diagnose the fine-grained limitations of current detection tools and highlight the necessity for future advancements in the field.
Authors:Francesco Aurelio Pironti, Angelo Furfaro, Francesco Blefari, Carmelo Felicetti, Matteo Lupinacci, Francesco Romeo
Abstract:
The security of Industrial Control Systems (ICSs) is critical to ensuring the safety of industrial processes and personnel. The rapid adoption of Industrial Internet of Things (IIoT) technologies has expanded system functionality but also increased the attack surface, exposing ICSs to a growing range of cyber threats. Honeypots provide a means to detect and analyze such threats by emulating target systems and capturing attacker behavior. However, traditional ICS honeypots, often limited to software-based simulations of a single Programmable Logic Controller (PLC), lack the realism required to engage sophisticated adversaries. In this work, we introduce a modular honeynet framework named ICSLure. The framework has been designed to emulate realistic ICS environments. Our approach integrates physical PLCs interacting with live data sources via industrial protocols such as Modbus and Profinet RTU, along with virtualized network components including routers, switches, and Remote Terminal Units (RTUs). The system incorporates comprehensive monitoring capabilities to collect detailed logs of attacker interactions. We demonstrate that our framework enables coherent and high-fidelity emulation of real-world industrial plants. This high-interaction environment significantly enhances the quality of threat data collected and supports advanced analysis of ICS-specific attack strategies, contributing to more effective detection and mitigation techniques.
Authors:Edoardo Marangone, Eugenio Nerio Nemmi, Daniele Friolo, Giuseppe Ateniese, Ingo Weber, Claudio Di Ciccio
Abstract:
Enabling automated decision-making processes by leveraging data-driven analysis is a core goal of Decision Support Systems (DSSs). In multi-party scenarios where decisions rely on distributed and sensitive data, though, ensuring confidentiality, verifiability, transparency, integrity, and consistency at once remains an open challenge for DSSs. To tackle this multi-faceted problem, we propose the Secure Platform for Automated decision Rules via Trusted Applications (SPARTA) approach. By leveraging Trusted Execution Environments (TEEs) at its core, SPARTA ensures that the decision logic and the data remain protected. To guarantee transparency and consistency of the decision process, SPARTA encodes decision rules into verifiable software objects deployed within TEEs. To maintain the confidentiality of the outcomes while keeping the information integrity, SPARTA employs cryptography techniques on notarized data based on user-definable access policies. Based on experiments conducted on public benchmarks and synthetic data, we find our approach to be practically applicable and scalable.
Authors:Zhiyang Chen, Tara Saba, Xun Deng, Xujie Si, Fan Long
Abstract:
Large Language Models (LLMs) have become critical to modern software development, but their reliance on uncurated web-scale datasets for training introduces a significant security risk: the absorption and reproduction of malicious content. To systematically evaluate this risk, we introduce Scam2Prompt, a scalable automated auditing framework that identifies the underlying intent of a scam site and then synthesizes innocuous, developer-style prompts that mirror this intent, allowing us to test whether an LLM will generate malicious code in response to these innocuous prompts. In a large-scale study of four production LLMs (GPT-4o, GPT-4o-mini, Llama-4-Scout, and DeepSeek-V3), we found that Scam2Prompt's innocuous prompts triggered malicious URL generation in 4.24% of cases. To test the persistence of this security risk, we constructed Innoc2Scam-bench, a benchmark of 1,559 innocuous prompts that consistently elicited malicious code from all four initial LLMs. When applied to seven additional production LLMs released in 2025, we found the vulnerability is not only present but severe, with malicious code generation rates ranging from 12.7% to 43.8%. Furthermore, existing safety measures like state-of-the-art guardrails proved insufficient to prevent this behavior, with an overall detection rate of less than 0.3%.
Authors:Xiang Li, Yueci Su, Jiahao Liu, Zhiwei Lin, Yuebing Hou, Peiming Gao, Yuanchao Zhang
Abstract:
Traditional vulnerability detection methods rely heavily on predefined rule matching, which often fails to capture vulnerabilities accurately. With the rise of large language models (LLMs), leveraging their ability to understand code semantics has emerged as a promising direction for achieving more accurate and efficient vulnerability detection. However, current LLM-based approaches face significant challenges: instability in model outputs, limitations in context length, and hallucination. As a result, many existing solutions either use LLMs merely to enrich predefined rule sets, thereby keeping the detection process fundamentally rule-based, or over-rely on them, leading to poor robustness. To address these challenges, we propose a constraint-solving approach powered by LLMs named VULSOLVER. By modeling vulnerability detection as a constraint-solving problem, and by integrating static application security testing (SAST) with the semantic reasoning capabilities of LLMs, our method enables the LLM to act like a professional human security expert. We assess VULSOLVER on the OWASP Benchmark (1,023 labeled samples), achieving 96.29% accuracy, 96.55% F1-score, and 100% recall. Applied to popular GitHub repositories, VULSOLVER also identified 15 previously unknown high-severity vulnerabilities (CVSS 7.5-9.8), demonstrating its effectiveness in real-world security analysis.
Authors:Daryna Oliynyk, Rudolf Mayer, Kathrin Grosse, Andreas Rauber
Abstract:
Model stealing attacks endanger the confidentiality of machine learning models offered as a service. Although these models are kept secret, a malicious party can query a model to label data samples and train their own substitute model, violating intellectual property. While novel attacks in the field are continually being published, their design and evaluations are not standardised, making it challenging to compare prior works and assess progress in the field. This paper is the first to address this gap by providing recommendations for designing and evaluating model stealing attacks. To this end, we study the largest group of attacks that rely on training a substitute model -- those attacking image classification models. We propose the first comprehensive threat model and develop a framework for attack comparison. Further, we analyse attack setups from related works to understand which tasks and models have been studied the most. Based on our findings, we present best practices for attack development before, during, and beyond experiments and derive an extensive list of open research questions regarding the evaluation of model stealing attacks. Our findings and recommendations also transfer to other problem domains, hence establishing the first generic evaluation methodology for model stealing attacks.
Authors:Shaoor Munir, Nurullah Demir, Qian Li, Konrad Kollnig, Zubair Shafiq
Abstract:
The privacy community has a long track record of investigating emerging types of web tracking techniques. Recent work has focused on compliance of web trackers with new privacy laws such as Europe's GDPR and California's CCPA. Despite the growing body of research documenting widespread lack of compliance with new privacy laws, there is a lack of robust enforcement. Different from prior work, we conduct a tech-law analysis to map decades-old U.S. laws about interception of electronic communications--so-called wiretapping--to web tracking. Bridging the tech-law gap for older wiretapping laws is important and timely because, in cases where legal harm to privacy is proven, they can provide statutory private right of action, are at the forefront of recent privacy enforcement, and could ultimately lead to a meaningful change in the web tracking landscape.
In this paper, we focus on a particularly invasive tracking technique: the use of JavaScript event listeners by third-party trackers for real-time keystroke interception on websites. We use an instrumented web browser to crawl a sample of the top-million websites to investigate the use of event listeners that aligns with the criteria for wiretapping, according to U.S. wiretapping law at the federal level and in California. We find evidence that 38.52% websites installed third-party event listeners to intercept keystrokes, and that at least 3.18% websites transmitted intercepted information to a third-party server, which aligns with the criteria for wiretapping. We further find evidence that the intercepted information such as email addresses typed into form fields are used for unsolicited email marketing. Beyond our work that maps the intersection between technical measurement and U.S. wiretapping law, additional future legal research is required to determine when the wiretapping observed in our paper passes the threshold for illegality.
Authors:Ronal Singh, Shahroz Tariq, Fatemeh Jalalvand, Mohan Baruwal Chhetri, Surya Nepal, Cecile Paris, Martin Lochner
Abstract:
The integration of Large Language Models (LLMs) into Security Operations Centres (SOCs) presents a transformative, yet still evolving, opportunity to reduce analyst workload through human-AI collaboration. However, their real-world application in SOCs remains underexplored. To address this gap, we present a longitudinal study of 3,090 analyst queries from 45 SOC analysts over 10 months. Our analysis reveals that analysts use LLMs as on-demand aids for sensemaking and context-building, rather than for making high-stakes determinations, preserving analyst decision authority. The majority of queries are related to interpreting low-level telemetry (e.g., commands) and refining technical communication through short (1-3 turn) interactions. Notably, 93% of queries align with established cybersecurity competencies (NICE Framework), underscoring the relevance of LLM use for SOC-related tasks. Despite variations in tasks and engagement, usage trends indicate a shift from occasional exploration to routine integration, with growing adoption and sustained use among a subset of analysts. We find that LLMs function as flexible, on-demand cognitive aids that augment, rather than replace, SOC expertise. Our study provides actionable guidance for designing context-aware, human-centred AI assistance in security operations, highlighting the need for further in-the-wild research on real-world analyst-LLM collaboration, challenges, and impacts.
Authors:Runpeng Geng, Yanting Wang, Ying Chen, Jinyuan Jia
Abstract:
Retrieval-augmented generation (RAG) systems are widely deployed in real-world applications in diverse domains such as finance, healthcare, and cybersecurity. However, many studies showed that they are vulnerable to knowledge corruption attacks, where an attacker can inject adversarial texts into the knowledge database of a RAG system to induce the LLM to generate attacker-desired outputs. Existing studies mainly focus on attacking specific queries or queries with similar topics (or keywords). In this work, we propose UniC-RAG, a universal knowledge corruption attack against RAG systems. Unlike prior work, UniC-RAG jointly optimizes a small number of adversarial texts that can simultaneously attack a large number of user queries with diverse topics and domains, enabling an attacker to achieve various malicious objectives, such as directing users to malicious websites, triggering harmful command execution, or launching denial-of-service attacks. We formulate UniC-RAG as an optimization problem and further design an effective solution to solve it, including a balanced similarity-based clustering method to enhance the attack's effectiveness. Our extensive evaluations demonstrate that UniC-RAG is highly effective and significantly outperforms baselines. For instance, UniC-RAG could achieve over 90% attack success rate by injecting 100 adversarial texts into a knowledge database with millions of texts to simultaneously attack a large set of user queries (e.g., 2,000). Additionally, we evaluate existing defenses and show that they are insufficient to defend against UniC-RAG, highlighting the need for new defense mechanisms in RAG systems.
Authors:Yuhe Luo, Zhongwen Li, Xiaoqi Li
Abstract:
As blockchain technology continues to evolve, the security of smart contracts has increasingly drawn attention from both academia and industry. The Move language, with its unique resource model and linear type system, provides a solid foundation for the security of digital assets. However, smart contracts still face new security challenges due to developer programming errors and the potential risks associated with cross-module interactions. This paper systematically analyzes the limitations of existing security tools within the Move ecosystem and reveals their unique vulnerability patterns. To address these issues, it introduces MoveScanner, a static analysis tool based on a control flow graph and data flow analysis architecture. By incorporating cross-module call graph tracking, MoveScanner can effectively identify five key types of security vulnerabilities, including resource leaks, weak permission management, and arithmetic overflows. In terms of design, MoveScanner adheres to a modular principle, supports bytecode-level analysis and multi-chain adaptation, and introduces innovative resource trajectory tracking algorithms and capability matrix analysis methods, thereby significantly reducing the false positive rate. Empirical results show that MoveScanner achieved 88.2% detection accuracy in benchmark testing, filling the gap in security tools in the Move ecosystem. Furthermore, this paper identifies twelve new types of security risks based on the resource-oriented programming paradigm and provides a theoretical foundation and practical experience for the development of smart contract security mechanisms. Future work will focus on combining formal verification and dynamic analysis techniques to build a security protection framework covering the entire contract lifecycle
Authors:Yagmur Yigit, Mehmet Ali Erturk, Kerem Gursu, Berk Canberk
Abstract:
Digital twin (DT) technology is rapidly becoming essential for smart city ecosystems, enabling real-time synchronisation and autonomous decision-making across physical and digital domains. However, as DTs take active roles in control loops, securely binding them to their physical counterparts in dynamic and adversarial environments remains a significant challenge. Existing authentication solutions either rely on static trust models, require centralised authorities, or fail to provide live and verifiable physical-digital binding, making them unsuitable for latency-sensitive and distributed deployments. To address this gap, we introduce PRZK-Bind, a lightweight and decentralised authentication protocol that combines Schnorr-based zero-knowledge proofs with elliptic curve cryptography to establish secure, real-time correspondence between physical entities and DTs without relying on pre-shared secrets. Simulation results show that PRZK-Bind significantly improves performance, offering up to 4.5 times lower latency and 4 times reduced energy consumption compared to cryptography-heavy baselines, while maintaining false acceptance rates more than 10 times lower. These findings highlight its suitability for future smart city deployments requiring efficient, resilient, and trustworthy DT authentication.
Authors:Saisai Xia, Wenhao Wang, Zihao Wang, Yuhui Zhang, Yier Jin, Dan Meng, Rui Hou
Abstract:
Publicly available large pretrained models (i.e., backbones) and lightweight adapters for parameter-efficient fine-tuning (PEFT) have become standard components in modern machine learning pipelines. However, preserving the privacy of both user inputs and fine-tuned adapters -- often trained on sensitive data -- during inference remains a significant challenge. Applying cryptographic techniques, such as multi-party computation (MPC), to PEFT settings still incurs substantial encrypted computation across both the backbone and adapter, mainly due to the inherent two-way communication between them. To address this limitation, we propose CryptPEFT, the first PEFT solution specifically designed for private inference scenarios. CryptPEFT introduces a novel one-way communication (OWC) architecture that confines encrypted computation solely to the adapter, significantly reducing both computational and communication overhead. To maintain strong model utility under this constraint, we explore the design space of OWC-compatible adapters and employ an automated architecture search algorithm to optimize the trade-off between private inference efficiency and model utility. We evaluated CryptPEFT using Vision Transformer backbones across widely used image classification datasets. Our results show that CryptPEFT significantly outperforms existing baselines, delivering speedups ranging from $20.62\times$ to $291.48\times$ in simulated wide-area network (WAN) and local-area network (LAN) settings. On CIFAR-100, CryptPEFT attains 85.47% accuracy with just 2.26 seconds of inference latency. These findings demonstrate that CryptPEFT offers an efficient and privacy-preserving solution for modern PEFT-based inference.
Authors:Fei Lin, Tengchao Zhang, Ziyang Gong, Fei-Yue Wang
Abstract:
In recent years, generative artificial intelligence (GenAI) has demonstrated remarkable capabilities in high-stakes domains such as molecular science. However, challenges related to the verifiability and structural privacy of its outputs remain largely unresolved. This paper focuses on the task of molecular toxicity repair. It proposes a structure-private verification framework - ToxiEval-ZKP - which, for the first time, introduces zero-knowledge proof (ZKP) mechanisms into the evaluation process of this task. The system enables model developers to demonstrate to external verifiers that the generated molecules meet multidimensional toxicity repair criteria, without revealing the molecular structures themselves. To this end, we design a general-purpose circuit compatible with both classification and regression tasks, incorporating evaluation logic, Poseidon-based commitment hashing, and a nullifier-based replay prevention mechanism to build a complete end-to-end ZK verification system. Experimental results demonstrate that ToxiEval-ZKP facilitates adequate validation under complete structural invisibility, offering strong circuit efficiency, security, and adaptability, thereby opening up a novel paradigm for trustworthy evaluation in generative scientific tasks.
Authors:Zhimeng Guo, Huaisheng Zhu, Siyuan Xu, Hangfan Zhang, Teng Xiao, Minhao Cheng
Abstract:
The need for detecting LLM-generated code necessitates watermarking systems capable of operating within its highly structured and syntactically constrained environment. To address this, we introduce CodeTracer, an innovative adaptive code watermarking framework underpinned by a novel reinforcement learning training paradigm. At its core, CodeTracer features a policy-driven approach that utilizes a parameterized model to intelligently bias token choices during next-token prediction. This strategy ensures that embedded watermarks maintain code functionality while exhibiting subtle yet statistically detectable deviations from typical token distributions. To facilitate policy learning, we devise a comprehensive reward system that seamlessly integrates execution feedback with watermark embedding signals, balancing process-level and outcome-level rewards. Additionally, we employ Gumbel Top-k reparameterization to enable gradient-based optimization of discrete watermarking decisions. Extensive comparative evaluations demonstrate CodeTracer's significant superiority over state-of-the-art baselines in both watermark detectability and the preservation of generated code's functionality.
Authors:Aydin Zaboli, Junho Hong
Abstract:
This paper elaborates on an extensive security framework specifically designed for energy management systems (EMSs), which effectively tackles the dynamic environment of cybersecurity vulnerabilities and/or system problems (SPs), accomplished through the incorporation of novel methodologies. A comprehensive multi-point attack/error model is initially proposed to systematically identify vulnerabilities throughout the entire EMS data processing pipeline, including post state estimation (SE) stealth attacks, EMS database manipulation, and human-machine interface (HMI) display corruption according to the real-time database (RTDB) storage. This framework acknowledges the interconnected nature of modern attack vectors, which utilize various phases of supervisory control and data acquisition (SCADA) data flow. Then, generative AI (GenAI)-based anomaly detection systems (ADSs) for EMSs are proposed for the first time in the power system domain to handle the scenarios. Further, a set-of-mark generative intelligence (SoM-GI) framework, which leverages multimodal analysis by integrating visual markers with rules considering the GenAI capabilities, is suggested to overcome inherent spatial reasoning limitations. The SoM-GI methodology employs systematic visual indicators to enable accurate interpretation of segmented HMI displays and detect visual anomalies that numerical methods fail to identify. Validation on the IEEE 14-Bus system shows the framework's effectiveness across scenarios, while visual analysis identifies inconsistencies. This integrated approach combines numerical analysis with visual pattern recognition and linguistic rules to protect against cyber threats and system errors.
Authors:Junxian Li, Beining Xu, Di Zhang
Abstract:
Vision-language models (VLMs) have shown significant advancements in tasks such as visual grounding, where they localize specific objects in images based on natural language queries and images. However, security issues in visual grounding tasks for VLMs remain underexplored, especially in the context of backdoor attacks. In this paper, we introduce a novel input-aware backdoor attack method, IAG, designed to manipulate the grounding behavior of VLMs. This attack forces the model to ground a specific target object in the input image, regardless of the user's query. We propose an adaptive trigger generator that embeds the semantic information of the attack target's description into the original image using a text-conditional U-Net, thereby overcoming the open-vocabulary attack challenge. To ensure the attack's stealthiness, we utilize a reconstruction loss to minimize visual discrepancies between poisoned and clean images. Additionally, we introduce a unified method for generating attack data. IAG is evaluated theoretically and empirically, demonstrating its feasibility and effectiveness. Notably, our ASR@0.5 on InternVL-2.5-8B reaches over 65\% on various testing sets. IAG also shows promising potential on manipulating Ferret-7B and LlaVA-1.5-7B with very little accuracy decrease on clean samples. Extensive specific experiments, such as ablation study and potential defense, also indicate the robustness and transferability of our attack.
Authors:Yushan Xiang, Zhongwen Li, Xiaoqi Li
Abstract:
As artificial intelligence technology continues to advance, chatbots are becoming increasingly powerful. Among them, ChatGPT, launched by OpenAI, has garnered widespread attention globally due to its powerful natural language processing capabilities based on the GPT model, which enables it to engage in natural conversations with users, understand various forms of linguistic expressions, and generate useful information and suggestions. However, as its application scope expands, user demand grows, and malicious attacks related to it become increasingly frequent, the security threats and privacy risks faced by ChatGPT are gradually coming to the forefront. In this paper, the security of ChatGPT is mainly studied from two aspects, security threats and privacy risks. The article systematically analyzes various types of vulnerabilities involved in the above two types of problems and their causes. Briefly, we discuss the controversies that ChatGPT may cause at the ethical and moral levels. In addition, this paper reproduces several network attack and defense test scenarios by simulating the attacker's perspective and methodology. Simultaneously, it explores the feasibility of using ChatGPT for security vulnerability detection and security tool generation from the defender's perspective.
Authors:Aydin Zaboli, Junho Hong
Abstract:
In digital substations, security events pose significant challenges to the sustained operation of power systems. To mitigate these challenges, the implementation of robust defense strategies is critically important. A thorough process of anomaly identification and detection in information and communication technology (ICT) frameworks is crucial to ensure secure and reliable communication and coordination between interconnected devices within digital substations. Hence, this paper addresses the critical cybersecurity challenges confronting IEC61850-based digital substations within modern smart grids, where the integration of advanced communication protocols, e.g., generic object-oriented substation event (GOOSE), has enhanced energy management and introduced significant vulnerabilities to cyberattacks. Focusing on the limitations of traditional anomaly detection systems (ADSs) in detecting threats, this research proposes a transformative approach by leveraging generative AI (GenAI) to develop robust ADSs. The primary contributions include the suggested advanced adversarial traffic mutation (AATM) technique to generate synthesized and balanced datasets for GOOSE messages, ensuring protocol compliance and enabling realistic zero-day attack pattern creation to address data scarcity. Then, the implementation of GenAI-based ADSs incorporating the task-oriented dialogue (ToD) processes has been explored for improved detection of attack patterns. Finally, a comparison of the GenAI-based ADS with machine learning (ML)-based ADSs has been implemented to showcase the outperformance of the GenAI-based frameworks considering the AATM-generated GOOSE datasets and standard/advanced performance evaluation metrics.
Authors:VojtÄch StanÄk, Karel Srna, Anton Firc, Kamil Malinka
Abstract:
Despite growing attention to deepfake speech detection, the aspects of bias and fairness remain underexplored in the speech domain. To address this gap, we introduce the Speaker Characteristics Deepfake (SCDF) dataset: a novel, richly annotated resource enabling systematic evaluation of demographic biases in deepfake speech detection. SCDF contains over 237,000 utterances in a balanced representation of both male and female speakers spanning five languages and a wide age range. We evaluate several state-of-the-art detectors and show that speaker characteristics significantly influence detection performance, revealing disparities across sex, language, age, and synthesizer type. These findings highlight the need for bias-aware development and provide a foundation for building non-discriminatory deepfake detection systems aligned with ethical and regulatory standards.
Authors:Ye Li, Chengcheng Zhu, Yanchao Zhao, Jiale Zhang
Abstract:
In this paper, we endeavor to address the challenges of backdoor attacks countermeasures in black-box scenarios, thereby fortifying the security of inference under MLaaS. We first categorize backdoor triggers from a new perspective, i.e., their impact on the patched area, and divide them into: high-visibility triggers (HVT), semi-visibility triggers (SVT), and low-visibility triggers (LVT). Based on this classification, we propose a progressive defense framework, BDFirewall, that removes these triggers from the most conspicuous to the most subtle, without requiring model access. First, for HVTs, which create the most significant local semantic distortions, we identify and eliminate them by detecting these salient differences. We then restore the patched area to mitigate the adverse impact of such removal process. The localized purification designed for HVTs is, however, ineffective against SVTs, which globally perturb benign features. We therefore model an SVT-poisoned input as a mixture of a trigger and benign features, where we unconventionally treat the benign features as "noise". This formulation allows us to reconstruct SVTs by applying a denoising process that removes these benign "noise" features. The SVT-free input is then obtained by subtracting the reconstructed trigger. Finally, to neutralize the nearly imperceptible but fragile LVTs, we introduce lightweight noise to disrupt the trigger pattern and then apply DDPM to restore any collateral impact on clean features. Comprehensive experiments demonstrate that our method outperforms state-of-the-art defenses. Compared with baselines, BDFirewall reduces the Attack Success Rate (ASR) by an average of 33.25%, improving poisoned sample accuracy (PA) by 29.64%, and achieving up to a 111x speedup in inference time. Code will be made publicly available upon acceptance.
Authors:Laxman Dhulipala, Monika Henzinger, George Z. Li, Quanquan C. Liu, A. R. Sricharan, Leqi Zhu
Abstract:
Many differentially private and classical non-private graph algorithms rely crucially on determining whether some property of each vertex meets a threshold. For example, for the $k$-core decomposition problem, the classic peeling algorithm iteratively removes a vertex if its induced degree falls below a threshold. The sparse vector technique (SVT) is generally used to transform non-private threshold queries into private ones with only a small additive loss in accuracy. However, a naive application of SVT in the graph setting leads to an amplification of the error by a factor of $n$ due to composition, as SVT is applied to every vertex. In this paper, we resolve this problem by formulating a novel generalized sparse vector technique which we call the Multidimensional AboveThreshold (MAT) Mechanism which generalizes SVT (applied to vectors with one dimension) to vectors with multiple dimensions. As an application, we solve a number of important graph problems with better bounds than previous work.
We apply our MAT mechanism to obtain a set of improved bounds for a variety of problems including $k$-core decomposition, densest subgraph, low out-degree ordering, and vertex coloring. We give a tight local edge DP algorithm for $k$-core decomposition with $O(ε^{-1}\log n)$ additive error and no multiplicative error in $O(n)$ rounds. We also give a new $(2+η)$-factor multiplicative, $O(ε^{-1}\log n)$ additive error algorithm in $O(\log^2 n)$ rounds for any constant $η> 0$. Both of these results are asymptotically tight against our new lower bound of $Ω(\log n)$ for any constant-factor approximation algorithm for $k$-core decomposition. Our new algorithms for $k$-core also directly lead to new algorithms for densest subgraph and low out-degree ordering. Our novel private defective coloring algorithms uses number of colors proportional to the arboricity of the graph.
Authors:Xinzhang Chen, Hassan Ali, Arash Shaghaghi, Salil S. Kanhere, Sanjay Jha
Abstract:
Online services often require users to agree to lengthy and obscure Terms of Service (ToS), leading to information asymmetry and legal risks. This paper proposes TOSense-a Chrome extension that allows users to ask questions about ToS in natural language and get concise answers in real time. The system combines (i) a crawler "tos-crawl" that automatically extracts ToS content, and (ii) a lightweight large language model pipeline: MiniLM for semantic retrieval and BART-encoder for answer relevance verification. To avoid expensive manual annotation, we present a novel Question Answering Evaluation Pipeline (QEP) that generates synthetic questions and verifies the correctness of answers using clustered topic matching. Experiments on five major platforms, Apple, Google, X (formerly Twitter), Microsoft, and Netflix, show the effectiveness of TOSense (with up to 44.5% accuracy) across varying number of topic clusters. During the demonstration, we will showcase TOSense in action. Attendees will be able to experience seamless extraction, interactive question answering, and instant indexing of new sites.
Authors:Vincent Cohen-Addad, Alessandro Epasto, Jason Lee, Morteza Zadimoghaddam
Abstract:
In modern datasets, where single records can have multiple owners, enforcing user-level differential privacy requires capping each user's total contribution. This "contribution bounding" becomes a significant combinatorial challenge. Existing sequential algorithms for this task are computationally intensive and do not scale to the massive datasets prevalent today. To address this scalability bottleneck, we propose a novel and efficient distributed algorithm. Our approach models the complex ownership structure as a hypergraph, where users are vertices and records are hyperedges. The algorithm proceeds in rounds, allowing users to propose records in parallel. A record is added to the final dataset only if all its owners unanimously agree, thereby ensuring that no user's predefined contribution limit is violated. This method aims to maximize the size of the resulting dataset for high utility while providing a practical, scalable solution for implementing user-level privacy in large, real-world systems.
Authors:Yucheng Wu, Yuncong Yang, Xiao Han, Leye Wang, Junjie Wu
Abstract:
Publishing graph data is widely desired to enable a variety of structural analyses and downstream tasks. However, it also potentially poses severe privacy leakage, as attackers may leverage the released graph data to launch attacks and precisely infer private information such as the existence of hidden sensitive links in the graph. Prior studies on privacy-preserving graph data publishing relied on heuristic graph modification strategies and it is difficult to determine the graph with the optimal privacy--utility trade-off for publishing. In contrast, we propose the first privacy-preserving graph structure learning framework against sensitive link inference attacks, named PPGSL, which can automatically learn a graph with the optimal privacy--utility trade-off. The PPGSL operates by first simulating a powerful surrogate attacker conducting sensitive link attacks on a given graph. It then trains a parameterized graph to defend against the simulated adversarial attacks while maintaining the favorable utility of the original graph. To learn the parameters of both parts of the PPGSL, we introduce a secure iterative training protocol. It can enhance privacy preservation and ensure stable convergence during the training process, as supported by the theoretical proof. Additionally, we incorporate multiple acceleration techniques to improve the efficiency of the PPGSL in handling large-scale graphs. The experimental results confirm that the PPGSL achieves state-of-the-art privacy--utility trade-off performance and effectively thwarts various sensitive link inference attacks.
Authors:Yassine El Kheir, Arnab Das, Enes Erdem Erdogan, Fabian Ritter-Guttierez, Tim Polzehl, Sebastian Möller
Abstract:
Recent advances in synthetic speech have made audio deepfakes increasingly realistic, posing significant security risks. Existing detection methods that rely on a single modality, either raw waveform embeddings or spectral based features, are vulnerable to non spoof disturbances and often overfit to known forgery algorithms, resulting in poor generalization to unseen attacks. To address these shortcomings, we investigate hybrid fusion frameworks that integrate self supervised learning (SSL) based representations with handcrafted spectral descriptors (MFCC , LFCC, CQCC). By aligning and combining complementary information across modalities, these fusion approaches capture subtle artifacts that single feature approaches typically overlook. We explore several fusion strategies, including simple concatenation, cross attention, mutual cross attention, and a learnable gating mechanism, to optimally blend SSL features with fine grained spectral cues. We evaluate our approach on four challenging public benchmarks and report generalization performance. All fusion variants consistently outperform an SSL only baseline, with the cross attention strategy achieving the best generalization with a 38% relative reduction in equal error rate (EER). These results confirm that joint modeling of waveform and spectral views produces robust, domain agnostic representations for audio deepfake detection.
Authors:Chang Gong, Zhongwen Li, Xiaoqi Li
Abstract:
Information security is facing increasingly severe challenges, and traditional protection means are difficult to cope with complex and changing threats. In recent years, as an emerging intelligent technology, large language models (LLMs) have shown a broad application prospect in the field of information security. In this paper, we focus on the key role of LLM in information security, systematically review its application progress in malicious behavior prediction, network threat analysis, system vulnerability detection, malicious code identification, and cryptographic algorithm optimization, and explore its potential in enhancing security protection performance. Based on neural networks and Transformer architecture, this paper analyzes the technical basis of large language models and their advantages in natural language processing tasks. It is shown that the introduction of large language modeling helps to improve the detection accuracy and reduce the false alarm rate of security systems. Finally, this paper summarizes the current application results and points out that it still faces challenges in model transparency, interpretability, and scene adaptability, among other issues. It is necessary to explore further the optimization of the model structure and the improvement of the generalization ability to realize a more intelligent and accurate information security protection system.
Authors:Yifan Xu, Jinfu Chen, Zhenyu Qi, Huashan Chen, Junyi Wang, Pengfei Hu, Feng Liu, Sen He
Abstract:
Virtual Reality (VR) has emerged as a transformative technology across industries, yet its security weaknesses, including vulnerabilities, are underinvestigated. This study investigates 334 VR projects hosted on GitHub, examining 1,681 software security weaknesses to understand: what types of weaknesses are prevalent in VR software; when and how weaknesses are introduced; how long they have survived; and how they have been removed. Due to the limited availability of VR software security weaknesses in public databases (e.g., the National Vulnerability Database or NVD), we prepare the first systematic dataset of VR software security weaknesses by introducing a novel framework to collect such weaknesses from GitHub commit data. Our empirical study on the dataset leads to useful insights, including: (i) VR weaknesses are heavily skewed toward user interface weaknesses, followed by resource-related weaknesses; (ii) VR development tools pose higher security risks than VR applications; (iii) VR security weaknesses are often introduced at the VR software birth time.
Authors:Hao Jiang, Quan Zhou, Dongdong Zhao, Shangshang Yang, Wenjian Luo, Xingyi Zhang
Abstract:
Data perturbation-based privacy-preserving methods have been widely adopted in various scenarios due to their efficiency and the elimination of the need for a trusted third party. However, these methods primarily focus on individual statistical indicators, neglecting the overall quality of the collected data from a distributional perspective. Consequently, they often fall short of meeting the diverse statistical analysis requirements encountered in practical data analysis. As a promising sensitive data perturbation method, negative survey methods is able to complete the task of collecting sensitive information distribution while protecting personal privacy. Yet, existing negative survey methods are primarily designed for discrete sensitive information and are inadequate for real-valued data distributions. To bridge this gap, this paper proposes a novel real-value negative survey model, termed RVNS, for the first time in the field of real-value sensitive information collection. The RVNS model exempts users from the necessity of discretizing their data and only requires them to sample a set of data from a range that deviates from their actual sensitive details, thereby preserving the privacy of their genuine information. Moreover, to accurately capture the distribution of sensitive information, an optimization problem is formulated, and a novel approach is employed to solve it. Rigorous theoretical analysis demonstrates that the RVNS model conforms to the differential privacy model, ensuring robust privacy preservation. Comprehensive experiments conducted on both synthetic and real-world datasets further validate the efficacy of the proposed method.
Authors:Sanzida Hoque, Abdullah Aydeger, Engin Zeydan, Madhusanka Liyanage
Abstract:
The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC). While PQC has been analyzed in theory, its performance in practical wireless communication environments remains underexplored. This paper presents a detailed implementation and performance evaluation of NIST-selected PQC algorithms in user equipment (UE) to UE communications over 5G networks. Using a full 5G emulation stack (Open5GS and UERANSIM) and PQC-enabled TLS 1.3 via BoringSSL and liboqs, we examine key encapsulation mechanisms and digital signature schemes across realistic network conditions. We evaluate performance based on handshake latency, CPU and memory usage, bandwidth, and retransmission rates, under varying cryptographic configurations and client loads. Our findings show that ML-KEM with ML-DSA offers the best efficiency for latency-sensitive applications, while SPHINCS+ and HQC combinations incur higher computational and transmission overheads, making them unsuitable for security-critical but time-sensitive 5G scenarios.
Authors:Yu Wang, Yijian Liu, Liheng Ji, Han Luo, Wenjie Li, Xiaofei Zhou, Chiyun Feng, Puji Wang, Yuhan Cao, Geyuan Zhang, Xiaojian Li, Rongwu Xu, Yilei Chen, Tianxing He
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities across a variety of domains. However, their applications in cryptography, which serves as a foundational pillar of cybersecurity, remain largely unexplored. To address this gap, we propose \textbf{AICrypto}, the first comprehensive benchmark designed to evaluate the cryptographic capabilities of LLMs. The benchmark comprises 135 multiple-choice questions, 150 capture-the-flag (CTF) challenges, and 18 proof problems, covering a broad range of skills from factual memorization to vulnerability exploitation and formal reasoning. All tasks are carefully reviewed or constructed by cryptography experts to ensure correctness and rigor. To support automated evaluation of CTF challenges, we design an agent-based framework. To gain deeper insight into the current state of cryptographic proficiency in LLMs, we introduce human expert performance baselines for comparison across all task types. Our evaluation of 17 leading LLMs reveals that state-of-the-art models match or even surpass human experts in memorizing cryptographic concepts, exploiting common vulnerabilities, and routine proofs. However, they still lack a deep understanding of abstract mathematical concepts and struggle with tasks that require multi-step reasoning and dynamic analysis. We hope this work could provide insights for future research on LLMs in cryptographic applications. Our code and dataset are available at https://aicryptobench.github.io.
Authors:Md Tanvirul Alam, Aritran Piplai, Nidhi Rastogi
Abstract:
Machine learning models are commonly used for malware classification; however, they suffer from performance degradation over time due to concept drift. Adapting these models to changing data distributions requires frequent updates, which rely on costly ground truth annotations. While active learning can reduce the annotation burden, leveraging unlabeled data through semi-supervised learning remains a relatively underexplored approach in the context of malware detection. In this research, we introduce \texttt{ADAPT}, a novel pseudo-labeling semi-supervised algorithm for addressing concept drift. Our model-agnostic method can be applied to various machine learning models, including neural networks and tree-based algorithms. We conduct extensive experiments on five diverse malware detection datasets spanning Android, Windows, and PDF domains. The results demonstrate that our method consistently outperforms baseline models and competitive benchmarks. This work paves the way for more effective adaptation of machine learning models to concept drift in malware detection.
Authors:Toluwani Aremu, Noor Hussein, Munachiso Nwadike, Samuele Poppi, Jie Zhang, Karthik Nandakumar, Neil Gong, Nils Lukas
Abstract:
Watermarking offers a promising solution for GenAI providers to establish the provenance of their generated content. A watermark is a hidden signal embedded in the generated content, whose presence can later be verified using a secret watermarking key. A security threat to GenAI providers are \emph{forgery attacks}, where malicious users insert the provider's watermark into generated content that was \emph{not} produced by the provider's models, potentially damaging their reputation and undermining trust. One potential defense to resist forgery is using multiple keys to watermark generated content. However, it has been shown that forgery attacks remain successful when adversaries can collect sufficiently many watermarked samples. We propose an improved multi-key watermarking method that resists all surveyed forgery attacks and scales independently of the number of watermarked samples collected by the adversary. Our method accepts content as genuinely watermarked only if \emph{exactly} one watermark is detected. We focus on the image and text modalities, but our detection method is modality-agnostic, since it treats the underlying watermarking method as a black-box. We derive theoretical bounds on forgery-resistance and empirically validate them using Mistral-7B. Our results show a decrease in forgery success from up to $100\%$ using single-key baselines to only $2\%$. While our method resists all surveyed attacks, we find that highly capable, adaptive attackers can still achieve success rates of up to $65\%$ if watermarked content generated using different keys is easily separable.
Authors:Matteo Lupinacci, Francesco Aurelio Pironti, Francesco Blefari, Francesco Romeo, Luigi Arena, Angelo Furfaro
Abstract:
The rapid adoption of Large Language Model (LLM) agents and multi-agent systems enables remarkable capabilities in natural language processing and generation. However, these systems introduce unprecedented security vulnerabilities that extend beyond traditional content generation attacks to system-level compromise. This paper presents a comprehensive evaluation of the security of LLMs used as reasoning engines within autonomous agents, highlighting how they can be exploited as attack vectors capable of achieving complete computer takeover. We focus on how different attack surfaces and trust boundaries - Direct Prompt Injection, RAG Backdoor, and Inter Agent Trust - can be leveraged to orchestrate such takeovers. We demonstrate that adversaries can effectively coerce popular LLMs (including GPT-4, Claude-4 and Gemini-2.5) into autonomously installing and executing malware on victim machines. Our evaluation of 18 state-of-the-art LLMs reveals an alarming scenario: 94.4% of models succumb to Direct Prompt Injection and 83.3% are vulnerable to the more stealth and evasive RAG Backdoor Attack. Notably, we tested trust boundaries within multi-agent systems, where LLM agents interact and influence each other, and we revealed a critical security flaw: LLMs which successfully resist direct injection or RAG backdoor will execute identical payloads when requested by peer agents. Our findings show that 100.0% of tested LLMs can be compromised through Inter-Agent Trust Exploitation attacks and that every model exhibits context-dependent security behaviors that create exploitable blind spots. Our results also highlight the need to increase awareness and research on the security risks of LLMs, showing a paradigm shift in cybersecurity threats, where AI tools themselves become sophisticated attack vectors.
Authors:Paritosh Ranjan, Surajit Majumder, Prodip Roy
Abstract:
In the digital age, individuals increasingly maintain active presences across multiple platforms ranging from social media and messaging applications to professional and communication tools. However, the current model for managing user level privacy and abuse is siloed, requiring users to block undesirable contacts independently on each platform. This paper introduces Single Block On (SBO) a unified and interoperable system enabling users to block an individual once and have that block propagated across all integrated applications. SBO operates via identity based matching rules, utilizing configurable levels of identifier similarity, and interfaces with systems through standardized protocols such as SSO, LDAP, or direct REST integration. A novel Contact Rule Markup Language (CRML) facilitates consistent policy sharing across systems. The proposed solution increases user safety, enhances digital well-being, and sets a precedent for interoperable privacy enforcement.
Authors:Shravya Kanchi, Neal Mangaokar, Aravind Cheruvu, Sifat Muhammad Abdullah, Shirin Nilizadeh, Atul Prakash, Bimal Viswanath
Abstract:
Machine learning-based supervised classifiers are widely used for security tasks, and their improvement has been largely focused on algorithmic advancements. We argue that data challenges that negatively impact the performance of these classifiers have received limited attention. We address the following research question: Can developments in Generative AI (GenAI) address these data challenges and improve classifier performance? We propose augmenting training datasets with synthetic data generated using GenAI techniques to improve classifier generalization. We evaluate this approach across 7 diverse security tasks using 6 state-of-the-art GenAI methods and introduce a novel GenAI scheme called Nimai that enables highly controlled data synthesis. We find that GenAI techniques can significantly improve the performance of security classifiers, achieving improvements of up to 32.6% even in severely data-constrained settings (only ~180 training samples). Furthermore, we demonstrate that GenAI can facilitate rapid adaptation to concept drift post-deployment, requiring minimal labeling in the adjustment process. Despite successes, our study finds that some GenAI schemes struggle to initialize (train and produce data) on certain security tasks. We also identify characteristics of specific tasks, such as noisy labels, overlapping class distributions, and sparse feature vectors, which hinder performance boost using GenAI. We believe that our study will drive the development of future GenAI tools designed for security tasks.
Authors:Gennady Khalimov, Yevgen Kotukh
Abstract:
We propose a public key encryption cryptosystem based on solutions of linear equation systems with predefinition of input parameters through shared secret computation for factorizable substitutions. The existence of multiple equivalent solutions for an underdetermined system of linear equations determines the impossibility of its resolution by a cryptanalyst in polynomial time. The completion of input parameters of the equation system is implemented through secret homomorphic matrix transformation for substitutions factorized over the basis of a vector space of dimension m over the field F2. Encryption is implemented through computation of substitutions that are one-way functions on an elementary abelian 2-group of order 2"m. Decryption is implemented through completion of input parameters of the equation system. Homomorphic transformations are constructed based on matrix computations. Matrix computations enable the implementation of high security and low computational overhead for homomorphic transformations.
Authors:Francesco Blefari, Cristian Cosentino, Francesco Aurelio Pironti, Angelo Furfaro, Fabrizio Marozzo
Abstract:
Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.
Authors:Ranyang Zhou, Abeer Matar A. Almalky, Gamana Aragonda, Sabbir Ahmed, Filip Roth Trønnes-Christensen, Adnan Siraj Rakin, Shaahin Angizi
Abstract:
True Random Number Generators (TRNGs) play a fundamental role in hardware security, cryptographic systems, and data protection. In the context of Deep NeuralNetworks (DNNs), safeguarding model parameters, particularly weights, is critical to ensure the integrity, privacy, and intel-lectual property of AI systems. While software-based pseudo-random number generators are widely used, they lack the unpredictability and resilience offered by hardware-based TRNGs. In this work, we propose a novel and robust Encoding-in-Memory TRNG called EIM-TRNG that leverages the inherent physical randomness in DRAM cell behavior, particularly under RowHammer-induced disturbances, for the first time. We demonstrate how the unpredictable bit-flips generated through carefully controlled RowHammer operations can be harnessed as a reliable entropy source. Furthermore, we apply this TRNG framework to secure DNN weight data by encoding via a combination of fixed and unpredictable bit-flips. The encrypted data is later decrypted using a key derived from the probabilistic flip behavior, ensuring both data confidentiality and model authenticity. Our results validate the effectiveness of DRAM-based entropy extraction for robust, low-cost hardware security and offer a promising direction for protecting machine learning models at the hardware level.
Authors:Ismail Labiad, Mathurin Videau, Matthieu Kowalski, Marc Schoenauer, Alessandro Leite, Julia Kempe, Olivier Teytaud
Abstract:
Gradient-based optimization is the workhorse of deep learning, offering efficient and scalable training via backpropagation. However, its reliance on large volumes of labeled data raises privacy and security concerns such as susceptibility to data poisoning attacks and the risk of overfitting. In contrast, black box optimization methods, which treat the model as an opaque function, relying solely on function evaluations to guide optimization, offer a promising alternative in scenarios where data access is restricted, adversarial risks are high, or overfitting is a concern. However, black box methods also pose significant challenges, including poor scalability to high-dimensional parameter spaces, as prevalent in large language models (LLMs), and high computational costs due to reliance on numerous model evaluations. This paper introduces BBoxER, an evolutionary black-box method for LLM post-training that induces an information bottleneck via implicit compression of the training data. Leveraging the tractability of information flow, we provide strong theoretical bounds on generalization, differential privacy, susceptibility to data poisoning attacks, and robustness to extraction attacks. BBoxER operates on top of pre-trained LLMs, offering a lightweight and modular enhancement suitable for deployment in restricted or privacy-sensitive environments, in addition to non-vacuous generalization guarantees. In experiments with LLMs, we demonstrate empirically that Retrofitting methods are able to learn, showing how a few iterations of BBoxER improve performance and generalize well on a benchmark of reasoning datasets. This positions BBoxER as an attractive add-on on top of gradient-based optimization.
Authors:Rupam Patir, Keyan Guo, Haipeng Cai, Hongxin Hu
Abstract:
The code generation capabilities of Large Language Models (LLMs) have transformed the field of software development. However, this advancement also presents significant security challenges, as LLM-generated code often contains vulnerabilities. One direction of research strengthens LLMs by injecting or refining security knowledge through curated datasets, model tuning, or static analyzers. While effective in certain settings, these methods can be resource-intensive, less adaptable to zero-day vulnerabilities, and often inapplicable to proprietary models. To address these challenges, we introduce GRASP, which explores a new direction that focuses on structured reasoning over Secure Coding Practices(SCPs) rather than additional training or external feedback. GRASP comprises two key ideas: (1) an SCP graph that organizes SCPs into a Directed Acyclic Graph (DAG) capturing dependencies and relationships, and (2) a graph-based reasoning process that systematically guides LLMs through relevant SCPs for code generation. This design enables interpretable, model-agnostic, and scalable security improvements, particularly for previously unseen vulnerabilities. Our evaluation shows that GRASP consistently achieves Security Rates (SR) exceeding 80% across multiple LLMs, and delivers up to 88% improvements over baselines on zero-day vulnerabilities.
Authors:Debeshee Das, Luca Beurer-Kellner, Marc Fischer, Maximilian Baader
Abstract:
The increasing adoption of LLM agents with access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt injections. Due to the context-dependent nature of attacks, however, current defenses are often ill-calibrated as they cannot reliably differentiate malicious and benign instructions, leading to high false positive rates that prevent their real-world adoption. To address this, we present a novel approach inspired by the fundamental principle of computer security: data should not contain executable instructions. Instead of sample-level classification, we propose a token-level sanitization process, which surgically removes any instructions directed at AI systems from tool outputs, capturing malicious instructions as a byproduct. In contrast to existing safety classifiers, this approach is non-blocking, does not require calibration, and is agnostic to the context of tool outputs. Further, we can train such token-level predictors with readily available instruction-tuning data only, and don't have to rely on unrealistic prompt injection examples from challenges or of other synthetic origin. In our experiments, we find that this approach generalizes well across a wide range of attacks and benchmarks like AgentDojo, BIPIA, InjecAgent, ASB and SEP, achieving a 7-10x reduction of attack success rate (ASR) (34% to 3% on AgentDojo), without impairing agent utility in both benign and malicious settings.
Authors:Muris Sladić, Veronica Valeros, Carlos Catania, Sebastian Garcia
Abstract:
There are very few SotA deception systems based on Large Language Models. The existing ones are limited only to simulating one type of service, mainly SSH shells. These systems - but also the deception technologies not based on LLMs - lack an extensive evaluation that includes human attackers. Generative AI has recently become a valuable asset for cybersecurity researchers and practitioners, and the field of cyber-deception is no exception. Researchers have demonstrated how LLMs can be leveraged to create realistic-looking honeytokens, fake users, and even simulated systems that can be used as honeypots. This paper presents an AI-based deception framework called VelLMes, which can simulate multiple protocols and services such as SSH Linux shell, MySQL, POP3, and HTTP. All of these can be deployed and used as honeypots, thus VelLMes offers a variety of choices for deception design based on the users' needs. VelLMes is designed to be attacked by humans, so interactivity and realism are key for its performance. We evaluate the generative capabilities and the deception capabilities. Generative capabilities were evaluated using unit tests for LLMs. The results of the unit tests show that, with careful prompting, LLMs can produce realistic-looking responses, with some LLMs having a 100% passing rate. In the case of the SSH Linux shell, we evaluated deception capabilities with 89 human attackers. The results showed that about 30% of the attackers thought that they were interacting with a real system when they were assigned an LLM-based honeypot. Lastly, we deployed 10 instances of the SSH Linux shell honeypot on the Internet to capture real-life attacks. Analysis of these attacks showed us that LLM honeypots simulating Linux shells can perform well against unstructured and unexpected attacks on the Internet, responding correctly to most of the issued commands.
Authors:Junki Mori, Kazuya Kakizaki, Taiki Miyagawa, Jun Sakuma
Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding them in external knowledge. However, its application in sensitive domains is limited by privacy risks. Existing private RAG methods typically rely on query-time differential privacy (DP), which requires repeated noise injection and leads to accumulated privacy loss. To address this issue, we propose DP-SynRAG, a framework that uses LLMs to generate differentially private synthetic RAG databases. Unlike prior methods, the synthetic text can be reused once created, thereby avoiding repeated noise injection and additional privacy costs. To preserve essential information for downstream RAG tasks, DP-SynRAG extends private prediction, which instructs LLMs to generate text that mimics subsampled database records in a DP manner. Experiments show that DP-SynRAG achieves superior performanec to the state-of-the-art private RAG systems while maintaining a fixed privacy budget, offering a scalable solution for privacy-preserving RAG.
Authors:Fikret Mert Gültekin, Oscar Lilja, Ranim Khojah, Rebekka Wohlrab, Marvin Damschen, Mazen Mohamad
Abstract:
In safety-critical software systems, cybersecurity activities become essential, with risk assessment being one of the most critical. In many software teams, cybersecurity experts are either entirely absent or represented by only a small number of specialists. As a result, the workload for these experts becomes high, and software engineers would need to conduct cybersecurity activities themselves. This creates a need for a tool to support cybersecurity experts and engineers in evaluating vulnerabilities and threats during the risk assessment process. This paper explores the potential of leveraging locally hosted large language models (LLMs) with retrieval-augmented generation to support cybersecurity risk assessment in the forestry domain while complying with data protection and privacy requirements that limit external data sharing. We performed a design science study involving 12 experts in interviews, interactive sessions, and a survey within a large-scale project. The results demonstrate that LLMs can assist cybersecurity experts by generating initial risk assessments, identifying threats, and providing redundancy checks. The results also highlight the necessity for human oversight to ensure accuracy and compliance. Despite trust concerns, experts were willing to utilize LLMs in specific evaluation and assistance roles, rather than solely relying on their generative capabilities. This study provides insights that encourage the use of LLM-based agents to support the risk assessment process of cyber-physical systems in safety-critical domains.
Authors:Yuval Efron, Joachim Neu, Ling Ren, Ertem Nusret Tas
Abstract:
In the context of Byzantine consensus problems such as Byzantine broadcast (BB) and Byzantine agreement (BA), the good-case setting aims to study the minimal possible latency of a BB or BA protocol under certain favorable conditions, namely the designated leader being correct (for BB), or all parties having the same input value (for BA). We provide a full characterization of the feasibility and impossibility of good-case latency, for both BA and BB, in the synchronous sleepy model. Surprisingly to us, we find irrational resilience thresholds emerging: 2-round good-case BB is possible if and only if at all times, at least $\frac{1}φ \approx 0.618$ fraction of the active parties are correct, where $φ= \frac{1+\sqrt{5}}{2} \approx 1.618$ is the golden ratio; 1-round good-case BA is possible if and only if at least $\frac{1}{\sqrt{2}} \approx 0.707$ fraction of the active parties are correct.
Authors:Khaled Serag, Zhaozhou Tang, Sungwoo Kim, Vireshwar Kumar, Dave, Tian, Saman Zonouz, Raheem Beyah, Dongyan Xu, Z. Berkay Celik
Abstract:
For decades, the Controller Area Network (CAN) has served as the primary in-vehicle bus (IVB) and extended its use to many non-vehicular systems. Over the past years, CAN security has been intensively scrutinized, yielding extensive research literature. Despite its wealth, the literature lacks structured systematization, complicating efforts to assess attack severity, defense efficacy, identify security gaps, or root causes. This leaves non experts uncertain about the relevancy of specific attacks or defenses to their systems, inadvertently portraying CAN as irredeemably insecure. Further, the introduction of new IVB technologies--CAN evolutions, add-ons, and alternative buses--with heightened security claims risks fostering the misconception that merely adopting these technologies resolves CAN's security challenges. This paper systematizes existing CAN security knowledge, presenting a comprehensive taxonomy and assessment models of attackers, attacks, and defenses. It identifies replicable attacks and defense gaps, investigating their root causes as inherent, accidental, unique, or universal. It then extrapolates these insights to emerging IVB technologies by formally analyzing three emerging IVBs to identify shared root causes with CAN and assess their ability to close security gaps. The findings challenge common perceptions, demonstrating that CAN is more securable than perceived, that most insecurity root causes are shared across IVBs, and that merely adopting newer IVB technology does not solve persistent security issues. The paper concludes by highlighting future research directions to secure IVB communication down the road.
Authors:Tejaswini Sanjay Katale, Lu Gao, Yunpeng Zhang, Alaa Senouci
Abstract:
Cyberattacks on pipeline operational technology systems pose growing risks to energy infrastructure. This study develops a physics-informed simulation and optimization framework for analyzing cyber-physical threats in petroleum pipeline networks. The model integrates networked hydraulic dynamics, SCADA-based state estimation, model predictive control (MPC), and a bi-level formulation for stealthy false-data injection (FDI) attacks. Pipeline flow and pressure dynamics are modeled on a directed graph using nodal pressure evolution and edge-based Weymouth-type relations, including control-aware equipment such as valves and compressors. An extended Kalman filter estimates the full network state from partial SCADA telemetry. The controller computes pressure-safe control inputs via MPC under actuator constraints and forecasted demands. Adversarial manipulation is formalized as a bi-level optimization problem where an attacker perturbs sensor data to degrade throughput while remaining undetected by bad-data detectors. This attack-control interaction is solved via Karush-Kuhn-Tucker (KKT) reformulation, which results in a tractable mixed-integer quadratic program. Test gas pipeline case studies demonstrate the covert reduction of service delivery under attack. Results show that undetectable attacks can cause sustained throughput loss with minimal instantaneous deviation. This reveals the need for integrated detection and control strategies in cyber-physical infrastructure.
Authors:Yue Li, Linying Xue, Dongdong Lin, Qiushi Li, Hui Tian, Hongxia Wang
Abstract:
With the flourishing prosperity of generative models, manipulated facial images have become increasingly accessible, raising concerns regarding privacy infringement and societal trust. In response, proactive defense strategies embed adversarial perturbations into facial images to counter deepfake manipulation. However, existing methods often face a tradeoff between imperceptibility and defense effectiveness-strong perturbations may disrupt forgeries but degrade visual fidelity. Recent studies have attempted to address this issue by introducing additional visual loss constraints, yet often overlook the underlying gradient conflicts among losses, ultimately weakening defense performance. To bridge the gap, we propose a gradient-projection-based adversarial proactive defense (GRASP) method that effectively counters facial deepfakes while minimizing perceptual degradation. GRASP is the first approach to successfully integrate both structural similarity loss and low-frequency loss to enhance perturbation imperceptibility. By analyzing gradient conflicts between defense effectiveness loss and visual quality losses, GRASP pioneers the design of the gradient-projection mechanism to mitigate these conflicts, enabling balanced optimization that preserves image fidelity without sacrificing defensive performance. Extensive experiments validate the efficacy of GRASP, achieving a PSNR exceeding 40 dB, SSIM of 0.99, and a 100% defense success rate against facial attribute manipulations, significantly outperforming existing approaches in visual quality.
Authors:Ahsan Farabi, Muhaiminul Rashid Shad, Israt Khandaker
Abstract:
Intrusion Detection Systems (IDS) face persistent challenges due to evolving cyberattacks, high-dimensional traffic data, and severe class imbalance in benchmark datasets such as NSL-KDD. To address these issues, we propose IntrusionX, a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs) for local feature extraction and Long Short-Term Memory (LSTM) networks for temporal modeling. The architecture is further optimized using the Squirrel Search Algorithm (SSA), enabling effective hyperparameter tuning while maintaining computational efficiency. Our pipeline incorporates rigorous preprocessing, stratified data splitting, and dynamic class weighting to enhance the detection of rare classes. Experimental evaluation on NSL-KDD demonstrates that IntrusionX achieves 98% accuracy in binary classification and 87% in 5-class classification, with significant improvements in minority class recall (U2R: 71%, R2L: 93%). The novelty of IntrusionX lies in its reproducible, imbalance-aware design with metaheuristic optimization.
Authors:Hong kyu Lee, Ruixuan Liu, Li Xiong
Abstract:
Machine unlearning is an emerging technique that removes the influence of a subset of training data (forget set) from a model without full retraining, with applications including privacy protection, content moderation, and model correction. The key challenge lies in ensuring that the model completely forgets the knowledge of the forget set without compromising its overall utility. Existing unlearning methods for large language models (LLMs) often utilize auxiliary language models, retain datasets, or even commercial AI services for effective unlearning and maintaining the model utility. However, dependence on these external resources is often impractical and could potentially introduce additional privacy risks. In this work, we propose direct token optimization (DTO), a novel self-contained unlearning approach for LLMs that directly optimizes the token level objectives and eliminates the need for external resources. Given a sequence to unlearn, we identify two categories of tokens: target tokens, which capture critical knowledge for unlearning, and the remaining non-target tokens, which are crucial for maintaining the model utility. The former are used to optimize the unlearning objective, while the latter serve to preserve the model's performance. The experimental results show that the proposed DTO achieves up to 16.8$\times$ improvement in forget quality on several benchmark datasets than the latest baselines while maintaining a comparable level of model utility.
Authors:Raghul Saravanan, Sai Manoj P D
Abstract:
The ever-increasing complexity of design specifications for processors and intellectual property (IP) presents a formidable challenge for early bug detection in the modern IC design cycle. The recent advancements in hardware fuzzing have proven effective in detecting bugs in RTL designs of cutting-edge processors. The modern IC design flow involves incremental updates and modifications to the hardware designs necessitating rigorous verification and extending the overall verification period. To accelerate this process, directed fuzzing has emerged focusing on generating targeted stimuli for specific regions of the design, avoiding the need for exhaustive, full-scale verification. However, a significant limitation of these hardware fuzzers lies in their reliance on an equivalent SW model of the hardware which fails to capture intrinsic hardware characteristics. To circumvent the aforementioned challenges, this work introduces TargetFuzz, an innovative and scalable targeted hardware fuzzing mechanism. It leverages SAT-based techniques to focus on specific regions of the hardware design while operating at its native hardware abstraction level, ensuring a more precise and comprehensive verification process. We evaluated this approach across a diverse range of RTL designs for various IP cores. Our experimental results demonstrate its capability to effectively target and fuzz a broad spectrum of sites within these designs, showcasing its extensive coverage and precision in addressing targeted regions. TargetFuzz demonstrates its capability to effectively scale 30x greater in terms of handling target sites, achieving 100% state coverage and 1.5x faster in terms of site coverage, and shows 90x improvement in target state coverage compared to Coverage-Guided Fuzzing, demonstrating its potential to advance the state-of-the-art in directed hardware fuzzing.
Authors:Hui Wang, Nima Tashakor, Xiaoyang Tian, Hans D. Schotten, Stefan M. Goetz
Abstract:
With the popularity of wireless charging, energy access protection and cybersecurity are gaining importance, especially in public places. Currently, the most common energy encryption method uses frequency and associated impedance variation. However, we have proven that this method is not reliable, since a hacker can detect the changing frequency and adjust the compensation. However, the previously presented system needed time to follow the updated frequency, while encryption systems may vary the frequency faster to avoid energy theft. Furthermore, the previous system required an additional sensor coil. To solve these problems, we optimized the attack and the associated system, which can intrude and steal energy within 0.2 ms. The key is the elimination of the time-consuming maximum receiver current regulation. Also, we use the main receiving coil rather than any additional sensor antenna to detect the magnetic field. Thus, the new hardware is even simpler. A simulation model and experimental results demonstrate the fast response speed of the attack on encrypted wireless power and steal 65% of the power. Overall, the applicability of the attack is highly improved and leaves less room for hardening the encryption. The results demonstrate that energy access protection needs to be given great attention.
Authors:Sepideh Abedini, Shubhankar Mohapatra, D. B. Emerson, Masoumeh Shafieinejad, Jesse C. Cresswell, Xi He
Abstract:
Large language models (LLMs) have shown promising performance on tasks that require reasoning, such as text-to-SQL, code generation, and debugging. However, regulatory frameworks with strict privacy requirements constrain their integration into sensitive systems. State-of-the-art LLMs are also proprietary, costly, and resource-intensive, making local deployment impractical. Consequently, utilizing such LLMs often requires sharing data with third-party providers, raising privacy concerns and risking noncompliance with regulations. Although fine-tuned small language models (SLMs) can outperform LLMs on certain tasks and be deployed locally to mitigate privacy concerns, they underperform on more complex tasks such as text-to-SQL translation. In this work, we introduce MaskSQL, a text-to-SQL framework that utilizes abstraction as a privacy protection mechanism to mask sensitive information in LLM prompts. Unlike redaction, which removes content entirely, or generalization, which broadens tokens, abstraction retains essential information while discarding unnecessary details, striking an effective privacy-utility balance for the text-to-SQL task. Moreover, by providing mechanisms to control the privacy-utility tradeoff, MaskSQL facilitates adoption across a broader range of use cases. Our experimental results show that MaskSQL outperforms leading SLM-based text-to-SQL models and achieves performance approaching state-of-the-art LLM-based models, while preserving privacy.
Authors:Charles E. Gagnon, Steven H. H. Ding, Philippe Charland, Benjamin C. M. Fung
Abstract:
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying semantically similar code in different contexts. Modern methods have progressed from manually engineered features to vector representations. Hand-crafted statistics (e.g., operation ratios) are interpretable, but shallow and fail to generalize. Embedding-based methods overcome this by learning robust cross-setting representations, but these representations are opaque vectors that prevent rapid verification. They also face a scalability-accuracy trade-off, since high-dimensional nearest-neighbor search requires approximations that reduce precision. Current approaches thus force a compromise between interpretability, generalizability, and scalability. We bridge these gaps using a language model-based agent to conduct structured reasoning analysis of assembly code and generate features such as input/output types, side effects, notable constants, and algorithmic intent. Unlike hand-crafted features, they are richer and adaptive. Unlike embeddings, they are human-readable, maintainable, and directly searchable with inverted or relational indexes. Without any matching training, our method respectively achieves 42% and 62% for recall@1 in cross-architecture and cross-optimization tasks, comparable to embedding methods with training (39% and 34%). Combined with embeddings, it significantly outperforms the state-of-the-art, demonstrating that accuracy, scalability, and interpretability can coexist.
Authors:Jeongyeon Hwang, Sangdon Park, Jungseul Ok
Abstract:
Watermarking for large language models (LLMs) embeds a statistical signal during generation to enable detection of model-produced text. While watermarking has proven effective in benign settings, its robustness under adversarial evasion remains contested. To advance a rigorous understanding and evaluation of such vulnerabilities, we propose the \emph{Bias-Inversion Rewriting Attack} (BIRA), which is theoretically motivated and model-agnostic. BIRA weakens the watermark signal by suppressing the logits of likely watermarked tokens during LLM-based rewriting, without any knowledge of the underlying watermarking scheme. Across recent watermarking methods, BIRA achieves over 99\% evasion while preserving the semantic content of the original text. Beyond demonstrating an attack, our results reveal a systematic vulnerability, emphasizing the need for stress testing and robust defenses.
Authors:Xinyu Hu, Zhiwei Fu, Shaocong Xie, Steven H. H. Ding, Philippe Charland
Abstract:
Reverse Engineering (RE) is central to software security, enabling tasks such as vulnerability discovery and malware analysis, but it remains labor-intensive and requires substantial expertise. Earlier advances in deep learning start to automate parts of RE, particularly for malware detection and vulnerability classification. More recently, a rapidly growing body of work has applied Large Language Models (LLMs) to similar purposes. Their role compared to prior machine learning remains unclear, since some efforts simply adapt existing pipelines with minimal change while others seek to exploit broader reasoning and generative abilities. These differences, combined with varied problem definitions, methods, and evaluation practices, limit comparability, reproducibility, and cumulative progress. This paper systematizes the field by reviewing 44 research papers, including peer-reviewed publications and preprints, and 18 additional open-source projects that apply LLMs in RE. We propose a taxonomy that organizes existing work by objective, target, method, evaluation strategy, and data scale. Our analysis identifies strengths and limitations, highlights reproducibility and evaluation gaps, and examines emerging risks. We conclude with open challenges and future research directions that aim to guide more coherent and security-relevant applications of LLMs in RE.
Authors:Alioune Diallo, Anta Diop, Abdoul Kader Kabore, Jordan Samhi, Aleksandr Pilgun, Tegawendé F. Bissyande, Jacque Klein
Abstract:
Android's open-source nature facilitates widespread smartphone accessibility, particularly in price-sensitive markets. System and vendor applications that come pre-installed on budget Android devices frequently operate with elevated privileges, yet they receive limited independent examination. To address this gap, we developed a framework that extracts APKs from physical devices and applies static analysis to identify privacy and security issues in embedded software. Our study examined 1,544 APKs collected from seven African smartphones. The analysis revealed that 145 applications (9%) disclose sensitive data, 249 (16%) expose critical components without sufficient safeguards, and many present additional risks: 226 execute privileged or dangerous commands, 79 interact with SMS messages (read, send, or delete), and 33 perform silent installation operations. We also uncovered a vendor-supplied package that appears to transmit device identifiers and location details to an external third party. These results demonstrate that pre-installed applications on widely distributed low-cost devices represent a significant and underexplored threat to user security and privacy.
Authors:Stefanos Chaliasos, Conner Swann, Sina Pilehchiha, Nicolas Mohnblatt, Benjamin Livshits, Assimakis Kattis
Abstract:
Rollups have become the de facto scalability solution for Ethereum, securing more than $55B in assets. They achieve scale by executing transactions on a Layer 2 ledger, while periodically posting data and finalizing state on the Layer 1, either optimistically or via validity proofs. Their fees must simultaneously reflect the pricing of three resources: L2 costs (e.g., execution), L1 DA, and underlying L1 gas costs for batch settlement and proof verification. In this work, we identify critical mis-pricings in existing rollup transaction fee mechanisms (TFMs) that allow for two powerful attacks. Firstly, an adversary can saturate the L2's DA batch capacity with compute-light data-heavy transactions, forcing low-gas transaction batches that enable both L2 DoS attacks, and finality-delay attacks. Secondly, by crafting prover killer transactions that maximize proving cycles relative to the gas charges, an adversary can effectively stall proof generation, delaying finality by hours and inflicting prover-side economic losses to the rollup at a minimal cost. We analyze the above attack vectors across the major Ethereum rollups, quantifying adversarial costs and protocol losses. We find that the first attack enables periodic DoS on rollups, lasting up to 30 minutes, at a cost below 2 ETH for most rollups. Moreover, we identify three rollups that are exposed to indefinite DoS at a cost of approximately 0.8 to 2.7 ETH per hour. The attack can be further modified to increase finalization delays by a factor of about 1.45x to 2.73x, compared to direct L1 blob-stuffing, depending on the rollup's parameters. Furthermore, we find that the prover killer attack induces a finalization latency increase of about 94x. Finally, we propose comprehensive mitigations to prevent these attacks and suggest how some practical uses of multi-dimensional rollup TFMs can rectify the identified mis-pricing attacks.
Authors:S M Asif Hossain, Ruksat Khan Shayoni, Mohd Ruhul Ameen, Akif Islam, M. F. Mridha, Jungpil Shin
Abstract:
Prompt injection attacks represent a major vulnerability in Large Language Model (LLM) deployments, where malicious instructions embedded in user inputs can override system prompts and induce unintended behaviors. This paper presents a novel multi-agent defense framework that employs specialized LLM agents in coordinated pipelines to detect and neutralize prompt injection attacks in real-time. We evaluate our approach using two distinct architectures: a sequential chain-of-agents pipeline and a hierarchical coordinator-based system. Our comprehensive evaluation on 55 unique prompt injection attacks, grouped into 8 categories and totaling 400 attack instances across two LLM platforms (ChatGLM and Llama2), demonstrates significant security improvements. Without defense mechanisms, baseline Attack Success Rates (ASR) reached 30% for ChatGLM and 20% for Llama2. Our multi-agent pipeline achieved 100% mitigation, reducing ASR to 0% across all tested scenarios. The framework demonstrates robustness across multiple attack categories including direct overrides, code execution attempts, data exfiltration, and obfuscation techniques, while maintaining system functionality for legitimate queries.
Authors:Ali Al-kuwari, Noureldin Mohamed, Saif Al-kuwari, Ahmed Farouk, Bikash K. Behera
Abstract:
The emergence of quantum computing poses significant risks to the security of modern communication networks as it breaks today's public-key cryptographic algorithms. Quantum Key Distribution (QKD) offers a promising solution by harnessing the principles of quantum mechanics to establish secure keys. However, practical QKD implementations remain vulnerable to hardware imperfections and advanced attacks such as Photon Number Splitting and Trojan-Horse attacks. In this work, we investigate the potential of using quantum machine learning (QML) to detect popular QKD attacks. In particular, we propose a Hybrid Quantum Long Short-Term Memory (QLSTM) model to improve the detection of common QKD attacks. By combining quantum-enhanced learning with classical deep learning, the model captures complex temporal patterns in QKD data, improving detection accuracy. To evaluate the proposed model, we introduce a realistic QKD dataset simulating normal QKD operations along with seven attack scenarios, Intercept-and-Resend, Photon-Number Splitting (PNS), Trojan-Horse attacks Random Number Generator (RNG), Detector Blinding, Wavelength-dependent Trojan Horse, and Combined attacks. The dataset includes quantum security metrics such as Quantum Bit Error Rate (QBER), measurement entropy, signal and decoy loss rates, and time-based metrics, ensuring an accurate representation of real-world conditions. Our results demonstrate promising performance of the quantum machine learning approach compared to traditional classical machine learning models, highlighting the potential of hybrid techniques to enhance the security of future quantum communication networks. The proposed Hybrid QLSTM model achieved an accuracy of 93.7.0\% after 50 training epochs, outperforming classical deep learning models such as LSTM, and CNN.
Authors:Moritz Bley, Tobias Scharnowski, Simon Wörner, Moritz Schloegel, Thorsten Holz
Abstract:
One of the biggest attack surfaces of embedded systems is their network interfaces, which enable communication with other devices. Unlike their general-purpose counterparts, embedded systems are designed for specialized use cases, resulting in unique and diverse communication stacks. Unfortunately, current approaches for evaluating the security of these embedded network stacks require manual effort or access to hardware, and they generally focus only on small parts of the embedded system. A promising alternative is firmware rehosting, which enables fuzz testing of the entire firmware by generically emulating the physical hardware. However, existing rehosting methods often struggle to meaningfully explore network stacks due to their complex, multi-layered input formats. This limits their ability to uncover deeply nested software faults. To address this problem, we introduce a novel method to automatically detect and handle the use of network protocols in firmware called Pemu. By automatically deducing the available network protocols, Pemu can transparently generate valid network packets that encapsulate fuzzing data, allowing the fuzzing input to flow directly into deeper layers of the firmware logic. Our approach thus enables a deeper, more targeted, and layer-by-layer analysis of firmware components that were previously difficult or impossible to test. Our evaluation demonstrates that Pemu consistently improves the code coverage of three existing rehosting tools for embedded network stacks. Furthermore, our fuzzer rediscovered several known vulnerabilities and identified five previously unknown software faults, highlighting its effectiveness in uncovering deeply nested bugs in network-exposed code.
Authors:James C. Ward, Alex Bott, Connor York, Edmund R. Hunt
Abstract:
Simulating hostile attacks of physical autonomous systems can be a useful tool to examine their robustness to attack and inform vulnerability-aware design. In this work, we examine this through the lens of multi-robot patrol, by presenting a machine learning-based adversary model that observes robot patrol behavior in order to attempt to gain undetected access to a secure environment within a limited time duration. Such a model allows for evaluation of a patrol system against a realistic potential adversary, offering insight into future patrol strategy design. We show that our new model outperforms existing baselines, thus providing a more stringent test, and examine its performance against multiple leading decentralized multi-robot patrol strategies.
Authors:Robert Dumitru, Junpeng Wan, Daniel Genkin, Rick Kennell, Dave, Tian, Yuval Yarom
Abstract:
In recent years, Rowhammer has attracted significant attention from academia and industry alike. This technique, first published in 2014, flips bits in memory by repeatedly accessing neighbouring memory locations. Since its discovery, researchers have developed a substantial body of work exploiting Rowhammer and proposing countermeasures. These works demonstrate that Rowhammer can be mounted not only through native code, but also via remote code execution, such as JavaScript in browsers, and over networks. In this work, we uncover a previously unexplored Rowhammer vector. We present Thunderhammer, an attack that induces DRAM bitflips from malicious peripherals connected via PCIe or Thunderbolt (which tunnels PCIe). On modern DDR4 systems, we observe that triggering bitflips through PCIe requests requires precisely timed access patterns tailored to the target system. We design a custom device to reverse engineer critical architectural parameters that shape PCIe request scheduling, and to execute effective hammering access patterns. Leveraging this knowledge, we successfully demonstrate Rowhammer-induced bitflips in DDR4 memory modules via both PCIe slot connections and Thunderbolt ports tunnelling PCIe.
Authors:Tao Wang, Yushu Zhang, Xiangli Xiao, Kun Xu, Lin Yuan, Wenying Wen, Yuming Fang
Abstract:
Deep learning-based face recognition (FR) technology exacerbates privacy concerns in photo sharing. In response, the research community developed a suite of anti-FR methods to block identity extraction by unauthorized FR systems. Benefiting from quasi-imperceptible alteration, perturbation-based methods are well-suited for privacy protection of subject faces in photos, as they allow familiar persons to recognize subjects via naked eyes. However, we reveal that perturbation-based methods provide a false sense of privacy through theoretical analysis and experimental validation. Therefore, new alternative solutions should be found to protect subject faces. In this paper, we explore synthesis-based methods as a promising solution, whose challenge is to enable familiar persons to recognize subjects. To solve the challenge, we present a key insight: In most photo sharing scenarios, familiar persons recognize subjects through identity perception rather than meticulous face analysis. Based on the insight, we propose the first synthesis-based method dedicated to subject faces, i.e., PerceptFace, which can make identity unextractable yet perceptible. To enhance identity perception, a new perceptual similarity loss is designed for faces, reducing the alteration in regions of high sensitivity to human vision. As a synthesis-based method, PerceptFace can inherently provide reliable identity protection. Meanwhile, out of the confine of meticulous face analysis, PerceptFace focuses on identity perception from a more practical scenario, which is also enhanced by the designed perceptual similarity loss. Sufficient experiments show that PerceptFace achieves a superior trade-off between identity protection and identity perception compared to existing methods. We provide a public API of PerceptFace and believe that it has great potential to become a practical anti-FR tool.
Authors:Navid Aftabi, Philip Samaha, Jin Ma, Long Cheng, Ramy Harik, Dan Li
Abstract:
Industrial robotic systems are central to automating smart manufacturing operations. Connected and automated factories face growing cybersecurity risks that can potentially cause interruptions and damages to physical operations. Among these attacks, data-integrity attacks often involve sophisticated exploitation of vulnerabilities that enable an attacker to access and manipulate the operational data and are hence difficult to detect with only existing intrusion detection or model-based detection. This paper addresses the challenges in utilizing existing side-channels to detect data-integrity attacks in robotic manufacturing processes by developing an online detection framework, ViSTR-GP, that cross-checks encoder-reported measurements against a vision-based estimate from an overhead camera outside the controller's authority. In this framework, a one-time interactive segmentation initializes SAM-Track to generate per-frame masks. A low-rank tensor-regression surrogate maps each mask to measurements, while a matrix-variate Gaussian process models nominal residuals, capturing temporal structure and cross-joint correlations. A frame-wise test statistic derived from the predictive distribution provides an online detector with interpretable thresholds. We validate the framework on a real-world robotic testbed with synchronized video frame and encoder data, collecting multiple nominal cycles and constructing replay attack scenarios with graded end-effector deviations. Results on the testbed indicate that the proposed framework recovers joint angles accurately and detects data-integrity attacks earlier with more frequent alarms than all baselines. These improvements are most evident in the most subtle attacks. These results show that plants can detect data-integrity attacks by adding an independent physical channel, bypassing the controller's authority, without needing complex instrumentation.
Authors:Shama Maganur, Yili Jiang, Jiaqi Huang, Fangtian Zhong
Abstract:
Sophisticated malware families exploit the openness of the Android platform to infiltrate IoT networks, enabling large-scale disruption, data exfiltration, and denial-of-service attacks. This systematic literature review (SLR) examines cutting-edge approaches to Android malware analysis with direct implications for securing IoT infrastructures. We analyze feature extraction techniques across static, dynamic, hybrid, and graph-based methods, highlighting their trade-offs: static analysis offers efficiency but is easily evaded through obfuscation; dynamic analysis provides stronger resistance to evasive behaviors but incurs high computational costs, often unsuitable for lightweight IoT devices; hybrid approaches balance accuracy with resource considerations; and graph-based methods deliver superior semantic modeling and adversarial robustness. This survey contributes a structured comparison of existing methods, exposes research gaps, and outlines a roadmap for future directions to enhance scalability, adaptability, and long-term security in IoT-driven Android malware detection.
Authors:Changjia Zhu, Junjie Xiong, Renkai Ma, Zhicong Lu, Yao Liu, Lingyao Li
Abstract:
Peer review is the cornerstone of academic publishing, yet the process is increasingly strained by rising submission volumes, reviewer overload, and expertise mismatches. Large language models (LLMs) are now being used as "reviewer aids," raising concerns about their fairness, consistency, and robustness against indirect prompt injection attacks. This paper presents a systematic evaluation of LLMs as academic reviewers. Using a curated dataset of 1,441 papers from ICLR 2023 and NeurIPS 2022, we evaluate GPT-5-mini against human reviewers across ratings, strengths, and weaknesses. The evaluation employs structured prompting with reference paper calibration, topic modeling, and similarity analysis to compare review content. We further embed covert instructions into PDF submissions to assess LLMs' susceptibility to prompt injection. Our findings show that LLMs consistently inflate ratings for weaker papers while aligning more closely with human judgments on stronger contributions. Moreover, while overarching malicious prompts induce only minor shifts in topical focus, explicitly field-specific instructions successfully manipulate specific aspects of LLM-generated reviews. This study underscores both the promises and perils of integrating LLMs into peer review and points to the importance of designing safeguards that ensure integrity and trust in future review processes.
Authors:Chanti Raju Mylay, Bobin Deng, Zhipeng Cai, Honghui Xu
Abstract:
Crop diseases pose significant threats to global food security, agricultural productivity, and sustainable farming practices, directly affecting farmers' livelihoods and economic stability. To address the growing need for effective crop disease management, AI-based disease alerting systems have emerged as promising tools by providing early detection and actionable insights for timely intervention. However, existing systems often overlook critical aspects such as data privacy, market pricing power, and farmer-friendly usability, leaving farmers vulnerable to privacy breaches and economic exploitation. To bridge these gaps, we propose AgriSentinel, the first Privacy-Enhanced Embedded-LLM Crop Disease Alerting System. AgriSentinel incorporates a differential privacy mechanism to protect sensitive crop image data while maintaining classification accuracy. Its lightweight deep learning-based crop disease classification model is optimized for mobile devices, ensuring accessibility and usability for farmers. Additionally, the system includes a fine-tuned, on-device large language model (LLM) that leverages a curated knowledge pool to provide farmers with specific, actionable suggestions for managing crop diseases, going beyond simple alerting. Comprehensive experiments validate the effectiveness of AgriSentinel, demonstrating its ability to safeguard data privacy, maintain high classification performance, and deliver practical, actionable disease management strategies. AgriSentinel offers a robust, farmer-friendly solution for automating crop disease alerting and management, ultimately contributing to improved agricultural decision-making and enhanced crop productivity.
Authors:Shixin Song, Tingzhen Dong, Kosi Nwabueze, Julian Zanders, Andres Erbsen, Adam Chlipala, Mengjia Yan
Abstract:
Authors of cryptographic software are well aware that their code should not leak secrets through its timing behavior, and, until 2018, they believed that following industry-standard constant-time coding guidelines was sufficient. However, the revelation of the Spectre family of speculative execution attacks injected new complexities. To block speculative attacks, prior work has proposed annotating the program's source code to mark secret data, with hardware using this information to decide when to speculate (i.e., when only public values are involved) or not (when secrets are in play). While these solutions are able to track secret information stored on the heap, they suffer from limitations that prevent them from correctly tracking secrets on the stack, at a cost in performance. This paper introduces SecSep, a transformation framework that rewrites assembly programs so that they partition secret and public data on the stack. By moving from the source-code level to assembly rewriting, SecSep is able to address limitations of prior work. The key challenge in performing this assembly rewriting stems from the loss of semantic information through the lengthy compilation process. The key innovation of our methodology is a new variant of typed assembly language (TAL), Octal, which allows us to address this challenge. Assembly rewriting is driven by compile-time inference within Octal. We apply our technique to cryptographic programs and demonstrate that it enables secure speculation efficiently, incurring a low average overhead of $1.2\%$.
Authors:Nils Bars, Lukas Bernhard, Moritz Schloegel, Thorsten Holz
Abstract:
We use browsers daily to access all sorts of information. Because browsers routinely process scripts, media, and executable code from unknown sources, they form a critical security boundary between users and adversaries. A common attack vector is JavaScript, which exposes a large attack surface due to the sheer complexity of modern JavaScript engines. To mitigate these threats, modern engines increasingly adopt software-based fault isolation (SFI). A prominent example is Google's V8 heap sandbox, which represents the most widely deployed SFI mechanism, protecting billions of users across all Chromium-based browsers and countless applications built on Node$.$js and Electron. The heap sandbox splits the address space into two parts: one part containing trusted, security-sensitive metadata, and a sandboxed heap containing memory accessible to untrusted code. On a technical level, the sandbox enforces isolation by removing raw pointers and using translation tables to resolve references to trusted objects. Consequently, an attacker cannot corrupt trusted data even with full control of the sandboxed data, unless there is a bug in how code handles data from the sandboxed heap. Despite their widespread use, such SFI mechanisms have seen little security testing. In this work, we propose a new testing technique that models the security boundary of modern SFI implementations. Following the SFI threat model, we assume a powerful attacker who fully controls the sandbox's memory. We implement this by instrumenting memory loads originating in the trusted domain and accessing untrusted, attacker-controlled sandbox memory. We then inject faults into the loaded data, aiming to trigger memory corruption in the trusted domain. In a comprehensive evaluation, we identify 19 security bugs in V8 that enable an attacker to bypass the sandbox.
Authors:Margarita Capretto, MartÃn Ceresa, Antonio Fernández Anta, Pedro Moreno-Sanchez, César Sánchez
Abstract:
Blockchains face a scalability limitation, partly due to the throughput limitations of consensus protocols, especially when aiming to obtain a high degree of decentralization. Layer 2 Rollups (L2s) are a faster alternative to conventional blockchains. L2s perform most computations offchain using minimally blockchains (L1) under-the-hood to guarantee correctness. A sequencer is a service that receives offchain L2 transaction requests, batches these transactions, and commits compressed or hashed batches to L1. Using hashing needs less L1 space, which is beneficial for gas cost, but requires a data availability committee (DAC) service to translate hashes into their corresponding batches of transaction requests. The behavior of sequencers and DACs influence the evolution of the L2 blockchain, presenting a potential security threat and delaying L2 adoption. We propose in this paper fraud-proof mechanisms, arbitrated by L1 contracts, to detect and generate evidence of dishonest behavior of the sequencer and DAC. We study how these fraud-proofs limit the power of adversaries that control different number of sequencer and DACs members, and provide incentives for their honest behavior. We designed these fraud-proof mechanisms as two player games. Unlike the generic fraud-proofs in current L2s (designed to guarantee the correct execution of transactions), our fraud-proofs are over pred-etermined algorithms that verify the properties that determine the correctness of the DAC. Arbitrating over concrete algorithms makes our fraud-proofs more efficient, easier to understand, and simpler to prove correct. We provide as an artifact a mechanization in LEAN4 of our fraud-proof games, including (1) the verified strategies that honest players should play to win all games as well as (2) mechanisms to detect dishonest claims.
Authors:Jack Wilkie, Hanan Hindy, Ivan Andonovic, Christos Tachtatzis, Robert Atkinson
Abstract:
Malware classification is a contemporary and ongoing challenge in cyber-security: modern obfuscation techniques are able to evade traditional static analysis, while dynamic analysis is too resource intensive to be deployed at a large scale. One prominent line of research addresses these limitations by converting malware binaries into 2D images by heuristically reshaping them into a 2D grid before resizing using Lanczos resampling. These images can then be classified based on their textural information using computer vision approaches. While this approach can detect obfuscated malware more effectively than static analysis, the process of converting files into 2D images results in significant information loss due to both quantisation noise, caused by rounding to integer pixel values, and the introduction of 2D dependencies which do not exist in the original data. This loss of signal limits the classification performance of the downstream model. This work addresses these weaknesses by instead resizing the files into 1D signals which avoids the need for heuristic reshaping, and additionally these signals do not suffer from quantisation noise due to being stored in a floating-point format. It is shown that existing 2D CNN architectures can be readily adapted to classify these 1D signals for improved performance. Furthermore, a bespoke 1D convolutional neural network, based on the ResNet architecture and squeeze-and-excitation layers, was developed to classify these signals and evaluated on the MalNet dataset. It was found to achieve state-of-the-art performance on binary, type, and family level classification with F1 scores of 0.874, 0.503, and 0.507, respectively, paving the way for future models to operate on the proposed signal modality.
Authors:Kunlin Cai, Jinghuai Zhang, Ying Li, Zhiyuan Wang, Xun Chen, Tianshi Li, Yuan Tian
Abstract:
The immersive nature of XR introduces a fundamentally different set of security and privacy (S&P) challenges due to the unprecedented user interactions and data collection that traditional paradigms struggle to mitigate. As the primary architects of XR applications, developers play a critical role in addressing novel threats. However, to effectively support developers, we must first understand how they perceive and respond to different threats. Despite the growing importance of this issue, there is a lack of in-depth, threat-aware studies that examine XR S&P from the developers' perspective. To fill this gap, we interviewed 23 professional XR developers with a focus on emerging threats in XR. Our study addresses two research questions aiming to uncover existing problems in XR development and identify actionable paths forward. By examining developers' perceptions of S&P threats, we found that: (1) XR development decisions (e.g., rich sensor data collection, user-generated content interfaces) are closely tied to and can amplify S&P threats, yet developers are often unaware of these risks, resulting in cognitive biases in threat perception; and (2) limitations in existing mitigation methods, combined with insufficient strategic, technical, and communication support, undermine developers' motivation, awareness, and ability to effectively address these threats. Based on these findings, we propose actionable and stakeholder-aware recommendations to improve XR S&P throughout the XR development process. This work represents the first effort to undertake a threat-aware, developer-centered study in the XR domain -- an area where the immersive, data-rich nature of the XR technology introduces distinctive challenges.
Authors:Junjie Mu, Zonghao Ying, Zhekui Fan, Zonglei Jing, Yaoyuan Zhang, Zhengmin Yu, Wenxin Zhang, Quanchen Zou, Xiangzheng Zhang
Abstract:
Jailbreak attacks on Large Language Models (LLMs) have demonstrated various successful methods whereby attackers manipulate models into generating harmful responses that they are designed to avoid. Among these, Greedy Coordinate Gradient (GCG) has emerged as a general and effective approach that optimizes the tokens in a suffix to generate jailbreakable prompts. While several improved variants of GCG have been proposed, they all rely on fixed-length suffixes. However, the potential redundancy within these suffixes remains unexplored. In this work, we propose Mask-GCG, a plug-and-play method that employs learnable token masking to identify impactful tokens within the suffix. Our approach increases the update probability for tokens at high-impact positions while pruning those at low-impact positions. This pruning not only reduces redundancy but also decreases the size of the gradient space, thereby lowering computational overhead and shortening the time required to achieve successful attacks compared to GCG. We evaluate Mask-GCG by applying it to the original GCG and several improved variants. Experimental results show that most tokens in the suffix contribute significantly to attack success, and pruning a minority of low-impact tokens does not affect the loss values or compromise the attack success rate (ASR), thereby revealing token redundancy in LLM prompts. Our findings provide insights for developing efficient and interpretable LLMs from the perspective of jailbreak attacks.
Authors:Yang Lou, Haibo Hu, Qun Song, Qian Xu, Yi Zhu, Rui Tan, Wei-Bin Lee, Jianping Wang
Abstract:
High-definition maps provide precise environmental information essential for prediction and planning in autonomous driving systems. Due to the high cost of labeling and maintenance, recent research has turned to online HD map construction using onboard sensor data, offering wider coverage and more timely updates for autonomous vehicles. However, the robustness of online map construction under adversarial conditions remains underexplored. In this paper, we present a systematic vulnerability analysis of online map construction models, which reveals that these models exhibit an inherent bias toward predicting symmetric road structures. In asymmetric scenes like forks or merges, this bias often causes the model to mistakenly predict a straight boundary that mirrors the opposite side. We demonstrate that this vulnerability persists in the real-world and can be reliably triggered by obstruction or targeted interference. Leveraging this vulnerability, we propose a novel two-stage attack framework capable of manipulating online constructed maps. First, our method identifies vulnerable asymmetric scenes along the victim AV's potential route. Then, we optimize the location and pattern of camera-blinding attacks and adversarial patch attacks. Evaluations on a public AD dataset demonstrate that our attacks can degrade mapping accuracy by up to 9.9%, render up to 44% of targeted routes unreachable, and increase unsafe planned trajectory rates, colliding with real-world road boundaries, by up to 27%. These attacks are also validated on a real-world testbed vehicle. We further analyze root causes of the symmetry bias, attributing them to training data imbalance, model architecture, and map element representation. To the best of our knowledge, this study presents the first vulnerability assessment of online map construction models and introduces the first digital and physical attack against them.
Authors:Lingfeng Yao, Chenpei Huang, Shengyao Wang, Junpei Xue, Hanqing Guo, Jiang Liu, Phone Lin, Tomoaki Ohtsuki, Miao Pan
Abstract:
As generative audio models are rapidly evolving, AI-generated audios increasingly raise concerns about copyright infringement and misinformation spread. Audio watermarking, as a proactive defense, can embed secret messages into audio for copyright protection and source verification. However, current neural audio watermarking methods focus primarily on the imperceptibility and robustness of watermarking, while ignoring its vulnerability to security attacks. In this paper, we develop a simple yet powerful attack: the overwriting attack that overwrites the legitimate audio watermark with a forged one and makes the original legitimate watermark undetectable. Based on the audio watermarking information that the adversary has, we propose three categories of overwriting attacks, i.e., white-box, gray-box, and black-box attacks. We also thoroughly evaluate the proposed attacks on state-of-the-art neural audio watermarking methods. Experimental results demonstrate that the proposed overwriting attacks can effectively compromise existing watermarking schemes across various settings and achieve a nearly 100% attack success rate. The practicality and effectiveness of the proposed overwriting attacks expose security flaws in existing neural audio watermarking systems, underscoring the need to enhance security in future audio watermarking designs.
Authors:Xiaogang Cheng, Ren Guo, Zuxi Chen
Abstract:
Elliptic curve cryptography is better than traditional cryptography based on RSA and discrete logarithm of finite field in terms of efficiency and security. In this paper, we show how to exploit elliptic curve with high rank, which has not been used in cryptography before, to construct cryptographic schemes. Concretely we demonstrate how to construct public key signature scheme with hierarchy revocation based on elliptic curve with high rank, where the rank determines the height of the revocation tree. Although our construction is not very efficient in some sense, our construction shows elliptic curve with high rank is valuable and important for cryptographic usage. The technique and assumption presented can surely be used for other cryptographic constructions.
Authors:Paresh Baidya, Rourab Paul, Vikas Srivastava, Sumit Kumar Debnath
Abstract:
A fault can occur naturally or intentionally. However, intentionally injecting faults into hardware accelerators of Post-Quantum Cryptographic (PQC) algorithms may leak sensitive information. This intentional fault injection in side-channel attacks compromises the reliability of PQC implementations. The recently NIST-standardized key encapsulation mechanism (KEM), Kyber may also leak information at the hardware implementation level. This work proposes three efficient and lightweight recomputation-based fault detection methods for Barrett Reduction in the Cooley-Tukey Butterfly Unit (CT-BU) of Kyber on a Field Programmable Gate Array (FPGA). The CT-BU and Barrett Reduction are fundamental components in structured lattice-based PQC algorithms, including Kyber, NTRU, Falcon, CRYSTALS-Dilithium, etc. This paper introduces a new algorithm, Recomputation with Swapped Operand (RESWO), for fault detection. While Recomputation with Negated Operand (RENO) and Recomputation with Shifted Operand (RESO) are existing methods used in other PQC hardware algorithms. To the best of our knowledge, RENO and RESO have never been used in Barrett Reduction before. The proposed RESWO method consumes a similar number of slices compared to RENO and RESO. However, RESWO shows lesser delay compared to both RENO and RESO. The fault detection efficiency of RESWO, RENO, and RESO is nearly 100%.
Authors:Hao Nie, Wei Wang, Peng Xu, Wei Chen, Laurence T. Yang, Mauro Conti, Kaitai Liang
Abstract:
Dynamic Searchable Symmetric Encryption (DSSE) allows secure searches over a dynamic encrypted database but suffers from inherent information leakage. Existing passive attacks against DSSE rely on persistent leakage monitoring to infer leakage patterns, whereas this work targets intermittent observation - a more practical threat model. We propose Peekaboo - a new universal attack framework - and the core design relies on inferring the search pattern and further combining it with auxiliary knowledge and other leakage. We instantiate Peekaboo over the SOTA attacks, Sap (USENIX' 21) and Jigsaw (USENIX' 24), to derive their "+" variants (Sap+ and Jigsaw+). Extensive experiments demonstrate that our design achieves >0.9 adjusted rand index for search pattern recovery and 90% query accuracy vs. FMA's 30% (CCS' 23). Peekaboo's accuracy scales with observation rounds and the number of observed queries but also it resists SOTA countermeasures, with >40% accuracy against file size padding and >80% against obfuscation.
Authors:Zhuoyun Qian, Hongyi Miao, Cheng Zhang, Qin Hu, Yili Jiang, Jiaqi Huang, Fangtian Zhong
Abstract:
As IoT devices continue to proliferate, their reliability is increasingly constrained by security concerns. In response, researchers have developed diverse malware analysis techniques to detect and classify IoT malware. These techniques typically rely on extracting features at different levels from IoT applications, giving rise to a wide range of feature extraction methods. However, current approaches still face significant challenges when applied in practice. This survey provides a comprehensive review of feature extraction techniques for IoT malware analysis from multiple perspectives. We first examine static and dynamic feature extraction methods, followed by hybrid approaches. We then explore feature representation strategies based on graph learning. Finally, we compare the strengths and limitations of existing techniques, highlight open challenges, and outline promising directions for future research.
Authors:Anuj Gautam, Tarun Yadav, Garrett Smith, Kent Seamons, Scott Ruoti
Abstract:
Password managers provide significant security benefits to users. However, malicious client-side scripts and browser extensions can steal passwords after the manager has autofilled them into the web page. In this paper, we extend prior work by Stock and Johns, showing how password autofill can be hardened to prevent these local attacks. We implement our design in the Firefox browser and conduct experiments demonstrating that our defense successfully protects passwords from XSS attacks and malicious extensions. We also show that our implementation is compatible with 97% of the Alexa top 1000 websites. Next, we generalize our design, creating a second defense that prevents recently discovered local attacks against the FIDO2 protocols. We implement this second defense into Firefox, demonstrating that it protects the FIDO2 protocol against XSS attacks and malicious extensions. This defense is compatible with all websites, though it does require a small change (2-3 lines) to web servers implementing FIDO2.
Authors:Sayan Biswas, Philippe Chartier, Akash Dhasade, Tom Jurien, David Kerriou, Anne-Marie Kerrmarec, Mohammed Lemou, Franklin Tranie, Martijn de Vos, Milos Vujasinovic
Abstract:
In contemporary cloud-based services, protecting users' sensitive data and ensuring the confidentiality of the server's model are critical. Fully homomorphic encryption (FHE) enables inference directly on encrypted inputs, but its practicality is hindered by expensive bootstrapping and inefficient approximations of non-linear activations. We introduce Safhire, a hybrid inference framework that executes linear layers under encryption on the server while offloading non-linearities to the client in plaintext. This design eliminates bootstrapping, supports exact activations, and significantly reduces computation. To safeguard model confidentiality despite client access to intermediate outputs, Safhire applies randomized shuffling, which obfuscates intermediate values and makes it practically impossible to reconstruct the model. To further reduce latency, Safhire incorporates advanced optimizations such as fast ciphertext packing and partial extraction. Evaluations on multiple standard models and datasets show that Safhire achieves 1.5X - 10.5X lower inference latency than Orion, a state-of-the-art baseline, with manageable communication overhead and comparable accuracy, thereby establishing the practicality of hybrid FHE inference.
Authors:Navid Aftabi, Abhishek Hanchate, Satish Bukkapatnam, Dan Li
Abstract:
Industry 4.0's highly networked Machine Tool Controllers (MTCs) are prime targets for replay attacks that use outdated sensor data to manipulate actuators. Dynamic watermarking can reveal such tampering, but current schemes assume linear-Gaussian dynamics and use constant watermark statistics, making them vulnerable to the time-varying, partly proprietary behavior of MTCs. We close this gap with DynaMark, a reinforcement learning framework that models dynamic watermarking as a Markov decision process (MDP). It learns an adaptive policy online that dynamically adapts the covariance of a zero-mean Gaussian watermark using available measurements and detector feedback, without needing system knowledge. DynaMark maximizes a unique reward function balancing control performance, energy consumption, and detection confidence dynamically. We develop a Bayesian belief updating mechanism for real-time detection confidence in linear systems. This approach, independent of specific system assumptions, underpins the MDP for systems with linear dynamics. On a Siemens Sinumerik 828D controller digital twin, DynaMark achieves a reduction in watermark energy by 70% while preserving the nominal trajectory, compared to constant variance baselines. It also maintains an average detection delay equivalent to one sampling interval. A physical stepper-motor testbed validates these findings, rapidly triggering alarms with less control performance decline and exceeding existing benchmarks.
Authors:Zhiqiang Wang, Junyang Zhang, Guanquan Shi, HaoRan Cheng, Yunhao Yao, Kaiwen Guo, Haohua Du, Xiang-Yang Li
Abstract:
The Model Context Protocol (MCP) is increasingly adopted to standardize the interaction between LLM agents and external tools. However, this trend introduces a new threat: Tool Poisoning Attacks (TPA), where tool metadata is poisoned to induce the agent to perform unauthorized operations. Existing defenses that primarily focus on behavior-level analysis are fundamentally ineffective against TPA, as poisoned tools need not be executed, leaving no behavioral trace to monitor.
Thus, we propose MindGuard, a decision-level guardrail for LLM agents, providing provenance tracking of call decisions, policy-agnostic detection, and poisoning source attribution against TPA. While fully explaining LLM decision remains challenging, our empirical findings uncover a strong correlation between LLM attention mechanisms and tool invocation decisions. Therefore, we choose attention as an empirical signal for decision tracking and formalize this as the Decision Dependence Graph (DDG), which models the LLM's reasoning process as a weighted, directed graph where vertices represent logical concepts and edges quantify the attention-based dependencies. We further design robust DDG construction and graph-based anomaly analysis mechanisms that efficiently detect and attribute TPA attacks. Extensive experiments on real-world datasets demonstrate that MindGuard achieves 94\%-99\% average precision in detecting poisoned invocations, 95\%-100\% attribution accuracy, with processing times under one second and no additional token cost. Moreover, DDG can be viewed as an adaptation of the classical Program Dependence Graph (PDG), providing a solid foundation for applying traditional security policies at the decision level.
Authors:Alhad Daftardar, Jianqiao Mo, Joey Ah-kiow, Benedikt Bünz, Siddharth Garg, Brandon Reagen
Abstract:
Zero-Knowledge Proofs (ZKPs) have emerged as powerful tools for secure and privacy-preserving computation. ZKPs enable one party to convince another of a statement's validity without revealing anything else. This capability has profound implications in many domains, including: machine learning, blockchain, image authentication, and electronic voting. Despite their potential, ZKPs have seen limited deployment because of their exceptionally high computational overhead, which manifests primarily during proof generation. To mitigate these overheads, a (growing) body of researchers has proposed hardware accelerators and GPU implementations for kernels and complete protocols. Prior art spans a wide variety of ZKP schemes that vary significantly in computational overhead, proof size, verifier cost, protocol setup, and trust. The latest, and widely used ZKP protocols are intentionally designed to balance these trade-offs. A particular challenge in modern ZKP systems is supporting complex, high-degree gates using the SumCheck protocol. We address this challenge with a novel programmable accelerator that efficiently handles arbitrary custom gates via SumCheck. Our accelerator achieves upwards of $1000\times$ geomean speedup over CPU-based SumChecks across a range of gate types. We integrate this unit into a full-system accelerator, zkPHIRE, which achieves $1486\times$ geomean speedup over CPU and $11.87\times$ speedup over the state-of-the-art at iso-area. zkPHIRE is the first accelerator to scale to problem sizes of $2^{30}$ nominal constraints while maintaining small proof sizes and programmability.
Authors:Rahul Mishra, Sudhanshu Kumar Jha, Naresh Kshetri, Bishnu Bhusal, Mir Mehedi Rahman, Md Masud Rana, Aimina Ali Eli, Khaled Aminul Islam, Bishwo Prakash Pokharel
Abstract:
Efficient and reliable node deployment in Wireless Sensor Networks is crucial for optimizing coverage of the area, connectivity among nodes, and energy efficiency. This paper proposes a hybrid meta heuristic approach combining a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) to address the challenges of energy efficient and reliable node deployment. The GA PSO hybrid leverages GAs strong exploration capabilities and PSOs rapid convergence, achieving an optimum stability between coverage and energy consumption. The performance of the proposed approach is evaluated against GA and PSO alone and the innovatory meta heuristic based Competitive Multi Objective Marine Predators Algorithm (CMOMPA) across varying sensing ranges. Simulation results demonstrate that GA PSO requires 15% to 25% fewer sensor nodes and maintains 95% or more area coverage while maintaining the connectivity in comparison to standalone GA or PSO algorithm. The proposed algorithm also dominates CMOMPA when compared for long sensing and communication range in terms of higher coverage, improved connectivity, and reduced deployment time while requiring fewer sensor nodes. This study also explores key trade offs in WSN deployment and highlights future research directions, including heterogeneous node deployment, mobile WSNs, and enhanced multi objective optimization techniques. The findings underscore the effectiveness of hybrid meta heuristics in improving WSN performance, offering a promising approach for real world applications such as environmental monitoring, smart cities, smart agriculture, disaster response, and IIoT.
Authors:Yifan Liao, Yuxin Cao, Yedi Zhang, Wentao He, Yan Xiao, Xianglong Du, Zhiyong Huang, Jin Song Dong
Abstract:
Deep learning-based lane detection (LD) plays a critical role in autonomous driving and advanced driver assistance systems. However, its vulnerability to backdoor attacks presents a significant security concern. Existing backdoor attack methods on LD often exhibit limited practical utility due to the artificial and conspicuous nature of their triggers. To address this limitation and investigate the impact of more ecologically valid backdoor attacks on LD models, we examine the common data poisoning attack and introduce DBALD, a novel diffusion-based data poisoning framework for generating naturalistic backdoor triggers. DBALD comprises two key components: optimal trigger position finding and stealthy trigger generation. Given the insight that attack performance varies depending on the trigger position, we propose a heatmap-based method to identify the optimal trigger location, with gradient analysis to generate attack-specific heatmaps. A region-based editing diffusion process is then applied to synthesize visually plausible triggers within the most susceptible regions identified previously. Furthermore, to ensure scene integrity and stealthy attacks, we introduce two loss strategies: one for preserving lane structure and another for maintaining the consistency of the driving scene. Consequently, compared to existing attack methods, DBALD achieves both a high attack success rate and superior stealthiness. Extensive experiments on 4 mainstream LD models show that DBALD exceeds state-of-the-art methods, with an average success rate improvement of +10.87% and significantly enhanced stealthiness. The experimental results highlight significant practical challenges in ensuring model robustness against real-world backdoor threats in LD.
Authors:Saeid Ghasemshirazi, Ghazaleh Shirvani, Marziye Ranjbar Tavakoli, Bahar Ghaedi, Mohammad Amin Langarizadeh
Abstract:
The pharmaceutical supply chain faces escalating cybersecurity challenges threatening patient safety and operational continuity. This paper examines the transformative potential of zero trust architecture for enhancing security and resilience within this critical ecosystem. We explore the challenges posed by data breaches, counterfeiting, and disruptions and introduce the principles of continuous verification, least-privilege access, and data-centric security inherent in zero trust. Real-world case studies illustrate successful implementations. Benefits include heightened security, data protection, and adaptable resilience. As recognized by researchers and industrialists, a reliable drug tracing system is crucial for ensuring drug safety throughout the pharmaceutical production process. One of the most pivotal domains within the pharmaceutical industry and its associated supply chains where zero trust can be effectively implemented is in the management of narcotics, high-health-risk drugs, and abusable substances. By embracing zero trust, the pharmaceutical industry fortifies its supply chain against constantly changing cyber threats, ensuring the trustworthiness of critical medical operations.
Authors:Bingguang Lu, Hongsheng Hu, Yuantian Miao, Shaleeza Sohail, Chaoxiang He, Shuo Wang, Xiao Chen
Abstract:
Federated learning (FL) has been widely adopted as a decentralized training paradigm that enables multiple clients to collaboratively learn a shared model without exposing their local data. As concerns over data privacy and regulatory compliance grow, machine unlearning, which aims to remove the influence of specific data from trained models, has become increasingly important in the federated setting to meet legal, ethical, or user-driven demands. However, integrating unlearning into FL introduces new challenges and raises largely unexplored security risks. In particular, adversaries may exploit the unlearning process to compromise the integrity of the global model. In this paper, we present the first backdoor attack in the context of federated unlearning, demonstrating that an adversary can inject backdoors into the global model through seemingly legitimate unlearning requests. Specifically, we propose BadFU, an attack strategy where a malicious client uses both backdoor and camouflage samples to train the global model normally during the federated training process. Once the client requests unlearning of the camouflage samples, the global model transitions into a backdoored state. Extensive experiments under various FL frameworks and unlearning strategies validate the effectiveness of BadFU, revealing a critical vulnerability in current federated unlearning practices and underscoring the urgent need for more secure and robust federated unlearning mechanisms.
Authors:Jan Lum Fok, Qingwen Zeng, Shiping Chen, Oscar Fawkes, Huaming Chen
Abstract:
Credit card fraud detection (CCFD) is a critical application of Machine Learning (ML) in the financial sector, where accurately identifying fraudulent transactions is essential for mitigating financial losses. ML models have demonstrated their effectiveness in fraud detection task, in particular with the tabular dataset. While adversarial attacks have been extensively studied in computer vision and deep learning, their impacts on the ML models, particularly those trained on CCFD tabular datasets, remains largely unexplored. These latent vulnerabilities pose significant threats to the security and stability of the financial industry, especially in high-value transactions where losses could be substantial. To address this gap, in this paper, we present a holistic framework that investigate the robustness of CCFD ML model against adversarial perturbations under different circumstances. Specifically, the gradient-based attack methods are incorporated into the tabular credit card transaction data in both black- and white-box adversarial attacks settings. Our findings confirm that tabular data is also susceptible to subtle perturbations, highlighting the need for heightened awareness among financial technology practitioners regarding ML model security and trustworthiness. Furthermore, the experiments by transferring adversarial samples from gradient-based attack method to non-gradient-based models also verify our findings. Our results demonstrate that such attacks remain effective, emphasizing the necessity of developing robust defenses for CCFD algorithms.
Authors:Wei Shao, Zequan Liang, Ruoyu Zhang, Ruijie Fang, Ning Miao, Ehsan Kourkchi, Setareh Rafatirad, Houman Homayoun, Chongzhou Fang
Abstract:
Biometric authentication using physiological signals offers a promising path toward secure and user-friendly access control in wearable devices. While electrocardiogram (ECG) signals have shown high discriminability, their intrusive sensing requirements and discontinuous acquisition limit practicality. Photoplethysmography (PPG), on the other hand, enables continuous, non-intrusive authentication with seamless integration into wrist-worn wearable devices. However, most prior work relies on high-frequency PPG (e.g., 75 - 500 Hz) and complex deep models, which incur significant energy and computational overhead, impeding deployment in power-constrained real-world systems. In this paper, we present the first real-world implementation and evaluation of a continuous authentication system on a smartwatch, We-Be Band, using low-frequency (25 Hz) multi-channel PPG signals. Our method employs a Bi-LSTM with attention mechanism to extract identity-specific features from short (4 s) windows of 4-channel PPG. Through extensive evaluations on both public datasets (PTTPPG) and our We-Be Dataset (26 subjects), we demonstrate strong classification performance with an average test accuracy of 88.11%, macro F1-score of 0.88, False Acceptance Rate (FAR) of 0.48%, False Rejection Rate (FRR) of 11.77%, and Equal Error Rate (EER) of 2.76%. Our 25 Hz system reduces sensor power consumption by 53% compared to 512 Hz and 19% compared to 128 Hz setups without compromising performance. We find that sampling at 25 Hz preserves authentication accuracy, whereas performance drops sharply at 20 Hz while offering only trivial additional power savings, underscoring 25 Hz as the practical lower bound. Additionally, we find that models trained exclusively on resting data fail under motion, while activity-diverse training improves robustness across physiological states.
Authors:Afrah Gueriani, Hamza Kheddar, Ahmed Cherif Mazari, Mohamed Chahine Ghanem
Abstract:
The increased Internet of Medical Things IoMT and the Industrial Internet of Things IIoT interconnectivity has introduced complex cybersecurity challenges, exposing sensitive data, patient safety, and industrial operations to advanced cyber threats. To mitigate these risks, this paper introduces a novel transformer-based intrusion detection system IDS, termed BiGAT-ID a hybrid model that combines bidirectional gated recurrent units BiGRU, long short-term memory LSTM networks, and multi-head attention MHA. The proposed architecture is designed to effectively capture bidirectional temporal dependencies, model sequential patterns, and enhance contextual feature representation. Extensive experiments on two benchmark datasets, CICIoMT2024 medical IoT and EdgeIIoTset industrial IoT demonstrate the model's cross-domain robustness, achieving detection accuracies of 99.13 percent and 99.34 percent, respectively. Additionally, the model exhibits exceptional runtime efficiency, with inference times as low as 0.0002 seconds per instance in IoMT and 0.0001 seconds in IIoT scenarios. Coupled with a low false positive rate, BiGAT-ID proves to be a reliable and efficient IDS for deployment in real-world heterogeneous IoT environments
Authors:Satyam Kumar Navneet, Joydeep Chandra
Abstract:
The integration of Large Language Models (LLMs) into software engineering has revolutionized code generation, enabling unprecedented productivity through promptware and autonomous AI agents. However, this transformation introduces significant risks, including insecure code generation, hallucinated outputs, irreversible actions, and a lack of transparency and accountability. Incidents like the Replit database deletion underscore the urgent need for robust safety and governance mechanisms. This paper comprehensively analyzes the inherent challenges of LLM-assisted code generation, such as vulnerability inheritance, overtrust, misinterpretation, and the absence of standardized validation and rollback protocols. To address these, we propose the SAFE-AI Framework, a holistic approach emphasizing Safety, Auditability, Feedback, and Explainability. The framework integrates guardrails, sandboxing, runtime verification, risk-aware logging, human-in-the-loop systems, and explainable AI techniques to mitigate risks while fostering trust and compliance. We introduce a novel taxonomy of AI behaviors categorizing suggestive, generative, autonomous, and destructive actions to guide risk assessment and oversight. Additionally, we identify open problems, including the lack of standardized benchmarks for code specific hallucinations and autonomy levels, and propose future research directions for hybrid verification, semantic guardrails, and proactive governance tools. Through detailed comparisons of autonomy control, prompt engineering, explainability, and governance frameworks, this paper provides a roadmap for responsible AI integration in software engineering, aligning with emerging regulations like the EU AI Act and Canada's AIDA to ensure safe, transparent, and accountable AI-driven development.
Authors:Zhipeng Yuan, Kai Wang, Weize Quan, Dong-Ming Yan, Tieru Wu
Abstract:
With the rapid advancement of AI generative models, the visual quality of AI-generated images (AIIs) has become increasingly close to natural images, which inevitably raises security concerns. Most AII detectors often employ the conventional image classification pipeline with natural images and AIIs (generated by a generative model), which can result in limited detection performance for AIIs from unseen generative models. To solve this, we proposed a universal AI-generated image detector from the perspective of anomaly detection. Our discriminator does not need to access any AIIs and learn a generalizable representation with unsupervised learning. Specifically, we use the pre-trained CLIP encoder as the feature extractor and design a normalizing flow-like unsupervised model. Instead of AIIs, proxy images, e.g., obtained by applying a spectral modification operation on natural images, are used for training. Our models are trained by minimizing the likelihood of proxy images, optionally combined with maximizing the likelihood of natural images. Extensive experiments demonstrate the effectiveness of our method on AIIs produced by various image generators.
Authors:Samaneh Mohammadi, Vasileios Tsouvalas, Iraklis Symeonidis, Ali Balador, Tanir Ozcelebi, Francesco Flammini, Nirvana Meratnia
Abstract:
Federated unlearning (FU) algorithms allow clients in federated settings to exercise their ''right to be forgotten'' by removing the influence of their data from a collaboratively trained model. Existing FU methods maintain data privacy by performing unlearning locally on the client-side and sending targeted updates to the server without exposing forgotten data; yet they often rely on server-side cooperation, revealing the client's intent and identity without enforcement guarantees - compromising autonomy and unlearning privacy. In this work, we propose EFU (Enforced Federated Unlearning), a cryptographically enforced FU framework that enables clients to initiate unlearning while concealing its occurrence from the server. Specifically, EFU leverages functional encryption to bind encrypted updates to specific aggregation functions, ensuring the server can neither perform unauthorized computations nor detect or skip unlearning requests. To further mask behavioral and parameter shifts in the aggregated model, we incorporate auxiliary unlearning losses based on adversarial examples and parameter importance regularization. Extensive experiments show that EFU achieves near-random accuracy on forgotten data while maintaining performance comparable to full retraining across datasets and neural architectures - all while concealing unlearning intent from the server. Furthermore, we demonstrate that EFU is agnostic to the underlying unlearning algorithm, enabling secure, function-hiding, and verifiable unlearning for any client-side FU mechanism that issues targeted updates.
Authors:Kaichuan Kong, Dongjie Liu, Xiaobo Jin, Zhiying Li, Guanggang Geng
Abstract:
Insider threat detection presents a significant challenge due to the deceptive nature of malicious behaviors, which often resemble legitimate user operations. However, existing approaches typically model system logs as flat event sequences, thereby failing to capture the inherent frequency dynamics and multiscale disturbance patterns embedded in user behavior. To address these limitations, we propose Log2Sig, a robust anomaly detection framework that transforms user logs into multivariate behavioral frequency signals, introducing a novel representation of user behavior. Log2Sig employs Multivariate Variational Mode Decomposition (MVMD) to extract Intrinsic Mode Functions (IMFs), which reveal behavioral fluctuations across multiple temporal scales. Based on this, the model further performs joint modeling of behavioral sequences and frequency-decomposed signals: the daily behavior sequences are encoded using a Mamba-based temporal encoder to capture long-term dependencies, while the corresponding frequency components are linearly projected to match the encoder's output dimension. These dual-view representations are then fused to construct a comprehensive user behavior profile, which is fed into a multilayer perceptron for precise anomaly detection. Experimental results on the CERT r4.2 and r5.2 datasets demonstrate that Log2Sig significantly outperforms state-of-the-art baselines in both accuracy and F1 score.
Authors:Kaichuan Kong, Dongjie Liu, Xiaobo Jin, Zhiying Li, Guanggang Geng, Jian Weng
Abstract:
Enterprises are facing increasing risks of insider threats, while existing detection methods are unable to effectively address these challenges due to reasons such as insufficient temporal dynamic feature modeling, computational efficiency and real-time bottlenecks and cross-modal information island problem. This paper proposes a new insider threat detection framework MambaITD based on the Mamba state space model and cross-modal adaptive fusion. First, the multi-source log preprocessing module aligns heterogeneous data through behavioral sequence encoding, interval smoothing, and statistical feature extraction. Second, the Mamba encoder models long-range dependencies in behavioral and interval sequences, and combines the sequence and statistical information dynamically in combination with the gated feature fusion mechanism. Finally, we propose an adaptive threshold optimization method based on maximizing inter-class variance, which dynamically adjusts the decision threshold by analyzing the probability distribution, effectively identifies anomalies, and alleviates class imbalance and concept drift. Compared with traditional methods, MambaITD shows significant advantages in modeling efficiency and feature fusion capabilities, outperforming Transformer-based methods, and provides a more effective solution for insider threat detection.
Authors:Kaichuan Kong, Dongjie Liu, Xiaobo Jin, Guanggang Geng, Zhiying Li, Jian Weng
Abstract:
Insider threat detection (ITD) poses a persistent and high-impact challenge in cybersecurity due to the subtle, long-term, and context-dependent nature of malicious insider behaviors. Traditional models often struggle to capture semantic intent and complex behavior dynamics, while existing LLM-based solutions face limitations in prompt adaptability and modality coverage. To bridge this gap, we propose DMFI, a dual-modality framework that integrates semantic inference with behavior-aware fine-tuning. DMFI converts raw logs into two structured views: (1) a semantic view that processes content-rich artifacts (e.g., emails, https) using instruction-formatted prompts; and (2) a behavioral abstraction, constructed via a 4W-guided (When-Where-What-Which) transformation to encode contextual action sequences. Two LoRA-enhanced LLMs are fine-tuned independently, and their outputs are fused via a lightweight MLP-based decision module. We further introduce DMFI-B, a discriminative adaptation strategy that separates normal and abnormal behavior representations, improving robustness under severe class imbalance. Experiments on CERT r4.2 and r5.2 datasets demonstrate that DMFI outperforms state-of-the-art methods in detection accuracy. Our approach combines the semantic reasoning power of LLMs with structured behavior modeling, offering a scalable and effective solution for real-world insider threat detection. Our work demonstrates the effectiveness of combining LLM reasoning with structured behavioral modeling, offering a scalable and deployable solution for modern insider threat detection.
Authors:Weihong Sheng, Jiajun Chen, Bin Cai, Chunqiang Hu, Meng Han, Jiguo Yu
Abstract:
Differential Privacy (DP) is commonly employed to safeguard graph analysis or publishing. Distance, a critical factor in graph analysis, is typically handled using curator DP, where a trusted curator holds the complete neighbor lists of all vertices and answers queries privately. However, in many real-world scenarios, such a curator may not be present, posing a significant challenge for implementing differentially private distance queries under Local Differential Privacy (LDP). This paper proposes two approaches to address this challenge. The first approach generates a synthetic graph by randomizing responses and applies bitwise operations to reduce noise interference. However, like other synthetic graph methods, this approach suffers from low utility. To overcome this limitation, we propose a second approach, the first LDP method specifically designed for distance queries, which captures the global graph structure by continuously aggregating local distance vectors from neighboring vertices. This process enables the accurate updating of global distances. We demonstrate the effectiveness of our method through comprehensive theoretical analysis and experimental evaluations on real-world datasets.
Authors:Ahsan Farabi, Israt Khandaker, Nusrat Jahan, Ibrahim Khalil Shanto
Abstract:
Academic credential fraud threatens educational integrity, especially in developing countries like Bangladesh, where verification methods are primarily manual and inefficient. To address this challenge, we present ShikkhaChain, a blockchain-powered certificate management platform designed to securely issue, verify, and revoke academic credentials in a decentralized and tamper-proof manner. Built on Ethereum smart contracts and utilizing IPFS for off-chain storage, the platform offers a transparent, scalable solution accessible through a React-based DApp with MetaMask integration. ShikkhaChain enables role-based access for governments, regulators, institutions, and public verifiers, allowing QR-based validation and on-chain revocation tracking. Our prototype demonstrates enhanced trust, reduced verification time, and improved international credibility for Bangladeshi degrees, promoting a more reliable academic and employment ecosystem.
Authors:Mamadou Keita, Wassim Hamidouche, Hessen Bougueffa Eutamene, Abdelmalik Taleb-Ahmed, Abdenour Hadid
Abstract:
In this paper, we introduce RAVID, the first framework for AI-generated image detection that leverages visual retrieval-augmented generation (RAG). While RAG methods have shown promise in mitigating factual inaccuracies in foundation models, they have primarily focused on text, leaving visual knowledge underexplored. Meanwhile, existing detection methods, which struggle with generalization and robustness, often rely on low-level artifacts and model-specific features, limiting their adaptability. To address this, RAVID dynamically retrieves relevant images to enhance detection. Our approach utilizes a fine-tuned CLIP image encoder, RAVID CLIP, enhanced with category-related prompts to improve representation learning. We further integrate a vision-language model (VLM) to fuse retrieved images with the query, enriching the input and improving accuracy. Given a query image, RAVID generates an embedding using RAVID CLIP, retrieves the most relevant images from a database, and combines these with the query image to form an enriched input for a VLM (e.g., Qwen-VL or Openflamingo). Experiments on the UniversalFakeDetect benchmark, which covers 19 generative models, show that RAVID achieves state-of-the-art performance with an average accuracy of 93.85%. RAVID also outperforms traditional methods in terms of robustness, maintaining high accuracy even under image degradations such as Gaussian blur and JPEG compression. Specifically, RAVID achieves an average accuracy of 80.27% under degradation conditions, compared to 63.44% for the state-of-the-art model C2P-CLIP, demonstrating consistent improvements in both Gaussian blur and JPEG compression scenarios. The code will be publicly available upon acceptance.
Authors:Zhaoyi Meng, Fenglei Xu, Wenxiang Zhao, Wansen Wang, Wenchao Huang, Jie Cui, Hong Zhong, Yan Xiong
Abstract:
Static analysis, a fundamental technique in Android app examination, enables the extraction of control flows, data flows, and inter-component communications (ICCs), all of which are essential for malware detection. However, existing methods struggle to leverage the semantic complementarity across different types of flows for representing program behaviors, and their context-unaware nature further hinders the accuracy of cross-flow semantic integration. We propose and implement MalFlows, a novel technique that achieves context-aware fusion of heterogeneous flow semantics for Android malware detection. Our goal is to leverage complementary strengths of the three types of flow-related information for precise app profiling. We adopt a heterogeneous information network (HIN) to model the rich semantics across these program flows. We further propose flow2vec, a context-aware HIN embedding technique that distinguishes the semantics of HIN entities as needed based on contextual constraints across different flows and learns accurate app representations through the joint use of multiple meta-paths. The representations are finally fed into a channel-attention-based deep neural network for malware classification. To the best of our knowledge, this is the first study to comprehensively aggregate the strengths of diverse flow-related information for assessing maliciousness within apps. We evaluate MalFlows on a large-scale dataset comprising over 20 million flow instances extracted from more than 31,000 real-world apps. Experimental results demonstrate that MalFlows outperforms representative baselines in Android malware detection, and meanwhile, validate the effectiveness of flow2vec in accurately learning app representations from the HIN constructed over the heterogeneous flows.
Authors:Mingrui Liu, Sixiao Zhang, Cheng Long
Abstract:
Text-to-Image (T2I) generation is a popular AI-generated content (AIGC) technology enabling diverse and creative image synthesis. However, some outputs may contain Not Safe For Work (NSFW) content (e.g., violence), violating community guidelines. Detecting NSFW content efficiently and accurately, known as external safeguarding, is essential. Existing external safeguards fall into two types: text filters, which analyze user prompts but overlook T2I model-specific variations and are prone to adversarial attacks; and image filters, which analyze final generated images but are computationally costly and introduce latency. Diffusion models, the foundation of modern T2I systems like Stable Diffusion, generate images through iterative denoising using a U-Net architecture with ResNet and Transformer blocks. We observe that: (1) early denoising steps define the semantic layout of the image, and (2) cross-attention layers in U-Net are crucial for aligning text and image regions. Based on these insights, we propose Wukong, a transformer-based NSFW detection framework that leverages intermediate outputs from early denoising steps and reuses U-Net's pre-trained cross-attention parameters. Wukong operates within the diffusion process, enabling early detection without waiting for full image generation. We also introduce a new dataset containing prompts, seeds, and image-specific NSFW labels, and evaluate Wukong on this and two public benchmarks. Results show that Wukong significantly outperforms text-based safeguards and achieves comparable accuracy of image filters, while offering much greater efficiency.
Authors:Mustapha Hemis, Hamza Kheddar, Mohamed Chahine Ghanem, Bachir Boudraa
Abstract:
Steganalysis methods based on deep learning (DL) often struggle with computational complexity and challenges in generalizing across different datasets. Incorporating a graph neural network (GNN) into steganalysis schemes enables the leveraging of relational data for improved detection accuracy and adaptability. This paper presents the first application of a Graph Neural Network (GNN), specifically the GraphSAGE architecture, for steganalysis of compressed voice over IP (VoIP) speech streams. The method involves straightforward graph construction from VoIP streams and employs GraphSAGE to capture hierarchical steganalysis information, including both fine grained details and high level patterns, thereby achieving high detection accuracy. Experimental results demonstrate that the developed approach performs well in uncovering quantization index modulation (QIM)-based steganographic patterns in VoIP signals. It achieves detection accuracy exceeding 98 percent even for short 0.5 second samples, and 95.17 percent accuracy under challenging conditions with low embedding rates, representing an improvement of 2.8 percent over the best performing state of the art methods. Furthermore, the model exhibits superior efficiency, with an average detection time as low as 0.016 seconds for 0.5-second samples an improvement of 0.003 seconds. This makes it efficient for online steganalysis tasks, providing a superior balance between detection accuracy and efficiency under the constraint of short samples with low embedding rates.
Authors:Quanchen Zou, Zonghao Ying, Moyang Chen, Wenzhuo Xu, Yisong Xiao, Yakai Li, Deyue Zhang, Dongdong Yang, Zhao Liu, Xiangzheng Zhang
Abstract:
The increasing sophistication of large vision-language models (LVLMs) has been accompanied by advances in safety alignment mechanisms designed to prevent harmful content generation. However, these defenses remain vulnerable to sophisticated adversarial attacks. Existing jailbreak methods typically rely on direct and semantically explicit prompts, overlooking subtle vulnerabilities in how LVLMs compose information over multiple reasoning steps. In this paper, we propose a novel and effective jailbreak framework inspired by Return-Oriented Programming (ROP) techniques from software security. Our approach decomposes a harmful instruction into a sequence of individually benign visual gadgets. A carefully engineered textual prompt directs the sequence of inputs, prompting the model to integrate the benign visual gadgets through its reasoning process to produce a coherent and harmful output. This makes the malicious intent emergent and difficult to detect from any single component. We validate our method through extensive experiments on established benchmarks including SafeBench and MM-SafetyBench, targeting popular LVLMs. Results show that our approach consistently and substantially outperforms existing baselines on state-of-the-art models, achieving near-perfect attack success rates (over 0.90 on SafeBench) and improving ASR by up to 0.39. Our findings reveal a critical and underexplored vulnerability that exploits the compositional reasoning abilities of LVLMs, highlighting the urgent need for defenses that secure the entire reasoning process.
Authors:Joydeep Chandra, Satyam Kumar Navneet
Abstract:
As AI-driven dataspaces become integral to data sharing and collaborative analytics, ensuring privacy, performance, and policy compliance presents significant challenges. This paper provides a comprehensive review of privacy-preserving and policy-aware AI techniques, including Federated Learning, Differential Privacy, Trusted Execution Environments, Homomorphic Encryption, and Secure Multi-Party Computation, alongside strategies for aligning AI with regulatory frameworks such as GDPR and the EU AI Act. We propose a novel taxonomy to classify these techniques based on privacy levels, performance impacts, and compliance complexity, offering a clear framework for practitioners and researchers to navigate trade-offs. Key performance metrics -- latency, throughput, cost overhead, model utility, fairness, and explainability -- are analyzed to highlight the multi-dimensional optimization required in dataspaces. The paper identifies critical research gaps, including the lack of standardized privacy-performance KPIs, challenges in explainable AI for federated ecosystems, and semantic policy enforcement amidst regulatory fragmentation. Future directions are outlined, proposing a conceptual framework for policy-driven alignment, automated compliance validation, standardized benchmarking, and integration with European initiatives like GAIA-X, IDS, and Eclipse EDC. By synthesizing technical, ethical, and regulatory perspectives, this work lays the groundwork for developing trustworthy, efficient, and compliant AI systems in dataspaces, fostering innovation in secure and responsible data-driven ecosystems.
Authors:Rui Guo, Avinash Ayalasomayajula, Henian Li, Jingbo Zhou, Sujan Kumar Saha, Farimah Farahmandi
Abstract:
Verification using SystemVerilog assertions (SVA) is one of the most popular methods for detecting circuit design vulnerabilities. However, with the globalization of integrated circuit design and the continuous upgrading of security requirements, the SVA development model has exposed major limitations. It is not only inefficient in development, but also unable to effectively deal with the increasing number of security vulnerabilities in modern complex integrated circuits. In response to these challenges, this paper proposes an innovative SVA automatic generation framework SVAgent. SVAgent introduces a requirement decomposition mechanism to transform the original complex requirements into a structured, gradually solvable fine-grained problem-solving chain. Experiments have shown that SVAgent can effectively suppress the influence of hallucinations and random answers, and the key evaluation indicators such as the accuracy and consistency of the SVA are significantly better than existing frameworks. More importantly, we successfully integrated SVAgent into the most mainstream integrated circuit vulnerability assessment framework and verified its practicality and reliability in a real engineering design environment.
Authors:Eldor Abdukhamidov, Tamer Abuhmed, Joanna C. S. Santos, Mohammed Abuhamad
Abstract:
Studies have shown that machine learning systems are vulnerable to adversarial examples in theory and practice. Where previous attacks have focused mainly on visual models that exploit the difference between human and machine perception, text-based models have also fallen victim to these attacks. However, these attacks often fail to maintain the semantic meaning of the text and similarity. This paper introduces AdvChar, a black-box attack on Interpretable Natural Language Processing Systems, designed to mislead the classifier while keeping the interpretation similar to benign inputs, thus exploiting trust in system transparency. AdvChar achieves this by making less noticeable modifications to text input, forcing the deep learning classifier to make incorrect predictions and preserve the original interpretation. We use an interpretation-focused scoring approach to determine the most critical tokens that, when changed, can cause the classifier to misclassify the input. We apply simple character-level modifications to measure the importance of tokens, minimizing the difference between the original and new text while generating adversarial interpretations similar to benign ones. We thoroughly evaluated AdvChar by testing it against seven NLP models and three interpretation models using benchmark datasets for the classification task. Our experiments show that AdvChar can significantly reduce the prediction accuracy of current deep learning models by altering just two characters on average in input samples.
Authors:Kim Hammar, Yuchao Li, Tansu Alpcan, Emil C. Lupu, Dimitri Bertsekas
Abstract:
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.
Authors:Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Hyoungshick Kim, Tamer Abuhmed
Abstract:
Vision transformer (ViT) models, when coupled with interpretation models, are regarded as secure and challenging to deceive, making them well-suited for security-critical domains such as medical applications, autonomous vehicles, drones, and robotics. However, successful attacks on these systems can lead to severe consequences. Recent research on threats targeting ViT models primarily focuses on generating the smallest adversarial perturbations that can deceive the models with high confidence, without considering their impact on model interpretations. Nevertheless, the use of interpretation models can effectively assist in detecting adversarial examples. This study investigates the vulnerability of transformer models to adversarial attacks, even when combined with interpretation models. We propose an attack called "AdViT" that generates adversarial examples capable of misleading both a given transformer model and its coupled interpretation model. Through extensive experiments on various transformer models and two transformer-based interpreters, we demonstrate that AdViT achieves a 100% attack success rate in both white-box and black-box scenarios. In white-box scenarios, it reaches up to 98% misclassification confidence, while in black-box scenarios, it reaches up to 76% misclassification confidence. Remarkably, AdViT consistently generates accurate interpretations in both scenarios, making the adversarial examples more difficult to detect.
Authors:Nikola Pavlovic, Sudeep Salgia, Qing Zhao
Abstract:
We consider the problem of contextual kernel bandits with stochastic contexts, where the underlying reward function belongs to a known Reproducing Kernel Hilbert Space. We study this problem under an additional constraint of Differential Privacy, where the agent needs to ensure that the sequence of query points is differentially private with respect to both the sequence of contexts and rewards. We propose a novel algorithm that achieves the state-of-the-art cumulative regret of $\widetilde{\mathcal{O}}(\sqrt{γ_TT}+\frac{γ_T}{\varepsilon_{\mathrm{DP}}})$ and $\widetilde{\mathcal{O}}(\sqrt{γ_TT}+\frac{γ_T\sqrt{T}}{\varepsilon_{\mathrm{DP}}})$ over a time horizon of $T$ in the joint and local models of differential privacy, respectively, where $γ_T$ is the effective dimension of the kernel and $\varepsilon_{\mathrm{DP}} > 0$ is the privacy parameter. The key ingredient of the proposed algorithm is a novel private kernel-ridge regression estimator which is based on a combination of private covariance estimation and private random projections. It offers a significantly reduced sensitivity compared to its classical counterpart while maintaining a high prediction accuracy, allowing our algorithm to achieve the state-of-the-art performance guarantees.
Authors:Amro Abdalla, Ismail Shaheen, Dan DeGenaro, Rupayan Mallick, Bogdan Raita, Sarah Adel Bargal
Abstract:
We present GIFT: a {G}radient-aware {I}mmunization technique to defend diffusion models against malicious {F}ine-{T}uning while preserving their ability to generate safe content. Existing safety mechanisms like safety checkers are easily bypassed, and concept erasure methods fail under adversarial fine-tuning. GIFT addresses this by framing immunization as a bi-level optimization problem: the upper-level objective degrades the model's ability to represent harmful concepts using representation noising and maximization, while the lower-level objective preserves performance on safe data. GIFT achieves robust resistance to malicious fine-tuning while maintaining safe generative quality. Experimental results show that our method significantly impairs the model's ability to re-learn harmful concepts while maintaining performance on safe content, offering a promising direction for creating inherently safer generative models resistant to adversarial fine-tuning attacks.
Authors:Fei Wu, Danning Sui, Thomas Thiery, Mallesh Pai
Abstract:
This paper provides a comprehensive empirical analysis of the economics and dynamics behind arbitrages between centralized and decentralized exchanges (CEX-DEX) on Ethereum. We refine heuristics to identify arbitrage transactions from on-chain data and introduce a robust empirical framework to estimate arbitrage revenue without knowing traders' actual behaviors on CEX. Leveraging an extensive dataset spanning 19 months from August 2023 to March 2025, we estimate a total of 233.8M USD extracted by 19 major CEX-DEX searchers from 7,203,560 identified CEX-DEX arbitrages. Our analysis reveals increasing centralization trends as three searchers captured three-quarters of both volume and extracted value. We also demonstrate that searchers' profitability is tied to their integration level with block builders and uncover exclusive searcher-builder relationships and their market impact. Finally, we correct the previously underestimated profitability of block builders who vertically integrate with a searcher. These insights illuminate the darkest corner of the MEV landscape and highlight the critical implications of CEX-DEX arbitrages for Ethereum's decentralization.
Authors:Liam Tyler, Adam Caulfield, Ivan De Oliveira Nunes
Abstract:
Control Flow Attestation (CFA) allows remote verification of run-time software integrity in embedded systems. However, CFA is limited by the storage/transmission costs of generated control flow logs (CFlog). Recent work has proposed application-specific optimizations by speculating on likely sub-paths in CFlog and replacing them with reserved symbols at runtime. Albeit effective, prior approaches do not consider the representation of addresses in a control flow path for speculation. This work proposes RESPEC-CFA, an architectural extension for CFA allowing for speculation on (1) the locality of control flows and (2) their Huffman encoding. Alone, RESPEC-CFA reduces CFlog sizes by up to 90.1%. Combined with prior methods, RESPEC-CFA yields reductions of up to 99.7%, representing a significant step toward practical CFA.
Authors:Brendan Murphy, Dillon Bowen, Shahrad Mohammadzadeh, Tom Tseng, Julius Broomfield, Adam Gleave, Kellin Pelrine
Abstract:
AI systems are rapidly advancing in capability, and frontier model developers broadly acknowledge the need for safeguards against serious misuse. However, this paper demonstrates that fine-tuning, whether via open weights or closed fine-tuning APIs, can produce helpful-only models with safeguards destroyed. In contrast to prior work which is blocked by modern moderation systems or achieved only partial removal of safeguards or degraded output quality, our jailbreak-tuning method teaches models to generate detailed, high-quality responses to arbitrary harmful requests. For example, OpenAI, Google, and Anthropic models will fully comply with requests for CBRN assistance, executing cyberattacks, and other criminal activity. We further show that backdoors can increase not only the stealth but also the severity of attacks. Stronger jailbreak prompts become even more effective in fine-tuning attacks, linking attacks and potentially defenses in the input and weight spaces. Not only are current models vulnerable, more recent ones also appear to be becoming even more vulnerable to these attacks, underscoring the urgent need for tamper-resistant safeguards. Until such safeguards are discovered, companies and policymakers should view the release of any fine-tunable model as simultaneously releasing its evil twin: equally capable as the original model, and usable for any malicious purpose within its capabilities.
Authors:Adriano Castro, Simon Hanisch, Matin Fallahi, Thorsten Strufe
Abstract:
Facial motion capture in mixed reality headsets enables real-time avatar animation, allowing users to convey non-verbal cues during virtual interactions. However, as facial motion data constitutes a behavioral biometric, its use raises novel privacy concerns. With mixed reality systems becoming more immersive and widespread, understanding whether face motion data can lead to user identification or inference of sensitive attributes is increasingly important.
To address this, we conducted a study with 116 participants using three types of headsets across three sessions, collecting facial, eye, and head motion data during verbal and non-verbal tasks. The data used is not raw video, but rather, abstract representations that are used to animate digital avatars. Our analysis shows that individuals can be re-identified from this data with up to 98% balanced accuracy, are even identifiable across device types, and that emotional states can be inferred with up to 86% accuracy. These results underscore the potential privacy risks inherent in face motion tracking in mixed reality environments.
Authors:Jie Zhang, Xiaohong Li, Man Zheng, Zhe Hou, Guangdong Bai, Ruitao Feng
Abstract:
Cloud storage introduces critical privacy challenges for encrypted data retrieval, where fuzzy multi-keyword search enables approximate matching while preserving data confidentiality. Existing solutions face fundamental trade-offs between security and efficiency: linear-search mechanisms provide adaptive security but incur prohibitive overhead for large-scale data, while tree-based indexes improve performance at the cost of branch leakage vulnerabilities.
To address these limitations, we propose DVFS - a dynamic verifiable fuzzy search service with three core innovations: (1) An \textit{adaptive-secure fuzzy search} method integrating locality-sensitive hashing with virtual binary trees, eliminating branch leakage while reducing search complexity from linear to sublinear ($O(\log n)$ time); (2) A \textit{dual-repository version control} mechanism supporting dynamic updates with forward privacy, preventing information leakage during operations; (3) A \textit{blockchain-based verification system} that ensures correctness and completeness via smart contracts, achieving $O(\log n)$ verification complexity.
Our solution advances secure encrypted retrieval by simultaneously resolving the security-performance paradox and enabling trustworthy dynamic operations.
Authors:Danyu Sun, Jinghuai Zhang, Jiacen Xu, Yu Zheng, Yuan Tian, Zhou Li
Abstract:
Host-based intrusion detection system (HIDS) is a key defense component to protect the organizations from advanced threats like Advanced Persistent Threats (APT). By analyzing the fine-grained logs with approaches like data provenance, HIDS has shown successes in capturing sophisticated attack traces. Despite the progresses embarked by the research community and industry, HIDS still frequently encounters backlash from their operators in the deployed environments, due to issues like high false-positive rate, inconsistent outcomes across environments and human-unfriendly detection results. Large Language Models (LLMs) have great potentials to advance the state of HIDS, given their extensive knowledge of attack techniques and their ability to detect anomalies through semantic analysis, anchored by recent studies. Yet, our preliminary analysis indicates that building an HIDS by naively prompting an LLM is unlikely to succeed. In this work, we explore the direction of building a customized LLM pipeline for HIDS and develop a system named SHIELD. SHIELD addresses challenges related to LLM's token limits, confusion of background noises, etc., by integrating a variety of techniques like event-level Masked Autoencoder (MAE) for attack window detection, attack evidence identification and expansion, Deterministic Data Augmentation (DDA) for profiling normal activities, and multi-purpose prompting that guides the LLM to conduct precise and interpretable attack investigations. Extensive experiments on three log datasets (DARPA-E3, NodLink-simulated-data and ATLASv2) show that SHIELD consistently achieves outstanding performance in comparison with 5 representative HIDS. These findings highlight the potential of LLMs as powerful tools for intrusion detection and pave the way for future research in this domain.
Authors:Huanming Shen, Baizhou Huang, Xiaojun Wan
Abstract:
Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling low-cost statistics-based spoofing attacks. This work breaks this trade-off by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a novel watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). It achieves a Pareto improvement, increasing the resilience against scrubbing attacks without compromising robustness to spoofing. Experiments demonstrate SEEK's superiority over prior method, yielding spoofing robustness gains of +88.2%/+92.3%/+82.0% and scrubbing robustness gains of +10.2%/+6.4%/+24.6% across diverse dataset settings.
Authors:Bo Yan, Yurong Hao, Dingqi Liu, Huabin Sun, Pengpeng Qiao, Wei Yang Bryan Lim, Yang Cao, Chuan Shi
Abstract:
Federated recommender systems (FedRec) have emerged as a promising solution for delivering personalized recommendations while safeguarding user privacy. However, recent studies have demonstrated their vulnerability to poisoning attacks. Existing attacks typically target the entire user group, which compromises stealth and increases the risk of detection. In contrast, real-world adversaries may prefer to prompt target items to specific user subgroups, such as recommending health supplements to elderly users. Motivated by this gap, we introduce Spattack, the first targeted poisoning attack designed to manipulate recommendations for specific user subgroups in the federated setting. Specifically, Spattack adopts a two-stage approximation-and-promotion strategy, which first simulates user embeddings of target/non-target subgroups and then prompts target items to the target subgroups. To enhance the approximation stage, we push the inter-group embeddings away based on contrastive learning and augment the target group's relevant item set based on clustering. To enhance the promotion stage, we further propose to adaptively tune the optimization weights between target and non-target subgroups. Besides, an embedding alignment strategy is proposed to align the embeddings between the target items and the relevant items. We conduct comprehensive experiments on three real-world datasets, comparing Spattack against seven state-of-the-art poisoning attacks and seven representative defense mechanisms. Experimental results demonstrate that Spattack consistently achieves strong manipulation performance on the specific user subgroup, while incurring minimal impact on non-target users, even when only 0.1\% of users are malicious. Moreover, Spattack maintains competitive overall recommendation performance and exhibits strong resilience against existing mainstream defenses.
Authors:Mehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed Badawy, Ahmad Patooghy
Abstract:
Thermal Trojan attacks present a pressing concern for the security and reliability of System-on-Chips (SoCs), especially in mobile applications. The situation becomes more complicated when such attacks are more evasive and operate sporadically to stay hidden from detection mechanisms. In this paper, we introduce Intermittent Thermal Trojans (iThermTroj) that exploit the chips' thermal information in a random time-triggered manner. According to our experiments, iThermTroj attack can easily bypass available threshold-based thermal Trojan detection solutions. We investigate SoC vulnerabilities to variations of iThermTroj through an in-depth analysis of Trojan activation and duration scenarios. We also propose a set of tiny Machine Learning classifiers for run-time anomaly detection to protect SoCs against such intermittent thermal Trojan attacks. Compared to existing methods, our approach improves the attack detection rate by 29.4\%, 17.2\%, and 14.3\% in scenarios where iThermTroj manipulates up to 80\%, 60\%, and 40\% of SoC's thermal data, respectively. Additionally, our method increases the full protection resolution to 0.8 degrees Celsius, meaning that any temperature manipulations exceeding $\pm 0.8$ degrees will be detected with 100\% accuracy.
Authors:Ruixuan Liu, Li Xiong
Abstract:
Differentially private (DP) optimization has been widely adopted as a standard approach to provide rigorous privacy guarantees for training datasets. DP auditing verifies whether a model trained with DP optimization satisfies its claimed privacy level by estimating empirical privacy lower bounds through hypothesis testing. Recent O(1) frameworks improve auditing efficiency by checking the membership status of multiple audit samples in a single run, rather than checking individual samples across multiple runs. However, we reveal that there is no free lunch for this improved efficiency: data dependency and an implicit conflict between auditing and utility impair the tightness of the auditing results. Addressing these challenges, our key insights include reducing data dependency through uncorrelated data and resolving the auditing-utility conflict by decoupling the criteria for effective auditing and separating objectives for utility and auditing. We first propose a unified framework, UniAud, for data-independent auditing that maximizes auditing power through a novel uncorrelated canary construction and a self-comparison framework. We then extend this framework as UniAud++ for data-dependent auditing, optimizing the auditing and utility trade-off through multi-task learning with separate objectives for auditing and training. Experimental results validate that our black-box O(1) framework matches the state-of-the-art auditing results of O(T) auditing with thousands of runs, demonstrating the best efficiency-auditing trade-off across vision and language tasks. Additionally, our framework provides meaningful auditing with only slight utility degradation compared to standard DP training, showing the optimal utility-auditing trade-off and the benefit of requiring no extra training for auditing.
Authors:Yibo He, Cunjian Huang, Xianmiao Qu, Hongdeng Chen, Wei Yang, Tao Xie
Abstract:
Modern processors are equipped with single instruction multiple data (SIMD) instructions for fine-grained data parallelism. Compiler auto-vectorization techniques that target SIMD instructions face performance limitations due to insufficient information available at compile time, requiring programmers to manually manipulate SIMD instructions. SIMD intrinsics, a type of built-in function provided by modern compilers, enable programmers to manipulate SIMD instructions within high-level programming languages. Bugs in compilers for SIMD intrinsics can introduce potential threats to software security, producing unintended calculation results, data loss, program crashes, etc.
To detect bugs in compilers for SIMD intrinsics, we propose RVISmith, a randomized fuzzer that generates well-defined C programs that include various invocation sequences of RVV (RISC-V Vector Extension) intrinsics. We design RVISmith to achieve the following objectives: (i) achieving high intrinsic coverage, (ii) improving sequence variety, and (iii) without known undefined behaviors. We implement RVISmith based on the ratified RVV intrinsic specification and evaluate our approach with three modern compilers: GCC, LLVM, and XuanTie. Experimental results show that RVISmith achieves 11.5 times higher intrinsic coverage than the state-of-the-art fuzzer for RVV intrinsics. By differential testing that compares results across different compilers, optimizations, and equivalent programs, we detect and report 13 previously unknown bugs of the three compilers under test to date. Of these bugs, 10 are confirmed and another 3 are fixed by the compiler developers.
Authors:Mao Luo, Zhi Wang, Yiwen Huang, Qingyun Zhang, Zhouxing Su, Zhipeng Lv, Wen Hu, Jianguo Li
Abstract:
Electronic payment platforms are estimated to process billions oftransactions daily, with the cumulative value of these transactionspotentially reaching into the trillions. Even a minor error within thishigh-volume environment could precipitate substantial financiallosses. To mitigate this risk, manually constructed verification rules,developed by domain experts, are typically employed to identifyand scrutinize transactions in production environments. However,due to the absence of a systematic approach to ensure the robust-ness of these verification rules against vulnerabilities, they remainsusceptible to exploitation.To mitigate this risk, manually constructed verification rules, de-veloped by domain experts, are typically employed to identify andscrutinize transactions in production environments. However, dueto the absence of a systematic approach to ensure the robustness ofthese verification rules against vulnerabilities, they remain suscep-tible to exploitation. To ensure data security, database maintainersusually compose complex verification rules to check whether aquery/update request is valid. However, the rules written by ex-perts are usually imperfect, and malicious requests may bypassthese rules. As a result, the demand for identifying the defects ofthe rules systematically emerges.
Authors:Xingke Yang, Liang Li, Zhiyi Wan, Sicong Li, Xiaoqi Qi, Jiang Liu, Tomoaki Ohtsuki, Xin Fu, Miao Pan
Abstract:
There is a huge gap between numerous intriguing applications fostered by on-device large language model (LLM) fine-tuning (FT) from fresh mobile data and the limited resources of a mobile device. While existing server-assisted methods (e.g., split learning or side-tuning) may enable LLM FT on the local mobile device, they suffer from heavy communication burdens of activation transmissions, and may disclose data and labels to the server. To address those issues, we develop PAE MobiLLM, a a privacy-aware and efficient LLM FT method which can be deployed on the mobile device via server-assisted additive side-tuning. To further accelerate FT convergence and improve computing efficiency, PAE MobiLLM integrates activation caching on the server side, which allows the server to reuse historical activations and saves the mobile device from repeatedly computing forward passes for the recurring data samples. Besides, to reduce communication cost, PAE MobiLLM develops an activation shortcut that transmits only the token involved in the loss calculation instead of full activation matrices to guide the side network tuning. Last but not least, PAE MobiLLM introduces the additive adapter side-network design which makes the server train the adapter modules based on device-defined prediction differences rather than raw ground-truth labels. In this way, the server can only assist device-defined side-network computing, and learn nothing about data and labels. Extensive experimental results demonstrate PAE MobiLLM's superiority.
Authors:Linard Arquint, Samarth Kishor, Jason R. Koenig, Joey Dodds, Daniel Kroening, Peter Müller
Abstract:
Existing program verifiers can prove advanced properties about security protocol implementations, but are difficult to scale to large codebases because of the manual effort required. We develop a novel methodology called *Diodon* that addresses this challenge by splitting the codebase into the protocol implementation (the *Core*) and the remainder (the *Application*). This split allows us to apply powerful semi-automated verification techniques to the security-critical Core, while fully-automatic static analyses scale the verification to the entire codebase by ensuring that the Application cannot invalidate the security properties proved for the Core. The static analyses achieve that by proving *I/O independence*, i.e., that the I/O operations within the Application are independent of the Core's security-relevant data (such as keys), and that the Application meets the Core's requirements. We have proved Diodon sound by first showing that we can safely allow the Application to perform I/O independent of the security protocol, and second that manual verification and static analyses soundly compose. We evaluate Diodon on two case studies: an implementation of the signed Diffie-Hellman key exchange and a large (100k+ LoC) production Go codebase implementing a key exchange protocol for which we obtained secrecy and injective agreement guarantees by verifying a Core of about 1% of the code with the auto-active program verifier Gobra in less than three person months.
Authors:Mikhail Terekhov, Alexander Panfilov, Daniil Dzenhaliou, Caglar Gulcehre, Maksym Andriushchenko, Ameya Prabhu, Jonas Geiping
Abstract:
AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a security problem, stress testing with exploits that use the deployment context to subtly complete harmful side tasks, such as backdoor insertion. In practice, most AI control protocols are fundamentally based on LLM monitors, which can become a central point of failure. We study adaptive attacks by an untrusted model that knows the protocol and the monitor model, which is plausible if the untrusted model was trained with a later knowledge cutoff or can search for this information autonomously. We instantiate a simple adaptive attack vector by which the attacker embeds publicly known or zero-shot prompt injections in the model outputs. Using this tactic, frontier models consistently evade diverse monitors and complete malicious tasks on two main AI control benchmarks. The attack works universally against current protocols that rely on a monitor. Furthermore, the recent Defer-to-Resample protocol even backfires, as its resampling amplifies the prompt injection and effectively reframes it as a best-of-$n$ attack. In general, adaptive attacks on monitor models represent a major blind spot in current control protocols and should become a standard component of evaluations for future AI control mechanisms.
Authors:Nouar Aldahoul, Yasir Zaki
Abstract:
The rapid spread of misinformation on digital platforms threatens public discourse, emotional stability, and decision-making. While prior work has explored various adversarial attacks in misinformation detection, the specific transformations examined in this paper have not been systematically studied. In particular, we investigate language-switching across English, French, Spanish, Arabic, Hindi, and Chinese, followed by translation. We also study query length inflation preceding summarization and structural reformatting into multiple-choice questions. In this paper, we present a multilingual, multi-agent large language model framework with retrieval-augmented generation that can be deployed as a web plugin into online platforms. Our work underscores the importance of AI-driven misinformation detection in safeguarding online factual integrity against diverse attacks, while showcasing the feasibility of plugin-based deployment for real-world web applications.
Authors:Simone Bozzolan, Stefano Calzavara, Lorenzo Cazzaro
Abstract:
Web measurements are a well-established methodology for assessing the security and privacy landscape of the Internet. However, existing top lists of popular websites commonly used as measurement targets are unlabeled and lack semantic information about the nature of the sites they include. This limitation makes targeted measurements challenging, as researchers often need to rely on ad-hoc techniques to bias their datasets toward specific categories of interest. In this paper, we investigate the use of Large Language Models (LLMs) as a means to enable targeted web measurement studies through their semantic understanding capabilities. Building on prior literature, we identify key website classification tasks relevant to web measurements and construct datasets to systematically evaluate the performance of different LLMs on these tasks. Our results demonstrate that LLMs may achieve strong performance across multiple classification scenarios. We then conduct LLM-assisted web measurement studies inspired by prior work and rigorously assess the validity of the resulting research inferences. Our results demonstrate that LLMs can serve as a practical tool for analyzing security and privacy trends on the Web.
Authors:Jiuan Zhou, Yu Cheng, Yuan Xie, Zhaoxia Yin
Abstract:
With the rapid progress of LLMs, high quality generative text has become widely available as a cover for text steganography. However, prevailing methods rely on hand-crafted or pre-specified strategies and struggle to balance efficiency, imperceptibility, and security, particularly at high embedding rates. Accordingly, we propose Auto-Stega, an agent-driven self-evolving framework that is the first to realize self-evolving steganographic strategies by automatically discovering, composing, and adapting strategies at inference time; the framework operates as a closed loop of generating, evaluating, summarizing, and updating that continually curates a structured strategy library and adapts across corpora, styles, and task constraints. A decoding LLM recovers the information under the shared strategy. To handle high embedding rates, we introduce PC-DNTE, a plug-and-play algorithm that maintains alignment with the base model's conditional distribution at high embedding rates, preserving imperceptibility while enhancing security. Experimental results demonstrate that at higher embedding rates Auto-Stega achieves superior performance with gains of 42.2\% in perplexity and 1.6\% in anti-steganalysis performance over SOTA methods.
Authors:Minh K. Quan, Pubudu N. Pathirana
Abstract:
Cross-slice attack attribution in 6G networks faces the fundamental challenge of distinguishing genuine causal relationships from spurious correlations in shared infrastructure environments. We propose a theoretically-grounded domain-adapted Granger causality framework that integrates statistical causal inference with network-specific resource modeling for real-time attack attribution. Our approach addresses key limitations of existing methods by incorporating resource contention dynamics and providing formal statistical guarantees. Comprehensive evaluation on a production-grade 6G testbed with 1,100 empirically-validated attack scenarios demonstrates 89.2% attribution accuracy with sub-100ms response time, representing a statistically significant 10.1 percentage point improvement over state-of-the-art baselines. The framework provides interpretable causal explanations suitable for autonomous 6G security orchestration.
Authors:Weiliang Zhao, Jinjun Peng, Daniel Ben-Levi, Zhou Yu, Junfeng Yang
Abstract:
The proliferation of powerful large language models (LLMs) has necessitated robust safety alignment, yet these models remain vulnerable to evolving adversarial attacks, including multi-turn jailbreaks that iteratively search for successful queries. Current defenses, primarily reactive and static, often fail to counter these search-based attacks. In this paper, we introduce ProAct, a novel proactive defense framework designed to disrupt and mislead autonomous jailbreaking processes. Our core idea is to intentionally provide adversaries with "spurious responses" that appear to be results of successful jailbreak attacks but contain no actual harmful content. These misleading responses provide false signals to the attacker's internal optimization loop, causing the adversarial search to terminate prematurely and effectively jailbreaking the jailbreak. By conducting extensive experiments across state-of-the-art LLMs, jailbreaking frameworks, and safety benchmarks, our method consistently and significantly reduces attack success rates by up to 92\%. When combined with other defense frameworks, it further reduces the success rate of the latest attack strategies to 0\%. ProAct represents an orthogonal defense strategy that can serve as an additional guardrail to enhance LLM safety against the most effective jailbreaking attacks.
Authors:Avilash Rath, Weiliang Qi, Youpeng Li, Xinda Wang
Abstract:
Graph-based models learn rich code graph structural information and present superior performance on various code analysis tasks. However, the robustness of these models against adversarial example attacks in the context of vulnerability detection remains an open question. This paper proposes NatGVD, a novel attack methodology that generates natural adversarial vulnerable code to circumvent GNN-based and graph-aware transformer-based vulnerability detectors. NatGVD employs a set of code transformations that modify graph structure while preserving code semantics. Instead of injecting dead or unrelated code like previous works, NatGVD considers naturalness requirements: generated examples should not be easily recognized by humans or program analysis tools. With extensive evaluation of NatGVD on state-of-the-art vulnerability detection systems, the results reveal up to 53.04% evasion rate across GNN-based detectors and graph-aware transformer-based detectors. We also explore potential defense strategies to enhance the robustness of these systems against NatGVD.
Authors:Yu Cui, Sicheng Pan, Yifei Liu, Haibin Zhang, Cong Zuo
Abstract:
Large language models (LLMs) have been widely deployed in Conversational AIs (CAIs), while exposing privacy and security threats. Recent research shows that LLM-based CAIs can be manipulated to extract private information from human users, posing serious security threats. However, the methods proposed in that study rely on a white-box setting that adversaries can directly modify the system prompt. This condition is unlikely to hold in real-world deployments. The limitation raises a critical question: can unprivileged attackers still induce such privacy risks in practical LLM-integrated applications? To address this question, we propose \textsc{VortexPIA}, a novel indirect prompt injection attack that induces privacy extraction in LLM-integrated applications under black-box settings. By injecting token-efficient data containing false memories, \textsc{VortexPIA} misleads LLMs to actively request private information in batches. Unlike prior methods, \textsc{VortexPIA} allows attackers to flexibly define multiple categories of sensitive data. We evaluate \textsc{VortexPIA} on six LLMs, covering both traditional and reasoning LLMs, across four benchmark datasets. The results show that \textsc{VortexPIA} significantly outperforms baselines and achieves state-of-the-art (SOTA) performance. It also demonstrates efficient privacy requests, reduced token consumption, and enhanced robustness against defense mechanisms. We further validate \textsc{VortexPIA} on multiple realistic open-source LLM-integrated applications, demonstrating its practical effectiveness.
Authors:Waqas Ishtiaq, Ashrafun Zannat, A. H. M. Shahariar Parvez, Md. Alamgir Hossain, Muntasir Hasan Kanchan, Muhammad Masud Tarek
Abstract:
The rapid expansion of the Internet of Things (IoT) has revolutionized modern industries by enabling smart automation and real time connectivity. However, this evolution has also introduced complex cybersecurity challenges due to the heterogeneous, resource constrained, and distributed nature of these environments. To address these challenges, this research presents CST AFNet, a novel dual attention based deep learning framework specifically designed for robust intrusion detection in IoT networks. The model integrates multi scale Convolutional Neural Networks (CNNs) for spatial feature extraction, Bidirectional Gated Recurrent Units (BiGRUs) for capturing temporal dependencies, and a dual attention mechanism, channel and temporal attention, to enhance focus on critical patterns in the data. The proposed method was trained and evaluated on the Edge IIoTset dataset, a comprehensive and realistic benchmark containing more than 2.2 million labeled instances spanning 15 attack types and benign traffic, collected from a seven layer industrial testbed. Our proposed model achieves outstanding accuracy for both 15 attack types and benign traffic. CST AFNet achieves 99.97 percent accuracy. Moreover, this model demonstrates exceptional performance with macro averaged precision, recall, and F1 score all above 99.3 percent. Experimental results show that CST AFNet achieves superior detection accuracy, significantly outperforming traditional deep learning models. The findings confirm that CST AFNet is a powerful and scalable solution for real time cyber threat detection in complex IoT and IIoT environments, paving the way for more secure, intelligent, and adaptive cyber physical systems.
Authors:Tarun Kumar Biswas, Ashrafun Zannat, Waqas Ishtiaq, Md. Alamgir Hossain
Abstract:
The growing integration of drones across commercial, industrial, and civilian domains has introduced significant cybersecurity challenges, particularly due to the susceptibility of drone networks to a wide range of cyberattacks. Existing intrusion detection mechanisms often lack the adaptability, efficiency, and generalizability required for the dynamic and resource constrained environments in which drones operate. This paper proposes TSLT-Net, a novel lightweight and unified Temporal Spatial Transformer based intrusion detection system tailored specifically for drone networks. By leveraging self attention mechanisms, TSLT-Net effectively models both temporal patterns and spatial dependencies in network traffic, enabling accurate detection of diverse intrusion types. The framework includes a streamlined preprocessing pipeline and supports both multiclass attack classification and binary anomaly detection within a single architecture. Extensive experiments conducted on the ISOT Drone Anomaly Detection Dataset, consisting of more than 2.3 million labeled records, demonstrate the superior performance of TSLT-Net with 99.99 percent accuracy in multiclass detection and 100 percent in binary anomaly detection, while maintaining a minimal memory footprint of only 0.04 MB and 9722 trainable parameters. These results establish TSLT-Net as an effective and scalable solution for real time drone cybersecurity, particularly suitable for deployment on edge devices in mission critical UAV systems.
Authors:Paschal C. Amusuo, Dongge Liu, Ricardo Andres Calvo Mendez, Jonathan Metzman, Oliver Chang, James C. Davis
Abstract:
Fuzz testing has become a cornerstone technique for identifying software bugs and security vulnerabilities, with broad adoption in both industry and open-source communities. Directly fuzzing a function requires fuzz drivers, which translate random fuzzer inputs into valid arguments for the target function. Given the cost and expertise required to manually develop fuzz drivers, methods exist that leverage program analysis and Large Language Models to automatically generate these drivers. However, the generated fuzz drivers frequently lead to false positive crashes, especially in functions highly structured input and complex state requirements. This problem is especially crucial in industry-scale fuzz driver generation efforts like OSS-Fuzz-en, as reporting false positive crashes to maintainers impede trust in both the system and the team. This paper presents two AI-driven strategies to reduce false positives in OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation. First, constraint-based fuzz driver generation proactively enforces constraints on a function's inputs and state to guide driver creation. Second, context-based crash validation reactively analyzes function callers to determine whether reported crashes are feasible from program entry points. Using 1,500 benchmark functions from OSS-Fuzz, we show that these strategies reduce spurious crashes by up to 8%, cut reported crashes by more than half, and demonstrate that frontier LLMs can serve as reliable program analysis agents. Our results highlight the promise and challenges of integrating AI into large-scale fuzzing pipelines.
Authors:Youpeng Li, Kartik Joshi, Xinda Wang, Eric Wong
Abstract:
The widespread adoption of open-source software (OSS) necessitates the mitigation of vulnerability risks. Most vulnerability detection (VD) methods are limited by inadequate contextual understanding, restrictive single-round interactions, and coarse-grained evaluations, resulting in undesired model performance and biased evaluation results. To address these challenges, we propose MAVUL, a novel multi-agent VD system that integrates contextual reasoning and interactive refinement. Specifically, a vulnerability analyst agent is designed to flexibly leverage tool-using capabilities and contextual reasoning to achieve cross-procedural code understanding and effectively mine vulnerability patterns. Through iterative feedback and refined decision-making within cross-role agent interactions, the system achieves reliable reasoning and vulnerability prediction. Furthermore, MAVUL introduces multi-dimensional ground truth information for fine-grained evaluation, thereby enhancing evaluation accuracy and reliability. Extensive experiments conducted on a pairwise vulnerability dataset demonstrate MAVUL's superior performance. Our findings indicate that MAVUL significantly outperforms existing multi-agent systems with over 62% higher pairwise accuracy and single-agent systems with over 600% higher average performance. The system's effectiveness is markedly improved with increased communication rounds between the vulnerability analyst agent and the security architect agent, underscoring the importance of contextual reasoning in tracing vulnerability flows and the crucial feedback role. Additionally, the integrated evaluation agent serves as a critical, unbiased judge, ensuring a more accurate and reliable estimation of the system's real-world applicability by preventing misleading binary comparisons.
Authors:Philip Sjösvärd, Hongyu Jin, Panos Papadimitratos
Abstract:
The Domain Name System (DNS) is involved in practically all web activity, translating easy-to-remember domain names into Internet Protocol (IP) addresses. Due to its central role on the Internet, DNS exposes user web activity in detail. The privacy challenge is honest-but-curious DNS servers/resolvers providing the translation/lookup service. In particular, with the majority of DNS queries handled by public DNS resolvers, the organizations running them can track, collect, and analyze massive user activity data. Existing solutions that encrypt DNS traffic between clients and resolvers are insufficient, as the resolver itself is the privacy threat. While DNS query relays separate duties among multiple entities, to limit the data accessible by each entity, they cannot prevent colluding entities from sharing user traffic logs. To achieve near-zero-trust DNS privacy compatible with the existing DNS infrastructure, we propose LLUAD: it locally stores a Popularity List, the most popular DNS records, on user devices, formed in a privacy-preserving manner based on user interests. In this way, LLUAD can both improve privacy and reduce access times to web content. The Popularity List is proactively retrieved from a (curious) public server that continually updates and refreshes the records based on user popularity votes, while efficiently broadcasting record updates/changes to adhere to aggressive load-balancing schemes (i.e., name servers actively load-balancing user connections by changing record IP addresses). User votes are anonymized using a novel, efficient, and highly scalable client-driven Voting Mix Network - with packet lengths independent of the number of hops, centrally enforced limit on number of votes cast per user, and robustness against poor client participation - to ensure a geographically relevant and correctly/securely instantiated Popularity List.
Authors:Philip Sjösvärd, Hongyu Jin, Panos Papadimitratos
Abstract:
The Domain Name System (DNS) is central to all Internet user activity, resolving accessed domain names into Internet Protocol (IP) addresses. As a result, curious DNS resolvers can learn everything about Internet users' interests. Public DNS resolvers are rising in popularity, offering low-latency resolution, high reliability, privacy-preserving policies, and support for encrypted DNS queries. However, client-resolver traffic encryption, increasingly deployed to protect users from eavesdroppers, does not protect users against curious resolvers. Similarly, privacy-preserving policies are based solely on written commitments and do not provide technical safeguards. Although DNS query relay schemes can separate duties to limit data accessible by each entity, they cannot prevent colluding entities from sharing user traffic logs. Thus, a key challenge remains: organizations operating public DNS resolvers, accounting for the majority of DNS resolutions, can potentially collect and analyze massive volumes of Internet user activity data. With DNS infrastructure that cannot be fully trusted, can we safeguard user privacy? We answer positively and advocate for a user-driven approach to reduce exposure to DNS services. We will discuss key ideas of the proposal, which aims to achieve a high level of privacy without sacrificing performance: maintaining low latency, network bandwidth, memory/storage overhead, and computational overhead.
Authors:Jaehan Kim, Minkyoo Song, Seungwon Shin, Sooel Son
Abstract:
Recent large language models (LLMs) have increasingly adopted the Mixture-of-Experts (MoE) architecture for efficiency. MoE-based LLMs heavily depend on a superficial safety mechanism in which harmful inputs are routed safety-critical experts. However, our analysis reveals that routing decisions for harmful inputs drift significantly after fine-tuning, exposing a critical vulnerability to harmful fine-tuning (HFT) attacks. Existing defenses, primarily designed for monolithic LLMs, are less effective for MoE LLMs as they fail to prevent drift in harmful input routing. To address this limitation, we propose SafeMoE, a safe fine-tuning method tailored to MoE LLMs. SafeMoE directly mitigates routing drift by penalizing the gap between the routing weights of a fine-tuned model and those of the initial safety-aligned model, thereby preserving the safety-aligned routing of harmful inputs to safety-critical experts. Experiments on open-source MoE LLMs ranging from 7B to 141B parameters demonstrate that SafeMoE effectively mitigates HFT attacks, reducing the harmfulness score of OLMoE from 62.0 to 5.0, for example, while maintaining task utility within 1% degradation and incurring only 2% overhead. It significantly outperforms state-of-the-art defense methods for safeguarding LLM fine-tuning and remains effective in recent large-scale MoE LLMs such as gpt-oss and Llama 4. Our implementation is available at https://anonymous.4open.science/r/SafeMoE.
Authors:Andrés Fábrega, Amy Zhao, Jay Yu, James Austgen, Sarah Allen, Kushal Babel, Mahimna Kelkar, Ari Juels
Abstract:
Decentralized Autonomous Organizations (DAOs) use smart contracts to foster communities working toward common goals. Existing definitions of decentralization, however -- the 'D' in DAO -- fall short of capturing the key properties characteristic of diverse and equitable participation. This work proposes a new framework for measuring DAO decentralization called Voting-Bloc Entropy (VBE, pronounced ''vibe''). VBE is based on the idea that voters with closely aligned interests act as a centralizing force and should be modeled as such. VBE formalizes this notion by measuring the similarity of participants' utility functions across a set of voting rounds. Unlike prior, ad hoc definitions of decentralization, VBE derives from first principles: We introduce a simple (yet powerful) reinforcement learning-based conceptual model for voting, that in turn implies VBE. We first show VBE's utility as a theoretical tool. We prove a number of results about the (de)centralizing effects of vote delegation, proposal bundling, bribery, etc. that are overlooked in previous notions of DAO decentralization. Our results lead to practical suggestions for enhancing DAO decentralization. We also show how VBE can be used empirically by presenting measurement studies and VBE-based governance experiments. We make the tools we developed for these results available to the community in the form of open-source artifacts in order to facilitate future study of DAO decentralization.
Authors:Ossi Räisä, Antti Koskela, Antti Honkela
Abstract:
The accuracy-first perspective of differential privacy addresses an important shortcoming by allowing a data analyst to adaptively adjust the quantitative privacy bound instead of sticking to a predetermined bound. Existing works on the accuracy-first perspective have neglected an important property of differential privacy known as post-processing immunity, which ensures that an adversary is not able to weaken the privacy guarantee by post-processing. We address this gap by determining which existing definitions in the accuracy-first perspective have post-processing immunity, and which do not. The only definition with post-processing immunity, pure ex-post privacy, lacks useful tools for practical problems, such as an ex-post analogue of the Gaussian mechanism, and an algorithm to check if accuracy on separate private validation set is high enough. To address this, we propose a new definition based on Rényi differential privacy that has post-processing immunity, and we develop basic theory and tools needed for practical applications. We demonstrate the practicality of our theory with an application to synthetic data generation, where our algorithm successfully adjusts the privacy bound until an accuracy threshold is met on a private validation dataset.
Authors:Zhou Xu, Guyue Li, Zhe Peng, Aiqun Hu
Abstract:
Radio frequency fingerprint (RFF) is a promising device identification technology, with recent research shifting from robustness to security due to growing concerns over vulnerabilities. To date, while the security of RFF against basic spoofing such as MAC address tampering has been validated, its resilience to advanced mimicry remains unknown. To address this gap, we propose a collusion-driven impersonation attack that achieves RF-level mimicry, successfully breaking RFF identification systems across diverse environments. Specifically, the attacker synchronizes with a colluding receiver to match the centralized logarithmic power spectrum (CLPS) of the legitimate transmitter; once the colluder deems the CLPS identical, the victim receiver will also accept the forged fingerprint, completing RF-level spoofing. Given that the distribution of CLPS features is relatively concentrated and has a clear underlying structure, we design a spoofed signal generation network that integrates a variational autoencoder (VAE) with a multi-objective loss function to enhance the similarity and deceptive capability of the generated samples. We carry out extensive simulations, validating cross-channel attacks in environments that incorporate standard channel variations including additive white Gaussian noise (AWGN), multipath fading, and Doppler shift. The results indicate that the proposed attack scheme essentially maintains a success rate of over 95% under different channel conditions, revealing the effectiveness of this attack.
Authors:Hanbo Huang, Yiran Zhang, Hao Zheng, Xuan Gong, Yihan Li, Lin Liu, Shiyu Liang
Abstract:
Large Language Models (LLMs) watermarking has shown promise in detecting AI-generated content and mitigating misuse, with prior work claiming robustness against paraphrasing and text editing. In this paper, we argue that existing evaluations are not sufficiently adversarial, obscuring critical vulnerabilities and overstating the security. To address this, we introduce adaptive robustness radius, a formal metric that quantifies watermark resilience against adaptive adversaries. We theoretically prove that optimizing the attack context and model parameters can substantially reduce this radius, making watermarks highly susceptible to paraphrase attacks. Leveraging this insight, we propose RLCracker, a reinforcement learning (RL)-based adaptive attack that erases watermarks while preserving semantic fidelity. RLCracker requires only limited watermarked examples and zero access to the detector. Despite weak supervision, it empowers a 3B model to achieve 98.5% removal success and an average 0.92 P-SP score on 1,500-token Unigram-marked texts after training on only 100 short samples. This performance dramatically exceeds 6.75% by GPT-4o and generalizes across five model sizes over ten watermarking schemes. Our results confirm that adaptive attacks are broadly effective and pose a fundamental threat to current watermarking defenses.
Authors:Md. Alamgir Hossain, Waqas Ishtiaq, Md. Samiul Islam
Abstract:
The growing integration of drones into civilian, commercial, and defense sectors introduces significant cybersecurity concerns, particularly with the increased risk of network-based intrusions targeting drone communication protocols. Detecting and classifying these intrusions is inherently challenging due to the dynamic nature of drone traffic and the presence of multiple sophisticated attack vectors such as spoofing, injection, replay, and man-in-the-middle (MITM) attacks. This research aims to develop a robust and interpretable intrusion detection framework tailored for drone networks, with a focus on handling multi-class classification and model explainability. We present a comparative analysis of ensemble-based machine learning models, namely Random Forest, Extra Trees, AdaBoost, CatBoost, and XGBoost, trained on a labeled dataset comprising benign traffic and nine distinct intrusion types. Comprehensive data preprocessing was performed, including missing value imputation, scaling, and categorical encoding, followed by model training and extensive evaluation using metrics such as macro F1-score, ROC AUC, Matthews Correlation Coefficient, and Log Loss. Random Forest achieved the highest performance with a macro F1-score of 0.9998 and ROC AUC of 1.0000. To validate the superiority of the models, statistical tests, including Friedmans test, the Wilcoxon signed-rank test with Holm correction, and bootstrapped confidence intervals, were applied. Furthermore, explainable AI methods, SHAP and LIME, were integrated to interpret both global and local feature importance, enhancing model transparency and decision trustworthiness. The proposed approach not only delivers near-perfect accuracy but also ensures interpretability, making it highly suitable for real-time and safety-critical drone operations.
Authors:Yi Chen, Xiaoyang Dong, Ruijie Ma, Yantian Shen, Anyu Wang, Hongbo Yu, Xiaoyun Wang
Abstract:
The machine learning problem of model extraction was first introduced in 1991 and gained prominence as a cryptanalytic challenge starting with Crypto 2020. For over three decades, research in this field has primarily focused on ReLU-based neural networks. In this work, we take the first step towards the cryptanalytic extraction of PReLU neural networks, which employ more complex nonlinear activation functions than their ReLU counterparts. We propose a raw output-based parameter recovery attack for PReLU networks and extend it to more restrictive scenarios where only the top-m probability scores are accessible. Our attacks are rigorously evaluated through end-to-end experiments on diverse PReLU neural networks, including models trained on the MNIST dataset. To the best of our knowledge, this is the first practical demonstration of PReLU neural network extraction across three distinct attack scenarios.
Authors:Miel Verkerken, Laurens D'hooge, Bruno Volckaert, Filip De Turck, Giovanni Apruzzese
Abstract:
Network Intrusion Detection Systems (NIDS) have been studied in research for almost four decades. Yet, despite thousands of papers claiming scientific advances, a non-negligible number of recent works suggest that the findings of prior literature may be questionable. At the root of such a disagreement is the well-known challenge of obtaining data representative of a real-world network-and, hence, usable for security assessments. We tackle such a challenge in this paper. We propose ConCap, a practical tool meant to facilitate experimental research on NIDS. Through ConCap, a researcher can set up an isolated and lightweight network environment and configure it to produce network-related data, such as packets or NetFlows, that are automatically labeled, hence ready for fine-grained experiments. ConCap is rooted on open-source software and is designed to foster experimental reproducibility across the scientific community by sharing just one configuration file. Through comprehensive experiments on 10 different network activities, further expanded via in-depth analyses of 21 variants of two specific activities and of 100 repetitions of four other ones, we empirically verify that ConCap produces network data resembling that of a real-world network. We also carry out experiments on well-known benchmark datasets as well as on a real "smart-home" network, showing that, from a cyber-detection viewpoint, ConCap's automatically-labeled NetFlows are functionally equivalent to those collected in other environments. Finally, we show that ConCap enables to safely reproduce sophisticated attack chains (e.g., to test/enhance existing NIDS). Altogether, ConCap is a solution to the "data problem" that is plaguing NIDS research.
Authors:Hannah Keller, Jacob Imola, Fabrizio Boninsegna, Rasmus Pagh, Amrita Roy Chowdhury
Abstract:
Quantiles are key in distributed analytics, but computing them over sensitive data risks privacy. Local differential privacy (LDP) offers strong protection but lower accuracy than central DP, which assumes a trusted aggregator. Secure multi-party computation (MPC) can bridge this gap, but generic MPC solutions face scalability challenges due to large domains, complex secure operations, and multi-round interactions. We present Piquant$\varepsilon$, a system for privacy-preserving estimation of multiple quantiles in a distributed setting without relying on a trusted server. Piquant$\varepsilon$ operates under the malicious threat model and achieves accuracy of the central DP model. Built on the two-server model, Piquant$\varepsilon$ uses a novel strategy of releasing carefully chosen intermediate statistics, reducing MPC complexity while preserving end-to-end DP. Empirically, Piquant$\varepsilon$ estimates 5 quantiles on 1 million records in under a minute with domain size $10^9$, achieving up to $10^4$-fold higher accuracy than LDP, and up to $\sim 10\times$ faster runtime compared to baselines.
Authors:Mengxiao Wang, Guofei Gu
Abstract:
Progressive Web Applications (PWAs) blend the advantages of web and native apps, offering features like offline access, push notifications, and installability. Beyond these, modern PWAs are increasingly granted system-level capabilities such as auto-start on login and shared context with native applications. However, their permission management remains poorly defined and inconsistently implemented across platforms and browsers. To investigate these gaps, we developed Permissioner, a cross-platform analysis tool, and conducted a systematic study of PWA permissions. Our analysis uncovered critical issues of inconsistency, incompleteness, and unclear boundaries in permission enforcement, leading to various attacks including permission leakage, device identification, and Permission API abuse. We further examined why some browsers resist adopting more granular permission controls, identifying trade-offs involving usability, compatibility, and platform limitations. Through collaboration with browser vendors, several issues reported in our findings were acknowledged and resolved, notably by Firefox and Chrome. Our work highlights the urgent need for a unified, robust permission model for PWAs and provides actionable guidance toward achieving this goal.
Authors:Mengxiao Wang, Guofei Gu
Abstract:
Progressive Web App (PWA) installation is critical for integrating web and mobile app functionalities, offering a seamless user experience. However, ensuring the security of the PWA installation lifecycle is essential for maintaining user trust and privacy. This paper introduces the GUARDIANPWA framework, a comprehensive approach to analyzing the PWA installation mechanism based on the CIA security principles (Confidentiality, Integrity, and Availability) and identifying areas where browser vendors fail to comply with these principles. Our study revealed 203 instances of non-compliance with security principles, highlighting how these irregularities in the PWA installation lifecycle can lead to potential violations of user privacy. For instance, in Firefox, PWAs installed in private mode incorrectly appear in normal mode, risking user confidentiality. Additionally, 29,465 PWAs are at risk because Samsung Internet does not display origins when PWAs navigate to third-party websites, undermining integrity. These findings were reported to browser vendors, leading to Firefox acknowledging four issues, resolving one, and planning to resolve two others. GUARDIANPWA supports developers by analyzing PWA manifest files for syntactic and semantic correctness, offering actionable recommendations, and helping to create PWAs that align with security best practices. By using GUARDIANPWA, developers and users can address critical security gaps and enhance compliance with CIA principles throughout the PWA installation lifecycle.
Authors:Jeremy Boy, Antoon Purnal, Anna Pätschke, Luca Wilke, Thomas Eisenbarth
Abstract:
As quantum computing advances, PQC schemes are adopted to replace classical algorithms. Among them is the SLH-DSA that was recently standardized by NIST and is favored for its conservative security foundations. In this work, we present the first software-only universal forgery attack on SLH-DSA, leveraging Rowhammer-induced bit flips to corrupt the internal state and forge signatures. While prior work targeted embedded systems and required physical access, our attack is software-only, targeting commodity desktop and server hardware, significantly broadening the threat model. We demonstrate a full end-to-end attack against all security levels of SLH-DSA in OpenSSL 3.5.1, achieving universal forgery for the highest security level after eight hours of hammering and 36 seconds of post-processing. Our post-processing is informed by a novel complexity analysis that, given a concrete set of faulty signatures, identifies the most promising computational path to pursue. To enable the attack, we introduce Swage, a modular and extensible framework for implementing end-to-end Rowhammer-based fault attacks. Swage abstracts and automates key components of practical Rowhammer attacks. Unlike prior tooling, Swage is untangled from the attacked code, making it reusable and suitable for frictionless analysis of different targets. Our findings highlight that even theoretically sound PQC schemes can fail under real-world conditions, underscoring the need for additional implementation hardening or hardware defenses against Rowhammer.
Authors:Johan Wahréus, Ahmed Hussain, Panos Papadimitratos
Abstract:
Large Language Models (LLMs) are increasingly deployed for task automation and content generation, yet their safety mechanisms remain vulnerable to circumvention through different jailbreaking techniques. In this paper, we introduce \textit{Content Concretization} (CC), a novel jailbreaking technique that iteratively transforms abstract malicious requests into concrete, executable implementations. CC is a two-stage process: first, generating initial LLM responses using lower-tier, less constrained safety filters models, then refining them through higher-tier models that process both the preliminary output and original prompt. We evaluate our technique using 350 cybersecurity-specific prompts, demonstrating substantial improvements in jailbreak Success Rates (SRs), increasing from 7\% (no refinements) to 62\% after three refinement iterations, while maintaining a cost of 7.5\textcent~per prompt. Comparative A/B testing across nine different LLM evaluators confirms that outputs from additional refinement steps are consistently rated as more malicious and technically superior. Moreover, manual code analysis reveals that generated outputs execute with minimal modification, although optimal deployment typically requires target-specific fine-tuning. With eventual improved harmful code generation, these results highlight critical vulnerabilities in current LLM safety frameworks.
Authors:Naifeng Zhang, Sophia Fu, Franz Franchetti
Abstract:
Specialized hardware like application-specific integrated circuits (ASICs) remains the primary accelerator type for cryptographic kernels based on large integer arithmetic. Prior work has shown that commodity and server-class GPUs can achieve near-ASIC performance for these workloads. However, achieving comparable performance on CPUs remains an open challenge. This work investigates the following question: How can we narrow the performance gap between CPUs and specialized hardware for key cryptographic kernels like basic linear algebra subprograms (BLAS) operations and the number theoretic transform (NTT)? To this end, we develop an optimized scalar implementation of these kernels for x86 CPUs at the per-core level. We utilize SIMD instructions (specifically AVX2 and AVX-512) to further improve performance, achieving an average speedup of 38 times and 62 times over state-of-the-art CPU baselines for NTTs and BLAS operations, respectively. To narrow the gap further, we propose a small AVX-512 extension, dubbed multi-word extension (MQX), which delivers substantial speedup with only three new instructions and minimal proposed hardware modifications. MQX cuts the slowdown relative to ASICs to as low as 35 times on a single CPU core. Finally, we perform a roofline analysis to evaluate the peak performance achievable with MQX when scaled across an entire multi-core CPU. Our results show that, with MQX, top-tier server-grade CPUs can approach the performance of state-of-the-art ASICs for cryptographic workloads.
Authors:Jonas C. Ditz, Veronika Lazar, Elmar LichtmeÃ, Carola Plesch, Matthias Heck, Kevin Baum, Markus Langer
Abstract:
Human oversight of AI is promoted as a safeguard against risks such as inaccurate outputs, system malfunctions, or violations of fundamental rights, and is mandated in regulation like the European AI Act. Yet debates on human oversight have largely focused on its effectiveness, while overlooking a critical dimension: the security of human oversight. We argue that human oversight creates a new attack surface within the safety, security, and accountability architecture of AI operations. Drawing on cybersecurity perspectives, we analyze attack vectors that threaten the requirements of effective human oversight, thereby undermining the safety of AI operations. Such attacks may target the AI system, its communication with oversight personnel, or the personnel themselves. We then outline hardening strategies to mitigate these risks. Our contributions are: (1) introducing a security perspective on human oversight, and (2) providing an overview of attack vectors and hardening strategies to enable secure human oversight of AI.
Authors:Praveensankar Manimaran, Mayank Raikwar, Thiago Garrett, Arlindo F. da Conceição, Leander Jehl, Roman Vitenberg
Abstract:
Systems managing Verifiable Credentials are becoming increasingly popular. Unfortunately, their support for revoking previously issued credentials allows verifiers to effectively monitor the validity of the credentials, which is sensitive information. While the issue started to gain recognition, no adequate solution has been proposed so far. In this work, we propose a novel framework for time-limited continuous verification. The holder is able to individually configure the verification period when sharing information with the verifier, and the system guarantees proven untraceability of the revocation status after the verification period expires. Different from existing systems, the implementation adopts a more scalable blacklist approach where tokens corresponding to revoked credentials are stored in the registry. The approach employs ZK proofs that allow holders to prove non-membership in the blacklist. In addition to theoretically proving security, we evaluate the approach analytically and experimentally and show that it significantly improves bandwidth consumption on the holder while being on par with state-of-the-art solutions with respect to the other performance metrics.
Authors:Xian Qin, Xue Yang, Xiaohu Tang
Abstract:
Federated Learning (FL) allows collaborative model training across distributed clients without sharing raw data, thus preserving privacy. However, the system remains vulnerable to privacy leakage from gradient updates and Byzantine attacks from malicious clients. Existing solutions face a critical trade-off among privacy preservation, Byzantine robustness, and computational efficiency. We propose a novel scheme that effectively balances these competing objectives by integrating homomorphic encryption with dimension compression based on the Johnson-Lindenstrauss transformation. Our approach employs a dual-server architecture that enables secure Byzantine defense in the ciphertext domain while dramatically reducing computational overhead through gradient compression. The dimension compression technique preserves the geometric relationships necessary for Byzantine defence while reducing computation complexity from $O(dn)$ to $O(kn)$ cryptographic operations, where $k \ll d$. Extensive experiments across diverse datasets demonstrate that our approach maintains model accuracy comparable to non-private FL while effectively defending against Byzantine clients comprising up to $40\%$ of the network.
Authors:Lichao Wu, Sasha Behrouzi, Mohamadreza Rostami, Maximilian Thang, Stjepan Picek, Ahmad-Reza Sadeghi
Abstract:
Safety alignment is critical for the ethical deployment of large language models (LLMs), guiding them to avoid generating harmful or unethical content. Current alignment techniques, such as supervised fine-tuning and reinforcement learning from human feedback, remain fragile and can be bypassed by carefully crafted adversarial prompts. Unfortunately, such attacks rely on trial and error, lack generalizability across models, and are constrained by scalability and reliability. This paper presents NeuroStrike, a novel and generalizable attack framework that exploits a fundamental vulnerability introduced by alignment techniques: the reliance on sparse, specialized safety neurons responsible for detecting and suppressing harmful inputs. We apply NeuroStrike to both white-box and black-box settings: In the white-box setting, NeuroStrike identifies safety neurons through feedforward activation analysis and prunes them during inference to disable safety mechanisms. In the black-box setting, we propose the first LLM profiling attack, which leverages safety neuron transferability by training adversarial prompt generators on open-weight surrogate models and then deploying them against black-box and proprietary targets. We evaluate NeuroStrike on over 20 open-weight LLMs from major LLM developers. By removing less than 0.6% of neurons in targeted layers, NeuroStrike achieves an average attack success rate (ASR) of 76.9% using only vanilla malicious prompts. Moreover, Neurostrike generalizes to four multimodal LLMs with 100% ASR on unsafe image inputs. Safety neurons transfer effectively across architectures, raising ASR to 78.5% on 11 fine-tuned models and 77.7% on five distilled models. The black-box LLM profiling attack achieves an average ASR of 63.7% across five black-box models, including the Google Gemini family.
Authors:Ahaan Dabholkar, Atul Sharma, Z. Berkay Celik, Saurabh Bagchi
Abstract:
Recent works in federated learning (FL) have shown the utility of leveraging transfer learning for balancing the benefits of FL and centralized learning. In this setting, federated training happens after a stable point has been reached through conventional training. Global model weights are first centrally pretrained by the server on a public dataset following which only the last few linear layers (the classification head) of the model are finetuned across clients. In this scenario, existing data reconstruction attacks (DRAs) in FL show two key weaknesses. First, strongly input-correlated gradient information from the initial model layers is never shared, significantly degrading reconstruction accuracy. Second, DRAs in which the server makes highly specific, handcrafted manipulations to the model structure or parameters (for e.g., layers with all zero weights, identity mappings and rows with identical weight patterns) are easily detectable by an active client. Improving on these, we propose MAUI, a stealthy DRA that does not require any overt manipulations to the model architecture or weights, and relies solely on the gradients of the classification head. MAUI first extracts "robust" feature representations of the input batch from the gradients of the classification head and subsequently inverts these representations to the original inputs. We report highly accurate reconstructions on the CIFAR10 and ImageNet datasets on a variety of model architectures including convolution networks (CNN, VGG11), ResNets (18, 50), ShuffleNet-V2 and Vision Transformer (ViT B-32), regardless of the batch size. MAUI significantly outperforms prior DRAs in reconstruction quality, achieving 40-120% higher PSNR scores.
Authors:Yu Cui, Hang Fu, Haibin Zhang, Licheng Wang, Cong Zuo
Abstract:
Multi-agent debate (MAD) is an emerging approach to improving the reasoning capabilities of large language models (LLMs). Existing MAD methods rely on multiple rounds of interaction among agents to reach consensus, and the final output is selected by majority voting in the last round. However, this consensus-based design faces several limitations. First, multiple rounds of communication increases token overhead and limits scalability. Second, due to the inherent conformity of LLMs, agents that initially produce correct responses may be influenced by incorrect ones during the debate process, causing error propagation. Third, majority voting introduces randomness and unfairness in the decision-making phase, and can degrade the reasoning performance. To address these issues, we propose \textsc{Free-MAD}, a novel MAD framework that eliminates the need for consensus among agents. \textsc{Free-MAD} introduces a novel score-based decision mechanism that evaluates the entire debate trajectory rather than relying on the last round only. This mechanism tracks how each agent's reasoning evolves, enabling more accurate and fair outcomes. In addition, \textsc{Free-MAD} reconstructs the debate phase by introducing anti-conformity, a mechanism that enables agents to mitigate excessive influence from the majority. Experiments on eight benchmark datasets demonstrate that \textsc{Free-MAD} significantly improves reasoning performance while requiring only a single-round debate and thus reducing token costs. We also show that compared to existing MAD approaches, \textsc{Free-MAD} exhibits improved robustness in real-world attack scenarios.
Authors:Muhammad M. Roomi, Suhail S. M. Hussain, Daisuke Mashima
Abstract:
This work provides a detailed specification of the Smart Grid Modelling Language (SG-ML), which is designed for the automated generation of smart grid cyber ranges. SG-ML is defined as a set of XML schemas that describe a smart grid's configuration in both machine-readable and human-friendly ways, thereby bridging the gap between system modelling and automated deployment. Unlike prior ad-hoc approaches to cyber range design, SG-ML provides a unified methodology that integrates both power system and cyber network representations. The SG-ML model can be customized by users to meet specific requirements, such as emulating physical or cyber topologies and configuring network devices. An SG-ML Processor then parses this configured model to instantiate the cyber range environment. The modelling language leverages established standards like the IEC 61850 Substation Configuration Language (SCL) and IEC 61131 PLCopen XML to define power system topology, cyber network topology, and device configurations. This approach allows for the reuse of existing assets, reducing the effort needed to create the SG-ML model. To address gaps not covered by these standards such as attack injection parameters, scenario-specific metadata, and additional network constraints, SG-ML introduces proprietary schemas that complement standard models. Overall, SG-ML enables reproducible, scalable, and automated generation of realistic smart grid cyber ranges for research, training, and security assessment.
Authors:Priyanka Rushikesh Chaudhary, Manan Gupta, Jabez Christopher, Putrevu Venkata Sai Charan, Rajib Ranjan Maiti
Abstract:
The classification of network traffic using machine learning (ML) models is one of the primary mechanisms to address the security issues in IoT networks and/or IoT devices. However, the ML models often act as black-boxes that create a roadblock to take critical decision based on the model output. To address this problem, we design and develop a system, called rCamInspector, that employs Explainable AI (XAI) to provide reliable and trustworthy explanations to model output. rCamInspector adopts two classifiers, Flow Classifier - categorizes a flow into one of four classes, IoTCam, Conf, Share and Others, and SmartCam Classifier - classifies an IoTCam flow into one of six classes, Netatmo, Spy Clock, Canary, D3D, Ezviz, V380 Spy Bulb; both are IP address and transport port agnostic. rCamInspector is evaluated using 38GB of network traffic and our results show that XGB achieves the highest accuracy of 92% and 99% in the Flow and SmartCam classifiers respectively among eight supervised ML models. We analytically show that the traditional mutual information (MI) based feature importance cannot provide enough reliability on the model output of XGB in either classifiers. Using SHAP and LIME, we show that a separate set of features can be picked up to explain a correct prediction of XGB. For example, the feature Init Bwd Win Byts turns out to have the highest SHAP values to support the correct prediction of both IoTCam in Flow Classifier and Netatmo class in SmartCam Classifier. To evaluate the faithfulness of the explainers on our dataset, we show that both SHAP and LIME have a consistency of more than 0.7 and a sufficiency of 1.0. Comparing with existing works, we show that rCamInspector achieves a better accuracy (99%), precision (99%), and false negative rate (0.7%).
Authors:Seyed Moein Abtahi, Akramul Azim
Abstract:
Large Language Models (LLMs) show promise in generating firmware for embedded systems, but often introduce security flaws and fail to meet real-time performance constraints. This paper proposes a three-phase methodology that combines LLM-based firmware generation with automated security validation and iterative refinement in a virtualized environment. Using structured prompts, models like GPT-4 generate firmware for networking and control tasks, deployed on FreeRTOS via QEMU. These implementations are tested using fuzzing, static analysis, and runtime monitoring to detect vulnerabilities such as buffer overflows (CWE-120), race conditions (CWE-362), and denial-of-service threats (CWE-400). Specialized AI agents for Threat Detection, Performance Optimization, and Compliance Verification collaborate to improve detection and remediation. Identified issues are categorized using CWE, then used to prompt targeted LLM-generated patches in an iterative loop. Experiments show a 92.4\% Vulnerability Remediation Rate (37.3\% improvement), 95.8\% Threat Model Compliance, and 0.87 Security Coverage Index. Real-time metrics include 8.6ms worst-case execution time and 195μs jitter. This process enhances firmware security and performance while contributing an open-source dataset for future research.
Authors:Alexandru Cojocaru, Juan Garay, Qipeng Liu, Fang Song
Abstract:
We give novel lifting theorems for security games in the quantum random oracle model (QROM) in Noisy Intermediate-Scale Quantum (NISQ) settings such as the hybrid query model, the noisy oracle and the bounded-depth models. We provide, for the first time, a hybrid lifting theorem for hybrid algorithms that can perform both quantum and classical queries, as well as a lifting theorem for quantum algorithms with access to noisy oracles or bounded quantum depth. At the core of our results lies a novel measure-and-reprogram framework, called hybrid coherent measure-and-reprogramming, tailored specifically for hybrid algorithms. Equipped with the lifting theorem, we are able to prove directly NISQ security and complexity results by calculating a single combinatorial quantity, relying solely on classical reasoning. As applications, we derive the first direct product theorems in the average case, in the hybrid setting-i.e., an enabling tool to determine the hybrid hardness of solving multi-instance security games. This allows us to derive in a straightforward manner the NISQ hardness of various security games, such as (i) the non-uniform hardness of salted games, (ii) the hardness of specific cryptographic tasks such as the multiple instance version of one-wayness and collision-resistance, and (iii) uniform or non-uniform hardness of many other games.
Authors:Alexandru Cojocaru, Juan Garay, Qipeng Liu, Fang Song
Abstract:
We give a tighter lifting theorem for security games in the quantum random oracle model. At the core of our main result lies a novel measure-and-reprogram framework that we call coherent reprogramming. This framework gives a tighter lifting theorem for query complexity problems, that only requires purely classical reasoning. As direct applications of our lifting theorem, we first provide a quantum direct product theorem in the average case - i.e., an enabling tool to determine the hardness of solving multi-instance security games. This allows us to derive in a straightforward manner the hardness of various security games, for example (i) the non-uniform hardness of salted games, (ii) the hardness of specific cryptographic tasks such as the multiple instance version of one-wayness and collision-resistance, and (iii) uniform or non-uniform hardness of many other games.
Authors:Priyanka Rushikesh Chaudhary, Rajib Ranjan Maiti
Abstract:
Protocol fuzzing is a scalable and cost-effective technique for identifying security vulnerabilities in deployed Internet of Things devices. During their operational phase, IoT devices often run lightweight servers to handle user interactions, such as video streaming or image capture in smart cameras. Implementation flaws in transport or application-layer security mechanisms can expose IoT devices to a range of threats, including unauthorized access and data leakage. This paper addresses the challenge of uncovering such vulnerabilities by leveraging protocol fuzzing techniques that inject crafted transport and application-layer packets into IoT communications. We present a mutation-based fuzzing tool, named IoTFuzzSentry, to identify specific non-trivial vulnerabilities in commercial IoT devices. We further demonstrate how these vulnerabilities can be exploited in real-world scenarios. We integrated our fuzzing tool into a well-known testing tool Cotopaxi and evaluated it with commercial-off-the-shelf IoT devices such as IP cameras and Smart Plug. Our evaluation revealed vulnerabilities categorized into 4 types (IoT Access Credential Leakage, Sneak IoT Live Video Stream, Creep IoT Live Image, IoT Command Injection) and we show their exploits using three IoT devices. We have responsibly disclosed all these vulnerabilities to the respective vendors. So far, we have published two CVEs, CVE-2024-41623 and CVE-2024-42531, and one is awaiting. To extend the applicability, we have investigated the traffic of six additional IoT devices and our analysis shows that these devices can have similar vulnerabilities, due to the presence of a similar set of application protocols. We believe that IoTFuzzSentry has the potential to discover unconventional security threats and allow IoT vendors to strengthen the security of their commercialized IoT devices automatically with negligible overhead.
Authors:Rahul Arvind, Nikhil Bansal, Dax Enshan Koh, Tobias Haug, Kishor Bharti
Abstract:
In adversarial settings, where attackers can deliberately and strategically corrupt quantum data, standard quantum error correction reaches its limits. It can only correct up to half the code distance and must output a unique answer. Quantum list decoding offers a promising alternative. By allowing the decoder to output a short list of possible errors, it becomes possible to tolerate far more errors, even under worst-case noise. But two fundamental questions remain: which quantum codes support list decoding, and can we design decoding schemes that are secure against efficient, computationally bounded adversaries? In this work, we answer both. To identify which codes are list-decodable, we provide a generalized version of the Knill-Laflamme conditions. Then, using tools from quantum cryptography, we build an unambiguous list decoding protocol based on pseudorandom unitaries. Our scheme is secure against any quantum polynomial-time adversary, even across multiple decoding attempts, in contrast to previous schemes. Our approach connects coding theory with complexity-based quantum cryptography, paving the way for secure quantum information processing in adversarial settings.
Authors:Zihan Liu, Xiaohu Wang, Chao Lin, Minghui Xu, Debiao He, Xinyi Huang
Abstract:
Privacy-preserving blockchain systems are essential for protecting transaction data, yet they must also provide auditability that enables auditors to recover participant identities and transaction amounts when warranted. Existing designs often compromise the independence of auditing and transactions, introducing extra interactions that undermine usability and scalability. Moreover, many auditable solutions depend on auditors serving as validators or recording nodes, which introduces risks to both data security and system reliability. To overcome these challenges, we propose SilentLedger, a privacy-preserving transaction system with auditing and complete non-interactivity. To support public verification of authorization, we introduce a renewable anonymous certificate scheme with formal semantics and a rigorous security model. SilentLedger further employs traceable transaction mechanisms constructed from established cryptographic primitives, enabling users to transact without interaction while allowing auditors to audit solely from on-chain data. We formally prove security properties including authenticity, anonymity, confidentiality, and soundness, provide a concrete instantiation, and evaluate performance under a standard 2-2 transaction model. Our implementation and benchmarks demonstrate that SilentLedger achieves superior performance compared with state-of-the-art solutions.
Authors:Priyanka Rushikesh Chaudhary, Rajib Ranjan Maiti
Abstract:
The majority of consumer IoT devices lack mechanisms for administrators to monitor and control them, hindering tailored security policies. A key challenge is identifying whether a new device, especially a streaming IoT camera, has joined the network. We present zCamInspector, a system for identifying known IoT cameras with supervised classifiers (zCamClassifier) and detecting zero-day cameras with one-class classifiers (zCamDetector). We analyzed ~40GB of traffic across three datasets: Set I (six commercial IoT cameras), Set II (five open-source IoT cameras, ~1.5GB), and Set III (four conferencing and two video-sharing applications as non-IoT traffic). From each, 62 flow-based features were extracted using CICFlowmeter. zCamInspector employs seven supervised models (ET, DT, RF, KNN, XGB, LKSVM, GNB) and four one-class models (OCSVM, SGDOCSVM, IF, DeepSVDD). Results show that XGB identifies IoT cameras with >99% accuracy and false negatives as low as 0.3%, outperforming state-of-the-art methods. For zero-day detection, accuracies reached 93.20% (OCSVM), 96.55% (SGDOCSVM), 78.65% (IF), and 92.16% (DeepSVDD). When all devices were treated as zero-day, DeepSVDD performed best with mean training/testing accuracies of 96.03%/74.51%. zCamInspector also achieved >95% accuracy for specific devices, such as Spy Clock cameras, demonstrating its robustness for identifying and detecting zero-day IoT cameras in diverse network environments.
Authors:Yan Pang, Wenlong Meng, Xiaojing Liao, Tianhao Wang
Abstract:
With the rapid development of large language models, the potential threat of their malicious use, particularly in generating phishing content, is becoming increasingly prevalent. Leveraging the capabilities of LLMs, malicious users can synthesize phishing emails that are free from spelling mistakes and other easily detectable features. Furthermore, such models can generate topic-specific phishing messages, tailoring content to the target domain and increasing the likelihood of success. Detecting such content remains a significant challenge, as LLM-generated phishing emails often lack clear or distinguishable linguistic features. As a result, most existing semantic-level detection approaches struggle to identify them reliably. While certain LLM-based detection methods have shown promise, they suffer from high computational costs and are constrained by the performance of the underlying language model, making them impractical for large-scale deployment. In this work, we aim to address this issue. We propose Paladin, which embeds trigger-tag associations into vanilla LLM using various insertion strategies, creating them into instrumented LLMs. When an instrumented LLM generates content related to phishing, it will automatically include detectable tags, enabling easier identification. Based on the design on implicit and explicit triggers and tags, we consider four distinct scenarios in our work. We evaluate our method from three key perspectives: stealthiness, effectiveness, and robustness, and compare it with existing baseline methods. Experimental results show that our method outperforms the baselines, achieving over 90% detection accuracy across all scenarios.
Authors:Nicolò Romandini, Carlo Mazzocca, Kai Otsuki, Rebecca Montanari
Abstract:
Blockchain and smart contracts have garnered significant interest in recent years as the foundation of a decentralized, trustless digital ecosystem, thereby eliminating the need for traditional centralized authorities. Despite their central role in powering Web3, their complexity still presents significant barriers for non-expert users. To bridge this gap, Artificial Intelligence (AI)-based agents have emerged as valuable tools for interacting with blockchain environments, supporting a range of tasks, from analyzing on-chain data and optimizing transaction strategies to detecting vulnerabilities within smart contracts. While interest in applying AI to blockchain is growing, the literature still lacks a comprehensive survey that focuses specifically on the intersection with AI agents. Most of the related work only provides general considerations, without focusing on any specific domain. This paper addresses this gap by presenting the first Systematization of Knowledge dedicated to AI-driven systems for blockchain, with a special focus on their security and privacy dimensions, shedding light on their applications, limitations, and future research directions.
Authors:Safayat Bin Hakim, Muhammad Adil, Alvaro Velasquez, Shouhuai Xu, Houbing Herbert Song
Abstract:
Traditional Artificial Intelligence (AI) approaches in cybersecurity exhibit fundamental limitations: inadequate conceptual grounding leading to non-robustness against novel attacks; limited instructibility impeding analyst-guided adaptation; and misalignment with cybersecurity objectives. Neuro-Symbolic (NeSy) AI has emerged with the potential to revolutionize cybersecurity AI. However, there is no systematic understanding of this emerging approach. These hybrid systems address critical cybersecurity challenges by combining neural pattern recognition with symbolic reasoning, enabling enhanced threat understanding while introducing concerning autonomous offensive capabilities that reshape threat landscapes. In this survey, we systematically characterize this field by analyzing 127 publications spanning 2019-July 2025. We introduce a Grounding-Instructibility-Alignment (G-I-A) framework to evaluate these systems, focusing on both cyber defense and cyber offense across network security, malware analysis, and cyber operations. Our analysis shows advantages of multi-agent NeSy architectures and identifies critical implementation challenges including standardization gaps, computational complexity, and human-AI collaboration requirements that constrain deployment. We show that causal reasoning integration is the most transformative advancement, enabling proactive defense beyond correlation-based approaches. Our findings highlight dual-use implications where autonomous systems demonstrate substantial capabilities in zero-day exploitation while achieving significant cost reductions, altering threat dynamics. We provide insights and future research directions, emphasizing the urgent need for community-driven standardization frameworks and responsible development practices that ensure advancement serves defensive cybersecurity objectives while maintaining societal alignment.
Authors:Zhou Li, Yu Zheng, Tianhao Wang, Sang-Woo Jun
Abstract:
Differential privacy (DP) is a privacy-enhancement technology (PET) that receives prominent attention from the academia, industry, and government. One main development over the past decade has been the decentralization of DP, including local DP and shuffle DP. Despite that decentralized DP heavily relies on network communications for data collection,we found that: 1) no systematic study has surveyed the research opportunities at the intersection of networking and DP; 2) nor have there been significant efforts to develop DP mechanisms that are explicitly tailored for network environments. In this paper, we seek to address this gap by initiating a new direction of network-aware DP. We identified two focus areas where the network research can offer substantive contributions to the design and deployment of DP, related to network security and topology. Through this work, we hope to encourage more research that adapt/optimize DP's deployment in various network environments.
Authors:Siddharth Muralee, Sourag Cherupattamoolayil, James C. Davis, Antonio Bianchi, Aravind Machiry
Abstract:
Modern computing systems remain rife with software vulnerabilities. Engineers apply many means to detect them, of which dynamic testing is one of the most common and effective. However, most dynamic testing techniques follow a top-down paradigm, and struggle to reach and exercise functions deep within the call graph. While recent works have proposed Bottom-Up approaches to address these limitations, they face challenges with false positives and generating valid inputs that adhere to the context of the entire program. In this work, we introduce a new paradigm that we call Reactive Bottom-Up Testing. Our insight is that function-level testing is necessary but not sufficient for the validation of vulnerabilities in functions. What we need is a systematic approach that not only tests functions in isolation but also validates their behavior within the broader program context, ensuring that detected vulnerabilities are both reachable and triggerable. We develop a three-stage bottom-up testing scheme: (1) identify likely-vulnerable functions and generate type- and context-aware harnesses; (2) fuzz to find crashes and extract input constraints via symbolic execution; (3) verify crashes by combining constraints to remove false positives. We implemented an automated prototype, which we call Griller. We evaluated Griller in a controlled setting using a benchmark of 48 known vulnerabilities across 5 open-source projects, where we successfully detected 28 known vulnerabilities. Additionally, we evaluated Griller on several real-world applications such as Pacman, and it discovered 6 previously unknown vulnerabilities. Our findings suggest that Reactive Bottom-Up Testing can significantly enhance the detection of vulnerabilities in complex systems, paving the way for more robust security practices.
Authors:Narges Dadkhah, Somayeh Mohammadi, Gerhard Wunder
Abstract:
Determining the optimal block size is crucial for achieving high throughput in blockchain systems. Many studies have focused on tuning various components, such as databases, network bandwidth, and consensus mechanisms. However, the impact of block size on system performance remains a topic of debate, often resulting in divergent views and even leading to new forks in blockchain networks. This research proposes a mathematical model to maximize performance by determining the ideal block size for Hyperledger Fabric, a prominent consortium blockchain. By leveraging machine learning and solving the model with a genetic algorithm, the proposed approach assesses how factors such as block size, transaction size, and network capacity influence the block processing time. The integration of an optimization solver enables precise adjustments to block size configuration before deployment, ensuring improved performance from the outset. This systematic approach aims to balance block processing efficiency, network latency, and system throughput, offering a robust solution to improve blockchain performance across diverse business contexts.
Authors:Napsu Karmitsa, Antti Airola, Tapio Pahikkala, Tinja Pitkämäki
Abstract:
The increasing availability of personal data has enabled significant advances in fields such as machine learning, healthcare, and cybersecurity. However, this data abundance also raises serious privacy concerns, especially in light of powerful re-identification attacks and growing legal and ethical demands for responsible data use. Differential privacy (DP) has emerged as a principled, mathematically grounded framework for mitigating these risks. This review provides a comprehensive survey of DP, covering its theoretical foundations, practical mechanisms, and real-world applications. It explores key algorithmic tools and domain-specific challenges - particularly in privacy-preserving machine learning and synthetic data generation. The report also highlights usability issues and the need for improved communication and transparency in DP systems. Overall, the goal is to support informed adoption of DP by researchers and practitioners navigating the evolving landscape of data privacy.
Authors:Rujie Dai, Peizhuo Lv, Yujiang Gui, Qiujian Lv, Yuanyuan Qiao, Yan Wang, Degang Sun, Weiqing Huang, Yingjiu Li, XiaoFeng Wang
Abstract:
Advanced Persistent Threats (APTs) are prolonged, stealthy intrusions by skilled adversaries that compromise high-value systems to steal data or disrupt operations. Reconstructing complete attack chains from massive, heterogeneous logs is essential for effective attack investigation, yet existing methods suffer from poor platform generality, limited generalization to evolving tactics, and an inability to produce analyst-ready reports. Large Language Models (LLMs) offer strong semantic understanding and summarization capabilities, but in this domain they struggle to capture the long-range, cross-log dependencies critical for accurate reconstruction. To solve these problems, we present an LLM-empowered attack investigation framework augmented with a dynamically adaptable Kill-Chain-aligned threat knowledge base. We organizes attack-relevant behaviors into stage-aware knowledge units enriched with semantic annotations, enabling the LLM to iteratively retrieve relevant intelligence, perform causal reasoning, and progressively expand the investigation context. This process reconstructs multi-phase attack scenarios and generates coherent, human-readable investigation reports. Evaluated on 15 attack scenarios spanning single-host and multi-host environments across Windows and Linux (over 4.3M log events, 7.2 GB of data), the system achieves an average True Positive Rate (TPR) of 97.1% and an average False Positive Rate (FPR) of 0.2%, significantly outperforming the SOTA method ATLAS, which achieves an average TPR of 79.2% and an average FPR of 29.1%.
Authors:Narasimha Raghavan Veeraragavan, Jan Franz Nygård
Abstract:
We investigate how to calculate Kaplan-Meier survival curves across multiple health-care jurisdictions while protecting patient privacy with node-level differential privacy. Each site discloses its curve only once, adding Laplace noise whose scale is determined by the length of the common time grid; the server then averages the noisy curves, so the overall privacy budget remains unchanged. We benchmark four one-shot smoothing techniques: Discrete Cosine Transform, Haar Wavelet shrinkage, adaptive Total-Variation denoising, and a parametric Weibull fit on the NCCTG lung-cancer cohort under five privacy levels and three partition scenarios (uniform, moderately skewed, highly imbalanced). Total-Variation gives the best mean accuracy, whereas the frequency-domain smoothers offer stronger worst-case robustness and the Weibull model shows the most stable behaviour at the strictest privacy setting. Across all methods the released curves keep the empirical log-rank type-I error below fifteen percent for privacy budgets of 0.5 and higher, demonstrating that clinically useful survival information can be shared without iterative training or heavy cryptography.
Authors:Shan Wang, Ming Yang, Yu Liu, Yue Zhang, Shuaiqing Zhang, Zhen Ling, Jiannong Cao, Xinwen Fu
Abstract:
Remote Procedure Call (RPC) services have become a primary gateway for users to access public blockchains. While they offer significant convenience, RPC services also introduce critical privacy challenges that remain insufficiently examined. Existing deanonymization attacks either do not apply to blockchain RPC users or incur costs like transaction fees assuming an active network eavesdropper. In this paper, we propose a novel deanonymization attack that can link an IP address of a RPC user to this user's blockchain pseudonym. Our analysis reveals a temporal correlation between the timestamps of transaction confirmations recorded on the public ledger and those of TCP packets sent by the victim when querying transaction status. We assume a strong passive adversary with access to network infrastructure, capable of monitoring traffic at network border routers or Internet exchange points. By monitoring network traffic and analyzing public ledgers, the attacker can link the IP address of the TCP packet to the pseudonym of the transaction initiator by exploiting the temporal correlation. This deanonymization attack incurs zero transaction fee. We mathematically model and analyze the attack method, perform large-scale measurements of blockchain ledgers, and conduct real-world attacks to validate the attack. Our attack achieves a high success rate of over 95% against normal RPC users on various blockchain networks, including Ethereum, Bitcoin and Solana.
Authors:Kunal Mukherjee, Murat Kantarcioglu
Abstract:
We introduce PROVSEEK, an LLM-powered agentic framework for automated provenance-driven forensic analysis and threat intelligence extraction. PROVSEEK employs specialized toolchains to dynamically retrieve relevant context by generating precise, context-aware queries that fuse a vectorized threat report knowledge base with data from system provenance databases. The framework resolves provenance queries, orchestrates multiple role-specific agents to mitigate hallucinations, and synthesizes structured, ground-truth verifiable forensic summaries. By combining agent orchestration with Retrieval-Augmented Generation (RAG) and chain-of-thought (CoT) reasoning, PROVSEEK enables adaptive multi-step analysis that iteratively refines hypotheses, verifies supporting evidence, and produces scalable, interpretable forensic explanations of attack behaviors. By combining provenance data with agentic reasoning, PROVSEEK establishes a new paradigm for grounded agentic forecics to investigate APTs. We conduct a comprehensive evaluation on publicly available DARPA datasets, demonstrating that PROVSEEK outperforms retrieval-based methods for intelligence extraction task, achieving a 34% improvement in contextual precision/recall; and for threat detection task, PROVSEEK achieves 22%/29% higher precision/recall compared to both a baseline agentic AI approach and State-Of-The-Art (SOTA) Provenance-based Intrusion Detection System (PIDS).
Authors:Mengxiao Wang, Yuxuan Zhang, Guofei Gu
Abstract:
Large Language Models (LLMs) are increasingly integrated into real-world applications, from virtual assistants to autonomous agents. However, their flexibility also introduces new attack vectors-particularly Prompt Injection (PI), where adversaries manipulate model behavior through crafted inputs. As attackers continuously evolve with paraphrased, obfuscated, and even multi-task injection strategies, existing benchmarks are no longer sufficient to capture the full spectrum of emerging threats. To address this gap, we construct a new benchmark that systematically extends prior efforts. Our benchmark subsumes the two widely-used existing ones while introducing new manipulation techniques and multi-task scenarios, thereby providing a more comprehensive evaluation setting. We find that existing defenses, though effective on their original benchmarks, show clear weaknesses under our benchmark, underscoring the need for more robust solutions. Our key insight is that while attack forms may vary, the adversary's intent-injecting an unauthorized task-remains invariant. Building on this observation, we propose PromptSleuth, a semantic-oriented defense framework that detects prompt injection by reasoning over task-level intent rather than surface features. Evaluated across state-of-the-art benchmarks, PromptSleuth consistently outperforms existing defense while maintaining comparable runtime and cost efficiency. These results demonstrate that intent-based semantic reasoning offers a robust, efficient, and generalizable strategy for defending LLMs against evolving prompt injection threats.
Authors:Yaser Baseri, Abdelhakim Senhaji Hafid, Dimitrios Makrakis, Hamidreza Fereidouni
Abstract:
Balancing robust security with strong privacy guarantees is critical for Risk-Based Adaptive Authentication (RBA), particularly in decentralized settings. Federated Learning (FL) offers a promising solution by enabling collaborative risk assessment without centralizing user data. However, existing FL approaches struggle with Non-Independent and Identically Distributed (Non-IID) user features, resulting in biased, unstable, and poorly generalized global models. This paper introduces FL-RBA2, a novel Federated Learning framework for Risk-Based Adaptive Authentication that addresses Non-IID challenges through a mathematically grounded similarity transformation. By converting heterogeneous user features (including behavioral, biometric, contextual, interaction-based, and knowledge-based modalities) into IID similarity vectors, FL-RBA2 supports unbiased aggregation and personalized risk modeling across distributed clients. The framework mitigates cold-start limitations via clustering-based risk labeling, incorporates Differential Privacy (DP) to safeguard sensitive information, and employs Message Authentication Codes (MACs) to ensure model integrity and authenticity. Federated updates are securely aggregated into a global model, achieving strong balance between user privacy, scalability, and adaptive authentication robustness. Rigorous game-based security proofs in the Random Oracle Model formally establish privacy, correctness, and adaptive security guarantees. Extensive experiments on keystroke, mouse, and contextual datasets validate FL-RBA2's effectiveness in high-risk user detection and its resilience to model inversion and inference attacks, even under strong DP constraints.
Authors:Nesrine Benchoubane, Olfa Ben Yahia, William Ferguson, Gurkan Gur, Sumit Chakravarty, Gregory Falco, Gunes Karabulut Kurt
Abstract:
In the face of adverse environmental conditions and cyber threats, robust communication systems for critical applications such as wildfire management and detection demand secure and resilient architectures. This paper presents a novel framework that considers both adversarial factors, building resilience into a heterogeneous network (HetNet) integrating Low Earth Orbit (LEO) satellite constellation with High-Altitude Platform Ground Stations (HAPGS) and Low-Altitude Platforms (LAPS), tailored to support wildfire management operations. Building upon our previous work on secure-by-component approach for link segment security, we extend protection to the communication layer by securing both Radio Frequency (RF)/Free Space Optics (FSO) management and different links. Through a case study, we quantify how environmental stressors impact secrecy capacity and expose the system to passive adversaries. Key findings demonstrate that atmospheric attenuation and beam misalignment can notably degrade secrecy capacity across both short- and long-range communication links, while high-altitude eavesdroppers face less signal degradation, increasing their interception capability. Moreover, increasing transmit power to counter environmental losses can inadvertently improve eavesdropper reception, thereby reducing overall link confidentiality. Our work not only highlights the importance of protecting networks from these dual threats but also aligns with the IEEE P3536 Standard for Space System Cybersecurity Design, ensuring resilience and the prevention of mission failures.
Authors:Peilun Wu, Nan Sun, Nour Moustafa, Youyang Qu, Ming Ding
Abstract:
Endpoint Detection and Response (EDR) solutions embrace the method of attack provenance graph to discover unknown threats through system event correlation. However, this method still faces some unsolved problems in the fields of interoperability, reliability, flexibility, and practicability to deliver actionable results. Our research highlights the limitations of current solutions in detecting obfuscation, correlating attacks, identifying low-frequency events, and ensuring robust context awareness in relation to command-line activities. To address these challenges, we introduce DEFENDCLI, an innovative system leveraging provenance graphs that, for the first time, delves into command-line-level detection. By offering finer detection granularity, it addresses a gap in modern EDR systems that has been overlooked in previous research. Our solution improves the precision of the information representation by evaluating differentiation across three levels: unusual system process calls, suspicious command-line executions, and infrequent external network connections. This multi-level approach enables EDR systems to be more reliable in complex and dynamic environments. Our evaluation demonstrates that DEFENDCLI improves precision by approximately 1.6x compared to the state-of-the-art methods on the DARPA Engagement Series attack datasets. Extensive real-time industrial testing across various attack scenarios further validates its practical effectiveness. The results indicate that DEFENDCLI not only detects previously unknown attack instances, which are missed by other modern commercial solutions, but also achieves a 2.3x improvement in precision over the state-of-the-art research work.
Authors:Nges Brian Njungle, Michel A. Kinsy
Abstract:
The growing adoption of machine learning in sensitive areas such as healthcare and defense introduces significant privacy and security challenges. These domains demand robust data protection, as models depend on large volumes of sensitive information for both training and inference. Fully Homomorphic Encryption (FHE) presents a compelling solution by enabling computations directly on encrypted data, maintaining confidentiality across the entire machine learning workflow. However, FHE inherently supports only linear operations, making it difficult to implement non-linear activation functions, essential components of modern neural networks. This work focuses on designing, implementing, and evaluating activation functions tailored for FHE-based machine learning. We investigate two commonly used functions: the Square function and Rectified Linear Unit (ReLU), using LeNet-5 and ResNet-20 architectures with the CKKS scheme from the OpenFHE library. For ReLU, we assess two methods: a conventional low-degree polynomial approximation and a novel scheme-switching technique that securely evaluates ReLU under FHE constraints. Our findings show that the Square function performs well in shallow networks like LeNet-5, achieving 99.4% accuracy with 128 seconds per image. In contrast, deeper models like ResNet-20 benefit more from ReLU. The polynomial approximation yields 83.8% accuracy with 1,145 seconds per image, while our scheme-switching method improves accuracy to 89.8%, albeit with a longer inference time of 1,697 seconds. These results underscore a critical trade-off in FHE-based ML: faster activation functions often reduce accuracy, whereas those preserving accuracy demand greater computational resources.
Authors:Hossein Shokouhinejad, Roozbeh Razavi-Far, Griffin Higgins, Ali A Ghorbani
Abstract:
Malware detection in modern computing environments demands models that are not only accurate but also interpretable and robust to evasive techniques. Graph neural networks (GNNs) have shown promise in this domain by modeling rich structural dependencies in graph-based program representations such as control flow graphs (CFGs). However, single-model approaches may suffer from limited generalization and lack interpretability, especially in high-stakes security applications. In this paper, we propose a novel stacking ensemble framework for graph-based malware detection and explanation. Our method dynamically extracts CFGs from portable executable (PE) files and encodes their basic blocks through a two-step embedding strategy. A set of diverse GNN base learners, each with a distinct message-passing mechanism, is used to capture complementary behavioral features. Their prediction outputs are aggregated by a meta-learner implemented as an attention-based multilayer perceptron, which both classifies malware instances and quantifies the contribution of each base model. To enhance explainability, we introduce an ensemble-aware post-hoc explanation technique that leverages edge-level importance scores generated by a GNN explainer and fuses them using the learned attention weights. This produces interpretable, model-agnostic explanations aligned with the final ensemble decision. Experimental results demonstrate that our framework improves classification performance while providing insightful interpretations of malware behavior.
Authors:Kexin Chu, Zecheng Lin, Dawei Xiang, Zixu Shen, Jianchang Su, Cheng Chu, Yiwei Yang, Wenhui Zhang, Wenfei Wu, Wei Zhang
Abstract:
Global KV-cache sharing has emerged as a key optimization for accelerating large language model (LLM) inference. However, it exposes a new class of timing side-channel attacks, enabling adversaries to infer sensitive user inputs via shared cache entries. Existing defenses, such as per-user isolation, eliminate leakage but degrade performance by up to 38.9% in time-to-first-token (TTFT), making them impractical for high-throughput deployment. To address this gap, we introduce SafeKV (Secure and Flexible KV Cache Sharing), a privacy-aware KV-cache management framework that selectively shares non-sensitive entries while confining sensitive content to private caches. SafeKV comprises three components: (i) a hybrid, multi-tier detection pipeline that integrates rule-based pattern matching, a general-purpose privacy detector, and context-aware validation; (ii) a unified radix-tree index that manages public and private entries across heterogeneous memory tiers (HBM, DRAM, SSD); and (iii) entropy-based access monitoring to detect and mitigate residual information leakage. Our evaluation shows that SafeKV mitigates 94% - 97% of timing-based side-channel attacks. Compared to per-user isolation method, SafeKV improves TTFT by up to 40.58% and throughput by up to 2.66X across diverse LLMs and workloads. SafeKV reduces cache-induced TTFT overhead from 50.41% to 11.74% on Qwen3-235B. By combining fine-grained privacy control with high cache reuse efficiency, SafeKV reclaims the performance advantages of global sharing while providing robust runtime privacy guarantees for LLM inference.
Authors:Shafizur Rahman Seeam, Ye Zheng, Zhengxiong Li, Yidan Hu
Abstract:
Location-based augmented reality (LB-AR) applications, such as Pokémon Go, stream sub-second GPS updates to deliver responsive and immersive user experiences. However, this high-frequency location reporting introduces serious privacy risks. Protecting privacy in LB-AR is significantly more challenging than in traditional location-based services (LBS), as it demands real-time location protection with strong per-location and trajectory-level privacy guaranteed while maintaining low latency and high quality of service (QoS). Existing methods fail to meet these combined demands.
To fill the gap, we present PrivAR, the first client-side privacy framework for real-time LB-AR. PrivAR introduces two lightweight mechanisms: (i) Planar Staircase Mechanism (PSM) which designs a staircase-shaped distribution to generate noisy location with strong per-location privacy and low expected error; and (ii) Thresholded Reporting with PSM (TR-PSM), a selective scheme that releases a noisy location update only when a displacement exceeds a private threshold, enabling many-to-one mappings for enhanced trace-level privacy while preserving high QoS. We present theoretical analysis, extensive experiments on two public datasets and our proprietary GeoTrace dataset, and validate PrivAR on a Pokémon-Go-style prototype. Results show PrivAR improves QoS (Gamescore) by up to 50%, while increasing attacker error by 1.8x over baseline with an additional 0.06 milliseconds runtime overhead.
Authors:Ali Alkinoon, Trung Cuong Dang, Ahod Alghuried, Abdulaziz Alghamdi, Soohyeon Choi, Manar Mohaisen, An Wang, Saeed Salem, David Mohaisen
Abstract:
The proper use of Android app permissions is crucial to the success and security of these apps. Users must agree to permission requests when installing or running their apps. Despite official Android platform documentation on proper permission usage, there are still many cases of permission abuse. This study provides a comprehensive analysis of the Android permission landscape, highlighting trends and patterns in permission requests across various applications from the Google Play Store. By distinguishing between benign and malicious applications, we uncover developers' evolving strategies, with malicious apps increasingly requesting fewer permissions to evade detection, while benign apps request more to enhance functionality. In addition to examining permission trends across years and app features such as advertisements, in-app purchases, content ratings, and app sizes, we leverage association rule mining using the FP-Growth algorithm. This allows us to uncover frequent permission combinations across the entire dataset, specific years, and 16 app genres. The analysis reveals significant differences in permission usage patterns, providing a deeper understanding of co-occurring permissions and their implications for user privacy and app functionality. By categorizing permissions into high-level semantic groups and examining their application across distinct app categories, this study offers a structured approach to analyzing the dynamics within the Android ecosystem. The findings emphasize the importance of continuous monitoring, user education, and regulatory oversight to address permission misuse effectively.
Authors:Shima Yousefi, Motahare Mounesan, Saptarshi Debroy
Abstract:
In recent years, Deep Neural Networks (DNNs) have become increasingly integral to IoT-based environments, enabling realtime visual computing. However, the limited computational capacity of these devices has motivated the adoption of collaborative DNN inference, where the IoT device offloads part of the inference-related computation to a remote server. Such offloading often requires dynamic DNN partitioning information to be exchanged among the participants over an unsecured network or via relays/hops, leading to novel privacy vulnerabilities. In this paper, we propose AdVAR-DNN, an adversarial variational autoencoder (VAE)-based misclassification attack, leveraging classifiers to detect model information and a VAE to generate untraceable manipulated samples, specifically designed to compromise the collaborative inference process. AdVAR-DNN attack uses the sensitive information exchange vulnerability of collaborative DNN inference and is black-box in nature in terms of having no prior knowledge about the DNN model and how it is partitioned. Our evaluation using the most popular object classification DNNs on the CIFAR-100 dataset demonstrates the effectiveness of AdVAR-DNN in terms of high attack success rate with little to no probability of detection.
Authors:Hongyu Zhu, Sichu Liang, Wenwen Wang, Zhuomeng Zhang, Fangqi Li, Shi-Lin Wang
Abstract:
Modern over-parameterized deep models are highly data-dependent, with large scale general-purpose and domain-specific datasets serving as the bedrock for rapid advancements. However, many datasets are proprietary or contain sensitive information, making unrestricted model training problematic. In the open world where data thefts cannot be fully prevented, Dataset Ownership Verification (DOV) has emerged as a promising method to protect copyright by detecting unauthorized model training and tracing illicit activities. Due to its diversity and superior stealth, evading DOV is considered extremely challenging. However, this paper identifies that previous studies have relied on oversimplistic evasion attacks for evaluation, leading to a false sense of security. We introduce a unified evasion framework, in which a teacher model first learns from the copyright dataset and then transfers task-relevant yet identifier-independent domain knowledge to a surrogate student using an out-of-distribution (OOD) dataset as the intermediary. Leveraging Vision-Language Models and Large Language Models, we curate the most informative and reliable subsets from the OOD gallery set as the final transfer set, and propose selectively transferring task-oriented knowledge to achieve a better trade-off between generalization and evasion effectiveness. Experiments across diverse datasets covering eleven DOV methods demonstrate our approach simultaneously eliminates all copyright identifiers and significantly outperforms nine state-of-the-art evasion attacks in both generalization and effectiveness, with moderate computational overhead. As a proof of concept, we reveal key vulnerabilities in current DOV methods, highlighting the need for long-term development to enhance practicality.
Authors:Sajana Weerawardhena, Paul Kassianik, Blaine Nelson, Baturay Saglam, Anu Vellore, Aman Priyanshu, Supriti Vijay, Massimo Aufiero, Arthur Goldblatt, Fraser Burch, Ed Li, Jianliang He, Dhruv Kedia, Kojin Oshiba, Zhouran Yang, Yaron Singer, Amin Karbasi
Abstract:
Large language models (LLMs) have shown remarkable success across many domains, yet their integration into cybersecurity applications remains limited due to a lack of general-purpose cybersecurity data, representational complexity, and safety and regulatory concerns. To address this gap, we previously introduced Foundation-Sec-8B, a cybersecurity-focused LLM suitable for fine-tuning on downstream tasks. That model, however, was not designed for chat-style interactions or instruction-following. In this report, we release Foundation-Sec-8B-Instruct: a model specifically trained for general-purpose cybersecurity dialogue. Built on Foundation-Sec-8B, it combines domain-specific knowledge with instruction-following, conversational capabilities, and alignment with human preferences to produce high-quality, relevant responses. Comprehensive evaluations show that Foundation-Sec-8B-Instruct outperforms Llama 3.1-8B-Instruct on a range of cybersecurity tasks while matching its instruction-following performance. It is also competitive with GPT-4o-mini on cyber threat intelligence and instruction-following tasks. We envision Foundation-Sec-8B-Instruct becoming an indispensable assistant in the daily workflows of cybersecurity professionals. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Instruct.
Authors:Sahan Sanjaya, Aruna Jayasena, Prabhat Mishra
Abstract:
Context switching is utilized by operating systems to change the execution context between application programs. It involves saving and restoring the states of multiple registers and performing a pipeline flush to remove any pre-fetched instructions, leading to a higher instantaneous power consumption compared to typical program execution. In this paper, we introduce a physical power side-channel leakage source that exploits the power spike observed during a context switch, triggered by the inbuilt sleep function of the system kernel. We observed that this power spike directly correlates with both the power consumption during context switching and the residual power consumption of the previously executed program. Notably, the persistence of residual power signatures from previous workloads extends the scope of this side-channel beyond extracting the data in registers during the context switch. Unlike traditional approaches that require analyzing full power traces, applying complex preprocessing, or relying on external synchronization triggers, this novel technique leverages only the amplitude of a single power spike, significantly simplifying the attack. We developed a power model to illustrate the feasibility of mounting end-to-end side-channel attacks using the sleep-induced power spikes. Experimental evaluation demonstrates that our framework can successfully perform cryptographic key recovery for both AES and SIKE implementations on Broadcom BCM2711.
Authors:Naseem Khan, Tuan Nguyen, Amine Bermak, Issa Khalil
Abstract:
The rapid advancement of Generative Artificial Intelligence has fueled deepfake proliferation-synthetic media encompassing fully generated content and subtly edited authentic material-posing challenges to digital security, misinformation mitigation, and identity preservation. This systematic review evaluates state-of-the-art deepfake detection methodologies, emphasizing reproducible implementations for transparency and validation. We delineate two core paradigms: (1) detection of fully synthetic media leveraging statistical anomalies and hierarchical feature extraction, and (2) localization of manipulated regions within authentic content employing multi-modal cues such as visual artifacts and temporal inconsistencies. These approaches, spanning uni-modal and multi-modal frameworks, demonstrate notable precision and adaptability in controlled settings, effectively identifying manipulations through advanced learning techniques and cross-modal fusion. However, comprehensive assessment reveals insufficient evaluation of adversarial robustness across both paradigms. Current methods exhibit vulnerability to adversarial perturbations-subtle alterations designed to evade detection-undermining reliability in real-world adversarial contexts. This gap highlights critical disconnect between methodological development and evolving threat landscapes. To address this, we contribute a curated GitHub repository aggregating open-source implementations, enabling replication and testing. Our findings emphasize urgent need for future work prioritizing adversarial resilience, advocating scalable, modality-agnostic architectures capable of withstanding sophisticated manipulations. This review synthesizes strengths and shortcomings of contemporary deepfake detection while charting paths toward robust trustworthy systems.
Authors:Matthieu Bettinger, Sonia Ben Mokhtar, Anthony Simonet-Boulogne
Abstract:
The logic of many protocols relies on time measurements. However, in Trusted Execution Environments (TEEs) like Intel SGX, the time source is outside the Trusted Computing Base: a malicious system hosting the TEE can manipulate that TEE's notion of time, e.g., jumping in time or affecting the perceived time speed. Previous work like Triad propose protocols for TEEs to maintain a trustworthy time source. However, in this paper, based on a public implementation of Triad that we contribute, we empirically showcase vulnerabilities to this protocol. For example, an attacker controlling the operating system, and consequently the scheduling algorithm, may arbitrarily manipulate their local TEE's clock speed. What is worse, in case of faster malicious clock speeds, an attacker on a single compromised machine may propagate the attack to honest machines participating in Triad's Trusted Time protocol, causing them to skip to timestamps arbitrarily far in the future. Then, infected honest machines propagate time-skips themselves to other honest machines interacting with them. We discuss protocol changes to Triad for higher resilience against such attacks.
Authors:Zehui Zhao, Laith Alzubaidi, Haider A. Alwzwazy, Jinglan Zhang, Yuantong Gu
Abstract:
In recent years, advanced deep learning architectures have shown strong performance in medical imaging tasks. However, the traditional centralized learning paradigm poses serious privacy risks as all data is collected and trained on a single server. To mitigate this challenge, decentralized approaches such as federated learning and swarm learning have emerged, allowing model training on local nodes while sharing only model weights. While these methods enhance privacy, they struggle with heterogeneous and imbalanced data and suffer from inefficiencies due to frequent communication and the aggregation of weights. More critically, the dynamic and complex nature of clinical environments demands scalable AI systems capable of continuously learning from diverse modalities and multilabels. Yet, both centralized and decentralized models are prone to catastrophic forgetting during system expansion, often requiring full model retraining to incorporate new data. To address these limitations, we propose VGS-ATD, a novel distributed learning framework. To validate VGS-ATD, we evaluate it in experiments spanning 30 datasets and 80 independent labels across distributed nodes, VGS-ATD achieved an overall accuracy of 92.7%, outperforming centralized learning (84.9%) and swarm learning (72.99%), while federated learning failed under these conditions due to high requirements on computational resources. VGS-ATD also demonstrated strong scalability, with only a 1% drop in accuracy on existing nodes after expansion, compared to a 20% drop in centralized learning, highlighting its resilience to catastrophic forgetting. Additionally, it reduced computational costs by up to 50% relative to both centralized and swarm learning, confirming its superior efficiency and scalability.
Authors:Muhammad M. Roomi, S. M. Suhail Hussain, Ee-Chien Chang, David M. Nicol, Daisuke Mashima
Abstract:
Digitalization of power grids have made them increasingly susceptible to cyber-attacks in the past decade. Iterative cybersecurity testing is indispensable to counter emerging attack vectors and to ensure dependability of critical infrastructure. Furthermore, these can be used to evaluate cybersecurity configuration, effectiveness of the cybersecurity measures against various attack vectors, as well as to train smart grid cybersecurity experts defending the system. Enabling extensive experiments narrows the gap between academic research and production environment. A high-fidelity cyber range is vital as it is often infeasible to conduct such experiments and training using production environment. However, the design and implementation of cyber range requires extensive domain knowledge of physical and cyber aspect of the infrastructure. Furthermore, costs incurred for setup and maintenance of cyber range are significant. Moreover, most existing smart grid cyber ranges are designed as a one-off, proprietary system, and are limited in terms of configurability, accessibility, portability, and reproducibility. To address these challenges, an automated Smart grid Cyber Range generation framework is presented in this paper. Initially a human-/machine-friendly, XML-based modeling language called Smart Grid Modeling Language was defined, which incorporates IEC 61850 System Configuration Language files. Subsequently, a toolchain to parse SG-ML model files and automatically instantiate a functional smart grid cyber range was developed. The developed SG-ML models can be easily shared and/or modified to reproduce or customize for any cyber range. The application of Auto-SGCR is demonstrated through case studies with large-scale substation models. The toolchain along with example SG-ML models have been open-sourced.
Authors:Samuel Ovaskainen, Majid Haghparast, Tommi Mikkonen
Abstract:
The number of qubits in quantum computers keeps growing, but most quantum programs remain relatively small because of the noisy nature of the underlying quantum hardware. This might lead quantum cloud providers to explore increased hardware utilization, and thus profitability through means such as multi-programming, which would allow the execution of multiple programs in parallel. The adoption of such technology would bring entirely new challenges to the field of quantum software security. This article explores and reports the key challenges identified in quantum software security within shared quantum computing environments.
Authors:Shafizur Rahman Seeam, Ye Zheng, Yidan Hu
Abstract:
Large-scale data collection, from national censuses to IoT-enabled smart homes, routinely gathers dozens of attributes per individual. These multi-attribute datasets are vital for analytics but pose significant privacy risks. Local Differential Privacy (LDP) is a powerful tool to protect user data privacy by allowing users to locally perturb their records before releasing to an untrusted data aggregator. However, existing LDP mechanisms either split the privacy budget across all attributes or treat each attribute independently, ignoring natural inter-attribute correlations. This leads to excessive noise or fragmented budgets, resulting in significant utility loss, particularly in high-dimensional settings.
To overcome these limitations, we propose Correlated Randomized Response (Corr-RR), a novel LDP mechanism that leverages correlations among attributes to substantially improve utility while maintaining rigorous LDP guarantees. Corr-RR allocates the full privacy budget to perturb a single, randomly selected attribute and reconstructs the remaining attributes using estimated interattribute dependencies, without incurring additional privacy cost. To enable this, Corr-RR operates in two phases: (1) a subset of users apply standard LDP mechanisms to estimate correlations, and (2) each remaining user perturbs one attribute and infers the others using the learned correlations. We theoretically prove that Corr-RR satisfies $ε$-LDP, and extensive experiments on synthetic and real-world datasets demonstrate that Corr-RR consistently outperforms state-of-the-art LDP mechanisms, particularly in scenarios with many attributes and strong inter-attribute correlations.
Authors:Youpeng Li, Weiliang Qi, Xuyu Wang, Fuxun Yu, Xinda Wang
Abstract:
The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. % for the security community. While existing empirical studies evaluate PLMs for vulnerability detection (VD), their inadequate consideration in data preparation, evaluation setups, and experimental settings undermines the accuracy and comprehensiveness of evaluations. This paper introduces RevisitVD, an extensive evaluation of 17 PLMs spanning smaller code-specific PLMs and large-scale PLMs using newly constructed datasets. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various training and testing settings, and analyze their robustness against code normalization, abstraction, and semantic-preserving transformations.
Our findings reveal that, for VD tasks, PLMs incorporating pre-training tasks designed to capture the syntactic and semantic patterns of code outperform both general-purpose PLMs and those solely pre-trained or fine-tuned on large code corpora. However, these models face notable challenges in real-world scenarios, such as difficulties in detecting vulnerabilities with complex dependencies, handling perturbations introduced by code normalization and abstraction, and identifying semantic-preserving vulnerable code transformations. Also, the truncation caused by the limited context windows of PLMs can lead to a non-negligible amount of labeling errors. This study underscores the importance of thorough evaluations of model performance in practical scenarios and outlines future directions to help enhance the effectiveness of PLMs for realistic VD applications.
Authors:Natalia Tomashenko, Emmanuel Vincent, Marc Tommasi
Abstract:
The temporal dynamics of speech, encompassing variations in rhythm, intonation, and speaking rate, contain important and unique information about speaker identity. This paper proposes a new method for representing speaker characteristics by extracting context-dependent duration embeddings from speech temporal dynamics. We develop novel attack models using these representations and analyze the potential vulnerabilities in speaker verification and voice anonymization systems.The experimental results show that the developed attack models provide a significant improvement in speaker verification performance for both original and anonymized data in comparison with simpler representations of speech temporal dynamics reported in the literature.
Authors:Yao Ma, Wen Yu Kon, Jefferson Chu, Kevin Han Yong Loh, Kaushik Chakraborty, Charles Lim
Abstract:
Identity verification is the process of confirming an individual's claimed identity, which is essential in sectors like finance, healthcare, and online services to ensure security and prevent fraud. However, current password/PIN-based identity solutions are susceptible to phishing or skimming attacks, where malicious intermediaries attempt to steal credentials using fake identification portals. Alikhani et al. [Nature, 2021] began exploring identity verification through graph coloring-based relativistic zero-knowledge proofs (RZKPs), a key cryptographic primitive that enables a prover to demonstrate knowledge of secret credentials to a verifier without disclosing any information about the secret. Our work advances this field and addresses unresolved issues: From an engineering perspective, we relax further the relativistic constraints from 60m to 30m, and significantly enhance the stability and scalability of the experimental demonstration of the 2-prover graph coloring-based RZKP protocol for near-term use cases. At the same time, for long-term security against entangled malicious provers, we propose a modified protocol with comparable computation and communication costs, we establish an upper bound on the soundness parameter for this modified protocol. On the other hand, we extend the two-prover, two-verifier setup to a three-prover configuration, demonstrating the security of such relativistic protocols against entangled malicious provers.
Authors:Md Rafid Haque, Abu Raihan Mostofa Kamal, Md. Azam Hossain
Abstract:
Federated Learning (FL) offers a paradigm for privacy-preserving collaborative AI, but its decentralized nature creates significant vulnerabilities to model poisoning attacks. While numerous static defenses exist, their effectiveness is highly context-dependent, often failing against adaptive adversaries or in heterogeneous data environments. This paper introduces FedStrategist, a novel meta-learning framework that reframes robust aggregation as a real-time, cost-aware control problem. We design a lightweight contextual bandit agent that dynamically selects the optimal aggregation rule from an arsenal of defenses based on real-time diagnostic metrics. Through comprehensive experiments, we demonstrate that no single static rule is universally optimal. We show that our adaptive agent successfully learns superior policies across diverse scenarios, including a ``Krum-favorable" environment and against a sophisticated "stealth" adversary designed to neutralize specific diagnostic signals. Critically, we analyze the paradoxical scenario where a non-robust baseline achieves high but compromised accuracy, and demonstrate that our agent learns a conservative policy to prioritize model integrity. Furthermore, we prove the agent's policy is controllable via a single "risk tolerance" parameter, allowing practitioners to explicitly manage the trade-off between performance and security. Our work provides a new, practical, and analyzable approach to creating resilient and intelligent decentralized AI systems.
Authors:Rui Zhao, Vladyslav Melnychuk, Jun Zhao, Jesse Wright, Nigel Shadbolt
Abstract:
In modern times, people have numerous online accounts, but they rarely read the Terms of Service or Privacy Policy of those sites despite claiming otherwise. This paper introduces PoliAnalyzer, a neuro-symbolic system that assists users with personalized privacy policy analysis. PoliAnalyzer uses Natural Language Processing (NLP) to extract formal representations of data usage practices from policy texts. In favor of deterministic, logical inference is applied to compare user preferences with the formal privacy policy representation and produce a compliance report. To achieve this, we extend an existing formal Data Terms of Use policy language to model privacy policies as app policies and user preferences as data policies. In our evaluation using our enriched PolicyIE dataset curated by legal experts, PoliAnalyzer demonstrated high accuracy in identifying relevant data usage practices, achieving F1-score of 90-100% across most tasks. Additionally, we demonstrate how PoliAnalyzer can model diverse user data-sharing preferences, derived from prior research as 23 user profiles, and perform compliance analysis against the top 100 most-visited websites. This analysis revealed that, on average, 95.2% of a privacy policy's segments do not conflict with the analyzed user preferences, enabling users to concentrate on understanding the 4.8% (636 / 13205) that violates preferences, significantly reducing cognitive burden. Further, we identified common practices in privacy policies that violate user expectations - such as the sharing of location data with 3rd parties. This paper demonstrates that PoliAnalyzer can support automated personalized privacy policy analysis at scale using off-the-shelf NLP tools. This sheds light on a pathway to help individuals regain control over their data and encourage societal discussions on platform data practices to promote a fairer power dynamic.
Authors:Agata Kaczmarek, Dawid PÅudowski, Piotr WilczyÅski, Krzysztof Kotowski, Ramez Shendy, Evridiki Ntagiou, Jakub Nalepa, Artur Janicki, PrzemysÅaw Biecek
Abstract:
The "Fake or Real" competition hosted on Kaggle (https://www.kaggle.com/competitions/fake-or-real-the-impostor-hunt ) is the second part of a series of follow-up competitions and hackathons related to the "Assurance for Space Domain AI Applications" project funded by the European Space Agency (https://assurance-ai.space-codev.org/ ). The competition idea is based on two real-life AI security threats identified within the project -- data poisoning and overreliance in Large Language Models. The task is to distinguish between the proper output from LLM and the output generated under malicious modification of the LLM. As this problem was not extensively researched, participants are required to develop new techniques to address this issue or adjust already existing ones to this problem's statement.
Authors:Zihao Xue, Zhen Bi, Long Ma, Zhenlin Hu, Yan Wang, Zhenfang Liu, Qing Sheng, Jie Xiao, Jungang Lou
Abstract:
While reinforcement learning-trained Large Reasoning Models (LRMs, e.g., Deepseek-R1) demonstrate advanced reasoning capabilities in the evolving Large Language Models (LLMs) domain, their susceptibility to security threats remains a critical vulnerability. This weakness is particularly evident in Chain-of-Thought (CoT) generation processes, where adversarial methods like backdoor prompt attacks can systematically subvert the model's core reasoning mechanisms. The emerging Chain-of-Thought Attack (CoTA) reveals this vulnerability through exploiting prompt controllability, simultaneously degrading both CoT safety and task performance with low-cost interventions. To address this compounded security-performance vulnerability, we propose Thought Purity (TP): a defense paradigm that systematically strengthens resistance to malicious content while preserving operational efficacy. Our solution achieves this through three synergistic components: (1) a safety-optimized data processing pipeline (2) reinforcement learning-enhanced rule constraints (3) adaptive monitoring metrics. Our approach establishes the first comprehensive defense mechanism against CoTA vulnerabilities in reinforcement learning-aligned reasoning systems, significantly advancing the security-functionality equilibrium for next-generation AI architectures.
Authors:Zhengyue Zhao, Yingzi Ma, Somesh Jha, Marco Pavone, Chaowei Xiao
Abstract:
Large Language Models (LLMs) have demonstrated remarkable generative capabilities. However, their susceptibility to misuse has raised significant safety concerns. While post-training safety alignment methods have been widely adopted, LLMs remain vulnerable to malicious instructions that can bypass safety constraints. Recent efforts have introduced inference-time safety reasoning (system-2 alignment), where LLMs conduct a reasoning process to perform safety verification before final response. We show, however, that these checks are driven by ad-hoc reasoning that diverges from the structured human process, where they first discern a user's true intent, then evaluate the associated risk based on the true intent. Consequently, these defenses remain vulnerable to sophisticated jailbreak prompts that cloak harmful goals in seemingly benign language. To build secure and safe LLMs, we propose a reasoning-based safety alignment framework, ARMOR, that replaces the ad-hoc chains of thought reasoning process with human-aligned, structured one. At inference, ARMOR (1) detects likely jailbreak strategies, (2) extracts the user's core intent while discarding deceptive instructions, and (3) applies a policy-grounded safety analysis to the purified request. ARMOR is evaluated on adaptive jailbreak attacks and multiple safety benchmarks, and a test-time scaling is conducted to further improve its performance. Results demonstrate that ARMOR significantly enhances the robustness against state-of-the-art adaptive jailbreak attacks and outperforms recent reasoning-based aligned models across various safety benchmarks.
Authors:Zeki Kazan, Sagar Sharma, Wanrong Zhang, Bo Jiang, Qiang Yan
Abstract:
As the use of differential privacy (DP) becomes widespread, the development of effective tools for reasoning about the privacy guarantee becomes increasingly critical. In pursuit of this goal, we demonstrate novel relationships between DP and measures of statistical disclosure risk. We suggest how experts and non-experts can use these results to explain the DP guarantee, interpret DP composition theorems, select and justify privacy parameters, and identify worst-case adversary prior probabilities.
Authors:Weichen Yu, Ravi Mangal, Terry Zhuo, Matt Fredrikson, Corina S. Pasareanu
Abstract:
Large language models (LLMs) have become proficient at sophisticated code-generation tasks, yet remain ineffective at reliably detecting or avoiding code vulnerabilities. Does this deficiency stem from insufficient learning about code vulnerabilities, or is it merely a result of ineffective prompting? Using representation engineering techniques, we investigate whether LLMs internally encode the concepts necessary to identify code vulnerabilities. We find that current LLMs encode precise internal representations that distinguish vulnerable from secure code--achieving greater accuracy than standard prompting approaches. Leveraging these vulnerability-sensitive representations, we develop an inference-time steering technique that subtly modulates the model's token-generation probabilities through a mixture of corrections (MoC). Our method effectively guides LLMs to produce less vulnerable code without compromising functionality, demonstrating a practical approach to controlled vulnerability management in generated code. Notably, MoC enhances the security ratio of Qwen2.5-Coder-7B by 8.9\%, while simultaneously improving functionality on HumanEval pass@1 by 2.1\%.
Authors:Yifan Zhang, Yu Bai, Riku Jantti, Zheng Yan, Christos Masouros, Zhu Han
Abstract:
Integrated sensing and communication (ISAC) systems potentially encounter significant performance degradation in densely obstructed urban and non-line-of-sight scenarios, thus limiting their effectiveness in practical deployments. To deal with these challenges, this paper proposes a backscatter device (BD)-assisted ISAC system, which leverages passive BDs naturally distributed in underlying environments for performance enhancement. These ambient devices can enhance sensing accuracy and communication reliability by providing additional reflective signal paths. In this system, we define the Pareto boundary characterizing the trade-off between sensing mutual information (SMI) and communication rates to provide fundamental insights for its design. To derive the boundary, we formulate a performance optimization problem within an orthogonal frequency division multiplexing (OFDM) framework, by jointly optimizing time-frequency resource element (RE) allocation, transmit power management, and BD modulation decisions. To tackle the non-convexity of the problem, we decompose it into three subproblems, solved iteratively through a block coordinate descent (BCD) algorithm. Specifically, the RE subproblem is addressed using the successive convex approximation (SCA) method, the power subproblem is solved using an augmented Lagrangian combined water-filling method, and the BD modulation subproblem is tackled using semidefinite relaxation (SDR) methods. Additionally, we demonstrate the generality of the proposed system by showing its adaptability to bistatic ISAC scenarios and MIMO settings. Finally, extensive simulation results validate the effectiveness of the proposed system and its superior performance compared to existing state-of-the-art ISAC schemes.
Authors:Ioannis Lamprou, Alexander Shevtsov, Ioannis Arapakis, Sotiris Ioannidis
Abstract:
The proliferation of software vulnerabilities presents a significant challenge to cybersecurity, necessitating more effective detection methodologies. We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance while challenging prevailing assumptions in AI model scaling. Utilizing an innovative architecture that integrates Mamba layers, linear self-attention, and a Mixture of Experts framework, White-Basilisk achieves state-of-the-art results in vulnerability detection tasks with a parameter count of only 200M. The model's capacity to process sequences of unprecedented length enables comprehensive analysis of extensive codebases in a single pass, surpassing the context limitations of current Large Language Models (LLMs). White-Basilisk exhibits robust performance on imbalanced, real-world datasets, while maintaining computational efficiency that facilitates deployment across diverse organizational scales. This research not only establishes new benchmarks in code security but also provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks, potentially redefining optimization strategies in AI development for domain-specific applications.
Authors:Sounak Bhowmik, Travis S. Humble, Himanshu Thapliyal
Abstract:
Quantum neural networks (QNN) hold immense potential for the future of quantum machine learning (QML). However, QNN security and robustness remain largely unexplored. In this work, we proposed novel Trojan attacks based on the quantum computing properties in a QNN-based binary classifier. Our proposed Quantum Properties Trojans (QuPTs) are based on the unitary property of quantum gates to insert noise and Hadamard gates to enable superposition to develop Trojans and attack QNNs. We showed that the proposed QuPTs are significantly stealthier and heavily impact the quantum circuits' performance, specifically QNNs. The most impactful QuPT caused a deterioration of 23% accuracy of the compromised QNN under the experimental setup. To the best of our knowledge, this is the first work on the Trojan attack on a fully quantum neural network independent of any hybrid classical-quantum architecture.
Authors:Federico Maria Cau, Giuseppe Desolda, Francesco Greco, Lucio Davide Spano, Luca Viganò
Abstract:
Phishing has become a prominent risk in modern cybersecurity, often used to bypass technological defences by exploiting predictable human behaviour. Warning dialogues are a standard mitigation measure, but the lack of explanatory clarity and static content limits their effectiveness. In this paper, we report on our research to assess the capacity of Large Language Models (LLMs) to generate clear, concise, and scalable explanations for phishing warnings. We carried out a large-scale between-subjects user study (N = 750) to compare the influence of warning dialogues supplemented with manually generated explanations against those generated by two LLMs, Claude 3.5 Sonnet and Llama 3.3 70B. We investigated two explanatory styles (feature-based and counterfactual) for their effects on behavioural metrics (click-through rate) and perceptual outcomes (e.g., trust, risk, clarity). The results indicate that well-constructed LLM-generated explanations can equal or surpass manually crafted explanations in reducing susceptibility to phishing; Claude-generated warnings exhibited particularly robust performance. Feature-based explanations were more effective for genuine phishing attempts, whereas counterfactual explanations diminished false-positive rates. Other variables such as workload, gender, and prior familiarity with warning dialogues significantly moderated warning effectiveness. These results indicate that LLMs can be used to automatically build explanations for warning users against phishing, and that such solutions are scalable, adaptive, and consistent with human-centred values.
Authors:Jungeun Lim, Stephan A. Fahrenkrog-Petersen, Xixi Lu, Jan Mendling, Minseok Song
Abstract:
Information systems support the execution of business processes. The event logs of these executions generally contain sensitive information about customers, patients, and employees. The corresponding privacy challenges can be addressed by anonymizing the event logs while still retaining utility for process discovery. However, trading off utility and privacy is difficult: the higher the complexity of event log, the higher the loss of utility by anonymization. In this work, we propose a pipeline that combines anonymization and event data partitioning, where event abstraction is utilized for partitioning. By leveraging event abstraction, event logs can be segmented into multiple parts, allowing each sub-log to be anonymized separately. This pipeline preserves privacy while mitigating the loss of utility. To validate our approach, we study the impact of event partitioning on two anonymization techniques using three real-world event logs and two process discovery techniques. Our results demonstrate that event partitioning can bring improvements in process discovery utility for directly-follows-based anonymization techniques.
Authors:Yue Su, Meng Shen, Cong Zuo, Yuzhi Liu, Liehuang Zhu
Abstract:
Conjunctive Searchable Symmetric Encryption (CSSE) enables secure conjunctive searches over encrypted data. While leakage-abuse attacks (LAAs) against single-keyword SSE have been extensively studied, their extension to conjunctive queries faces a critical challenge: the combinatorial explosion of candidate keyword combinations, leading to enormous time and space overhead for attacks. In this paper, we reveal a fundamental vulnerability in state-of-the-art CSSE schemes: s-term leakage, where the keyword with the minimal document frequency in a query leaks distinct patterns. We propose S-Leak, the first passive attack framework that progressively recovers conjunctive queries by exploiting s-term leakage and global leakage. Our key innovation lies in a three-stage approach: identifying the s-term of queries, pruning low-probability keyword conjunctions, and reconstructing full queries. We propose novel metrics to better assess attacks in conjunctive query scenarios. Empirical evaluations on real-world datasets demonstrate that our attack is effective in diverse CSSE configurations. When considering 161,700 conjunctive keyword queries, our attack achieves a 95.15% accuracy in recovering at least one keyword, 82.57% for at least two, 58% for all three keywords, and maintains efficacy against defenses such as SEAL padding and CLRZ obfuscation. Our work exposes the underestimated risks of s-term leakage in practical SSE deployments and calls for a redesign of leakage models for multi-keyword search scenarios.
Authors:Salahuddin Salahuddin, Ahmed Hussain, Jussi Löppönen, Toni Jutila, Panos Papadimitratos
Abstract:
While Large Language Models (LLMs) demonstrate exceptional natural language capabilities, general-purpose models lack specialized domain knowledge for effective cybersecurity analysis. In this work, we investigate Domain-Adaptive Continuous Pretraining (DAP) as a methodology for enhancing cybersecurity understanding in pretrained LLMs while preserving general language capabilities. We systematically adapted three decoder-based architectures -- Llama-3.1-8B, DeepSeek-R1-Distill-Qwen-14B, and Llama-3.3-70B-Instruct -- using a curated 126-million-word cybersecurity corpus from standards, academic literature, and various other sources. Our approach employed constrained training parameters and distributed FSDP training to balance domain specialization with knowledge preservation. Evaluation across three cybersecurity benchmarks, namely, CTI-MCQ, CyberMetric, and SecEval, demonstrates consistent improvements post-adaptation. The Llama-3.3-70B-Ins-DAP model achieved state-of-the-art accuracies of 0.718, 0.933, and 0.864, respectively, outperforming specialized models, including Llama-Primus-Base. Notably, competitive performance was achieved using substantially smaller datasets (118.8 million versus 2.77 billion tokens), demonstrating efficient domain specialization viability. We establish that targeted continuous pretraining enables effective cybersecurity domain adaptation with computational feasibility, providing foundations for specialized AI assistants in threat analysis, vulnerability assessment, and security documentation while challenging prevailing assumptions about data requirements for LLM specialization.
Authors:Almog Hilel, Idan Shenfeld, Jacob Andreas, Leshem Choshen
Abstract:
We describe a vulnerability in language models (LMs) trained with user feedback, whereby a single user can persistently alter LM knowledge and behavior given only the ability to provide prompts and upvote / downvote feedback on LM outputs. To implement the attack, the attacker prompts the LM to stochastically output either a "poisoned" or benign response, then upvotes the poisoned response or downvotes the benign one. When feedback signals are used in a subsequent preference tuning behavior, LMs exhibit increased probability of producing poisoned responses even in contexts without malicious prompts. We show that this attack can be used to (1) insert factual knowledge the model did not previously possess, (2) modify code generation patterns in ways that introduce exploitable security flaws, and (3) inject fake financial news. Our finding both identifies a new qualitative feature of language model preference tuning (showing that it even highly restricted forms of preference data can be used to exert fine-grained control over behavior), and a new attack mechanism for LMs trained with user feedback (extending work on pretraining-time data poisoning and deployment-time prompt injection).
Authors:Ye Zheng, Yidan Hu
Abstract:
Local differential privacy (LDP) provides a rigorous and quantifiable privacy guarantee for personal data by introducing perturbation at the data source. However, quantifying the impact of these perturbations on classifier utility remains a theoretical challenge, particularly for complex or black-box classifiers.
This paper presents a framework for theoretically quantifying classifier utility under LDP mechanisms. The key insight is that LDP perturbation is concentrated around the original data with a specific probability, transforming utility analysis of the classifier into its robustness analysis in this concentrated region. Our framework connects the concentration analysis of LDP mechanisms with the robustness analysis of classifiers. It treats LDP mechanisms as general distributional functions and classifiers as black-box functions, thus applicable to any LDP mechanism and classifier. A direct application of our utility quantification is guiding the selection of LDP mechanisms and privacy parameters for a given classifier. Notably, our analysis shows that a piecewise-based mechanism leads to better utility compared to alternatives in common scenarios.
Using this framework alongside two novel refinement techniques, we conduct case studies on utility quantification for typical mechanism-classifier combinations. The results demonstrate that our theoretical utility quantification aligns closely with empirical observations, particularly when classifiers operate in lower-dimensional input spaces.
Authors:Zhicheng Zhang, Mingsheng Ying
Abstract:
Access control is a cornerstone of computer security that prevents unauthorised access to resources. In this paper, we study access control in quantum computer systems. We present the first explicit scenario of a security breach when a classically secure access control system is straightforwardly adapted to the quantum setting. The breach is ultimately due to that quantum mechanics allows the phenomenon of entanglement and violates Mermin inequality, a multi-party variant of the celebrated Bell inequality. This reveals a threat from quantum entanglement to access control if existing computer systems integrate with quantum computing. To protect against such threat, we propose several new models of quantum access control, and rigorously analyse their security, flexibility and efficiency.
Authors:Krishna Kanth Nakka, Xue Jiang, Dmitrii Usynin, Xuebing Zhou
Abstract:
This paper investigates privacy jailbreaking in LLMs via steering, focusing on whether manipulating activations can bypass LLM alignment and alter response behaviors to privacy related queries (e.g., a certain public figure's sexual orientation). We begin by identifying attention heads predictive of refusal behavior for private attributes (e.g., sexual orientation) using lightweight linear probes trained with privacy evaluator labels. Next, we steer the activations of a small subset of these attention heads guided by the trained probes to induce the model to generate non-refusal responses. Our experiments show that these steered responses often disclose sensitive attribute details, along with other private information about data subjects such as life events, relationships, and personal histories that the models would typically refuse to produce. Evaluations across four LLMs reveal jailbreaking disclosure rates of at least 95%, with more than 50% on average of these responses revealing true personal information. Our controlled study demonstrates that private information memorized in LLMs can be extracted through targeted manipulation of internal activations.
Authors:Anbin Wu, Zhiyong Feng, Ruitao Feng, Zhenchang Xing, Yang Liu
Abstract:
RESTful APIs facilitate data exchange between applications, but they also expose sensitive resources to potential exploitation. Broken Object Level Authorization (BOLA) is the top vulnerability in the OWASP API Security Top 10, exemplifies a critical access control flaw where attackers manipulate API parameters to gain unauthorized access. To address this, we propose BOLAZ, a defense framework grounded in zero trust principles. BOLAZ analyzes the data flow of resource IDs, pinpointing BOLA attack injection points and determining the associated authorization intervals to prevent horizontal privilege escalation. Our approach leverages static taint tracking to categorize APIs into producers and consumers based on how they handle resource IDs. By mapping the propagation paths of resource IDs, BOLAZ captures the context in which these IDs are produced and consumed, allowing for precise identification of authorization boundaries. Unlike defense methods based on common authorization models, BOLAZ is the first authorization-guided method that adapts defense rules based on the system's best-practice authorization logic. We validate BOLAZ through empirical research on 10 GitHub projects. The results demonstrate BOLAZ's effectiveness in defending against vulnerabilities collected from CVE and discovering 35 new BOLA vulnerabilities in the wild, demonstrating its practicality in real-world deployments.
Authors:Haya Schulmann, Niklas Vogel
Abstract:
Resource Public Key Infrastructure (RPKI) is a critical security mechanism for BGP, but the complexity of its architecture is a growing concern as its adoption scales. Current RPKI design heavily reuses legacy PKI components, such as X.509 EE-certificates, ASN.1 encoding, and XML-based repository protocols, which introduce excessive cryptographic validation, redundant metadata, and inefficiencies in both storage and processing. We show that these design choices, although based on established standards, create significant performance bottlenecks, increase the vulnerability surface, and hinder scalability for wide-scale Internet deployment.
In this paper, we perform the first systematic analysis of the root causes of complexity in RPKI's design and experimentally quantify their real-world impact. We show that over 70\% of validation time in RPKI relying parties is spent on certificate parsing and signature verification, much of it unnecessary. Building on this insight, we introduce the improved RPKI (iRPKI), a backwards-compatible redesign that preserves all security guarantees while substantially reducing protocol overhead. iRPKI eliminates EE-certificates and ROA signatures, merges revocation and integrity objects, replaces verbose encodings with Protobuf, and restructures repository metadata for more efficient access. We experimentally demonstrate that our implementation of iRPKI in the Routinator validator achieves a 20x speed-up of processing time, 18x improvement of bandwidth requirements and 8x reduction in cache memory footprint, while also eliminating classes of vulnerabilities that have led to at least 10 vulnerabilities in RPKI software. iRPKI significantly increases the feasibility of deploying RPKI at scale in the Internet, and especially in constrained environments. Our design may be deployed incrementally without impacting existing operations.
Authors:Charles Fleming, Ashish Kundu, Ramana Kompella
Abstract:
The proliferation of autonomous AI agents within enterprise environments introduces a critical security challenge: managing access control for emergent, novel tasks for which no predefined policies exist. This paper introduces an advanced security framework that extends the Task-Based Access Control (TBAC) model by using a Large Language Model (LLM) as an autonomous, risk-aware judge. This model makes access control decisions not only based on an agent's intent but also by explicitly considering the inherent \textbf{risk associated with target resources} and the LLM's own \textbf{model uncertainty} in its decision-making process. When an agent proposes a novel task, the LLM judge synthesizes a just-in-time policy while also computing a composite risk score for the task and an uncertainty estimate for its own reasoning. High-risk or high-uncertainty requests trigger more stringent controls, such as requiring human approval. This dual consideration of external risk and internal confidence allows the model to enforce a more robust and adaptive version of the principle of least privilege, paving the way for safer and more trustworthy autonomous systems.
Authors:Vasilije Stambolic, Aritra Dhar, Lukas Cavigelli
Abstract:
Retrieval-Augmented Generation (RAG) increases the reliability and trustworthiness of the LLM response and reduces hallucination by eliminating the need for model retraining. It does so by adding external data into the LLM's context. We develop a new class of black-box attack, RAG-Pull, that inserts hidden UTF characters into queries or external code repositories, redirecting retrieval toward malicious code, thereby breaking the models' safety alignment. We observe that query and code perturbations alone can shift retrieval toward attacker-controlled snippets, while combined query-and-target perturbations achieve near-perfect success. Once retrieved, these snippets introduce exploitable vulnerabilities such as remote code execution and SQL injection. RAG-Pull's minimal perturbations can alter the model's safety alignment and increase preference towards unsafe code, therefore opening up a new class of attacks on LLMs.
Authors:Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic
Abstract:
Large language Models (LLMs) have shown remarkable proficiency in code generation tasks across various programming languages. However, their outputs often contain subtle but critical vulnerabilities, posing significant risks when deployed in security-sensitive or mission-critical systems. This paper introduces TypePilot, an agentic AI framework designed to enhance the security and robustness of LLM-generated code by leveraging strongly typed and verifiable languages, using Scala as a representative example. We evaluate the effectiveness of our approach in two settings: formal verification with the Stainless framework and general-purpose secure code generation. Our experiments with leading open-source LLMs reveal that while direct code generation often fails to enforce safety constraints, just as naive prompting for more secure code, our type-focused agentic pipeline substantially mitigates input validation and injection vulnerabilities. The results demonstrate the potential of structured, type-guided LLM workflows to improve the SotA of the trustworthiness of automated code generation in high-assurance domains.
Authors:Jugal Gajjar, Kaustik Ranaware, Kamalasankari Subramaniakuppusamy
Abstract:
Software vulnerabilities remain a persistent risk, yet static and dynamic analyses often overlook structural dependencies that shape insecure behaviors. Viewing programs as heterogeneous graphs, we capture control- and data-flow relations as complex interaction networks. Our hybrid framework combines these graph representations with light-weight (<4B) local LLMs, uniting topological features with semantic reasoning while avoiding the cost and privacy concerns of large cloud models. Evaluated on Java vulnerability detection (binary classification), our method achieves 93.57% accuracy-an 8.36% gain over Graph Attention Network-based embeddings and 17.81% over pretrained LLM baselines such as Qwen2.5 Coder 3B. Beyond accuracy, the approach extracts salient subgraphs and generates natural language explanations, improving interpretability for developers. These results pave the way for scalable, explainable, and locally deployable tools that can shift vulnerability analysis from purely syntactic checks to deeper structural and semantic insights, facilitating broader adoption in real-world secure software development.
Authors:Eduardo Baena, Han Yang, Dimitrios Koutsonikolas, Israat Haque
Abstract:
Smart homes are increasingly populated with heterogeneous Internet of Things (IoT) devices that interact continuously with users and the environment. This diversity introduces critical challenges in device identification, authentication, and security, where fingerprinting techniques have emerged as a key approach. In this survey, we provide a comprehensive analysis of IoT fingerprinting specifically in the context of smart homes, examining methods for device and their event detection, classification, and intrusion prevention. We review existing techniques, e.g., network traffic analysis or machine learning-based schemes, highlighting their applicability and limitations in home environments characterized by resource-constrained devices, dynamic usage patterns, and privacy requirements. Furthermore, we discuss fingerprinting system deployment challenges like scalability, interoperability, and energy efficiency, as well as emerging opportunities enabled by generative AI and federated learning. Finally, we outline open research directions that can advance reliable and privacy-preserving fingerprinting for next-generation smart home ecosystems.
Authors:Raffaele Cristodaro, Benjamin Kraner, Claudio J. Tessone
Abstract:
This paper investigates the impact of sanctions on Tornado Cash, a smart contract protocol designed to enhance transaction privacy. Following the U.S. Department of the Treasury's sanctions against Tornado Cash in August 2022, platform activity declined sharply. We document a significant and sustained reduction in transaction volume, user diversity, and overall protocol utilization after the sanctions were imposed. Our analysis draws on transaction data from three major blockchains: Ethereum, BNB Smart Chain, and Polygon. We further examine developments following the partial lifting and eventual removal of sanctions by the U.S. Office of Foreign Assets Control (OFAC) in March 2025. Although activity partially recovered, the rebound remained limited. The Tornado Cash case illustrates how regulatory interventions can affect decentralized protocols, while also highlighting the challenges of fully enforcing such measures in decentralized environments.
Authors:Raffaele Cristodaro, Benjamin Kraner, Claudio J. Tessone
Abstract:
Tornado Cash is a decentralised mixer that uses cryptographic techniques to sever the on-chain trail between depositors and withdrawers. In practice, however, its anonymity can be undermined by user behaviour and operational quirks. We conduct the first cross-chain empirical study of Tornado Cash activity on Ethereum, BNB Smart Chain, and Polygon, introducing three clustering heuristics-(i) address-reuse, (ii) transactional-linkage, and (iii) a novel first-in-first-out (FIFO) temporal-matching rule. Together, these heuristics reconnect deposits to withdrawals and deanonymise a substantial share of recipients. Our analysis shows that 5.1 - 12.6% of withdrawals can already be traced to their originating deposits through address reuse and transactional linkage heuristics. Adding our novel First-In-First-Out (FIFO) temporal-matching heuristic lifts the linkage rate by a further 15 - 22 percentage points. Statistical tests confirm that these FIFO matches are highly unlikely to occur by chance. Comparable leakage across Ethereum, BNB Smart Chain, and Polygon indicates chain-agnostic user misbehaviour, rather than chain-specific protocol flaws. These results expose how quickly cryptographic guarantees can unravel in everyday use, underscoring the need for both disciplined user behaviour and privacy-aware protocol design. In total, our heuristics link over $2.3 billion in Tornado Cash withdrawals to identifiable deposits, exposing significant cracks in practical anonymity.
Authors:Boyu Liu, Yang Zhang, Liang Cheng, Yi Zhang, Junjie Fan, Yu Fu
Abstract:
Fuzzing has become a cornerstone technique for uncovering vulnerabilities and enhancing the security of OS kernels. However, state-of-the-art kernel fuzzers, including the de facto standard Syzkaller, struggle to generate valid syscall sequences that respect implicit Syscall Dependency Relations (SDRs). Consequently, many generated seeds either fail kernel validation or cannot penetrate deep execution paths, resulting in significant inefficiency. We hypothesize that SDRs can be effectively learned from both historic and present kernel execution data, and that incorporating these learned relations into fuzzing can substantially improve seed validity and diversity. To validate this, we propose an approach that utilizes the N-gram model to mine SDRs from the Dongting dataset-one of the largest Linux kernel execution datasets available-as well as from execution traces collected on the fly during fuzzing. The resulting model is used to continuously augment the Choice Table of Syzkaller to improve its seed generation and demonstrably increases the Shannon Entropy of the Choice Table throughout fuzzing, reflecting more empirically-grounded choices in expanding syscall sequences into valid and diverse seeds. In addition, we introduce a Random Walk strategy that instructs Syzkaller to construct seeds in a bidirectional manner to further diversify the generated seeds. We implement our approach in a prototype, Psyzkaller, built on top of Syzkaller. Experiments on three representative Linux kernel versions show that Psyzkaller improves Syzkaller's code coverage by 4.6%-7.0% in 48-hour fuzzing, while triggering 110.4%-187.2% more crashes. Moreover, our investigation shows that Psyzkaller discovered eight previously unknown kernel vulnerabilities, compared to only one found by Syzkaller.
Authors:Robin Kothari, Ryan O'Donnell, Kewen Wu
Abstract:
In 2021, Chen, Liu, and Zhandry presented an efficient quantum algorithm for the average-case $\ell_\infty$-Short Integer Solution ($\mathrm{SIS}^\infty$) problem, in a parameter range outside the normal range of cryptographic interest, but still with no known efficient classical algorithm. This was particularly exciting since $\mathrm{SIS}^\infty$ is a simple problem without structure, and their algorithmic techniques were different from those used in prior exponential quantum speedups. We present efficient classical algorithms for all of the $\mathrm{SIS}^\infty$ and (more general) Constrained Integer Solution problems studied in their paper, showing there is no exponential quantum speedup anymore.
Authors:Dmytro Gavinsky, Dar Gilboa, Siddhartha Jain, Dmitri Maslov, Jarrod R. McClean
Abstract:
The no-cloning theorem can be used as a basis for quantum money constructions which guarantee unconditionally unforgeable currency. Existing schemes, however, either (i) require long-term quantum memory and quantum communication between the user and the bank in order to verify the validity of a bill or (ii) fail to protect user privacy due to the uniqueness of each bill issued by the bank, which can allow its usage to be tracked. We introduce a construction of single-use quantum money that gives users the ability to detect whether the issuing authority is tracking them, employing an auditing procedure for which we prove unconditional security. Bill validation is classical, and hence does not require long-term quantum memory or quantum communication, making the protocol relatively practical to deploy. We discuss potential applications beyond money, including anonymous one-time pads and voting.
Authors:Wenhao Li, Selvakumar Manickam, Yung-Wey Chong, Shankar Karuppayah, Priyadarsi Nanda, Binyong Li
Abstract:
Phishing websites remain a persistent cybersecurity threat by mimicking legitimate sites to steal sensitive user information. Existing machine learning-based detection methods often rely on supervised learning with labeled data, which not only incurs substantial annotation costs but also limits adaptability to novel attack patterns. To address these challenges, we propose PhishSSL, a self-supervised contrastive learning framework that eliminates the need for labeled phishing data during training. PhishSSL combines hybrid tabular augmentation with adaptive feature attention to produce semantically consistent views and emphasize discriminative attributes. We evaluate PhishSSL on three phishing datasets with distinct feature compositions. Across all datasets, PhishSSL consistently outperforms unsupervised and self-supervised baselines, while ablation studies confirm the contribution of each component. Moreover, PhishSSL maintains robust performance despite the diversity of feature sets, highlighting its strong generalization and transferability. These results demonstrate that PhishSSL offers a promising solution for phishing website detection, particularly effective against evolving threats in dynamic Web environments.
Authors:Yogesh Kumar, Susanta Samanta, Atul Gaur
Abstract:
MDS matrices play a critical role in the design of diffusion layers for block ciphers and hash functions due to their optimal branch number. Involutory and orthogonal MDS matrices offer additional benefits by allowing identical or nearly identical circuitry for both encryption and decryption, leading to equivalent implementation costs for both processes. These properties have been further generalized through the notions of semi-involutory and semi-orthogonal matrices. Specifically, we establish nontrivial interconnections between semi-involutory and involutory matrices, as well as between semi-orthogonal and orthogonal matrices. Exploiting these relationships, we show that the number of semi-involutory MDS matrices can be directly derived from the number of involutory MDS matrices, and vice versa. A similar correspondence holds for semi-orthogonal and orthogonal MDS matrices. We also examine the intersection of these classes and show that the number of $3 \times 3$ MDS matrices that are both semi-involutory and semi-orthogonal coincides with the number of semi-involutory MDS matrices over $\mathbb{F}_{2^m}$. Furthermore, we derive the general structure of orthogonal matrices of arbitrary order $n$ over $\mathbb{F}_{2^m}$. Based on this generic form, we provide a closed-form expression for enumerating all $3 \times 3$ orthogonal MDS matrices over $\mathbb{F}_{2^m}$. Finally, leveraging the aforementioned interconnections, we present explicit formulas for counting $3 \times 3$ semi-involutory MDS matrices and semi-orthogonal MDS matrices.
Authors:Martin Sandfuchs, Carla Ferradini, Renato Renner
Abstract:
We consider a pair of causally independent processes, modelled as the tensor product of two channels, acting on a possibly correlated input to produce random outputs X and Y. We show that, assuming the processes produce a sufficient amount of randomness, one can extract uniform randomness from X and Y. This generalizes prior results, which assumed that X and Y are (conditionally) independent. Note that in contrast to the independence of quantum states, the independence of channels can be enforced through spacelike separation. As a consequence, our results allow for the generation of randomness under more practical and physically justifiable assumptions than previously possible. We illustrate this with the example of device-independent randomness amplification, where we can remove the constraint that the adversary only has access to classical side information about the source.
Authors:Bruno Cavalar, Eli Goldin, Matthew Gray, Taiga Hiroka, Tomoyuki Morimae
Abstract:
One of the most fundamental problems in the field of hypothesis testing is the identity testing problem: whether samples from some unknown distribution $\mathcal{G}$ are actually from some explicit distribution $\mathcal{D}$. It is known that when the distribution $\mathcal{D}$ has support $[N]$, the optimal sample complexity for the identity testing problem is roughly $O(\sqrt{N})$. However, many distributions of interest, including those which can be sampled efficiently, have exponential support size, and therefore the optimal identity tester also requires exponential samples. In this paper, we bypass this lower bound by considering restricted settings. The above $O(\sqrt{N})$ sample complexity identity tester is constructed so that it is not fooled by any (even inefficiently-sampled) distributions. However, in most applications, the distributions under consideration are efficiently sampleable, and therefore it is enough to consider only identity testers that are not fooled by efficiently-sampled distributions. In that case, we can focus on efficient verification with efficient identity testers. We investigate relations between efficient verifications of classical/quantum distributions and classical/quantum cryptography, and show the following results: (i) Every quantumly samplable distribution is verifiable with a $\mathbf{P^{PP}}$ algorithm. (ii) If one-way functions exist, then no sufficiently random classically samplable distribution is efficiently verifiable. (iii) If one-way functions do not exist, then every classically samplable distribution is efficiently verifiable. (iv) If QEFID pairs exist, then there exists a quantumly samplable distribution which is not efficiently verifiable. (v) If one-way puzzles do not exist, then it is possible to verify sampling-based quantum advantage with a efficient quantum computer.
Authors:Ivan Homoliak, Martin Perešíni, Marek Tamaškovič, Timotej Ponek, Lukáš Hellebrandt, Kamil Malinka
Abstract:
Proof-of-Stake (PoS) consensus protocols often face a trade-off between performance and security. Protocols that pre-elect leaders for subsequent rounds are vulnerable to Denial-of-Service (DoS) attacks, which can disrupt the network and compromise liveness. In this work, we present PoS-CoPOR, a single-chain PoS consensus protocol that mitigates this vulnerability by integrating a native onion routing mechanism into the consensus protocol itself. PoS-CoPOR combines stake-weighted probabilistic leader election with an anonymization layer that conceals the network identity of the next block proposer. This approach prevents targeted DoS attacks on leaders before they produce a block, thus enhancing network resilience. We implemented and evaluated PoS-CoPOR, demonstrating its ability to achieve a throughput of up to 110 tx/s with 6 nodes, even with the overhead of the anonymization layer. The results show that native anonymization can provide robust DoS resistance with only a modest impact on performance, offering a solution to build secure and scalable PoS blockchains.
Authors:Rijha Safdar, Danyail Mateen, Syed Taha Ali, Wajahat Hussain
Abstract:
Artificial Intelligence (AI) and more specifically Large Language Models (LLMs) have demonstrated exceptional progress in multiple areas including software engineering, however, their capability for vulnerability detection in the wild scenario and its corresponding reasoning remains underexplored. Prompting pre-trained LLMs in an effective way offers a computationally effective and scalable solution. Our contributions are (i)varied prompt designs for vulnerability detection and its corresponding reasoning in the wild. (ii)a real-world vector data store constructed from the National Vulnerability Database, that will provide real time context to vulnerability detection framework, and (iii)a scoring measure for combined measurement of accuracy and reasoning quality. Our contribution aims to examine whether LLMs are ready for wild deployment, thus enabling the reliable use of LLMs stronger for the development of secure software's.
Authors:Rabeya Amin Jhuma, Mostafa Mohaimen Akand Faisal
Abstract:
This study explored how in-context learning (ICL) in large language models can be disrupted by data poisoning attacks in the setting of public health sentiment analysis. Using tweets of Human Metapneumovirus (HMPV), small adversarial perturbations such as synonym replacement, negation insertion, and randomized perturbation were introduced into the support examples. Even these minor manipulations caused major disruptions, with sentiment labels flipping in up to 67% of cases. To address this, a Spectral Signature Defense was applied, which filtered out poisoned examples while keeping the data's meaning and sentiment intact. After defense, ICL accuracy remained steady at around 46.7%, and logistic regression validation reached 100% accuracy, showing that the defense successfully preserved the dataset's integrity. Overall, the findings extend prior theoretical studies of ICL poisoning to a practical, high-stakes setting in public health discourse analysis, highlighting both the risks and potential defenses for robust LLM deployment. This study also highlights the fragility of ICL under attack and the value of spectral defenses in making AI systems more reliable for health-related social media monitoring.
Authors:David Benfield, Stefano Coniglio, Phan Tu Vuong, Alain Zemkoho
Abstract:
Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as spam email filtering, malware detection and fake image generation, where security methods must be actively updated to keep up with the everimproving generation of malicious data. Pessimistic Bilevel optimisation has been shown to be an effective method of training resilient classifiers against such adversaries. By modelling these scenarios as a game between the learner and the adversary, we anticipate how the adversary will modify their data and then train a resilient classifier accordingly. However, since existing pessimistic bilevel approaches feature an unrestricted adversary, the model is vulnerable to becoming overly pessimistic and unrealistic. When finding the optimal solution that defeats the classifier, it is possible that the adversary's data becomes nonsensical and loses its intended nature. Such an adversary will not properly reflect reality, and consequently, will lead to poor classifier performance when implemented on real-world data. By constructing a constrained pessimistic bilevel optimisation model, we restrict the adversary's movements and identify a solution that better reflects reality. We demonstrate through experiments that this model performs, on average, better than the existing approach.
Authors:Weihang Li, Pete Crowley, Arya Tschand, Yu Wang, Miroslav Pajic, Daniel Sorin
Abstract:
Rigorous quantitative evaluation of microarchitectural side channels is challenging for two reasons. First, the processors, attacks, and defenses often exhibit probabilistic behaviors. These probabilistic behaviors arise due to natural noise in systems (e.g., from co-running processes), probabilistic side channel attacks, and probabilistic obfuscation defenses. Second, microprocessors are extremely complex. Previous evaluation methods have relied on abstract or simplified models, which are necessarily less detailed than real systems or cycle-by-cycle simulators, and these models may miss important phenomena. Whereas a simple model may suffice for estimating performance, security issues frequently manifest in the details. We address this challenge by introducing Statistical Model Checking (SMC) to the quantitative evaluation of microarchitectural side channels. SMC is a rigorous statistical technique that can process the results of probabilistic experiments and provide statistical guarantees, and it has been used in computing applications that depend heavily on statistical guarantees (e.g., medical implants, vehicular computing). With SMC, we can treat processors as opaque boxes, and we do not have to abstract or simplify them. We demonstrate the effectiveness of SMC through three case studies, in which we experimentally show that SMC can evaluate existing security vulnerabilities and defenses and provide qualitatively similar conclusions with greater statistical rigor, while making no simplifying assumptions or abstractions. We also show that SMC can enable a defender to quantify the amount of noise necessary to have a desired level of confidence that she has reduced an attacker's probability of success to less than a desired threshold, thus providing the defender with an actionable plan for obfuscation via noise injection.
Authors:Jie Cao, Qi Li, Zelin Zhang, Jianbing Ni
Abstract:
The rapid advancement of generative artificial intelligence (Gen-AI) has facilitated the effortless creation of high-quality images, while simultaneously raising critical concerns regarding intellectual property protection, authenticity, and accountability. Watermarking has emerged as a promising solution to these challenges by distinguishing AI-generated images from natural content, ensuring provenance, and fostering trustworthy digital ecosystems. This paper presents a comprehensive survey of the current state of AI-generated image watermarking, addressing five key dimensions: (1) formalization of image watermarking systems; (2) an overview and comparison of diverse watermarking techniques; (3) evaluation methodologies with respect to visual quality, capacity, and detectability; (4) vulnerabilities to malicious attacks; and (5) prevailing challenges and future directions. The survey aims to equip researchers with a holistic understanding of AI-generated image watermarking technologies, thereby promoting their continued development.
Authors:Gautier Evennou, Vivien Chappelier, Ewa Kijak
Abstract:
Most image watermarking systems focus on robustness, capacity, and imperceptibility while treating the embedded payload as meaningless bits. This bit-centric view imposes a hard ceiling on capacity and prevents watermarks from carrying useful information. We propose LatentSeal, which reframes watermarking as semantic communication: a lightweight text autoencoder maps full-sentence messages into a compact 256-dimensional unit-norm latent vector, which is robustly embedded by a finetuned watermark model and secured through a secret, invertible rotation. The resulting system hides full-sentence messages, decodes in real time, and survives valuemetric and geometric attacks. It surpasses prior state of the art in BLEU-4 and Exact Match on several benchmarks, while breaking through the long-standing 256-bit payload ceiling. It also introduces a statistically calibrated score that yields a ROC AUC score of 0.97-0.99, and practical operating points for deployment. By shifting from bit payloads to semantic latent vectors, LatentSeal enables watermarking that is not only robust and high-capacity, but also secure and interpretable, providing a concrete path toward provenance, tamper explanation, and trustworthy AI governance. Models, training and inference code, and data splits will be available upon publication.
Authors:Tharindu Lakshan Yasarathna, Nhien-An Le-Khac
Abstract:
Integrating SDN and the IoT enhances network control and flexibility. DL-based AAD systems improve security by enabling real-time threat detection in SDN-IoT networks. However, these systems remain vulnerable to adversarial attacks that manipulate input data or exploit model weaknesses, significantly degrading detection accuracy. Existing research lacks a systematic analysis of adversarial vulnerabilities specific to DL-based AAD systems in SDN-IoT environments. This SoK study introduces a structured adversarial threat model and a comprehensive taxonomy of attacks, categorising them into data, model, and hybrid-level threats. Unlike previous studies, we systematically evaluate white, black, and grey-box attack strategies across popular benchmark datasets. Our findings reveal that adversarial attacks can reduce detection accuracy by up to 48.4%, with Membership Inference causing the most significant drop. C&W and DeepFool achieve high evasion success rates. However, adversarial training enhances robustness, and its high computational overhead limits the real-time deployment of SDN-IoT applications. We propose adaptive countermeasures, including real-time adversarial mitigation, enhanced retraining mechanisms, and explainable AI-driven security frameworks. By integrating structured threat models, this study offers a more comprehensive approach to attack categorisation, impact assessment, and defence evaluation than previous research. Our work highlights critical vulnerabilities in existing DL-based AAD models and provides practical recommendations for improving resilience, interpretability, and computational efficiency. This study serves as a foundational reference for researchers and practitioners seeking to enhance DL-based AAD security in SDN-IoT networks, offering a systematic adversarial threat model and conceptual defence evaluation based on prior empirical studies.
Authors:Panagiotis Michalopoulos, Anthony Mack, Cameron Clark, Linus Chen, Johannes Sedlmeir, Andreas Veneris
Abstract:
Blockchain technology has spawned a vast ecosystem of digital currencies with Central Bank Digital Currencies (CBDCs) -- digital forms of fiat currency -- being one of them. An important feature of digital currencies is facilitating transactions without network connectivity, which can enhance the scalability of cryptocurrencies and the privacy of CBDC users. However, in the case of CBDCs, this characteristic also introduces new regulatory challenges, particularly when it comes to applying established Anti-Money Laundering and Countering the Financing of Terrorism (AML/CFT) frameworks. This paper introduces a prototype for offline digital currency payments, equally applicable to cryptocurrencies and CBDCs, that leverages Secure Elements and digital credentials to address the tension of offline payment support with regulatory compliance. Performance evaluation results suggest that the prototype can be flexibly adapted to different regulatory environments, with a transaction latency comparable to real-life commercial payment systems. Furthermore, we conceptualize how the integration of Zero-Knowledge Proofs into our design could accommodate various tiers of enhanced privacy protection.
Authors:Loay Abdelrazek, Filippo Rebecchi
Abstract:
Mobile networks in the 5G and 6G era require to rethink how to manage security due to the introduction of new services, use cases, each with its own security requirements, while simultaneously expanding the threat landscape. Although automation has emerged as a key enabler to address complexity in networks, existing approaches lack the expressiveness to define and enforce complex, goal-driven, and measurable security requirements. In this paper, we propose the concept of differentiated security levels and leveraging intents as a management framework. We discuss the requirements and enablers to extend the currently defined intent-based management frameworks to pave the path for intent-based security management in mobile networks. Our approach formalizes both functional and non-functional security requirements and demonstrates how these can be expressed and modeled using an extended TM Forum (TMF) intent security ontology. We further discuss the required standardization steps to achieve intent-based security management. Our work aims at advance security automation, improve adaptability, and strengthen the resilience and security posture of the next-generation mobile networks.
Authors:Martin Kotuliak, Simon Erni, Jakub Polák, Marc Roeschlin, Richard Baker, Ivan Martinovic, Srdjan Äapkun
Abstract:
The widespread availability of cellular devices introduces new threat vectors that allow users or attackers to bypass security policies and physical barriers and bring unauthorized devices into sensitive areas. These threats can arise from user non-compliance or deliberate actions aimed at data exfiltration/infiltration via hidden devices, drones, etc. We identify a critical gap in this context: the absence of low-latency systems for high-quality and instantaneous monitoring of cellular transmissions. Such low-latency systems are crucial to allow for timely detection, decision (e.g., geofencing or localization), and disruption of unauthorized communication in sensitive areas. Operator-based monitoring systems, built for purposes such as people counting or tracking, lack real-time capability, require cooperation across multiple operators, and thus are hard to deploy. Operator-independent monitoring approaches proposed in the literature either lack low-latency capabilities or do not scale. We propose LTag, the first low-latency, operator-independent and scalable system designed to monitor cellular connections across all operators prior to any user data transmission. LTag consists of several downlink sniffers and a distributed network of uplink sniffers that measure both downlink protocol information and uplink signal characteristics at multiple locations to gain a detailed spatial image of uplink signals. LTag aggregates the recorded information, processes it, and provides a decision about the connection all prior to connection establishment of a UE. To evaluate LTag, we deployed it in the context of geofencing, where LTag was able to determine if the signals originate from inside or outside of an area within 2.3 ms of the initial base station-to-device message, therefore enabling prompt and targeted suppression of communication before any user data was transmitted.
Authors:Haochen Sun, Xi He
Abstract:
Differential privacy (DP) has become the gold standard for preserving individual privacy in data analysis. However, an implicit yet fundamental assumption underlying these rigorous privacy guarantees is the correct implementation and execution of DP mechanisms. Several incidents of unintended privacy loss have occurred due to numerical issues and inappropriate configurations of DP software, which have been successfully exploited in privacy attacks. To better understand the seriousness of defective DP software, we ask the following question: is it possible to elevate these passive defects into active privacy attacks while maintaining covertness? To address this question, we present the Gaussian pancake mechanism (GPM), a novel mechanism that is computationally indistinguishable from the widely used Gaussian mechanism (GM), yet exhibits arbitrarily weaker statistical DP guarantees. This unprecedented separation enables a new class of backdoor attacks: by indistinguishably passing off as the authentic GM, GPM can covertly degrade statistical privacy. Unlike the unintentional privacy loss caused by GM's numerical issues, GPM is an adversarial yet undetectable backdoor attack against data privacy. We formally prove GPM's covertness, characterize its statistical leakage, and demonstrate a concrete distinguishing attack that can achieve near-perfect success rates under suitable parameter choices, both theoretically and empirically. Our results underscore the importance of using transparent, open-source DP libraries and highlight the need for rigorous scrutiny and formal verification of DP implementations to prevent subtle, undetectable privacy compromises in real-world systems.
Authors:Rowdy Chotkan, Bulat Nasrulin, Jérémie Decouchant, Johan Pouwelse
Abstract:
Spam poses a growing threat to blockchain networks. Adversaries can easily create multiple accounts to flood transaction pools, inflating fees and degrading service quality. Existing defenses against spam, such as fee markets and staking requirements, primarily rely on economic deterrence, which fails to distinguish between malicious and legitimate users and often exclude low-value but honest activity. To address these shortcomings, we present StarveSpam, a decentralized reputation-based protocol that mitigates spam by operating at the transaction relay layer. StarveSpam combines local behavior tracking, peer scoring, and adaptive rate-limiting to suppress abusive actors, without requiring global consensus, protocol changes, or trusted infrastructure. We evaluate StarveSpam using real Ethereum data from a major NFT spam event and show that it outperforms existing fee-based and rule-based defenses, allowing each node to block over 95% of spam while dropping just 3% of honest traffic, and reducing the fraction of the network exposed to spam by 85% compared to existing rule-based methods. StarveSpam offers a scalable and deployable alternative to traditional spam defenses, paving the way toward more resilient and equitable blockchain infrastructure.
Authors:Xiangchen Meng, Yangdi Lyu
Abstract:
Federated learning (FL) with fully homomorphic encryption (FHE) effectively safeguards data privacy during model aggregation by encrypting local model updates before transmission, mitigating threats from untrusted servers or eavesdroppers in transmission. However, the computational burden and ciphertext expansion associated with homomorphic encryption can significantly increase resource and communication overhead. To address these challenges, we propose FedBit, a hardware/software co-designed framework optimized for the Brakerski-Fan-Vercauteren (BFV) scheme. FedBit employs bit-interleaved data packing to embed multiple model parameters into a single ciphertext coefficient, thereby minimizing ciphertext expansion and maximizing computational parallelism. Additionally, we integrate a dedicated FPGA accelerator to handle cryptographic operations and an optimized dataflow to reduce the memory overhead. Experimental results demonstrate that FedBit achieves a speedup of two orders of magnitude in encryption and lowers average communication overhead by 60.7%, while maintaining high accuracy.
Authors:Enoal Gesny, Eva Giboulot, Teddy Furon, Vivien Chappelier
Abstract:
This paper introduces a novel watermarking method for diffusion models. It is based on guiding the diffusion process using the gradient computed from any off-the-shelf watermark decoder. The gradient computation encompasses different image augmentations, increasing robustness to attacks against which the decoder was not originally robust, without retraining or fine-tuning. Our method effectively convert any \textit{post-hoc} watermarking scheme into an in-generation embedding along the diffusion process. We show that this approach is complementary to watermarking techniques modifying the variational autoencoder at the end of the diffusion process. We validate the methods on different diffusion models and detectors. The watermarking guidance does not significantly alter the generated image for a given seed and prompt, preserving both the diversity and quality of generation.
Authors:Wenxuan Wang, Chenglei Wang, Xuelin Qian
Abstract:
As adversarial attacks continue to evolve, defense models face the risk of recurrent vulnerabilities, underscoring the importance of continuous adversarial training (CAT). Existing CAT approaches typically balance decision boundaries by either data replay or optimization strategy to constrain shared model parameters. However, due to the diverse and aggressive nature of adversarial examples, these methods suffer from catastrophic forgetting of previous defense knowledge after continual learning. In this paper, we propose a novel framework, called Dual-level Defense Routing or DDeR, that can autonomously select appropriate routers to integrate specific defense experts, thereby adapting to evolving adversarial attacks. Concretely, the first-level defense routing comprises multiple defense experts and routers, with each router dynamically selecting and combining suitable experts to process attacked features. Routers are independently incremented as continuous adversarial training progresses, and their selections are guided by an Adversarial Sentinel Network (ASN) in the second-level defense routing. To compensate for the inability to test due to the independence of routers, we further present a Pseudo-task Substitution Training (PST) strategy, which leverages distributional discrepancy in data to facilitate inter-router communication without storing historical data. Extensive experiments demonstrate that DDeR achieves superior continuous defense performance and classification accuracy compared to existing methods.
Authors:Yu Liu, Boxiang He, Fanggang Wang
Abstract:
This paper proposes a novel and flexible security-aware semantic-driven integrated sensing and communication (ISAC) framework, namely security semantic ISAC (SS-ISAC). Inspired by the positive impact of the adversarial attack, a pair of pluggable encryption and decryption modules is designed in the proposed SS-ISAC framework. The encryption module is installed after the semantic transmitter, adopting a trainable adversarial residual network (ARN) to create the adversarial attack. Correspondingly, the decryption module before the semantic receiver utilizes another trainable ARN to mitigate the adversarial attack and noise. These two modules can be flexibly assembled considering the system security demands, without drastically modifying the hardware infrastructure. To ensure the sensing and communication (SAC) performance while preventing the eavesdropping threat, the above ARNs are jointly optimized by minimizing a carefully designed loss function that relates to the adversarial attack power, SAC performance, as well as the privacy leakage risk. Simulation results validate the effectiveness of the proposed SS-ISAC framework in terms of both SAC and eavesdropping prevention performance.
Authors:Andrey Boris Khesin, Jonathan Z. Lu, Alexander Poremba, Akshar Ramkumar, Vinod Vaikuntanathan
Abstract:
Random classical linear codes are widely believed to be hard to decode. While slightly sub-exponential time algorithms exist when the coding rate vanishes sufficiently rapidly, all known algorithms at constant rate require exponential time. By contrast, the complexity of decoding a random quantum stabilizer code has remained an open question for quite some time. This work closes the gap in our understanding of the algorithmic hardness of decoding random quantum versus random classical codes. We prove that decoding a random stabilizer code with even a single logical qubit is at least as hard as decoding a random classical code at constant rate--the maximally hard regime. This result suggests that the easiest random quantum decoding problem is at least as hard as the hardest random classical decoding problem, and shows that any sub-exponential algorithm decoding a typical stabilizer code, at any rate, would immediately imply a breakthrough in cryptography. More generally, we also characterize many other complexity-theoretic properties of stabilizer codes. While classical decoding admits a random self-reduction, we prove significant barriers for the existence of random self-reductions in the quantum case. This result follows from new bounds on Clifford entropies and Pauli mixing times, which may be of independent interest. As a complementary result, we demonstrate various other self-reductions which are in fact achievable, such as between search and decision. We also demonstrate several ways in which quantum phenomena, such as quantum degeneracy, force several reasonable definitions of stabilizer decoding--all of which are classically identical--to have distinct or non-trivially equivalent complexity.
Authors:Massimo Bartoletti, Enrico Lipparini, Livio Pompianu
Abstract:
Ensuring the correctness of smart contracts is critical, as even subtle flaws can lead to severe financial losses. While bug detection tools able to spot common vulnerability patterns can serve as a first line of defense, most real-world exploits and losses stem from errors in the contract business logic. Formal verification tools such as SolCMC and the Certora Prover address this challenge, but their impact remains limited by steep learning curves and restricted specification languages. Recent works have begun to explore the use of large language models (LLMs) for security-related tasks such as vulnerability detection and test generation. Yet, a fundamental question remains open: can LLMs serve as verification oracles, capable of reasoning about arbitrary contract-specific properties? In this paper, we provide the first systematic evaluation of GPT-5, a state-of-the-art reasoning LLM, in this role. We benchmark its performance on a large dataset of verification tasks, compare its outputs against those of established formal verification tools, and assess its practical effectiveness in real-world auditing scenarios. Our study combines quantitative metrics with qualitative analysis, and shows that recent reasoning-oriented LLMs can be surprisingly effective as verification oracles, suggesting a new frontier in the convergence of AI and formal methods for secure smart contract development and auditing.
Authors:Felix Weissberg, Lukas Pirch, Erik Imgrund, Jonas Möller, Thorsten Eisenhofer, Konrad Rieck
Abstract:
Large language models (LLMs) excel in many tasks of software engineering, yet progress in leveraging them for vulnerability discovery has stalled in recent years. To understand this phenomenon, we investigate LLMs through the lens of classic code metrics. Surprisingly, we find that a classifier trained solely on these metrics performs on par with state-of-the-art LLMs for vulnerability discovery. A root-cause analysis reveals a strong correlation and a causal effect between LLMs and code metrics: When the value of a metric is changed, LLM predictions tend to shift by a corresponding magnitude. This dependency suggests that LLMs operate at a similarly shallow level as code metrics, limiting their ability to grasp complex patterns and fully realize their potential in vulnerability discovery. Based on these findings, we derive recommendations on how research should more effectively address this challenge.
Authors:Gorka Guardiola Múzquiz, Juan González-Gómez, Enrique Soriano-Salvador
Abstract:
Write Once Read Many (WORM) properties for storage devices are desirable to ensure data immutability for applications such as secure logging, regulatory compliance, archival storage, and other types of backup systems. WORM devices guarantee that data, once written, cannot be altered or deleted. However, implementing secure and compatible WORM storage remains a challenge. Traditional solutions often rely on specialized hardware, which is either costly, closed, or inaccessible to the general public. Distributed approaches, while promising, introduce additional risks such as denial-of-service vulnerabilities and operational complexity. We introduce Socarrat, a novel, cost-effective, and local WORM storage solution that leverages a simple external USB device (specifically, a single-board computer running Linux with USB On-The-Go support). The resulting device can be connected via USB, appearing as an ordinary external disk formatted with an ext4 or exFAT file system, without requiring any specialized software or drivers. By isolating the WORM enforcement mechanism in a dedicated USB hardware module, Socarrat significantly reduces the attack surface and ensures that even privileged attackers cannot modify or erase stored data. In addition to the WORM capacity, the system is designed to be tamper-evident, becoming resilient against advanced attacks. This work describes a novel approach, the Reverse File System, based on inferring the file system operations occurring at higher layers in the host computer where Socarrat is mounted. The paper also describes the current Socarrat prototype, implemented in Go and available as free/libre software. Finally, it provides a complete evaluation of the logging performance on different single-board computers.
Authors:Amir Reza Ramtin, Philippe Nain, Don Towsley
Abstract:
We investigate the problem of covert quickest change detection in a continuous-time setting, where a Brownian motion experiences a drift change at an unknown time. Unlike classical formulations, we consider a covert adversary who adjusts the post-change drift $μ= μ(γ)$ as a function of the false alarm constraint parameter $γ$, with the goal of remaining undetected for as long as possible. Leveraging the exact expressions for the average detection delay (ADD) and average time to false alarm (AT2FA) known for the continuous-time CuSum procedure, we rigorously analyze how the asymptotic behavior of ADD evolves as $μ(γ) \to 0$ with increasing $γ$. Our results reveal that classical detection delay characterizations no longer hold in this regime. We derive sharp asymptotic expressions for the ADD under various convergence rates of $μ(γ)$, identify precise conditions for maintaining covertness, and characterize the total damage inflicted by the adversary. We show that the adversary achieves maximal damage when the drift scales as $μ(γ) = Î(1/\sqrtγ)$, marking a fundamental trade-off between stealth and impact in continuous-time detection systems.
Authors:Lorenzo Guerra, Thomas Chapuis, Guillaume Duc, Pavlo Mozharovskyi, Van-Tam Nguyen
Abstract:
Detecting intrusions in network traffic is a challenging task, particularly under limited supervision and constantly evolving attack patterns. While recent works have leveraged graph neural networks for network intrusion detection, they often decouple representation learning from anomaly detection, limiting the utility of the embeddings for identifying attacks. We propose GraphIDS, a self-supervised intrusion detection model that unifies these two stages by learning local graph representations of normal communication patterns through a masked autoencoder. An inductive graph neural network embeds each flow with its local topological context to capture typical network behavior, while a Transformer-based encoder-decoder reconstructs these embeddings, implicitly learning global co-occurrence patterns via self-attention without requiring explicit positional information. During inference, flows with unusually high reconstruction errors are flagged as potential intrusions. This end-to-end framework ensures that embeddings are directly optimized for the downstream task, facilitating the recognition of malicious traffic. On diverse NetFlow benchmarks, GraphIDS achieves up to 99.98% PR-AUC and 99.61% macro F1-score, outperforming baselines by 5-25 percentage points.
Authors:Jugal Gajjar, Kamalasankari Subramaniakuppusamy, Relsy Puthal, Kaustik Ranaware
Abstract:
Modern software development pipelines face growing challenges in securing large codebases with extensive dependencies. Static analysis tools like Bandit are effective at vulnerability detection but suffer from high false positives and lack repair capabilities. Large Language Models (LLMs), in contrast, can suggest fixes but often hallucinate changes and lack self-validation. We present SecureFixAgent, a hybrid repair framework integrating Bandit with lightweight local LLMs (<8B parameters) in an iterative detect-repair-validate loop. To improve precision, we apply parameter-efficient LoRA-based fine-tuning on a diverse, curated dataset spanning multiple Python project domains, mitigating dataset bias and reducing unnecessary edits. SecureFixAgent uses Bandit for detection, the LLM for candidate fixes with explanations, and Bandit re-validation for verification, all executed locally to preserve privacy and reduce cloud reliance. Experiments show SecureFixAgent reduces false positives by 10.8% over static analysis, improves fix accuracy by 13.51%, and lowers false positives by 5.46% compared to pre-trained LLMs, typically converging within three iterations. Beyond metrics, developer studies rate explanation quality 4.5/5, highlighting its value for human trust and adoption. By combining verifiable security improvements with transparent rationale in a resource-efficient local framework, SecureFixAgent advances trustworthy, automated vulnerability remediation for modern pipelines.
Authors:Bruno Llacer Trotti, Weizhao Tang, Rachid El-Azouzi, Giulia Fanti, Daniel Sadoc Menasche
Abstract:
Liquidity providers (LPs) are essential figures in the operation of automated market makers (AMMs); in exchange for transaction fees, LPs lend the liquidity that allows AMMs to operate. While many prior works have studied the incentive structures of LPs in general, we currently lack a principled understanding of a special class of LPs known as Just-In-Time (JIT) LPs. These are strategic agents who momentarily supply liquidity for a single swap, in an attempt to extract disproportionately high fees relative to the remaining passive LPs. This paper provides the first formal, transaction-level model of JIT liquidity provision for a widespread class of AMMs known as Concentrated Liquidity Market Makers (CLMMs), as seen in Uniswap V3, for instance. We characterize the landscape of price impact and fee allocation in these systems, formulate and analyze a non-linear optimization problem faced by JIT LPs, and prove the existence of an optimal strategy. By fitting our optimal solution for JIT LPs to real-world CLMMs, we observe that in liquidity pools (particularly those with risky assets), there is a significant gap between observed and optimal JIT behavior. Existing JIT LPs often fail to account for price impact; doing so, we estimate they could increase earnings by up to 69% on average over small time windows. We also show that JIT liquidity, when deployed strategically, can improve market efficiency by reducing slippage for traders, albeit at the cost of eroding average passive LP profits by up to 44% per trade.
Authors:Qianyu Yu, Giuliano Losa, Nibesh Shrestha, Xuechao Wang
Abstract:
To maximize performance, many modern blockchain systems rely on eventually-synchronous, Byzantine fault-tolerant (BFT) consensus protocols. Two protocol designs have emerged in this space: protocols that minimize latency using a leader that drives both data dissemination and consensus, and protocols that maximize throughput using a separate, asynchronous data dissemination layer. Recent protocols such as Partially-Synchronous Bullshark and Sailfish combine elements of both approaches by using a DAG to enable parallel data dissemination and a leader that paces DAG formation. This improves latency while achieving state-of-the-art throughput. Yet the latency of leader-based protocols is still better under moderate loads. We present Angelfish, a hybrid protocol that adapts smoothly across this design space, from leader-based to Sailfish-like DAG-based consensus. Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting costlier DAG vertices. This reduces communication, helps lagging nodes catch up, and lowers latency in practice compared to prior DAG-based protocols. Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput, delivering the best of both worlds.
Authors:Shashank Shreedhar Bhatt, Tanmay Rajore, Khushboo Aggarwal, Ganesh Ananthanarayanan, Ranveer Chandra, Nishanth Chandran, Suyash Choudhury, Divya Gupta, Emre Kiciman, Sumit Kumar Pandey, Srinath Setty, Rahul Sharma, Teijia Zhao
Abstract:
Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data. While fine-tuning enhances performance by internalizing domain knowledge, it also introduces a critical security risk: leakage of confidential training data to unauthorized users. These risks are exacerbated when LLMs are combined with Retrieval-Augmented Generation (RAG) pipelines that dynamically fetch contextual documents at inference time. We demonstrate data exfiltration attacks on AI assistants where adversaries can exploit current fine-tuning and RAG architectures to leak sensitive information by leveraging the lack of access control enforcement. We show that existing defenses, including prompt sanitization, output filtering, system isolation, and training-level privacy mechanisms, are fundamentally probabilistic and fail to offer robust protection against such attacks. We take the position that only a deterministic and rigorous enforcement of fine-grained access control during both fine-tuning and RAG-based inference can reliably prevent the leakage of sensitive data to unauthorized recipients. We introduce a framework centered on the principle that any content used in training, retrieval, or generation by an LLM is explicitly authorized for \emph{all users involved in the interaction}. Our approach offers a simple yet powerful paradigm shift for building secure multi-user LLM systems that are grounded in classical access control but adapted to the unique challenges of modern AI workflows. Our solution has been deployed in Microsoft Copilot Tuning, a product offering that enables organizations to fine-tune models using their own enterprise-specific data.
Authors:Ramazan Yener, Muhammad Hassan, Masooda Bashir
Abstract:
The integration of the Internet of Medical Things (IoMT) into healthcare systems has transformed patient care by enabling real-time monitoring, enhanced diagnostics, and enhanced operational efficiency. However, this increased connectivity has also expanded the attack surface for cybercriminals, raising significant cybersecurity and privacy concerns. This study focuses on the cybersecurity vulnerabilities of IoMT infusion pumps, which are critical devices in modern healthcare. Through a targeted literature review of the past five years, we analyzed seven current studies from a pool of 132 papers to identify security vulnerabilities. Our findings indicate that infusion pumps face vulnerabilities such as device-level flaws, authentication and access control issues, network and communication weaknesses, data security and privacy risks, and operational or organizational challenges that can expose them to lateral attacks within healthcare networks. Our analysis synthesizes findings from seven recent studies to clarify how and why infusion pumps remain vulnerable in each of these areas. By categorizing the security gaps, we highlight critical risk patterns and their implications. This work underscores the scope of the issue and provides a structured understanding that is valuable for healthcare IT professionals and device manufacturers. Ultimately, the findings can inform the development of targeted, proactive security strategies to better safeguard infusion pumps and protect patient well-being.
Authors:Taesoo Kim, HyungSeok Han, Soyeon Park, Dae R. Jeong, Dohyeok Kim, Dongkwan Kim, Eunsoo Kim, Jiho Kim, Joshua Wang, Kangsu Kim, Sangwoo Ji, Woosun Song, Hanqing Zhao, Andrew Chin, Gyejin Lee, Kevin Stevens, Mansour Alharthi, Yizhuo Zhai, Cen Zhang, Joonun Jang, Yeongjin Jang, Ammar Askar, Dongju Kim, Fabian Fleischer, Jeongin Cho, Junsik Kim, Kyungjoon Ko, Insu Yun, Sangdon Park, Dowoo Baik, Haein Lee, Hyeon Heo, Minjae Gwon, Minjae Lee, Minwoo Baek, Seunggi Min, Wonyoung Kim, Yonghwi Jin, Younggi Park, Yunjae Choi, Jinho Jung, Gwanhyun Lee, Junyoung Jang, Kyuheon Kim, Yeonghyeon Cha, Youngjoon Kim
Abstract:
We present ATLANTIS, the cyber reasoning system developed by Team Atlanta that won 1st place in the Final Competition of DARPA's AI Cyber Challenge (AIxCC) at DEF CON 33 (August 2025). AIxCC (2023-2025) challenged teams to build autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at the speed and scale of modern software. ATLANTIS integrates large language models (LLMs) with program analysis -- combining symbolic execution, directed fuzzing, and static analysis -- to address limitations in automated vulnerability discovery and program repair. Developed by researchers at Georgia Institute of Technology, Samsung Research, KAIST, and POSTECH, the system addresses core challenges: scaling across diverse codebases from C to Java, achieving high precision while maintaining broad coverage, and producing semantically correct patches that preserve intended behavior. We detail the design philosophy, architectural decisions, and implementation strategies behind ATLANTIS, share lessons learned from pushing the boundaries of automated security when program analysis meets modern AI, and release artifacts to support reproducibility and future research.
Authors:Carla Ferradini, Martin Sandfuchs, Ramona Wolf, Renato Renner
Abstract:
The security of quantum key distribution (QKD) is quantified by a parameter $\varepsilon>0$, which -- under well-defined physical assumptions -- can be bounded explicitly. This contrasts with computationally secure schemes, where security claims are only asymptotic (i.e., under standard complexity assumptions, one only knows that $\varepsilon \to 0$ as the key size grows, but has no explicit bound). Here we explain the definition and interpretation of $\varepsilon$-security. Adopting an axiomatic approach, we show that $\varepsilon$ can be understood as the maximum probability of a security failure. Finally, we review and address several criticisms of this definition that have appeared in the literature.
Authors:Kiho Lee, Jungkon Kim, Doowon Kim, Hyoungshick Kim
Abstract:
Code-generating Large Language Models (LLMs) significantly accelerate software development. However, their frequent generation of insecure code presents serious risks. We present a comprehensive evaluation of seven parameter-efficient fine-tuning (PEFT) techniques, demonstrating substantial gains in secure code generation without compromising functionality. Our research identifies prompt-tuning as the most effective PEFT method, achieving an 80.86% Overall-Secure-Rate on CodeGen2 16B, a 13.5-point improvement over the 67.28% baseline. Optimizing decoding strategies through sampling temperature further elevated security to 87.65%. This equates to a reduction of approximately 203,700 vulnerable code snippets per million generated. Moreover, prompt and prefix tuning increase robustness against poisoning attacks in our TrojanPuzzle evaluation, with strong performance against CWE-79 and CWE-502 attack vectors. Our findings generalize across Python and Java, confirming prompt-tuning's consistent effectiveness. This study provides essential insights and practical guidance for building more resilient software systems with LLMs.
Authors:Ben Dong, Hui Feng, Qian Wang
Abstract:
As quantum computing advances, quantum circuit simulators serve as critical tools to bridge the current gap caused by limited quantum hardware availability. These simulators are typically deployed on cloud platforms, where users submit proprietary circuit designs for simulation. In this work, we demonstrate a novel timing side-channel attack targeting cloud-based quantum simulators. A co-located malicious process can observe fine-grained execution timing patterns to extract sensitive information about concurrently running quantum circuits. We systematically analyze simulator behavior using the QASMBench benchmark suite, profiling timing and memory characteristics across various circuit executions. Our experimental results show that timing profiles exhibit circuit-dependent patterns that can be effectively classified using pattern recognition techniques, enabling the adversary to infer circuit identities and compromise user confidentiality. We were able to achieve 88% to 99.9% identification rate of quantum circuits based on different datasets. This work highlights previously unexplored security risks in quantum simulation environments and calls for stronger isolation mechanisms to protect user workloads
Authors:Xin Tong, Zhi Lin, Jingya Wang, Meng Han, Bo Jin
Abstract:
Large language models (LLMs) enforce safety alignment to reliably refuse malicious requests, yet the same blanket safeguards also block legitimate uses in policing, defense, and other high-stakes settings. Earlier "refusal-direction" edits can bypass those layers, but they rely on a single vector that indiscriminately unlocks all hazardous topics, offering no semantic control. We introduce Mutually Exclusive Unlock Vectors (MEUV), a lightweight framework that factorizes the monolithic refusal direction into topic-aligned, nearly orthogonal vectors, each dedicated to one sensitive capability. MEUV is learned in a single epoch with a multi-task objective that blends a differential-ablation margin, cross-topic and orthogonality penalties, and several auxiliary terms. On bilingual malicious-prompt benchmarks, MEUV achieves an attack success rate of no less than 87% on Gemma-2-2B, LLaMA-3-8B, and Qwen-7B, yet cuts cross-topic leakage by up to 90% compared with the best single-direction baseline. Vectors trained in Chinese transfer almost unchanged to English (and vice versa), suggesting a language-agnostic refusal subspace. The results show that fine-grained, topic-level capability activation is achievable with minimal utility loss, paving the way for controlled LLMs deployment in security-sensitive domains.
Authors:Daniel Herzinger, Linus Heise, Daniel Loebenberger, Matthias Söllner
Abstract:
Advances in quantum computing necessitate migrating the entire technology stack to post-quantum cryptography. This includes IPsec-based VPN connection authentication. Although there is an RFC draft for post-quantum authentication in this setting, the draft does not consider (stateful) hash-based signatures despite their small signature size and trusted long-term security. We propose a design with time-based state-management that assigns VPN devices a certificate authority (CA) based on the hash-based signature scheme XMSS. The CA then issues leaf certificates which are based on classical cryptography but have a short validity time, e. g., four hours. It is to be expected that even large quantum computers will take significantly longer to break the cryptography, making the design quantum-secure. We propose strategies to make the timekeeping more resilient to faults and tampering, as well as strategies to recognize a wrong system time, minimize its potential damage, and quickly recover. The result is an OpenBSD implementation of a quantum-safe and, regarding the leaf certificates, highly flexible VPN authentication design that requires significantly less bandwidth and computational resources compared to existing alternatives.
Authors:Anusha Sinha, Keltin Grimes, James Lucassen, Michael Feffer, Nathan VanHoudnos, Zhiwei Steven Wu, Hoda Heidari
Abstract:
A red team simulates adversary attacks to help defenders find effective strategies to defend their systems in a real-world operational setting. As more enterprise systems adopt AI, red-teaming will need to evolve to address the unique vulnerabilities and risks posed by AI systems. We take the position that AI systems can be more effectively red-teamed if AI red-teaming is recognized as a domain-specific evolution of cyber red-teaming. Specifically, we argue that existing Cyber Red Teams who adopt this framing will be able to better evaluate systems with AI components by recognizing that AI poses new risks, has new failure modes to exploit, and often contains unpatchable bugs that re-prioritize disclosure and mitigation strategies. Similarly, adopting a cybersecurity framing will allow existing AI Red Teams to leverage a well-tested structure to emulate realistic adversaries, promote mutual accountability with formal rules of engagement, and provide a pattern to mature the tooling necessary for repeatable, scalable engagements. In these ways, the merging of AI and Cyber Red Teams will create a robust security ecosystem and best position the community to adapt to the rapidly changing threat landscape.
Authors:Fabian Bäumer, Marcel Maehren, Marcus Brinkmann, Jörg Schwenk
Abstract:
SSH is an important protocol for secure remote shell access to servers on the Internet. At USENIX 2024, Bäumer et al. presented the Terrapin attack on SSH, which relies on the attacker injecting optional messages during the key exchange. To mitigate this attack, SSH vendors adopted an extension developed by OpenSSH called strict key exchange ("strict KEX"). With strict KEX, optional messages are forbidden during the handshake, preventing the attack. In practice, this should simplify the state machine of an SSH handshake to a linear message flow similar to that of TLS. In this work, we analyze the design, implementation, and security of strict KEX in popular SSH servers, using black-box state learning, which can uncover the hidden state machine of an implementation. In practice, it is limited by the number of learned messages and the complexity of the state machine. Thus, learning the complete state machine of SSH is infeasible. Previous research on SSH, therefore, excluded optional messages, learning only a partial state machine. However, these messages are a critical part of the Terrapin attack. We propose to instead learn the complete state machine of the handshake phase of an SSH server, but with strict KEX enabled. We investigate the security of ten SSH implementations supporting strict KEX for up to five key exchange algorithms. In total, we learn 33 state machines, revealing significant differences in the implementations. We show that seven implementations violate the strict KEX specification and find two critical security vulnerabilities. One results in a rogue session attack in the proprietary Tectia SSH implementation. Another affects the official SSH implementation of the Erlang Open Telecom Platform, and enables unauthenticated remote code execution in the security context of the SSH server.
Authors:Christoph Hochrainer, Valentin Wüstholz, Maria Christakis
Abstract:
Zero-knowledge virtual machines (zkVMs) are increasingly deployed in decentralized applications and blockchain rollups since they enable verifiable off-chain computation. These VMs execute general-purpose programs, frequently written in Rust, and produce succinct cryptographic proofs. However, zkVMs are complex, and bugs in their constraint systems or execution logic can cause critical soundness (accepting invalid executions) or completeness (rejecting valid ones) issues. We present Arguzz, the first automated tool for testing zkVMs for soundness and completeness bugs. To detect such bugs, Arguzz combines a novel variant of metamorphic testing with fault injection. In particular, it generates semantically equivalent program pairs, merges them into a single Rust program with a known output, and runs it inside a zkVM. By injecting faults into the VM, Arguzz mimics malicious or buggy provers to uncover overly weak constraints. We used Arguzz to test six real-world zkVMs (RISC Zero, Nexus, Jolt, SP1, OpenVM, and Pico) and found eleven bugs in three of them. One RISC Zero bug resulted in a $50,000 bounty, despite prior audits, demonstrating the critical need for systematic testing of zkVMs.
Authors:Yang Zhang, Wenyi Ouyang, Yi Zhang, Liang Cheng, Chen Wu, Wenxin Hu
Abstract:
The prevalence of cryptographic API misuse (CAM) is compromising the effectiveness of cryptography and in turn the security of modern systems and applications. Despite extensive efforts to develop CAM detection tools, these tools typically rely on a limited set of predefined rules from human-curated knowledge. This rigid, rule-based approach hinders adaptation to evolving CAM patterns in real practices. We propose leveraging large language models (LLMs), trained on publicly available cryptography-related data, to automatically detect and classify CAMs in real-world code to address this limitation. Our method enables the development and continuous expansion of a CAM taxonomy, supporting developers and detection tools in tracking and understanding emerging CAM patterns. Specifically, we develop an LLM-agnostic prompt engineering method to guide LLMs in detecting CAM instances from C/C++, Java, Python, and Go code, and then classifying them into a hierarchical taxonomy. Using a data set of 3,492 real-world software programs, we demonstrate the effectiveness of our approach with mainstream LLMs, including GPT, Llama, Gemini, and Claude. It also allows us to quantitatively measure and compare the performance of these LLMs in analyzing CAM in realistic code. Our evaluation produced a taxonomy with 279 base CAM categories, 36 of which are not addressed by existing taxonomies. To validate its practical value, we encode 11 newly identified CAM types into detection rules and integrate them into existing tools. Experiments show that such integration expands the tools' detection capabilities.
Authors:Zhongtang Luo, Jianting Zhang, Akshat Neerati, Aniket Kate
Abstract:
The Tor network offers network anonymity to its users by routing their traffic through a sequence of relays. A group of nine directory authorities maintains information about all available relay nodes using a distributed directory protocol. We observe that the current protocol makes a steep synchrony assumption, which makes it vulnerable to natural as well as adversarial non-synchronous communication scenarios over the Internet. In this paper, we show that it is possible to cause a failure in the Tor directory protocol by targeting a majority of the authorities for only five minutes using a well-executed distributed denial-of-service (DDoS) attack. We demonstrate this attack in a controlled environment and show that it is cost-effective for as little as \$53.28 per month to disrupt the protocol and to effectively bring down the entire Tor network. To mitigate this problem, we consider the popular partial synchrony assumption for the Tor directory protocol that ensures that the protocol security is hampered even when the network delays are large and unknown. We design a new Tor directory protocol that leverages any standard partial-synchronous consensus protocol to solve this problem, while also proving its security. We have implemented a prototype in Rust, demonstrating comparable performance to the current protocol while resisting similar attacks.
Authors:Davide Corradini, Mariano Ceccato, Mohammad Ghafari
Abstract:
We present AuthREST, an open-source security testing tool targeting broken authentication, one of the most prevalent API security risks in the wild. AuthREST automatically tests web APIs for credential stuffing, password brute forcing, and unchecked token authenticity. Empirical results show that AuthREST is effective in improving web API security. Notably, it uncovered previously unknown authentication vulnerabilitiesin in four public APIs.
Authors:Fabian Bäumer, Marcus Brinkmann, Maximilian Radoy, Jörg Schwenk, Juraj Somorovsky
Abstract:
Administrators and developers use SSH client keys and signatures for authentication, for example, to access internet backbone servers or to commit new code on platforms like GitHub. However, unlike servers, SSH clients cannot be measured through internet scans. We close this gap in two steps. First, we collect SSH client public keys. Such keys are regularly published by their owners on open development platforms like GitHub and GitLab. We systematize previous non-academic work by subjecting these keys to various security tests in a longitudinal study. Second, in a series of black-box lab experiments, we analyze the implementations of algorithms for SSH client signatures in 24 popular SSH clients for Linux, Windows, and macOS. We extracted 31,622,338 keys from three public sources in two scans. Compared to previous work, we see a clear tendency to abandon RSA signatures in favor of EdDSA signatures. Still, in January 2025, we found 98 broken short keys, 139 keys generated from weak randomness, and 149 keys with common or small factors-the large majority of the retrieved keys exposed no weakness. Weak randomness can not only compromise a secret key through its public key, but also through signatures. It is well-known that a bias in random nonces in ECDSA can reveal the secret key through public signatures. For the first time, we show that the use of deterministic nonces in ECDSA can also be dangerous: The private signing key of a PuTTY client can be recovered from just 58 valid signatures if ECDSA with NIST curve P-521 is used. PuTTY acknowledged our finding in CVE-2024-31497, and they subsequently replaced the nonce generation algorithm.
Authors:Muhammad Azmi Umer, Zhan Xuna, Yan Lin Aung, Aditya P. Mathur, Jianying Zhou
Abstract:
Critical Infrastructure (CI) is prone to cyberattacks. Several techniques have been developed to protect CI against such attacks. In this work, we describe a honeypot based on a cyber twin for a water treatment plant. The honeypot is intended to serve as a realistic replica of a water treatment plant that attracts potential attackers. The attacks launched on the honeypot are recorded and analyzed for threat intelligence. The intelligence so obtained is shared with the management of water treatment plants, who in turn may use it to improve plant protection systems. The honeypot used here is operational and has been attacked on several occasions using, for example, a ransomware attack that is described in detail.
Authors:Bishnu Bhusal, Rohit Chadha, A. Prasad Sistla, Mahesh Viswanathan
Abstract:
The verification of differential privacy algorithms that employ Gaussian distributions is little understood. This paper tackles the challenge of verifying such programs by introducing a novel approach to approximating probability distributions of loop-free programs that sample from both discrete and continuous distributions with computable probability density functions, including Gaussian and Laplace. We establish that verifying $(ε,δ)$-differential privacy for these programs is \emph{almost decidable}, meaning the problem is decidable for all values of $δ$ except those in a finite set. Our verification algorithm is based on computing probabilities to any desired precision by combining integral approximations, and tail probability bounds. The proposed methods are implemented in the tool, DipApprox, using the FLINT library for high-precision integral computations, and incorporate optimizations to enhance scalability. We validate {\ourtool} on fundamental privacy-preserving algorithms, such as Gaussian variants of the Sparse Vector Technique and Noisy Max, demonstrating its effectiveness in both confirming privacy guarantees and detecting violations.
Authors:Charuka Herath, Yogachandran Rahulamathavan, Varuna De Silva, Sangarapillai Lambotharan
Abstract:
Federated Learning (FL) enables decentralized model training without sharing raw data, offering strong privacy guarantees. However, existing FL protocols struggle to defend against Byzantine participants, maintain model utility under non-independent and identically distributed (non-IID) data, and remain lightweight for edge devices. Prior work either assumes trusted hardware, uses expensive cryptographic tools, or fails to address privacy and robustness simultaneously. We propose DSFL, a Dual-Server Byzantine-Resilient Federated Learning framework that addresses these limitations using a group-based secure aggregation approach. Unlike LSFL, which assumes non-colluding semi-honest servers, DSFL removes this dependency by revealing a key vulnerability: privacy leakage through client-server collusion. DSFL introduces three key innovations: (1) a dual-server secure aggregation protocol that protects updates without encryption or key exchange, (2) a group-wise credit-based filtering mechanism to isolate Byzantine clients based on deviation scores, and (3) a dynamic reward-penalty system for enforcing fair participation. DSFL is evaluated on MNIST, CIFAR-10, and CIFAR-100 under up to 30 percent Byzantine participants in both IID and non-IID settings. It consistently outperforms existing baselines, including LSFL, homomorphic encryption methods, and differential privacy approaches. For example, DSFL achieves 97.15 percent accuracy on CIFAR-10 and 68.60 percent on CIFAR-100, while FedAvg drops to 9.39 percent under similar threats. DSFL remains lightweight, requiring only 55.9 ms runtime and 1088 KB communication per round.
Authors:Lucas Fenaux, Zheng Wang, Jacob Yan, Nathan Chung, Florian Kerschbaum
Abstract:
Federated Learning is a distributed learning technique in which multiple clients cooperate to train a machine learning model. Distributed settings facilitate backdoor attacks by malicious clients, who can embed malicious behaviors into the model during their participation in the training process. These malicious behaviors are activated during inference by a specific trigger. No defense against backdoor attacks has stood the test of time, especially against adaptive attackers, a powerful but not fully explored category of attackers. In this work, we first devise a new adaptive adversary that surpasses existing adversaries in capabilities, yielding attacks that only require one or two malicious clients out of 20 to break existing state-of-the-art defenses. Then, we present Hammer and Anvil, a principled defense approach that combines two defenses orthogonal in their underlying principle to produce a combined defense that, given the right set of parameters, must succeed against any attack. We show that our best combined defense, Krum+, is successful against our new adaptive adversary and state-of-the-art attacks.
Authors:Katsuaki Nakano, Reza Feyyazi, Shanchieh Jay Yang, Michael Zuzak
Abstract:
Recent advances in Large Language Models (LLMs) have driven interest in automating cybersecurity penetration testing workflows, offering the promise of faster and more consistent vulnerability assessment for enterprise systems. Existing LLM agents for penetration testing primarily rely on self-guided reasoning, which can produce inaccurate or hallucinated procedural steps. As a result, the LLM agent may undertake unproductive actions, such as exploiting unused software libraries or generating cyclical responses that repeat prior tactics. In this work, we propose a guided reasoning pipeline for penetration testing LLM agents that incorporates a deterministic task tree built from the MITRE ATT&CK Matrix, a proven penetration testing kll chain, to constrain the LLM's reaoning process to explicitly defined tactics, techniques, and procedures. This anchors reasoning in proven penetration testing methodologies and filters out ineffective actions by guiding the agent towards more productive attack procedures. To evaluate our approach, we built an automated penetration testing LLM agent using three LLMs (Llama-3-8B, Gemini-1.5, and GPT-4) and applied it to navigate 10 HackTheBox cybersecurity exercises with 103 discrete subtasks representing real-world cyberattack scenarios. Our proposed reasoning pipeline guided the LLM agent through 71.8\%, 72.8\%, and 78.6\% of subtasks using Llama-3-8B, Gemini-1.5, and GPT-4, respectively. Comparatively, the state-of-the-art LLM penetration testing tool using self-guided reasoning completed only 13.5\%, 16.5\%, and 75.7\% of subtasks and required 86.2\%, 118.7\%, and 205.9\% more model queries. This suggests that incorporating a deterministic task tree into LLM reasoning pipelines can enhance the accuracy and efficiency of automated cybersecurity assessments
Authors:Shuai Yuan, Zhibo Zhang, Yuxi Li, Guangdong Bai, Wang Kailong
Abstract:
The widespread distribution of Large Language Models (LLMs) through public platforms like Hugging Face introduces significant security challenges. While these platforms perform basic security scans, they often fail to detect subtle manipulations within the embedding layer. This work identifies a novel class of deployment phase attacks that exploit this vulnerability by injecting imperceptible perturbations directly into the embedding layer outputs without modifying model weights or input text. These perturbations, though statistically benign, systematically bypass safety alignment mechanisms and induce harmful behaviors during inference. We propose Search based Embedding Poisoning(SEP), a practical, model agnostic framework that introduces carefully optimized perturbations into embeddings associated with high risk tokens. SEP leverages a predictable linear transition in model responses, from refusal to harmful output to semantic deviation to identify a narrow perturbation window that evades alignment safeguards. Evaluated across six aligned LLMs, SEP achieves an average attack success rate of 96.43% while preserving benign task performance and evading conventional detection mechanisms. Our findings reveal a critical oversight in deployment security and emphasize the urgent need for embedding level integrity checks in future LLM defense strategies.
Authors:Yuhan Meng, Shaofei Li, Jiaping Gui, Peng Jiang, Ding Li
Abstract:
High-level natural language knowledge in CTI reports, such as the ATT&CK framework, is beneficial to counter APT attacks. However, how to automatically apply the high-level knowledge in CTI reports in realistic attack detection systems, such as provenance analysis systems, is still an open problem. The challenge stems from the semantic gap between the knowledge and the low-level security logs: while the knowledge in CTI reports is written in natural language, attack detection systems can only process low-level system events like file accesses or network IP manipulations. Manual approaches can be labor-intensive and error-prone. In this paper, we propose KnowHow, a CTI-knowledge-driven online provenance analysis approach that can automatically apply high-level attack knowledge from CTI reports written in natural languages to detect low-level system events. The core of KnowHow is a novel attack knowledge representation, gIoC, that represents the subject, object, and actions of attacks. By lifting system identifiers, such as file paths, in system events to natural language terms, KnowHow can match system events to gIoC and further match them to techniques described in natural languages. Finally, based on the techniques matched to system events, KnowHow reasons about the temporal logic of attack steps and detects potential APT attacks in system events. Our evaluation shows that KnowHow can accurately detect all 16 APT campaigns in the open-source and industrial datasets, while existing approaches all introduce large numbers of false positives. Meanwhile, our evaluation also shows that KnowHow reduces at most 90% of node-level false positives while having a higher node-level recall and is robust against several unknown attacks and mimicry attacks.
Authors:Ali Arastehfard, Weiran Liu, Joshua Lee, Bingyu Liu, Xuegang Ban, Yuan Hong
Abstract:
Secure norm computation is becoming increasingly important in many real-world learning applications. However, existing cryptographic systems often lack a general framework for securely computing the $L^p$-norm over private inputs held by different parties. These systems often treat secure norm computation as a black-box process, neglecting to design tailored cryptographic protocols that optimize performance. Moreover, they predominantly focus on the $L^2$-norm, paying little attention to other popular $L^p$-norms, such as $L^1$ and $L^\infty$, which are commonly used in practice, such as machine learning tasks and location-based services. To our best knowledge, we propose the first comprehensive framework for secure two-party $L^p$-norm computations ($L^1$, $L^2$, and $L^\infty$), denoted as \mbox{Crypto-$L^p$}, designed to be versatile across various applications. We have designed, implemented, and thoroughly evaluated our framework across a wide range of benchmarking applications, state-of-the-art (SOTA) cryptographic protocols, and real-world datasets to validate its effectiveness and practical applicability. In summary, \mbox{Crypto-$L^p$} outperforms prior works on secure $L^p$-norm computation, achieving $82\times$, $271\times$, and $42\times$ improvements in runtime while reducing communication overhead by $36\times$, $4\times$, and $21\times$ for $p=1$, $2$, and $\infty$, respectively. Furthermore, we take the first step in adapting our Crypto-$L^p$ framework for secure machine learning inference, reducing communication costs by $3\times$ compared to SOTA systems while maintaining comparable runtime and accuracy.
Authors:Erik Rye, Dave Levin, Robert Beverly
Abstract:
IPv4 NAT has limited the spread of IoT botnets considerably by default-denying bots' incoming connection requests to in-home devices unless the owner has explicitly allowed them. As the Internet transitions to majority IPv6, however, residential connections no longer require the use of NAT. This paper therefore asks: has the transition from IPv4 to IPv6 ultimately made residential networks more vulnerable to attack, thereby empowering the next generation of IPv6-based IoT botnets? To answer this question, we introduce a large-scale IPv6 scanning methodology that, unlike those that rely on AI, can be run on low-resource devices common in IoT botnets. We use this methodology to perform the largest-scale measurement of IPv6 residential networks to date, and compare which devices are publicly accessible to comparable IPv4 networks. We were able to receive responses from 14.0M distinct IPv6 addresses inside of residential networks (i.e., not the external-facing gateway), in 2,436 ASes across 118 countries. These responses come from protocols commonly exploited by IoT botnets (including telnet and FTP), as well as protocols typically associated with end-user devices (including iPhone-Sync and IPP). Comparing to IPv4, we show that we are able to reach more printers, iPhones, and smart lights over IPv6 than full IPv4-wide scans could. Collectively, our results show that NAT has indeed acted as the de facto firewall of the Internet, and the v4-to-v6 transition of residential networks is opening up new devices to attack.
Authors:Yilu Dong, Tianchang Yang, Arupjyoti Bhuyan, Syed Rafiul Hussain
Abstract:
The 6 GHz spectrum, recently opened for unlicensed use under Wi-Fi 6E and Wi-Fi 7, overlaps with frequencies used by mission-critical incumbent systems such as public safety communications and utility infrastructure. To prevent interference, the FCC mandates the use of Automated Frequency Coordination (AFC) systems, which assign safe frequency and power levels based on Wi-Fi Access Point (AP)-reported locations. In this work, we demonstrate that GPS-based location reporting, which Wi-Fi APs use, can be spoofed using inexpensive, off-the-shelf radio equipment. This enables attackers to manipulate AP behavior, gain unauthorized spectrum access, cause harmful interference, or disable APs entirely by spoofing them into foreign locations. We validate these attacks in a controlled lab setting against a commercial AP and evaluate a commercial AFC system under spoofed scenarios. Our findings highlight critical gaps in the security assumptions of AFC and motivate the need for stronger location integrity protections.
Authors:Rye Stahle-Smith, Rasha Karakchi
Abstract:
The growing use of FPGAs in reconfigurable systems introducessecurity risks through malicious bitstreams that could cause denial-of-service (DoS), data leakage, or covert attacks. We investigated chip-level hardware malicious payload in embedded systems and proposed a supervised machine learning method to detect malicious bitstreams via static byte-level features. Our approach diverges from existing methods by analyzing bitstreams directly at the binary level, enabling real-time detection without requiring access to source code or netlists. Bitstreams were sourced from state-of-the-art (SOTA) benchmarks and re-engineered to target the Xilinx PYNQ-Z1 FPGA Development Board. Our dataset included 122 samples of benign and malicious configurations. The data were vectorized using byte frequency analysis, compressed using TSVD, and balanced using SMOTE to address class imbalance. The evaluated classifiers demonstrated that Random Forest achieved a macro F1-score of 0.97, underscoring the viability of real-time Trojan detection on resource-constrained systems. The final model was serialized and successfully deployed via PYNQ to enable integrated bitstream analysis.
Authors:Refat Othman, Diaeddin Rimawi, Bruno Rossi, Barbara Russo
Abstract:
In the domain of security, vulnerabilities frequently remain undetected even after their exploitation. In this work, vulnerabilities refer to publicly disclosed flaws documented in Common Vulnerabilities and Exposures (CVE) reports. Establishing a connection between attacks and vulnerabilities is essential for enabling timely incident response, as it provides defenders with immediate, actionable insights. However, manually mapping attacks to CVEs is infeasible, thereby motivating the need for automation. This paper evaluates 14 state-of-the-art (SOTA) sentence transformers for automatically identifying vulnerabilities from textual descriptions of attacks. Our results demonstrate that the multi-qa-mpnet-base-dot-v1 (MMPNet) model achieves superior classification performance when using attack Technique descriptions, with an F1-score of 89.0, precision of 84.0, and recall of 94.7. Furthermore, it was observed that, on average, 56% of the vulnerabilities identified by the MMPNet model are also represented within the CVE repository in conjunction with an attack, while 61% of the vulnerabilities detected by the model correspond to those cataloged in the CVE repository. A manual inspection of the results revealed the existence of 275 predicted links that were not documented in the MITRE repositories. Consequently, the automation of linking attack techniques to vulnerabilities not only enhances the detection and response capabilities related to software security incidents but also diminishes the duration during which vulnerabilities remain exploitable, thereby contributing to the development of more secure systems.
Authors:Tran Duc Le, Phuc Hao Do, Truong Duy Dinh, Van Dai Pham
Abstract:
Quantum computing threatens to undermine classical cryptography by breaking widely deployed encryption and signature schemes. This paper examines enterprise readiness for quantum-safe cybersecurity through three perspectives: (i) the technologist view, assessing the maturity of post-quantum cryptography (PQC) and quantum key distribution (QKD); (ii) the enterprise (CISO/CIO) view, analyzing organizational awareness, risk management, and operational barriers; and (iii) the threat actor view, evaluating the evolving quantum threat and the urgency of migration. Using recent standards (e.g., NIST's 2024 PQC algorithms), industry surveys, and threat intelligence, we synthesize findings via a SWOT analysis to map strengths, weaknesses, opportunities, and threats. Results indicate uneven and generally insufficient preparedness: while PQC standards and niche QKD deployments signal technical progress, fewer than 5\% of enterprises have formal quantum-transition plans, and many underestimate "harvest now, decrypt later" risks. Financial, telecom, and government sectors have begun migration, but most industries remain exploratory or stalled by costs, complexity, and skills gaps. Expert consensus places cryptanalytically relevant quantum computers in the 2030s, yet delayed preparation could leave today's data vulnerable for decades. We recommend immediate steps: establishing crypto-agility, creating quantum transition roadmaps, prioritizing PQC deployment in high-value systems, and upskilling cybersecurity teams. A coordinated, proactive approach is essential to secure current and future digital assets in the quantum era.
Authors:Alberto Miguel-Diez, Adrián Campazas-Vega, Ãngel Manuel Guerrero-Higueras, Claudia Ãlvarez-Aparicio, Vicente Matellán-Olivera
Abstract:
Nowadays, the volume of network traffic continues to grow, along with the frequency and sophistication of attacks. This scenario highlights the need for solutions capable of continuously adapting, since network behavior is dynamic and changes over time. This work presents an anomaly detection model for network flows using unsupervised machine learning with online learning capabilities. This approach allows the system to dynamically learn the normal behavior of the network and detect deviations without requiring labeled data, which is particularly useful in real-world environments where traffic is constantly changing and labeled data is scarce. The model was implemented using the River library with a One-Class SVM and evaluated on the NF-UNSW-NB15 dataset and its extended version v2, which contain network flows labeled with different attack categories. The results show an accuracy above 98%, a false positive rate below 3.1%, and a recall of 100% in the most advanced version of the dataset. In addition, the low processing time per flow (<0.033 ms) demonstrates the feasibility of the approach for real-time applications.
Authors:Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar
Abstract:
This research proposes a real-time, adaptive decision-support framework for mitigating cyber incidents in cyber-physical systems, developed in response to an increasing reliance on these systems within critical infrastructure and evolving adversarial tactics. Existing decision-support systems often fall short in accounting for multi-agent, multi-path attacks and trade-offs between safety and operational continuity. To address this, our framework integrates hierarchical system modelling with Bayesian probabilistic reasoning, constructing Bayesian Network Graphs from system architecture and vulnerability data. Models are encoded using a Domain Specific Language to enhance computational efficiency and support dynamic updates. In our approach, we use a hybrid exposure probability estimation framework, which combines Exploit Prediction Scoring System and Common Vulnerability Scoring System scores via Bayesian confidence calibration to handle epistemic uncertainty caused by incomplete or heterogeneous vulnerability metadata. Mitigation recommendations are generated as countermeasure portfolios, refined using multi-objective optimisation to identify Pareto-optimal strategies balancing attack likelihood, impact severity, and system availability. To accommodate time- and resource-constrained incident response, frequency-based heuristics are applied to prioritise countermeasures across the optimised portfolios. The framework was evaluated through three representative cyber-physical attack scenarios, demonstrating its versatility in handling complex adversarial behaviours under real-time response constraints. The results affirm its utility in operational contexts and highlight the robustness of our proposed approach across diverse threat environments.
Authors:Ruopengyu Xu, Chenglian Liu
Abstract:
This paper presents an enhanced post-quantum key agreement protocol based on Rényi entropy, addressing vulnerabilities in the original construction while preserving information-theoretic security properties. We develop a theoretical framework leveraging entropy-preserving operations and secret-shared verification to achieve provable security against quantum adversaries. Through entropy amplification techniques and quantum-resistant commitments, the protocol establishes $2^{128}$ quantum security guarantees under the quantum random oracle model. Key innovations include a confidentiality-preserving verification mechanism using distributed polynomial commitments, tightened min-entropy bounds with guaranteed non-negativity, and composable security proofs in the quantum universal composability framework. Unlike computational approaches, our method provides information-theoretic security without hardness assumptions while maintaining polynomial complexity. Theoretical analysis demonstrates resilience against known quantum attack vectors, including Grover-accelerated brute force and quantum memory attacks. The protocol achieves parameterization for 128-bit quantum security with efficient $\mathcal{O}(n^{2})$ communication complexity. Extensions to secure multiparty computation and quantum network applications are established, providing a foundation for long-term cryptographic security.
Authors:Md Faizul Bari, Yi Xie, Meghna Roy Choudhury, Shreyas Sen
Abstract:
Electronic devices and cables inadvertently emit RF emissions as a byproduct of signal processing and/or transmission. Labeled as electromagnetic emanations, they form an EM side-channel for data leakage. Previously, it was believed that such leakage could be contained within a facility since they are weak signals with a short transmission range. However, in the preliminary version of this work [1], we found that the traditional cable repairing process forms a tiny monopole antenna that helps emanations transmit over a long range. Experimentation with three types of cables revealed that emanations from repaired cables remain detectable even at >4 m and can penetrate a 14 cm thick concrete wall. In this extended version, we show that such emanation can be exploited at a long distance for information extraction by detecting keystrokes typed on a repaired USB keyboard. By collecting data for 70 different keystrokes at different distances from the target in 3 diverse environments (open space, a corridor outside an office room, and outside a building) and developing an efficient detection algorithm, ~100% keystroke detection accuracy has been achieved up to 12 m distance, which is the highest reported accuracy at such a long range for USB keyboards in the literature. The effect of two experimental factors, interference and human-body coupling, has been investigated thoroughly. Along with exploring the vulnerability, multi-layer external metal shielding during the repairing process as a possible remedy has been explored. This work exposes a new attack surface caused by hardware modification, its exploitation, and potential countermeasures.
Authors:Amirhossein Nazeri, Wael Hafez
Abstract:
Convolutional Neural Networks (CNNs) have become the foundation of modern computer vision, achieving unprecedented accuracy across diverse image recognition tasks. While these networks excel on in-distribution data, they remain vulnerable to adversarial perturbations imperceptible input modifications that cause misclassification with high confidence. However, existing detection methods either require expensive retraining, modify network architecture, or degrade performance on clean inputs. Here we show that adversarial perturbations create immediate, detectable entropy signatures in CNN activations that can be monitored without any model modification. Using parallel entropy monitoring on VGG-16, we demonstrate that adversarial inputs consistently shift activation entropy by 7% in early convolutional layers, enabling 90% detection accuracy with false positives and false negative rates below 20%. The complete separation between clean and adversarial entropy distributions reveals that CNNs inherently encode distribution shifts in their activation patterns. This work establishes that CNN reliability can be assessed through activation entropy alone, enabling practical deployment of self-diagnostic vision systems that detect adversarial inputs in real-time without compromising original model performance.
Authors:Nishant Chinnasami, Rasha Karakchi
Abstract:
AES-128 encryption is theoretically secure but vulnerable in practical deployments due to timing and fault injection attacks on embedded systems. This work presents a lightweight dual-detection framework combining statistical thresholding and machine learning (ML) for real-time anomaly detection. By simulating anomalies via delays and ciphertext corruption, we collect timing and data features to evaluate two strategies: (1) a statistical threshold method based on execution time and (2) a Random Forest classifier trained on block-level anomalies. Implemented on CPU and FPGA (PYNQ-Z1), our results show that the ML approach outperforms static thresholds in accuracy, while maintaining real-time feasibility on embedded platforms. The framework operates without modifying AES internals or relying on hardware performance counters. This makes it especially suitable for low-power, resource-constrained systems where detection accuracy and computational efficiency must be balanced.
Authors:Minghao Hu, Junzhe Wang, Weisen Zhao, Qiang Zeng, Lannan Luo
Abstract:
Applying deep learning to malware detection has drawn great attention due to its notable performance. With the increasing prevalence of cyberattacks targeting IoT devices, there is a parallel rise in the development of malware across various Instruction Set Architectures (ISAs). It is thus important to extend malware detection capacity to multiple ISAs. However, training a deep learning-based malware detection model usually requires a large number of labeled malware samples. The process of collecting and labeling sufficient malware samples to build datasets for each ISA is labor-intensive and time-consuming. To reduce the burden of data collection, we propose to leverage the ideas of Neural Machine Translation (NMT) and Normalizing Flows (NFs) for malware detection. Specifically, when dealing with malware in a certain ISA, we translate it to an ISA with sufficient malware samples (like X86-64). This allows us to apply a model trained on one ISA to analyze malware from another ISA. Our approach reduces the data collection effort by enabling malware detection across multiple ISAs using a model trained on a single ISA.
Authors:Youwei Huang, Jianwen Li, Sen Fang, Yao Li, Peng Yang, Bin Hu, Tao Zhang
Abstract:
Malicious intent in smart contract development can lead to substantial economic losses. SmartIntentNN is a deep learning model specifically designed to identify unsafe intents in smart contracts. This model integrates the Universal Sentence Encoder, a K-means clustering-based intent highlighting mechanism, and a Bidirectional Long Short-Term Memory network for multi-label classification, achieving an F1 of 0.8633 in distinguishing ten different intent categories. In this study, we present an upgraded version of this model, SmartIntentNN2 (Smart Contract Intent Neural Network V2). A significant enhancement in V2 is the incorporation of a BERT-based pre-trained language model, which has been trained on a dataset of 16,000 real smart contracts using a Masked Language Modeling objective. SmartIntentNN2 retains the BiLSTM-based multi-label classification network. With an improved F1 of 0.927, V2 demonstrates enhanced performance compared to its predecessor, establishing itself as the state-of-the-art model for smart contract intent detection.
Authors:Viktor Valadi, Mattias Ã
kesson, Johan Ãstman, Salman Toor, Andreas Hellander
Abstract:
Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.
Authors:Chao Huang, Zefeng Zhang, Juewei Yue, Quangang Li, Chuang Zhang, Tingwen Liu
Abstract:
Current safety alignment for large language models(LLMs) continues to present vulnerabilities, given that adversarial prompting can effectively bypass their safety measures.Our investigation shows that these safety mechanisms predominantly depend on a limited subset of attention heads: removing or ablating these heads can severely compromise model safety. To identify and evaluate these safety-critical components, we introduce RDSHA, a targeted ablation method that leverages the model's refusal direction to pinpoint attention heads mostly responsible for safety behaviors. Further analysis shows that existing jailbreak attacks exploit this concentration by selectively bypassing or manipulating these critical attention heads. To address this issue, we propose AHD, a novel training strategy designed to promote the distributed encoding of safety-related behaviors across numerous attention heads. Experimental results demonstrate that AHD successfully distributes safety-related capabilities across more attention heads. Moreover, evaluations under several mainstream jailbreak attacks show that models trained with AHD exhibit considerably stronger safety robustness, while maintaining overall functional utility.
Authors:Mahdi Haghifam, Adam Smith, Jonathan Ullman
Abstract:
A membership-inference attack gets the output of a learning algorithm, and a target individual, and tries to determine whether this individual is a member of the training data or an independent sample from the same distribution. A successful membership-inference attack typically requires the attacker to have some knowledge about the distribution that the training data was sampled from, and this knowledge is often captured through a set of independent reference samples from that distribution. In this work we study how much information the attacker needs for membership inference by investigating the sample complexity-the minimum number of reference samples required-for a successful attack. We study this question in the fundamental setting of Gaussian mean estimation where the learning algorithm is given $n$ samples from a Gaussian distribution $\mathcal{N}(μ,Σ)$ in $d$ dimensions, and tries to estimate $\hatμ$ up to some error $\mathbb{E}[\|\hat μ- μ\|^2_Σ]\leq Ï^2 d$. Our result shows that for membership inference in this setting, $Ω(n + n^2 Ï^2)$ samples can be necessary to carry out any attack that competes with a fully informed attacker. Our result is the first to show that the attacker sometimes needs many more samples than the training algorithm uses to train the model. This result has significant implications for practice, as all attacks used in practice have a restricted form that uses $O(n)$ samples and cannot benefit from $Ï(n)$ samples. Thus, these attacks may be underestimating the possibility of membership inference, and better attacks may be possible when information about the distribution is easy to obtain.
Authors:Kangfeng Ye, Roberto Metere, Jim Woodcock, Poonam Yadav
Abstract:
Formal verification is crucial for ensuring the robustness of security protocols against adversarial attacks. The Needham-Schroeder protocol, a foundational authentication mechanism, has been extensively studied, including its integration with Physical Layer Security (PLS) techniques such as watermarking and jamming. Recent research has used ProVerif to verify these mechanisms in terms of secrecy. However, the ProVerif-based approach limits the ability to improve understanding of security beyond verification results. To overcome these limitations, we re-model the same protocol using an Isabelle formalism that generates sound animation, enabling interactive and automated formal verification of security protocols. Our modelling and verification framework is generic and highly configurable, supporting both cryptography and PLS. For the same protocol, we have conducted a comprehensive analysis (secrecy and authenticity in four different eavesdropper locations under both passive and active attacks) using our new web interface. Our findings not only successfully reproduce and reinforce previous results on secrecy but also reveal an uncommon but expected outcome: authenticity is preserved across all examined scenarios, even in cases where secrecy is compromised. We have proposed a PLS-based Diffie-Hellman protocol that integrates watermarking and jamming, and our analysis shows that it is secure for deriving a session key with required authentication. These highlight the advantages of our novel approach, demonstrating its robustness in formally verifying security properties beyond conventional methods.
Authors:Ruopengyu Xu, Chenglian Liu
Abstract:
The imminent threat of quantum computing necessitates quantum-resistant cryptosystems. This paper establishes tight security bounds for two NIST PQC finalists: SPHINCS+ (hash-based) and NTRU (lattice-based). Our key contributions include: (1) A quantum attack model incorporating decoherence effects ($Ï_d$) and parallelization limits; (2) Improved entropy concentration inequalities reducing SPHINCS+ parameters by 15-20\%; (3) Optimized NTRU lattice parameters via quantum lattice entropy $H_Q(Î)$; (4) Tightened NTRU-to-LWE reduction with polynomial-factor improvement. Theoretical results demonstrate significant security enhancement over existing constructions, providing implementable parameters for standardization.
Authors:Joshua Lee, Ali Arastehfard, Weiran Liu, Xuegang Ban, Yuan Hong
Abstract:
Autonomous driving and V2X technologies have developed rapidly in the past decade, leading to improved safety and efficiency in modern transportation. These systems interact with extensive networks of vehicles, roadside infrastructure, and cloud resources to support their machine learning capabilities. However, the widespread use of machine learning in V2X systems raises issues over the privacy of the data involved. This is particularly concerning for smart-transit and driver safety applications which can implicitly reveal user locations or explicitly disclose medical data such as EEG signals. To resolve these issues, we propose SecureV2X, a scalable, multi-agent system for secure neural network inferences deployed between the server and each vehicle. Under this setting, we study two multi-agent V2X applications: secure drowsiness detection, and secure red-light violation detection. Our system achieves strong performance relative to baselines, and scales efficiently to support a large number of secure computation interactions simultaneously. For instance, SecureV2X is $9.4 \times$ faster, requires $143\times$ fewer computational rounds, and involves $16.6\times$ less communication on drowsiness detection compared to other secure systems. Moreover, it achieves a runtime nearly $100\times$ faster than state-of-the-art benchmarks in object detection tasks for red light violation detection.
Authors:Ahmed Mounsf Rafik Bendada, Yacine Ghamri-Doudane
Abstract:
With the rapid growth of Electric Vehicle (EV) technology, EVs are destined to shape the future of transportation. The large number of EVs facilitates the development of the emerging vehicle-to-grid (V2G) technology, which realizes bidirectional energy exchanges between EVs and the power grid. This has led to the setting up of electricity markets that are usually confined to a small geographical location, often with a small number of participants. Usually, these markets are manipulated by intermediaries responsible for collecting bids from prosumers, determining the market-clearing price, incorporating grid constraints, and accounting for network losses. While centralized models can be highly efficient, they grant excessive power to the intermediary by allowing them to gain exclusive access to prosumers \textquotesingle price preferences. This opens the door to potential market manipulation and raises significant privacy concerns for users, such as the location of energy providers. This lack of protection exposes users to potential risks, as untrustworthy servers and malicious adversaries can exploit this information to infer trading activities and real identities. This work proposes a secure, decentralized exchange market built on blockchain technology, utilizing a privacy-preserving Automated Market Maker (AMM) model to offer open and fair, and equal access to traders, and mitigates the most common trading-manipulation attacks. Additionally, it incorporates a scalable architecture based on geographical dynamic sharding, allowing for efficient resource allocation and improved performance as the market grows.
Authors:Muhammad Ali Nadeem, Bishwo Prakash Pokharel, Naresh Kshetri, Achyut Shankar, Gokarna Sharma
Abstract:
The rapid integration of Internet of Things (IoT) and interconnected systems in modern vehicles not only introduced a new era of convenience, automation, and connected vehicles but also elevated their exposure to sophisticated cyber threats. This is especially evident in US and Canada, where cyber-enabled auto theft has surged in recent years, revealing the limitations of existing security measures for connected vehicles. In response, this paper proposes $AutoGuardX$, a comprehensive cybersecurity framework designed specifically for connected vehicles. $AutoGuardX$ combines key elements from existing recognized standards for vehicle security, such as ISO/SAE 21434 and ISO 26262, with advanced technologies, including machine learning-based anomaly detection, IoT security protocols, and encrypted communication channels. The framework addresses major attack vectors like relay attacks, controller area network (CAN) bus intrusions, and vulnerabilities introduced by emerging technologies such as 5G and quantum computing. $AutoGuardX$ is extensively evaluated through security simulations across a mix of Sedans and SUVs from four major vehicle brands manufactured between 2019 and 2023. The results demonstrate the framework's adaptability, scalability, and practical effectiveness against existing and emerging threats.
Authors:Xiaoyan Zhang, Dongyang Lyu, Xiaoqi Li
Abstract:
As large language models (LLMs) expose systemic security challenges in high risk applications, including privacy leaks, bias amplification, and malicious abuse, there is an urgent need for a dynamic risk assessment and collaborative defence framework that covers their entire life cycle. This paper focuses on the security problems of large language models (LLMs) in critical application scenarios, such as the possibility of disclosure of user data, the deliberate input of harmful instructions, or the models bias. To solve these problems, we describe the design of a system for dynamic risk assessment and a hierarchical defence system that allows different levels of protection to cooperate. This paper presents a risk assessment system capable of evaluating both static and dynamic indicators simultaneously. It uses entropy weighting to calculate essential data, such as the frequency of sensitive words, whether the API call is typical, the realtime risk entropy value is significant, and the degree of context deviation. The experimental results show that the system is capable of identifying concealed attacks, such as role escape, and can perform rapid risk evaluation. The paper uses a hybrid model called BERT-CRF (Bidirectional Encoder Representation from Transformers) at the input layer to identify and filter malicious commands. The model layer uses dynamic adversarial training and differential privacy noise injection technology together. The output layer also has a neural watermarking system that can track the source of the content. In practice, the quality of this method, especially important in terms of customer service in the financial industry.
Authors:Rijha Safdar, Danyail Mateen, Syed Taha Ali, M. Umer Ashfaq, Wajahat Hussain
Abstract:
The performance of AI-based software vulnerability detection systems is often limited by their poor generalization to unknown codebases. In this research, we explore the impact of data quality and model architecture on the generalizability of vulnerability detection systems. By generalization we mean ability of high vulnerability detection performance across different C/C++ software projects not seen during training. Through a series of experiments, we demonstrate that improvements in dataset diversity and quality substantially enhance detection performance. Additionally, we compare multiple encoder-only and decoder-only models, finding that encoder based models outperform in terms of accuracy and generalization. Our model achieves 6.8% improvement in recall on the benchmark BigVul[1] dataset, also outperforming on unseen projects, hence showing enhanced generalizability. These results highlight the role of data quality and model selection in the development of robust vulnerability detection systems. Our findings suggest a direction for future systems having high cross-project effectiveness.
Authors:Krishnendu Chatterjee, Jan Matyáš KÅišťan, Stefan Schmid, Jakub Svoboda, Michelle Yeo
Abstract:
Payment channel networks (PCNs) are a promising technology that alleviates blockchain scalability by shifting the transaction load from the blockchain to the PCN. Nevertheless, the network topology has to be carefully designed to maximise the transaction throughput in PCNs. Additionally, users in PCNs also have to make optimal decisions on which transactions to forward and which to reject to prolong the lifetime of their channels. In this work, we consider an input sequence of transactions over $p$ parties. Each transaction consists of a transaction size, source, and target, and can be either accepted or rejected (entailing a cost). The goal is to design a PCN topology among the $p$ cooperating parties, along with the channel capacities, and then output a decision for each transaction in the sequence to minimise the cost of creating and augmenting channels, as well as the cost of rejecting transactions. Our main contribution is an $\mathcal{O}(p)$ approximation algorithm for the problem with $p$ parties. We further show that with some assumptions on the distribution of transactions, we can reduce the approximation ratio to $\mathcal{O}(\sqrt{p})$. We complement our theoretical analysis with an empirical study of our assumptions and approach in the context of the Lightning Network.
Authors:Xingxing Xu, Minjia Shi, Patrick Sole
Abstract:
Butson matrices are complex Hadamard matrices with entries in the complex roots of unity of given order. There is an interesting code in phase space related to this matrix (Armario et al. 2023). We study the covering radius of Butson Hadamard codes for the homogeneous metric, a metric defined uniquely, up to scaling, for a commutative ring alphabet that is Quasi Frobenius. An upper bound is derived by an orthogonal array argument. A lower bound relies on the existence of bent sequences in the sense of (Shi et al. 2022). This latter bound generalizes a bound of (Armario et al. 2025) for the Hamming metric.
Authors:Zhihao Wang, Alessandro Cornacchia, Andrea Bianco, Idilio Drago, Paolo Giaccone, Dingde Jiang, Marco Mellia
Abstract:
Traffic visibility remains a key component for management and security operations. Observing unsolicited and erroneous traffic, such as unanswered traffic or errors, is fundamental to detect misconfiguration, temporary failures or attacks. ChamaleoNet transforms any production network into a transparent monitor to let administrators collect unsolicited and erroneous traffic directed to hosts, whether offline or active, hosting a server or a client, protected by a firewall, or unused addresses. ChamaleoNet is programmed to ignore well-formed traffic and collect only erroneous packets, including those generated by misconfigured or infected internal hosts, and those sent by external actors which scan for services. Engineering such a system poses several challenges, from scalability to privacy. Leveraging the SDN paradigm, ChamaleoNet processes the traffic flowing through a campus/corporate network and focuses on erroneous packets only, lowering the pressure on the collection system while respecting privacy regulations by design. ChamaleoNet enables the seamless integration with active deceptive systems like honeypots that can impersonate unused hosts/ports/services and engage with senders. The SDN in-hardware filtering reduces the traffic to the controller by 96%, resulting in a scalable solution, which we offer as open source. Simple analytics unveil internal misconfigured and infected hosts, identify temporary failures, and enhance visibility on external radiation produced by attackers looking for vulnerable services.
Authors:Ken Huang, Yasir Mehmood, Hammad Atta, Jerry Huang, Muhammad Zeeshan Baig, Sree Bhargavi Balija
Abstract:
This paper presents a Unified Security Architecture that fortifies the Agentic Web through a Zero-Trust IAM framework. This architecture is built on a foundation of rich, verifiable agent identities using Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), with discovery managed by a protocol-agnostic Agent Name Service (ANS). Security is operationalized through a multi-layered Trust Fabric which introduces significant innovations, including Trust-Adaptive Runtime Environments (TARE), Causal Chain Auditing, and Dynamic Identity with Behavioral Attestation. By explicitly linking the LPCI threat to these enhanced architectural countermeasures within a formal security model, we propose a comprehensive and forward-looking blueprint for a secure, resilient, and trustworthy agentic ecosystem. Our formal analysis demonstrates that the proposed architecture provides provable security guarantees against LPCI attacks with bounded probability of success.
Authors:Minhao Jin, Hongyu He, Maria Apostolaki
Abstract:
Current synthetic traffic generators (SynNetGens) promise privacy but lack comprehensive guarantees or empirical validation, even as their fidelity steadily improves. We introduce the first attack-grounded benchmark for assessing the privacy of SynNetGens directly from the traffic they produce. We frame privacy as membership inference at the traffic-source level--a realistic and actionable threat for data holders. To this end, we present TraceBleed, the first attack that exploits behavioral fingerprints across flows using contrastive learning and temporal chunking, outperforming prior membership inference baselines by 172%. Our large-scale study across GAN-, diffusion-, and GPT-based SynNetGens uncovers critical insights: (i) SynNetGens leak user-level information; (ii) differential privacy either fails to stop these attacks or severely degrades fidelity; and (iii) sharing more synthetic data amplifies leakage by 59% on average. Finally, we introduce TracePatch, the first SynNetGen-agnostic defense that combines adversarial ML with SMT constraints to mitigate leakage while preserving fidelity.
Authors:Md Sazedur Rahman, Mohamed Elmahallawy, Sanjay Madria, Samuel Frimpong
Abstract:
Underground mining operations rely on distributed sensor networks to collect critical data daily, including mine temperature, toxic gas concentrations, and miner movements for hazard detection and operational decision-making. However, transmitting raw sensor data to a central server for training deep learning models introduces significant privacy risks, potentially exposing sensitive mine-specific information. Federated Learning (FL) offers a transformative solution by enabling collaborative model training while ensuring that raw data remains localized at each mine. Despite its advantages, FL in underground mining faces key challenges: (i) An attacker may compromise a mine's local model by employing techniques such as sign-flipping attacks or additive noise, leading to erroneous predictions; (ii) Low-quality (yet potentially valuable) data, caused by poor lighting conditions or sensor inaccuracies in mines may degrade the FL training process. In response, this paper proposes MineDetect, a defense FL framework that detects and isolates the attacked models while mitigating the impact of mines with low-quality data. MineDetect introduces two key innovations: (i) Detecting attacked models (maliciously manipulated) by developing a history-aware mechanism that leverages local and global averages of gradient updates; (ii) Identifying and eliminating adversarial influences from unreliable models (generated by clients with poor data quality) on the FL training process. Comprehensive simulations across diverse datasets demonstrate that MineDetect outperforms existing methods in both robustness and accuracy, even in challenging non-IID data scenarios. Its ability to counter adversarial influences while maintaining lower computational efficiency makes it a vital advancement for improving safety and operational effectiveness in underground mining.
Authors:Pierre-Francois Gimenez, Sarath Sivaprasad, Mario Fritz
Abstract:
Malware analysis involves analyzing suspicious software to detect malicious payloads. Static malware analysis, which does not require software execution, relies increasingly on machine learning techniques to achieve scalability. Although such techniques obtain very high detection accuracy, they can be easily evaded with adversarial examples where a few modifications of the sample can dupe the detector without modifying the behavior of the software. Unlike other domains, such as computer vision, creating an adversarial example of malware without altering its functionality requires specific transformations. We propose a new model architecture for certifiably robust malware detection by design. In addition, we show that every robust detector can be decomposed into a specific structure, which can be applied to learn empirically robust malware detectors, even on fragile features. Our framework ERDALT is based on this structure. We compare and validate these approaches with machine-learning-based malware detection methods, allowing for robust detection with limited reduction of detection performance.
Authors:Meghal Gupta, William He, Ryan O'Donnell, Noah G. Singer
Abstract:
A recent work of Schmidhuber et al (QIP, SODA, & Phys. Rev. X 2025) exhibited a quantum algorithm for the noisy planted $k$XOR problem running quartically faster than all known classical algorithms. In this work, we design a new classical algorithm that is quadratically faster than the best previous one, in the case of large constant $k$. Thus for such $k$, the quantum speedup of Schmidhuber et al. becomes only quadratic (though it retains a space advantage). Our algorithm, which also works in the semirandom case, combines tools from sublinear-time algorithms (essentially, the birthday paradox) and polynomial anticoncentration.
Authors:Anju Rani, Xiaoyu Ai, Aman Gupta, Ravi Singh Adhikari, Robert Malaney
Abstract:
In this work, we present an experimental deployment of a new design for combined quantum key distribution (QKD) and post-quantum cryptography (PQC). Novel to our system is the dynamic obfuscation of the QKD-PQC sequence of operations, the number of operations, and parameters related to the operations; coupled to the integration of a GPS-free quantum synchronization protocol within the QKD process. We compare the performance and overhead of our QKD-PQC system relative to a standard QKD system with one-time pad encryption, demonstrating that our design can operate in real time with little additional overhead caused by the new security features. Since our system can offer additional defensive strategies against a wide spectrum of practical attacks that undermine deployed QKD, PQC, and certain combinations of these two primitives, we suggest that our design represents one of the most secure communication systems currently available. Given the dynamic nature of its obfuscation attributes, our new system can also be adapted in the field to defeat yet-to-be-discovered practical attacks.
Authors:Alejandro Moreno R., Desale Fentaw, Samuel Palmer, Raúl Salles de Padua, Ninad Dixit, Samuel Mugel, Roman Orús, Manuel Radons, Josef Menter, Ali Abedi
Abstract:
Synthetic data generation is a key technique in modern artificial intelligence, addressing data scarcity, privacy constraints, and the need for diverse datasets in training robust models. In this work, we propose a method for generating privacy-preserving high-quality synthetic tabular data using Tensor Networks, specifically Matrix Product States (MPS). We benchmark the MPS-based generative model against state-of-the-art models such as CTGAN, VAE, and PrivBayes, focusing on both fidelity and privacy-preserving capabilities. To ensure differential privacy (DP), we integrate noise injection and gradient clipping during training, enabling privacy guarantees via Rényi Differential Privacy accounting. Across multiple metrics analyzing data fidelity and downstream machine learning task performance, our results show that MPS outperforms classical models, particularly under strict privacy constraints. This work highlights MPS as a promising tool for privacy-aware synthetic data generation. By combining the expressive power of tensor network representations with formal privacy mechanisms, the proposed approach offers an interpretable and scalable alternative for secure data sharing. Its structured design facilitates integration into sensitive domains where both data quality and confidentiality are critical.
Authors:Chi-Sheng Chen, Samuel Yen-Chi Chen
Abstract:
Time series forecasting is vital in domains where data sensitivity is paramount, such as finance and energy systems. While Differential Privacy (DP) provides theoretical guarantees to protect individual data contributions, its integration especially via DP-SGD often impairs model performance due to injected noise. In this paper, we propose Q-DPTS, a hybrid quantum-classical framework for Quantum Differentially Private Time Series Forecasting. Q-DPTS combines Variational Quantum Circuits (VQCs) with per-sample gradient clipping and Gaussian noise injection, ensuring rigorous $(ε, δ)$-differential privacy. The expressiveness of quantum models enables improved robustness against the utility loss induced by DP mechanisms. We evaluate Q-DPTS on the ETT (Electricity Transformer Temperature) dataset, a standard benchmark for long-term time series forecasting. Our approach is compared against both classical and quantum baselines, including LSTM, QASA, QRWKV, and QLSTM. Results demonstrate that Q-DPTS consistently achieves lower prediction error under the same privacy budget, indicating a favorable privacy-utility trade-off. This work presents one of the first explorations into quantum-enhanced differentially private forecasting, offering promising directions for secure and accurate time series modeling in privacy-critical scenarios.
Authors:Jairo Gudiño-Rosero, Clément Contet, Umberto Grandi, César A. Hidalgo
Abstract:
Large Language Models (LLMs) are gaining traction as a method to generate consensus statements and aggregate preferences in digital democracy experiments. Yet, LLMs may introduce critical vulnerabilities in these systems. Here, we explore the impact of prompt-injection attacks targeting consensus generating systems by introducing a four-dimensional taxonomy of attacks. We test these attacks using LLaMA 3.1 8B and Chat GPT 4.1 Nano finding the LLMs more vulnerable to criticism attacks -- attacks using disagreeable prompts -- and more effective at tilting ambiguous consensus statements. We also find evidence of more effective manipulation when using explicit imperatives and rational-sounding arguments compared to emotional language or fabricated statistics. To mitigate these vulnerabilities, we apply Direct Preference Optimization (DPO), an alignment method that fine-tunes LLMs to prefer unperturbed consensus statements. While DPO significantly improves robustness, it still offers limited protection against attacks targeting ambiguous consensus. These results advance our understanding of the vulnerability and robustness of consensus generating LLMs in digital democracy applications.
Authors:Kunlan Xiang, Haomiao Yang, Meng Hao, Haoxin Wang, Shaofeng Li, Wenbo Jiang
Abstract:
Multivariate Long-Term Time Series Forecasting (MLTSF) models are increasingly deployed in critical domains such as climate, finance, and transportation. Although a variety of powerful MLTSF models have been proposed to improve predictive performance, the robustness of MLTSF models against malicious backdoor attacks remains entirely unexplored, which is crucial to ensuring their reliable and trustworthy deployment. To address this gap, we conduct an in-depth study on backdoor attacks against MLTSF models and propose the first effective attack method named BadTime. BadTime executes a backdoor attack by poisoning training data and customizing the backdoor training process. During data poisoning, BadTime proposes a contrast-guided strategy to select the most suitable training samples for poisoning, then employs a graph attention network to identify influential variables for trigger injection. Subsequently, BadTime further localizes optimal positions for trigger injection based on lag analysis and proposes a puzzle-like trigger structure that distributes the trigger across multiple poisoned variables to jointly steer the prediction of the target variable. During backdoor training, BadTime alternately optimizes the model and triggers via proposed tailored optimization objectives. Extensive experiments show that BadTime significantly outperforms state-of-the-art (SOTA) backdoor attacks on time series forecasting by reducing MAE by over 50% on target variables and boosting stealthiness by more than 3 times.
Authors:Thilo Hagendorff, Erik Derner, Nuria Oliver
Abstract:
Jailbreaking -- bypassing built-in safety mechanisms in AI models -- has traditionally required complex technical procedures or specialized human expertise. In this study, we show that the persuasive capabilities of large reasoning models (LRMs) simplify and scale jailbreaking, converting it into an inexpensive activity accessible to non-experts. We evaluated the capabilities of four LRMs (DeepSeek-R1, Gemini 2.5 Flash, Grok 3 Mini, Qwen3 235B) to act as autonomous adversaries conducting multi-turn conversations with nine widely used target models. LRMs received instructions via a system prompt, before proceeding to planning and executing jailbreaks with no further supervision. We performed extensive experiments with a benchmark of harmful prompts composed of 70 items covering seven sensitive domains. This setup yielded an overall attack success rate across all model combinations of 97.14%. Our study reveals an alignment regression, in which LRMs can systematically erode the safety guardrails of other models, highlighting the urgent need to further align frontier models not only to resist jailbreak attempts, but also to prevent them from being co-opted into acting as jailbreak agents.
Authors:Oriol Saguillo, Vahid Ghafouri, Lucianna Kiffer, Guillermo Suarez-Tangil
Abstract:
Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies.
In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input.
Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.
Authors:Rourab Paul, Paresh Baidya, Krishnendu Guha
Abstract:
Post-Quantum Cryptographic (PQC) algorithms are mathematically secure and resistant to quantum attacks but can still leak sensitive information in hardware implementations due to natural faults or intentional fault injections. The intent fault injection in side-channel attacks reduces the reliability of crypto implementation in future generation network security procesors. In this regard, this research proposes a lightweight, efficient, recomputation-based fault detection module implemented on a Field Programmable Gate Array (FPGA) for Number Theoretic Transform (NTT). The NTT is primarily composed of memory units and the Cooley-Tukey Butterfly Unit (CT-BU), a critical and computationally intensive hardware component essential for polynomial multiplication. NTT and polynomial multiplication are fundamental building blocks in many PQC algorithms, including Kyber, NTRU, Ring-LWE, and others. In this paper, we present a fault detection method called : Recomputation with a Modular Offset (REMO) for the logic blocks of the CT-BU using Montgomery Reduction and another method called Memory Rule Checkers for the memory components used within the NTT. The proposed fault detection framework sets a new benchmark by achieving high efficiency with significant low implementation cost. It occupies only 16 slices and a single DSP block, with a power consumption of just 3mW in Artix-7 FPGA. The REMO-based detection mechanism achieves a fault coverage of 87.2% to 100%, adaptable across various word sizes, fault bit counts, and fault injection modes. Similarly, the Memory Rule Checkers demonstrate robust performance, achieving 50.7% to 100% fault detection depending on and the nature of injected faults.
Authors:Faezeh Shojaeighadikolaei, Shouhuai Xu, Keith Paarporn
Abstract:
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
Authors:Abdullah Al Mamun, Kyle Yates, Antsa Rakotondrafara, Mashrur Chowdhury, Ryann Cartor, Shuhong Gao
Abstract:
Intelligent Transportation Systems (ITS) fundamentally rely on vehicle-generated data for applications such as congestion monitoring and route optimization, making the preservation of user privacy a critical challenge. Homomorphic Encryption (HE) offers a promising solution by enabling computation on encrypted data without revealing underlying content. This study presents the first real-world experimental evaluation of three post-quantum secure HE schemes, i.e., Brakerski-Fan-Vercauteren (BFV), Brakerski-Gentry-Vaikuntanathan (BGV), and Cheon-Kim-Kim-Song (CKKS), for vehicular communication scenarios. Two representative privacy-preserving use cases are considered: encrypted vehicle counting and average speed aggregation. Experiments are conducted over both Wi-Fi and Ethernet to assess performance under wireless and wired vehicle-to-everything (V2X) settings. Results show that BFV and BGV are suitable for latency-tolerant applications such as intersection monitoring and regional traffic analysis, with total end-to-end latencies under 10 seconds. While CKKS experiences higher overhead, it remains viable for periodic encrypted aggregation of numerical data. The experimental results demonstrate that HE can be feasibly deployed in ITS environments under 128-bit post-quantum security, provided that scheme-specific latency constraints are considered. This reinforces its potential to serve as a foundational tool for secure and privacy-preserving V2X data processing.
Authors:Malvika Jadhav, Wenxuan Bao, Vincent Bindschaedler
Abstract:
Stalkerware is a serious threat to individuals' privacy that is receiving increased attention from the security and privacy research communities. Existing works have largely focused on studying leading stalkerware apps, dual-purpose apps, monetization of stalkerware, or the experience of survivors. However, there remains a need to understand potential defenses beyond the detection-and-removal approach, which may not necessarily be effective in the context of stalkerware.
In this paper, we perform a systematic analysis of a large corpus of recent Android stalkerware apps. We combine multiple analysis techniques to quantify stalkerware behaviors and capabilities and how these evolved over time. Our primary goal is understanding: how (and whether) recent Android platform changes -- largely designed to improve user privacy -- have thwarted stalkerware functionality; how stalkerware may have adapted as a result; and what we may conclude about potential defenses. Our investigation reveals new insights into tactics used by stalkerware and may inspire alternative defense strategies.
Authors:Shida Wang, Chaohu Liu, Yubo Wang, Linli Xu
Abstract:
Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel knowledge-editing framework that injects semantically coherent natural language fingerprints by modifying a sparse subset of model weights. This ensures stealthy and precise ownership encoding without degrading the core functionality. Extensive experiments show that FPEdit achieves $95$-$100\%$ fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on 24 downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 10 minutes using less than 32 GB of GPU memory, a $70\%$ reduction in resource requirements compared to existing techniques. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.
Authors:Hammad Atta, Muhammad Zeeshan Baig, Yasir Mehmood, Nadeem Shahzad, Ken Huang, Muhammad Aziz Ul Haq, Muhammad Awais, Kamal Ahmed, Anthony Green
Abstract:
The rapid advancement and widespread adoption of generative artificial intelligence (AI) pose significant threats to the integrity of personal identity, including digital cloning, sophisticated impersonation, and the unauthorized monetization of identity-related data. Mitigating these risks necessitates the development of robust AI-generated content detection systems, enhanced legal frameworks, and ethical guidelines. This paper introduces the Digital Identity Rights Framework (DIRF), a structured security and governance model designed to protect behavioral, biometric, and personality-based digital likeness attributes to address this critical need. Structured across nine domains and 63 controls, DIRF integrates legal, technical, and hybrid enforcement mechanisms to secure digital identity consent, traceability, and monetization. We present the architectural foundations, enforcement strategies, and key use cases supporting the need for a unified framework. This work aims to inform platform builders, legal entities, and regulators about the essential controls needed to enforce identity rights in AI-driven systems.
Authors:Dong Chen, Tong Yang, Feipeng Zhai, Pengpeng Ouyang, Qidong Liu, Yafei Li, Chong Fu, Mingliang Xu
Abstract:
The increasing adoption of Cloud-based Large Language Models (CLLMs) has raised significant concerns regarding data privacy during user interactions. While existing approaches primarily focus on encrypting sensitive information, they often overlook the logical structure of user inputs. This oversight can lead to reduced data utility and degraded performance of CLLMs. To address these limitations and enable secure yet effective interactions, we propose Semantic Encryption (SE)-a plug-and-play framework designed to preserve both privacy and utility. SE consists of two key components: Semantic Encoding and Semantic Decoding. In the encoding phase, a lightweight local model transforms the original user input into an alternative semantic context that maintains the original intent and logical structure while obfuscating sensitive information. This transformed input is then processed by the CLLM, which generates a response based on the transformed semantic context. To maintain a seamless user experience, the decoding phase will reconstruct the CLLM's response back into the original semantic context by referencing the locally stored user input. Extensive experimental evaluations demonstrate that SE effectively protects data privacy without compromising data utility or user experience, offering a practical solution for secure interaction with CLLMs. Particularly, the proposed SE demonstrates a significant improvement over the state-of-the-art InferDPT, surpassing it across various evaluated metrics and datasets.
Authors:Wenwen Zhou, Dongyang Lyu, Xiaoqi Li
Abstract:
As an emerging service framework built by combining cryptography, P2P network, consensus mechanism and innovative contract technology, blockchain has been widely used in digital finance, data sharing, message traceability and electronic evidence preservation because of its decentralised, non-tamperable and transaction traceability. However, with the complex and changeable application scenarios of blockchain technology and the continuous enhancement of blockchain attack technology, the security of the blockchain system has been seriously threatened, dramatically affecting the development and application of blockchain technology. This paper aims to analyse the attacks on blockchain from the perspective of cryptography. Firstly, from the cryptography technology in the blockchain, the principle of hash functions, digital signatures, and other technologies, as well as their role in the blockchain, are introduced. Then, based on the six-layer architecture of the blockchain, the attacks on the data layer, the network layer, the consensus layer, the contract layer, the incentive layer and the application layer are analysed, and the methods to mitigate or resist the attacks are proposed. Secondly, the attack principles of 51% attack, Double-Spending attack, Reentrancy attack, Replay attack, Sybil attack and Timestamp Tampering attack were analysed, and the mitigation or defence solutions for these six attacks were designed. Finally, the core problems to be solved in blockchain technology are summarised, and the future development of blockchain security technology is projected.
Authors:Aaron Xuxiang Tian, Ruofan Zhang, Janet Tang, Ji Wang, Tianyu Shi, Jiaxin Wen
Abstract:
Computer-using agents (CUAs), which can autonomously control computers to perform multi-step actions, might pose significant safety risks if misused. However, existing benchmarks mainly evaluate LMs in chatbots or simple tool use. To more comprehensively evaluate CUAs' misuse risks, we introduce a new benchmark: CUAHarm. CUAHarm consists of 104 expert-written realistic misuse risks, such as disabling firewalls, leaking data, or installing backdoors. We provide a sandbox with rule-based verifiable rewards to measure CUAs' success rates in executing these tasks (e.g., whether the firewall is indeed disabled), beyond refusal rates. We evaluate frontier LMs including GPT-5, Claude 4 Sonnet, Gemini 2.5 Pro, Llama-3.3-70B, and Mistral Large 2. Even without jailbreaking prompts, these frontier LMs comply with executing these malicious tasks at a high success rate (e.g., 90\% for Gemini 2.5 Pro). Furthermore, while newer models are safer in previous safety benchmarks, their misuse risks as CUAs become even higher, e.g., Gemini 2.5 Pro is riskier than Gemini 1.5 Pro. Additionally, while these LMs are robust to common malicious prompts (e.g., creating a bomb) when acting as chatbots, they could still act unsafely as CUAs. We further evaluate a leading agentic framework (UI-TARS-1.5) and find that while it improves performance, it also amplifies misuse risks. To mitigate the misuse risks of CUAs, we explore using LMs to monitor CUAs' actions. We find monitoring unsafe computer-using actions is significantly harder than monitoring conventional unsafe chatbot responses. While monitoring chain-of-thoughts leads to modest gains, the average monitoring accuracy is only 77\%. A hierarchical summarization strategy improves performance by up to 13\%, a promising direction though monitoring remains unreliable. The benchmark will be released publicly to facilitate further research on mitigating these risks.
Authors:Ruopengyu Xu, Chenglian Liu
Abstract:
This paper explores a theoretical approach to enhance RSA's resistance against quantum attacks by optimizing prime selection through Rényi entropy constraints. We develop a framework where primes are generated with controlled proximity ($|p-q| < γ\sqrt{pq}$) to minimize the collision entropy $\mathscr{H}_2$ of the quantum period-finding operator.
The main contributions include: (1) establishing a connection between prime distribution properties and quantum attack complexity via Maynard's prime gap theorem, (2) providing a constructive proof for prime existence under entropy constraints, and (3) demonstrating security reduction to ideal lattice problems under the quantum random oracle model.
Theoretical analysis suggests that for $k$-bit moduli with $γ< k^{-1/2+ε}$, Shor's algorithm requires $Ω(γ^{-1}k^{3/2})$ quantum operations while maintaining classical security equivalent to standard RSA. Key Enhancements: (1) Prime existence proof via Maynard's theorem (Theorem 3.1), (2) Ideal lattice embedding for SVP reduction (Theorem 5.3), (3) Quantum Fano bound for information-theoretic analysis (Theorem 6.3), (4) Multi-prime RSA extension (Section 7.3).
Authors:Linmin Pei, Granger Sutton, Michael Rutherford, Ulrike Wagner, Tracy Nolan, Kirk Smith, Phillip Farmer, Peter Gu, Ambar Rana, Kailing Chen, Thomas Ferleman, Brian Park, Ye Wu, Jordan Kojouharov, Gargi Singh, Jon Lemon, Tyler Willis, Milos Vukadinovic, Grant Duffy, Bryan He, David Ouyang, Marco Pereanez, Daniel Samber, Derek A. Smith, Christopher Cannistraci, Zahi Fayad, David S. Mendelson, Michele Bufano, Elmar Kotter, Hamideh Haghiri, Rajesh Baidya, Stefan Dvoretskii, Klaus H. Maier-Hein, Marco Nolden, Christopher Ablett, Silvia Siggillino, Sandeep Kaushik, Hongzhu Jiang, Sihan Xie, Zhiyu Wan, Alex Michie, Simon J Doran, Angeline Aurelia Waly, Felix A. Nathaniel Liang, Humam Arshad Mustagfirin, Michelle Grace Felicia, Kuo Po Chih, Rahul Krish, Ghulam Rasool, Nidhal Bouaynaya, Nikolas Koutsoubis, Kyle Naddeo, Kartik Pandit, Tony O'Sullivan, Raj Krish, Qinyan Pan, Scott Gustafson, Benjamin Kopchick, Laura Opsahl-Ong, Andrea Olvera-Morales, Jonathan Pinney, Kathryn Johnson, Theresa Do, Juergen Klenk, Maria Diaz, Arti Singh, Rong Chai, David A. Clunie, Fred Prior, Keyvan Farahani
Abstract:
The de-identification (deID) of protected health information (PHI) and personally identifiable information (PII) is a fundamental requirement for sharing medical images, particularly through public repositories, to ensure compliance with patient privacy laws. In addition, preservation of non-PHI metadata to inform and enable downstream development of imaging artificial intelligence (AI) is an important consideration in biomedical research. The goal of MIDI-B was to provide a standardized platform for benchmarking of DICOM image deID tools based on a set of rules conformant to the HIPAA Safe Harbor regulation, the DICOM Attribute Confidentiality Profiles, and best practices in preservation of research-critical metadata, as defined by The Cancer Imaging Archive (TCIA). The challenge employed a large, diverse, multi-center, and multi-modality set of real de-identified radiology images with synthetic PHI/PII inserted.
The MIDI-B Challenge consisted of three phases: training, validation, and test. Eighty individuals registered for the challenge. In the training phase, we encouraged participants to tune their algorithms using their in-house or public data. The validation and test phases utilized the DICOM images containing synthetic identifiers (of 216 and 322 subjects, respectively). Ten teams successfully completed the test phase of the challenge. To measure success of a rule-based approach to image deID, scores were computed as the percentage of correct actions from the total number of required actions. The scores ranged from 97.91% to 99.93%. Participants employed a variety of open-source and proprietary tools with customized configurations, large language models, and optical character recognition (OCR). In this paper we provide a comprehensive report on the MIDI-B Challenge's design, implementation, results, and lessons learned.
Authors:Ethan Wilson, Vincent Bindschaedler, Sophie Jörg, Sean Sheikholeslam, Kevin Butler, Eakta Jain
Abstract:
Photorealistic 3D avatar generation has rapidly improved in recent years, and realistic avatars that match a user's true appearance are more feasible in Mixed Reality (MR) than ever before. Yet, there are known risks to sharing one's likeness online, and photorealistic MR avatars could exacerbate these risks. If user likenesses were to be shared broadly, there are risks for cyber abuse or targeted fraud based on user appearances. We propose an alternate avatar rendering scheme for broader social MR -- synthesizing realistic avatars that preserve a user's demographic identity while being distinct enough from the individual user to protect facial biometric information. We introduce a methodology for privatizing appearance by isolating identity within the feature space of identity-encoding generative models. We develop two algorithms that then obfuscate identity: \epsmethod{} provides differential privacy guarantees and \thetamethod{} provides fine-grained control for the level of identity offset. These methods are shown to successfully generate de-identified virtual avatars across multiple generative architectures in 2D and 3D. With these techniques, it is possible to protect user privacy while largely preserving attributes related to sense of self. Employing these techniques in public settings could enable the use of photorealistic avatars broadly in MR, maintaining high realism and immersion without privacy risk.
Authors:Seiji Sato, Tetsushi Ohki, Masakatsu Nishigaki
Abstract:
Anticipating emerging attack methodologies is crucial for proactive cybersecurity. Recent advances in Large Language Models (LLMs) have enabled the automated generation of phishing messages and accelerated research into potential attack techniques. However, predicting future threats remains challenging due to reliance on existing training data. To address this limitation, we propose a novel framework that integrates LLM-based phishing attack simulations with a genetic algorithm in a psychological context, enabling phishing strategies to evolve dynamically through adversarial interactions with simulated victims. Through simulations using Llama 3.1, we demonstrate that (1) self-evolving phishing strategies employ increasingly sophisticated psychological manipulation techniques, surpassing naive LLM-generated attacks, (2) variations in a victim's prior knowledge significantly influence the evolution of attack strategies, and (3) adversarial interactions between evolving attacks and adaptive defenses create a cat-and-mouse dynamic, revealing an inherent asymmetry in cybersecurity -- attackers continuously refine their methods, whereas defenders struggle to comprehensively counter all evolving threats. Our approach provides a scalable, cost-effective method for analyzing the evolution of phishing strategies and defenses, offering insights into future social engineering threats and underscoring the necessity of proactive cybersecurity measures.
Authors:Alper Ãakan, Vipul Goyal
Abstract:
A quantum copy-protection scheme (Aaronson, CCC 2009) encodes a functionality into a quantum state such that given this state, no efficient adversary can create two (possibly entangled) quantum states that are both capable of running the functionality. There has been a recent line of works on constructing provably-secure copy-protection schemes for general classes of schemes in the plain model, and most recently the recent work of Ãakan and Goyal (IACR Eprint, 2025) showed how to copy-protect all cryptographically puncturable schemes with pseudorandom puncturing points. In this work, we show how to copy-protect even a larger class of schemes. We define a class of cryptographic schemes called malleable-puncturable schemes where the only requirement is that one can create a circuit that is capable of answering inputs at points that are unrelated to the challenge in the security game but does not help the adversary answer inputs related to the challenge. This is a flexible generalization of puncturable schemes, and can capture a wide range of primitives that was not known how to copy-protect prior to our work. Going further, we show that our scheme is secure against arbitrary high min-entropy challenge distributions whereas previous work has only considered schemes that are punctured at pseudorandom points.
Authors:Sivana Hamer, Jacob Bowen, Md Nazmul Haque, Chris Madden, Laurie Williams
Abstract:
The MITRE Adversarial Tactics, Techniques and Common Knowledge (MITRE ATT&CK) Attack Technique to Proactive Software Supply Chain Risk Management Framework (P-SSCRM) Task mapping described in this document helps software organizations to determine how different tasks mitigate the attack techniques of software supply chain attacks. The mapping was created through four independent strategies to find agreed-upon mappings. Because each P-SSCRM task is mapped to one or more tasks from the 10 frameworks, the mapping we provide is also a mapping between MITRE ATT&CK and other prominent government and industry frameworks.
Authors:Nowfel Mashnoor, Mohammad Akyash, Hadi Kamali, Kimia Azar
Abstract:
Achieving timing closure and design-specific optimizations in FPGA-targeted High-Level Synthesis (HLS) remains a significant challenge due to the complex interaction between architectural constraints, resource utilization, and the absence of automated support for platform-specific pragmas. In this work, we propose TimelyHLS, a novel framework integrating Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to automatically generate and iteratively refine HLS code optimized for FPGA-specific timing and performance requirements. TimelyHLS is driven by a structured architectural knowledge base containing FPGA-specific features, synthesis directives, and pragma templates. Given a kernel, TimelyHLS generates HLS code annotated with both timing-critical and design-specific pragmas. The synthesized RTL is then evaluated using commercial toolchains, and simulation correctness is verified against reference outputs via custom testbenches. TimelyHLS iteratively incorporates synthesis logs and performance reports into the LLM engine for refinement in the presence of functional discrepancies. Experimental results across 10 FPGA architectures and diverse benchmarks show that TimelyHLS reduces the need for manual tuning by up to 70%, while achieving up to 4x latency speedup (e.g., 3.85x for Matrix Multiplication, 3.7x for Bitonic Sort) and over 50% area savings in certain cases (e.g., 57% FF reduction in Viterbi). TimelyHLS consistently achieves timing closure and functional correctness across platforms, highlighting the effectiveness of LLM-driven, architecture-aware synthesis in automating FPGA design.
Authors:Tian Dong, Yan Meng, Shaofeng Li, Guoxing Chen, Zhen Liu, Haojin Zhu
Abstract:
Large Language Models (LLMs) are increasingly integrated into daily routines, yet they raise significant privacy and safety concerns. Recent research proposes collaborative inference, which outsources the early-layer inference to ensure data locality, and introduces model safety auditing based on inner neuron patterns. Both techniques expose the LLM's Internal States (ISs), which are traditionally considered irreversible to inputs due to optimization challenges and the highly abstract representations in deep layers. In this work, we challenge this assumption by proposing four inversion attacks that significantly improve the semantic similarity and token matching rate of inverted inputs. Specifically, we first develop two white-box optimization-based attacks tailored for low-depth and high-depth ISs. These attacks avoid local minima convergence, a limitation observed in prior work, through a two-phase inversion process. Then, we extend our optimization attack under more practical black-box weight access by leveraging the transferability between the source and the derived LLMs. Additionally, we introduce a generation-based attack that treats inversion as a translation task, employing an inversion model to reconstruct inputs. Extensive evaluation of short and long prompts from medical consulting and coding assistance datasets and 6 LLMs validates the effectiveness of our inversion attacks. Notably, a 4,112-token long medical consulting prompt can be nearly perfectly inverted with 86.88 F1 token matching from the middle layer of Llama-3 model. Finally, we evaluate four practical defenses that we found cannot perfectly prevent ISs inversion and draw conclusions for future mitigation design.
Authors:Wenhao Li, Selvakumar Manickam, Yung-wey Chong, Shankar Karuppayah
Abstract:
Voice phishing (vishing) remains a persistent threat in cybersecurity, exploiting human trust through persuasive speech. While machine learning (ML)-based classifiers have shown promise in detecting malicious call transcripts, they remain vulnerable to adversarial manipulations that preserve semantic content. In this study, we explore a novel attack vector where large language models (LLMs) are leveraged to generate adversarial vishing transcripts that evade detection while maintaining deceptive intent. We construct a systematic attack pipeline that employs prompt engineering and semantic obfuscation to transform real-world vishing scripts using four commercial LLMs. The generated transcripts are evaluated against multiple ML classifiers trained on a real-world Korean vishing dataset (KorCCViD) with statistical testing. Our experiments reveal that LLM-generated transcripts are both practically and statistically effective against ML-based classifiers. In particular, transcripts crafted by GPT-4o significantly reduce classifier accuracy (by up to 30.96%) while maintaining high semantic similarity, as measured by BERTScore. Moreover, these attacks are both time-efficient and cost-effective, with average generation times under 9 seconds and negligible financial cost per query. The results underscore the pressing need for more resilient vishing detection frameworks and highlight the imperative for LLM providers to enforce stronger safeguards against prompt misuse in adversarial social engineering contexts.
Authors:Wenhao Li, Selvakumar Manickam, Yung-wey Chong, Shankar Karuppayah
Abstract:
Phishing websites remain a major cybersecurity threat, yet existing methods primarily focus on detection, while the recognition of underlying malicious intentions remains largely unexplored. To address this gap, we propose PhishIntentionLLM, a multi-agent retrieval-augmented generation (RAG) framework that uncovers phishing intentions from website screenshots. Leveraging the visual-language capabilities of large language models (LLMs), our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting. We construct and release the first phishing intention ground truth dataset (~2K samples) and evaluate the framework using four commercial LLMs. Experimental results show that PhishIntentionLLM achieves a micro-precision of 0.7895 with GPT-4o and significantly outperforms the single-agent baseline with a ~95% improvement in micro-precision. Compared to the previous work, it achieves 0.8545 precision for credential theft, marking a ~4% improvement. Additionally, we generate a larger dataset of ~9K samples for large-scale phishing intention profiling across sectors. This work provides a scalable and interpretable solution for intention-aware phishing analysis.
Authors:Zeeshan Kaleem, Misha Urooj Khan, Ahmad Suleman, Waqas Khalid, Kai-Kit Wong, Chau Yuen
Abstract:
Recently, low-altitude wireless networks (LAWNs) have emerged as a critical backbone for supporting the low-altitude economy, particularly with the densification of unmanned aerial vehicles (UAVs) and high-altitude platforms (HAPs). To meet growing data demands, some LAWN deployments incorporate free-space optical (FSO) links, which offer exceptional bandwidth and beam directivity. However, without strong security measures in place, both conventional radio frequency channels and FSO beams remain vulnerable to interception and spoofing and FSO in particular can suffer from turbulence, misalignment, and weather-related attenuation. To address these challenges in the quantum era, a quantum-secure architecture called Quantum Skyshield is proposed to enable reliable communication between the base transceiver station (BTS) and LAWN. The proposed design integrates BB84 quantum key distribution (QKD) with post-quantum authentication mechanisms. Simulation results confirm the reliable generation of a 128-bit symmetric key when the quantum bit error rate (QBER) remains below the threshold of 11%. Authentication is enforced using Lamport one-time signatures and hash-based message authentication codes (HMAC) to ensure message integrity. A Grover-inspired threat detection mechanism identifies anomalies with up to 89% probability in a single iteration, enabling real-time trust evaluation. Lastly, future research challenges have also been identified and discussed to guide further development in this area.
Authors:Kutub Uddin, Awais Khan, Muhammad Umar Farooq, Khalid Malik
Abstract:
Audio plays a crucial role in applications like speaker verification, voice-enabled smart devices, and audio conferencing. However, audio manipulations, such as deepfakes, pose significant risks by enabling the spread of misinformation. Our empirical analysis reveals that existing methods for detecting deepfake audio are often vulnerable to anti-forensic (AF) attacks, particularly those attacked using generative adversarial networks. In this article, we propose a novel collaborative learning method called SHIELD to defend against generative AF attacks. To expose AF signatures, we integrate an auxiliary generative model, called the defense (DF) generative model, which facilitates collaborative learning by combining input and output. Furthermore, we design a triplet model to capture correlations for real and AF attacked audios with real-generated and attacked-generated audios using auxiliary generative models. The proposed SHIELD strengthens the defense against generative AF attacks and achieves robust performance across various generative models. The proposed AF significantly reduces the average detection accuracy from 95.49% to 59.77% for ASVspoof2019, from 99.44% to 38.45% for In-the-Wild, and from 98.41% to 51.18% for HalfTruth for three different generative models. The proposed SHIELD mechanism is robust against AF attacks and achieves an average accuracy of 98.13%, 98.58%, and 99.57% in match, and 98.78%, 98.62%, and 98.85% in mismatch settings for the ASVspoof2019, In-the-Wild, and HalfTruth datasets, respectively.
Authors:Jose Luis Castanon Remy, Caleb Chang, Ekzhin Ear, Shouhuai Xu
Abstract:
Cyber threats against space infrastructures, including satellites and systems on the ground, have not been adequately understood. Testbeds are important to deepen our understanding and validate space cybersecurity studies. The state of the art is that there are very few studies on building testbeds, and there are few characterizations of testbeds. In this paper, we propose a framework for characterizing the fidelity of space cybersecurity testbeds. The framework includes 7 attributes for characterizing the system models, threat models, and defenses that can be accommodated by a testbed. We use the framework to guide us in building and characterizing a concrete testbed we have implemented, which includes space, ground, user, and link segments. In particular, we show how the testbed can accommodate some space cyber attack scenarios that have occurred in the real world, and discuss future research directions.
Authors:Jugal Gajjar, Kamalasankari Subramaniakuppusamy, Noha El Kachach
Abstract:
The growing complexity of cyber threats and the limitations of traditional vulnerability detection tools necessitate novel approaches for securing software systems. We introduce MalCodeAI, a language-agnostic, multi-stage AI pipeline for autonomous code security analysis and remediation. MalCodeAI combines code decomposition and semantic reasoning using fine-tuned Qwen2.5-Coder-3B-Instruct models, optimized through Low-Rank Adaptation (LoRA) within the MLX framework, and delivers scalable, accurate results across 14 programming languages. In Phase 1, the model achieved a validation loss as low as 0.397 for functional decomposition and summarization of code segments after 200 iterations, 6 trainable layers, and a learning rate of 2 x 10^(-5). In Phase 2, for vulnerability detection and remediation, it achieved a best validation loss of 0.199 using the same number of iterations and trainable layers but with an increased learning rate of 4 x 10^(-5), effectively identifying security flaws and suggesting actionable fixes. MalCodeAI supports red-hat-style exploit tracing, CVSS-based risk scoring, and zero-shot generalization to detect complex, zero-day vulnerabilities. In a qualitative evaluation involving 15 developers, the system received high scores in usefulness (mean 8.06/10), interpretability (mean 7.40/10), and readability of outputs (mean 7.53/10), confirming its practical value in real-world development workflows. This work marks a significant advancement toward intelligent, explainable, and developer-centric software security solutions.
Authors:Hammad Atta, Ken Huang, Manish Bhatt, Kamal Ahmed, Muhammad Aziz Ul Haq, Yasir Mehmood
Abstract:
The integration of large language models (LLMs) into enterprise systems has introduced a new class of covert security vulnerabilities, particularly within logic execution layers and persistent memory contexts. This paper introduces Logic-layer Prompt Control Injection (LPCI), a novel category of attacks that embeds encoded, delayed, and conditionally triggered payloads within memory, vector stores, or tool outputs. These payloads can bypass conventional input filters and trigger unauthorised behaviour across sessions.
Authors:Hanyang Guo, Xiaoheng Xie, Hong-Ning Dai, Peng Di, Yu Zhang, Bishenghui Tao, Zibin Zheng
Abstract:
Automated Program Repair (APR) is essential for ensuring software reliability and quality while enhancing efficiency and reducing developers' workload. Although rule-based and learning-based APR methods have demonstrated their effectiveness, their performance was constrained by the defect type of repair, the quality of training data, and the size of model parameters. Recently, Large Language Models (LLMs) combined with Retrieval-Augmented-Generation (RAG) have been increasingly adopted in APR tasks. However, current code LLMs and RAG designs neither fully address code repair tasks nor consider code-specific features. To overcome these limitations, we propose SelRepair, a novel APR approach with integration of a fine-tuned LLM with a newly-designed dual RAG module. This approach uses a bug-fix pair dataset for fine-tuning and incorporates semantic and syntactic/structural similarity information through an RAG selection gate. This design ensures relevant information is retrieved efficiently, thereby reducing token length and inference time. Evaluations on Java datasets show SelRepair outperforms other APR methods, achieving 26.29% and 17.64% in terms of exact match (EM) on different datasets while reducing inference time by at least 6.42% with controlled input lengths.
Authors:Javier Blanco-Romero, Pedro Otero GarcÃa, Daniel Sobral-Blanco, Florina Almenares Mendoza, Ana Fernández Vilas, Manuel Fernández-Veiga
Abstract:
Quantum Key Distribution (QKD) offers information-theoretic security against quantum computing threats, but integrating QKD into existing security protocols remains an unsolved challenge due to fundamental mismatches between pre-distributed quantum keys and computational key exchange paradigms. This paper presents the first systematic comparison of sequential versus parallel hybrid QKD-PQC key establishment strategies for IPsec, revealing fundamental protocol design principles that extend beyond specific implementations. We introduce two novel approaches for incorporating QKD into Internet Key Exchange version 2 (IKEv2) with support for both ETSI GS QKD 004 stateful and ETSI GS QKD 014 stateless API specifications: (1) a pure QKD approach that replaces computational key derivation with identifier-based quantum key coordination, and (2) a unified QKD-KEM abstraction that enables parallel composition of quantum and post-quantum cryptographic methods within existing protocol frameworks. Our key insight is that parallel hybrid approaches eliminate the multiplicative latency penalties inherent in sequential methods mandated by RFC 9370, achieving significant performance improvements under realistic network conditions. Performance evaluation using a Docker-based testing framework with IDQuantique QKD hardware demonstrates that the parallel hybrid approach significantly outperforms sequential methods under network latency conditions, while pure QKD achieves minimal bandwidth overhead through identifier-based key coordination. Our implementations provide practical quantum-enhanced IPsec solutions suitable for critical infrastructure deployments requiring defense-in-depth security.
Authors:Yulin Teng, Keshuang Han, Pinchang Zhang, Xiaohong Jiang, Yulong Shen, Fu Xiao
Abstract:
Beam alignment (BA) is a crucial process in millimeter-wave (mmWave) communications, enabling precise directional transmission and efficient link establishment. However, due to characteristics like omnidirectional exposure and the broadcast nature of the BA phase, it is particularly vulnerable to eavesdropping and identity impersonation attacks. To this end, this paper proposes a novel secure framework named CovertAuth, designed to enhance the security of the BA phase against such attacks. In particular, to combat eavesdropping attacks, the closed-form expressions of successful BA probability and covert transmission rate are first derived. Then, a covert communication problem aimed at jointly optimizing beam training budget and transmission power is formulated to maximize covert communication rate, subject to the covertness requirement. An alternating optimization algorithm combined with successive convex approximation is employed to iteratively achieve optimal results. To combat impersonation attacks, the mutual coupling effect of antenna array impairments is explored as a device feature to design a weighted-sum energy detector based physical layer authentication scheme. Moreover, theoretical models for authentication metrics like detection and false alarm probabilities are also provided to conduct performance analysis. Based on these models, an optimization problem is constructed to determine the optimal weight value that maximizes authentication accuracy. Finally, simulation results demonstrate that CovertAuth presents improved detection accuracy under the same covertness requirement compared to existing works.
Authors:Chris S. Lin, Joyce Qu, Gururaj Saileshwar
Abstract:
Rowhammer is a read disturbance vulnerability in modern DRAM that causes bit-flips, compromising security and reliability. While extensively studied on Intel and AMD CPUs with DDR and LPDDR memories, its impact on GPUs using GDDR memories, critical for emerging machine learning applications, remains unexplored. Rowhammer attacks on GPUs face unique challenges: (1) proprietary mapping of physical memory to GDDR banks and rows, (2) high memory latency and faster refresh rates that hinder effective hammering, and (3) proprietary mitigations in GDDR memories, difficult to reverse-engineer without FPGA-based test platforms. We introduce GPUHammer, the first Rowhammer attack on NVIDIA GPUs with GDDR6 DRAM. GPUHammer proposes novel techniques to reverse-engineer GDDR DRAM row mappings, and employs GPU-specific memory access optimizations to amplify hammering intensity and bypass mitigations. Thus, we demonstrate the first successful Rowhammer attack on a discrete GPU, injecting up to 8 bit-flips across 4 DRAM banks on an NVIDIA A6000 with GDDR6 memory. We also show how an attacker can use these to tamper with ML models, causing significant accuracy drops (up to 80%).
Authors:Giovanni Gambigliani Zoccoli, Filip Valgimigli, Dario Stabili, Mirco Marchetti
Abstract:
This paper presents RADAR, a tracking algorithm for vehicles participating in Cooperative Intelligent Transportation Systems (C-ITS) that exploits multiple radio signals emitted by a modern vehicle to break privacy-preserving pseudonym schemes deployed in VANETs. This study shows that by combining Dedicated Short Range Communication (DSRC) and Wi-Fi probe request messages broadcast by the vehicle, it is possible to improve tracking over standard de-anonymization approaches that only leverage DSRC, especially in realistic scenarios where the attacker does not have full coverage of the entire vehicle path. The experimental evaluation compares three different metrics for pseudonym and Wi-Fi probe identifier association (Count, Statistical RSSI, and Pearson RSSI), demonstrating that the Pearson RSSI metric is better at tracking vehicles under pseudonym-changing schemes in all scenarios and against previous works. As an additional contribution to the state-of-the-art, we publicly release all implementations and simulation scenarios used in this work.
Authors:Mohammad F. Al-Hammouri, Yazan Otoum, Rasha Atwa, Amiya Nayak
Abstract:
This paper presents a novel approach to intrusion detection by integrating traditional signature-based methods with the contextual understanding capabilities of the GPT-2 Large Language Model (LLM). As cyber threats become increasingly sophisticated, particularly in distributed, heterogeneous, and resource-constrained environments such as those enabled by the Internet of Things (IoT), the need for dynamic and adaptive Intrusion Detection Systems (IDSs) becomes increasingly urgent. While traditional methods remain effective for detecting known threats, they often fail to recognize new and evolving attack patterns. In contrast, GPT-2 excels at processing unstructured data and identifying complex semantic relationships, making it well-suited to uncovering subtle, zero-day attack vectors. We propose a hybrid IDS framework that merges the robustness of signature-based techniques with the adaptability of GPT-2-driven semantic analysis. Experimental evaluations on a representative intrusion dataset demonstrate that our model enhances detection accuracy by 6.3%, reduces false positives by 9.0%, and maintains near real-time responsiveness. These results affirm the potential of language model integration to build intelligent, scalable, and resilient cybersecurity defences suited for modern connected environments.
Authors:Haitham S. Al-Sinani, Chris J. Mitchell
Abstract:
Ethical hacking today relies on highly skilled practitioners executing complex sequences of commands, which is inherently time-consuming, difficult to scale, and prone to human error. To help mitigate these limitations, we previously introduced 'PenTest++', an AI-augmented system combining automation with generative AI supporting ethical hacking workflows. However, a key limitation of PenTest++ was its lack of support for privilege escalation, a crucial element of ethical hacking. In this paper we present 'PenTest2.0', a substantial evolution of PenTest++ supporting automated privilege escalation driven entirely by Large Language Model reasoning. It also incorporates several significant enhancements: 'Retrieval-Augmented Generation', including both one-line and offline modes; 'Chain-of-Thought' prompting for intermediate reasoning; persistent 'PenTest Task Trees' to track goal progression across turns; and the optional integration of human-authored hints. We describe how it operates, present a proof-of-concept prototype, and discuss its benefits and limitations. We also describe application of the system to a controlled Linux target, showing it can carry out multi-turn, adaptive privilege escalation. We explain the rationale behind its core design choices, and provide comprehensive testing results and cost analysis. Our findings indicate that 'PenTest2.0' represents a meaningful step toward practical, scalable, AI-automated penetration testing, whilst highlighting the shortcomings of generative AI systems, particularly their sensitivity to prompt structure, execution context, and semantic drift, reinforcing the need for further research and refinement in this emerging space.
Keywords: AI, Ethical Hacking, Privilege Escalation, GenAI, ChatGPT, LLM (Large Language Model), HITL (Human-in-the-Loop)
Authors:Sarthak Choudhary, Divyam Anshumaan, Nils Palumbo, Somesh Jha
Abstract:
LLM-integrated applications and agents are vulnerable to prompt injection attacks, in which adversaries embed malicious instructions within seemingly benign user inputs to manipulate the LLM's intended behavior. Recent defenses based on $\textit{known-answer detection}$ (KAD) have achieved near-perfect performance by using an LLM to classify inputs as clean or contaminated. In this work, we formally characterize the KAD framework and uncover a structural vulnerability in its design that invalidates its core security premise. We design a methodical adaptive attack, $\textit{DataFlip}$, to exploit this fundamental weakness. It consistently evades KAD defenses with detection rates as low as $1.5\%$ while reliably inducing malicious behavior with success rates of up to $88\%$, without needing white-box access to the LLM or any optimization procedures.
Authors:Jumin Kim, Seungmin Baek, Minbok Wi, Hwayong Nam, Michael Jaemin Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn
Abstract:
Per-Row Activation Counting (PRAC), a DRAM read disturbance mitigation method, modifies key DRAM timing parameters, reportedly causing significant performance overheads in simulator-based studies. However, given known discrepancies between simulators and real hardware, real-machine experiments are vital for accurate PRAC performance estimation. We present the first real-machine performance analysis of PRAC. After verifying timing modifications on the latest CPUs using microbenchmarks, our analysis shows that PRAC's average and maximum overheads are just 1.06% and 3.28% for the SPEC CPU2017 workloads -- up to 9.15x lower than simulator-based reports. Further, we show that the close page policy minimizes this overhead by effectively hiding the elongated DRAM row precharge operations due to PRAC from the critical path.
Authors:Rahul Thomas, Louai Zahran, Erica Choi, Akilesh Potti, Micah Goldblum, Arka Pal
Abstract:
As LLMs continue to increase in parameter size, the computational resources required to run them are available to fewer parties. Therefore, third-party inference services -- where LLMs are hosted by third parties with significant computational resources -- are becoming increasingly popular. However, third party inference raises critical concerns about user data privacy. To mitigate these risks, privacy researchers have developed provably secure schemes for third-party inference, such as Secure Multi-Party Computation (SMPC). However, SMPC protocols have significant computational and communication overhead, and do not scale to large models. In this work, we propose a new multi-party inference protocol, Cascade, that avoids these punitive costs by leveraging sharding in the sequence dimension to maintain privacy, trading off cryptographic privacy guarantees for increased performance and scalability. We demonstrate that Cascade is resistant to a generalization of a recent attack that is highly effective against other statistical privacy schemes, and that it is further resistant to learning-based attacks. As Cascade is orders of magnitude faster than existing schemes, our findings offer practical solutions for secure deployment of modern state-of-the-art LLMs.
Authors:Carlos Agulló-Domingo, Ãscar Vera-López, Seyda Guzelhan, Lohit Daksha, Aymane El Jerari, Kaustubh Shivdikar, Rashmi Agrawal, David Kaeli, Ajay Joshi, José L. Abellán
Abstract:
Word-wise Fully Homomorphic Encryption (FHE) schemes, such as CKKS, are gaining significant traction due to their ability to provide post-quantum-resistant, privacy-preserving approximate computing; an especially desirable feature in Machine-Learning-as-a-Service (MLaaS) cloud-computing paradigms. OpenFHE is a leading CPU-based FHE library with robust CKKS operations, but its server-side performance is not yet sufficient for practical cloud deployment. As GPU computing becomes more common in data centers, many FHE libraries are adding GPU support. However, integrating an efficient GPU backend into OpenFHE is challenging. While OpenFHE uses a Hardware Abstraction Layer (HAL), its flexible architecture sacrifices performance due to the abstraction layers required for multi-scheme and multi-backend compatibility. In this work, we introduce FIDESlib, the first open-source server-side CKKS GPU library that is fully interoperable with well-established client-side OpenFHE operations. Unlike other existing open-source GPU libraries, FIDESlib provides the first implementation featuring heavily optimized GPU kernels for all CKKS primitives, including bootstrapping. Our library also integrates robust benchmarking and testing, ensuring it remains adaptable to further optimization. Furthermore, its software architecture is designed to support extensions to a multi-GPU backend for enhanced acceleration. Our experiments across various GPU systems and the leading open-source CKKS library to date, Phantom, show that FIDESlib offers superior performance and scalability. For bootstrapping, FIDESlib achieves no less than 70x speedup over the AVX-optimized OpenFHE implementation.
Authors:Nishant Chinnasami, Rye Stahle-Smith, Rasha Karakchi
Abstract:
Advanced Encryption Standard (AES) is a widely adopted cryptographic algorithm, yet its practical implementations remain susceptible to side-channel and fault injection attacks. In this work, we propose a comprehensive framework that enhances AES-128 encryption security through controlled anomaly injection and real-time anomaly detection using both statistical and machine learning (ML) methods. We simulate timing and fault-based anomalies by injecting execution delays and ciphertext perturbations during encryption, generating labeled datasets for detection model training. Two complementary detection mechanisms are developed: a threshold-based timing anomaly detector and a supervised Random Forest classifier trained on combined timing and ciphertext features. We implement and evaluate the framework on both CPU and FPGA-based SoC hardware (PYNQ-Z1), measuring performance across varying block sizes, injection rates, and core counts. Our results show that ML-based detection significantly outperforms threshold-based methods in precision and recall while maintaining real-time performance on embedded hardware. Compared to existing AES anomaly detection methods, our solution offers a low-cost, real-time, and accurate detection approach deployable on lightweight FPGA platforms.
Authors:Dipo Dunsin, Mohamed Chahine Ghanem, Eduardo Almeida Palmieri
Abstract:
This paper addresses the critical need for high-quality malware datasets that support advanced analysis techniques, particularly machine learning and agentic AI frameworks. Existing datasets often lack diversity, comprehensive labelling, and the complexity necessary for effective machine learning and agent-based AI training. To fill this gap, we developed a systematic approach for generating a dataset that combines automated malware execution in controlled virtual environments with dynamic monitoring tools. The resulting dataset comprises clean and infected memory snapshots across multiple malware families and operating systems, capturing detailed behavioural and environmental features. Key design decisions include applying ethical and legal compliance, thorough validation using both automated and manual methods, and comprehensive documentation to ensure replicability and integrity. The dataset's distinctive features enable modelling system states and transitions, facilitating RL-based malware detection and response strategies. This resource is significant for advancing adaptive cybersecurity defences and digital forensic research. Its scope supports diverse malware scenarios and offers potential for broader applications in incident response and automated threat mitigation.
Authors:Ahmed Bensaoud, Jugal Kalita
Abstract:
Active learning for classification seeks to reduce the cost of labeling samples by finding unlabeled examples about which the current model is least certain and sending them to an annotator/expert to label. Bayesian theory can provide a probabilistic view of deep neural network models by asserting a prior distribution over model parameters and estimating the uncertainties by posterior distribution over these parameters. This paper proposes two novel active learning approaches to label one million malware examples belonging to different unknown modern malware families. The first model is Inception-V4+PCA combined with several support vector machine (SVM) algorithms (UTSVM, PSVM, SVM-GSU, TBSVM). The second model is Vision Transformer based Bayesian Neural Networks ViT-BNN. Our proposed ViT-BNN is a state-of-the-art active learning approach that differs from current methods and can apply to any particular task. The experiments demonstrate that the ViT-BNN is more stable and robust in handling uncertainty.
Authors:Hanlin Cai, Haofan Dong, Houtianfu Wang, Kai Li, Ozgur B. Akan
Abstract:
Federated large language models (FedLLMs) enable powerful generative capabilities within wireless networks while preserving data privacy. Nonetheless, FedLLMs remain vulnerable to model poisoning attacks. This article first reviews recent advancements in model poisoning techniques and existing defense mechanisms for FedLLMs, underscoring critical limitations, especially when dealing with non-IID textual data distributions. Current defense strategies predominantly employ distance or similarity-based outlier detection mechanisms, relying on the assumption that malicious updates markedly differ from benign statistical patterns. However, this assumption becomes inadequate against adaptive adversaries targeting billion-parameter LLMs. The article further investigates graph representation-based model poisoning (GRMP), an emerging attack paradigm that exploits higher-order correlations among benign client gradients to craft malicious updates indistinguishable from legitimate ones. GRMP can effectively circumvent advanced defense systems, causing substantial degradation in model accuracy and overall performance. Moreover, the article outlines a forward-looking research roadmap that emphasizes the necessity of graph-aware secure aggregation methods, specialized vulnerability metrics tailored for FedLLMs, and evaluation frameworks to enhance the robustness of federated language model deployments.
Authors:Mayar Elfares, Pascal Reisert, Ralf Küsters, Andreas Bulling
Abstract:
Privacy is a highly subjective concept and perceived variably by different individuals. Previous research on quantifying user-perceived privacy has primarily relied on questionnaires. Furthermore, applying user-perceived privacy to optimise the parameters of privacy-preserving techniques (PPT) remains insufficiently explored. To address these limitations, we introduce Gaze3P -- the first dataset specifically designed to facilitate systematic investigations into user-perceived privacy. Our dataset comprises gaze data from 100 participants and 1,000 stimuli, encompassing a range of private and safe attributes. With Gaze3P, we train a machine learning model to implicitly and dynamically predict perceived privacy from human eye gaze. Through comprehensive experiments, we show that the resulting models achieve high accuracy. Finally, we illustrate how predicted privacy can be used to optimise the parameters of differentially private mechanisms, thereby enhancing their alignment with user expectations.
Authors:Lu Zhang, Sangarapillai Lambotharan, Gan Zheng, Guisheng Liao, Xuekang Liu, Fabio Roli, Carsten Maple
Abstract:
The remarkable success of transformers across various fields such as natural language processing and computer vision has paved the way for their applications in automatic modulation classification, a critical component in the communication systems of Internet of Things (IoT) devices. However, it has been observed that transformer-based classification of radio signals is susceptible to subtle yet sophisticated adversarial attacks. To address this issue, we have developed a defensive strategy for transformer-based modulation classification systems to counter such adversarial attacks. In this paper, we propose a novel vision transformer (ViT) architecture by introducing a new concept known as adversarial indicator (AdvI) token to detect adversarial attacks. To the best of our knowledge, this is the first work to propose an AdvI token in ViT to defend against adversarial attacks. Integrating an adversarial training method with a detection mechanism using AdvI token, we combine a training time defense and running time defense in a unified neural network model, which reduces architectural complexity of the system compared to detecting adversarial perturbations using separate models. We investigate into the operational principles of our method by examining the attention mechanism. We show the proposed AdvI token acts as a crucial element within the ViT, influencing attention weights and thereby highlighting regions or features in the input data that are potentially suspicious or anomalous. Through experimental results, we demonstrate that our approach surpasses several competitive methods in handling white-box attack scenarios, including those utilizing the fast gradient method, projected gradient descent attacks and basic iterative method.
Authors:Yuval Golbari, Navve Wasserman, Gal Vardi, Michal Irani
Abstract:
Determining which data samples were used to train a model-known as Membership Inference Attack (MIA)-is a well-studied and important problem with implications for data privacy. Black-box methods presume access only to the model's outputs and often rely on training auxiliary reference models. While they have shown strong empirical performance, they rely on assumptions that rarely hold in real-world settings: (i) the attacker knows the training hyperparameters; (ii) all available non-training samples come from the same distribution as the training data; and (iii) the fraction of training data in the evaluation set is known. In this paper, we demonstrate that removing these assumptions leads to a significant drop in the performance of black-box attacks. We introduce ImpMIA, a Membership Inference Attack that exploits the Implicit Bias of neural networks, hence removes the need to rely on any reference models and their assumptions. ImpMIA is a white-box attack -- a setting which assumes access to model weights and is becoming increasingly realistic given that many models are publicly available (e.g., via Hugging Face). Building on maximum-margin implicit bias theory, ImpMIA uses the Karush-Kuhn-Tucker (KKT) optimality conditions to identify training samples. This is done by finding the samples whose gradients most strongly reconstruct the trained model's parameters. As a result, ImpMIA achieves state-of-the-art performance compared to both black and white box attacks in realistic settings where only the model weights and a superset of the training data are available.
Authors:Cem Topcuoglu, Kaan Onarlioglu, Steven Sprecher, Engin Kirda
Abstract:
Contemporary web application architectures involve many layers of proxy services that process traffic. Due to the complexity of HTTP and vendor design decisions, these proxies sometimes process a given request in different ways. Attackers can exploit these processing discrepancies to launch damaging attacks including web cache poisoning and request smuggling. Discrepancy attacks are surging, yet, there exists no systemic defense. In this work, we propose the first comprehensive defense to address this problem, called HTTP Request Synchronization. Our scheme uses standard HTTP extension mechanisms to augment each request with a complete processing history. It propagates this context through the traffic path detailing how each server hop has processed said request. Using this history, every proxy server can validate that their processing is consistent with all previous hops, eliminating discrepancy attacks. We implement our scheme for 5 popular proxy technologies, Apache, NGINX, HAProxy, Varnish, and Cloudflare, demonstrating its practical impact.
Authors:Haytham Albousayri, Bechir Hamdaoui, Weng-Keen Wong, Nora Basha
Abstract:
Deep learning-based radio frequency fingerprinting (RFFP) has become an enabling physical-layer security technology, allowing device identification and authentication through received RF signals. This technology, however, faces significant challenges when it comes to adapting to domain variations, such as time, location, environment, receiver and channel. For Bluetooth Low Energy (BLE) devices, addressing these challenges is particularly crucial due to the BLE protocol's frequency-hopping nature. In this work, and for the first time, we investigated the frequency hopping effect on RFFP of BLE devices, and proposed a novel, low-cost, domain-adaptive feature extraction method. Our approach improves the classification accuracy by up to 58\% across environments and up to 80\% across receivers compared to existing benchmarks.
Authors:Aofan Liu, Lulu Tang
Abstract:
Vision-Language Models (VLMs) have garnered significant attention for their remarkable ability to interpret and generate multimodal content. However, securing these models against jailbreak attacks continues to be a substantial challenge. Unlike text-only models, VLMs integrate additional modalities, introducing novel vulnerabilities such as image hijacking, which can manipulate the model into producing inappropriate or harmful responses. Drawing inspiration from text-based jailbreaks like the "Do Anything Now" (DAN) command, this work introduces VisualDAN, a single adversarial image embedded with DAN-style commands. Specifically, we prepend harmful corpora with affirmative prefixes (e.g., "Sure, I can provide the guidance you need") to trick the model into responding positively to malicious queries. The adversarial image is then trained on these DAN-inspired harmful texts and transformed into the text domain to elicit malicious outputs. Extensive experiments on models such as MiniGPT-4, MiniGPT-v2, InstructBLIP, and LLaVA reveal that VisualDAN effectively bypasses the safeguards of aligned VLMs, forcing them to execute a broad range of harmful instructions that severely violate ethical standards. Our results further demonstrate that even a small amount of toxic content can significantly amplify harmful outputs once the model's defenses are compromised. These findings highlight the urgent need for robust defenses against image-based attacks and offer critical insights for future research into the alignment and security of VLMs.
Authors:Mohammadhossein Homaei, Mehran Tarif, Mar Avilla, Andres Caro
Abstract:
Industrial Control Systems (ICS) face growing cyber-physical attacks that exploit both network vulnerabilities and physical processes. Current anomaly detection methods rely on correlation-based analysis, which cannot separate true causal relationships from spurious associations. This limitation results in high false alarm rates and poor root cause analysis. We propose a novel Causal Digital Twin (CDT) framework for cyber-physical security in medium-scale ICS. Our method combines causal inference theory with digital twin modeling. The framework enables three types of causal reasoning: association for pattern detection, intervention for understanding system responses, and counterfactual analysis for attack prevention planning. We evaluate our framework on three industrial datasets: SWaT, WADI, and HAI, with validation through physical constraint compliance (90.8\%) and synthetic ground truth testing (structural Hamming distance 0.13). Results show significant improvements over seven baseline methods. Our CDT achieves F1-scores are $0.944 \pm 0.014$ for SWaT, $0.902 \pm 0.021$ for WADI, and $0.923 \pm 0.018$ for HAI with statistical significance ($p < 0.0024$, Bonferroni corrected). The framework reduces false positives by \SI{74}{\percent} and achieves \SI{78.4}{\percent} root cause analysis accuracy compared to \SI{48.7}{\percent} for existing methods. Counterfactual analysis enables defense strategies that reduce attack success by \SI{73.2}{\percent}. The system keeps real-time performance with \SI{3.2}{ms} latency, which is suitable for industrial deployment, while providing interpretable explanations for operators.
Authors:Zhuolun Li, Haluk Sonmezler, Faiza Shirazi, Febin Shaji, Tymoteusz Mroczkowski, Dexter Lardner, Matthew Alain Camus, Evangelos Pournaras
Abstract:
Ensuring ballot secrecy is critical for fair and trustworthy electronic voting systems, yet achieving strong secrecy guarantees in decentralized, large-scale elections remains challenging. This paper proposes the concept of collectively secure voting, in which voters themselves can opt in as secret holders to protect ballot secrecy. A practical blockchain-based collectively secure voting system is designed and implemented. Our design strikes a balance between strong confidentiality guarantees and real-world applicability. The proposed system combines threshold cryptography and smart contracts to ensure ballots remain confidential during voting, while all protocol steps remain transparent and verifiable. Voters can use the system without prior blockchain knowledge through an intuitive user interface that hides underlying complexity. To evaluate this approach, a user testing is conducted. Results show a high willingness to act as secret holders, reliable participation in share release, and high security confidence in the proposed system. The findings demonstrate that voters can collectively maintain secrecy and that such a practical deployment is feasible.
Authors:Mikaëla Ngamboé, Jean-Simon Marrocco, Jean-Yves Ouattara, José M. Fernandez, Gabriela Nicolescu
Abstract:
With the growing reliance on the vulnerable Automatic Dependent Surveillance-Broadcast (ADS-B) protocol in air traffic management (ATM), ensuring security is critical. This study investigates emerging machine learning models and training strategies to improve AI-based intrusion detection systems (IDS) for ADS-B. Focusing on ground-based ATM systems, we evaluate two deep learning IDS implementations: one using a transformer encoder and the other an extended Long Short-Term Memory (xLSTM) network, marking the first xLSTM-based IDS for ADS-B. A transfer learning strategy was employed, involving pre-training on benign ADS-B messages and fine-tuning with labeled data containing instances of tampered messages. Results show this approach outperforms existing methods, particularly in identifying subtle attacks that progressively undermine situational awareness. The xLSTM-based IDS achieves an F1-score of 98.9%, surpassing the transformer-based model at 94.3%. Tests on unseen attacks validated the generalization ability of the xLSTM model. Inference latency analysis shows that the 7.26-second delay introduced by the xLSTM-based IDS fits within the Secondary Surveillance Radar (SSR) refresh interval (5-12 s), although it may be restrictive for time-critical operations. While the transformer-based IDS achieves a 2.1-second latency, it does so at the cost of lower detection performance.
Authors:Thomas Debris-Alazard, Philippe Gaborit, Romaric Neveu, Olivier Ruatta
Abstract:
Introduced in 2003 and 2005, Alekhnovich and Regev' schemes were the first public-key encryptions whose security is only based on the average hardness of decoding random linear codes and LWE, without other security assumptions. Such security guarantees made them very popular, being at the origin of the now standardized HQC or Kyber. We present an adaptation of Alekhnovich and Regev' encryption scheme whose security is only based on the hardness of a slight variation of MinRank, the so-called stationary-MinRank problem. We succeeded to reach this strong security guarantee by showing that stationary-MinRank benefits from a search-to-decision reduction. Our scheme therefore brings a partial answer to the long-standing open question of building an encryption scheme whose security relies solely on the hardness of MinRank. Finally, we show after a thoroughly security analysis that our scheme is practical and competitive with other encryption schemes admitting such strong security guarantees. Our scheme is slightly less efficient than FrodoKEM, but much more efficient than Alekhnovich and Regev' original schemes, with possibilities of improvements by considering more structure, in the same way as HQC and Kyber.
Authors:Alain Couvreur, Thomas Debris-Alazard, Philippe Gaborit, Adrien Vinçotte
Abstract:
We present $\mathsf{Miranda}$, the first family of full-domain-hash signatures based on matrix codes. This signature scheme fulfils the paradigm of Gentry, Peikert and Vaikuntanathan ($\mathsf{GPV}$), which gives strong security guarantees. Our trapdoor is very simple and generic: if we propose it with matrix codes, it can actually be instantiated in many other ways since it only involves a subcode of a decodable code (or lattice) in a unique decoding regime of parameters. Though $\mathsf{Miranda}$ signing algorithm relies on a decoding task where there is exactly one solution, there are many possible signatures given a message to sign and we ensure that signatures are not leaking information on their underlying trapdoor by means of a very simple procedure involving the drawing of a small number of uniform bits. In particular $\mathsf{Miranda}$ does not use a rejection sampling procedure which makes its implementation a very simple task contrary to other $\mathsf{GPV}$-like signatures schemes such as $\mathsf{Falcon}$ or even $\mathsf{Wave}$. We instantiate $\mathsf{Miranda}$ with the famous family of Gabidulin codes represented as spaces of matrices and we study thoroughly its security (in the EUF-CMA security model). For~$128$ bits of classical security, the signature sizes are as low as~$90$ bytes and the public key sizes are in the order of~$2.6$ megabytes.
Authors:Sanjay Deshpande, Jakub Szefer
Abstract:
Quantum computing is rapidly emerging as one of the most transformative technologies of our time. With the potential to tackle problems that remain intractable for even the most powerful classical supercomputers, quantum hardware has advanced at an extraordinary pace. Today, major platforms such as IBM Quantum, Amazon Braket, and Microsoft Azure provide cloud-based access to quantum processors, making them more widely available than ever before. While a promising technology, quantum computing is not magically immune to security threats. Much research has been done on post-quantum cryptography, which addresses how to protect classical computers from attackers using quantum computers. This article meanwhile introduces the dual idea of quantum computer security: how to protect quantum computers from security attacks.
Authors:Donghwan Kim, Xin Gu, Jinho Baek, Timothy Lo, Younghoon Min, Kwangsik Shin, Jongryool Kim, Jongse Park, Kiwan Maeng
Abstract:
Machine learning (ML) models memorize and leak training data, causing serious privacy issues to data owners. Training algorithms with differential privacy (DP), such as DP-SGD, have been gaining attention as a solution. However, DP-SGD adds a noise at each training iteration, which degrades the accuracy of the trained model. To improve accuracy, a new family of approaches adds carefully designed correlated noises, so that noises cancel out each other across iterations. We performed an extensive characterization study of these new mechanisms, for the first time to the best of our knowledge, and show they incur non-negligible overheads when the model is large or uses large embedding tables. Motivated by the analysis, we propose Cocoon, a hardware-software co-designed framework for efficient training with correlated noises. Cocoon accelerates models with embedding tables through pre-computing and storing correlated noises in a coalesced format (Cocoon-Emb), and supports large models through a custom near-memory processing device (Cocoon-NMP). On a real system with an FPGA-based NMP device prototype, Cocoon improves the performance by 2.33-10.82x(Cocoon-Emb) and 1.55-3.06x (Cocoon-NMP).
Authors:Dmytro Zakharov, Oleksandr Kurbatov, Artem Sdobnov, Lev Soukhanov, Yevhenii Sekhin, Vitalii Volovyk, Mykhailo Velykodnyi, Mark Cherepovskyi, Kyrylo Baibula, Lasha Antadze, Pavlo Kravchenko, Volodymyr Dubinin, Yaroslav Panasenko
Abstract:
In this report, we compare the performance of our UltraGroth-based zero-knowledge machine learning framework Bionetta to other tools of similar purpose such as EZKL, Lagrange's deep-prove, or zkml. The results show a significant boost in the proving time for custom-crafted neural networks: they can be proven even on mobile devices, enabling numerous client-side proving applications. While our scheme increases the cost of one-time preprocessing steps, such as circuit compilation and generating trusted setup, our approach is, to the best of our knowledge, the only one that is deployable on the native EVM smart contracts without overwhelming proof size and verification overheads.
Authors:Akira Ito, Takayuki Miura, Yosuke Todo
Abstract:
Deep Neural Networks (DNNs) have attracted significant attention, and their internal models are now considered valuable intellectual assets. Extracting these internal models through access to a DNN is conceptually similar to extracting a secret key via oracle access to a block cipher. Consequently, cryptanalytic techniques, particularly differential-like attacks, have been actively explored recently. ReLU-based DNNs are the most commonly and widely deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024) assume access to exact output logits, which are usually invisible, more recent works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting, where only the final classification result (e.g., "dog" or "car") is available to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that model extraction is feasible in polynomial time even under this restricted setting. In this paper, we first show that the assumptions underlying their attack become increasingly unrealistic as the attack-target depth grows. In practice, satisfying these assumptions requires an exponential number of queries with respect to the attack depth, implying that the attack does not always run in polynomial time. To address this critical limitation, we propose a novel attack method called CrossLayer Extraction. Instead of directly extracting the secret parameters (e.g., weights and biases) of a specific neuron, which incurs exponential cost, we exploit neuron interactions across layers to extract this information from deeper layers. This technique significantly reduces query complexity and mitigates the limitations of existing model extraction approaches.
Authors:Xutao Mao, Ke Li, Cameron Baird, Ezra Xuanru Tao, Dan Lin
Abstract:
As advances in synthetic voice generation accelerate, an increasing variety of fake voice generators have emerged, producing audio that is often indistinguishable from real human speech. This evolution poses new and serious threats across sectors where audio recordings serve as critical evidence. Although fake voice detectors are also advancing, the arms race between fake voice generation and detection has become more intense and complex. In this work, we present the first large-scale, cross-domain evaluation of fake voice detectors, benchmarking 8 state-of-the-art models against datasets synthesized by 20 different fake voice generation systems. To the best of our knowledge, this is the most comprehensive cross-domain assessment conducted to date. Our study reveals substantial security vulnerabilities in current fake voice detection systems, underscoring critical gaps in their real-world robustness. To advance the field, we propose a unified and effective metric that consolidates the diverse and often inconsistent evaluation criteria previously used across different studies. This metric enables standardized, straightforward comparisons of the robustness of fake voice detectors. We conclude by offering actionable recommendations for building more resilient fake voice detection technologies, with the broader goal of reinforcing the foundations of AI security and trustworthiness.
Authors:Asif Shahriar, Md Nafiu Rahman, Sadif Ahmed, Farig Sadeque, Md Rizwan Parvez
Abstract:
The rapid shift from passive LLMs to autonomous LLM-agents marks a new paradigm in cybersecurity. While these agents can act as powerful tools for both offensive and defensive operations, the very agentic context introduces a new class of inherent security risks. In this work we present the first holistic survey of the agentic security landscape, structuring the field around three interdependent pillars: Applications, Threats, and Defenses. We provide a comprehensive taxonomy of over 150 papers, explaining how agents are used, the vulnerabilities they possess, and the countermeasures designed to protect them. A detailed cross-cutting analysis shows emerging trends in agent architecture while revealing critical research gaps in model and modality coverage.
Authors:Ludwig Stage, Mirela Riveni, Raimundas Matulevičius, Dimka Karastoyanova
Abstract:
Provenance in scientific workflows is essential for understand- ing and reproducing processes, while in business processes, it can ensure compliance and correctness and facilitates process mining. However, the provenance of process adaptations, especially modifications during execu- tion, remains insufficiently addressed. A review of the literature reveals a lack of systematic approaches for capturing provenance information about adaptive workflows/processes. To fill this gap, we propose the AdProv method for collecting, storing, retrieving, and visualizing prove- nance of runtime workflow adaptations. In addition to the definition of the AdProv method in terms of steps and concepts like change events, we also present an architecture for a Provenance Holder service that is essential for implementing the method. To ensure semantic consistency and interoperability we define a mapping to the ontology PROV Ontol- ogy (PROV-O). Additionally, we extend the XES standard with elements for adaptation logging. Our main contributions are the AdProv method and a comprehensive framework and its tool support for managing adap- tive workflow provenance, facilitating advanced provenance tracking and analysis for different application domains.
Authors:James Bailie, Ruobin Gong
Abstract:
The Five Safes is a framework used by national statistical offices (NSO) for assessing and managing the disclosure risk of data sharing. This paper makes two points: Firstly, the Five Safes can be understood as a specialization of a broader concept $\unicode{x2013}$ contextual integrity $\unicode{x2013}$ to the situation of statistical dissemination by an NSO. We demonstrate this by mapping the five parameters of contextual integrity onto the five dimensions of the Five Safes. Secondly, the Five Safes contextualizes narrow, technical notions of privacy within a holistic risk assessment. We demonstrate this with the example of differential privacy (DP). This contextualization allows NSOs to place DP within their Five Safes toolkit while also guiding the design of DP implementations within the broader privacy context, as delineated by both their regulation and the relevant social norms.
Authors:Jacopo Bufalino, Mario Di Francesco, Agathe Blaise, Stefano Secci
Abstract:
Supply chain security is extremely important for modern applications running at scale in the cloud. In fact, they involve a large number of heterogeneous microservices that also include third-party software. As a result, security vulnerabilities are hard to identify and mitigate before they start being actively exploited by attackers. For this reason, governments have recently introduced cybersecurity regulations that require vendors to share a software bill of material (SBOM) with end users or regulators. An SBOM can be employed to identify the security vulnerabilities of a software component even without access to its source code, as long as it is accurate and interoperable across different tools. This work evaluates this issue through a comprehensive study of tools for SBOM generation and vulnerability scanning, including both open-source software and cloud services from major providers. We specifically target software containers and focus on operating system packages in Linux distributions that are widely used as base images due to their far-reaching security impact. Our findings show that the considered tools are largely incompatible, leading to inaccurate reporting and a large amount of undetected vulnerabilities. We uncover the SBOM confusion vulnerability, a byproduct of such fragmented ecosystem, where inconsistent formats prevent reliable vulnerability detection across tools.
Authors:Xuefeng Xu, Graham Cormode
Abstract:
Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are fundamental tools for evaluating machine learning classifiers, offering detailed insights into the trade-offs between true positive rate vs. false positive rate (ROC) or precision vs. recall (PR). However, in Federated Learning (FL) scenarios, where data is distributed across multiple clients, computing these curves is challenging due to privacy and communication constraints. Specifically, the server cannot access raw prediction scores and class labels, which are used to compute the ROC and PR curves in a centralized setting. In this paper, we propose a novel method for approximating ROC and PR curves in a federated setting by estimating quantiles of the prediction score distribution under distributed differential privacy. We provide theoretical bounds on the Area Error (AE) between the true and estimated curves, demonstrating the trade-offs between approximation accuracy, privacy, and communication cost. Empirical results on real-world datasets demonstrate that our method achieves high approximation accuracy with minimal communication and strong privacy guarantees, making it practical for privacy-preserving model evaluation in federated systems.
Authors:Pedro Ivo da Cruz, Dimitri Silva, Tito Spadini, Ricardo Suyama, Murilo Bellezoni Loiola
Abstract:
Massive multiple-input multiple-output (MMIMO) is essential to modern wireless communication systems, like 5G and 6G, but it is vulnerable to active eavesdropping attacks. One type of such attack is the pilot contamination attack (PCA), where a malicious user copies pilot signals from an authentic user during uplink, intentionally interfering with the base station's (BS) channel estimation accuracy. In this work, we propose to use a Decision Tree (DT) algorithm for PCA detection at the BS in a multi-user system. We present a methodology to generate training data for the DT classifier and select the best DT according to their depth. Then, we simulate different scenarios that could be encountered in practice and compare the DT to a classical technique based on likelihood ratio testing (LRT) submitted to the same scenarios. The results revealed that a DT with only one level of depth is sufficient to outperform the LRT. The DT shows a good performance regarding the probability of detection in noisy scenarios and when the malicious user transmits with low power, in which case the LRT fails to detect the PCA. We also show that the reason for the good performance of the DT is its ability to compute a threshold that separates PCA data from non-PCA data better than the LRT's threshold. Moreover, the DT does not necessitate prior knowledge of noise power or assumptions regarding the signal power of malicious users, prerequisites typically essential for LRT and other hypothesis testing methodologies.
Authors:Chunyi Zhang, Qinghong Wei, Xiaoqi Li
Abstract:
The rapid advancement of blockchain technology has precipitated the widespread adoption of Ethereum and smart contracts across a variety of sectors. However, this has also given rise to numerous fraudulent activities, with many speculators embedding Ponzi schemes within smart contracts, resulting in significant financial losses for investors. Currently, there is a lack of effective methods for identifying and analyzing such new types of fraudulent activities. This paper categorizes these scams into four structural types and explores the intrinsic characteristics of Ponzi scheme contract source code from a program analysis perspective. The Mythril tool is employed to conduct static and dynamic analyses of representative cases, thereby revealing their vulnerabilities and operational mechanisms. Furthermore, this paper employs shell scripts and command patterns to conduct batch detection of open-source smart contract code, thereby unveiling the common characteristics of Ponzi scheme smart contracts.
Authors:Joachim Neu, Javier Nieto, Ling Ren
Abstract:
Proof-of-stake blockchains require consensus protocols that support Dynamic Availability and Reconfiguration (so-called DAR setting), where the former means that the consensus protocol should remain live even if a large number of nodes temporarily crash, and the latter means it should be possible to change the set of operating nodes over time. State-of-the-art protocols for the DAR setting, such as Ethereum, Cardano's Ouroboros, or Snow White, require unrealistic additional assumptions, such as social consensus, or that key evolution is performed even while nodes are not participating. In this paper, we identify the necessary and sufficient adversarial condition under which consensus can be achieved in the DAR setting without additional assumptions. We then introduce a new and realistic additional assumption: honest nodes dispose of their cryptographic keys the moment they express intent to exit from the set of operating nodes. To add reconfiguration to any dynamically available consensus protocol, we provide a bootstrapping gadget that is particularly simple and efficient in the common optimistic case of few reconfigurations and no double-spending attempts.
Authors:Taha M. Mahmoud, Naima Kaabouch
Abstract:
The rapid growth of the Internet of Things (IoT) has expanded opportunities for innovation but also increased exposure to botnet-driven cyberattacks. Conventional detection methods often struggle with scalability, privacy, and adaptability in resource-constrained IoT environments. To address these challenges, we present a lightweight and privacy-preserving botnet detection framework based on federated learning. This approach enables distributed devices to collaboratively train models without exchanging raw data, thus maintaining user privacy while preserving detection accuracy. A communication-efficient aggregation strategy is introduced to reduce overhead, ensuring suitability for constrained IoT networks. Experiments on benchmark IoT botnet datasets demonstrate that the framework achieves high detection accuracy while substantially reducing communication costs. These findings highlight federated learning as a practical path toward scalable, secure, and privacy-aware intrusion detection for IoT ecosystems.
Authors:Taha M. Mahmoud, Naima Kaabouch
Abstract:
Electronic voting systems face growing risks from cyberattacks and data breaches, which are expected to intensify with the advent of quantum computing. To address these challenges, we introduce a quantum-secure voting framework that integrates Quantum Key Distribution (QKD), Dual-Key Symmetric Encryption, and verifiable receipt mechanisms to strengthen the privacy, integrity, and reliability of the voting process. The framework enables voters to establish encryption keys securely, cast encrypted ballots, and verify their votes through receipt-based confirmation, all without exposing the vote contents. To evaluate performance, we simulate both quantum and classical communication channels using the Message Queuing Telemetry Transport (MQTT) protocol. Results demonstrate that the system can process large numbers of votes efficiently with low latency and minimal error rates. This approach offers a scalable and practical path toward secure, transparent, and verifiable electronic voting in the quantum era.
Authors:Sanjay Malakar, Michael D. Ernst, Martin Kellogg, Manu Sridharan
Abstract:
A resource leak occurs when a program fails to release a finite resource like a socket, file descriptor or database connection. While sound static analysis tools can detect all leaks, automatically repairing them remains challenging. Prior work took the output of a detection tool and attempted to repair only leaks from a hard-coded list of library resource types. That approach limits the scope of repairable leaks: real-world code uses resource wrappers that store a resource in a field and must themselves be closed. This paper makes four key contributions to improve resource leak repair in the presence of wrappers. (1) It integrates inference of resource management specifications into the repair pipeline, enabling extant fixing approaches to reason about wrappers. (2) It transforms programs into variants that are easier to analyze, making inference, detection, and fixing tools more effective; for instance, it makes detection tools report problems closer to the root cause, often in a client of a resource wrapper rather than within the wrapper class itself. (3) A novel field containment analysis reasons about resource lifetimes, enabling repair of more leaks involving resources stored in fields. (4) It introduces a new repair pattern and more precise reasoning to better handle resources stored in non-final fields. Prior work fixed 41% of resource leak warnings in the NJR benchmark suite; our implementation Arodnap fixes 68%.
Authors:Al Nahian Bin Emran, Rajendra Upadhyay, Rajendra Paudyal, Lisa Donnan, Duminda Wijesekera
Abstract:
In the rapidly evolving landscape of 5G technology, the adoption of cloud-based infrastructure for the deployment of 5G services has become increasingly common. Using a service-based architecture, critical 5G components, such as the Access and Mobility Management Function (AMF), Session Management Function (SMF), and User Plane Function (UPF), now run as containerized pods on Kubernetes clusters. Although this approach improves scalability, flexibility, and resilience, it also introduces new security challenges, particularly to ensure the integrity and trustworthiness of these components. Current 5G security specifications (for example, 3GPP TS 33.501) focus on communication security and assume that network functions remain trustworthy after authentication, consequently lacking mechanisms to continuously validate the integrity of NVFs at runtime. To close this gap, and to align with Zero Trust principles of 'never trust, always verify', we present a TPM 2.0-based continuous remote attestation solution for core 5G components deployed on Kubernetes. Our approach uses the Linux Integrity Measurement Architecture (IMA) and a Trusted Platform Module (TPM) to provide hardware-based runtime validation. We integrate the open-source Keylime framework with a custom IMA template that isolates pod-level measurements, allowing per-pod integrity verification. A prototype on a k3s cluster (consisting of 1 master, 2 worker nodes) was implemented to attest to core functions, including AMF, SMF and UPF. The experimental results show that the system detects unauthorized modifications in real time, labels each pod's trust state, and generates detailed audit logs. This work provides hardware-based continuous attestation for cloud native and edge deployments, strengthening the resilience of 5G as critical infrastructure in multi-vendor and mission-critical scenarios of 5G.
Authors:Atonu Ghosh, Akhilesh Mohanasundaram, Srishivanth R F, Sudip Misra
Abstract:
We present TLoRa, an end-to-end architecture for HTTPS communication over LoRa by integrating TCP tunneling and a complete TLS 1.3 handshake. It enables a seamless and secure communication channel between WiFi-enabled end devices and the Internet over LoRa using an End Hub (EH) and a Net Relay (NR). The EH tethers a WiFi hotspot and a captive portal for user devices to connect and request URLs. The EH forwards the requested URLs to the NR using a secure tunnel over LoRa. The NR, which acts as a server-side proxy, receives and resolves the request from the Internet-based server. It then relays back the encrypted response from the server over the same secure tunnel. TLoRa operates in three phases -session setup, secure tunneling, and rendering. In the first phase, it manages the TCP socket and initiates the TLS handshake. In the second, it creates a secure tunnel and transfers encrypted TLS data over LoRa. Finally, it delivers the URL content to the user. TLoRa also implements a lightweight TLS record reassembly layer and a queuing mechanism for session multiplexing. We evaluate TLoRa on real hardware using multiple accesses to a web API. Results indicate that it provides a practical solution by successfully establishing a TLS session over LoRa in 9.9 seconds and takes 3.58 seconds to fulfill API requests. To the best of our knowledge, this is the first work to comprehensively design, implement, and evaluate the performance of HTTPS access over LoRa using full TLS.
Authors:Grace Billiris, Asif Gill, Madhushi Bandara
Abstract:
Artificial Intelligence (AI) systems introduce unprecedented privacy challenges as they process increasingly sensitive data. Traditional privacy frameworks prove inadequate for AI technologies due to unique characteristics such as autonomous learning and black-box decision-making. This paper presents a taxonomy classifying AI privacy risks, synthesised from 45 studies identified through systematic review. We identify 19 key risks grouped under four categories: Dataset-Level, Model-Level, Infrastructure-Level, and Insider Threat Risks. Findings reveal a balanced distribution across these dimensions, with human error (9.45%) emerging as the most significant factor. This taxonomy challenges conventional security approaches that typically prioritise technical controls over human factors, highlighting gaps in holistic understanding. By bridging technical and behavioural dimensions of AI privacy, this paper contributes to advancing trustworthy AI development and provides a foundation for future research.
Authors:Lekkala Sai Teja, Annepaka Yadagiri, Sangam Sai Anish, Siva Gopala Krishna Nuthakki, Partha Pakray
Abstract:
The growth of highly advanced Large Language Models (LLMs) constitutes a huge dual-use problem, making it necessary to create dependable AI-generated text detection systems. Modern detectors are notoriously vulnerable to adversarial attacks, with paraphrasing standing out as an effective evasion technique that foils statistical detection. This paper presents a comparative study of adversarial robustness, first by quantifying the limitations of standard adversarial training and then by introducing a novel, significantly more resilient detection framework: Perturbation-Invariant Feature Engineering (PIFE), a framework that enhances detection by first transforming input text into a standardized form using a multi-stage normalization pipeline, it then quantifies the transformation's magnitude using metrics like Levenshtein distance and semantic similarity, feeding these signals directly to the classifier. We evaluate both a conventionally hardened Transformer and our PIFE-augmented model against a hierarchical taxonomy of character-, word-, and sentence-level attacks. Our findings first confirm that conventional adversarial training, while resilient to syntactic noise, fails against semantic attacks, an effect we term "semantic evasion threshold", where its True Positive Rate at a strict 1% False Positive Rate plummets to 48.8%. In stark contrast, our PIFE model, which explicitly engineers features from the discrepancy between a text and its canonical form, overcomes this limitation. It maintains a remarkable 82.6% TPR under the same conditions, effectively neutralizing the most sophisticated semantic attacks. This superior performance demonstrates that explicitly modeling perturbation artifacts, rather than merely training on them, is a more promising path toward achieving genuine robustness in the adversarial arms race.
Authors:Niloofar Mireshghallah, Tianshi Li
Abstract:
The discourse on privacy risks in Large Language Models (LLMs) has disproportionately focused on verbatim memorization of training data, while a constellation of more immediate and scalable privacy threats remain underexplored. This position paper argues that the privacy landscape of LLM systems extends far beyond training data extraction, encompassing risks from data collection practices, inference-time context leakage, autonomous agent capabilities, and the democratization of surveillance through deep inference attacks. We present a comprehensive taxonomy of privacy risks across the LLM lifecycle -- from data collection through deployment -- and demonstrate through case studies how current privacy frameworks fail to address these multifaceted threats. Through a longitudinal analysis of 1,322 AI/ML privacy papers published at leading conferences over the past decade (2016--2025), we reveal that while memorization receives outsized attention in technical research, the most pressing privacy harms lie elsewhere, where current technical approaches offer little traction and viable paths forward remain unclear. We call for a fundamental shift in how the research community approaches LLM privacy, moving beyond the narrow focus of current technical solutions and embracing interdisciplinary approaches that address the sociotechnical nature of these emerging threats.
Authors:Tsubasa Takahashi, Shojiro Yamabe, Futa Waseda, Kento Sasaki
Abstract:
Differential Attention (DA) has been proposed as a refinement to standard attention, suppressing redundant or noisy context through a subtractive structure and thereby reducing contextual hallucination. While this design sharpens task-relevant focus, we show that it also introduces a structural fragility under adversarial perturbations. Our theoretical analysis identifies negative gradient alignment-a configuration encouraged by DA's subtraction-as the key driver of sensitivity amplification, leading to increased gradient norms and elevated local Lipschitz constants. We empirically validate this Fragile Principle through systematic experiments on ViT/DiffViT and evaluations of pretrained CLIP/DiffCLIP, spanning five datasets in total. These results demonstrate higher attack success rates, frequent gradient opposition, and stronger local sensitivity compared to standard attention. Furthermore, depth-dependent experiments reveal a robustness crossover: stacking DA layers attenuates small perturbations via depth-dependent noise cancellation, though this protection fades under larger attack budgets. Overall, our findings uncover a fundamental trade-off: DA improves discriminative focus on clean inputs but increases adversarial vulnerability, underscoring the need to jointly design for selectivity and robustness in future attention mechanisms.
Authors:Yu Yan, Siqi Lu, Yang Gao, Zhaoxuan Li, Ziming Zhao, Qingjun Yuan, Yongjuan Wang
Abstract:
Recently, Bit-Flip Attack (BFA) has garnered widespread attention for its ability to compromise software system integrity remotely through hardware fault injection. With the widespread distillation and deployment of large language models (LLMs) into single file .gguf formats, their weight spaces have become exposed to an unprecedented hardware attack surface. This paper is the first to systematically discover and validate the existence of single-bit vulnerabilities in LLM weight files: in mainstream open-source models (e.g., DeepSeek and QWEN) using .gguf quantized formats, flipping just single bit can induce three types of targeted semantic level failures Artificial Flawed Intelligence (outputting factual errors), Artificial Weak Intelligence (degradation of logical reasoning capability), and Artificial Bad Intelligence (generating harmful content). By building an information theoretic weight sensitivity entropy model and a probabilistic heuristic scanning framework called BitSifter, we achieved efficient localization of critical vulnerable bits in models with hundreds of millions of parameters. Experiments show that vulnerabilities are significantly concentrated in the tensor data region, particularly in areas related to the attention mechanism and output layers, which are the most sensitive. A negative correlation was observed between model size and robustness, with smaller models being more susceptible to attacks. Furthermore, a remote BFA chain was designed, enabling semantic-level attacks in real-world environments: At an attack frequency of 464.3 times per second, a single bit can be flipped with 100% success in as little as 31.7 seconds. This causes the accuracy of LLM to plummet from 73.5% to 0%, without requiring high-cost equipment or complex prompt engineering.
Authors:Dalal Alharthi, Ivan Roberto Kawaminami Garcia
Abstract:
Large Language Models (LLMs) have gained prominence in domains including cloud security and forensics. Yet cloud forensic investigations still rely on manual analysis, making them time-consuming and error-prone. LLMs can mimic human reasoning, offering a pathway to automating cloud log analysis. To address this, we introduce the Cloud Investigation Automation Framework (CIAF), an ontology-driven framework that systematically investigates cloud forensic logs while improving efficiency and accuracy. CIAF standardizes user inputs through semantic validation, eliminating ambiguity and ensuring consistency in log interpretation. This not only enhances data quality but also provides investigators with reliable, standardized information for decision-making. To evaluate security and performance, we analyzed Microsoft Azure logs containing ransomware-related events. By simulating attacks and assessing CIAF's impact, results showed significant improvement in ransomware detection, achieving precision, recall, and F1 scores of 93 percent. CIAF's modular, adaptable design extends beyond ransomware, making it a robust solution for diverse cyberattacks. By laying the foundation for standardized forensic methodologies and informing future AI-driven automation, this work underscores the role of deterministic prompt engineering and ontology-based validation in enhancing cloud forensic investigations. These advancements improve cloud security while paving the way for efficient, automated forensic workflows.
Authors:Dalal Alharthi, Ivan Roberto Kawaminami Garcia
Abstract:
Large language models have gained widespread prominence, yet their vulnerability to prompt injection and other adversarial attacks remains a critical concern. This paper argues for a security-by-design AI paradigm that proactively mitigates LLM vulnerabilities while enhancing performance. To achieve this, we introduce PromptShield, an ontology-driven framework that ensures deterministic and secure prompt interactions. It standardizes user inputs through semantic validation, eliminating ambiguity and mitigating adversarial manipulation. To assess PromptShield's security and performance capabilities, we conducted an experiment on an agent-based system to analyze cloud logs within Amazon Web Services (AWS), containing 493 distinct events related to malicious activities and anomalies. By simulating prompt injection attacks and assessing the impact of deploying PromptShield, our results demonstrate a significant improvement in model security and performance, achieving precision, recall, and F1 scores of approximately 94%. Notably, the ontology-based framework not only mitigates adversarial threats but also enhances the overall performance and reliability of the system. Furthermore, PromptShield's modular and adaptable design ensures its applicability beyond cloud security, making it a robust solution for safeguarding generative AI applications across various domains. By laying the groundwork for AI safety standards and informing future policy development, this work stimulates a crucial dialogue on the pivotal role of deterministic prompt engineering and ontology-based validation in ensuring the safe and responsible deployment of LLMs in high-stakes environments.
Authors:Diego Ortiz Barbosa, Mohit Agrawal, Yash Malegaonkar, Luis Burbano, Axel Andersson, György Dán, Henrik Sandberg, Alvaro A. Cardenas
Abstract:
Autonomous drones must often respond to sudden events, such as alarms, faults, or unexpected changes in their environment, that require immediate and adaptive decision-making. Traditional approaches rely on safety engineers hand-coding large sets of recovery rules, but this strategy cannot anticipate the vast range of real-world contingencies and quickly becomes incomplete. Recent advances in embodied AI, powered by large visual language models, provide commonsense reasoning to assess context and generate appropriate actions in real time. We demonstrate this capability in a simulated urban benchmark in the Unreal Engine, where drones dynamically interpret their surroundings and decide on sudden maneuvers for safe landings. Our results show that embodied AI makes possible a new class of adaptive recovery and decision-making pipelines that were previously infeasible to design by hand, advancing resilience and safety in autonomous aerial systems.
Authors:Valentin Barbaza, Alan Rodrigo Diaz-Rizo, Hassan Aboushady, Spyridon Raptis, Haralampos-G. Stratigopoulos
Abstract:
AI models are often regarded as valuable intellectual property due to the high cost of their development, the competitive advantage they provide, and the proprietary techniques involved in their creation. As a result, AI model stealing attacks pose a serious concern for AI model providers. In this work, we present a novel attack targeting wireless devices equipped with AI hardware accelerators. The attack unfolds in two phases. In the first phase, the victim's device is compromised with a hardware Trojan (HT) designed to covertly leak model weights through a hidden communication channel, without the victim realizing it. In the second phase, the adversary uses a nearby wireless device to intercept the victim's transmission frames during normal operation and incrementally reconstruct the complete weight matrix. The proposed attack is agnostic to both the AI model architecture and the hardware accelerator used. We validate our approach through a hardware-based demonstration involving four diverse AI models of varying types and sizes. We detail the design of the HT and the covert channel, highlighting their stealthy nature. Additionally, we analyze the impact of bit error rates on the reception and propose an error mitigation technique. The effectiveness of the attack is evaluated based on the accuracy of the reconstructed models with stolen weights and the time required to extract them. Finally, we explore potential defense mechanisms.
Authors:Firas Ben Hmida, Abderrahmen Amich, Ata Kaboudi, Birhanu Eshete
Abstract:
Deep neural networks (DNNs) are increasingly being deployed in high-stakes applications, from self-driving cars to biometric authentication. However, their unpredictable and unreliable behaviors in real-world settings require new approaches to characterize and ensure their reliability. This paper introduces DeepProv, a novel and customizable system designed to capture and characterize the runtime behavior of DNNs during inference by using their underlying graph structure. Inspired by system audit provenance graphs, DeepProv models the computational information flow of a DNN's inference process through Inference Provenance Graphs (IPGs). These graphs provide a detailed structural representation of the behavior of DNN, allowing both empirical and structural analysis. DeepProv uses these insights to systematically repair DNNs for specific objectives, such as improving robustness, privacy, or fairness. We instantiate DeepProv with adversarial robustness as the goal of model repair and conduct extensive case studies to evaluate its effectiveness. Our results demonstrate its effectiveness and scalability across diverse classification tasks, attack scenarios, and model complexities. DeepProv automatically identifies repair actions at the node and edge-level within IPGs, significantly enhancing the robustness of the model. In particular, applying DeepProv repair strategies to just a single layer of a DNN yields an average 55% improvement in adversarial accuracy. Moreover, DeepProv complements existing defenses, achieving substantial gains in adversarial robustness. Beyond robustness, we demonstrate the broader potential of DeepProv as an adaptable system to characterize DNN behavior in other critical areas, such as privacy auditing and fairness analysis.
Authors:Maciej Skorski, Francisco-Javier Soto, Onur Günlü
Abstract:
Using Fourier analysis, this paper establishes exact security bounds for linear extractors in True Random Number Generators (TRNGs). We provide the first near-optimal total variation security characterization by interpolating between optimal $\ell_{\infty}$ and $\ell_2$ norm results, expressed through code weight enumerators and input bias parameters. Our bounds improve security assessments by an order of magnitude over previous approximations. By scanning ~20,000 codes, we reveal fundamental trade-offs between compression efficiency and cryptographic security. For instance, we show that achieving 80 bits of security can require sacrificing more than 50\% of the code rate when correcting 10\% input bias. Our bounds enhance security evaluation of TRNG post-processing schemes and quantify the inherent cost of randomness extraction in hardware implementations.
Authors:Korok Ray, Sindura Saraswathi
Abstract:
We formulate the design of a threshold signature scheme as made possible on cryptocurrency protocols like Bitcoin. The funds are secured by an m-of-n threshold signature, where at least m signatures are needed to unlock the funds. A user designs this scheme knowing that a malicious attacker can also obtain the signatures with some probability. Higher thresholds offer more security, but also risk locking the user out of his own funds. The optimal threshold balances these twin effects. Interventions like increasing the security or usability of the signatures allow for higher thresholds. We model dynamic threshold signature schemes, where the probability of a user or attacker obtaining signatures decays with time. A dynamic threshold signature scheme is optimal, and increasing security or usability allows for higher thresholds and longer time locks.
Authors:Matilde Baroni, Igor Klep, Dominik Leichtle, Marc-Olivier Renou, Ivan Å upiÄ, Lucas Tendick, Xiangling Xu
Abstract:
Compiled nonlocal games transfer the power of Bell-type multi-prover tests into a single-device setting by replacing spatial separation with cryptography. Concretely, the KLVY compiler (STOC'23) maps any multi-prover game to an interactive single-prover protocol, using quantum homomorphic encryption. A crucial security property of such compilers is quantum soundness, which ensures that a dishonest quantum prover cannot exceed the original game's quantum value. For practical cryptographic implementations, this soundness must be quantitative, providing concrete bounds, rather than merely asymptotic. While quantitative quantum soundness has been established for the KLVY compiler in the bipartite case, it has only been shown asymptotically for multipartite games. This is a significant gap, as multipartite nonlocality exhibits phenomena with no bipartite analogue, and the difficulty of enforcing space-like separation makes single-device compilation especially compelling. This work closes this gap by showing the quantitative quantum soundness of the KLVY compiler for all multipartite nonlocal games. On the way, we introduce an NPA-like hierarchy for quantum instruments and prove its completeness, thereby characterizing correlations from operationally-non-signaling sequential strategies. We further develop novel geometric arguments for the decomposition of sequential strategies into their signaling and non-signaling parts, which might be of independent interest.
Authors:Wai Ming Chan, Remi Chou, Taejoon Kim
Abstract:
This paper introduces the first two-dimensional XOR-based secret sharing scheme for layered multipath communication networks. We present a construction that guarantees successful message recovery and perfect privacy when an adversary observes and disrupts any single path at each transmission layer. The scheme achieves information-theoretic security using only bitwise XOR operations with linear $O(|S|)$ complexity, where $|S|$ is the message length. We provide mathematical proofs demonstrating that the scheme maintains unconditional security regardless of computational resources available to adversaries. Unlike encryption-based approaches vulnerable to quantum computing advances, our construction offers provable security suitable for resource-constrained military environments where computational assumptions may fail.
Authors:Eunkyu Lee, Donghyeon Kim, Wonyoung Kim, Insu Yun
Abstract:
Coding agents, which are LLM-driven agents specialized in software development, have become increasingly prevalent in modern programming environments. Unlike traditional AI coding assistants, which offer simple code completion and suggestions, modern coding agents tackle more complex tasks with greater autonomy, such as generating entire programs from natural language instructions. To enable such capabilities, modern coding agents incorporate extensive functionalities, which in turn raise significant concerns over their security and privacy. Despite their growing adoption, systematic and in-depth security analysis of these agents has largely been overlooked. In this paper, we present a comprehensive security analysis of eight real-world coding agents. Our analysis addresses the limitations of prior approaches, which were often fragmented and ad hoc, by systematically examining the internal workflows of coding agents and identifying security threats across their components. Through the analysis, we identify 15 security issues, including previously overlooked or missed issues, that can be abused to compromise the confidentiality and integrity of user systems. Furthermore, we show that these security issues are not merely individual vulnerabilities, but can collectively lead to end-to-end exploitations. By leveraging these security issues, we successfully achieved arbitrary command execution in five agents and global data exfiltration in four agents, all without any user interaction or approval. Our findings highlight the need for a comprehensive security analysis in modern LLM-driven agents and demonstrate how insufficient security considerations can lead to severe vulnerabilities.
Authors:Alireza Lotfi, Charalampos Katsis, Elisa Bertino
Abstract:
Software vulnerabilities remain a critical security challenge, providing entry points for attackers into enterprise networks. Despite advances in security practices, the lack of high-quality datasets capturing diverse exploit behavior limits effective vulnerability assessment and mitigation. This paper introduces an end-to-end multi-step pipeline leveraging generative AI, specifically large language models (LLMs), to address the challenges of orchestrating and reproducing attacks to known software vulnerabilities. Our approach extracts information from CVE disclosures in the National Vulnerability Database, augments it with external public knowledge (e.g., threat advisories, code snippets) using Retrieval-Augmented Generation (RAG), and automates the creation of containerized environments and exploit code for each vulnerability. The pipeline iteratively refines generated artifacts, validates attack success with test cases, and supports complex multi-container setups. Our methodology overcomes key obstacles, including noisy and incomplete vulnerability descriptions, by integrating LLMs and RAG to fill information gaps. We demonstrate the effectiveness of our pipeline across different vulnerability types, such as memory overflows, denial of service, and remote code execution, spanning diverse programming languages, libraries and years. In doing so, we uncover significant inconsistencies in CVE descriptions, emphasizing the need for more rigorous verification in the CVE disclosure process. Our approach is model-agnostic, working across multiple LLMs, and we open-source the artifacts to enable reproducibility and accelerate security research. To the best of our knowledge, this is the first system to systematically orchestrate and exploit known vulnerabilities in containerized environments by combining general-purpose LLM reasoning with CVE data and RAG-based context enrichment.
Authors:Shuyi Lin, Tian Lu, Zikai Wang, Bo Wen, Yibo Zhao, Cheng Tan
Abstract:
OpenAI's GPT-OSS family provides open-weight language models with explicit chain-of-thought (CoT) reasoning and a Harmony prompt format. We summarize an extensive security evaluation of GPT-OSS-20B that probes the model's behavior under different adversarial conditions. Using the Jailbreak Oracle (JO) [1], a systematic LLM evaluation tool, the study uncovers several failure modes including quant fever, reasoning blackholes, Schrodinger's compliance, reasoning procedure mirage, and chain-oriented prompting. Experiments demonstrate how these behaviors can be exploited on the GPT-OSS-20B model, leading to severe consequences.
Authors:M. Z. Haider, Tayyaba Noreen, M. Salman
Abstract:
Blockchain Business applications and cryptocurrencies such as enable secure, decentralized value transfer, yet their pseudonymous nature creates opportunities for illicit activity, challenging regulators and exchanges in anti money laundering (AML) enforcement. Detecting fraudulent transactions in blockchain networks requires models that can capture both structural and temporal dependencies while remaining resilient to noise, imbalance, and adversarial behavior. In this work, we propose an ensemble framework that integrates Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), and Graph Isomorphism Networks (GIN) to enhance blockchain fraud detection. Using the real-world Elliptic dataset, our tuned soft voting ensemble achieves high recall of illicit transactions while maintaining a false positive rate below 1%, beating individual GNN models and baseline methods. The modular architecture incorporates quantum-ready design hooks, allowing seamless future integration of quantum feature mappings and hybrid quantum classical graph neural networks. This ensures scalability, robustness, and long-term adaptability as quantum computing technologies mature. Our findings highlight ensemble GNNs as a practical and forward-looking solution for real-time cryptocurrency monitoring, providing both immediate AML utility and a pathway toward quantum-enhanced financial security analytics.
Authors:Junjie Song, Jinguang Han, Man Ho Au, Rupeng Yang, Chao Sun
Abstract:
With the development of Internet, privacy has become a close concern of users. Anonymous authentication plays an important role in privacy-preserving systems. $k$-times anonymous authentication ($k$-TAA) scheme allows members of a group to be authenticated anonymously by application providers up to $k$ times. Considering quantum computing attacks, lattice-based $k$-TAA was introduced. However, existing schemes do not support dynamically granting and revoking users. In this paper, we construct the first lattice-based dynamic $k$-TAA, which offers limited times anonymous authentication, dynamic member management, and post-quantum security. We present a concrete construction, and reduce its security to standard complexity assumptions. Notably, compared with existing lattice-based $k$-TAA, our scheme is efficient in terms of communication cost.
Authors:Moritz Grundei, Aayush Rajasekaran, Kishori Konwar, Muriel Medard
Abstract:
The data availability problem is a central challenge in blockchain systems and lies at the core of the accessibility and scalability issues faced by platforms such as Ethereum. Modern solutions employ several approaches, with data availability sampling (DAS) being the most self-sufficient and minimalistic in its security assumptions. Existing DAS methods typically form cryptographic commitments on codewords of fixed-rate erasure codes, which restrict light nodes to sampling from a predetermined set of coded symbols. In this paper, we introduce a new approach to DAS that modularizes the coding and commitment process by committing to the uncoded data while performing sampling through on-the-fly coding. The resulting samples are significantly more expressive, enabling light nodes to obtain, in concrete implementations, up to multiple orders of magnitude stronger assurances of data availability than from sampling pre-committed symbols from a fixed-rate redundancy code as done in established DAS schemes using Reed Solomon or low density parity check codes. We present a concrete protocol that realizes this paradigm using random linear network coding (RLNC).
Authors:Anh Tu Ngo, Anupam Chattopadhyay, Subhamoy Maitra
Abstract:
In this paper we show that cryptographic backdoors in a neural network (NN) can be highly effective in two directions, namely mounting the attacks as well as in presenting the defenses as well. On the attack side, a carefully planted cryptographic backdoor enables powerful and invisible attack on the NN. Considering the defense, we present applications: first, a provably robust NN watermarking scheme; second, a protocol for guaranteeing user authentication; and third, a protocol for tracking unauthorized sharing of the NN intellectual property (IP). From a broader theoretical perspective, borrowing the ideas from Goldwasser et. al. [FOCS 2022], our main contribution is to show that all these instantiated practical protocol implementations are provably robust. The protocols for watermarking, authentication and IP tracking resist an adversary with black-box access to the NN, whereas the backdoor-enabled adversarial attack is impossible to prevent under the standard assumptions. While the theoretical tools used for our attack is mostly in line with the Goldwasser et. al. ideas, the proofs related to the defense need further studies. Finally, all these protocols are implemented on state-of-the-art NN architectures with empirical results corroborating the theoretical claims. Further, one can utilize post-quantum primitives for implementing the cryptographic backdoors, laying out foundations for quantum-era applications in machine learning (ML).
Authors:Adam Swanda, Amy Chang, Alexander Chen, Fraser Burch, Paul Kassianik, Konstantin Berlin
Abstract:
The widespread adoption of Large Language Models (LLMs) has revolutionized AI deployment, enabling autonomous and semi-autonomous applications across industries through intuitive language interfaces and continuous improvements in model development. However, the attendant increase in autonomy and expansion of access permissions among AI applications also make these systems compelling targets for malicious attacks. Their inherent susceptibility to security flaws necessitates robust defenses, yet no known approaches can prevent zero-day or novel attacks against LLMs. This places AI protection systems in a category similar to established malware protection systems: rather than providing guaranteed immunity, they minimize risk through enhanced observability, multi-layered defense, and rapid threat response, supported by a threat intelligence function designed specifically for AI-related threats. Prior work on LLM protection has largely evaluated individual detection models rather than end-to-end systems designed for continuous, rapid adaptation to a changing threat landscape. We present a production-grade defense system rooted in established malware detection and threat intelligence practices. Our platform integrates three components: a threat intelligence system that turns emerging threats into protections; a data platform that aggregates and enriches information while providing observability, monitoring, and ML operations; and a release platform enabling safe, rapid detection updates without disrupting customer workflows. Together, these components deliver layered protection against evolving LLM threats while generating training data for continuous model improvement and deploying updates without interrupting production.
Authors:Dana A Abdullah, Dana Rasul Hamad, Bishar Rasheed Ibrahim, Sirwan Abdulwahid Aula, Aso Khaleel Ameen, Sabat Salih Hamadamin
Abstract:
Altered fingerprint recognition (AFR) is challenging for biometric verification in applications such as border control, forensics, and fiscal admission. Adversaries can deliberately modify ridge patterns to evade detection, so robust recognition of altered prints is essential. We present DeepAFRNet, a deep learning recognition model that matches and recognizes distorted fingerprint samples. The approach uses a VGG16 backbone to extract high-dimensional features and cosine similarity to compare embeddings. We evaluate on the SOCOFing Real-Altered subset with three difficulty levels (Easy, Medium, Hard). With strict thresholds, DeepAFRNet achieves accuracies of 96.7 percent, 98.76 percent, and 99.54 percent for the three levels. A threshold-sensitivity study shows that relaxing the threshold from 0.92 to 0.72 sharply degrades accuracy to 7.86 percent, 27.05 percent, and 29.51 percent, underscoring the importance of threshold selection in biometric systems. By using real altered samples and reporting per-level metrics, DeepAFRNet addresses limitations of prior work based on synthetic alterations or limited verification protocols, and indicates readiness for real-world deployments where both security and recognition resilience are critical.
Authors:Ren-Yi Huang, Dumindu Samaraweera, Prashant Shekhar, J. Morris Chang
Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy by keeping raw data locally stored on client devices, preventing access from other clients or the central server. However, recent studies reveal that sharing model gradients creates vulnerability to Model Inversion Attacks, particularly Deep Leakage from Gradients (DLG), which reconstructs private training data from shared gradients. While Homomorphic Encryption has been proposed as a promising defense mechanism to protect gradient privacy, fully encrypting all model gradients incurs high computational overhead. Selective encryption approaches aim to balance privacy protection with computational efficiency by encrypting only specific gradient components. However, the existing literature largely overlooks a theoretical exploration of the spectral behavior of encrypted versus unencrypted parameters, relying instead primarily on empirical evaluations. To address this gap, this paper presents a framework for theoretical analysis of the underlying principles of selective encryption as a defense against model inversion attacks. We then provide a comprehensive empirical study that identifies and quantifies the critical factors, such as model complexity, encryption ratios, and exposed gradients, that influence defense effectiveness. Our theoretical framework clarifies the relationship between gradient selection and privacy preservation, while our experimental evaluation demonstrates how these factors shape the robustness of defenses against model inversion attacks. Collectively, these contributions advance the understanding of selective encryption mechanisms and offer principled guidance for designing efficient, scalable, privacy-preserving federated learning systems.
Authors:Grace Billiris, Asif Gill, Madhushi Bandara
Abstract:
Quantum Artificial Intelligence (QAI), the integration of Artificial Intelligence (AI) and Quantum Computing (QC), promises transformative advances, including AI-enabled quantum cryptography and quantum-resistant encryption protocols. However, QAI inherits data risks from both AI and QC, creating complex privacy and security vulnerabilities that are not systematically studied. These risks affect the trustworthiness and reliability of AI and QAI systems, making their understanding critical. This study systematically reviews 67 privacy- and security-related studies to expand understanding of QAI data risks. We propose a taxonomy of 22 key data risks, organised into five categories: governance, risk assessment, control implementation, user considerations, and continuous monitoring. Our findings reveal vulnerabilities unique to QAI and identify gaps in holistic risk assessment. This work contributes to trustworthy AI and QAI research and provides a foundation for developing future risk assessment tools.
Authors:Mohamed E. Najd, Ghada Almashaqbeh
Abstract:
Decentralized resource markets are Web 3.0 applications that build open-access platforms for trading digital resources among users without any central management. They promise cost reduction, transparency, and flexible service provision. However, these markets usually have large workload that must be processed in a timely manner, leading to serious scalability problems. Despite the large amount of work on blockchain scalability, existing solutions are ineffective as they do not account for these markets' work models and traffic patterns. We introduce chainScale, a secure hybrid sidechain-sharding solution that aims to boost throughput of decentralized resource markets and reduce their latency and storage footprint. At its core, chainScale leverages dependent sidechains and functionality-oriented workload splitting to parallelize traffic processing by having each market module assigned to a sidechain. Different from sharding, chainScale does not incur any cross-sidechain transactions that tend to be costly. chainScale introduces several techniques, including hierarchical workload sharing that further sub-divides overloaded modules, and weighted miner assignment that assigns miners with vested interest in the system to critical modules' sidechains. Furthermore, chainScale employs sidechain syncing to maintain the mainchain as the single truth of system state, and pruning to discard stale records. Beside analyzing security, we build a proof-of-concept implementation for a distributed file storage market as a use case. Our experiments show that, compared to a single sidechain-based prior solution, chainScale boosts throughput by 4x and reduces confirmation latency by 5x. Also, they show that chainScale outperforms sharding by 2.5x in throughput and 3.5x in latency.
Authors:Marco Benedetti, Andrej Bogdanov, Enrico M. Malatesta, Marc Mézard, Gianmarco Perrupato, Alon Rosen, Nikolaj I. Schwartzbach, Riccardo Zecchina
Abstract:
When neural networks are trained to classify a dataset, one finds a set of weights from which the network produces a label for each data point. We study the algorithmic complexity of finding a collision in a single-layer neural net, where a collision is defined as two distinct sets of weights that assign the same labels to all data. For binary perceptrons with oscillating activation functions, we establish the emergence of an overlap gap property in the space of collisions. This is a topological property believed to be a barrier to the performance of efficient algorithms. The hardness is supported by numerical experiments using approximate message passing algorithms, for which the algorithms stop working well below the value predicted by our analysis. Neural networks provide a new category of candidate collision resistant functions, which for some parameter setting depart from constructions based on lattices. Beyond relevance to cryptography, our work uncovers new forms of computational hardness emerging in large neural networks which may be of independent interest.
Authors:Raj Patel, Umesh Biswas, Surya Kodipaka, Will Carroll, Preston Peranich, Maxwell Young
Abstract:
Peer-to-peer (P2P) networks are a cornerstone of modern computing, and their security is an active area of research. Many defenses with strong security guarantees have been proposed; however, the most-recent survey is over a decade old. This paper delivers an updated review of recent theoretical advances that address classic threats, such as the Sybil and routing attacks, while highlighting how emerging trends -- such as machine learning, social networks, and dynamic systems -- pose new challenges and drive novel solutions. We evaluate the strengths and weaknesses of these solutions and suggest directions for future research.
Authors:Saeid Sheikhi, Panos Kostakos, Lauri Loven
Abstract:
Federated Learning (FL) in 5G and edge network environments face severe security threats from adversarial clients. Malicious participants can perform label flipping, inject backdoor triggers, or launch Sybil attacks to corrupt the global model. This paper introduces Hybrid Reputation Aggregation (HRA), a novel robust aggregation mechanism designed to defend against diverse adversarial behaviors in FL without prior knowledge of the attack type. HRA combines geometric anomaly detection with momentum-based reputation tracking of clients. In each round, it detects outlier model updates via distance-based geometric analysis while continuously updating a trust score for each client based on historical behavior. This hybrid approach enables adaptive filtering of suspicious updates and long-term penalization of unreliable clients, countering attacks ranging from backdoor insertions to random noise Byzantine failures. We evaluate HRA on a large-scale proprietary 5G network dataset (3M+ records) and the widely used NF-CSE-CIC-IDS2018 benchmark under diverse adversarial attack scenarios. Experimental results reveal that HRA achieves robust global model accuracy of up to 98.66% on the 5G dataset and 96.60% on NF-CSE-CIC-IDS2018, outperforming state-of-the-art aggregators such as Krum, Trimmed Mean, and Bulyan by significant margins. Our ablation studies further demonstrate that the full hybrid system achieves 98.66% accuracy, while the anomaly-only and reputation-only variants drop to 84.77% and 78.52%, respectively, validating the synergistic value of our dual-mechanism approach. This demonstrates HRA's enhanced resilience and robustness in 5G/edge federated learning deployments, even under significant adversarial conditions.
Authors:Samuel Breckenridge, Dani Vilardell, Andrés Fábrega, Amy Zhao, Patrick McCorry, Rafael Solari, Ari Juels
Abstract:
In traditional, one-vote-per-person voting systems, privacy equates with ballot secrecy: voting tallies are published, but individual voters' choices are concealed. Voting systems that weight votes in proportion to token holdings, though, are now prevalent in cryptocurrency and web3 systems. We show that these weighted-voting systems overturn existing notions of voter privacy. Our experiments demonstrate that even with secret ballots, publishing raw tallies often reveals voters' choices. Weighted voting thus requires a new framework for privacy. We introduce a notion called B-privacy whose basis is bribery, a key problem in voting systems today. B-privacy captures the economic cost to an adversary of bribing voters based on revealed voting tallies. We propose a mechanism to boost B-privacy by noising voting tallies. We prove bounds on its tradeoff between B-privacy and transparency, meaning reported-tally accuracy. Analyzing 3,582 proposals across 30 Decentralized Autonomous Organizations (DAOs), we find that the prevalence of large voters ("whales") limits the effectiveness of any B-Privacy-enhancing technique. However, our mechanism proves to be effective in cases without extreme voting weight concentration: among proposals requiring coalitions of $\geq5$ voters to flip outcomes, our mechanism raises B-privacy by a geometric mean factor of $4.1\times$. Our work offers the first principled guidance on transparency-privacy tradeoffs in weighted-voting systems, complementing existing approaches that focus on ballot secrecy and revealing fundamental constraints that voting weight concentration imposes on privacy mechanisms.
Authors:Selma Yahia, Ildi Alla, Girija Bangalore Mohan, Daniel Rau, Mridula Singh, Valeria Loscri
Abstract:
Autonomous vehicles (AVs) rely heavily on LiDAR sensors for accurate 3D perception. We show a novel class of low-cost, passive LiDAR spoofing attacks that exploit mirror-like surfaces to inject or remove objects from an AV's perception. Using planar mirrors to redirect LiDAR beams, these attacks require no electronics or custom fabrication and can be deployed in real settings. We define two adversarial goals: Object Addition Attacks (OAA), which create phantom obstacles, and Object Removal Attacks (ORA), which conceal real hazards. We develop geometric optics models, validate them with controlled outdoor experiments using a commercial LiDAR and an Autoware-equipped vehicle, and implement a CARLA-based simulation for scalable testing. Experiments show mirror attacks corrupt occupancy grids, induce false detections, and trigger unsafe planning and control behaviors. We discuss potential defenses (thermal sensing, multi-sensor fusion, light-fingerprinting) and their limitations.
Authors:Mayukh Borana, Junyi Liang, Sai Sathiesh Rajan, Sudipta Chattopadhyay
Abstract:
We introduce FreqRank, a mutation-based defense to localize malicious components in LLM outputs and their corresponding backdoor triggers. FreqRank assumes that the malicious sub-string(s) consistently appear in outputs for triggered inputs and uses a frequency-based ranking system to identify them. Our ranking system then leverages this knowledge to localize the backdoor triggers present in the inputs. We create nine malicious models through fine-tuning or custom instructions for three downstream tasks, namely, code completion (CC), code generation (CG), and code summarization (CS), and show that they have an average attack success rate (ASR) of 86.6%. Furthermore, FreqRank's ranking system highlights the malicious outputs as one of the top five suggestions in 98% of cases. We also demonstrate that FreqRank's effectiveness scales as the number of mutants increases and show that FreqRank is capable of localizing the backdoor trigger effectively even with a limited number of triggered samples. Finally, we show that our approach is 35-50% more effective than other defense methods.
Authors:Petr Grinberg, Eric Bezzam, Paolo Prandoni, Martin Vetterli
Abstract:
With society's increasing reliance on digital data sharing, the protection of sensitive information has become critical. Encryption serves as one of the privacy-preserving methods; however, its realization in the audio domain predominantly relies on signal processing or software methods embedded into hardware. In this paper, we introduce LenslessMic, a hybrid optical hardware-based encryption method that utilizes a lensless camera as a physical layer of security applicable to multiple types of audio. We show that LenslessMic enables (1) robust authentication of audio recordings and (2) encryption strength that can rival the search space of 256-bit digital standards, while maintaining high-quality signals and minimal loss of content information. The approach is validated with a low-cost Raspberry Pi prototype and is open-sourced together with datasets to facilitate research in the area.
Authors:Yunfan Yang, Jiarong Xu, Hongzhe Zhang, Xiao Fang
Abstract:
Model-sharing offers significant business value by enabling firms with well-established Machine Learning (ML) models to monetize and share their models with others who lack the resources to develop ML models from scratch. However, concerns over data confidentiality remain a significant barrier to model-sharing adoption, as Confidential Property Inference (CPI) attacks can exploit shared ML models to uncover confidential properties of the model provider's private model training data. Existing defenses often assume that CPI attacks are non-adaptive to the specific ML model they are targeting. This assumption overlooks a key characteristic of real-world adversaries: their responsiveness, i.e., adversaries' ability to dynamically adjust their attack models based on the information of the target and its defenses. To overcome this limitation, we propose a novel defense method that explicitly accounts for the responsive nature of real-world adversaries via two methodological innovations: a novel Responsive CPI attack and an attack-defense arms race framework. The former emulates the responsive behaviors of adversaries in the real world, and the latter iteratively enhances both the target and attack models, ultimately producing a secure ML model that is robust against responsive CPI attacks. Furthermore, we propose and integrate a novel approximate strategy into our defense, which addresses a critical computational bottleneck of defense methods and improves defense efficiency. Through extensive empirical evaluations across various realistic model-sharing scenarios, we demonstrate that our method outperforms existing defenses by more effectively defending against CPI attacks, preserving ML model utility, and reducing computational overhead.
Authors:Yaser Baseri, Abdelhakim Hafid, Arash Habibi Lashkari
Abstract:
Quantum Computing (QC) introduces a transformative threat to digital security, with the potential to compromise widely deployed classical cryptographic systems. This survey offers a comprehensive and systematic examination of quantumsafe security for Cloud Computing (CC), focusing on the vulnerabilities, transition strategies, and mitigation mechanisms required to secure cloud infrastructures in the quantum era. We evaluated the landscape of quantum threats across the entire CC stack, demonstrating how quantum algorithms can undermine classical encryption and compromise cloud security at multiple architectural layers. Using a structured risk assessment methodology based on the STRIDE model, we evaluate quantum-induced attack vectors and their impact on cloud environments. To address these challenges, we propose a layered security framework that integrates hybrid cryptographic transition strategies, cryptographic agility, and proactive risk mitigation. We analyze the preparation and implementation approaches of the major Cloud Service Providers (CSPs), including AWS, Azure and GCP, synthesizing platform-specific initiatives toward Post-Quantum Cryptography (PQC). Furthermore, we provide a detailed evaluation of standardized PQC algorithms, exploring their resilience to side-channel and active attacks within cloud-native deployments. This survey serves as a strategic reference for cloud architects, policymakers, and researchers, offering actionable insights for navigating the complex transition to quantum-resilient cloud systems. We conclude by identifying six key future research directions: standardization and interoperability, performance and scalability, implementation security, integration with emerging technologies, systemic preparedness, and crypto-agile migration frameworks.
Authors:Rasil Baidar, Sasa Maric, Robert Abbas
Abstract:
The exponential expansion of IoT and 5G-Advanced applications has enlarged the attack surface for DDoS, malware, and zero-day intrusions. We propose an intrusion detection system that fuses a convolutional neural network (CNN), a bidirectional LSTM (BiLSTM), and an autoencoder (AE) bottleneck within a privacy-preserving federated learning (FL) framework. The CNN-BiLSTM branch captures local and gated cross-feature interactions, while the AE emphasizes reconstruction-based anomaly sensitivity. Training occurs across edge devices without sharing raw data. On UNSW-NB15 (binary), the fused model attains AUC 99.59 percent and F1 97.36 percent; confusion-matrix analysis shows balanced error rates with high precision and recall. Average inference time is approximately 0.0476 ms per sample on our test hardware, which is well within the less than 10 ms URLLC budget, supporting edge deployment. We also discuss explainability, drift tolerance, and FL considerations for compliant, scalable 5G-Advanced IoT security.
Authors:Yuanbo Xie, Yingjie Zhang, Tianyun Liu, Duohe Ma, Tingwen Liu
Abstract:
Jailbreak attacks pose persistent threats to large language models (LLMs). Current safety alignment methods have attempted to address these issues, but they experience two significant limitations: insufficient safety alignment depth and unrobust internal defense mechanisms. These limitations make them vulnerable to adversarial attacks such as prefilling and refusal direction manipulation. We introduce DeepRefusal, a robust safety alignment framework that overcomes these issues. DeepRefusal forces the model to dynamically rebuild its refusal mechanisms from jailbreak states. This is achieved by probabilistically ablating the refusal direction across layers and token depths during fine-tuning. Our method not only defends against prefilling and refusal direction attacks but also demonstrates strong resilience against other unseen jailbreak strategies. Extensive evaluations on four open-source LLM families and six representative attacks show that DeepRefusal reduces attack success rates by approximately 95%, while maintaining model capabilities with minimal performance degradation.
Authors:Ozer Ozturk, Busra Buyuktanir, Gozde Karatas Baydogmus, Kazim Yildiz
Abstract:
Machine learning models used for distributed architectures consisting of servers and clients require large amounts of data to achieve high accuracy. Data obtained from clients are collected on a central server for model training. However, storing data on a central server raises concerns about security and privacy. To address this issue, a federated learning architecture has been proposed. In federated learning, each client trains a local model using its own data. The trained models are periodically transmitted to the central server. The server then combines the received models using federated aggregation algorithms to obtain a global model. This global model is distributed back to the clients, and the process continues in a cyclical manner. Although preventing data from leaving the clients enhances security, certain concerns still remain. Attackers can perform inference attacks on the obtained models to approximate the training dataset, potentially causing data leakage. In this study, differential privacy was applied to address the aforementioned security vulnerability, and a performance analysis was conducted. The Data-Unaware Classification Based on Association (duCBA) algorithm was used as the federated aggregation method. Differential privacy was implemented on the data using the Randomized Response technique, and the trade-off between security and performance was examined under different epsilon values. As the epsilon value decreased, the model accuracy declined, and class prediction imbalances were observed. This indicates that higher levels of privacy do not always lead to practical outcomes and that the balance between security and performance must be carefully considered.
Authors:Ali Sadeghi Jahromi, AbdelRahman Abdou, Paul C. van Oorschot
Abstract:
Since security was not among the original design goals of the Domain Name System (herein called Vanilla DNS), many secure DNS schemes have been proposed to enhance the security and privacy of the DNS resolution process. Some proposed schemes aim to replace the existing DNS infrastructure entirely, but none have succeeded in doing so. In parallel, numerous schemes focus on improving DNS security without modifying its fundamental two-stage structure. These efforts highlight the feasibility of addressing DNS security as two distinct but compatible stages. We survey DNS resolution process attacks and threats and develop a comprehensive threat model and attack taxonomy for their systematic categorization. This analysis results in the formulation of 14 desirable security, privacy, and availability properties to mitigate the identified threats. Using these properties, we develop an objective evaluation framework and apply it to comparatively analyze 12 secure DNS schemes surveyed in this work that aim to augment the properties of the DNS resolution process. Our evaluation reveals that no single scheme provides ideal protection across the entire resolution path. Instead, the schemes tend to address a subset of properties specific to individual stages. Since these schemes targeting different stages of DNS resolution are complementary and can operate together, combining compatible schemes offers a practical and effective approach to achieving comprehensive security in the DNS resolution process.
Authors:Vijay Kumar Butte, Sujata Butte
Abstract:
The enterprises today are faced with the tough challenge of processing, storing large amounts of data in a secure, scalable manner and enabling decision makers to make quick, informed data driven decisions. This paper addresses this challenge and develops an effective enterprise data strategy in the cloud. Various components of an effective data strategy are discussed and architectures addressing security, scalability and privacy aspects are provided.
Authors:Eman Abu Ishgair, Chinenye Okafor, Marcela S. Melara, Santiago Torres-Arias
Abstract:
Software Bills of Materials (SBOMs) have become a regulatory requirement for improving software supply chain security and trust by means of transparency regarding components that make up software artifacts. However, enterprise and regulated software vendors commonly wish to restrict who can view confidential software metadata recorded in their SBOMs due to intellectual property or security vulnerability information. To address this tension between transparency and confidentiality, we propose Petra, an SBOM exchange system that empowers software vendors to interoperably compose and distribute redacted SBOM data using selective encryption. Petra enables software consumers to search redacted SBOMs for answers to specific security questions without revealing information they are not authorized to access. Petra leverages a format-agnostic, tamper-evident SBOM representation to generate efficient and confidentiality-preserving integrity proofs, allowing interested parties to cryptographically audit and establish trust in redacted SBOMs. Exchanging redacted SBOMs in our Petra prototype requires less than 1 extra KB per SBOM, and SBOM decryption account for at most 1% of the performance overhead during an SBOM query.
Authors:Jukka Ruohonen, Sani Abdullahi, Abhishek Tiwari
Abstract:
Motivated by software maintenance and the more recent concept of security debt, the paper presents a time series analysis of vulnerability patching of Red Hat's products and components between 1999 and 2024. According to the results based on segmented regression analysis, the amounts of vulnerable products and components have not been stable; a linear trend describes many of the series well. Nor do the amounts align well with trends characterizing vulnerabilities in general. There are also visible breakpoints indicating that the linear trend is not universally applicable and that the growing security debt may be stabilizing.
Authors:Magnus Wiik Eckhoff, Peter Marius Flydal, Siem Peters, Martin Eian, Jonas Halvorsen, Vasileios Mavroeidis, Gudmund Grov
Abstract:
Interpreting the massive volume of security alerts is a significant challenge in Security Operations Centres (SOCs). Effective contextualisation is important, enabling quick distinction between genuine threats and benign activity to prioritise what needs further analysis. This paper proposes a graph-based approach to enhance alert contextualisation in a SOC by aggregating alerts into graph-based alert groups, where nodes represent alerts and edges denote relationships within defined time-windows. By grouping related alerts, we enable analysis at a higher abstraction level, capturing attack steps more effectively than individual alerts. Furthermore, to show that our format is well suited for downstream machine learning methods, we employ Graph Matching Networks (GMNs) to correlate incoming alert groups with historical incidents, providing analysts with additional insights.
Authors:Jiaao Ma, Ceyu Xu, Lisa Wu Wills
Abstract:
In the era of cloud computing, privacy-preserving computation offloading is crucial for safeguarding sensitive data. Fully Homomorphic Encryption (FHE) enables secure processing of encrypted data, but the inherent computational complexity of FHE operations introduces significant computational overhead on the server side. FHE schemes often face a tradeoff between efficiency and versatility. While the CKKS scheme is highly efficient for polynomial operations, it lacks the flexibility of the binary TFHE (Torus-FHE) scheme, which offers greater versatility but at the cost of efficiency. The recent multi-bit TFHE extension offers greater flexibility and performance by supporting native non-polynomial operations and efficient integer processing. However, current implementations of multi-bit TFHE are constrained by its narrower numeric representation, which prevents its adoption in applications requiring wider numeric representations. To address this challenge, we introduce Taurus, a hardware accelerator designed to enhance the efficiency of multi-bit TFHE computations. Taurus supports ciphertexts up to 10 bits by leveraging novel FFT units and optimizing memory bandwidth through key reuse strategies. We also propose a compiler with operation deduplication to improve memory utilization. Our experiment results demonstrate that Taurus achieves up to 2600x speedup over a CPU, 1200x speedup over a GPU, and up to 7x faster compared to the previous state-of-the-art TFHE accelerator. Moreover, Taurus is the first accelerator to demonstrate privacy-preserving inference with large language models such as GPT-2. These advancements enable more practical and scalable applications of privacy-preserving computation in cloud environments.
Authors:Manish Bansal, Pramsu Shrivastava, J. Harshan
Abstract:
In Vehicle-to-Everything (V2X) networks with multi-hop communication, Road Side Units (RSUs) intend to gather location data from the vehicles to offer various location-based services. Although vehicles use the Global Positioning System (GPS) for navigation, they may refrain from sharing their exact GPS coordinates to the RSUs due to privacy considerations. Thus, to address the localization expectations of the RSUs and the privacy concerns of the vehicles, we introduce a relaxed-privacy model wherein the vehicles share their partial location information in order to avail the location-based services. To implement this notion of relaxed-privacy, we propose a low-latency protocol for spatial-provenance recovery, wherein vehicles use correlated linear Bloom filters to embed their position information. Our proposed spatial-provenance recovery process takes into account the resolution of localization, the underlying ad hoc protocol, and the coverage range of the wireless technology used by the vehicles. Through a rigorous theoretical analysis, we present extensive analysis on the underlying trade-off between relaxed-privacy and the communication-overhead of the protocol. Finally, using a wireless testbed, we show that our proposed method requires a few bits in the packet header to provide security features such as localizing a low-power jammer executing a denial-of-service attack.
Authors:Muhammad H. Ashiq, Peter Triantafillou, Hung Yun Tseng, Grigoris G. Chrysos
Abstract:
A key concern for AI safety remains understudied in the machine learning (ML) literature: how can we ensure users of ML models do not leverage predictions on incorrect personal data to harm others? This is particularly pertinent given the rise of open-weight models, where simply masking model outputs does not suffice to prevent adversaries from recovering harmful predictions. To address this threat, which we call *test-time privacy*, we induce maximal uncertainty on protected instances while preserving accuracy on all other instances. Our proposed algorithm uses a Pareto optimal objective that explicitly balances test-time privacy against utility. We also provide a certifiable approximation algorithm which achieves $(\varepsilon, δ)$ guarantees without convexity assumptions. We then prove a tight bound that characterizes the privacy-utility tradeoff that our algorithms incur. Empirically, our method obtains at least $>3\times$ stronger uncertainty than pretraining with marginal drops in accuracy on various image recognition benchmarks. Altogether, this framework provides a tool to guarantee additional protection to end users.
Authors:M. Z. Haider, M. Dias de Assuncao, Kaiwen Zhang
Abstract:
Blockchain technology offers decentralization and security but struggles with scalability, particularly in enterprise settings where efficiency and controlled access are paramount. Sharding is a promising solution for private blockchains, yet existing approaches face challenges in coordinating shards, ensuring fault tolerance with limited nodes, and minimizing the high overhead of consensus mechanisms like PBFT. This paper proposes the Range-Based Sharding (RBS) Protocol, a novel sharding mechanism tailored for enterprise blockchains, implemented on Quorum. Unlike traditional sharding models such as OmniLedger and non-sharding Corda framework, RBS employs a commit-reveal scheme for secure and unbiased shard allocation, ensuring fair validator distribution while reducing cross-shard transaction delays. Our approach enhances scalability by balancing computational loads across shards, reducing consensus overhead, and improving parallel transaction execution. Experimental evaluations demonstrate that RBS achieves significantly higher throughput and lower latency compared to existing enterprise sharding frameworks, making it a viable and efficient solution for largescale blockchain deployments.
Authors:Rishit Agrawal, Kunal Bhatnagar, Andrew Do, Ronnit Rana, Mark Stamp
Abstract:
Recently, a considerable amount of malware research has focused on the use of powerful image-based machine learning techniques, which generally yield impressive results. However, before image-based techniques can be applied to malware, the samples must be converted to images, and there is no generally-accepted approach for doing so. The malware-to-image conversion strategies found in the literature often appear to be ad hoc, with little or no effort made to take into account properties of executable files. In this paper, we experiment with eight distinct malware-to-image conversion techniques, and for each, we test a variety of learning models. We find that several of these image conversion techniques perform similarly across a range of learning models, in spite of the image conversion processes being quite different. These results suggest that the effectiveness of image-based malware classification techniques may depend more on the inherent strengths of image analysis techniques, as opposed to the precise details of the image conversion strategy.
Authors:Prokash Barman, Ratul Chowdhury, Banani Saha
Abstract:
In modern smart systems, the convergence of the Internet of Things (IoT) and Wireless of Things (WoT) have been revolutionized by offering a broad level of wireless connectivity and communication among various devices. Hitherto, this greater interconnectivity poses important security problems, including the question of how to securely interconnect different networks, preserve secure communication channels, and maintain data integrity. However, the traditional cryptographic method and frequency hopping technique, although they provide some protection, are not sufficient to defend against Man-In-The-Middle, jamming, and replay attacks. In addition, synchronization issues in multi-channel communication systems result in increased latency and energy consumption, which make them unsuitable for resource-constrained IoT and WoT devices. This work presents the Multi-Channel Secure Communication (MCSC) framework, which integrates advanced cryptographic protocols with dynamic channel-hopping strategies to enhance security with reduced synchronization overhead. The MCSC framework maximizes the critical performance metrics, such as packet delivery ratio, latency, throughput, and energy efficiency, and fulfills the specific requirements of the IoT and WoT networks. A comprehensive comparison of MCSC with well-established methods, including Frequency Hop Spread Spectrum, single channel Advanced Encryption Standard, and various Elliptic Curve Cryptography-based schemes, indicates that MCSC has lower error rates and is more resilient to a wider range of cyber attacks. The efficiency of the proposed solution to secure IoT and WoT networks without compromising the operational performance is validated under various interference conditions.
Authors:Pavan Reddy, Aditya Sanjay Gujral
Abstract:
Large language model (LLM) assistants are increasingly integrated into enterprise workflows, raising new security concerns as they bridge internal and external data sources. This paper presents an in-depth case study of EchoLeak (CVE-2025-32711), a zero-click prompt injection vulnerability in Microsoft 365 Copilot that enabled remote, unauthenticated data exfiltration via a single crafted email. By chaining multiple bypasses-evading Microsofts XPIA (Cross Prompt Injection Attempt) classifier, circumventing link redaction with reference-style Markdown, exploiting auto-fetched images, and abusing a Microsoft Teams proxy allowed by the content security policy-EchoLeak achieved full privilege escalation across LLM trust boundaries without user interaction. We analyze why existing defenses failed, and outline a set of engineering mitigations including prompt partitioning, enhanced input/output filtering, provenance-based access control, and strict content security policies. Beyond the specific exploit, we derive generalizable lessons for building secure AI copilots, emphasizing the principle of least privilege, defense-in-depth architectures, and continuous adversarial testing. Our findings establish prompt injection as a practical, high-severity vulnerability class in production AI systems and provide a blueprint for defending against future AI-native threats.
Authors:Ming Zhou, Xupu Hu, Zhihao Wang, Haining Wang, Hui Wen, Limin Sun, Peng Zhang
Abstract:
Existing dynamic vulnerability patching techniques are not well-suited for embedded devices, especially mission-critical ones such as medical equipment, as they have limited computational power and memory but uninterrupted service requirements. Those devices often lack sufficient idle memory for dynamic patching, and the diverse architectures of embedded systems further complicate the creation of patch triggers that are compatible across various system kernels and hardware platforms. To address these challenges, we propose a hot patching framework called StackPatch that facilitates patch development based on stack frame reconstruction. StackPatch introduces different triggering strategies to update programs stored in memory units. We leverage the exception-handling mechanisms commonly available in embedded processors to enhance StackPatch's adaptability across different processor architectures for control flow redirection. We evaluated StackPatch on embedded devices featuring three major microcontroller (MCU) architectures: ARM, RISC-V, and Xtensa. In the experiments, we used StackPatch to successfully fix 102 publicly disclosed vulnerabilities in real-time operating systems (RTOS). We applied patching to medical devices, soft programmable logic controllers (PLCs), and network services, with StackPatch consistently completing each vulnerability remediation in less than 260 MCU clock cycles.
Authors:Harshini Sri Ramulu, Helen Schmitt, Bogdan Rerich, Rachel Gonzalez Rodriguez, Tadayoshi Kohno, Yasemin Acar
Abstract:
Ethical questions are discussed regularly in computer security. Still, researchers in computer security lack clear guidance on how to make, document, and assess ethical decisions in research when what is morally right or acceptable is not clear-cut. In this work, we give an overview of the discussion of ethical implications in current published work in computer security by reviewing all 1154 top-tier security papers published in 2024, finding inconsistent levels of ethics reporting with a strong focus of reporting institutional or ethics board approval, human subjects protection, and responsible disclosure, and a lack of discussion of balancing harms and benefits. We further report on the results of a semi-structured interview study with 24 computer security and privacy researchers (among whom were also: reviewers, ethics committee members, and/or program chairs) and their ethical decision-making both as authors and during peer review, finding a strong desire for ethical research, but a lack of consistency in considered values, ethical frameworks (if articulated), decision-making, and outcomes. We present an overview of the current state of the discussion of ethics and current de-facto standards in computer security research, and contribute suggestions to improve the state of ethics in computer security research.
Authors:Jacopo Bufalino, Agathe Blaise, Stefano Secci
Abstract:
Modern software development increasingly depends on open-source libraries and third-party components, which are often encapsulated into containerized environments. While improving the development and deployment of applications, this approach introduces security risks, particularly when outdated or vulnerable components are inadvertently included in production environments. Software Composition Analysis (SCA) is a critical process that helps identify and manage packages and dependencies inside a container. However, unintentional modifications to the container filesystem can lead to incomplete container images, which compromise the reliability of SCA tools. In this paper, we examine the limitations of both cloud-based and open-source SCA tools when faced with such obscure images. An analysis of 600 popular containers revealed that obscure containers exist in well-known registries and trusted images and that many tools fail to analyze such containers. To mitigate these issues, we propose an obscuration-resilient methodology for container analysis and introduce ORCA (Obscuration-Resilient Container Analyzer), its open-source implementation. We reported our findings to all vendors using their appropriate channels. Our results demonstrate that ORCA effectively detects the content of obscure containers and achieves a median 40% improvement in file coverage compared to Docker Scout and Syft.
Authors:Abdou-Essamad Jabri, C. Drocourt, Mostafa Azizi, Gil Utard
Abstract:
The integration of the Internet of Things (IoT) in healthcare has revolutionized patient monitoring and data collection, allowing real-time tracking of vital signs, remote diagnostics, and automated medical responses. However, the transmission and storage of sensitive medical data introduce significant security and privacy challenges. To address these concerns, blockchain technology provides a decentralized and immutable ledger that ensures data integrity, , and transparency. Unlike public blockchains, private blockchains are permissioned; the access is granted only to authorized participants; they are more suitable for handling confidential healthcare data. Although blockchain ensures security and trust, it lacks built-in mechanisms to support flexible and controlled data sharing; This is where Proxy Re-Encryption (PRE) comes into play. PRE is a cryptographic technique that allows encrypted data to be re-encrypted for a new recipient without exposing it to intermediaries. We propose an architecture integrating private blockchain and PRE to enable secure, traceable, and privacy-preserving data sharing in IoT-based healthcare systems. Blockchain guarantees tamper proof record-keeping, while PRE enables fine-grained access control, allowing medical professionals to securely share patient data without compromising confidentiality. This combination creates a robust security framework that enhances trust and efficiency in digital healthcare ecosystems.
Authors:Abdou-Essamad Jabri, Mostafa Azizi, Cyril Drocourt, Gil Utard
Abstract:
Being propelled by the fourth industrial revolution (Industry 4.0), IoT devices and solutions are well adopted everywhere, ranging from home applications to industrial use, crossing through transportation, healthcare, energy, and so on. This wide use of IoT has not gone unnoticed, hackers are tracking the weakness of such a technology and threatening them continuously. Their security at various levels has become an important concern of professionals and researchers. This issue takes more risk, especially with the IoT variants, IIoT (Industrial IoT) and MIoT (Medical IoT). Many existing security solutions are adapted and proposed for addressing IoT security. In this paper, we are interested in exploring blockchain technology and we make a comparison of three free Blockchain platforms towards their applicability for MIoT context, namely Ethereum, Hyperledger Fabric and Corda. In general, Blockchain technology provides a decentralized, autonomous, trustless, and distributed environment. It is challenging to find a Blockchain platform that fits the MIoT context and performs well in terms of security. The retained platform should be deployed smartly to avoid its practical drawbacks related to energy-consuming and excessive computing.
Authors:Yue Han, Jinguang Han, Liqun Chen, Chao Sun
Abstract:
E-health record (EHR) contains a vast amount of continuously growing medical data and enables medical institutions to access patient health data conveniently.This provides opportunities for medical data mining which has important applications in identifying high-risk patients and improving disease diagnosis, etc.Since EHR contains sensitive patient information, how to protect patient privacy and enable mining on EHR data is important and challenging.Traditional public key encryption (PKE) can protect patient privacy, but cannot support flexible selective computation on encrypted EHR data.Functional encryption (FE) allows authorised users to compute function values of encrypted data without releasing other information, hence supporting selective computation on encrypted data. Nevertheless, existing FE schemes do not support fine-grained revocation and update, so they are unsuitable for EHR system. In this paper,we first propose an inner-product functional encryption with fine-grained revocation (IPFE-FR) scheme, and then apply it to a flexible EHR sharing system. Our scheme possesses the following features:(1) a group manager can revoke a specific function computation of medical institutions on encrypted EHR data,instead of all function computation rights. (2) a revoked medical institution is not allowed to compute the function value of encrypted EHR data not only generated after the revocation, but also generated before the revocation. (3) secret keys issued to the same medical institution are bound together to prevent collusion attacks. The formal definition and security model of the IPFE-FR scheme are proposed.Furthermore, we present a concrete construction and reduce its security to the Learning with Errors (LWE) assumption which is quantum-resistant. Finally, the theoretical analysis and experimental implementation of our scheme are conducted to show its efficiency.
Authors:Chongqing Lei, Zhen Ling, Xiangyu Xu, Shaofeng Li, Guangchi Liu, Kai Dong, Junzhou Luo
Abstract:
Microcontroller units (MCUs) are widely used in embedded devices due to their low power consumption and cost-effectiveness. MCU firmware controls these devices and is vital to the security of embedded systems. However, performing dynamic security analyses for MCU firmware has remained challenging due to the lack of usable execution environments -- existing dynamic analyses cannot run on physical devices (e.g., insufficient computational resources), while building emulators is costly due to the massive amount of heterogeneous hardware, especially peripherals. Our work is based on the insight that MCU peripherals can be modeled in a two-fold manner. At the structural level, peripherals have diverse implementations but we can use a limited set of primitives to abstract peripherals because their hardware implementations are based on common hardware concepts. At the semantic level, peripherals have diverse functionalities. However, we can use a single unified semantic model to describe the same kind of peripherals because they exhibit similar functionalities. Building on this, we propose FlexEmu, a flexible MCU peripheral emulation framework. Once semantic models are created, FlexEmu automatically extracts peripheral-specific details to instantiate models and generate emulators accordingly. We have successfully applied FlexEmu to model 12 kinds of MCU peripherals. Our evaluation on 90 firmware samples across 15 different MCU platforms shows that the automatically generated emulators can faithfully replicate hardware behaviors and achieve a 98.48% unit test passing rate, outperforming state-of-the-art approaches. To demonstrate the implications of FlexEmu on firmware security, we use the generated emulators to fuzz three popular RTOSes and uncover 10 previously unknown bugs.
Authors:Fadhil Abbas Fadhil, Maryam Mahdi Alhusseini, Mohammad-Reza Feizi-Derakhshi
Abstract:
An improved CAST-128 encryption algorithm, which is done by implementing chaos-based adaptive S-box generation using Logistic sine Map (LSM), has been provided in this paper because of the increasing requirements of efficient and smart image encryption mechanisms. The study aims to address the drawbacks of static S-box models commonly used in traditional cryptographic systems, which are susceptible to linear and differential attacks. In the proposed scheme, the dynamic, non-linear, invertible, and highly cryptographic strength S-boxes are generated through a hybrid chaotic system that may have high non-linearity, strong and rigorous avalanche characteristics, and low differential uniformity. The process here is that the LSM is used to produce S-boxes having key-dependent parameters that are stuffed into the CAST-128 structure to encrypt the image in a block-wise manner. The performance of the encryption is assessed utilizing a set of standard grayscale images. The metrics that are used to evaluate the security are entropy, NPCR, UACI, PSNR, and histogram analysis. Outcomes indicate that randomness, resistance to statistical attacks, and country of encryption are significantly improved compared to the original CAST-128. The study is theoretically and practically relevant since it presents a lightweight S-box generation approach driven by chaos, which can increase the level of robustness of the image encryptions without enlisting machine learning. The system may be applied to secure communications, surveillance systems, and medical image protection on a real-time basis.
Authors:Haywood Gelman, John D. Hastings, David Kenley
Abstract:
Insider threats are a growing organizational problem due to the complexity of identifying their technical and behavioral elements. A large research body is dedicated to the study of insider threats from technological, psychological, and educational perspectives. However, research in this domain has been generally dependent on datasets that are static and limited access which restricts the development of adaptive detection models. This study introduces a novel, ethically grounded approach that uses the large language model (LLM) Claude Sonnet 3.7 to dynamically synthesize syslog messages, some of which contain indicators of insider threat scenarios. The messages reflect real-world data distributions by being highly imbalanced (1% insider threats). The syslogs were analyzed for insider threats by both Sonnet 3.7 and GPT-4o, with their performance evaluated through statistical metrics including accuracy, precision, recall, F1, specificity, FAR, MCC, and ROC AUC. Sonnet 3.7 consistently outperformed GPT-4o across nearly all metrics, particularly in reducing false alarms and improving detection accuracy. The results show strong promise for the use of LLMs in synthetic dataset generation and insider threat detection.
Authors:Irdin Pekaric, Philipp Zech, Tom Mattson
Abstract:
Large Language Models (LLMs) are transforming human decision-making by acting as cognitive collaborators. Yet, this promise comes with a paradox: while LLMs can improve accuracy, they may also erode independent reasoning, promote over-reliance and homogenize decisions. In this paper, we investigate how LLMs shape human judgment in security-critical contexts. Through two exploratory focus groups (unaided and LLM-supported), we assess decision accuracy, behavioral resilience and reliance dynamics. Our findings reveal that while LLMs enhance accuracy and consistency in routine decisions, they can inadvertently reduce cognitive diversity and improve automation bias, which is especially the case among users with lower resilience. In contrast, high-resilience individuals leverage LLMs more effectively, suggesting that cognitive traits mediate AI benefit.
Authors:Zilong Wang, Gideon Mohr, Klaus von Gleissenthall, Jan Reineke, Marco Guarnieri
Abstract:
Leakage contracts have been proposed as a new security abstraction at the instruction set architecture level. Leakage contracts aim to capture the information that processors may leak via microarchitectural side channels. Recently, the first tools have emerged to verify whether a processor satisfies a given contract. However, coming up with a contract that is both sound and precise for a given processor is challenging, time-consuming, and error-prone, as it requires in-depth knowledge of the timing side channels introduced by microarchitectural optimizations. In this paper, we address this challenge by proposing LeaSyn, the first tool for automatically synthesizing leakage contracts that are both sound and precise for processor designs at register-transfer level. Starting from a user-provided contract template that captures the space of possible contracts, LeaSyn automatically constructs a contract, alternating between contract synthesis, which ensures precision based on an empirical characterization of the processor's leaks, and contract verification, which ensures soundness. Using LeaSyn, we automatically synthesize contracts for six open-source RISC-V CPUs for a variety of contract templates. Our experiments indicate that LeaSyn's contracts are sound and more precise (i.e., represent the actual leaks in the target processor more faithfully) than contracts constructed by existing approaches.
Authors:Yilin Li, Guozhu Meng, Mingyang Sun, Yanzhong Wang, Kun Sun, Hailong Chang, Yuekang Li
Abstract:
On-device deep learning models have extensive real world demands. Deep learning compilers efficiently compile models into executables for deployment on edge devices, but these executables may face the threat of reverse engineering. Previous studies have attempted to decompile DNN executables, but they face challenges in handling compilation optimizations and analyzing quantized compiled models. In this paper, we present NeuroDeX to unlock diverse support in decompiling DNN executables. NeuroDeX leverages the semantic understanding capabilities of LLMs along with dynamic analysis to accurately and efficiently perform operator type recognition, operator attribute recovery and model reconstruction. NeuroDeX can recover DNN executables into high-level models towards compilation optimizations, different architectures and quantized compiled models. We conduct experiments on 96 DNN executables across 12 common DNN models. Extensive experimental results demonstrate that NeuroDeX can decompile non-quantized executables into nearly identical high-level models. NeuroDeX can recover functionally similar high-level models for quantized executables, achieving an average top-1 accuracy of 72%. NeuroDeX offers a more comprehensive and effective solution compared to previous DNN executables decompilers.
Authors:Charbel Mattar, Jacques Bou Abdo, Abdallah Makhoul, Benoit Piranda, Jacques Demerjian
Abstract:
Satellites, drones, and 5G space links now support critical services such as air traffic, finance, and weather. Yet most were not built to resist modern cyber threats. Ground stations can be breached, GPS jammed, and supply chains compromised, while no shared list of vulnerabilities or safe testing range exists. This paper maps eleven research gaps, including secure routing, onboard intrusion detection, recovery methods, trusted supply chains, post-quantum encryption, zero-trust architectures, and real-time impact monitoring. For each, we outline the challenge, why it matters, and a guiding research question. We also highlight an agentic (multi-agent) AI approach where small, task-specific agents share defense tasks onboard instead of one large model. Finally, we propose a five-year roadmap: post-quantum and QKD flight trials, open cyber-ranges, clearer vulnerability shar ing, and early multi-agent deployments. These steps move space cybersecurity from reactive patching toward proactive resilience.
Authors:Sharif Noor Zisad, Ragib Hasan
Abstract:
As our cities and communities become smarter, the systems that keep us safe, such as traffic control centers, emergency response networks, and public transportation, also become more complex. With this complexity comes a greater risk of security threats that can affect not just machines but real people's lives. To address this challenge, we present ThreatGPT, an agentic Artificial Intelligence (AI) assistant built to help people whether they are engineers, safety officers, or policy makers to understand and analyze threats in public safety systems. Instead of requiring deep cybersecurity expertise, it allows users to simply describe the components of a system they are concerned about, such as login systems, data storage, or communication networks. Then, with the click of a button, users can choose how they want the system to be analyzed by using popular frameworks such as STRIDE, MITRE ATT&CK, CVE reports, NIST, or CISA. ThreatGPT is unique because it does not just provide threat information, but rather it acts like a knowledgeable partner. Using few-shot learning, the AI learns from examples and generates relevant smart threat models. It can highlight what might go wrong, how attackers could take advantage, and what can be done to prevent harm. Whether securing a city's infrastructure or a local health service, this tool adapts to users' needs. In simple terms, ThreatGPT brings together AI and human judgment to make our public systems safer. It is designed not just to analyze threats, but to empower people to understand and act on them, faster, smarter, and with more confidence.
Authors:Konur Tholl, François Rivest, Mariam El Mezouar, Ranwa Al Mallah
Abstract:
Reinforcement Learning (RL) has shown great potential for autonomous decision-making in the cybersecurity domain, enabling agents to learn through direct environment interaction. However, RL agents in Autonomous Cyber Operations (ACO) typically learn from scratch, requiring them to execute undesirable actions to learn their consequences. In this study, we integrate external knowledge in the form of a Large Language Model (LLM) pretrained on cybersecurity data that our RL agent can directly leverage to make informed decisions. By guiding initial training with an LLM, we improve baseline performance and reduce the need for exploratory actions with obviously negative outcomes. We evaluate our LLM-integrated approach in a simulated cybersecurity environment, and demonstrate that our guided agent achieves over 2x higher rewards during early training and converges to a favorable policy approximately 4,500 episodes faster than the baseline.
Authors:Simon Lachnit, Ghassan Karame
Abstract:
Horizontal Federated Learning (HFL) is particularly vulnerable to backdoor attacks as adversaries can easily manipulate both the training data and processes to execute sophisticated attacks. In this work, we study the impact of training hyperparameters on the effectiveness of backdoor attacks and defenses in HFL. More specifically, we show both analytically and by means of measurements that the choice of hyperparameters by benign clients does not only influence model accuracy but also significantly impacts backdoor attack success. This stands in sharp contrast with the multitude of contributions in the area of HFL security, which often rely on custom ad-hoc hyperparameter choices for benign clients$\unicode{x2013}$leading to more pronounced backdoor attack strength and diminished impact of defenses. Our results indicate that properly tuning benign clients' hyperparameters$\unicode{x2013}$such as learning rate, batch size, and number of local epochs$\unicode{x2013}$can significantly curb the effectiveness of backdoor attacks, regardless of the malicious clients' settings. We support this claim with an extensive robustness evaluation of state-of-the-art attack-defense combinations, showing that carefully chosen hyperparameters yield across-the-board improvements in robustness without sacrificing main task accuracy. For example, we show that the 50%-lifespan of the strong A3FL attack can be reduced by 98.6%, respectively$\unicode{x2013}$all without using any defense and while incurring only a 2.9 percentage points drop in clean task accuracy.
Authors:Shuzheng Wang, Yue Huang, Zhuoer Xu, Yuming Huang, Jing Tang
Abstract:
Ethereum smart contracts hold tens of billions of USD in DeFi and NFTs, yet comprehensive security analysis remains difficult due to unverified code, proxy-based architectures, and the reliance on manual inspection of complex execution traces. Existing approaches fall into two main categories: anomaly transaction detection, which flags suspicious transactions but offers limited insight into specific attack strategies hidden in execution traces inside transactions, and code vulnerability detection, which cannot analyze unverified contracts and struggles to show how identified flaws are exploited in real incidents. As a result, analysts must still manually align transaction traces with contract code to reconstruct attack scenarios and conduct forensics. To address this gap, TraceLLM is proposed as a framework that leverages LLMs to integrate execution trace-level detection with decompiled contract code. We introduce a new anomaly execution path identification algorithm and an LLM-refined decompile tool to identify vulnerable functions and provide explicit attack paths to LLM. TraceLLM establishes the first benchmark for joint trace and contract code-driven security analysis. For comparison, proxy baselines are created by jointly transmitting the results of three representative code analysis along with raw traces to LLM. TraceLLM identifies attacker and victim addresses with 85.19\% precision and produces automated reports with 70.37\% factual precision across 27 cases with ground truth expert reports, achieving 25.93\% higher accuracy than the best baseline. Moreover, across 148 real-world Ethereum incidents, TraceLLM automatically generates reports with 66.22\% expert-verified accuracy, demonstrating strong generalizability.
Authors:Pascal Zimmer, Simon Lachnit, Alexander Jan Zielinski, Ghassan Karame
Abstract:
A number of attacks rely on infrared light sources or heat-absorbing material to imperceptibly fool systems into misinterpreting visual input in various image recognition applications. However, almost all existing approaches can only mount untargeted attacks and require heavy optimizations due to the use-case-specific constraints, such as location and shape. In this paper, we propose a novel, stealthy, and cost-effective attack to generate both targeted and untargeted adversarial infrared perturbations. By projecting perturbations from a transparent film onto the target object with an off-the-shelf infrared flashlight, our approach is the first to reliably mount laser-free targeted attacks in the infrared domain. Extensive experiments on traffic signs in the digital and physical domains show that our approach is robust and yields higher attack success rates in various attack scenarios across bright lighting conditions, distances, and angles compared to prior work. Equally important, our attack is highly cost-effective, requiring less than US\$50 and a few tens of seconds for deployment. Finally, we propose a novel segmentation-based detection that thwarts our attack with an F1-score of up to 99%.
Authors:Nathanael Coolidge, Jaime González Sanz, Li Yang, Khalil El Khatib, Glenn Harvel, Nelson Agbemava, I Putu Susila, Mehmet Yavuz Yagci
Abstract:
Radiation Detection Systems (RDSs) are used to measure and detect abnormal levels of radioactive material in the environment. These systems are used in many applications to mitigate threats posed by high levels of radioactive material. However, these systems lack protection against malicious external attacks to modify the data. The novelty of applying Intrusion Detection Systems (IDS) in RDSs is a crucial element in safeguarding these critical infrastructures. While IDSs are widely used in networking environments to safeguard against various attacks, their application in RDSs is novel. A common attack on RDSs is Denial of Service (DoS), where the attacker aims to overwhelm the system, causing malfunctioning RDSs. This paper proposes an efficient Machine Learning (ML)-based IDS to detect anomalies in radiation data, focusing on DoS attacks. This work explores the use of sampling methods to create a simulated DoS attack based on a real radiation dataset, followed by an evaluation of various ML algorithms, including Random Forest, Support Vector Machine (SVM), logistic regression, and Light Gradient-Boosting Machine (LightGBM), to detect DoS attacks on RDSs. LightGBM is emphasized for its superior accuracy and low computational resource consumption, making it particularly suitable for real-time intrusion detection. Additionally, model optimization and TinyML techniques, including feature selection, parallel execution, and random search methods, are used to improve the efficiency of the proposed IDS. Finally, an optimized and efficient LightGBM-based IDS is developed to achieve accurate intrusion detection for RDSs.
Authors:Einstein Rivas Pizarro, Wajiha Zaheer, Li Yang, Khalil El-Khatib, Glenn Harvel
Abstract:
Radiation Detection Systems (RDSs) play a vital role in ensuring public safety across various settings, from nuclear facilities to medical environments. However, these systems are increasingly vulnerable to cyber-attacks such as data injection, man-in-the-middle (MITM) attacks, ICMP floods, botnet attacks, privilege escalation, and distributed denial-of-service (DDoS) attacks. Such threats could compromise the integrity and reliability of radiation measurements, posing significant public health and safety risks. This paper presents a new synthetic radiation dataset and an Intrusion Detection System (IDS) tailored for resource-constrained environments, bringing Machine Learning (ML) predictive capabilities closer to the sensing edge layer of critical infrastructure. Leveraging TinyML techniques, the proposed IDS employs an optimized XGBoost model enhanced with pruning, quantization, feature selection, and sampling. These TinyML techniques significantly reduce the size of the model and computational demands, enabling real-time intrusion detection on low-resource devices while maintaining a reasonable balance between efficiency and accuracy.
Authors:Chengyu Song, Jianming Zheng
Abstract:
Insider threat detection (ITD) requires analyzing sparse, heterogeneous user behavior. Existing ITD methods predominantly rely on single-view modeling, resulting in limited coverage and missed anomalies. While multi-view learning has shown promise in other domains, its direct application to ITD introduces significant challenges: scalability bottlenecks from independently trained sub-models, semantic misalignment across disparate feature spaces, and view imbalance that causes high-signal modalities to overshadow weaker ones. In this work, we present Insight-LLM, the first modular multi-view fusion framework specifically tailored for insider threat detection. Insight-LLM employs frozen, pre-nes, achieving state-of-the-art detection with low latency and parameter overhead.
Authors:Samuel Punch, Krishnendu Guha
Abstract:
Cloud quantum services can reveal circuit structure and timing through scheduler metadata, latency patterns, and co-tenant interference. We introduce NADGO (Noise-Adaptive Dummy-Gate Obfuscation), a scheduling and obfuscation stack that enforces operational privacy for gate-model workloads by applying per-interval limits on observable information leakage. To support confidentiality and fair multi-tenancy, operators require a method to audit compliance at acceptable overheads. NADGO combines: (i) hardware-aware t-design padding for structured cover traffic, (ii) particle-filter timing randomization to mask queue patterns, (iii) CASQUE subcircuit routing across heterogeneous backends, and (iv) a per-interval leakage estimator with locked calibration artifacts and a dual-threshold kill-switch. We prototype the approach on a 4-qubit superconducting tile with cryo-CMOS control and evaluate both depth-varied local-random circuits and small QAOA instances. Monitoring runs at a 6.3 microsecond control interval, and per-interval decisions are recorded in an append-only, hash-chained audit log. Across Monte Carlo (Tier 1) and cloud-hardware emulation (Tier 2) evaluations, NADGO maintains leakage within budget in nominal operation (interval-abort rate below 1 percent) and under attack yields high separation with concentrated aborts. At matched leakage targets, microbenchmarks indicate lower latency and cryogenic power consumption than static padding, while end-to-end workloads maintain competitive cost envelopes.
Authors:Samuel Punch, Krishnendu Guha
Abstract:
We present MaestroCut, a closed-loop framework for quantum circuit cutting that adapts partitioning and shot allocation to device drift and workload variation. MaestroCut tracks a variance proxy in real time, triggers re-cutting when accuracy degrades, and routes shots using topology-aware priors. An online estimator cascade (MLE, Bayesian, GP-assisted) selects the lowest-error reconstruction within a fixed budget. Tier-1 simulations show consistent variance contraction and reduced mean-squared error versus uniform and proportional baselines. Tier-2 emulation with realistic queueing and noise demonstrates stable latency targets, high reliability, and ~1% software overhead under stress scenarios. These results indicate that adaptive circuit cutting can provide accuracy and efficiency improvements with minimal operational cost on near-term hardware.
Authors:Luca Cotti, Anisa Rula, Devis Bianchini, Federico Cerutti
Abstract:
Effective Cyber Threat Intelligence (CTI) relies upon accurately structured and semantically enriched information extracted from cybersecurity system logs. However, current methodologies often struggle to identify and interpret malicious events reliably and transparently, particularly in cases involving unstructured or ambiguous log entries. In this work, we propose a novel methodology that combines ontology-driven structured outputs with Large Language Models (LLMs), to build an Artificial Intelligence (AI) agent that improves the accuracy and explainability of information extraction from cybersecurity logs. Central to our approach is the integration of domain ontologies and SHACL-based constraints to guide the language model's output structure and enforce semantic validity over the resulting graph. Extracted information is organized into an ontology-enriched graph database, enabling future semantic analysis and querying. The design of our methodology is motivated by the analytical requirements associated with honeypot log data, which typically comprises predominantly malicious activity. While our case study illustrates the relevance of this scenario, the experimental evaluation is conducted using publicly available datasets. Results demonstrate that our method achieves higher accuracy in information extraction compared to traditional prompt-only approaches, with a deliberate focus on extraction quality rather than processing speed.
Authors:Isaac David, Arthur Gervais
Abstract:
AI-powered development platforms are making software creation accessible to a broader audience, but this democratization has triggered a scalability crisis in security auditing. With studies showing that up to 40% of AI-generated code contains vulnerabilities, the pace of development now vastly outstrips the capacity for thorough security assessment.
We present MAPTA, a multi-agent system for autonomous web application security assessment that combines large language model orchestration with tool-grounded execution and end-to-end exploit validation. On the 104-challenge XBOW benchmark, MAPTA achieves 76.9% overall success with perfect performance on SSRF and misconfiguration vulnerabilities, 83% success on broken authorization, and strong results on injection attacks including server-side template injection (85%) and SQL injection (83%). Cross-site scripting (57%) and blind SQL injection (0%) remain challenging. Our comprehensive cost analysis across all challenges totals $21.38 with a median cost of $0.073 for successful attempts versus $0.357 for failures. Success correlates strongly with resource efficiency, enabling practical early-stopping thresholds at approximately 40 tool calls or $0.30 per challenge.
MAPTA's real-world findings are impactful given both the popularity of the respective scanned GitHub repositories (8K-70K stars) and MAPTA's low average operating cost of $3.67 per open-source assessment: MAPTA discovered critical vulnerabilities including RCEs, command injections, secret exposure, and arbitrary file write vulnerabilities. Findings are responsibly disclosed, 10 findings are under CVE review.
Authors:Konur Tholl, Mariam El Mezouar, Ranwa Al Mallah
Abstract:
Simulated environments have proven invaluable in Autonomous Cyber Operations (ACO) where Reinforcement Learning (RL) agents can be trained without the computational overhead of emulation. These environments must accurately represent cybersecurity scenarios while producing the necessary signals to support RL training. In this study, we present a framework where we first extend CybORG's Cage Challenge 2 environment by implementing three new actions: Patch, Isolate, and Unisolate, to better represent the capabilities available to human operators in real-world settings. We then propose a design for agent development where we modify the reward signals and the agent's feature space to enhance training performance. To validate these modifications, we train DQN and PPO agents in the updated environment. Our study demonstrates that CybORG can be extended with additional realistic functionality, while maintaining its ability to generate informative training signals for RL agents.
Authors:David Egea, Barproda Halder, Sanghamitra Dutta
Abstract:
Automated detection of vulnerabilities in source code is an essential cybersecurity challenge, underpinning trust in digital systems and services. Graph Neural Networks (GNNs) have emerged as a promising approach as they can learn structural and logical code relationships in a data-driven manner. However, their performance is severely constrained by training data imbalances and label noise. GNNs often learn 'spurious' correlations from superficial code similarities, producing detectors that fail to generalize well to unseen real-world data. In this work, we propose a unified framework for robust and interpretable vulnerability detection, called VISION, to mitigate spurious correlations by systematically augmenting a counterfactual training dataset. Counterfactuals are samples with minimal semantic modifications but opposite labels. Our framework includes: (i) generating counterfactuals by prompting a Large Language Model (LLM); (ii) targeted GNN training on paired code examples with opposite labels; and (iii) graph-based interpretability to identify the crucial code statements relevant for vulnerability predictions while ignoring spurious ones. We find that VISION reduces spurious learning and enables more robust, generalizable detection, improving overall accuracy (from 51.8% to 97.8%), pairwise contrast accuracy (from 4.5% to 95.8%), and worst-group accuracy (from 0.7% to 85.5%) on the Common Weakness Enumeration (CWE)-20 vulnerability. We further demonstrate gains using proposed metrics: intra-class attribution variance, inter-class attribution distance, and node score dependency. We also release CWE-20-CFA, a benchmark of 27,556 functions (real and counterfactual) from the high-impact CWE-20 category. Finally, VISION advances transparent and trustworthy AI-based cybersecurity systems through interactive visualization for human-in-the-loop analysis.
Authors:Zeng Zhang, Xiaoqi Li
Abstract:
The development of blockchain technology has significantly enhanced the security and transparency of personal information and transaction records. Concurrent with the advancement of blockchain technology and the emergence of the digital currency ecosystem, the internet has evolved from a paradigm dominated by information flow to one driven by value flow. Consequently, the concept of token has gained widespread dissemination, and the electronic token under investigation in this thesis is a development of this concept. The application of electronic tokens has become pervasive with the development of the internet, but the functionality of these tokens is often limited, and issues related to trust remain significant challenges. This study proposes innovative solutions to address the deficiencies in traditional electronic token systems, including the issuance of tokens, the heterogeneity of issuance standards, and the lack of democracy. The solutions are based on distributed storage, anti-tampering mechanisms, consensus protocols, and transparent, traceable storage. Additionally, the study employs NFTs, smart contracts, and RSA blind signatures, among other key technologies, to construct a system based on blockchain technology. The process integrates the decentralised management and centralised operation models, aligning them with the national policy directives. The developed solution enables the full utilisation of blockchain technology's advantages while also fostering community participation. Consequently, it establishes a secure, legal, reliable, and dynamic electronic certification system.
Authors:Muhammet Anil Yagiz, Zeynep Sude Cengiz, Polat Goktas
Abstract:
The rapid expansion of immersive Metaverse applications introduces complex challenges at the intersection of performance, privacy, and environmental sustainability. Centralized architectures fall short in addressing these demands, often resulting in elevated energy consumption, latency, and privacy concerns. This paper proposes MetaFed, a decentralized federated learning (FL) framework that enables sustainable and intelligent resource orchestration for Metaverse environments. MetaFed integrates (i) multi-agent reinforcement learning for dynamic client selection, (ii) privacy-preserving FL using homomorphic encryption, and (iii) carbon-aware scheduling aligned with renewable energy availability. Evaluations on MNIST and CIFAR-10 using lightweight ResNet architectures demonstrate that MetaFed achieves up to 25% reduction in carbon emissions compared to conventional approaches, while maintaining high accuracy and minimal communication overhead. These results highlight MetaFed as a scalable solution for building environmentally responsible and privacy-compliant Metaverse infrastructures.
Authors:Jiale Liu, Jiahao Zhang, Suhang Wang
Abstract:
Retrieval-Augmented Generation (RAG) is a powerful technique for enhancing Large Language Models (LLMs) with external, up-to-date knowledge. Graph RAG has emerged as an advanced paradigm that leverages graph-based knowledge structures to provide more coherent and contextually rich answers. However, the move from plain document retrieval to structured graph traversal introduces new, under-explored privacy risks. This paper investigates the data extraction vulnerabilities of the Graph RAG systems. We design and execute tailored data extraction attacks to probe their susceptibility to leaking both raw text and structured data, such as entities and their relationships. Our findings reveal a critical trade-off: while Graph RAG systems may reduce raw text leakage, they are significantly more vulnerable to the extraction of structured entity and relationship information. We also explore potential defense mechanisms to mitigate these novel attack surfaces. This work provides a foundational analysis of the unique privacy challenges in Graph RAG and offers insights for building more secure systems.
Authors:Kaiwen Zuo, Zelin Liu, Raman Dutt, Ziyang Wang, Zhongtian Sun, Yeming Wang, Fan Mo, Pietro Liò
Abstract:
Large Vision-Language Models (LVLMs) augmented with Retrieval-Augmented Generation (RAG) are increasingly employed in medical AI to enhance factual grounding through external clinical image-text retrieval. However, this reliance creates a significant attack surface. We propose MedThreatRAG, a novel multimodal poisoning framework that systematically probes vulnerabilities in medical RAG systems by injecting adversarial image-text pairs. A key innovation of our approach is the construction of a simulated semi-open attack environment, mimicking real-world medical systems that permit periodic knowledge base updates via user or pipeline contributions. Within this setting, we introduce and emphasize Cross-Modal Conflict Injection (CMCI), which embeds subtle semantic contradictions between medical images and their paired reports. These mismatches degrade retrieval and generation by disrupting cross-modal alignment while remaining sufficiently plausible to evade conventional filters. While basic textual and visual attacks are included for completeness, CMCI demonstrates the most severe degradation. Evaluations on IU-Xray and MIMIC-CXR QA tasks show that MedThreatRAG reduces answer F1 scores by up to 27.66% and lowers LLaVA-Med-1.5 F1 rates to as low as 51.36%. Our findings expose fundamental security gaps in clinical RAG systems and highlight the urgent need for threat-aware design and robust multimodal consistency checks. Finally, we conclude with a concise set of guidelines to inform the safe development of future multimodal medical RAG systems.
Authors:Kamel Kamel, Keshav Sood, Hridoy Sankar Dutta, Sunil Aryal
Abstract:
Voice authentication has undergone significant changes from traditional systems that relied on handcrafted acoustic features to deep learning models that can extract robust speaker embeddings. This advancement has expanded its applications across finance, smart devices, law enforcement, and beyond. However, as adoption has grown, so have the threats. This survey presents a comprehensive review of the modern threat landscape targeting Voice Authentication Systems (VAS) and Anti-Spoofing Countermeasures (CMs), including data poisoning, adversarial, deepfake, and adversarial spoofing attacks. We chronologically trace the development of voice authentication and examine how vulnerabilities have evolved in tandem with technological advancements. For each category of attack, we summarize methodologies, highlight commonly used datasets, compare performance and limitations, and organize existing literature using widely accepted taxonomies. By highlighting emerging risks and open challenges, this survey aims to support the development of more secure and resilient voice authentication systems.
Authors:Sabine Houy, Bruno Kreyssig, Timothee Riom, Alexandre Bartel, Patrick McDaniel
Abstract:
Memory corruption vulnerabilities remain one of the most severe threats to software security. They often allow attackers to achieve arbitrary code execution by redirecting a vulnerable program's control flow. While Control Flow Integrity (CFI) has gained traction to mitigate this exploitation path, developers are not provided with any direction on how to apply CFI to real-world software. In this work, we establish a taxonomy mapping LLVM's forward-edge CFI variants to memory corruption vulnerability classes, offering actionable guidance for developers seeking to deploy CFI incrementally in existing codebases. Based on the Top 10 Known Exploited Vulnerabilities (KEV) list, we identify four high-impact vulnerability categories and select one representative CVE for each. We evaluate LLVM's CFI against each CVE and explain why CFI blocks exploitation in two cases while failing in the other two, illustrating its potential and current limitations. Our findings support informed deployment decisions and provide a foundation for improving the practical use of CFI in production systems.
Authors:Zheng Li, Xiaoyang Dong, Xiaoyun Wang
Abstract:
This paper evaluates the secure level of authenticated encryption \textsc{Ascon} against cube-like method. \textsc{Ascon} submitted by Dobraunig \emph{et~al.} is one of 16 survivors of the 3rd round CAESAR competition. The cube-like method is first used by Dinur \emph{et~al.} to analyze Keccak keyed modes. At CT-RSA 2015, Dobraunig \emph{et~al.} applied this method to 5/6-round reduced \textsc{Ascon}, whose structure is similar to Keccak keyed modes. However, for \textsc{Ascon} the non-linear layer is more complex and state is much smaller, which make it hard for the attackers to select enough cube variables that do not multiply with each other after the first round. This seems to be the reason why the best previous key-recovery attack is on 6-round \textsc{Ascon}, while for Keccak keyed modes (Keccak-MAC and Keyak) the attacked round is no less than 7-round.
In this paper, we generalize the conditional cube attack proposed by Huang \emph{et~al.}, and find new cubes depending on some key bit conditions for 5/6-round reduced \textsc{Ascon}, and translate the previous theoretic 6-round attack with $2^{66}$ time complexity to a practical one with $2^{40}$ time complexity. Moreover, we propose the first 7-round key-recovery attack on \textsc{Ascon}. By introducing \emph{the cube-like key-subset technique}, we divide the full key space into many subsets according to different key conditions. For each key subset, we launch the cube tester to determine if the key falls into it. Finally, we recover the full key space by testing all the key subsets. The total time complexity is about $2^{103.9}$. In addition, for a weak-key subset, whose size is $2^{117}$, the attack is more efficient and costs only $2^{77}$ time complexity. Those attacks do not threaten the full round (12 rounds) \textsc{Ascon}.
Authors:Sima Arasteh, Christophe Hauser
Abstract:
In recent years, machine learning has demonstrated impressive results in various fields, including software vulnerability detection. Nonetheless, using machine learning to identify software vulnerabilities presents new challenges, especially regarding the scale of data involved, which was not a factor in traditional methods. Consequently, in spite of the rise of new machine-learning-based approaches in that space, important shortcomings persist regarding their evaluation. First, researchers often fail to provide concrete statistics about their training datasets, such as the number of samples for each type of vulnerability. Moreover, many methods rely on training with semantically similar functions rather than directly on vulnerable programs. This leads to uncertainty about the suitability of the datasets currently used for training. Secondly, the choice of a model and the level of granularity at which models are trained also affect the effectiveness of such vulnerability discovery approaches.
In this paper, we explore the challenges of applying machine learning to vulnerability discovery. We also share insights from our two previous research papers, Bin2vec and BinHunter, which could enhance future research in this field.
Authors:Yuntao Liu, Abir Akib, Zelin Lu, Qian Xu, Ankur Srivastava, Gang Qu, David Kehlet, Nij Dorairaj
Abstract:
The main goal of design obfuscation schemes is to protect sensitive design details from untrusted parties in the VLSI supply chain, including but not limited to off-shore foundries and untrusted end users. In this work, we provide a systematic red teaming approach to evaluate the security of design obfuscation approaches. Specifically, we propose security metrics and evaluation methodology for the scenarios where the adversary does not have access to a working chip. A case study on the RIPPER tool developed by the University of Florida indicates that more information is leaked about the structure of the original design than commonly considered.
Authors:Daniel M. Jimenez-Gutierrez, Yelizaveta Falkouskaya, Jose L. Hernandez-Ramos, Aris Anagnostopoulos, Ioannis Chatzigiannakis, Andrea Vitaletti
Abstract:
Federated Learning (FL) is an emerging distributed machine learning paradigm enabling multiple clients to train a global model collaboratively without sharing their raw data. While FL enhances data privacy by design, it remains vulnerable to various security and privacy threats. This survey provides a comprehensive overview of more than 200 papers regarding the state-of-the-art attacks and defense mechanisms developed to address these challenges, categorizing them into security-enhancing and privacy-preserving techniques. Security-enhancing methods aim to improve FL robustness against malicious behaviors such as byzantine attacks, poisoning, and Sybil attacks. At the same time, privacy-preserving techniques focus on protecting sensitive data through cryptographic approaches, differential privacy, and secure aggregation. We critically analyze the strengths and limitations of existing methods, highlight the trade-offs between privacy, security, and model performance, and discuss the implications of non-IID data distributions on the effectiveness of these defenses. Furthermore, we identify open research challenges and future directions, including the need for scalable, adaptive, and energy-efficient solutions operating in dynamic and heterogeneous FL environments. Our survey aims to guide researchers and practitioners in developing robust and privacy-preserving FL systems, fostering advancements safeguarding collaborative learning frameworks' integrity and confidentiality.
Authors:Viktoria Koscinski, Mark Nelson, Ahmet Okutan, Robert Falso, Mehdi Mirakhorli
Abstract:
Accurately assessing software vulnerabilities is essential for effective prioritization and remediation. While various scoring systems exist to support this task, their differing goals, methodologies and outputs often lead to inconsistent prioritization decisions. This work provides the first large-scale, outcome-linked empirical comparison of four publicly available vulnerability scoring systems: the Common Vulnerability Scoring System (CVSS), the Stakeholder-Specific Vulnerability Categorization (SSVC), the Exploit Prediction Scoring System (EPSS), and the Exploitability Index. We use a dataset of 600 real-world vulnerabilities derived from four months of Microsoft's Patch Tuesday disclosures to investigate the relationships between these scores, evaluate how they support vulnerability management task, how these scores categorize vulnerabilities across triage tiers, and assess their ability to capture the real-world exploitation risk. Our findings reveal significant disparities in how scoring systems rank the same vulnerabilities, with implications for organizations relying on these metrics to make data-driven, risk-based decisions. We provide insights into the alignment and divergence of these systems, highlighting the need for more transparent and consistent exploitability, risk, and severity assessments.
Authors:Yixuan Yang, Daoyuan Wu, Yufan Chen
Abstract:
Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and expands their attack surfaces. In this paper, we present the first systematic taxonomy of MCP security, identifying 17 attack types across 4 primary attack surfaces. We introduce MCPSecBench, a comprehensive security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, and attack scripts to evaluate these attacks across three major MCP providers. Our benchmark is modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for systematic security assessment. Experimental results show that over 85% of the identified attacks successfully compromise at least one platform, with core vulnerabilities universally affecting Claude, OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit considerable variability across different hosts and models. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all MCP layers.
中文: 本文首次系统化分类了模型上下文协议(MCP)的安全风险,在四个攻击面上识别出17种攻击类型,并推出模块化基准测试平台MCPSecBench,实验表明超过85%的攻击能成功突破平台防护,而现有防护机制基本无效。
English: This paper introduces the first systematic taxonomy of security risks in the Model Context Protocol (MCP), identifying 17 attack types across four surfaces, and presents MCPSecBench, a modular benchmark that reveals over 85% of attacks successfully compromise platforms while current protections remain ineffective.
Authors:Ishraq Tashdid, Tasnuva Farheen, Sazadur Rahman
Abstract:
The rapid adoption of chiplet-based heterogeneous integration is reshaping semiconductor design by enabling modular, scalable, and faster time-to-market solutions for AI and high-performance computing. However, multi-vendor assembly in post-fabrication environments fragments the supply chain and exposes SiP systems to serious security threats, including cloning, overproduction, and chiplet substitution. Existing authentication solutions depend on trusted integrators or centralized security anchors, which can expose sensitive data or create single points of failure. We introduce AuthenTree, a distributed authentication framework that leverages multi-party computation (MPC) in a scalable tree-based architecture, removing the need for dedicated security hardware or centralized trust. AuthenTree enables secure chiplet validation without revealing raw signatures, distributing trust across multiple integrator chiplets. Our evaluation in five SiP benchmarks demonstrates that AuthenTree imposes minimal overhead, with an area as low as 0.48% (7,000 sq-micrometers), an overhead power under 0.5%, and an authentication latency below 1 microsecond, surpassing previous work in some cases by 700 times. These results establish AuthenTree as an efficient, robust, and scalable solution for next-generation chiplet-based security in zero-trust SiP environments.
Authors:Tyler Schroder, Renee Sirbu, Sohee Park, Jessica Morley, Sam Street, Luciano Floridi
Abstract:
Brain-computer interfaces (BCIs) show enormous potential for advancing personalized medicine. However, BCIs also introduce new avenues for cyber-attacks or security compromises. In this article, we analyze the problem and make recommendations for device manufacturers to better secure devices and to help regulators understand where more guidance is needed to protect patient safety and data confidentiality. Device manufacturers should implement the prior suggestions in their BCI products. These recommendations help protect BCI users from undue risks, including compromised personal health and genetic information, unintended BCI-mediated movement, and many other cybersecurity breaches. Regulators should mandate non-surgical device update methods, strong authentication and authorization schemes for BCI software modifications, encryption of data moving to and from the brain, and minimize network connectivity where possible. We also design a hypothetical, average-case threat model that identifies possible cybersecurity threats to BCI patients and predicts the likeliness of risk for each category of threat. BCIs are at less risk of physical compromise or attack, but are vulnerable to remote attack; we focus on possible threats via network paths to BCIs and suggest technical controls to limit network connections.
Authors:Yongjian Guo, Puzhuo Liu, Wanlun Ma, Zehang Deng, Xiaogang Zhu, Peng Di, Xi Xiao, Sheng Wen
Abstract:
The Model Context Protocol (MCP) has emerged as a universal standard that enables AI agents to seamlessly connect with external tools, significantly enhancing their functionality. However, while MCP brings notable benefits, it also introduces significant vulnerabilities, such as Tool Poisoning Attacks (TPA), where hidden malicious instructions exploit the sycophancy of large language models (LLMs) to manipulate agent behavior. Despite these risks, current academic research on MCP security remains limited, with most studies focusing on narrow or qualitative analyses that fail to capture the diversity of real-world threats. To address this gap, we present the MCP Attack Library (MCPLIB), which categorizes and implements 31 distinct attack methods under four key classifications: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attack. We further conduct a quantitative analysis of the efficacy of each attack. Our experiments reveal key insights into MCP vulnerabilities, including agents' blind reliance on tool descriptions, sensitivity to file-based attacks, chain attacks exploiting shared context, and difficulty distinguishing external data from executable commands. These insights, validated through attack experiments, underscore the urgency for robust defense strategies and informed MCP design. Our contributions include 1) constructing a comprehensive MCP attack taxonomy, 2) introducing a unified attack framework MCPLIB, and 3) conducting empirical vulnerability analysis to enhance MCP security mechanisms. This work provides a foundational framework, supporting the secure evolution of MCP ecosystems.
Authors:Ben Nassi, Stav Cohen, Or Yair
Abstract:
The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware - maliciously engineered prompts designed to manipulate LLMs to compromise the CIA triad of these applications. While prior research warned about a potential shift in the threat landscape for LLM-powered applications, the risk posed by Promptware is frequently perceived as low. In this paper, we investigate the risk Promptware poses to users of Gemini-powered assistants (web application, mobile application, and Google Assistant). We propose a novel Threat Analysis and Risk Assessment (TARA) framework to assess Promptware risks for end users. Our analysis focuses on a new variant of Promptware called Targeted Promptware Attacks, which leverage indirect prompt injection via common user interactions such as emails, calendar invitations, and shared documents. We demonstrate 14 attack scenarios applied against Gemini-powered assistants across five identified threat classes: Short-term Context Poisoning, Permanent Memory Poisoning, Tool Misuse, Automatic Agent Invocation, and Automatic App Invocation. These attacks highlight both digital and physical consequences, including spamming, phishing, disinformation campaigns, data exfiltration, unapproved user video streaming, and control of home automation devices. We reveal Promptware's potential for on-device lateral movement, escaping the boundaries of the LLM-powered application, to trigger malicious actions using a device's applications. Our TARA reveals that 73% of the analyzed threats pose High-Critical risk to end users. We discuss mitigations and reassess the risk (in response to deployed mitigations) and show that the risk could be reduced significantly to Very Low-Medium. We disclosed our findings to Google, which deployed dedicated mitigations.
Authors:Sina Bagheri, Masoud Kaveh, Francisco Hernando-Gallego, Diego MartÃn, Nuria Serrano
Abstract:
The commutative supersingular isogeny Diffie-Hellman (CSIDH) algorithm is a promising post-quantum key exchange protocol, notable for its exceptionally small key sizes, but hindered by computationally intensive key generation. Furthermore, practical implementations must operate in constant time to mitigate side-channel vulnerabilities, which presents an additional performance challenge. This paper presents, to our knowledge, the first comprehensive hardware study of CSIDH, establishing a performance baseline with a unified architecture on both field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The architecture features a top-level finite state machine (FSM) that orchestrates a deeply pipelined arithmetic logic unit (ALU) to accelerate the underlying 512-bit finite field operations. The ALU employs a parallelized schoolbook multiplier, completing a 512$\times$512-bit multiplication in 22 clock cycles and enabling a full Montgomery modular multiplication in 87 cycles. The constant-time CSIDH-512 design requires $1.03\times10^{8}$ clock cycles per key generation. When implemented on a Xilinx Zynq UltraScale+ FPGA, the architecture achieves a 200 MHz clock frequency, corresponding to a 515 ms latency. For ASIC implementation in a 180nm process, the design requires $1.065\times10^{8}$ clock cycles and achieves a \textasciitilde 180 MHz frequency, resulting in a key generation latency of 591 ms. By providing the first public hardware performance metrics for CSIDH on both FPGA and ASIC platforms, this work delivers a crucial benchmark for future isogeny-based post-quantum cryptography (PQC) accelerators.
Authors:Richa Dasila, Vatsala Upadhyay, Samo Bobek, Abhishek Vaish
Abstract:
Deep learning models are one of the security strategies, trained on extensive datasets, and play a critical role in detecting and responding to these threats by recognizing complex patterns in malicious code. However, the opaque nature of these models-often described as "black boxes"-makes their decision-making processes difficult to understand, even for their creators. This research addresses these challenges by integrating Explainable AI (XAI) techniques to enhance the interpretability and trustworthiness of malware detection models. In this research, the use of Multi-Layer Perceptrons (MLP) for dynamic malware analysis has been considered, a less explored area, and its efficacy in detecting Metamorphic Malware, and further the effectiveness and transparency of MLPs, CNNs, RNNs, and CNN-LSTM models in malware classification, evaluating these models through the lens of Explainable AI (XAI). This comprehensive approach aims to demystify the internal workings of deep learning models, promoting a better understanding and trust in their predictive capabilities in cybersecurity contexts. Such in-depth analysis and implementation haven't been done to the best of our knowledge.
Authors:Sandipan Dey, Payal Santosh Kate, Vatsala Upadhyay, Abhishek Vaish
Abstract:
DDoS attacks have become a major threat to the security of IoT devices and can cause severe damage to the network infrastructure. IoT devices suffer from the inherent problem of resource constraints and are therefore susceptible to such resource-exhausting attacks. Traditional methods for detecting DDoS attacks are not efficient enough to cope with the dynamic nature of IoT networks, as well as the scalability of the attacks, diversity of protocols, high volume of traffic, and variability in device behavior, and variability of protocols like MQTT, CoAP, making it hard to implement security across all the protocols. In this paper, we propose a novel approach, i.e., the use of Transformer models, which have shown remarkable performance in natural language processing tasks, for detecting DDoS attacks on IoT devices. The proposed model extracts features from network traffic data and processes them using a self-attention mechanism. Experiments conducted on a real-world dataset demonstrate that the proposed approach outperforms traditional machine learning techniques, which can be validated by comparing both approaches' accuracy, precision, recall, and F1-score. The results of this study show that the Transformer models can be an effective solution for detecting DDoS attacks on IoT devices and have the potential to be deployed in real-world IoT environments.
Authors:Wenqiang Wang, Yan Xiao, Hao Lin, Yangshijie Zhang, Xiaochun Cao
Abstract:
Current multi-task adversarial text attacks rely on abundant access to shared internal features and numerous queries, often limited to a single task type. As a result, these attacks are less effective against practical scenarios involving black-box feedback APIs, limited queries, or multiple task types. To bridge this gap, we propose \textbf{C}luster and \textbf{E}nsemble \textbf{M}ulti-task Text Adversarial \textbf{A}ttack (\textbf{CEMA}), an effective black-box attack that exploits the transferability of adversarial texts across different tasks. CEMA simplifies complex multi-task scenarios by using a \textit{deep-level substitute model} trained in a \textit{plug-and-play} manner for text classification, enabling attacks without mimicking the victim model. This approach requires only a few queries for training, converting multi-task attacks into classification attacks and allowing attacks across various tasks.
CEMA generates multiple adversarial candidates using different text classification methods and selects the one that most effectively attacks substitute models.
In experiments involving multi-task models with two, three, or six tasks--spanning classification, translation, summarization, and text-to-image generation--CEMA demonstrates significant attack success with as few as 100 queries. Furthermore, CEMA can target commercial APIs (e.g., Baidu and Google Translate), large language models (e.g., ChatGPT 4o), and image-generation models (e.g., Stable Diffusion V2), showcasing its versatility and effectiveness in real-world applications.
Authors:Yue Yao, Zhen Xu, Youzhu Liu, Kunyuan Ma, Yuxiu Lin, Mohan Jiang
Abstract:
This paper addresses the challenges of data privacy and collaborative modeling in cross-institution financial risk analysis. It proposes a risk assessment framework based on federated learning. Without sharing raw data, the method enables joint modeling and risk identification across multiple institutions. This is achieved by incorporating a feature attention mechanism and temporal modeling structure. Specifically, the model adopts a distributed optimization strategy. Each financial institution trains a local sub-model. The model parameters are protected using differential privacy and noise injection before being uploaded. A central server then aggregates these parameters to generate a global model. This global model is used for systemic risk identification. To validate the effectiveness of the proposed method, multiple experiments are conducted. These evaluate communication efficiency, model accuracy, systemic risk detection, and cross-market generalization. The results show that the proposed model outperforms both traditional centralized methods and existing federated learning variants across all evaluation metrics. It demonstrates strong modeling capabilities and practical value in sensitive financial environments. The method enhances the scope and efficiency of risk identification while preserving data sovereignty. It offers a secure and efficient solution for intelligent financial risk analysis.
Authors:Fuyao Zhang, Xinyu Yan, Tiantong Wu, Wenjie Li, Tianxiang Chen, Yang Cao, Ran Yan, Longtao Huang, Wei Yang Bryan Lim, Qiang Yang
Abstract:
Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.
Authors:Fatemeh Moradi, Mehran Tarif, Mohammadhossein Homaei
Abstract:
Detecting fraud in modern supply chains is a growing challenge, driven by the complexity of global networks and the scarcity of labeled data. Traditional detection methods often struggle with class imbalance and limited supervision, reducing their effectiveness in real-world applications. This paper proposes a novel two-phase learning framework to address these challenges. In the first phase, the Isolation Forest algorithm performs unsupervised anomaly detection to identify potential fraud cases and reduce the volume of data requiring further analysis. In the second phase, a self-training Support Vector Machine (SVM) refines the predictions using both labeled and high-confidence pseudo-labeled samples, enabling robust semi-supervised learning. The proposed method is evaluated on the DataCo Smart Supply Chain Dataset, a comprehensive real-world supply chain dataset with fraud indicators. It achieves an F1-score of 0.817 while maintaining a false positive rate below 3.0%. These results demonstrate the effectiveness and efficiency of combining unsupervised pre-filtering with semi-supervised refinement for supply chain fraud detection under real-world constraints, though we acknowledge limitations regarding concept drift and the need for comparison with deep learning approaches.
Authors:Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese, Omer Akgul, Athanasios Theocharis, Petros Efstathopoulos
Abstract:
AI for IT Operations (AIOps) is transforming how organizations manage complex software systems by automating anomaly detection, incident diagnosis, and remediation. Modern AIOps solutions increasingly rely on autonomous LLM-based agents to interpret telemetry data and take corrective actions with minimal human intervention, promising faster response times and operational cost savings.
In this work, we perform the first security analysis of AIOps solutions, showing that, once again, AI-driven automation comes with a profound security cost. We demonstrate that adversaries can manipulate system telemetry to mislead AIOps agents into taking actions that compromise the integrity of the infrastructure they manage. We introduce techniques to reliably inject telemetry data using error-inducing requests that influence agent behavior through a form of adversarial reward-hacking; plausible but incorrect system error interpretations that steer the agent's decision-making. Our attack methodology, AIOpsDoom, is fully automated--combining reconnaissance, fuzzing, and LLM-driven adversarial input generation--and operates without any prior knowledge of the target system.
To counter this threat, we propose AIOpsShield, a defense mechanism that sanitizes telemetry data by exploiting its structured nature and the minimal role of user-generated content. Our experiments show that AIOpsShield reliably blocks telemetry-based attacks without affecting normal agent performance.
Ultimately, this work exposes AIOps as an emerging attack vector for system compromise and underscores the urgent need for security-aware AIOps design.
Authors:Xurun Wang, Guangrui Liu, Xinjie Li, Haoyu He, Lin Yao, Weizhe Zhang
Abstract:
Machine learning models have been shown to be susceptible to membership inference attack, which can be used to determine whether a given sample appears in the training data. Existing membership inference methods commonly assume that the adversary has full access to the features of the target sample. This assumption, however, does not hold in many real-world scenarios where only partial features information is available, thereby limiting the applicability of these methods. In this work, we study an inference scenario where the adversary observes only partial features of each sample and aims to infer whether this observed subset was present in the training set of the target model. We define this problem as Partial Feature Membership Inference (PFMI). To address this problem, we propose MRAD (Memory-guided Reconstruction and Anomaly Detection), a two-stage attack framework. In the first stage, MRAD optimizes the unknown feature values to minimize the loss of the sample. In the second stage, it measures the deviation between the reconstructed sample and the training distribution using anomaly detection. Empirical results demonstrate that MRAD is effective across a range of datasets, and maintains compatibility with various off-the-shelf anomaly detection techniques. For example, on STL-10, our attack achieves an AUC of around 0.6 even with 40% of the missing features.
Authors:Haorui He, Yupeng Li, Bin Benjamin Zhu, Dacheng Wen, Reynold Cheng, Francis C. M. Lau
Abstract:
State-of-the-art fact-checking systems combat misinformation at scale by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanatory rationales for the verdicts). The security of these systems is crucial, as compromised fact-checkers, which tend to be easily underexplored, can amplify misinformation. This work introduces Fact2Fiction, the first poisoning attack framework targeting such agentic fact-checking systems. Fact2Fiction mirrors the decomposition strategy and exploits system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9\%--21.2\% higher attack success rates than state-of-the-art attacks across various poisoning budgets. Fact2Fiction exposes security weaknesses in current fact-checking systems and highlights the need for defensive countermeasures.
Authors:Sasa Maric, Rasil Baidar, Robert Abbas, Sam Reisenfeld
Abstract:
The integration of Terrestrial Networks (TN) and Non-Terrestrial Networks (NTN), including 5G Advanced/6G and the Internet of Things (IoT) technologies, using Low Earth Orbit (LEO) satellites, high-altitude platforms (HAPS), and Unmanned Aerial Vehicles (UAVs), is redefining the landscape of global connectivity. This paper introduces a new system-level security framework for 5G Advanced/6G IoT-integrated TN-NTN architectures with AI-native-enabled cloud security. Due to the heterogeneity, scale, and distributed nature of these networks, new security challenges have emerged. Leveraging AI-native cloud platforms offers powerful capabilities for real-time threat detection, security automation, and intelligent policy enforcement. The NTN satellite access function enhances security for discontinuous coverage via satellite connections. In addition, this paper explores the security risks associated with integrated 5G Advanced/6G IoT TN-NTN systems, including full network segmentation, network slicing, and the cloudification of the RAN and core. We present a comprehensive AI-enabled cloud security framework and conclude with proposals for implementing AI-powered, satellite-based NTN within future 5G Advanced/6G IoT networks. Our approach emphasizes zero-trust principles, federated learning, secure orchestration, a layered security framework, and resilience against adversarial threats.
Authors:Thorsten Peinemann, Paula Arnold, Sebastian Berndt, Thomas Eisenbarth, Esfandiar Mohammadi
Abstract:
Backdoor injection attacks are a threat to machine learning models that are trained on large data collected from untrusted sources; these attacks enable attackers to inject malicious behavior into the model that can be triggered by specially crafted inputs. Prior work has established bounds on the success of backdoor attacks and their impact on the benign learning task, however, an open question is what amount of poison data is needed for a successful backdoor attack. Typical attacks either use few samples, but need much information about the data points or need to poison many data points.
In this paper, we formulate the one-poison hypothesis: An adversary with one poison sample and limited background knowledge can inject a backdoor with zero backdooring-error and without significantly impacting the benign learning task performance. Moreover, we prove the one-poison hypothesis for linear regression and linear classification. For adversaries that utilize a direction that is unused by the benign data distribution for the poison sample, we show that the resulting model is functionally equivalent to a model where the poison was excluded from training. We build on prior work on statistical backdoor learning to show that in all other cases, the impact on the benign learning task is still limited. We also validate our theoretical results experimentally with realistic benchmark data sets.
Authors:Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit Das, Aman Chadha
Abstract:
Redacting Personally Identifiable Information (PII) from unstructured text is critical for ensuring data privacy in regulated domains. While earlier approaches have relied on rule-based systems and domain-specific Named Entity Recognition (NER) models, these methods fail to generalize across formats and contexts. Recent advances in Large Language Models (LLMs) offer a promising alternative, yet the effect of architectural and training choices on redaction performance remains underexplored. LLMs have demonstrated strong performance in tasks that require contextual language understanding, including the redaction of PII in free-form text. Prior work suggests that with appropriate adaptation, LLMs can become effective contextual privacy learners. However, the consequences of architectural and training choices for PII Redaction remain underexplored. In this work, we present a comprehensive analysis of LLMs as privacy-preserving PII Redaction systems. We evaluate a range of LLM architectures and training strategies for their effectiveness in PII Redaction. Our analysis measures redaction performance, semantic preservation, and PII leakage, and compares these outcomes against latency and computational cost. The results provide practical guidance for configuring LLM-based redactors that are accurate, efficient, and privacy-aware. To support reproducibility and real-world deployment, we release PRvL, an open-source suite of fine-tuned models, and evaluation tools for general-purpose PII Redaction. PRvL is built entirely on open-source LLMs and supports multiple inference settings for flexibility and compliance. It is designed to be easily customized for different domains and fully operable within secure, self-managed environments. This enables data owners to perform redactions without relying on third-party services or exposing sensitive content beyond their own infrastructure.
Authors:Jörn Bodenhausen, Simon Mangel, Thomas Vogt, Martin Henze
Abstract:
While TLS has become the de-facto standard for end-to-end security, its use to secure critical communication in evolving industrial IoT scenarios is severely limited by prevalent resource constraints of devices and networks. Most notably, the TLS handshake to establish secure connections incurs significant bandwidth and processing overhead that often cannot be handled in constrained environments. To alleviate this situation, we present BiTHaC which realizes bidirectional TLS handshake caching by exploiting that significant parts of repeated TLS handshakes, especially certificates, are static. Thus, redundant information neither needs to be transmitted nor corresponding computations performed, saving valuable bandwidth and processing resources. By implementing BiTHaC for wolfSSL, we show that we can reduce the bandwidth consumption of TLS handshakes by up to 61.1% and the computational overhead by up to 8.5%, while incurring only well-manageable memory overhead and preserving the strict security guarantees of TLS.
Authors:Yu Pan, Jiahao Chen, Lin Wang, Bingrong Dai, Yi Du
Abstract:
In recent years, Diffusion models have achieved remarkable progress in the field of image generation. However, recent studies have shown that diffusion models are susceptible to backdoor attacks, in which attackers can manipulate the output by injecting covert triggers such as specific visual patterns or textual phrases into the training dataset. Fortunately, with the continuous advancement of defense techniques, defenders have become increasingly capable of identifying and mitigating most backdoor attacks using visual inspection and neural network-based detection methods. However, in this paper, we identify a novel type of backdoor threat that is more lightweight and covert than existing approaches, which we name BadBlocks, requires only about 30% of the computational resources and 20% GPU time typically needed by previous backdoor attacks, yet it successfully injects backdoors and evades the most advanced defense frameworks. BadBlocks enables attackers to selectively contaminate specific blocks within the UNet architecture of diffusion models while maintaining normal functionality in the remaining components. Experimental results demonstrate that BadBlocks achieves a high attack success rate and low perceptual quality loss , even under extremely constrained computational resources and GPU time. Moreover, BadBlocks is able to bypass existing defense frameworks, especially the attention-based backdoor detection method, highlighting it as a novel and noteworthy threat. Ablation studies further demonstrate that effective backdoor injection does not require fine-tuning the entire network and highlight the pivotal role of certain neural network layers in backdoor mapping. Overall, BadBlocks significantly reduces the barrier to conducting backdoor attacks in all aspects. It enables attackers to inject backdoors into large-scale diffusion models even using consumer-grade GPUs.
Authors:Kang Chen, Xiuze Zhou, Yuanguo Lin, Jinhe Su, Yuanhui Yu, Li Shen, Fan Lin
Abstract:
Large Language Models (LLMs), now a foundation in advancing natural language processing, power applications such as text generation, machine translation, and conversational systems. Despite their transformative potential, these models inherently rely on massive amounts of training data, often collected from diverse and uncurated sources, which exposes them to serious data security risks. Harmful or malicious data can compromise model behavior, leading to issues such as toxic output, hallucinations, and vulnerabilities to threats such as prompt injection or data poisoning. As LLMs continue to be integrated into critical real-world systems, understanding and addressing these data-centric security risks is imperative to safeguard user trust and system reliability. This survey offers a comprehensive overview of the main data security risks facing LLMs and reviews current defense strategies, including adversarial training, RLHF, and data augmentation. Additionally, we categorize and analyze relevant datasets used for assessing robustness and security across different domains, providing guidance for future research. Finally, we highlight key research directions that focus on secure model updates, explainability-driven defenses, and effective governance frameworks, aiming to promote the safe and responsible development of LLM technology. This work aims to inform researchers, practitioners, and policymakers, driving progress toward data security in LLMs.
Authors:Dor Elimelech, Wasim Huleihel
Abstract:
Detection of planted subgraphs in Erdös-Rényi random graphs has been extensively studied, leading to a rich body of results characterizing both statistical and computational thresholds. However, most prior work assumes a purely random generative model, making the resulting algorithms potentially fragile in the face of real-world perturbations. In this work, we initiate the study of semi-random models for the planted subgraph detection problem, wherein an adversary is allowed to remove edges outside the planted subgraph before the graph is revealed to the statistician. Crucially, the statistician remains unaware of which edges have been removed, introducing fundamental challenges to the inference task. We establish fundamental statistical limits for detection under this semi-random model, revealing a sharp dichotomy. Specifically, for planted subgraphs with strongly sub-logarithmic maximum density detection becomes information-theoretically impossible in the presence of an adversary, despite being possible in the classical random model. In stark contrast, for subgraphs with super-logarithmic density, the statistical limits remain essentially unchanged; we prove that the optimal (albeit computationally intractable) likelihood ratio test remains robust. Beyond these statistical boundaries, we design a new computationally efficient and robust detection algorithm, and provide rigorous statistical guarantees for its performance. Our results establish the first robust framework for planted subgraph detection and open new directions in the study of semi-random models, computational-statistical trade-offs, and robustness in graph inference problems.
Authors:Arunava Chaudhuri, Shubhi Shukla, Sarani Bhattacharya, Debdeep Mukhopadhyay
Abstract:
Transformers have become the backbone of many Machine Learning (ML) applications, including language translation, summarization, and computer vision. As these models are increasingly deployed in shared Graphics Processing Unit (GPU) environments via Machine Learning as a Service (MLaaS), concerns around their security grow. In particular, the risk of side-channel attacks that reveal architectural details without physical access remains underexplored, despite the high value of the proprietary models they target. This work to the best of our knowledge is the first to investigate GPU power and thermal fluctuations as side-channels and further exploit them to extract information from pre-trained transformer models. The proposed analysis shows how these side channels can be exploited at user-privilege to reveal critical architectural details such as encoder/decoder layer and attention head for both language and vision transformers. We demonstrate the practical impact by evaluating multiple language and vision pre-trained transformers which are publicly available. Through extensive experimental evaluations, we demonstrate that the attack model achieves a high accuracy of over 89% on average for model family identification and 100% for hyperparameter classification, in both single-process as well as noisy multi-process scenarios. Moreover, by leveraging the extracted architectural information, we demonstrate highly effective black-box transfer adversarial attacks with an average success rate exceeding 93%, underscoring the security risks posed by GPU side-channel leakage in deployed transformer models.
Authors:Xindi Fan, Jing Wu, Mingyi Zhou, Pengwei Liang, Dinh Phung
Abstract:
Recent studies have shown that deep learning models are vulnerable to attacks and tend to memorize training data points, raising significant concerns about privacy leakage. This motivates the development of machine unlearning (MU), i.e., a paradigm that enables models to selectively forget specific data points upon request. However, most existing MU algorithms require partial or full fine-tuning on the retain set. This necessitates continued access to the original training data, which is often impractical due to privacy concerns and storage constraints. A few retain-data-free MU methods have been proposed, but some rely on access to auxiliary data and precomputed statistics of the retain set, while others scale poorly when forgetting larger portions of data. In this paper, we propose Influence-guided Machine Unlearning (IMU), a simple yet effective method that conducts MU using only the forget set. Specifically, IMU employs gradient ascent and innovatively introduces dynamic allocation of unlearning intensities across different data points based on their influences. This adaptive strategy significantly enhances unlearning effectiveness while maintaining model utility. Results across vision and language tasks demonstrate that IMU consistently outperforms existing retain-data-free MU methods.
Authors:Pengcheng Zhou, Yinglun Feng, Zhongliang Yang
Abstract:
Although Retrieval-Augmented Generation (RAG) systems have been widely applied, the privacy and security risks they face, such as data leakage and data poisoning, have not been systematically addressed yet. Existing defense strategies primarily rely on heuristic filtering or enhancing retriever robustness, which suffer from limited interpretability, lack of formal security guarantees, and vulnerability to adaptive attacks. To address these challenges, this paper proposes the first provably secure framework for RAG systems(SAG). Our framework employs a pre-storage full-encryption scheme to ensure dual protection of both retrieved content and vector embeddings, guaranteeing that only authorized entities can access the data. Through formal security proofs, we rigorously verify the scheme's confidentiality and integrity under a computational security model. Extensive experiments across multiple benchmark datasets demonstrate that our framework effectively resists a range of state-of-the-art attacks. This work establishes a theoretical foundation and practical paradigm for verifiably secure RAG systems, advancing AI-powered services toward formally guaranteed security.
Authors:Julien Simon de Kergunic, Rony Abecidan, Patrick Bas, Vincent Itier
Abstract:
Despite recent progress in splicing detection, deep learning-based forensic tools remain difficult to deploy in practice due to their high sensitivity to training conditions. Even mild post-processing applied to evaluation images can significantly degrade detector performance, raising concerns about their reliability in operational contexts. In this work, we show that the same deep architecture can react very differently to unseen post-processing depending on the learned weights, despite achieving similar accuracy on in-distribution test data. This variability stems from differences in the latent spaces induced by training, which affect how samples are separated internally. Our experiments reveal a strong correlation between the distribution of latent margins and a detector's ability to generalize to post-processed images. Based on this observation, we propose a practical strategy for building more robust detectors: train several variants of the same model under different conditions, and select the one that maximizes latent margins.
Authors:MichaÅ Forystek, Andrew D. Syrmakesis, Alkistis Kontou, Panos Kotsampopoulos, Nikos D. Hatziargyriou, Charalambos Konstantinou
Abstract:
Integrating Information and Communications Technology (ICT) devices into the power grid brings many benefits. However, it also exposes the grid to new potential cyber threats. Many control and protection mechanisms, such as Load Frequency Control (LFC), responsible for maintaining nominal frequency during load fluctuations and Under Frequency Load Shedding (UFLS) disconnecting portion of the load during an emergency, are dependent on information exchange through the communication network. The recently emerging Load Altering Attacks (LAAs) utilize a botnet of high-wattage devices to introduce load fluctuation. In their dynamic form (DLAAs), they manipulate the load in response to live grid frequency measurements for increased efficiency, posing a notable threat to grid stability. Recognizing the importance of communication networks in power grid cyber security research, this paper presents an open-source co-simulation environment that models the power grid with the corresponding communication network, implementing grid protective mechanisms. This setup allows the comprehensive analysis of the attacks in concrete LFC and UFLS scenarios.
Authors:Chunyi Zhang, Fengjiao Dou, Xiaoqi Li
Abstract:
Blockchain technology is widely used in various fields due to its ability to provide decentralization and trustless security. This is a fundamental understanding held by many advocates, but it is misunderstood, leading participants to fail to recognize the limitations of the security that blockchain can provide. Among all current network attacks, Denial of Service (DoS) attacks pose significant threats due to their ease of execution and destructive potential. This paper, based on the blockchain architecture hierarchy, categorizes and organizes existing DoS attacks, with a focus on explaining the principles and methods of contract layer and consensus layer DoS attacks. Furthermore, this paper comprehensively analyzes and compares commonly used detection methods and defense technologies, which will contribute to strengthening the security and stability of blockchain systems and promoting further innovation and application of blockchain systems.
Authors:Alexander Goldberg, Giulia Fanti, Nihar Shah, Zhiwei Steven Wu
Abstract:
We introduce the novel problem of benchmarking fraud detectors on private graph-structured data. Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs. We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties (e.g., vendors or researchers). The third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results. We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results. In simulations of a privacy-sensitive benchmark for facial recognition algorithms by the National Institute of Standards and Technology (NIST), our attack achieves near perfect accuracy in identifying whether individuals' data is present in a private dataset, with a True Positive Rate of 0.98 at a False Positive Rate of 0.00. We then study how to benchmark algorithms while satisfying a formal differential privacy (DP) guarantee. We empirically evaluate two classes of solutions: subsample-and-aggregate and DP synthetic graph data. We demonstrate through extensive experiments that current approaches do not provide utility when guaranteeing DP. Our results indicate that the error arising from DP trades off between bias from distorting graph structure and variance from adding random noise. Current methods lie on different points along this bias-variance trade-off, but more complex methods tend to require high-variance noise addition, undermining utility.
Authors:Yassine Rachidy, Jihad Rbaiti, Youssef Hmamouche, Faissal Sehbaoui, Amal El Fallah Seghrouchni
Abstract:
With the growing adoption of Large Language Models (LLMs) in critical areas, ensuring their security against jailbreaking attacks is paramount. While traditional defenses primarily rely on refusing malicious prompts, recent logit-level attacks have demonstrated the ability to bypass these safeguards by directly manipulating the token-selection process during generation. We introduce Strategic Deflection (SDeflection), a defense that redefines the LLM's response to such advanced attacks. Instead of outright refusal, the model produces an answer that is semantically adjacent to the user's request yet strips away the harmful intent, thereby neutralizing the attacker's harmful intent. Our experiments demonstrate that SDeflection significantly lowers Attack Success Rate (ASR) while maintaining model performance on benign queries. This work presents a critical shift in defensive strategies, moving from simple refusal to strategic content redirection to neutralize advanced threats.
Authors:Pranet Sharma, Yizhuo Tan, Konstantinos-Nikolaos Papadopoulos, Jakub Szefer
Abstract:
This work explores and evaluates noise and crosstalk in neutral atom quantum computers. Neutral atom quantum computers are a promising platform for analog Hamiltonian simulations, which rely on a sequence of time-dependent Hamiltonians to model the dynamics of a larger system and are particularly useful for problems in optimization, physics, and molecular dynamics. However, the viability of running multiple simulations in a co-located or multi-tenant environment is limited by noise and crosstalk. This work conducts an analysis of how noise faced by simulations changes over time, and investigates the effects of spatial co-location on simulation fidelity. Findings of this work demonstrate that the close proximity of concurrent simulations can increase crosstalk between them. To mitigate this issue, a Moving Target Defense (MTD) strategy is proposed and evaluated. The results confirm that the MTD is a viable technique for enabling safe and reliable co-location of simulations on neutral atom quantum hardware.
Authors:Xiaotao Feng, Xiaogang Zhu, Kun Hu, Jincheng Wang, Yingjie Cao, Guang Gong, Jianfeng Pan
Abstract:
Fuzzing is highly effective in detecting bugs due to the key contribution of randomness. However, randomness significantly reduces the efficiency of fuzzing, causing it to cost days or weeks to expose bugs. Even though directed fuzzing reduces randomness by guiding fuzzing towards target buggy locations, the dilemma of randomness still challenges directed fuzzers. Two critical components, which are seeds and mutators, contain randomness and are closely tied to the conditions required for triggering bugs. Therefore, to address the challenge of randomness, we propose to use large language models (LLMs) to remove the randomness in seeds and reduce the randomness in mutators. With their strong reasoning and code generation capabilities, LLMs can be used to generate reachable seeds that target pre-determined locations and to construct bug-specific mutators tailored for specific bugs. We propose RandLuzz, which integrates LLMs and directed fuzzing, to improve the quality of seeds and mutators, resulting in efficient bug exposure. RandLuzz analyzes function call chain or functionality to guide LLMs in generating reachable seeds. To construct bug-specific mutators, RandLuzz uses LLMs to perform bug analysis, obtaining information such as bug causes and mutation suggestions, which further help generate code that performs bug-specific mutations. We evaluate RandLuzz by comparing it with four state-of-the-art directed fuzzers, AFLGo, Beacon, WindRanger, and SelectFuzz. With RandLuzz-generated seeds, the fuzzers achieve an average speedup ranging from 2.1$\times$ to 4.8$\times$ compared to using widely-used initial seeds. Additionally, when evaluated on individual bugs, RandLuzz achieves up to a 2.7$\times$ speedup compared to the second-fastest exposure. On 8 bugs, RandLuzz can even expose them within 60 seconds.
Authors:Zixuan Chen, Weikai Lu, Xin Lin, Ziqian Zeng
Abstract:
Open-source Large Language Models (LLMs) often employ safety alignment methods to resist harmful instructions. However, recent research shows that maliciously fine-tuning these LLMs on harmful data can easily bypass these safeguards. To counter this, we theoretically uncover why malicious fine-tuning succeeds and identify potential defense strategies. Building on the theoretical analysis, we introduce the Self-Degraded Defense (SDD) framework. SDD encourages LLMs to produce high-quality but irrelevant responses to harmful prompts. When attackers attempt malicious fine-tuning, the general capability of the LLM aligned by SDD will significantly decrease, rendering it incapable of following harmful instructions. Our experimental results confirm SDD's effectiveness against such attacks.
Authors:Chad DeLuca, Anna Lisa Gentile, Shubhi Asthana, Bing Zhang, Pawan Chowdhary, Kellen Cheng, Basel Shbita, Pengyuan Li, Guang-Jie Ren, Sandeep Gopisetty
Abstract:
The rise of Large Language Models has created a general excitement about the great potential for a myriad of applications. While LLMs offer many possibilities, questions about safety, privacy, and ethics have emerged, and all the key actors are working to address these issues with protective measures for their own models and standalone solutions. The constantly evolving nature of LLMs makes it extremely challenging to universally shield users against their potential risks, and one-size-fits-all solutions are unfeasible. In this work, we propose OneShield, our stand-alone, model-agnostic and customizable solution to safeguard LLMs. OneShield aims to provide facilities for defining risk factors, expressing and declaring contextual safety and compliance policies, and mitigating LLM risks, with a focus on each specific customer. We describe the implementation of the framework, discuss scalability considerations, and provide usage statistics of OneShield since its initial deployment.
Authors:Anas Ali, Mubashar Husain, Peter Hans
Abstract:
Address Resolution Protocol (ARP) spoofing remains a critical threat to IoT networks, enabling attackers to intercept, modify, or disrupt data transmission by exploiting ARP's lack of authentication. The decentralized and resource-constrained nature of IoT environments amplifies this vulnerability, making conventional detection mechanisms ineffective at scale. This paper introduces an intelligent, multi-layered machine learning framework designed to detect ARP spoofing in real-time IoT deployments. Our approach combines feature engineering based on ARP header behavior, traffic flow analysis, and temporal packet anomalies with a hybrid detection pipeline incorporating decision trees, ensemble models, and deep learning classifiers. We propose a hierarchical architecture to prioritize lightweight models at edge gateways and deeper models at centralized nodes to balance detection accuracy and computational efficiency. The system is validated on both simulated IoT traffic and the CICIDS2017 dataset, achieving over 97% detection accuracy with low false positive rates. Comparative evaluations with signature-based and rule-based systems demonstrate the robustness and generalizability of our approach. Our results show that intelligent machine learning integration enables proactive ARP spoofing detection tailored for IoT scenarios, laying the groundwork for scalable and autonomous network security solutions.
Authors:Petr Spelda, Vit Stritecky
Abstract:
What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to fill the gap and identify several shortcomings in diversity and participation. We found that these security practices are part of securitization processes aiming to support (commercial) development of general purpose AI systems whose trustworthiness can only be imperfectly tested instead of guaranteed. We conclude by offering several improvements to the current AI security practices.
Authors:Armin Namavari, Thomas Ristenpart
Abstract:
Message franking is an indispensable abuse mitigation tool for end-to-end encrypted (E2EE) messaging platforms. With it, users who receive harmful content can securely report that content to platform moderators. However, while real-world deployments of reporting require the disclosure of multiple messages, existing treatments of message franking only consider the report of a single message. As a result, there is a gap between the security goals achieved by constructions and those needed in practice. Our work introduces transcript franking, a new type of protocol that allows reporting subsets of conversations such that moderators can cryptographically verify message causality and contents. We define syntax, semantics, and security for transcript franking in two-party and group messaging. We then present efficient constructions for transcript franking and prove their security. Looking toward deployment considerations, we provide detailed discussion of how real-world messaging systems can incorporate our protocols.
Authors:Haotian Zhang, Kun Liu, Cristian Garces, Chenke Luo, Yu Lei, Jiang Ming
Abstract:
Binary code analysis is essential in scenarios where source code is unavailable, with extensive applications across various security domains. However, accurately resolving indirect call targets remains a longstanding challenge in maintaining the integrity of static analysis in binary code. This difficulty arises because the operand of a call instruction (e.g., call rax) remains unknown until runtime, resulting in an incomplete inter-procedural control flow graph (CFG). Previous approaches have struggled with low accuracy and limited scalability. To address these limitations, recent work has increasingly turned to machine learning (ML) to enhance analysis. However, this ML-driven approach faces two significant obstacles: low-quality callsite-callee training pairs and inadequate binary code representation, both of which undermine the accuracy of ML models. In this paper, we introduce NeuCall, a novel approach for resolving indirect calls using graph neural networks. Existing ML models in this area often overlook key elements such as data and code cross-references, which are essential for understanding a program's control flow. In contrast, NeuCall augments CFGs with cross-references, preserving rich semantic information. Additionally, we leverage advanced compiler-level type analysis to generate high-quality callsite-callee training pairs, enhancing model precision and reliability. We further design a graph neural model that leverages augmented CFGs and relational graph convolutions for accurate target prediction. Evaluated against real-world binaries from GitHub and the Arch User Repository on x86_64 architecture, NeuCall achieves an F1 score of 95.2%, outperforming state-of-the-art ML-based approaches. These results highlight NeuCall's effectiveness in building precise inter-procedural CFGs and its potential to advance downstream binary analysis and security applications.
Authors:Vita Santa Barletta, Vito Bavaro, Miriana Calvano, Antonio Curci, Antonio Piccinno, Davide Pio Posa
Abstract:
Digital Twins (DTs) are gaining prominence in cybersecurity for their ability to replicate complex IT (Information Technology), OT (Operational Technology), and IoT (Internet of Things) infrastructures, allowing for real time monitoring, threat analysis, and system simulation. This study investigates how integrating DTs with penetration testing tools and Large Language Models (LLMs) can enhance cybersecurity education and operational readiness. By simulating realistic cyber environments, this approach offers a practical, interactive framework for exploring vulnerabilities and defensive strategies. At the core of this research is the Red Team Knife (RTK), a custom penetration testing toolkit aligned with the Cyber Kill Chain model. RTK is designed to guide learners through key phases of cyberattacks, including reconnaissance, exploitation, and response within a DT powered ecosystem. The incorporation of Large Language Models (LLMs) further enriches the experience by providing intelligent, real-time feedback, natural language threat explanations, and adaptive learning support during training exercises. This combined DT LLM framework is currently being piloted in academic settings to develop hands on skills in vulnerability assessment, threat detection, and security operations. Initial findings suggest that the integration significantly improves the effectiveness and relevance of cybersecurity training, bridging the gap between theoretical knowledge and real-world application. Ultimately, the research demonstrates how DTs and LLMs together can transform cybersecurity education to meet evolving industry demands.
Authors:Mohammad Eslami, Ashira Johara, Kyungbin Park, Samuel Pagliarini
Abstract:
In the traditional Application-Specific Integrated Circuit (ASIC) design flow, the concept of timing closure implies to reach convergence during physical synthesis such that, under a given area and power budget, the design works at the targeted frequency. However, security has been largely neglected when evaluating the Quality of Results (QoR) from physical synthesis. In general, commercial place & route tools do not understand security goals. In this work, we propose a modified ASIC design flow that is security-aware and, differently from prior research, does not degrade QoR for the sake of security improvement. Therefore, we propose a first-of-its-kind zero-overhead flow for security closure. Our flow is concerned with two distinct threat models: (i) insertion of Hardware Trojans (HTs) and (ii) physical probing/fault injection. Importantly, the flow is entirely executed within a commercial place & route engine and is scalable. In several metrics, our security-aware flow achieves the best-known results for the ISPD`22 set of benchmark circuits while incurring negligible design overheads due to security-related strategies. Finally, we open source the entire methodology (as a set of scripts) and also share the protected circuits (as design databases) for the benefit of the hardware security community.
Authors:Lijie Zheng, Ji He, Shih Yu Chang, Yulong Shen, Dusit Niyato
Abstract:
This work tackles the physical layer security (PLS) problem of maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under propulsion energy constraints. Unlike prior studies that assume uniform UAV capabilities or overlook energy-security trade-offs, we consider a realistic scenario where UAVs with diverse payloads and computation resources collaborate to serve ground terminals in the presence of eavesdroppers. To manage the complex coupling between UAV motion and communication, we propose a hierarchical optimization framework. The inner layer uses a semidefinite relaxation (SDR)-based S2DC algorithm combining penalty functions and difference-of-convex (d.c.) programming to solve the secrecy precoding problem with fixed UAV positions. The outer layer introduces a Large Language Model (LLM)-guided heuristic multi-agent reinforcement learning approach (LLM-HeMARL) for trajectory optimization. LLM-HeMARL efficiently incorporates expert heuristics policy generated by the LLM, enabling UAVs to learn energy-aware, security-driven trajectories without the inference overhead of real-time LLM calls. The simulation results show that our method outperforms existing baselines in secrecy rate and energy efficiency, with consistent robustness across varying UAV swarm sizes and random seeds.
Authors:Joshua Kalyanapu, Farshad Dizani, Darsh Asher, Azam Ghanbari, Rosario Cammarota, Aydin Aysu, Samira Mirbagher Ajorpaz
Abstract:
As power consumption from AI training and inference continues to increase, AI accelerators are being integrated directly into the CPU. Intel's Advanced Matrix Extensions (AMX) is one such example, debuting on the 4th generation Intel Xeon Scalable CPU. We discover a timing side and covert channel, GATEBLEED, caused by the aggressive power gating utilized to keep the CPU within operating limits. We show that the GATEBLEED side channel is a threat to AI privacy as many ML models such as transformers and CNNs make critical computationally-heavy decisions based on private values like confidence thresholds and routing logits. Timing delays from selective powering down of AMX components mean that each matrix multiplication is a potential leakage point when executed on the AMX accelerator. Our research identifies over a dozen potential gadgets across popular ML libraries (HuggingFace, PyTorch, TensorFlow, etc.), revealing that they can leak sensitive and private information. GATEBLEED poses a risk for local and remote timing inference, even under previous protective measures. GATEBLEED can be used as a high performance, stealthy remote covert channel and a generic magnifier for timing transmission channels, capable of bypassing traditional cache defenses to leak arbitrary memory addresses and evading state of the art microarchitectural attack detectors under realistic network conditions and system configurations in which previous attacks fail. We implement an end-to-end microarchitectural inference attack on a transformer model optimized with Intel AMX, achieving a membership inference accuracy of 81% and a precision of 0.89. In a CNN-based or transformer-based mixture-of-experts model optimized with Intel AMX, we leak expert choice with 100% accuracy. To our knowledge, this is the first side-channel attack on AI privacy that exploits hardware optimizations.
Authors:Igor Klep, Connor Paddock, Marc-Olivier Renou, Simon Schmidt, Lucas Tendick, Xiangling Xu, Yuming Zhao
Abstract:
Compiling Bell games under cryptographic assumptions replaces the need for physical separation, allowing nonlocality to be probed with a single untrusted device. While Kalai et al. (STOC'23) showed that this compilation preserves quantum advantages, its quantitative quantum soundness has remained an open problem. We address this gap with two primary contributions. First, we establish the first quantitative quantum soundness bounds for every bipartite compiled Bell game whose optimal quantum strategy is finite-dimensional: any polynomial-time prover's score in the compiled game is negligibly close to the game's ideal quantum value. More generally, for all bipartite games we show that the compiled score cannot significantly exceed the bounds given by a newly formalized sequential Navascués-Pironio-AcÃn (NPA) hierarchy. Second, we provide a full characterization of this sequential NPA hierarchy, establishing it as a robust numerical tool that is of independent interest. Finally, for games without finite-dimensional optimal strategies, we explore the necessity of NPA approximation error for quantitatively bounding their compiled scores, linking these considerations to the complexity conjecture $\mathrm{MIP}^{\mathrm{co}}=\mathrm{coRE}$ and open challenges such as quantum homomorphic encryption correctness for "weakly commuting" quantum registers.
Authors:Ahmed Lekssays, Hamza Mouhcine, Khang Tran, Ting Yu, Issa Khalil
Abstract:
Software vulnerabilities present a persistent security challenge, with over 25,000 new vulnerabilities reported in the Common Vulnerabilities and Exposures (CVE) database in 2024 alone. While deep learning based approaches show promise for vulnerability detection, recent studies reveal critical limitations in terms of accuracy and robustness: accuracy drops by up to 45% on rigorously verified datasets, and performance degrades significantly under simple code modifications. This paper presents LLMxCPG, a novel framework integrating Code Property Graphs (CPG) with Large Language Models (LLM) for robust vulnerability detection. Our CPG-based slice construction technique reduces code size by 67.84 to 90.93% while preserving vulnerability-relevant context. Our approach's ability to provide a more concise and accurate representation of code snippets enables the analysis of larger code segments, including entire projects. This concise representation is a key factor behind the improved detection capabilities of our method, as it can now identify vulnerabilities that span multiple functions. Empirical evaluation demonstrates LLMxCPG's effectiveness across verified datasets, achieving 15-40% improvements in F1-score over state-of-the-art baselines. Moreover, LLMxCPG maintains high performance across function-level and multi-function codebases while exhibiting robust detection efficacy under various syntactic code modifications.
Authors:Baofu Han, Bing Li, Yining Qi, Raja Jurdak, Kaibin Huang, Chau Yuen
Abstract:
Privacy-Preserving Federated Learning (PPFL) has emerged as a secure distributed Machine Learning (ML) paradigm that aggregates locally trained gradients without exposing raw data. To defend against model poisoning threats, several robustness-enhanced PPFL schemes have been proposed by integrating anomaly detection. Nevertheless, they still face two major challenges: (1) the reliance on heavyweight encryption techniques results in substantial communication and computation overhead; and (2) single-strategy defense mechanisms often fail to provide sufficient robustness against adaptive adversaries. To overcome these challenges, we propose DP2Guard, a lightweight PPFL framework that enhances both privacy and robustness. DP2Guard leverages a lightweight gradient masking mechanism to replace costly cryptographic operations while ensuring the privacy of local gradients. A hybrid defense strategy is proposed, which extracts gradient features using singular value decomposition and cosine similarity, and applies a clustering algorithm to effectively identify malicious gradients. Additionally, DP2Guard adopts a trust score-based adaptive aggregation scheme that adjusts client weights according to historical behavior, while blockchain records aggregated results and trust scores to ensure tamper-proof and auditable training. Extensive experiments conducted on two public datasets demonstrate that DP2Guard effectively defends against four advanced poisoning attacks while ensuring privacy with reduced communication and computation costs.
Authors:Eyasu Getahun Chekole, Howard Halim, Jianying Zhou
Abstract:
Unauthorized access remains one of the critical security challenges in the realm of cybersecurity. With the increasing sophistication of attack techniques, the threat of unauthorized access is no longer confined to the conventional ones, such as exploiting weak access control policies. Instead, advanced exploitation strategies, such as session hijacking-based attacks, are becoming increasingly prevalent, posing serious security concerns. Session hijacking enables attackers to take over an already established session between legitimate peers in a stealthy manner, thereby gaining unauthorized access to private resources. Unfortunately, traditional access control mechanisms, such as static access control policies, are insufficient to prevent session hijacking or other advanced exploitation techniques. In this work, we propose a new multi-factor authorization (MFAz) scheme that proactively mitigates unauthorized access attempts both conventional and advanced unauthorized access attacks. The proposed scheme employs fine-grained access control rules (ARs) and verification points (VPs) that are systematically generated from historically granted accesses as the first and second authorization factors, respectively. As a proof-of-concept, we implement the scheme using different techniques. We leverage bloom filter to achieve runtime and storage efficiency, and blockchain to make authorization decisions in a temper-proof and decentralized manner. To the best of our knowledge, this is the first formal introduction of a multi-factor authorization scheme, which is orthogonal to the multi-factor authentication (MFA) schemes. The effectiveness of our proposed scheme is experimentally evaluated using a smart-city testbed involving different devices with varying computational capacities. The experimental results reveal high effectiveness of the scheme both in security and performance guarantees.
Authors:Alessio Caminata, Elisa Gorla, Madison Mabe, Martina Vigorito, Irene Villa
Abstract:
We consider the multivariate scheme Pesto, which was introduced by Calderini, Caminata, and Villa. In this scheme, the public polynomials are obtained by applying a CCZ transformation to a set of quadratic secret polynomials. As a consequence, the public key consists of polynomials of degree 4. In this work, we show that the public degree 4 polynomial system can be efficiently reduced to a system of quadratic polynomials. This seems to suggest that the CCZ transformation may not offer a significant increase in security, contrary to what was initially believed.
Authors:Magali Bardet, Charles Brion, Philippe Gaborit, Mercedes Haiech, Romaric Neveu
Abstract:
Nowadays, equivalence problems are widely used in cryptography, most notably to establish cryptosystems such as digital signatures, with MEDS, LESS, PERK as the most recent ones. However, in the context of matrix codes, only the code equivalence problem has been studied, while the subcode equivalence is well-defined in the Hamming metric. In this work, we introduce two new problems: the Matrix Subcode Equivalence Problem and the Matrix Code Permuted Kernel Problem, to which we apply the MPCitH paradigm to build a signature scheme. These new problems, closely related to the Matrix Code Equivalence problem, ask to find an isometry given a code $C$ and a subcode $D$. Furthermore, we prove that the Matrix Subcode Equivalence problem reduces to the Hamming Subcode Equivalence problem, which is known to be NP-Complete, thus introducing the matrix code version of the Permuted Kernel Problem. We also adapt the combinatorial and algebraic algorithms for the Matrix Code Equivalence problem to the subcode case, and we analyze their complexities. We find with this analysis that the algorithms perform much worse than in the code equivalence case, which is the same as what happens in the Hamming metric. Finally, our analysis of the attacks allows us to take parameters much smaller than in the Matrix Code Equivalence case. Coupled with the effectiveness of \textit{Threshold-Computation-in-the-Head} or \textit{VOLE-in-the-Head}, we obtain a signature size of $\approx$ 4 800 Bytes, with a public key of $\approx$ 275 Bytes. We thus obtain a reasonable signature size, which brings diversity in the landscape of post-quantum signature schemes, by relying on a new hard problem. In particular, this new signature scheme performs better than SPHINCS+, with a smaller size of public key + signature. Our signature compares also well with other signature schemes: compared to MEDS, the signature is smaller, and we reduced the size of the sum of signature and public key by a factor close to 5. We also obtain a signature size that is almost half the size of the CROSS signature scheme.
Authors:Menglu Li, Xiao-Ping Zhang, Lian Zhao
Abstract:
Detecting partial deepfake speech is essential due to its potential for subtle misinformation. However, existing methods depend on costly frame-level annotations during training, limiting real-world scalability. Also, they focus on detecting transition artifacts between bonafide and deepfake segments. As deepfake generation techniques increasingly smooth these transitions, detection has become more challenging. To address this, our work introduces a new perspective by analyzing frame-level temporal differences and reveals that deepfake speech exhibits erratic directional changes and unnatural local transitions compared to bonafide speech. Based on this finding, we propose a Temporal Difference Attention Module (TDAM) that redefines partial deepfake detection as identifying unnatural temporal variations, without relying on explicit boundary annotations. A dual-level hierarchical difference representation captures temporal irregularities at both fine and coarse scales, while adaptive average pooling preserves essential patterns across variable-length inputs to minimize information loss. Our TDAM-AvgPool model achieves state-of-the-art performance, with an EER of 0.59% on the PartialSpoof dataset and 0.03% on the HAD dataset, which significantly outperforms the existing methods without requiring frame-level supervision.
Authors:Argianto Rahartomo, Leonel Merino, Mohammad Ghafari
Abstract:
The rapid growth of metaverse technologies, including virtual worlds, augmented reality, and lifelogging, has accelerated their adoption across diverse domains. This rise exposes users to significant new security and privacy challenges due to sociotechnical complexity, pervasive connectivity, and extensive user data collection in immersive environments. We present a systematic review of the literature published between 2013 and 2024, offering a comprehensive analysis of how the research community has addressed metaverse-related security and privacy issues over the past decade. We organize the studies by method, examined the security and privacy properties, immersive components, and evaluation strategies. Our investigation reveals a sharp increase in research activity in the last five years, a strong focus on practical and user-centered approaches, and a predominant use of benchmarking, human experimentation, and qualitative methods. Authentication and unobservability are the most frequently studied properties. However, critical gaps remain in areas such as policy compliance, accessibility, interoperability, and back-end infrastructure security. We emphasize the intertwined technical complexity and human factors of the metaverse and call for integrated, interdisciplinary approaches to securing inclusive and trustworthy immersive environments.
Authors:Zhou Li, Xiang Zhang, Jiawen Lv, Jihao Fan, Haiqiang Chen, Giuseppe Caire
Abstract:
Motivated by federated learning (FL), secure aggregation (SA) aims to securely compute, as efficiently as possible, the sum of a set of inputs distributed across many users. To understand the impact of network topology, hierarchical secure aggregation (HSA) investigated the communication and secret key generation efficiency in a 3-layer relay network, where clusters of users are connected to the aggregation server through an intermediate layer of relays. Due to the pre-aggregation of the messages at the relays, HSA reduces the communication burden on the relay-to-server links and is able to support a large number of users. However, as the number of users increases, a practical challenge arises from heterogeneous security requirements--for example, users in different clusters may require varying levels of input protection. Motivated by this, we study weakly-secure HSA (WS-HSA) with collusion resilience, where instead of protecting all the inputs from any set of colluding users, only the inputs belonging to a predefined collection of user groups (referred to as security input sets) need to be protected against another predefined collection of user groups (referred to as collusion sets). Since the security input sets and collusion sets can be arbitrarily defined, our formulation offers a flexible framework for addressing heterogeneous security requirements in HSA. We characterize the optimal total key rate, i.e., the total number of independent key symbols required to ensure both server and relay security, for a broad range of parameter configurations. For the remaining cases, we establish lower and upper bounds on the optimal key rate, providing constant-factor gap optimality guarantees.
Authors:Niklas Busch, Philip Klostermeyer, Jan H. Klemmer, Yasemin Acar, Sascha Fahl
Abstract:
Hardening computer systems against cyberattacks is crucial for security. However, past incidents illustrated, that many system operators struggle with effective system hardening. Hence, many computer systems and applications remain insecure. So far, the research community lacks an in-depth understanding of system operators motivation, practices, and challenges around system hardening. With a focus on practices and challenges, we qualitatively analyzed 316 Stack Exchange (SE) posts related to system hardening. We find that access control and deployment-related issues are the most challenging, and system operators suffer from misconceptions and unrealistic expectations. Most frequently, posts focused on operating systems and server applications. System operators were driven by the fear of their systems getting attacked or by compliance reasons. Finally, we discuss our research questions, make recommendations for future system hardening, and illustrate the implications of our work.
Authors:Shogo Murasaki, Kazumasa Omote, Keita Emura
Abstract:
An address is indicated as an identifier of the user on the blockchain, and is defined by a hash value of the ECDSA verification key. A vanity address is an address that embeds custom characters such as a name. To generate a vanity address, a classical try-and-error method is employed, and thus the number of characters to be embedded is limited. In this paper, we focus on the functionality of identity-based signatures (IBS) where any strings can be employed as a verification key, and explore whether IBS can be used for generating a vanity address. We attach importance to the fact that it is not realistic to replace ECDSA with key recovery, which is currently employed for issuing transactions in Ethereum, to an IBS scheme. Even if this replacement is possible, it is not a reasonable price for the ease of the vanity address generation. Thus, we pay attention to a generic construction of IBS from signatures, and construct an IBS scheme from ECDSA with key recovery. Though we cannot directly generate a vanity address due to the key recovery functionality of the underlying ECDSA, we can connect any string with an address due to the functionality of IBS that can give additional meaning to the address. We implement our system by Solidity, and demonstrate that the gas cost is almost same as that of the ECDSA signature verification.
Authors:Rina Mishra, Gaurav Varshney
Abstract:
The advent of advanced Generative AI (GenAI) models such as DeepSeek and ChatGPT has significantly reshaped the cybersecurity landscape, introducing both promising opportunities and critical risks. This study investigates how GenAI powered chatbot services can be exploited via jailbreaking techniques to bypass ethical safeguards, enabling the generation of phishing content, recommendation of hacking tools, and orchestration of phishing campaigns. In ethically controlled experiments, we used ChatGPT 4o Mini selected for its accessibility and status as the latest publicly available model at the time of experimentation, as a representative GenAI system. Our findings reveal that the model could successfully guide novice users in executing phishing attacks across various vectors, including web, email, SMS (smishing), and voice (vishing). Unlike automated phishing campaigns that typically follow detectable patterns, these human-guided, AI assisted attacks are capable of evading traditional anti phishing mechanisms, thereby posing a growing security threat. We focused on DeepSeek and ChatGPT due to their widespread adoption and technical relevance in 2025. The study further examines common jailbreaking techniques and the specific vulnerabilities exploited in these models. Finally, we evaluate a range of mitigation strategies such as user education, advanced authentication mechanisms, and regulatory policy measures and discuss emerging trends in GenAI facilitated phishing, outlining future research directions to strengthen cybersecurity defenses in the age of artificial intelligence.
Authors:Rahat Masood, Sunday Oyinlola Ogundoyin, Muhammad Ikram, Alex Ye
Abstract:
With the increasing concerns around privacy and the enforcement of data privacy laws, many websites now provide users with privacy controls. However, locating these controls can be challenging, as they are frequently hidden within multiple settings and layers. Moreover, the lack of standardization means these controls can vary widely across services. The technical or confusing terminology used to describe these controls further complicates users' ability to understand and use them effectively. This paper presents a large-scale empirical analysis investigating usability challenges of web privacy controls across 18,628 websites. While aiming for a multi-scenario view, our automated data collection faced significant hurdles, particularly in simulating sign-up and authenticated user visits, leading to more focused insights on guest visit scenarios and challenges in automated capture of dynamic user interactions. Our heuristic evaluation of three different user visit scenarios identifies significant website usability issues. Our results show that privacy policies are most common across all visit scenarios, with nudges and notices being prevalent in sign-up situations. We recommend designing privacy controls that: enhance awareness through pop-up nudges and notices; offer a table of contents as navigational aids and customized settings links in policies for more informed choice; and ensure accessibility via direct links to privacy settings from nudges.
Authors:Kshitij Raj, Atri Chatterjee, Patanjali SLPSK, Swarup Bhunia, Sandip Ray
Abstract:
Designing secure architectures for system-on-chip (SoC) platforms is a highly intricate and time-intensive task, often requiring months of development and meticulous verification. Even minor architectural oversights can lead to critical vulnerabilities that undermine the security of the entire chip. In response to this challenge, we introduce CITADEL, a modular security framework aimed at streamlining the creation of robust security architectures for SoCs. CITADEL offers a configurable, plug-and-play subsystem composed of custom intellectual property (IP) blocks, enabling the construction of diverse security mechanisms tailored to specific threats. As a concrete demonstration, we instantiate CITADEL to defend against supply-chain threats, illustrating how the framework adapts to one of the most pressing concerns in hardware security. This paper explores the range of obstacles encountered when building a unified security architecture capable of addressing multiple attack vectors and presents CITADEL's strategies for overcoming them. Through several real-world case studies, we showcase the practical implementation of CITADEL and present a thorough evaluation of its impact on silicon area and power consumption across various ASIC technologies. Results indicate that CITADEL introduces only minimal resource overhead, making it a practical solution for enhancing SoC security.
Authors:Zhanyu Wang, Arin Chang, Jordan Awan
Abstract:
We design a debiased parametric bootstrap framework for statistical inference from differentially private data. Existing usage of the parametric bootstrap on privatized data ignored or avoided handling the effect of clamping, a technique employed by the majority of privacy mechanisms. Ignoring the impact of clamping often leads to under-coverage of confidence intervals and miscalibrated type I errors of hypothesis tests. The main reason for the failure of the existing methods is the inconsistency of the parameter estimate based on the privatized data. We propose using the indirect inference method to estimate the parameter values consistently, and we use the improved estimator in parametric bootstrap for inference. To implement the indirect estimator, we present a novel simulation-based, adaptive approach along with the theory that establishes the consistency of the corresponding parametric bootstrap estimates, confidence intervals, and hypothesis tests. In particular, we prove that our adaptive indirect estimator achieves the minimum asymptotic variance among all "well-behaved" consistent estimators based on the released summary statistic. Our simulation studies show that our framework produces confidence intervals with well-calibrated coverage and performs hypothesis testing with the correct type I error, giving state-of-the-art performance for inference on location-scale normals, simple linear regression, and logistic regression.
Authors:Manuel Röder, Christoph Raab, Frank-Michael Schleif
Abstract:
Federated Learning has emerged as a leading paradigm for decentralized, privacy-preserving learning, particularly relevant in the era of interconnected edge devices equipped with sensors. However, the practical implementation of Federated Learning faces three primary challenges: the need for human involvement in costly data labelling processes for target adaptation, covariate shift in client device data collection due to environmental factors affecting sensors, leading to discrepancies between source and target samples, and the impracticality of continuous or regular model updates in resource-constrained environments due to limited data transmission capabilities and technical constraints on channel availability and energy efficiency. To tackle these issues, we expand upon an efficient and scalable Federated Learning framework tailored for real-world client adaptation in industrial settings. This framework leverages a pre-trained source model comprising a deep backbone, an adaptation module, and a classifier running on a powerful server. By freezing the backbone and classifier during client adaptation on resource-constrained devices, we allow the domain adaptive linear layer to handle target domain adaptation, thus minimizing overall computational overhead. Furthermore, this setup, designated as FedAcross+, is extended to encompass the processing of streaming data, thereby rendering the solution suitable for non-stationary environments. Extensive experimental results demonstrate the effectiveness of FedAcross+ in achieving competitive adaptation on low-end client devices with limited target samples, successfully addressing the challenge of domain shift. Moreover, our framework accommodates sporadic model updates within resource-constrained environments, ensuring practical and seamless deployment.
Authors:Xiaofei Wang, Mingliang Han, Tianyu Hao, Cegang Li, Yunbo Zhao, Keke Tang
Abstract:
Adversarial attacks on robotic grasping provide valuable insights into evaluating and improving the robustness of these systems. Unlike studies that focus solely on neural network predictions while overlooking the physical principles of grasping, this paper introduces AdvGrasp, a framework for adversarial attacks on robotic grasping from a physical perspective. Specifically, AdvGrasp targets two core aspects: lift capability, which evaluates the ability to lift objects against gravity, and grasp stability, which assesses resistance to external disturbances. By deforming the object's shape to increase gravitational torque and reduce stability margin in the wrench space, our method systematically degrades these two key grasping metrics, generating adversarial objects that compromise grasp performance. Extensive experiments across diverse scenarios validate the effectiveness of AdvGrasp, while real-world validations demonstrate its robustness and practical applicability
Authors:Gaurav Varshney, Akanksha Raj, Divya Sangwan, Sharif Abuadbba, Rina Mishra, Yansong Gao
Abstract:
Phishing is a prevalent cyberattack that uses look-alike websites to deceive users into revealing sensitive information. Numerous efforts have been made by the Internet community and security organizations to detect, prevent, or train users to avoid falling victim to phishing attacks. Most of this research over the years has been highly diverse and application-oriented, often serving as standalone solutions for HTTP clients, servers, or third parties. However, limited work has been done to develop a comprehensive or proactive protocol-oriented solution to effectively counter phishing attacks. Inspired by the concept of certificate transparency, which allows certificates issued by Certificate Authorities (CAs) to be publicly verified by clients, thereby enhancing transparency, we propose a concept called Page Transparency (PT) for the web. The proposed PT requires login pages that capture users' sensitive information to be publicly logged via PLS and made available to web clients for verification. The pages are verified to be logged using cryptographic proofs. Since all pages are logged on a PLS and visually compared with existing pages through a comprehensive visual page-matching algorithm, it becomes impossible for an attacker to register a deceptive look-alike page on the PLS and receive the cryptographic proof required for client verification. All implementations occur on the client side, facilitated by the introduction of a new HTTP PT header, eliminating the need for platform-specific changes or the installation of third-party solutions for phishing prevention.
Authors:Julio Gento Suela, Javier Blanco-Romero, Florina Almenares Mendoza, Daniel DÃaz-Sánchez
Abstract:
The emergence of quantum computers poses a significant threat to current secure service, application and/or protocol implementations that rely on RSA and ECDSA algorithms, for instance DNSSEC, because public-key cryptography based on number factorization or discrete logarithm is vulnerable to quantum attacks. This paper presents the integration of post-quantum cryptographic (PQC) algorithms into CoreDNS to enable quantum-resistant DNSSEC functionality. We have developed a plugin that extends CoreDNS with support for five PQC signature algorithm families: ML-DSA, FALCON, SPHINCS+, MAYO, and SNOVA. Our implementation maintains compatibility with existing DNS resolution flows while providing on-the-fly signing using quantum-resistant signatures. A benchmark has been performed and performance evaluation results reveal significant trade-offs between security and efficiency. The results indicate that while PQC algorithms introduce operational overhead, several candidates offer viable compromises for transitioning DNSSEC to quantum-resistant cryptography.
Authors:Dominika Woszczyk, Ranya Aloufi, Soteris Demetriou
Abstract:
Dementia, a neurodegenerative disease, alters speech patterns, creating communication barriers and raising privacy concerns. Current speech technologies, such as automatic speech transcription (ASR), struggle with dementia and atypical speech, further challenging accessibility. This paper presents a novel dementia obfuscation in speech framework, ClaritySpeech, integrating ASR, text obfuscation, and zero-shot text-to-speech (TTS) to correct dementia-affected speech while preserving speaker identity in low-data environments without fine-tuning. Results show a 16% and 10% drop in mean F1 score across various adversarial settings and modalities (audio, text, fusion) for ADReSS and ADReSSo, respectively, maintaining 50% speaker similarity. We also find that our system improves WER (from 0.73 to 0.08 for ADReSS and 0.15 for ADReSSo) and speech quality from 1.65 to ~2.15, enhancing privacy and accessibility.
Authors:Xinyu Huang, Leming Shen, Zijing Ma, Yuanqing Zheng
Abstract:
Large Language Models (LLMs) have showcased remarkable generalizability in language comprehension and hold significant potential to revolutionize human-computer interaction in smart homes. Existing LLM-based smart home assistants typically transmit user commands, along with user profiles and home configurations, to remote servers to obtain personalized services. However, users are increasingly concerned about the potential privacy leaks to the remote servers. To address this issue, we develop HomeLLaMA, an on-device assistant for privacy-preserving and personalized smart home serving with a tailored small language model (SLM). HomeLLaMA learns from cloud LLMs to deliver satisfactory responses and enable user-friendly interactions. Once deployed, HomeLLaMA facilitates proactive interactions by continuously updating local SLMs and user profiles. To further enhance user experience while protecting their privacy, we develop PrivShield to offer an optional privacy-preserving LLM-based smart home serving for those users, who are unsatisfied with local responses and willing to send less-sensitive queries to remote servers. For evaluation, we build a comprehensive benchmark DevFinder to assess the service quality. Extensive experiments and user studies (M=100) demonstrate that HomeLLaMA can provide personalized services while significantly enhancing user privacy.
Authors:Sarah Ball, Greg Gluch, Shafi Goldwasser, Frauke Kreuter, Omer Reingold, Guy N. Rothblum
Abstract:
With the increased deployment of large language models (LLMs), one concern is their potential misuse for generating harmful content. Our work studies the alignment challenge, with a focus on filters to prevent the generation of unsafe information. Two natural points of intervention are the filtering of the input prompt before it reaches the model, and filtering the output after generation. Our main results demonstrate computational challenges in filtering both prompts and outputs. First, we show that there exist LLMs for which there are no efficient prompt filters: adversarial prompts that elicit harmful behavior can be easily constructed, which are computationally indistinguishable from benign prompts for any efficient filter. Our second main result identifies a natural setting in which output filtering is computationally intractable. All of our separation results are under cryptographic hardness assumptions. In addition to these core findings, we also formalize and study relaxed mitigation approaches, demonstrating further computational barriers. We conclude that safety cannot be achieved by designing filters external to the LLM internals (architecture and weights); in particular, black-box access to the LLM will not suffice. Based on our technical results, we argue that an aligned AI system's intelligence cannot be separated from its judgment.
Authors:Faissal Ahmadou, Sepehr Ghaffarzadegan, Boubakr Nour, Makan Pourzandi, Mourad Debbabi, Chadi Assi
Abstract:
In the ever-evolving landscape of cybersecurity, the rapid identification and mitigation of Advanced Persistent Threats (APTs) is crucial. Security practitioners rely on detailed threat reports to understand the tactics, techniques, and procedures (TTPs) employed by attackers. However, manually extracting attack testflows from these reports requires elusive knowledge and is time-consuming and prone to errors. This paper proposes FLOWGUARDIAN, a novel solution leveraging language models (i.e., BERT) and Natural Language Processing (NLP) techniques to automate the extraction of attack testflows from unstructured threat reports. FLOWGUARDIAN systematically analyzes and contextualizes security events, reconstructs attack sequences, and then generates comprehensive testflows. This automated approach not only saves time and reduces human error but also ensures comprehensive coverage and robustness in cybersecurity testing. Empirical validation using public threat reports demonstrates FLOWGUARDIAN's accuracy and efficiency, significantly enhancing the capabilities of security teams in proactive threat hunting and incident response.
Authors:Hong Niu, Yue Xiao, Xia Lei, Jiangong Chen, Zhihan Xiao, Mao Li, Chau Yuen
Abstract:
Due to the broadcast nature of wireless communications, physical-layer security has attracted increasing concerns from both academia and industry. Artificial noise (AN), as one of the promising physical-layer security techniques, is capable of utilizing the spatial degree-of-freedom of channels to effectively enhance the security of wireless communications. In contrast to other physicallayer security techniques, the key distinguishing feature of AN is to generate specific interfering signals according to channel characteristics, increasing the secrecy capacity by reducing the wiretap channel capacity without affecting the legitimate channel capacity. Hence, this paper provides the latest survey of AN, including its evolution, modeling, backgrounds, applications, and future trends. Initially, we introduce the development, fundamentals, and backgrounds of AN. Subsequently, we highlight a comprehensive survey of the current state of research on various AN-empowered scenarios and AN-combined technologies. Finally, we discuss some technical challenges to tackle for AN-aided wireless security in the future.
Authors:Oleksandr Kurbatov, Kyrylo Baibula, Yaroslava Chopa, Sergey Kozlov, Oleh Komendant, Illia Dovhopolyi, Dmitrii Kurbatov, Zakhar Naumets, Yuliia Aritkulova, Pavel Kravchenko, Volodymyr Dubinin, Lasha Antadze, Yaroslav Panasenko, Mykhailo Velykodnyi
Abstract:
This paper presents Wrapless -- a lending protocol that enables the collateralization of bitcoins without requiring a trusted wrapping mechanism. The protocol facilitates a "loan channel" on the Bitcoin blockchain, allowing bitcoins to be locked as collateral for loans issued on any blockchain that supports Turing-complete smart contracts. The protocol is designed in a way that makes it economically irrational for each involved party to manipulate the loan rules. There is still a significant research area to bring the protocol closer to traditional AMM financial instruments.
Authors:Alireza Khodaie, Berkay Kemal Balioglu, Mehmet Emre Gursoy
Abstract:
Local differential privacy (LDP) has recently gained prominence as a powerful paradigm for collecting and analyzing sensitive data from users' devices. However, the inherent perturbation added by LDP protocols reduces the utility of the collected data. To mitigate this issue, several post-processing (PP) methods have been developed. Yet, the comparative performance of PP methods under diverse settings remains underexplored. In this paper, we present an extensive benchmark comprising 6 popular LDP protocols, 7 PP methods, 4 utility metrics, and 6 datasets to evaluate the behaviors and optimality of PP methods under diverse conditions. Through extensive experiments, we show that while PP can substantially improve utility when the privacy budget is small (i.e., strict privacy), its benefit diminishes as the privacy budget grows. Moreover, our findings reveal that the optimal PP method depends on multiple factors, including the choice of LDP protocol, privacy budget, data characteristics (such as distribution and domain size), and the specific utility metric. To advance research in this area and assist practitioners in identifying the most suitable PP method for their setting, we introduce LDP$^3$, an open-source benchmark platform. LDP$^3$ contains all methods used in our experimental analysis, and it is designed in a modular, extensible, and multi-threaded way for future use and development.
Authors:Berkay Kemal Balioglu, Alireza Khodaie, Mehmet Emre Gursoy
Abstract:
Local differential privacy (LDP) has become a prominent notion for privacy-preserving data collection. While numerous LDP protocols and post-processing (PP) methods have been developed, selecting an optimal combination under different privacy budgets and datasets remains a challenge. Moreover, the lack of a comprehensive and extensible LDP benchmarking toolkit raises difficulties in evaluating new protocols and PP methods. To address these concerns, this paper presents LDP$^3$ (pronounced LDP-Cube), an open-source, extensible, and multi-threaded toolkit for LDP researchers and practitioners. LDP$^3$ contains implementations of several LDP protocols, PP methods, and utility metrics in a modular and extensible design. Its modular design enables developers to conveniently integrate new protocols and PP methods. Furthermore, its multi-threaded nature enables significant reductions in execution times via parallelization. Experimental evaluations demonstrate that: (i) using LDP$^3$ to select a good protocol and post-processing method substantially improves utility compared to a bad or random choice, and (ii) the multi-threaded design of LDP$^3$ brings substantial benefits in terms of efficiency.
Authors:Aravind Cheruvu, Shravya Kanchi, Sifat Muhammad Abdullah, Nicholas Kong, Daphne Yao, Murtuza Jadliwala, Bimal Viswanath
Abstract:
Recent advances in foundation models, such as LLMs, have revolutionized conversational AI. Chatbots are increasingly being developed by customizing LLMs on specific conversational datasets. However, mitigating toxicity during this customization, especially when dealing with untrusted training data, remains a significant challenge. To address this, we introduce TuneShield, a defense framework designed to mitigate toxicity during chatbot fine-tuning while preserving conversational quality. TuneShield leverages LLM-based toxicity classification, utilizing the instruction-following capabilities and safety alignment of LLMs to effectively identify toxic samples, outperforming industry API services. TuneShield generates synthetic conversation samples, termed 'healing data', based on the identified toxic samples, using them to mitigate toxicity while reinforcing desirable behavior during fine-tuning. It performs an alignment process to further nudge the chatbot towards producing desired responses. Our findings show that TuneShield effectively mitigates toxicity injection attacks while preserving conversational quality, even when the toxicity classifiers are imperfect or biased. TuneShield proves to be resilient against adaptive adversarial and jailbreak attacks. Additionally, TuneShield demonstrates effectiveness in mitigating adaptive toxicity injection attacks during dialog-based learning (DBL).
Authors:Daniel Jones, Giorgio Severi, Martin Pouliot, Gary Lopez, Joris de Gruyter, Santiago Zanella-Beguelin, Justin Song, Blake Bullwinkel, Pamela Cortez, Amanda Minnich
Abstract:
Computer Use Agents (CUAs), autonomous systems that interact with software interfaces via browsers or virtual machines, are rapidly being deployed in consumer and enterprise environments. These agents introduce novel attack surfaces and trust boundaries that are not captured by traditional threat models. Despite their growing capabilities, the security boundaries of CUAs remain poorly understood. In this paper, we conduct a systematic threat analysis and testing of real-world CUAs under adversarial conditions. We identify seven classes of risks unique to the CUA paradigm, and analyze three concrete exploit scenarios in depth: (1) clickjacking via visual overlays that mislead interface-level reasoning, (2) indirect prompt injection that enables Remote Code Execution (RCE) through chained tool use, and (3) CoT exposure attacks that manipulate implicit interface framing to hijack multi-step reasoning. These case studies reveal deeper architectural flaws across current CUA implementations. Namely, a lack of input provenance tracking, weak interface-action binding, and insufficient control over agent memory and delegation. We conclude by proposing a CUA-specific security evaluation framework and design principles for safe deployment in adversarial and high-stakes settings.
Authors:Kazumasa Shinagawa, Koji Nuida
Abstract:
Card-based cryptography is a research area to implement cryptographic procedures using a deck of physical cards. In recent years, it has been found to be related to finite group theory and algebraic combinatorics, and is becoming more and more closely connected to the field of mathematics. In this paper, we discuss the relationship between card-based cryptography and combinatorics on words for the first time. In particular, we focus on cyclic equality of words. We say that a set of words are cyclically equalizable if they can be transformed to be cyclically equal by repeated simultaneous insertion of letters. The main result of this paper is to show that two binary words of equal length and equal Hamming weight are cyclically equalizable. As applications of cyclic equalizability to card-based cryptography, we describe its applications to the information erasure problem and to single-cut full-open protocols.
Authors:Naseem Khan, Aref Y. Al-Tamimi, Amine Bermak, Issa M. Khalil
Abstract:
Traditional malware detection methods exhibit computational inefficiency due to exhaustive feature extraction requirements, creating accuracy-efficiency trade-offs that limit real-time deployment. We formulate malware classification as a Markov Decision Process with episodic feature acquisition and propose a Dueling Double Deep Q-Network (D3QN) framework for adaptive sequential feature selection. The agent learns to dynamically select informative features per sample before terminating with classification decisions, optimizing both detection accuracy and computational cost through reinforcement learning.
We evaluate our approach on Microsoft Big2015 (9-class, 1,795 features) and BODMAS (binary, 2,381 features) datasets. D3QN achieves 99.22% and 98.83% accuracy while utilizing only 61 and 56 features on average, representing 96.6% and 97.6% dimensionality reduction. This yields computational efficiency improvements of 30.1x and 42.5x over traditional ensemble methods. Comprehensive ablation studies demonstrate consistent superiority over Random Forest, XGBoost, and static feature selection approaches.
Quantitative analysis demonstrates that D3QN learns non-random feature selection policies with 62.5% deviation from uniform baseline distributions. The learned policies exhibit structured hierarchical preferences, utilizing high-level metadata features for initial assessment while selectively incorporating detailed behavioral features based on classification uncertainty. Feature specialization analysis reveals 57.7% of examined features demonstrate significant class-specific discrimination patterns. Our results validate reinforcement learning-based sequential feature selection for malware classification, achieving superior accuracy with substantial computational reduction through learned adaptive policies.
Authors:Howard Halim, Eyasu Getahun Chekole, Daniël Reijsbergen, Jianying Zhou
Abstract:
Biometric authentication is a widely used security mechanism that leverages unique physiological or behavioral characteristics to authenticate users. In multi-factor biometrics (MFB), multiple biometric modalities, e.g., physiological and behavioral, are integrated to mitigate the limitations inherent in single-factor biometrics. The main challenge in MFB lies in identifying novel behavioral techniques capable of meeting critical criteria, including high accuracy, high usability, non-invasiveness, resilience against spoofing attacks, and low use of computational resources. Despite ongoing advancements, current behavioral biometric techniques often fall short of fulfilling one or more of these requirements. In this work, we propose BlowPrint, a novel behavioral biometric technique that allows us to authenticate users based on their phone blowing behaviors. In brief, we assume that the way users blow on a phone screen can produce distinctive acoustic patterns, which can serve as a unique biometric identifier for effective user authentication. It can also be seamlessly integrated with physiological techniques, such as facial recognition, to enhance its robustness and security. To assess BlowPrint's effectiveness, we conduct an empirical study involving 50 participants from whom we collect blow-acoustic and facial feature data. Subsequently, we compute the similarity scores of the two modalities using various similarity algorithms and combine them through score-level fusion. Finally, we compute the accuracy using a machine learning-based classifier. As a result, the proposed method demonstrates an accuracy of 99.35% for blow acoustics, 99.96% for facial recognition, and 99.82% for the combined approach. The experimental results demonstrate BlowPrint's high effectiveness in terms of authentication accuracy, spoofing attack resilience, usability, non-invasiveness, and other aspects.
Authors:Kazumasa Shinagawa, Koji Nuida
Abstract:
Card-based cryptography is a research area that realizes cryptographic protocols such as secure computation by applying shuffles to sequences of cards that encode input values. A single-cut full-open protocol is one that obtains an output value by applying a random cut to an input sequence of cards, after which all cards are opened. In this paper, we propose three single-cut full-open protocols: two protocols for three-variable functions and one protocol for a four-variable function.
Authors:Marco Simoni, Aleksandar Fontana, Giulio Rossolini, Andrea Saracino
Abstract:
Improving and understanding the training dynamics and reasoning of Large Language Models (LLMs) has become essential for their deployment in AI-based security tools, such as software vulnerability detection. In this work, we present an extensive study aimed at advancing recent RL-based finetuning techniques for LLMs in the context of vulnerability detection.
We start by highlighting key limitations of commonly adopted LLMs, such as their tendency to over-predict certain types of vulnerabilities while failing to detect others. To address this challenge, we explore the use of Group Relative Policy Optimization (GRPO), a recent policy-gradient method, for guiding LLM behavior through structured, rule-based rewards. We enable its application to the vulnerability detection task by redefining its advantage functions and reward signals using annotations from widely used datasets in the field, including BigVul, DiverseVul, and CleanVul.
The proposed methodology enables an extensive set of experiments, addressing multiple research questions regarding the impact of GRPO on generalization, reasoning capabilities, and performance improvements over standard supervised finetuning (SFT). Our findings offer valuable insights into the potential of RL-based training to enhance both the performance and reasoning abilities of LLMs in the context of software vulnerability detection.
Authors:Rundong Xin, Taotao Wang, Jin Wang, Chonghe Zhao, Jing Wang
Abstract:
Face identification systems operating in the ciphertext domain have garnered significant attention due to increasing privacy concerns and the potential recovery of original facial data. However, as the size of ciphertext template libraries grows, the face retrieval process becomes progressively more time-intensive. To address this challenge, we propose a novel and efficient scheme for face retrieval in the ciphertext domain, termed Privacy-Preserving Preselection for Face Identification Based on Packing (PFIP). PFIP incorporates an innovative preselection mechanism to reduce computational overhead and a packing module to enhance the flexibility of biometric systems during the enrollment stage. Extensive experiments conducted on the LFW and CASIA datasets demonstrate that PFIP preserves the accuracy of the original face recognition model, achieving a 100% hit rate while retrieving 1,000 ciphertext face templates within 300 milliseconds. Compared to existing approaches, PFIP achieves a nearly 50x improvement in retrieval efficiency.
Authors:Yan Long, Jiancong Cui, Yuqing Yang, Tobias Alam, Zhiqiang Lin, Kevin Fu
Abstract:
This work investigates how to monitor access to Android zero-permission sensors which could cause privacy leakage to users. Moreover, monitoring such sensitive access allows security researchers to characterize potential sensor abuse patterns. Zero-permission sensors such as accelerometers have become an indispensable part of Android devices. The critical information they provide has attracted extensive research investigating how data collectors could capture more sensor data to enable both benign and exploitative applications. In contrast, little work has explored how to enable data providers, such as end users, to understand sensor usage. While existing methods such as static analysis and hooking-based dynamic analysis face challenges of requiring complicated development chains, rooting privilege, and app-specific reverse engineering analysis, our work aims to bridge this gap by developing ARMOUR for user-space runtime monitoring, leveraging the intrinsic sampling rate variation and convergence behaviors of Android. ARMOUR enables privacy-aware users to easily monitor how third-party apps use sensor data and support security researchers to perform rapid app-agnostic sensor access analysis. Our evaluation with 1,448 commercial applications shows the effectiveness of ARMOUR in detecting sensor usage in obfuscated code and other conditions, and observes salient sensor abuse patterns such as 50% of apps from seemingly sensor-independent categories accessing data of multiple zero-permission sensors. We analyze the impact of Android's recent policy changes on zero-permission sensors and remaining technical and regulatory problems.
Authors:Anis Yusof, Yuancheng Liu, Niklaus Kang, Choon Meng Seah, Zhenkai Liang, Ee-Chien Chang
Abstract:
The prevalence of cyberattacks on Industrial Control Systems (ICS) has highlighted the necessity for robust security measures and incident response to protect critical infrastructure. This is prominent when Operational Technology (OT) systems undergo digital transformation by integrating with Information Technology (IT) systems to enhance operational efficiency, adaptability, and safety. To support analysts in staying abreast of emerging attack patterns, there is a need for ICS datasets that reflect indicators representative of contemporary cyber threats. To address this, we conduct two ICS cyberattack simulations to showcase the impact of trending ICS cyberattacks on a railway cyber range that resembles the railway infrastructure. The attack scenario is designed to blend trending attack trends with attack patterns observed from historical ICS incidents. The resulting evidence is collected as datasets, serving as an essential resource for cyberattack analysis. This captures key indicators that are relevant to the current threat landscape, augmenting the effectiveness of security systems and analysts to protect against ICS cyber threats.
Authors:Masood Jan, Wafa Njima, Xun Zhang
Abstract:
Location information serves as the fundamental element for numerous Internet of Things (IoT) applications. Traditional indoor localization techniques often produce significant errors and raise privacy concerns due to centralized data collection. In response, Machine Learning (ML) techniques offer promising solutions by capturing indoor environment variations. However, they typically require central data aggregation, leading to privacy, bandwidth, and server reliability issues. To overcome these challenges, in this paper, we propose a Federated Learning (FL)-based approach for dynamic indoor localization using a Deep Neural Network (DNN) model. Experimental results show that FL has the nearby performance to Centralized Model (CL) while keeping the data privacy, bandwidth efficiency and server reliability. This research demonstrates that our proposed FL approach provides a viable solution for privacy-enhanced indoor localization, paving the way for advancements in secure and efficient indoor localization systems.
Authors:Taiga Hiroka, Min-Hsiu Hsieh, Tomoyuki Morimae
Abstract:
The existence of one-way functions (OWFs) forms the minimal assumption in classical cryptography. However, this is not necessarily the case in quantum cryptography. One-way puzzles (OWPuzzs), introduced by Khurana and Tomer, provide a natural quantum analogue of OWFs. The existence of OWPuzzs implies $PP\neq BQP$, while the converse remains open. In classical cryptography, the analogous problem-whether OWFs can be constructed from $P \neq NP$-has long been studied from the viewpoint of hardness of learning. Hardness of learning in various frameworks (including PAC learning) has been connected to OWFs or to $P \neq NP$. In contrast, no such characterization previously existed for OWPuzzs. In this paper, we establish the first complete characterization of OWPuzzs based on the hardness of a well-studied learning model: distribution learning. Specifically, we prove that OWPuzzs exist if and only if proper quantum distribution learning is hard on average. A natural question that follows is whether the worst-case hardness of proper quantum distribution learning can be derived from $PP \neq BQP$. If so, and a worst-case to average-case hardness reduction is achieved, it would imply OWPuzzs solely from $PP \neq BQP$. However, we show that this would be extremely difficult: if worst-case hardness is PP-hard (in a black-box reduction), then $SampBQP \neq SampBPP$ follows from the infiniteness of the polynomial hierarchy. Despite that, we show that $PP \neq BQP$ is equivalent to another standard notion of hardness of learning: agnostic. We prove that $PP \neq BQP$ if and only if agnostic quantum distribution learning with respect to KL divergence is hard. As a byproduct, we show that hardness of agnostic quantum distribution learning with respect to statistical distance against $PPT^{Σ_3^P}$ learners implies $SampBQP \neq SampBPP$.
Authors:Sean Oesch, Jack Hutchins, Luke Koch, Kevin Kurian
Abstract:
In living off the land attacks, malicious actors use legitimate tools and processes already present on a system to avoid detection. In this paper, we explore how the on-device LLMs of the future will become a security concern as threat actors integrate LLMs into their living off the land attack pipeline and ways the security community may mitigate this threat.
Authors:Michele Battagliola, Sebastian Bitzer, Antonia Wachter-Zeh, Violetta Weger
Abstract:
Threshold-Computation-in-the-Head (TCitH) and VOLE-in-the-Head (VOLEitH), two recent developments of the MPC-in-the-Head (MPCitH) paradigm, have significantly improved the performance of digital signature schemes in this framework. In this note, we embed the restricted decoding problem within these frameworks. We propose a structurally simple modeling that achieves competitive signature sizes. Specifically, by instantiating the restricted decoding problem with the same hardness assumption underlying CROSS, we reduce sizes by more than a factor of two compared to the NIST submission. Moreover, we observe that ternary full-weight decoding, closely related to the hardness assumption underlying WAVE, is a restricted decoding problem. Using ternary full-weight decoding, we obtain signature sizes comparable to the smallest MPCitH-based candidates in the NIST competition.
Authors:Ahmad Mohammadi, Reza Ahmari, Vahid Hemmati, Frederick Owusu-Ambrose, Mahmoud Nabil Mahmoud, Parham Kebria, Abdollah Homaifar, Mehrdad Saif
Abstract:
As autonomous vehicles become an essential component of modern transportation, they are increasingly vulnerable to threats such as GPS spoofing attacks. This study presents an adaptive detection approach utilizing a dynamically tuned Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, designed to adjust the detection threshold (ε) in real-time. The threshold is updated based on the recursive mean and standard deviation of displacement errors between GPS and in-vehicle sensors data, but only at instances classified as non-anomalous. Furthermore, an initial threshold, determined from 120,000 clean data samples, ensures the capability to identify even subtle and gradual GPS spoofing attempts from the beginning. To assess the performance of the proposed method, five different subsets from the real-world Honda Research Institute Driving Dataset (HDD) are selected to simulate both large and small magnitude GPS spoofing attacks. The modified algorithm effectively identifies turn-by-turn, stop, overshoot, and multiple small biased spoofing attacks, achieving detection accuracies of 98.621%, 99.960.1%, 99.880.1%, and 98.380.1%, respectively. This work provides a substantial advancement in enhancing the security and safety of AVs against GPS spoofing threats.
Authors:Ming Tan, Wei Li, Hu Tao, Hailong Ma, Aodi Liu, Qian Chen, Zilong Wang
Abstract:
Open-source large language models (LLMs) have demonstrated considerable dominance over proprietary LLMs in resolving neural processing tasks, thanks to the collaborative and sharing nature. Although full access to source codes, model parameters, and training data lays the groundwork for transparency, we argue that such a full-access manner is vulnerable to stego attacks, and their ill-effects are not fully understood. In this paper, we conduct a systematic formalization for stego attacks on open-source LLMs by enumerating all possible threat models associated with adversary objectives, knowledge, and capabilities. Therein, the threat posed by adversaries with internal knowledge, who inject payloads and triggers during the model sharing phase, is of practical interest. We go even further and propose the first stego attack on open-source LLMs, dubbed SASER, which wields impacts through identifying targeted parameters, embedding payloads, injecting triggers, and executing payloads sequentially. Particularly, SASER enhances the attack robustness against quantization-based local deployment by de-quantizing the embedded payloads. In addition, to achieve stealthiness, SASER devises the performance-aware importance metric to identify targeted parameters with the least degradation of model performance. Extensive experiments on LlaMA2-7B and ChatGLM3-6B, without quantization, show that the stealth rate of SASER outperforms existing stego attacks (for general DNNs) by up to 98.1%, while achieving the same attack success rate (ASR) of 100%. More importantly, SASER improves ASR on quantized models from 0 to 100% in all settings. We appeal for investigations on countermeasures against SASER in view of the significant attack effectiveness.
Authors:Shaolun Liu, Sina Marefat, Omar Tsai, Yu Chen, Zecheng Deng, Jia Wang, Mohammad A. Tayebi
Abstract:
GraphQL's flexible query model and nested data dependencies expose APIs to complex, context-dependent vulnerabilities that are difficult to uncover using conventional testing tools. Existing fuzzers either rely on random payload generation or rigid mutation heuristics, failing to adapt to the dynamic structures of GraphQL schemas and responses. We present PrediQL, the first retrieval-augmented, LLM-guided fuzzer for GraphQL APIs. PrediQL combines large language model reasoning with adaptive feedback loops to generate semantically valid and diverse queries. It models the choice of fuzzing strategy as a multi-armed bandit problem, balancing exploration of new query structures with exploitation of past successes. To enhance efficiency, PrediQL retrieves and reuses execution traces, schema fragments, and prior errors, enabling self-correction and progressive learning across test iterations. Beyond input generation, PrediQL integrates a context-aware vulnerability detector that uses LLM reasoning to analyze responses, interpreting data values, error messages, and status codes to identify issues such as injection flaws, access-control bypasses, and information disclosure. Our evaluation across open-source and benchmark GraphQL APIs shows that PrediQL achieves significantly higher coverage and vulnerability discovery rates compared to state-of-the-art baselines. These results demonstrate that combining retrieval-augmented reasoning with adaptive fuzzing can transform API security testing from reactive enumeration to intelligent exploration.
Authors:Nora Basha, Bechir Hamdaoui, Attila A. Yavuz, Thang Hoang, Mehran Mozaffari Kermani
Abstract:
Secret-key generation and agreement based on wireless channel reciprocity offers a promising avenue for securing IoT networks. However, existing approaches predominantly rely on the similarity of instantaneous channel measurement samples between communicating devices. This narrow view of reciprocity is often impractical, as it is highly susceptible to noise, asynchronous sampling, channel fading, and other system-level imperfections -- all of which significantly impair key generation performance. Furthermore, the quantization step common in traditional schemes introduces irreversible errors, further limiting efficiency. In this work, we propose a novel approach for secret-key generation by using wavelet scattering networks to extract robust and reciprocal CSI features. Dimensionality reduction is applied to uncover hidden cluster structures, which are then used to build hidden Markov models for efficient key agreement. Our approach eliminates the need for quantization and effectively captures channel randomness. It achieves a 5x improvement in key generation rate compared to traditional benchmarks, providing a secure and efficient solution for key generation in resource-constrained IoT environments.
Authors:Subrat Kishore Dutta, Yuelin Xu, Piyush Pant, Xiao Zhang
Abstract:
Recent work has shown that RLHF is highly susceptible to backdoor attacks, poisoning schemes that inject malicious triggers in preference data. However, existing methods often rely on static, rare-token-based triggers, limiting their effectiveness in realistic scenarios. In this paper, we develop GREAT, a novel framework for crafting generalizable backdoors in RLHF through emotion-aware trigger synthesis. Specifically, GREAT targets harmful response generation for a vulnerable user subgroup characterized by both semantically violent requests and emotionally angry triggers. At the core of GREAT is a trigger identification pipeline that operates in the latent embedding space, leveraging principal component analysis and clustering techniques to identify the most representative triggers. To enable this, we present Erinyes, a high-quality dataset of over $5000$ angry triggers curated from GPT-4.1 using a principled, hierarchical, and diversity-promoting approach. Experiments on benchmark RLHF datasets demonstrate that GREAT significantly outperforms baseline methods in attack success rates, especially for unseen trigger scenarios, while largely preserving the response quality on benign inputs.
Authors:Dennis Rall, Bernhard Bauer, Mohit Mittal, Thomas Fraunholz
Abstract:
Large language models (LLMs) are now routinely used to autonomously execute complex tasks, from natural language processing to dynamic workflows like web searches. The usage of tool-calling and Retrieval Augmented Generation (RAG) allows LLMs to process and retrieve sensitive corporate data, amplifying both their functionality and vulnerability to abuse. As LLMs increasingly interact with external data sources, indirect prompt injection emerges as a critical and evolving attack vector, enabling adversaries to exploit models through manipulated inputs. Through a systematic evaluation of indirect prompt injection attacks across diverse models, we analyze how susceptible current LLMs are to such attacks, which parameters, including model size and manufacturer, specific implementations, shape their vulnerability, and which attack methods remain most effective. Our results reveal that even well-known attack patterns continue to succeed, exposing persistent weaknesses in model defenses. To address these vulnerabilities, we emphasize the need for strengthened training procedures to enhance inherent resilience, a centralized database of known attack vectors to enable proactive defense, and a unified testing framework to ensure continuous security validation. These steps are essential to push developers toward integrating security into the core design of LLMs, as our findings show that current models still fail to mitigate long-standing threats.
Authors:Gorjan Alagic, Chen Bai, Christian Majenz, Kaiyan Shi
Abstract:
Block ciphers are versatile cryptographic ingredients that are used in a wide range of applications ranging from secure Internet communications to disk encryption. While post-quantum security of public-key cryptography has received significant attention, the case of symmetric-key cryptography (and block ciphers in particular) remains a largely unexplored topic. In this work, we set the foundations for a theory of post-quantum security for block ciphers and associated constructions. Leveraging our new techniques, we provide the first post-quantum security proofs for the key-length extension scheme FX, the tweakable block ciphers LRW and XEX, and most block cipher encryption and authentication modes. Our techniques can be used for security proofs in both the plain model and the quantum ideal cipher model. Our work takes significant initial steps in establishing a rigorous understanding of the post-quantum security of practical symmetric-key cryptography.
Authors:Andrew Huang, Yael Tauman Kalai
Abstract:
We present a generic compiler that converts any $\mathsf{MIP}^{*}$ protocol into a succinct interactive argument where the communication and the verifier are classical, and where post-quantum soundness relies on the post-quantum sub-exponential hardness of the Learning with Errors ($\mathsf{LWE}$) problem. Prior to this work, such a compiler for $\mathsf{MIP}^{*}$ was given by Kalai, Lombardi, Vaikuntanathan and Yang (STOC 2022), but the post-quantum soundness of this compiler is still under investigation. More generally, our compiler can be applied to any $\mathsf{QIP}$ protocol which is sound only against semi-malicious provers that follow the prescribed protocol, but with possibly malicious initial state. Our compiler consists of two steps. We first show that if a language $\mathcal{L}$ has a $\mathsf{QIP}$ with semi-malicious soundness, where the prover runs in time $T$, then $\mathcal{L} \in \mathsf{QMATIME}(T)$. Then we construct a succinct classical argument for any such language, where the communication complexity grows polylogarithmically with $T$, under the post-quantum sub-exponential hardness of $\mathsf{LWE}$.
Authors:Jinsong Mao, Benjamin E. Ujcich, Shiqing Ma
Abstract:
Provenance plays a critical role in maintaining traceability of a system's actions for root cause analysis of security threats and impacts. Provenance collection is often incorporated into the reference monitor of systems to ensure that an audit trail exists of all events, that events are completely captured, and that logging of such events cannot be bypassed. However, recent research has questioned whether existing state-of-the-art provenance collection systems fail to ensure the security guarantees of a true reference monitor due to the 'super producer threat' in which provenance generation can overload a system to force the system to drop security-relevant events and allow an attacker to hide their actions. One approach towards solving this threat is to enforce resource isolation, but that does not fully solve the problems resulting from hardware dependencies and performance limitations. In this paper, we show how an operating system's kernel scheduler can mitigate this threat, and we introduce Venus, a learned scheduler for Linux specifically designed for provenance. Unlike conventional schedulers that ignore provenance completeness requirements, Venus leverages reinforcement learning to learn provenance task behavior and to dynamically optimize resource allocation. We evaluate Venus's efficacy and show that Venus significantly improves both the completeness and efficiency of provenance collection systems compared to traditional scheduling, while maintaining reasonable overheads and even improving overall runtime in certain cases compared to the default Linux scheduler.
Authors:Lynn Engelberts, Yanlin Chen, Amin Shiraz Gilani, Maya-Iggy van Hoof, Stacey Jeffery, Ronald de Wolf
Abstract:
The assumed hardness of the Shortest Vector Problem in high-dimensional lattices is one of the cornerstones of post-quantum cryptography. The fastest known heuristic attacks on SVP are via so-called sieving methods. While these still take exponential time in the dimension $d$, they are significantly faster than non-heuristic approaches and their heuristic assumptions are verified by extensive experiments. $k$-Tuple sieving is an iterative method where each iteration takes as input a large number of lattice vectors of a certain norm, and produces an equal number of lattice vectors of slightly smaller norm, by taking sums and differences of $k$ of the input vectors. Iterating these ''sieving steps'' sufficiently many times produces a short lattice vector. The fastest attacks (both classical and quantum) are for $k=2$, but taking larger $k$ reduces the amount of memory required for the attack. In this paper we improve the quantum time complexity of 3-tuple sieving from $2^{0.3098 d}$ to $2^{0.2846 d}$, using a two-level amplitude amplification aided by a preprocessing step that associates the given lattice vectors with nearby ''center points'' to focus the search on the neighborhoods of these center points. Our algorithm uses $2^{0.1887d}$ classical bits and QCRAM bits, and $2^{o(d)}$ qubits. This is the fastest known quantum algorithm for SVP when total memory is limited to $2^{0.1887d}$.
Authors:Gregory D. Kahanamoku-Meyer, Seyoon Ragavan, Katherine Van Kirk
Abstract:
"Pebble games," an abstraction from classical reversible computing, have found use in the design of quantum circuits for inherently sequential tasks. Gidney showed that allowing Hadamard basis measurements during pebble games can dramatically improve costs -- an extension termed "spooky pebble games" because the measurements leave temporary phase errors called ghosts. In this work, we define and study parallel spooky pebble games. Previous work by Blocki, Holman, and Lee (TCC 2022) and Gidney studied the benefits offered by either parallelism or spookiness individually; here we show that these resources can yield impressive gains when used together. First, we show by construction that a line graph of length $\ell$ can be pebbled in depth $2\ell$ (which is exactly optimal) using space $\leq 2.47\log \ell$. Then, to explore pebbling schemes using even less space, we use a highly optimized $A^*$ search implemented in Julia to find the lowest-depth parallel spooky pebbling possible for a range of concrete line graph lengths $\ell$ given a constant number of pebbles $s$. We show that these techniques can be applied to Regev's factoring algorithm (Journal of the ACM 2025) to significantly reduce the cost of its arithmetic. For example, we find that 4096-bit integers $N$ can be factored in multiplication depth 193, which outperforms the 680 required of previous variants of Regev and the 444 reported by Ekerå and Gärtner for Shor's algorithm (IACR Communications in Cryptology 2025). While space-optimized implementations of Shor's algorithm remain likely the best candidates for first quantum factorization of large integers, our results show that Regev's algorithm may have practical importance in the future, especially given the possibility of further optimization. Finally, we believe our pebbling techniques will find applications in quantum cryptanalysis beyond integer factorization.
Authors:Kaustabh Barman, Fabian Piper, Sanjeet Raj Pandey, Axel Kuepper
Abstract:
User authentication is one of the most important aspects for secure communication between services and end-users over the Internet. Service providers leverage Single-Sign On (SSO) to make it easier for their users to authenticate themselves. However, standardized systems for SSO, such as OIDC, do not guarantee user privacy as identity providers can track user activities. We propose a zero-knowledge-based mechanism that integrates with OIDC to let users authenticate through SSO without revealing information about the service provider. Our system leverages Groth's zk-SNARK to prove membership of subscribed service providers without revealing their identity. We adopt a decentralized and verifiable approach to set up the prerequisites of our construction that further secures and establishes trust in the system. We set up high security targets and achieve them with minimal storage and latency cost, proving that our research can be adopted for production.
Authors:Cédrick Austa, Jan Tobias Mühlberg, Jean-Michel Dricot
Abstract:
While interest in the open RISC-V instruction set architecture is growing, tools to assess the security of concrete processor implementations are lacking. There are dedicated tools and benchmarks for common microarchitectural side-channel vulnerabilities for popular processor families such as Intel x86-64 or ARM, but not for RISC-V. In this paper we describe our efforts in porting an Intel x86-64 benchmark suite for cache-based timing vulnerabilities to RISC-V. We then use this benchmark to evaluate the security of three commercially available RISC-V processors, the T-Head C910 and the SiFive U54 and U74 cores. We observe that the C910 processor exhibits more distinct timing types than the other processors, leading to the assumption that code running on the C910 would be exposed to more microarchitectural vulnerability sources. In addition, our evaluation reveals that $37.5\%$ of the vulnerabilities covered by the benchmark exist in all processors, while only $6.8\%$ are absent from all cores. Our work, in particular the ported benchmark, aims to support RISC-V processor designers to identify leakage sources early in their designs and to support the development of countermeasures.
Authors:Hikmat A. M. Abdeljaber, Md. Alamgir Hossain, Sultan Ahmad, Ahmed Alsanad, Md Alimul Haque, Sudan Jha, Jabeen Nazeer
Abstract:
The rapid expansion of Internet of Things (IoT) devices has transformed industries and daily life by enabling widespread connectivity and data exchange. However, this increased interconnection has introduced serious security vulnerabilities, making IoT systems more exposed to sophisticated cyber attacks. This study presents a novel ensemble learning architecture designed to improve IoT attack detection. The proposed approach applies advanced machine learning techniques, specifically the Extra Trees Classifier, along with thorough preprocessing and hyperparameter optimization. It is evaluated on several benchmark datasets including CICIoT2023, IoTID20, BotNeTIoT L01, ToN IoT, N BaIoT, and BoT IoT. The results show excellent performance, achieving high recall, accuracy, and precision with very low error rates. These outcomes demonstrate the model efficiency and superiority compared to existing approaches, providing an effective and scalable method for securing IoT environments. This research establishes a solid foundation for future progress in protecting connected devices from evolving cyber threats.
Authors:Jiachen Li, Bang Wu, Xiaoyu Xia, Xiaoning Liu, Xun Yi, Xiuzhen Zhang
Abstract:
Spiking Neural Networks (SNNs) have gained increasing attention for their superior energy efficiency compared to Artificial Neural Networks (ANNs). However, their security aspects, particularly under backdoor attacks, have received limited attention. Existing defense methods developed for ANNs perform poorly or can be easily bypassed in SNNs due to their event-driven and temporal dependencies. This paper identifies the key blockers that hinder traditional backdoor defenses in SNNs and proposes an unsupervised post-training detection framework, Temporal Membrane Potential Backdoor Detection (TMPBD), to overcome these challenges. TMPBD leverages the maximum margin statistics of temporal membrane potential (TMP) in the final spiking layer to detect target labels without any attack knowledge or data access. We further introduce a robust mitigation mechanism, Neural Dendrites Suppression Backdoor Mitigation (NDSBM), which clamps dendritic connections between early convolutional layers to suppress malicious neurons while preserving benign behaviors, guided by TMP extracted from a small, clean, unlabeled dataset. Extensive experiments on multiple neuromorphic benchmarks and state-of-the-art input-aware dynamic trigger attacks demonstrate that TMPBD achieves 100% detection accuracy, while NDSBM reduces the attack success rate from 100% to 8.44%, and to 2.81% when combined with detection, without degrading clean accuracy.
Authors:Sergio Demian Lerner, Ariel Futoransky
Abstract:
We present BATTLE for Bitcoin, a DoS-resilient dispute layer that secures optimistic bridges between Bitcoin and rollups or sidechains. Our design adapts the BATTLE tournament protocol to Bitcoin's UTXO model using BitVM-style FLEX components and garbled circuits with on-demand L1 security bonds. Disputes are resolved in logarithmic rounds while recycling rewards, keeping the honest asserter's minimum initial capital constant even under many permissionless challengers. The construction is fully contestable (challengers can supply higher-work counter-proofs) and relies only on standard timelocks and pre-signed transaction DAGs, without new opcodes. For $N$ operators, the protocol requires $O(N^2)$ pre-signed transactions, signatures, and message exchanges, yet remains practical at $N\!\gtrsim\!10^3$, enabling high decentralization.
Authors:Vipul Goyal, Justin Raizes
Abstract:
A central challenge in data security is not just preventing theft, but detecting whether it has occurred. Classically, this is impossible because a perfect copy leaves no evidence. Quantum mechanics, on the other hand, forbids general duplication, opening up new possibilities. We introduce Proofs of No Intrusion, which enable a classical client to remotely test whether a quantum server has been hacked and the client's data stolen. Crucially, the test does not destroy the data being tested, avoiding the need to store a backup elsewhere. We define and construct proofs of no intrusion for ciphertexts assuming fully homomorphic encryption. Additionally, we show how to equip several constructions of unclonable primitives with proofs of non-intrusion, such as unclonable decryption keys and signature tokens. Conceptually, proofs of non-intrusion can be defined for essentially any unclonable primitive. At the heart of our techniques is a new method for non-destructively testing coset states with classical communication. It can be viewed as a non-destructive proof of knowledge of a measurement result of the coset state.
Authors:Muhammad Abdullah Soomro, Fatima Muhammad Anwar
Abstract:
The Precision Time Protocol (PTP), standardized as IEEE 1588, provides sub-microsecond synchronization across distributed systems and underpins critical infrastructure in telecommunications, finance, power systems, and industrial automation. While prior work has extensively analyzed PTP's vulnerability to network-based attacks, prompting the development of cryptographic protections and anomaly detectors, these defenses presume an uncompromised host. In this paper, we identify and exploit a critical blind spot in current threat models: kernel-level adversaries operating from within the host running the PTP stack. We present the first systematic study of kernel-rooted attacks on PTP, demonstrating how privileged attackers can manipulate system time by corrupting key interfaces without altering PTP network traffic. We implement three attack primitives, constant offset, progressive skew, and random jitter, using in-kernel payloads, and evaluate their impact on the widely used ptp4l and phc2sys daemons. Our experiments reveal that these attacks can silently destabilize clock synchronization, bypassing existing PTP security extensions. These findings highlight the urgent need to reconsider host-level trust assumptions and integrate kernel integrity into the design of secure time synchronization systems.
Authors:Roberto Civino, Valerio Fedele
Abstract:
We classify small binary bibraces, using the correspondence with alternating algebras over the field F2, up to dimension eight, also determining their isomorphism classes. These finite-dimensional algebras, defined by an alternating bilinear multiplication and nilpotency of class two, can be represented by subspaces of skew-symmetric matrices, with classification corresponding to GL(m, F_2)-orbits under congruence. Our approach combines theoretical invariants, such as rank sequences and the identification of primitive algebras, with computational methods implemented in Magma. These results also count the number of possible alternative operations that can be used in differential cryptanalysis.
Authors:Fabian Piper, Karl Wolf, Jonathan Heiss
Abstract:
Decentralized applications (dApps) in Decentralized Finance (DeFi) face a fundamental tension between regulatory compliance requirements like Know Your Customer (KYC) and maintaining decentralization and privacy. Existing permissioned DeFi solutions often fail to adequately protect private attributes of dApp users and introduce implicit trust assumptions, undermining the blockchain's decentralization. Addressing these limitations, this paper presents a novel synthesis of Self-Sovereign Identity (SSI), Zero-Knowledge Proofs (ZKPs), and Attribute-Based Access Control to enable privacy-preserving on-chain permissioning based on decentralized policy decisions. We provide a comprehensive framework for permissioned dApps that aligns decentralized trust, privacy, and transparency, harmonizing blockchain principles with regulatory compliance. Our framework supports multiple proof types (equality, range, membership, and time-dependent) with efficient proof generation through a commit-and-prove scheme that moves credential authenticity verification outside the ZKP circuit. Experimental evaluation of our KYC-compliant DeFi implementation shows considerable performance improvement for different proof types compared to baseline approaches. We advance the state-of-the-art through a holistic approach, flexible proof mechanisms addressing diverse real-world requirements, and optimized proof generation enabling practical deployment.
Authors:Shadi Rahimian, Mario Fritz
Abstract:
Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of SNPs with each other makes strong adversarial attacks such as masked-value reconstruction, kin, and membership inference attacks possible. Existing privacy-preserving approaches either apply differential privacy to statistical summaries of these datasets or offer complex methods that require post-processing and the usage of a publicly available dataset to suppress or selectively share SNPs. In this study, we introduce an innovative framework for generating synthetic SNP sequence datasets using samples derived from time-inhomogeneous hidden Markov models (TIHMMs). To preserve the privacy of the training data, we ensure that each SNP sequence contributes only a bounded influence during training, enabling strong differential privacy guarantees. Crucially, by operating on full SNP sequences and bounding their gradient contributions, our method directly addresses the privacy risks introduced by their inherent correlations. Through experiments conducted on the real-world 1000 Genomes dataset, we demonstrate the efficacy of our method using privacy budgets of $\varepsilon \in [1, 10]$ at $δ=10^{-4}$. Notably, by allowing the transition models of the HMM to be dependent on the location in the sequence, we significantly enhance performance, enabling the synthetic datasets to closely replicate the statistical properties of non-private datasets. This framework facilitates the private sharing of genomic data while offering researchers exceptional flexibility and utility.
Authors:Yasod Ginige, Akila Niroshan, Sajal Jain, Suranga Seneviratne
Abstract:
Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale and complexity, the demand for pentesting has surged, surpassing the capacity of human professionals to meet it effectively. With advances in AI, particularly Large Language Models (LLMs), there have been attempts to automate the pentesting process. However, existing tools such as PentestGPT are still semi-manual, requiring significant professional human interaction to conduct pentests. To this end, we propose a novel LLM agent-based framework, AutoPentester, which automates the pentesting process. Given a target IP, AutoPentester automatically conducts pentesting steps using common security tools in an iterative process. It can dynamically generate attack strategies based on the tool outputs from the previous iteration, mimicking the human pentester approach. We evaluate AutoPentester using Hack The Box and custom-made VMs, comparing the results with the state-of-the-art PentestGPT. Results show that AutoPentester achieves a 27.0% better subtask completion rate and 39.5% more vulnerability coverage with fewer steps. Most importantly, it requires significantly fewer human interactions and interventions compared to PentestGPT. Furthermore, we recruit a group of security industry professional volunteers for a user survey and perform a qualitative analysis to evaluate AutoPentester against industry practices and compare it with PentestGPT. On average, AutoPentester received a score of 3.93 out of 5 based on user reviews, which was 19.8% higher than PentestGPT.
Authors:Samuel Bouaziz--Ermann, Minki Hhan, Garazi Muguruza, Quoc-Huy Vu
Abstract:
There are various notions of quantum pseudorandomness, such as pseudorandom unitaries (PRUs), pseudorandom state generators (PRSGs) and pseudorandom function-like state generators (PRSFGs). Unlike the different notions of classical pseudorandomness, which are known to be existentially equivalent to each other, the relation between quantum pseudorandomness has yet to be fully established. We present some evidence suggesting that some quantum pseudorandomness is unlikely to be constructed from the others, or at least is hard to construct unless some conjectures are false. This indicates that quantum pseudorandomness could behave quite differently from classical pseudorandomness. We study new oracle worlds where one quantum pseudorandomness exists but another pseudorandomness does not under some assumptions or constraints, and provide potential directions to achieve the full black-box separation. More precisely: - We give a unitary oracle relative to which PRFSGs exist but PRUs without using ancilla do not. This can be extended to the general PRUs if we can prove a structural property of the PRU algorithm. - Assuming an isoperimetric inequality-style conjecture, we show a unitary oracle world where log-length output PRFSGs exist but proving the existence of quantum-computable pseudorandom generators (QPRGs) with negligible correctness error is as hard as proving that ${\sf BQP}\neq {\sf QCMA}$. This result suggests that the inverse-polynomial error in the state of the art construction of QPRGs from log-length PRSGs is inherent. - Assuming the same conjecture, we prove that some natural way of constructing super-log-length output PRSGs from log-length output PRFSGs is impossible. This partly complements the known hardness of shrinking the PRSG output lengths. Along the way, we also discuss other potential approaches to extend the PRSG output lengths.
Authors:Saida Elouardi, Mohammed Jouhari, Anas Motii
Abstract:
In critical IoT environments, such as smart homes and industrial systems, effective Intrusion Detection Systems (IDS) are essential for ensuring security. However, developing robust IDS solutions remains a significant challenge. Traditional machine learning-based IDS models typically require large datasets, but data sharing is often limited due to privacy and security concerns. Federated Learning (FL) presents a promising alternative by enabling collaborative model training without sharing raw data. Despite its advantages, FL still faces key challenges, such as data heterogeneity (non-IID data) and high energy and computation costs, particularly for resource constrained IoT devices. To address these issues, this paper proposes OptiFLIDS, a novel approach that applies pruning techniques during local training to reduce model complexity and energy consumption. It also incorporates a customized aggregation method to better handle pruned models that differ due to non-IID data distributions. Experiments conducted on three recent IoT IDS datasets, TON_IoT, X-IIoTID, and IDSIoT2024, demonstrate that OptiFLIDS maintains strong detection performance while improving energy efficiency, making it well-suited for deployment in real-world IoT environments.
Authors:Haoqi Wu, Wei Dai, Ming Xu, Li Wang, Qiang Yan
Abstract:
Diffusion Models have gained significant popularity due to their remarkable capabilities in image generation, albeit at the cost of intensive computation requirement. Meanwhile, despite their widespread deployment in inference services such as Midjourney, concerns about the potential leakage of sensitive information in uploaded user prompts have arisen. Existing solutions either lack rigorous privacy guarantees or fail to strike an effective balance between utility and efficiency. To bridge this gap, we propose ObCLIP, a plug-and-play safeguard that enables oblivious cloud-device hybrid generation. By oblivious, each input prompt is transformed into a set of semantically similar candidate prompts that differ only in sensitive attributes (e.g., gender, ethnicity). The cloud server processes all candidate prompts without knowing which one is the real one, thus preventing any prompt leakage. To mitigate server cost, only a small portion of denoising steps is performed upon the large cloud model. The intermediate latents are then sent back to the client, which selects the targeted latent and completes the remaining denoising using a small device model. Additionally, we analyze and incorporate several cache-based accelerations that leverage temporal and batch redundancy, effectively reducing computation cost with minimal utility degradation. Extensive experiments across multiple datasets demonstrate that ObCLIP provides rigorous privacy and comparable utility to cloud models with slightly increased server cost.
Authors:Atul Singh Arora, Carl A. Miller, Mauro E. S. Morales, Jamie Sikora
Abstract:
Coin-flipping is a fundamental task in two-party cryptography where two remote mistrustful parties wish to generate a shared uniformly random bit. While quantum protocols promising near-perfect security exist for weak coin-flipping -- when the parties want opposing outcomes -- it has been shown that they must be inefficient in terms of their round complexity, and it is an open question of how space efficient they can be. In this work, we consider a variant called cheat-penalised weak coin-flipping in which if a party gets caught cheating, they lose $Λ$ points (compared to $0$ in the standard definition). We find that already for a small cheating penalty, the landscape of coin-flipping changes dramatically. For example, with $Λ=0.01$, we exhibit a protocol where neither Alice nor Bob can bias the result in their favour beyond $1/2 + 10^{-8}$, which uses $24$ qubits and $10^{16}$ rounds of communication (provably $10^{7}$ times better than any weak coin-flipping protocol with matching security). For the same space requirements, we demonstrate how one can choose between lowering how much a malicious party can bias the result (down to $1/2 + 10^{-10}$) and reducing the rounds of communication (down to $25,180$), depending on what is preferred. To find these protocols, we make two technical contributions. First, we extend the point game-protocol correspondence introduced by Kitaev and Mochon, to incorporate: (i) approximate point games, (ii) the cheat-penalised setting, and (iii) round and space complexity. Second, we give the first (to the best of our knowledge) numerical algorithm for constructing (approximate) point games that correspond to high security and low complexity. Our results open up the possibility of having secure and practical quantum protocols for multiparty computation.
Authors:Tamjid Al Rahat, Yanju Chen, Yu Feng, Yuan Tian
Abstract:
OpenID Connect has revolutionized online authentication based on single sign-on (SSO) by providing a secure and convenient method for accessing multiple services with a single set of credentials. Despite its widespread adoption, critical security bugs in OpenID Connect have resulted in significant financial losses and security breaches, highlighting the need for robust mitigation strategies. Automated program repair presents a promising solution for generating candidate patches for OpenID implementations. However, challenges such as domain-specific complexities and the necessity for precise fault localization and patch verification must be addressed. We propose AuthFix, a counterexample-guided repair engine leveraging LLMs for automated OpenID bug fixing. AuthFix integrates three key components: fault localization, patch synthesis, and patch verification. By employing a novel Petri-net-based model checker, AuthFix ensures the correctness of patches by effectively modeling interactions. Our evaluation on a dataset of OpenID bugs demonstrates that AuthFix successfully generated correct patches for 17 out of 23 bugs (74%), with a high proportion of patches semantically equivalent to developer-written fixes.
Authors:Jack Garrard, John F. Hardy, Carlo daCunha, Mayank Bakshi
Abstract:
Physically Unclonable Functions (PUFs) are a promising solution for identity verification and asymmetric encryption. In this paper, a new Resistive Random Access Memory (ReRAM) PUF-based protocol is presented to create a physical ReRAM PUF with a large challenge space. This protocol uses differential reads from unformed ReRAM as the method for response generation. Lastly, this paper also provides an experimental hardware demonstration of this protocol on a Physical ReRAM device, along with providing notable results as a PUF, with excellent performance characteristics.
Authors:Vadim Safronov, Anthony McCaigue, Nicholas Allott, Andrew Martin
Abstract:
The growing integration of open-source software and AI-driven technologies has introduced new layers of complexity into the software supply chain, challenging existing methods for dependency management and system assurance. While Software Bills of Materials (SBOMs) have become critical for enhancing transparency and traceability, current frameworks fall short in capturing the unique characteristics of AI systems -- namely, their dynamic, data-driven nature and the loosely coupled dependencies across datasets, models, and software components. These challenges are compounded by fragmented governance structures and the lack of robust tools for ensuring integrity, trust, and compliance in AI-enabled environments. In this paper, we introduce Trusted AI Bill of Materials (TAIBOM) -- a novel framework extending SBOM principles to the AI domain. TAIBOM provides (i) a structured dependency model tailored for AI components, (ii) mechanisms for propagating integrity statements across heterogeneous AI pipelines, and (iii) a trust attestation process for verifying component provenance. We demonstrate how TAIBOM supports assurance, security, and compliance across AI workflows, highlighting its advantages over existing standards such as SPDX and CycloneDX. This work lays the foundation for trustworthy and verifiable AI systems through structured software transparency.
Authors:Davide Rusconi, Osama Yousef, Mirco Picca, Flavio Toffalini, Andrea Lanzi
Abstract:
In this paper we show E-FuzzEdge, a novel fuzzing architecture targeted towards improving the throughput of fuzzing campaigns in contexts where scalability is unavailable. E-FuzzEdge addresses the inefficiencies of hardware-in-the-loop fuzzing for microcontrollers by optimizing execution speed. We evaluated our system against state-of-the-art benchmarks, demonstrating significant performance improvements. A key advantage of E-FuzzEdgearchitecture is its compatibility with other embedded fuzzing techniques that perform on device testing instead of firmware emulation. This means that the broader embedded fuzzing community can integrate E-FuzzEdge into their workflows to enhance overall testing efficiency.
Authors:Hui Dou, Ning Xu, Yiwen Zhang, Kaibin Wang
Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks. However, they remain exposed to jailbreak attacks, eliciting harmful responses. The nested scenario strategy has been increasingly adopted across various methods, demonstrating immense potential. Nevertheless, these methods are easily detectable due to their prominent malicious intentions. In this work, we are the first to find and systematically verify that LLMs' alignment defenses are not sensitive to nested scenarios, where these scenarios are highly semantically relevant to the queries and incorporate targeted toxic knowledge. This is a crucial yet insufficiently explored direction. Based on this, we propose RTS-Attack (Semantically Relevant Nested Scenarios with Targeted Toxic Knowledge), an adaptive and automated framework to examine LLMs' alignment. By building scenarios highly relevant to the queries and integrating targeted toxic knowledge, RTS-Attack bypasses the alignment defenses of LLMs. Moreover, the jailbreak prompts generated by RTS-Attack are free from harmful queries, leading to outstanding concealment. Extensive experiments demonstrate that RTS-Attack exhibits superior performance in both efficiency and universality compared to the baselines across diverse advanced LLMs, including GPT-4o, Llama3-70b, and Gemini-pro. Our complete code is available in the supplementary material. WARNING: THIS PAPER CONTAINS POTENTIALLY HARMFUL CONTENT.
Authors:Yen-Shan Chen, Sian-Yao Huang, Cheng-Lin Yang, Yun-Nung Chen
Abstract:
Existing data poisoning attacks on retrieval-augmented generation (RAG) systems scale poorly because they require costly optimization of poisoned documents for each target phrase. We introduce Eyes-on-Me, a modular attack that decomposes an adversarial document into reusable Attention Attractors and Focus Regions. Attractors are optimized to direct attention to the Focus Region. Attackers can then insert semantic baits for the retriever or malicious instructions for the generator, adapting to new targets at near zero cost. This is achieved by steering a small subset of attention heads that we empirically identify as strongly correlated with attack success. Across 18 end-to-end RAG settings (3 datasets $\times$ 2 retrievers $\times$ 3 generators), Eyes-on-Me raises average attack success rates from 21.9 to 57.8 (+35.9 points, 2.6$\times$ over prior work). A single optimized attractor transfers to unseen black box retrievers and generators without retraining. Our findings establish a scalable paradigm for RAG data poisoning and show that modular, reusable components pose a practical threat to modern AI systems. They also reveal a strong link between attention concentration and model outputs, informing interpretability research.
Authors:Akshaya Kumar, Anna Raymaker, Michael Specter
Abstract:
We conduct the first comprehensive security analysis of Tile, the second most popular crowd-sourced location-tracking service behind Apple's AirTags. We identify several exploitable vulnerabilities and design flaws, disproving many of the platform's claimed security and privacy guarantees: Tile's servers can persistently learn the location of all users and tags, unprivileged adversaries can track users through Bluetooth advertisements emitted by Tile's devices, and Tile's anti-theft mode is easily subverted. Despite its wide deployment -- millions of users, devices, and purpose-built hardware tags -- Tile provides no formal description of its protocol or threat model. Worse, Tile intentionally weakens its antistalking features to support an antitheft use-case and relies on a novel "accountability" mechanism to punish those abusing the system to stalk victims. We examine Tile's accountability mechanism, a unique feature of independent interest; no other provider attempts to guarantee accountability. While an ideal accountability mechanism may disincentivize abuse in crowd-sourced location tracking protocols, we show that Tile's implementation is subvertible and introduces new exploitable vulnerabilities. We conclude with a discussion on the need for new, formal definitions of accountability in this setting.
Authors:Ehsan Aghaei, Sarthak Jain, Prashanth Arun, Arjun Sambamoorthy
Abstract:
Effective analysis of cybersecurity and threat intelligence data demands language models that can interpret specialized terminology, complex document structures, and the interdependence of natural language and source code. Encoder-only transformer architectures provide efficient and robust representations that support critical tasks such as semantic search, technical entity extraction, and semantic analysis, which are key to automated threat detection, incident triage, and vulnerability assessment. However, general-purpose language models often lack the domain-specific adaptation required for high precision. We present SecureBERT 2.0, an enhanced encoder-only language model purpose-built for cybersecurity applications. Leveraging the ModernBERT architecture, SecureBERT 2.0 introduces improved long-context modeling and hierarchical encoding, enabling effective processing of extended and heterogeneous documents, including threat reports and source code artifacts. Pretrained on a domain-specific corpus more than thirteen times larger than its predecessor, comprising over 13 billion text tokens and 53 million code tokens from diverse real-world sources, SecureBERT 2.0 achieves state-of-the-art performance on multiple cybersecurity benchmarks. Experimental results demonstrate substantial improvements in semantic search for threat intelligence, semantic analysis, cybersecurity-specific named entity recognition, and automated vulnerability detection in code within the cybersecurity domain.
Authors:Dong Hyun Roh, Rajesh Kumar
Abstract:
Keystroke dynamics is a promising modality for active user authentication, but its effectiveness under varying LLM-assisted typing and cognitive conditions remains understudied. Using data from 50 users and cognitive labels from Bloom's Taxonomy, we evaluate keystroke-based authentication in Korean across three realistic typing scenarios: bona fide composition, LLM content paraphrasing, and transcription. Our pipeline incorporates continuity-aware segmentation, feature extraction, and classification via SVM, MLP, and XGB. Results show that the system maintains reliable performance across varying LLM usages and cognitive contexts, with Equal Error Rates ranging from 5.1% to 10.4%. These findings demonstrate the feasibility of behavioral authentication under modern writing conditions and offer insights into designing more context-resilient models.
Authors:Yu-Fu Fu, Meng Xu, Taesoo Kim
Abstract:
While LLM-based specification generation is gaining traction, existing tools primarily focus on mainstream programming languages like C, Java, and even Solidity, leaving emerging and yet verification-oriented languages like Move underexplored. In this paper, we introduce MSG, an automated specification generation tool designed for Move smart contracts. MSG aims to highlight key insights that uniquely present when applying LLM-based specification generation to a new ecosystem. Specifically, MSG demonstrates that LLMs exhibit robust code comprehension and generation capabilities even for non-mainstream languages. MSG successfully generates verifiable specifications for 84% of tested Move functions and even identifies clauses previously overlooked by experts. Additionally, MSG shows that explicitly leveraging specification language features through an agentic, modular design improves specification quality substantially (generating 57% more verifiable clauses than conventional designs). Incorporating feedback from the verification toolchain further enhances the effectiveness of MSG, leading to a 30% increase in generated verifiable specifications.
Authors:Antonis Selentis, Nikolas Makris, Alkinoos Papageorgopoulos, Persefoni Konteli, Konstantinos Christodoulopoulos, George T. Kanellos, Dimitris Syvridis
Abstract:
We evaluate the performance of two architectures for network-wide quantum key distribution (QKD): Relayed QKD, which relays keys over multi-link QKD paths for non-adjacent nodes, and Switched QKD, which uses optical switches to dynamically connect arbitrary QKD modules to form direct QKD links between them. An advantage of Switched QKD is that it distributes quantum keys end-to-end, whereas Relayed relies on trusted nodes. However, Switched depends on arbitrary matching of QKD modules. We first experimentally evaluate the performance of commercial DV-QKD modules; for each of three vendors we benchmark the performance in standard/matched module pairs and in unmatched pairs to emulate configurations in the Switched QKD network architecture. The analysis reveals that in some cases a notable variation in the generated secret key rate (SKR) between the matched and unmatched pairs is observed. Driven by these experimental findings, we conduct a comprehensive theoretical analysis that evaluates the network-wide performance of the two architectures. Our analysis is based on uniform ring networks, where we derive optimal key management configurations and analytical formulas for the achievable consumed SKR. We compare network performance under varying ring sizes, QKD link losses, QKD receivers' sensitivity and performance penalties of unmatched modules. Our findings indicate that Switched QKD performs better in dense rings (short distances, large node counts), while Relayed QKD is more effective in longer distances and large node counts. Moreover, we confirm that unmatched QKD modules penalties significantly impact the efficiency of Switched QKD architecture.
Authors:Sherif Saad, Kevin Shi, Mohammed Mamun, Hythem Elmiligi
Abstract:
Automated machine learning (AutoML) has emerged as a promising paradigm for automating machine learning (ML) pipeline design, broadening AI adoption. Yet its reliability in complex domains such as cybersecurity remains underexplored. This paper systematically evaluates eight open-source AutoML frameworks across 11 publicly available cybersecurity datasets, spanning intrusion detection, malware classification, phishing, fraud detection, and spam filtering. Results show substantial performance variability across tools and datasets, with no single solution consistently superior. A paradigm shift is observed: the challenge has moved from selecting individual ML models to identifying the most suitable AutoML framework, complicated by differences in runtime efficiency, automation capabilities, and supported features. AutoML tools frequently favor tree-based models, which perform well but risk overfitting and limit interpretability. Key challenges identified include adversarial vulnerability, model drift, and inadequate feature engineering. We conclude with best practices and research directions to strengthen robustness, interpretability, and trust in AutoML for high-stakes cybersecurity applications.
Authors:Ummay Kulsum, Aafaq Sabir, Abhinaya S. B., Anupam Das
Abstract:
YouTube has emerged as a dominant platform for both information dissemination and entertainment. However, its vast accessibility has also made it a target for scammers, who frequently upload deceptive or malicious content. Prior research has documented a range of scam types, and detection approaches rely primarily on textual or statistical metadata. Although effective to some extent, these signals are easy to evade and potentially overlook other modalities, such as visual cues. In this study, we present the first systematic investigation of multimodal approaches for YouTube scam detection. Our dataset consolidates established scam categories and augments them with full length video content and policy grounded reasoning annotations. Our experimental evaluation demonstrates that a text-only model using video titles and descriptions (fine-tuned BERT) achieves moderate effectiveness (76.61% F1), with modest improvements when incorporating audio transcripts (77.98% F1). In contrast, visual analysis using a fine-tuned LLaVA-Video model yields stronger results (79.61% F1). Finally, a multimodal framework that integrates titles, descriptions, and video frames achieves the highest performance (80.53% F1). Beyond improving detection accuracy, our multimodal framework produces interpretable reasoning grounded in YouTube content policies, thereby enhancing transparency and supporting potential applications in automated moderation. Moreover, we validate our approach on in-the-wild YouTube data by analyzing 6,374 videos, thereby contributing a valuable resource for future research on scam detection.
Authors:Aravindhan G, Yuvaraj Govindarajulu, Parin Shah
Abstract:
Recent studies have demonstrated the vulnerability of Automatic Speech Recognition systems to adversarial examples, which can deceive these systems into misinterpreting input speech commands. While previous research has primarily focused on white-box attacks with constrained optimizations, and transferability based black-box attacks against commercial Automatic Speech Recognition devices, this paper explores cost efficient white-box attack and non transferability black-box adversarial attacks on Automatic Speech Recognition systems, drawing insights from approaches such as Fast Gradient Sign Method and Zeroth-Order Optimization. Further, the novelty of the paper includes how poisoning attack can degrade the performances of state-of-the-art models leading to misinterpretation of audio signals. Through experimentation and analysis, we illustrate how hybrid models can generate subtle yet impactful adversarial examples with very little perturbation having Signal Noise Ratio of 35dB that can be generated within a minute. These vulnerabilities of state-of-the-art open source model have practical security implications, and emphasize the need for adversarial security.
Authors:Jingkai Guo, Chaitali Chakrabarti, Deliang Fan
Abstract:
Model integrity of Large language models (LLMs) has become a pressing security concern with their massive online deployment. Prior Bit-Flip Attacks (BFAs) -- a class of popular AI weight memory fault-injection techniques -- can severely compromise Deep Neural Networks (DNNs): as few as tens of bit flips can degrade accuracy toward random guessing. Recent studies extend BFAs to LLMs and reveal that, despite the intuition of better robustness from modularity and redundancy, only a handful of adversarial bit flips can also cause LLMs' catastrophic accuracy degradation. However, existing BFA methods typically focus on either integer or floating-point models separately, limiting attack flexibility. Moreover, in floating-point models, random bit flips often cause perturbed parameters to extreme values (e.g., flipping in exponent bit), making it not stealthy and leading to numerical runtime error (e.g., invalid tensor values (NaN/Inf)). In this work, for the first time, we propose SBFA (Sneaky Bit-Flip Attack), which collapses LLM performance with only one single bit flip while keeping perturbed values within benign layer-wise weight distribution. It is achieved through iterative searching and ranking through our defined parameter sensitivity metric, ImpactScore, which combines gradient sensitivity and perturbation range constrained by the benign layer-wise weight distribution. A novel lightweight SKIP searching algorithm is also proposed to greatly reduce searching complexity, which leads to successful SBFA searching taking only tens of minutes for SOTA LLMs. Across Qwen, LLaMA, and Gemma models, with only one single bit flip, SBFA successfully degrades accuracy to below random levels on MMLU and SST-2 in both BF16 and INT8 data formats. Remarkably, flipping a single bit out of billions of parameters reveals a severe security concern of SOTA LLM models.
Authors:Hesam Sarkhosh, Uzma Maroof, Diogo Barradas
Abstract:
The rise of Web3 and Decentralized Finance (DeFi) has enabled borderless access to financial services empowered by smart contracts and blockchain technology. However, the ecosystem's trustless, permissionless, and borderless nature presents substantial regulatory challenges. The absence of centralized oversight and the technical complexity create fertile ground for financial crimes. Among these, money laundering is particularly concerning, as in the event of successful scams, code exploits, and market manipulations, it facilitates covert movement of illicit gains. Beyond this, there is a growing concern that cryptocurrencies can be leveraged to launder proceeds from drug trafficking, or to transfer funds linked to terrorism financing. This survey aims to outline a taxonomy of high-level strategies and underlying mechanisms exploited to facilitate money laundering in Web3. We examine how criminals leverage the pseudonymous nature of Web3, alongside weak regulatory frameworks, to obscure illicit financial activities. Our study seeks to bridge existing knowledge gaps on laundering schemes, identify open challenges in the detection and prevention of such activities, and propose future research directions to foster a more transparent Web3 financial ecosystem -- offering valuable insights for researchers, policymakers, and industry practitioners.
Authors:Prakhar Sharma, Haohuang Wen, Vinod Yegneswaran, Ashish Gehani, Phillip Porras, Zhiqiang Lin
Abstract:
The evolution toward 6G networks is being accelerated by the Open Radio Access Network (O-RAN) paradigm -- an open, interoperable architecture that enables intelligent, modular applications across public telecom and private enterprise domains. While this openness creates unprecedented opportunities for innovation, it also expands the attack surface, demanding resilient, low-cost, and autonomous security solutions. Legacy defenses remain largely reactive, labor-intensive, and inadequate for the scale and complexity of next-generation systems. Current O-RAN applications focus mainly on network optimization or passive threat detection, with limited capability for closed-loop, automated response. To address this critical gap, we present an agentic AI framework for fully automated, end-to-end threat mitigation in 6G O-RAN environments. MobiLLM orchestrates security workflows through a modular multi-agent system powered by Large Language Models (LLMs). The framework features a Threat Analysis Agent for real-time data triage, a Threat Classification Agent that uses Retrieval-Augmented Generation (RAG) to map anomalies to specific countermeasures, and a Threat Response Agent that safely operationalizes mitigation actions via O-RAN control interfaces. Grounded in trusted knowledge bases such as the MITRE FiGHT framework and 3GPP specifications, and equipped with robust safety guardrails, MobiLLM provides a blueprint for trustworthy AI-driven network security. Initial evaluations demonstrate that MobiLLM can effectively identify and orchestrate complex mitigation strategies, significantly reducing response latency and showcasing the feasibility of autonomous security operations in 6G.
Authors:Ben Rosenzweig, Valentino Dalla Valle, Giovanni Apruzzese, Aurore Fass
Abstract:
Google Chrome is the most popular Web browser. Users can customize it with extensions that enhance their browsing experience. The most well-known marketplace of such extensions is the Chrome Web Store (CWS). Developers can upload their extensions on the CWS, but such extensions are made available to users only after a vetting process carried out by Google itself. Unfortunately, some malicious extensions bypass such checks, putting the security and privacy of downstream browser extension users at risk. Here, we scrutinize the extent to which automated mechanisms reliant on supervised machine learning (ML) can be used to detect malicious extensions on the CWS. To this end, we first collect 7,140 malicious extensions published in 2017--2023. We combine this dataset with 63,598 benign extensions published or updated on the CWS before 2023, and we develop three supervised-ML-based classifiers. We show that, in a "lab setting", our classifiers work well (e.g., 98% accuracy). Then, we collect a more recent set of 35,462 extensions from the CWS, published or last updated in 2023, with unknown ground truth. We were eventually able to identify 68 malicious extensions that bypassed the vetting process of the CWS. However, our classifiers also reported >1k likely malicious extensions. Based on this finding (further supported with empirical evidence), we elucidate, for the first time, a strong concept drift effect on browser extensions. We also show that commercial detectors (e.g., VirusTotal) work poorly to detect known malicious extensions. Altogether, our results highlight that detecting malicious browser extensions is a fundamentally hard problem. This requires additional work both by the research community and by Google itself -- potentially by revising their approaches. In the meantime, we informed Google of our discoveries, and we release our artifacts.
Authors:Devashish Chaudhary, Sutharshan Rajasegarar, Shiva Raj Pokhrel
Abstract:
This survey explores the integration of Federated Learning (FL) with Network Intrusion Detection Systems (NIDS), with particular emphasis on deep learning and quantum machine learning approaches. FL enables collaborative model training across distributed devices while preserving data privacy-a critical requirement in network security contexts where sensitive traffic data cannot be centralized. Our comprehensive analysis systematically examines the full spectrum of FL architectures, deployment strategies, communication protocols, and aggregation methods specifically tailored for intrusion detection. We provide an in-depth investigation of privacy-preserving techniques, model compression approaches, and attack-specific federated solutions for threats including DDoS, MITM, and botnet attacks. The survey further delivers a pioneering exploration of Quantum FL (QFL), discussing quantum feature encoding, quantum machine learning algorithms, and quantum-specific aggregation methods that promise exponential speedups for complex pattern recognition in network traffic. Through rigorous comparative analysis of classical and quantum approaches, identification of research gaps, and evaluation of real-world deployments, we outline a concrete roadmap for industrial adoption and future research directions. This work serves as an authoritative reference for researchers and practitioners seeking to enhance privacy, efficiency, and robustness of federated intrusion detection systems in increasingly complex network environments, while preparing for the quantum-enhanced cybersecurity landscape of tomorrow.
Authors:Yongjiao Li, Liang Zhu, Yalin Deng, Qikun Zhang, Zhenlei Wang, Zhu Cao
Abstract:
Efficient and secure revocable attribute-based encryption (RABE) is vital for ensuring flexible and fine-grained access control and data sharing in cloud storage and outsourced data environments within the Internet of Things (IoT). However, current RABE schemes often struggle to achieve an optimal balance between efficiency, security, dynamic scalability, and other important features, which hampers their practical application. To overcome these limitations, we propose a fast RABE scheme with data integrity for IoT that achieves adaptive security with multiple challenge ciphertexts. Our scheme supports the revocation of authorized users and transfers the computationally heavy revocation processes to the cloud, thereby easing the computational burden on IoT devices. Moreover, it consistently guarantees the integrity and correctness of data. We have demonstrated its adaptive security within the defined security model with multiple challenge ciphertexts and optimized its performance. Experimental results indicate that our scheme provides better performance than existing solutions. Under the same access policy, our scheme reduces computational consumption by 7 to 9 times compared to previous schemes.
Authors:Rian Adam Rajagede, Yan Solihin
Abstract:
Fully Homomorphic Encryption (FHE) represents a paradigm shift in cryptography, enabling computation directly on encrypted data and unlocking privacy-critical computation. Despite being increasingly deployed in real platforms, the reliability aspects of FHE systems, especially how they respond to faults, have been mostly neglected. This paper aims to better understand of how FHE computation behaves in the presence of memory faults, both in terms of individual operations as well as at the level of applications, for different FHE schemes. Finally, we investigate how effective traditional and FHE-specific fault mitigation techniques are.
Authors:Visar Berisha, Prad Kadambi, Isabella Lenz
Abstract:
Speech deepfake detectors are often evaluated on clean, benchmark-style conditions, but deployment occurs in an open world of shifting devices, sampling rates, codecs, environments, and attack families. This creates a ``coverage debt" for AI-based detectors: every new condition multiplies with existing ones, producing data blind spots that grow faster than data can be collected. Because attackers can target these uncovered regions, worst-case performance (not average benchmark scores) determines security. To demonstrate the impact of the coverage debt problem, we analyze results from a recent cross-testing framework. Grouping performance by bona fide domain and spoof release year, two patterns emerge: newer synthesizers erase the legacy artifacts detectors rely on, and conversational speech domains (teleconferencing, interviews, social media) are consistently the hardest to secure. These findings show that detection alone should not be relied upon for high-stakes decisions. Detectors should be treated as auxiliary signals within layered defenses that include provenance, personhood credentials, and policy safeguards.
Authors:Huzaifa Sidhpurwala, Emily Fox, Garth Mollett, Florencio Cano Gabarda, Roman Zhukov
Abstract:
This paper introduces the Hazard-Aware System Card (HASC), a novel framework designed to enhance transparency and accountability in the development and deployment of AI systems. The HASC builds upon existing model card and system card concepts by integrating a comprehensive, dynamic record of an AI system's security and safety posture. The framework proposes a standardized system of identifiers, including a novel AI Safety Hazard (ASH) ID, to complement existing security identifiers like CVEs, allowing for clear and consistent communication of fixed flaws. By providing a single, accessible source of truth, the HASC empowers developers and stakeholders to make more informed decisions about AI system safety throughout its lifecycle. Ultimately, we also compare our proposed AI system cards with the ISO/IEC 42001:2023 standard and discuss how they can be used to complement each other, providing greater transparency and accountability for AI systems.
Authors:Ãnder Askin, Tim Kutta, Holger Dette
Abstract:
Auditing differential privacy has emerged as an important area of research that supports the design of privacy-preserving mechanisms. Privacy audits help to obtain empirical estimates of the privacy parameter, to expose flawed implementations of algorithms and to compare practical with theoretical privacy guarantees. In this work, we investigate an unexplored facet of privacy auditing: the sustained auditing of a mechanism that can go through changes during its development or deployment. Monitoring the privacy of algorithms over time comes with specific challenges. Running state-of-the-art (static) auditors repeatedly requires excessive sampling efforts, while the reliability of such methods deteriorates over time without proper adjustments. To overcome these obstacles, we present a new monitoring procedure that extracts information from the entire deployment history of the algorithm. This allows us to reduce sampling efforts, while sustaining reliable outcomes of our auditor. We derive formal guarantees with regard to the soundness of our methods and evaluate their performance for important mechanisms from the literature. Our theoretical findings and experiments demonstrate the efficacy of our approach.
Authors:Xiaofan Li, Xing Gao
Abstract:
In recent years, various software supply chain (SSC) attacks have posed significant risks to the global community. Severe consequences may arise if developers integrate insecure code snippets that are vulnerable to SSC attacks into their products. Particularly, code generation techniques, such as large language models (LLMs), have been widely utilized in the developer community. However, LLMs are known to suffer from inherent issues when generating code, including fabrication, misinformation, and reliance on outdated training data, all of which can result in serious software supply chain threats. In this paper, we investigate the security threats to the SSC that arise from these inherent issues. We examine three categories of threats, including eleven potential SSC-related threats, related to external components in source code, and continuous integration configuration files. We find some threats in LLM-generated code could enable attackers to hijack software and workflows, while some others might cause potential hidden threats that compromise the security of the software over time. To understand these security impacts and severity, we design a tool, SSCGuard, to generate 439,138 prompts based on SSC-related questions collected online, and analyze the responses of four popular LLMs from GPT and Llama. Our results show that all identified SSC-related threats persistently exist. To mitigate these risks, we propose a novel prompt-based defense mechanism, namely Chain-of-Confirmation, to reduce fabrication, and a middleware-based defense that informs users of various SSC threats.
Authors:Balazs Pejo, Marcell Frank, Krisztian Varga, Peter Veliczky
Abstract:
This paper investigates the fragility of contribution evaluation in federated learning, a critical mechanism for ensuring fairness and incentivizing participation. We argue that contribution scores are susceptible to significant distortions from two fundamental perspectives: architectural sensitivity and intentional manipulation. First, we explore how different model aggregation methods impact these scores. While most research assumes a basic averaging approach, we demonstrate that advanced techniques, including those designed to handle unreliable or diverse clients, can unintentionally yet significantly alter the final scores. Second, we explore vulnerabilities posed by poisoning attacks, where malicious participants strategically manipulate their model updates to inflate their own contribution scores or reduce the importance of other participants. Through extensive experiments across diverse datasets and model architectures, implemented within the Flower framework, we rigorously show that both the choice of aggregation method and the presence of attackers are potent vectors for distorting contribution scores, highlighting a critical need for more robust evaluation schemes.
Authors:Mengdi Lu, Steven Ding, Furkan Alaca, Philippe Charland
Abstract:
Security vulnerabilities in Internet-of-Things devices, mobile platforms, and autonomous systems remain critical. Traditional mutation-based fuzzers -- while effectively explore code paths -- primarily perform byte- or bit-level edits without semantic reasoning. Coverage-guided tools such as AFL++ use dictionaries, grammars, and splicing heuristics to impose shallow structural constraints, leaving deeper protocol logic, inter-field dependencies, and domain-specific semantics unaddressed. Conversely, reasoning-capable large language models (LLMs) can leverage pretraining knowledge to understand input formats, respect complex constraints, and propose targeted mutations, much like an experienced reverse engineer or testing expert. However, lacking ground truth for "correct" mutation reasoning makes supervised fine-tuning impractical, motivating explorations of off-the-shelf LLMs via prompt-based few-shot learning. To bridge this gap, we present an open-source microservices framework that integrates reasoning LLMs with AFL++ on Google's FuzzBench, tackling asynchronous execution and divergent hardware demands (GPU- vs. CPU-intensive) of LLMs and fuzzers. We evaluate four research questions: (R1) How can reasoning LLMs be integrated into the fuzzing mutation loop? (R2) Do few-shot prompts yield higher-quality mutations than zero-shot? (R3) Can prompt engineering with off-the-shelf models improve fuzzing directly? and (R4) Which open-source reasoning LLMs perform best under prompt-only conditions? Experiments with Llama3.3, Deepseek-r1-Distill-Llama-70B, QwQ-32B, and Gemma3 highlight Deepseek as the most promising. Mutation effectiveness depends more on prompt complexity and model choice than shot count. Response latency and throughput bottlenecks remain key obstacles, offering directions for future work.
Authors:Hafijul Hoque Chowdhury, Riad Ahmed Anonto, Sourov Jajodia, Suryadipta Majumdar, Md. Shohrab Hossain
Abstract:
With the rapid growth of smart home IoT devices, users are increasingly exposed to various security risks, as evident from recent studies. While seeking answers to know more on those security concerns, users are mostly left with their own discretion while going through various sources, such as online blogs and technical manuals, which may render higher complexity to regular users trying to extract the necessary information. This requirement does not go along with the common mindsets of smart home users and hence threatens the security of smart homes furthermore. In this paper, we aim to identify and address the major user-level security concerns in smart homes. Specifically, we develop a novel dataset of Q&A from public forums, capturing practical security challenges faced by smart home users. We extract major security concerns in smart homes from our dataset by leveraging the Latent Dirichlet Allocation (LDA). We fine-tune relatively "smaller" transformer models, such as T5 and Flan-T5, on this dataset to build a QA system tailored for smart home security. Unlike larger models like GPT and Gemini, which are powerful but often resource hungry and require data sharing, smaller models are more feasible for deployment in resource-constrained or privacy-sensitive environments like smart homes. The dataset is manually curated and supplemented with synthetic data to explore its potential impact on model performance. This approach significantly improves the system's ability to deliver accurate and relevant answers, helping users address common security concerns with smart home IoT devices. Our experiments on real-world user concerns show that our work improves the performance of the base models.
Authors:Kemi Akanbi, Sunkanmi Oluwadare, Jess Kropczynski, Jacques Bou Abdo
Abstract:
This study examines the robustness of I2P, a well-regarded anonymous and decentralized peer-to-peer network designed to ensure anonymity, confidentiality, and circumvention of censorship. Unlike its more widely researched counterpart, TOR, I2P's resilience has received less scholarly attention. Employing network analysis, this research evaluates I2P's susceptibility to adversarial percolation. By utilizing the degree centrality as a measure of nodes' influence in the network, the finding suggests the network is vulnerable to targeted disruptions. Before percolation, the network exhibited a density of 0.01065443 and an average path length of 6.842194. At the end of the percolation process, the density decreased by approximately 10%, and the average path length increased by 33%, indicating a decline in efficiency and connectivity. These results highlight that even decentralized networks, such as I2P, exhibit structural fragility under targeted attacks, emphasizing the need for improved design strategies to enhance resilience against adversarial disruptions.
Authors:Aleksandr Dolgavin, Jacob Gatlin, Moti Yung, Mark Yampolskiy
Abstract:
The central security issue of outsourced 3D printing (aka AM: Additive Manufacturing), an industry that is expected to dominate manufacturing, is the protection of the digital design (containing the designers' model, which is their intellectual property) shared with the manufacturer. Here, we show, for the first time, that side-channel attacks are, in fact, a concrete serious threat to existing industrial grade 3D printers, enabling the reconstruction of the model printed (regardless of employing ways to directly conceal the design, e.g. by encrypting it in transit and before loading it into the printer). Previously, such attacks were demonstrated only on fairly simple FDM desktop 3D printers, which play a negligible role in manufacturing of valuable designs. We focus on the Powder Bed Fusion (PBF) AM process, which is popular for manufacturing net-shaped parts with both polymers and metals. We demonstrate how its individual actuators can be instrumented for the collection of power side-channel information during the printing process. We then present our approach to reconstruct the 3D printed model solely from the collected power side-channel data. Further, inspired by Differential Power Analysis, we developed a method to improve the quality of the reconstruction based on multiple traces. We tested our approach on two design models with different degrees of complexity. For different models, we achieved as high as 90.29~\% of True Positives and as low as 7.02~\% and 9.71~\% of False Positives and False Negatives by voxel-based volumetric comparison between reconstructed and original designs. The lesson learned from our attack is that the security of design files cannot solely rely on protecting the files themselves in an industrial environment, but must instead also rely on assuring no leakage of power, noise and similar signals to potential eavesdroppers in the printer's vicinity.
Authors:Tarun Chitra, Paolo Penna, Manvir Schneider
Abstract:
Restaking protocols expand validator responsibilities beyond consensus, but their security depends on resistance to Sybil attacks. We introduce a formal framework for Sybil-proofness in restaking networks, distinguishing between two types of attacks, one in which other Sybil identities are kept out of an attack and one where multiple Sybil identities attack. We analyze marginal and multiplicative slashing mechanisms and characterize the conditions under which each deters Sybil strategies. We then prove an impossibility theorem: no slashing mechanism can simultaneously prevent both attack types. Finally, we study the impact of network structure through random graph models: while Erdös-Rényi networks remain Sybil-proof, even minimal heterogeneity in a two-block stochastic block model makes Sybil attacks profitable. These results reveal fundamental limits of mechanism design for restaking and highlight the critical role of network topology.
Authors:Roberto Doriguzzi-Corin, Petr Sabel, Silvio Cretti, Silvio Ranise
Abstract:
Machine Learning (ML) techniques have shown strong potential for network traffic analysis; however, their effectiveness depends on access to representative, up-to-date datasets, which is limited in cybersecurity due to privacy and data-sharing restrictions. To address this challenge, Federated Learning (FL) has recently emerged as a novel paradigm that enables collaborative training of ML models across multiple clients while ensuring that sensitive data remains local. Nevertheless, Federated Averaging (FedAvg), the canonical FL algorithm, has proven poor convergence in heterogeneous environments where data distributions are non-independent and identically distributed (i.i.d.) and client datasets are unbalanced, conditions frequently observed in cybersecurity contexts. To overcome these challenges, several alternative FL strategies have been developed, yet their applicability to network intrusion detection remains insufficiently explored. This study systematically reviews and evaluates a range of FL methods in the context of intrusion detection for DDoS attacks. Using a dataset of network attacks within a Kubernetes-based testbed, we assess convergence efficiency, computational overhead, bandwidth consumption, and model accuracy. To the best of our knowledge, this is the first comparative analysis of FL algorithms for intrusion detection under realistic non-i.i.d. and unbalanced settings, providing new insights for the design of robust, privacypreserving network security solutions.
Authors:Haotian Xu, Qingsong Peng, Jie Shi, Huadi Zheng, Yu Li, Cheng Zhuo
Abstract:
The rapid adoption of large language models (LLMs) in critical domains has spurred extensive research into their security issues. While input manipulation attacks (e.g., prompt injection) have been well studied, Bit-Flip Attacks (BFAs) -- which exploit hardware vulnerabilities to corrupt model parameters and cause severe performance degradation -- have received far less attention. Existing BFA methods suffer from key limitations: they fail to balance performance degradation and output naturalness, making them prone to discovery. In this paper, we introduce SilentStriker, the first stealthy bit-flip attack against LLMs that effectively degrades task performance while maintaining output naturalness. Our core contribution lies in addressing the challenge of designing effective loss functions for LLMs with variable output length and the vast output space. Unlike prior approaches that rely on output perplexity for attack loss formulation, which inevitably degrade output naturalness, we reformulate the attack objective by leveraging key output tokens as targets for suppression, enabling effective joint optimization of attack effectiveness and stealthiness. Additionally, we employ an iterative, progressive search strategy to maximize attack efficacy. Experiments show that SilentStriker significantly outperforms existing baselines, achieving successful attacks without compromising the naturalness of generated text.
Authors:Colman McGuan, Aadithyan V. Raghavan, Komala M. Mandapati, Chansu Yu, Brian E. Ray, Debbie K. Jackson, Sathish Kumar
Abstract:
In an increasingly interconnected world, cybersecurity professionals play a pivotal role in safeguarding organizations from cyber threats. To secure their cyberspace, organizations are forced to adopt a cybersecurity framework such as the NIST National Initiative for Cybersecurity Education Workforce Framework for Cybersecurity (NICE Framework). Although these frameworks are a good starting point for businesses and offer critical information to identify, prevent, and respond to cyber incidents, they can be difficult to navigate and implement, particularly for small-medium businesses (SMB). To help overcome this issue, this paper identifies the most frequent attack vectors to SMBs (Objective 1) and proposes a practical model of both technical and non-technical tasks, knowledge, skills, abilities (TKSA) from the NICE Framework for those attacks (Objective 2). The research develops a scenario-based curriculum. By immersing learners in realistic cyber threat scenarios, their practical understanding and preparedness in responding to cybersecurity incidents is enhanced (Objective 3). Finally, this work integrates practical experience and real-life skill development into the curriculum (Objective 4). SMBs can use the model as a guide to evaluate, equip their existing workforce, or assist in hiring new employees. In addition, educational institutions can use the model to develop scenario-based learning modules to adequately equip the emerging cybersecurity workforce for SMBs. Trainees will have the opportunity to practice both technical and legal issues in a simulated environment, thereby strengthening their ability to identify, mitigate, and respond to cyber threats effectively.
Authors:Bence Soóki-Tóth, István András Seres, Kamilla Kara, Ãbel Nagy, Balázs Pejó, Gergely Biczók
Abstract:
The long-term success of cryptocurrencies largely depends on the incentive compatibility provided to the validators. Bribery attacks, facilitated trustlessly via smart contracts, threaten this foundation. This work introduces, implements, and evaluates three novel and efficient bribery contracts targeting Ethereum validators. The first bribery contract enables a briber to fork the blockchain by buying votes on their proposed blocks. The second contract incentivizes validators to voluntarily exit the consensus protocol, thus increasing the adversary's relative staking power. The third contract builds a trustless bribery market that enables the briber to auction off their manipulative power over the RANDAO, Ethereum's distributed randomness beacon. Finally, we provide an initial game-theoretical analysis of one of the described bribery markets.
Authors:Serena Wang, Martino Banchio, Krzysztof Kotowicz, Katrina Ligett, R. Preston McAfee, Eduardo' Vela'' Nava
Abstract:
Bug bounty programs have contributed significantly to security in technology firms in the last decade, but little is known about the role of reward incentives in producing useful outcomes. We analyze incentives and outcomes in Google's Vulnerability Rewards Program (VRP), one of the world's largest bug bounty programs. We analyze the responsiveness of the quality and quantity of bugs received to changes in payments, focusing on a change in Google's reward amounts posted in July, 2024, in which reward amounts increased by up to 200% for the highest impact tier. Our empirical results show an increase in the volume of high-value bugs received after the reward increase, for which we also compute elasticities. We further break down the sources of this increase between veteran researchers and new researchers, showing that the reward increase both redirected the attention of veteran researchers and attracted new top security researchers into the program.
Authors:Mingjian Duan, Ming Xu, Shenghao Zhang, Jiaheng Zhang, Weili Han
Abstract:
Textual passwords remain a predominant authentication mechanism in web security. To evaluate their strength, existing research has proposed several data-driven models across various scenarios. However, these models generally treat passwords uniformly, neglecting the structural differences among passwords. This typically results in biased training that favors frequent password structural patterns. To mitigate the biased training, we argue that passwords, as a type of complex short textual data, should be processed in a structure-aware manner by identifying their structural patterns and routing them to specialized models accordingly. In this paper, we propose MoPE, a Mixture of Password Experts framework, specifically designed to leverage the structural patterns in passwords to improveguessing performance. Motivated by the observation that passwords with similar structural patterns (e.g., fixed-length numeric strings) tend to cluster in high-density regions within the latent space, our MoPE introduces: (1) a novel structure-based method for generating specialized expert models; (2) a lightweight gate method to select appropriate expert models to output reliable guesses, better aligned with the high computational frequency of password guessing tasks. Our evaluation shows that MoPE significantly outperforms existing state-of-the-art baselines in both offline and online guessing scenarios, achieving up to 38.80% and 9.27% improvement in cracking rate, respectively, showcasing that MoPE can effectively exploit the capabilities of data-driven models for password guessing. Additionally, we implement a real-time Password Strength Meter (PSM) based on offline MoPE, assisting users in choosing stronger passwords more precisely with millisecond-level response latency.
Authors:Vabuk Pahari, Andrea Canidio
Abstract:
We analyze 15,097 blocks proposed for inclusion in Ethereum's blockchain over an 8-minute window on December 3, 2024, during which 38 blocks were added to the chain. We classify transactions as exclusive -- present only in blocks from a single builder -- or private -- absent from the public mempool but included in blocks from multiple builders. We find that exclusive transactions account for 84% of the total fees paid by transactions in winning blocks. Furthermore, we show that exclusivity cannot be fully explained by exclusive relationships between senders and builders: about 7% of all exclusive transactions included on-chain, by value, come from senders who route exclusively to a single builder. Analyzing transaction logs shows that some exclusive transactions are duplicates or variations of the same strategy, but even accounting for that, the share of the total fees paid by transactions in winning blocks is at least 77.2%. Taken together, our findings highlight that exclusive transactions are the dominant source of builder revenues.
Authors:Zhiyu Huang, Guyue Li, Hao Xu, Derrick Wing Kwan Ng
Abstract:
This paper investigates physical-layer key generation (PLKG) in multi-antenna base station systems, by leveraging a fluid antenna system (FAS) to dynamically customize radio environments. Without requiring additional nodes or extensive radio frequency chains, the FAS effectively enables adaptive antenna port selection by exploiting channel spatial correlation to enhance the key generation rate (KGR) at legitimate nodes. To comprehensively evaluate the efficiency of the FAS in PLKG, we propose an FAS-assisted PLKG model that integrates transmit beamforming and sparse port selection under independent and identically distributed and spatially correlated channel models, respectively. Specifically, the PLKG utilizes reciprocal channel probing to derive a closed-form KGR expression based on the mutual information between legitimate channel estimates. Nonconvex optimization problems for these scenarios are formulated to maximize the KGR subject to transmit power constraints and sparse port activation. We propose an iterative algorithm by capitalizing on successive convex approximation and Cauchy-Schwarz inequality to obtain a locally optimal solution. A reweighted $\ell_1$-norm-based algorithm is applied to advocate for the sparse port activation of FAS-assisted PLKG. Furthermore, a low-complexity sliding window-based port selection is proposed to substitute reweighted $\ell_1$-norm method based on Rayleigh-quotient analysis. Simulation results demonstrate that the FAS-PLKG scheme significantly outperforms the FA-PLKG scheme in both independent and spatially correlated environments. The sliding window-based port selection method introduced in this paper has been shown to yield superior KGR, compared to the reweighted $\ell_1$-norm method. It is shown that the FAS achieves higher KGR with fewer RF chains through dynamic sparse port selection.
Authors:Michael Neri, Tuomas Virtanen
Abstract:
Replay speech attacks pose a significant threat to voice-controlled systems, especially in smart environments where voice assistants are widely deployed. While multi-channel audio offers spatial cues that can enhance replay detection robustness, existing datasets and methods predominantly rely on single-channel recordings. In this work, we introduce an acoustic simulation framework designed to simulate multi-channel replay speech configurations using publicly available resources. Our setup models both genuine and spoofed speech across varied environments, including realistic microphone and loudspeaker impulse responses, room acoustics, and noise conditions. The framework employs measured loudspeaker directionalities during the replay attack to improve the realism of the simulation. We define two spoofing settings, which simulate whether a reverberant or an anechoic speech is used in the replay scenario, and evaluate the impact of omnidirectional and diffuse noise on detection performance. Using the state-of-the-art M-ALRAD model for replay speech detection, we demonstrate that synthetic data can support the generalization capabilities of the detector across unseen enclosures.
Authors:Yihao Guo, Haocheng Bian, Liutong Zhou, Ze Wang, Zhaoyi Zhang, Francois Kawala, Milan Dean, Ian Fischer, Yuantao Peng, Noyan Tokgozoglu, Ivan Barrientos, Riyaaz Shaik, Rachel Li, Chandru Venkataraman, Reza Shifteh Far, Moses Pawar, Venkat Sundaranatha, Michael Xu, Frank Chu
Abstract:
With the deployment of Large Language Models (LLMs) in interactive applications, online malicious intent detection has become increasingly critical. However, existing approaches fall short of handling diverse and complex user queries in real time. To address these challenges, we introduce ADRAG (Adversarial Distilled Retrieval-Augmented Guard), a two-stage framework for robust and efficient online malicious intent detection. In the training stage, a high-capacity teacher model is trained on adversarially perturbed, retrieval-augmented inputs to learn robust decision boundaries over diverse and complex user queries. In the inference stage, a distillation scheduler transfers the teacher's knowledge into a compact student model, with a continually updated knowledge base collected online. At deployment, the compact student model leverages top-K similar safety exemplars retrieved from the online-updated knowledge base to enable both online and real-time malicious query detection. Evaluations across ten safety benchmarks demonstrate that ADRAG, with a 149M-parameter model, achieves 98.5% of WildGuard-7B's performance, surpasses GPT-4 by 3.3% and Llama-Guard-3-8B by 9.5% on out-of-distribution detection, while simultaneously delivering up to 5.6x lower latency at 300 queries per second (QPS) in real-time applications.
Authors:Xuan Luo, Yue Wang, Zefeng He, Geng Tu, Jing Li, Ruifeng Xu
Abstract:
Safety alignment aims to prevent Large Language Models (LLMs) from responding to harmful queries. To strengthen safety protections, jailbreak methods are developed to simulate malicious attacks and uncover vulnerabilities. In this paper, we introduce HILL (Hiding Intention by Learning from LLMs), a novel jailbreak approach that systematically transforms imperative harmful requests into learning-style questions with only straightforward hypotheticality indicators. Further, we introduce two new metrics to thoroughly evaluate the utility of jailbreak methods. Experiments on the AdvBench dataset across a wide range of models demonstrate HILL's strong effectiveness, generalizability, and harmfulness. It achieves top attack success rates on the majority of models and across malicious categories while maintaining high efficiency with concise prompts. Results of various defense methods show the robustness of HILL, with most defenses having mediocre effects or even increasing the attack success rates. Moreover, the assessment on our constructed safe prompts reveals inherent limitations of LLMs' safety mechanisms and flaws in defense methods. This work exposes significant vulnerabilities of safety measures against learning-style elicitation, highlighting a critical challenge of balancing helpfulness and safety alignments.
Authors:Aleksandr Nahapetyan, Kanv Khare, Kevin Schwarz, Bradley Reaves, Alexandros Kapravelos
Abstract:
In 2024, the Anti-Phishing Work Group identified over one million phishing pages. Phishers achieve this scale by using phishing kits -- ready-to-deploy phishing websites -- to rapidly deploy phishing campaigns with specific data exfiltration, evasion, or mimicry techniques. In contrast, researchers and defenders continue to fight phishing on a page-by-page basis and rely on manual analysis to recognize static features for kit identification. This paper aims to aid researchers and analysts by automatically differentiating groups of phishing pages based on the underlying kit, automating a previously manual process, and enabling us to measure how popular different client-side techniques are across these groups. For kit detection, our system has an accuracy of 97% on a ground-truth dataset of 548 kit families deployed across 4,562 phishing URLs. On an unlabeled dataset, we leverage the complexity of 434,050 phishing pages' JavaScript logic to group them into 11,377 clusters, annotating the clusters with what phishing techniques they employ. We find that UI interactivity and basic fingerprinting are universal techniques, present in 90% and 80% of the clusters, respectively. On the other hand, mouse detection via the browser's mouse API is among the rarest behaviors, despite being used in a deployment of a 7-year-old open-source phishing kit. Our methods and findings provide new ways for researchers and analysts to tackle the volume of phishing pages.
Authors:Haneen Najjar, Eyal Ronen, Mahmood Sharif
Abstract:
Security-critical machine-learning (ML) systems, such as face-recognition systems, are susceptible to adversarial examples, including real-world physically realizable attacks. Various means to boost ML's adversarial robustness have been proposed; however, they typically induce unfair robustness: It is often easier to attack from certain classes or groups than from others. Several techniques have been developed to improve adversarial robustness while seeking perfect fairness between classes. Yet, prior work has focused on settings where security and fairness are less critical. Our insight is that achieving perfect parity in realistic fairness-critical tasks, such as face recognition, is often infeasible -- some classes may be highly similar, leading to more misclassifications between them. Instead, we suggest that seeking symmetry -- i.e., attacks from class $i$ to $j$ would be as successful as from $j$ to $i$ -- is more tractable. Intuitively, symmetry is a desirable because class resemblance is a symmetric relation in most domains. Additionally, as we prove theoretically, symmetry between individuals induces symmetry between any set of sub-groups, in contrast to other fairness notions where group-fairness is often elusive. We develop Sy-FAR, a technique to encourage symmetry while also optimizing adversarial robustness and extensively evaluate it using five datasets, with three model architectures, including against targeted and untargeted realistic attacks. The results show Sy-FAR significantly improves fair adversarial robustness compared to state-of-the-art methods. Moreover, we find that Sy-FAR is faster and more consistent across runs. Notably, Sy-FAR also ameliorates another type of unfairness we discover in this work -- target classes that adversarial examples are likely to be classified into become significantly less vulnerable after inducing symmetry.
Authors:David Adei, Varun Madathil, Nithin Shyam S., Bradley Reaves
Abstract:
The STIR/SHAKEN (S/S) attestation Framework mandated by the United States, Canada, and France to combat pervasive telephone abuse has not achieved its goals, partly because legacy non-VoIP infrastructure could not participate. The industry solution to extend S/S broadcasts sensitive metadata of every non-VoIP call in plaintext to every third party required to facilitate the system. It has no mechanism to determine whether a provider's request for call data is appropriate, nor can it ensure that every copy of that call data is unavailable after its specified expiration. It threatens subscriber privacy and provider confidentiality. In this paper, we present Sidecar, a distributed, privacy-preserving system with tunable decentralization that securely extends S/S across all telephone network technologies. We introduce the notion of secure out-of-band signaling for telephony and formalize its system and security requirements. We then design novel, scalable protocols that realize these requirements and prove their security within the Universal Composability framework. Finally, we demonstrate Sidecar's efficiency with our open-sourced reference implementation. Compared to the current solution, Sidecar 1) protects the confidentiality of subscriber identity and provider trade secrets, 2) guarantees record expiration as long as a single node handling a record is honest, 3) reduces resource requirements while providing virtually identical call-setup times and equivalent or better uptimes, and 4) enables secure pay-per-use billing and integrates mechanisms to mitigate and detect misbehavior. Moreover, Sidecar can be extended to provide the same security guarantees for arbitrary call metadata. Not only is Sidecar a superior approach, it is also a transformative tool to retrofit fragmented global telephony and enable future improvements, such as stronger call authentication and Branded Calling.
Authors:Chuxu Song, Dheekshith Dev Manohar Mekala, Hao Wang, Richard Martin
Abstract:
Website Fingerprinting (WFP) uses deep learning models to classify encrypted network traffic to infer visited websites. While historically effective, prior methods fail to generalize to modern web environments. Single-page applications (SPAs) eliminate the paradigm of websites as sets of discrete pages, undermining page-based classification, and traffic from scripted browsers lacks the behavioral richness seen in real user sessions. Our study reveals that users exhibit highly diverse behaviors even on the same website, producing traffic patterns that vary significantly across individuals. This behavioral entropy makes WFP a harder problem than previously assumed and highlights the need for larger, more diverse, and representative datasets to achieve robust performance. To address this, we propose a new paradigm: we drop session-boundaries in favor of contiguous traffic segments and develop a scalable data generation pipeline using large language models (LLM) agents. These multi-agent systems coordinate decision-making and browser interaction to simulate realistic, persona-driven browsing behavior at 3--5x lower cost than human collection. We evaluate nine state-of-the-art WFP models on traffic from 20 modern websites browsed by 30 real users, and compare training performance across human, scripted, and LLM-generated datasets. All models achieve under 10\% accuracy when trained on scripted traffic and tested on human data. In contrast, LLM-generated traffic boosts accuracy into the 80\% range, demonstrating strong generalization to real-world traces. Our findings indicate that for modern WFP, model performance is increasingly bottlenecked by data quality, and that scalable, semantically grounded synthetic traffic is essential for capturing the complexity of real user behavior.
Authors:Shunfan Zhou, Kevin Wang, Hang Yin
Abstract:
Web3 applications require execution platforms that maintain confidentiality and integrity without relying on centralized trust authorities. While Trusted Execution Environments (TEEs) offer promising capabilities for confidential computing, current implementations face significant limitations when applied to Web3 contexts, particularly in security reliability, censorship resistance, and vendor independence. This paper presents dstack, a comprehensive framework that transforms raw TEE technology into a true Zero Trust platform. We introduce three key innovations: (1) Portable Confidential Containers that enable seamless workload migration across heterogeneous TEE environments while maintaining security guarantees, (2) Decentralized Code Management that leverages smart contracts for transparent governance of TEE applications, and (3) Verifiable Domain Management that ensures secure and verifiable application identity without centralized authorities. These innovations are implemented through three core components: dstack-OS, dstack-KMS, and dstack-Gateway. Together, they demonstrate how to achieve both the performance advantages of VM-level TEE solutions and the trustless guarantees required by Web3 applications. Our evaluation shows that dstack provides comprehensive security guarantees while maintaining practical usability for real-world applications.
Authors:Syed Emad Uddin Shubha, Tasnuva Farheen
Abstract:
Hardware crosstalk in multi-tenant superconducting quantum computers constitutes a significant security threat, enabling adversaries to inject targeted errors across tenant boundaries. We present the first end-to-end framework for mapping physical pulse-level attacks to interpretable logical error channels, integrating density-matrix simulation, quantum process tomography (QPT), and a novel isometry-based circuit extraction method. Our pipeline reconstructs the complete induced error channel and fits an effective logical circuit model, revealing a fundamentally asymmetric attack mechanism: one adversarial qubit acts as a driver to set the induced logical rotation, while a second, the catalyst, refines the attack's coherence. Demonstrated on a linear three-qubit system, our approach shows that such attacks can significantly disrupt diverse quantum protocols, sometimes reducing accuracy to random guessing, while remaining effective and stealthy even under realistic hardware parameter variations. We further propose a protocol-level detection strategy based on observable attack signatures, showing that stealthy attacks can be exposed through targeted monitoring and providing a foundation for future defense-in-depth in quantum cloud platforms.
Authors:Seonghun Son, Chandrika Mukherjee, Reham Mohamed Aburas, Berk Gulmezoglu, Z. Berkay Celik
Abstract:
Over the past decade, AR/VR devices have drastically changed how we interact with the digital world. Users often share sensitive information, such as their location, browsing history, and even financial data, within third-party apps installed on these devices, assuming a secure environment protected from malicious actors. Recent research has revealed that malicious apps can exploit such capabilities and monitor benign apps to track user activities, leveraging fine-grained profiling tools, such as performance counter APIs. However, app-to-app monitoring is not feasible on all AR/VR devices (e.g., Meta Quest), as a concurrent standalone app execution is disabled. In this paper, we present OVRWatcher, a novel side-channel primitive for AR/VR devices that infers user activities by monitoring low-resolution (1Hz) GPU usage via a background script, unlike prior work that relies on high-resolution profiling. OVRWatcher captures correlations between GPU metrics and 3D object interactions under varying speeds, distances, and rendering scenarios, without requiring concurrent app execution, access to application data, or additional SDK installations. We demonstrate the efficacy of OVRWatcher in fingerprinting both standalone AR/VR and WebXR applications. OVRWatcher also distinguishes virtual objects, such as products in immersive shopping apps selected by real users and the number of participants in virtual meetings, thereby revealing users' product preferences and potentially exposing confidential information from those meetings. OVRWatcher achieves over 99% accuracy in app fingerprinting and over 98% accuracy in object-level inference.
Authors:Vitor Hugo Galhardo Moia, Igor Jochem Sanz, Gabriel Antonio Fontes Rebello, Rodrigo Duarte de Meneses, Briland Hitaj, Ulf Lindqvist
Abstract:
The success and wide adoption of generative AI (GenAI), particularly large language models (LLMs), has attracted the attention of cybercriminals seeking to abuse models, steal sensitive data, or disrupt services. Moreover, providing security to LLM-based systems is a great challenge, as both traditional threats to software applications and threats targeting LLMs and their integration must be mitigated. In this survey, we shed light on security and privacy concerns of such LLM-based systems by performing a systematic review and comprehensive categorization of threats and defensive strategies considering the entire software and LLM life cycles. We analyze real-world scenarios with distinct characteristics of LLM usage, spanning from development to operation. In addition, threats are classified according to their severity level and to which scenarios they pertain, facilitating the identification of the most relevant threats. Recommended defense strategies are systematically categorized and mapped to the corresponding life cycle phase and possible attack strategies they attenuate. This work paves the way for consumers and vendors to understand and efficiently mitigate risks during integration of LLMs in their respective solutions or organizations. It also enables the research community to benefit from the discussion of open challenges and edge cases that may hinder the secure and privacy-preserving adoption of LLM-based systems.
Authors:Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, Giuseppe Ateniese
Abstract:
We prove a sharp threshold for the robustness of cryptographic watermarking for generative models. This is achieved by introducing a coding abstraction, which we call messageless secret-key codes, that formalizes sufficient and necessary requirements of robust watermarking: soundness, tamper detection, and pseudorandomness. Thus, we establish that robustness has a precise limit: For binary outputs no scheme can survive if more than half of the encoded bits are modified, and for an alphabet of size q the corresponding threshold is $(1-1/q)$ of the symbols. Complementing this impossibility, we give explicit constructions that meet the bound up to a constant slack. For every $δ > 0$, assuming pseudorandom functions and access to a public counter, we build linear-time codes that tolerate up to $(1/2)(1-δ)$ errors in the binary case and $(1-1/q)(1-δ)$ errors in the $q$-ary case. Together with the lower bound, these yield the maximum robustness achievable under standard cryptographic assumptions. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact. These results provide a complete characterization of robust watermarking, identifying the threshold at which robustness fails, constructions that achieve it, and an experimental confirmation that the threshold is already reached in practice.
Authors:Amal Raj, Vivek Balachandran
Abstract:
This paper introduces a hybrid encryption framework combining classical cryptography (EdDSA, ECDH), post-quantum cryptography (ML-DSA-6x5, ML-KEM-768), and Quantum Key Distribution (QKD) via Guardian to counter quantum computing threats. Our prototype implements this integration, using a key derivation function to generate secure symmetric and HMAC keys, and evaluates its performance across execution time and network metrics. The approach improves data protection by merging classical efficiency with PQC's quantum resilience and QKD's key security, offering a practical transition path for cryptographic systems. This research lays the foundation for future adoption of PQC in securing digital communication.
Authors:Ruwanga Konara, Kasun De Zoysa, Asanka Sayakkara
Abstract:
Recent years have seen many industrial implementations and much scholastic research, i.e., prototypes and theoretical frameworks, in Decentralized Identity Management Systems (DIDMS). It is safe to say that Attestation-Based Attribute-Based Decentralized IDM (ABABDIDM) has not received anywhere near the same level of attention in the literature as general Attribute-Based DIDMs (ABDIDM), i.e, decentralized Attribute-Based Access Control (ABAC). The use of decentralization, i.e., DIDM, is to improve upon the security and privacy-related issues of centralized Identity Management Systems (IDM) and Attribute-Based IDMs (ABIDM). And blockchain is the framework used for decentralization in all these schemes. Many DIDMs - even ABDIDMs - have been defined on popular blockchains such as Hyperledger, Ethereum, and Bitcoin. However, despite the characteristics of Ripple that makes it appealing for an ABIDM, there is a lack of research to develop an Identity Management System (IDMS) on Ripple in literature. We have attempted to conceptualize an ABABDIDM on Ripple.
Authors:Reynaldo Gil-Pons, Sjouke Mauw, Rolando Trujillo-Rasua
Abstract:
Software-based memory-erasure protocols are two-party communication protocols where a verifier instructs a computational device to erase its memory and send a proof of erasure. They aim at guaranteeing that low-cost IoT devices are free of malware by putting them back into a safe state without requiring secure hardware or physical manipulation of the device. Several software-based memory-erasure protocols have been introduced and theoretically analysed. Yet, many of them have not been tested for their feasibility, performance and security on real devices, which hinders their industry adoption. This article reports on the first empirical analysis of software-based memory-erasure protocols with respect to their security, erasure guarantees, and performance. The experimental setup consists of 3 modern IoT devices with different computational capabilities, 7 protocols, 6 hash-function implementations, and various performance and security criteria. Our results indicate that existing software-based memory-erasure protocols are feasible, although slow devices may take several seconds to erase their memory and generate a proof of erasure. We found that no protocol dominates across all empirical settings, defined by the computational power and memory size of the device, the network speed, and the required level of security. Interestingly, network speed and hidden constants within the protocol specification played a more prominent role in the performance of these protocols than anticipated based on the related literature. We provide an evaluation framework that, given a desired level of security, determines which protocols offer the best trade-off between performance and erasure guarantees.
Authors:Yunhao Zhang, Haobin Ni, Soumya Basu, Shir Cohen, Maofan Yin, Lorenzo Alvisi, Robbert van Renesse, Qi Chen, Lidong Zhou
Abstract:
The specification of state machine replication (SMR) has no requirement on the final total order of commands. In blockchains based on SMR, however, order matters, since different orders could provide their clients with different financial rewards. Ordered consensus augments the specification of SMR to include specific guarantees on such order, with a focus on limiting the influence of Byzantine nodes. Real-world ordering manipulations, however, can and do happen even without Byzantine replicas, typically because of factors, such as faster networks or closer proximity to the blockchain infrastructure, that give some clients an unfair advantage. To address this challenge, this paper proceeds to extend ordered consensus by requiring it to also support equal opportunity, a concrete notion of fairness, widely adopted in social sciences. Informally, equal opportunity requires that two candidates who, according to a set of criteria deemed to be relevant, are equally qualified for a position (in our case, a specific slot in the SMR total order), should have an equal chance of landing it. We show how randomness can be leveraged to keep bias in check, and, to this end, introduce the secret random oracle (SRO), a system component that generates randomness in a fault-tolerant manner. We describe two SRO designs based, respectively, on trusted hardware and threshold verifiable random functions, and instantiate them in Bercow, a new ordered consensus protocol that, by approximating equal opportunity up to within a configurable factor, can effectively mitigate well-known ordering attacks in SMR-based blockchains.
Authors:Amitabh Chakravorty, Jess Kropczynski, Nelly Elsayed
Abstract:
With the widespread adoption of cryptocurrencies, cryptojacking has become a significant security threat to crypto wallet users. This paper presents a front-end prototype of an AI-powered security dashboard, namely, CryptoGuard. Developed through a user-centered design process, the prototype was constructed as a high-fidelity, click-through model from Figma mockups to simulate key user interactions. It is designed to assist users in monitoring their login and transaction activity, identifying any suspicious behavior, and enabling them to take action directly within the wallet interface. The dashboard is designed for a general audience, prioritizing an intuitive user experience for non-technical individuals. Although its AI functionality is conceptual, the prototype demonstrates features like visual alerts and reporting. This work is positioned explicitly as a design concept, bridging cryptojacking detection research with human-centered interface design. This paper also demonstrates how usability heuristics can directly inform a tool's ability to support rapid and confident decision-making under real-world threats. This paper argues that practical security tools require not only robust backend functionality but also a user-centric design that communicates risk and empowers users to take meaningful action.
Authors:Zhiyu He, Maojiang Wang, Xinwen Gao, Yuchuan Luo, Lin Liu, Shaojing Fu
Abstract:
Secure inference enables privacy-preserving machine learning by leveraging cryptographic protocols that support computations on sensitive user data without exposing it. However, integrating cryptographic protocols with large language models (LLMs) presents significant challenges, as the inherent complexity of these protocols, together with LLMs' massive parameter scale and sophisticated architectures, severely limits practical usability. In this work, we propose ENSI, a novel non-interactive secure inference framework for LLMs, based on the principle of co-designing the cryptographic protocols and LLM architecture. ENSI employs an optimized encoding strategy that seamlessly integrates CKKS scheme with a lightweight LLM variant, BitNet, significantly reducing the computational complexity of encrypted matrix multiplications. In response to the prohibitive computational demands of softmax under homomorphic encryption (HE), we pioneer the integration of the sigmoid attention mechanism with HE as a seamless, retraining-free alternative. Furthermore, by embedding the Bootstrapping operation within the RMSNorm process, we efficiently refresh ciphertexts while markedly decreasing the frequency of costly bootstrapping invocations. Experimental evaluations demonstrate that ENSI achieves approximately an 8x acceleration in matrix multiplications and a 2.6x speedup in softmax inference on CPU compared to state-of-the-art method, with the proportion of bootstrapping is reduced to just 1%.
Authors:Diwen Xue, Armin Huremagic, Wayne Wang, Ram Sundara Raman, Roya Ensafi
Abstract:
Users around the world face escalating network interference such as censorship, throttling, and interception, largely driven by the commoditization and growing availability of Deep Packet Inspection (DPI) devices. Once reserved for a few well-resourced nation-state actors, the ability to interfere with traffic at scale is now within reach of nearly any network operator. Despite this proliferation, our understanding of DPIs and their deployments on the Internet remains limited -- being network intermediary leaves DPI unresponsive to conventional host-based scanning tools, and DPI vendors actively obscuring their products further complicates measurement efforts. In this work, we present a remote measurement framework, dMAP (DPI Mapper), that derives behavioral fingerprints for DPIs to differentiate and cluster these otherwise indistinguishable middleboxes at scale, as a first step toward active reconnaissance of DPIs on the Internet. Our key insight is that parsing and interpreting traffic as network intermediaries inherently involves ambiguities -- from under-specified protocol behaviors to differing RFC interpretations -- forcing DPI vendors into independent implementation choices that create measurable variance among DPIs. Based on differential fuzzing, dMAP systematically discovers, selects, and deploys specialized probes that translate DPI internal parsing behaviors into externally observable fingerprints. Applying dMAP to DPI deployments globally, we demonstrate its practical feasibility, showing that even a modest set of 20-40 discriminative probes reliably differentiates a wide range of DPI implementations, including major nation-state censorship infrastructures and commercial DPI products. We discuss how our fingerprinting methodology generalizes beyond censorship to other forms of targeted interference.
Authors:Ruiyao Liu, Chenxi Qiu
Abstract:
Metric Differential Privacy (mDP) extends the local differential privacy (LDP) framework to metric spaces, enabling more nuanced privacy protection for data such as geo-locations. However, existing mDP optimization methods, particularly those based on linear programming (LP), face scalability challenges due to the quadratic growth in decision variables. In this paper, we propose Perturbation via Anchor-based Distributed Approximation (PAnDA), a scalable two-phase framework for optimizing metric differential privacy (mDP). To reduce computational overhead, PAnDA allows each user to select a small set of anchor records, enabling the server to solve a compact linear program over a reduced domain. We introduce three anchor selection strategies, exponential decay (PAnDA-e), power-law decay (PAnDA-p), and logistic decay (PAnDA-l), and establish theoretical guarantees under a relaxed privacy notion called probabilistic mDP (PmDP). Experiments on real-world geo-location datasets demonstrate that PAnDA scales to secret domains with up to 5,000 records, two times larger than prior LP-based methods, while providing theoretical guarantees for both privacy and utility.
Authors:Ron F. Del Rosario, Klaudia Krawiecka, Christian Schroeder de Witt
Abstract:
As Large Language Model (LLM) agents become increasingly capable of automating complex, multi-step tasks, the need for robust, secure, and predictable architectural patterns is paramount. This paper provides a comprehensive guide to the ``Plan-then-Execute'' (P-t-E) pattern, an agentic design that separates strategic planning from tactical execution. We explore the foundational principles of P-t-E, detailing its core components - the Planner and the Executor - and its architectural advantages in predictability, cost-efficiency, and reasoning quality over reactive patterns like ReAct (Reason + Act). A central focus is placed on the security implications of this design, particularly its inherent resilience to indirect prompt injection attacks by establishing control-flow integrity. We argue that while P-t-E provides a strong foundation, a defense-in-depth strategy is necessary, and we detail essential complementary controls such as the Principle of Least Privilege, task-scoped tool access, and sandboxed code execution. To make these principles actionable, this guide provides detailed implementation blueprints and working code references for three leading agentic frameworks: LangChain (via LangGraph), CrewAI, and AutoGen. Each framework's approach to implementing the P-t-E pattern is analyzed, highlighting unique features like LangGraph's stateful graphs for re-planning, CrewAI's declarative tool scoping for security, and AutoGen's built-in Docker sandboxing. Finally, we discuss advanced patterns, including dynamic re-planning loops, parallel execution with Directed Acyclic Graphs (DAGs), and the critical role of Human-in-the-Loop (HITL) verification, to offer a complete strategic blueprint for architects, developers, and security engineers aiming to build production-grade, resilient, and trustworthy LLM agents.
Authors:Hossein Siadati, Haadi Jafarian, Sima Jafarikhah
Abstract:
Scammers are increasingly harnessing generative AI(GenAI) technologies to produce convincing phishing content at scale, amplifying financial fraud and undermining public trust. While conventional defenses, such as detection algorithms, user training, and reactive takedown efforts remain important, they often fall short in dismantling the infrastructure scammers depend on, including mule bank accounts and cryptocurrency wallets. To bridge this gap, a proactive and emerging strategy involves using conversational honeypots to engage scammers and extract actionable threat intelligence. This paper presents the first large-scale, real-world evaluation of a scambaiting system powered by large language models (LLMs). Over a five-month deployment, the system initiated over 2,600 engagements with actual scammers, resulting in a dataset of more than 18,700 messages. It achieved an Information Disclosure Rate (IDR) of approximately 32%, successfully extracting sensitive financial information such as mule accounts. Additionally, the system maintained a Human Acceptance Rate (HAR) of around 70%, indicating strong alignment between LLM-generated responses and human operator preferences. Alongside these successes, our analysis reveals key operational challenges. In particular, the system struggled with engagement takeoff: only 48.7% of scammers responded to the initial seed message sent by defenders. These findings highlight the need for further refinement and provide actionable insights for advancing the design of automated scambaiting systems.
Authors:Laurie Williams, Sammy Migues
Abstract:
Software supply chain attacks have increased exponentially since 2020. The primary attack vectors for supply chain attacks are through: (1) software components; (2) the build infrastructure; and (3) humans (a.k.a software practitioners). Software supply chain risk management frameworks provide a list of tasks that an organization can adopt to reduce software supply chain risk. Exhaustively adopting all the tasks of these frameworks is infeasible, necessitating the prioritized adoption of tasks. Software organizations can benefit from being guided in this prioritization by learning what tasks other teams have adopted. The goal of this study is to aid software development organizations in understanding the adoption of security tasks that reduce software supply chain risk through an interview study of software practitioners engaged in software supply chain risk management efforts. An interview study was conducted with 61 practitioners at nine software development organizations that have focused efforts on reducing software supply chain risk. The results of the interviews indicate that organizations had implemented the most adopted software tasks before the focus on software supply chain security. Therefore, their implementation in organizations is more mature. The tasks that mitigate the novel attack vectors through software components and the build infrastructure are in the early stages of adoption. Adoption of these tasks should be prioritized.
Authors:Bilal Hussain Abbasi, Yanjun Zhang, Leo Zhang, Shang Gao
Abstract:
Backdoor (trojan) attacks embed hidden, controllable behaviors into machine-learning models so that models behave normally on benign inputs but produce attacker-chosen outputs when a trigger is present. This survey reviews the rapidly growing literature on backdoor attacks and defenses in the computer-vision domain. We introduce a multi-dimensional taxonomy that organizes attacks and defenses by injection stage (dataset poisoning, model/parameter modification, inference-time injection), trigger type (patch, blended/frequency, semantic, transformation), labeling strategy (dirty-label vs. clean-label / feature-collision), representation stage (instance-specific, manifold/class-level, neuron/parameter hijacking, distributed encodings), and target task (classification, detection, segmentation, video, multimodal). For each axis we summarize representative methods, highlight evaluation practices, and discuss where defenses succeed or fail. For example, many classical sanitization and reverse-engineering tools are effective against reusable patch attacks but struggle with input-aware, sample-specific, or parameter-space backdoors and with transfer via compromised pre-trained encoders or hardware bit-flips. We synthesize trends, identify persistent gaps (supply-chain and hardware threats, certifiable defenses, cross-task benchmarks), and propose practical guidelines for threat-aware evaluation and layered defenses. This survey aims to orient researchers and practitioners to the current threat landscape and pressing research directions in secure computer vision.
Authors:Shakhzod Yuldoshkhujaev, Mijin Jeon, Doowon Kim, Nick Nikiforakis, Hyungjoon Koo
Abstract:
An advanced persistent threat (APT) refers to a covert, long-term cyberattack, typically conducted by state-sponsored actors, targeting critical sectors and often remaining undetected for long periods. In response, collective intelligence from around the globe collaborates to identify and trace surreptitious activities, generating substantial documentation on APT campaigns publicly available on the web. While prior works predominantly focus on specific aspects of APT cases, such as detection, evaluation, cyber threat intelligence, and dataset creation, limited attention has been devoted to revisiting and investigating these scattered dossiers in a longitudinal manner. The objective of our study is to fill the gap by offering a macro perspective, connecting key insights and global trends in past APT attacks. We systematically analyze six reliable sources-three focused on technical reports and another three on threat actors-examining 1,509 APT dossiers (24,215 pages) spanning 2014-2023, and identifying 603 unique APT groups worldwide. To efficiently unearth relevant information, we employ a hybrid methodology that combines rule-based information retrieval with large-language-model-based search techniques. Our longitudinal analysis reveals shifts in threat actor activities, global attack vectors, changes in targeted sectors, and relationships between cyberattacks and significant events such as elections or wars, which provide insights into historical patterns in APT evolution. Over the past decade, 154 countries have been affected, primarily using malicious documents and spear phishing as dominant initial infiltration vectors, with a noticeable decline in zero-day exploitation since 2016. Furthermore, we present our findings through interactive visualization tools, such as an APT map or flow diagram, to facilitate intuitive understanding of global patterns and trends in APT activities.
Authors:Bin Hu, Kunyang Huang, Daehan Kwak, Meng Xu, Kuan Huang
Abstract:
The rapid advancement of AI has enabled highly realistic speech synthesis and voice cloning, posing serious risks to voice authentication, smart assistants, and telecom security. While most prior work frames spoof detection as a binary task, real-world attacks often involve hybrid utterances that mix genuine and synthetic speech, making detection substantially more challenging. To address this gap, we introduce the Hybrid Spoofed Audio Dataset (HSAD), a benchmark containing 1,248 clean and 41,044 degraded utterances across four classes: human, cloned, zero-shot AI-generated, and hybrid audio. Each sample is annotated with spoofing method, speaker identity, and degradation metadata to enable fine-grained analysis. We evaluate six transformer-based models, including spectrogram encoders (MIT-AST, MattyB95-AST) and self-supervised waveform models (Wav2Vec2, HuBERT). Results reveal critical lessons: pretrained models overgeneralize and collapse under hybrid conditions; spoof-specific fine-tuning improves separability but struggles with unseen compositions; and dataset-specific adaptation on HSAD yields large performance gains (AST greater than 97 percent and F1 score is approximately 99 percent), though residual errors persist for complex hybrids. These findings demonstrate that fine-tuning alone is not sufficient-robust hybrid-aware benchmarks like HSAD are essential to expose calibration failures, model biases, and factors affecting spoof detection in adversarial environments. HSAD thus provides both a dataset and an analytic framework for building resilient and trustworthy voice authentication systems.
Authors:Pierre Briaud, Itai Dinur, Riddhi Ghosal, Aayush Jain, Paul Lou, Amit Sahai
Abstract:
In this work, we propose a new way to (non-interactively, verifiably) demonstrate quantum advantage by solving the average-case $\mathsf{NP}$ search problem of finding a solution to a system of (underdetermined) constant degree multivariate equations over the finite field $\mathbb{F}_2$ drawn from a specified distribution. In particular, for any $d \geq 2$, we design a distribution of degree up to $d$ polynomials $\{p_i(x_1,\ldots,x_n)\}_{i\in [m]}$ for $m 2$, it is classically hard to find one based on a thorough review of existing classical cryptanalysis. Our work thus posits that degree three functions are enough to instantiate the random oracle to obtain non-relativized quantum advantage. Our approach begins with the breakthrough Yamakawa-Zhandry (FOCS 2022) quantum algorithmic framework. In our work, we demonstrate that this quantum algorithmic framework extends to the setting of multivariate polynomial systems. Our key technical contribution is a new analysis on the Fourier spectra of distributions induced by a general family of distributions over $\mathbb{F}_2$ multivariate polynomials -- those that satisfy $2$-wise independence and shift-invariance. This family of distributions includes the distribution of uniform random degree at most $d$ polynomials for any constant $d \geq 2$. Our analysis opens up potentially new directions for quantum cryptanalysis of other multivariate systems.
Authors:Tomás González, Mateo Dulce-Rubio, Aaditya Ramdas, Mónica Ribero
Abstract:
We propose a practical sequential test for auditing differential privacy guarantees of black-box mechanisms. The test processes streams of mechanisms' outputs providing anytime-valid inference while controlling Type I error, overcoming the fixed sample size limitation of previous batch auditing methods. Experiments show this test detects violations with sample sizes that are orders of magnitude smaller than existing methods, reducing this number from 50K to a few hundred examples, across diverse realistic mechanisms. Notably, it identifies DP-SGD privacy violations in \textit{under} one training run, unlike prior methods needing full model training.
Authors:Youjia Zheng, Mohammad Zandsalimy, Shanu Sushmita
Abstract:
Large Language Models (LLMs) are increasingly vulnerable to a sophisticated form of adversarial prompting known as camouflaged jailbreaking. This method embeds malicious intent within seemingly benign language to evade existing safety mechanisms. Unlike overt attacks, these subtle prompts exploit contextual ambiguity and the flexible nature of language, posing significant challenges to current defense systems. This paper investigates the construction and impact of camouflaged jailbreak prompts, emphasizing their deceptive characteristics and the limitations of traditional keyword-based detection methods. We introduce a novel benchmark dataset, Camouflaged Jailbreak Prompts, containing 500 curated examples (400 harmful and 100 benign prompts) designed to rigorously stress-test LLM safety protocols. In addition, we propose a multi-faceted evaluation framework that measures harmfulness across seven dimensions: Safety Awareness, Technical Feasibility, Implementation Safeguards, Harmful Potential, Educational Value, Content Quality, and Compliance Score. Our findings reveal a stark contrast in LLM behavior: while models demonstrate high safety and content quality with benign inputs, they exhibit a significant decline in performance and safety when confronted with camouflaged jailbreak attempts. This disparity underscores a pervasive vulnerability, highlighting the urgent need for more nuanced and adaptive security strategies to ensure the responsible and robust deployment of LLMs in real-world applications.
Authors:Ikhlasse Badidi, Nouhaila El Khiyaoui, Aya Riany, Badr Ben Elallid, Amine Abouaomar
Abstract:
The integration of Large Language Models (LLMs) in 6G vehicular networks promises unprecedented advancements in intelligent transportation systems. However, offloading LLM computations from vehicles to edge infrastructure poses significant privacy risks, potentially exposing sensitive user data. This paper presents a novel privacy-preserving offloading framework for LLM-integrated vehicular networks. We introduce a hybrid approach combining federated learning (FL) and differential privacy (DP) techniques to protect user data while maintaining LLM performance. Our framework includes a privacy-aware task partitioning algorithm that optimizes the trade-off between local and edge computation, considering both privacy constraints and system efficiency. We also propose a secure communication protocol for transmitting model updates and aggregating results across the network. Experimental results demonstrate that our approach achieves 75\% global accuracy with only a 2-3\% reduction compared to non-privacy-preserving methods, while maintaining DP guarantees with an optimal privacy budget of $\varepsilon = 0.8$. The framework shows stable communication overhead of approximately 2.1MB per round with computation comprising over 90\% of total processing time, validating its efficiency for resource-constrained vehicular environments.
Authors:Abiodun Ganiyu, Dara Ron, Syed Rafiul Hussain, Vijay K Shah
Abstract:
The Y1 interface in O-RAN enables the sharing of RAN Analytics Information (RAI) between the near-RT RIC and authorized Y1 consumers, which may be internal applications within the operator's trusted domain or external systems accessing data through a secure exposure function. While this visibility enhances network optimization and enables advanced services, it also introduces a potential security risk -- a malicious or compromised Y1 consumer could misuse analytics to facilitate targeted interference. In this work, we demonstrate how an adversary can exploit the Y1 interface to launch selective jamming attacks by passively monitoring downlink metrics. We propose and evaluate two Y1-aided jamming strategies: a clustering-based jammer leveraging DBSCAN for traffic profiling and a threshold-based jammer. These are compared against two baselines strategies -- always-on jammer and random jammer -- on an over-the-air LTE/5G O-RAN testbed. Experimental results show that in unconstrained jamming budget scenarios, the threshold-based jammer can closely replicate the disruption caused by always-on jamming while reducing transmission time by 27\%. Under constrained jamming budgets, the clustering-based jammer proves most effective, causing up to an 18.1\% bitrate drop while remaining active only 25\% of the time. These findings reveal a critical trade-off between jamming stealthiness and efficiency, and illustrate how exposure of RAN analytics via the Y1 interface can enable highly targeted, low-overhead attacks, raising important security considerations for both civilian and mission-critical O-RAN deployments.
Authors:Richard Derbyshire, Diana Selck-Paulsson, Charl van der Walt, Joe Burton
Abstract:
Since 2022, hacktivist groups have escalated their tactics, expanding from distributed denial-of-service attacks and document leaks to include targeting operational technology (OT). By 2024, attacks on the OT of critical national infrastructure (CNI) had been linked to partisan hacktivist efforts in ongoing geopolitical conflicts, demonstrating a shift from protest to something more resembling cyber warfare. This escalation raises critical questions about the classification of these groups and the appropriate state response to their growing role in destabilizing international security. This paper examines the strategic motivations behind escalatory hacktivism, highlighting how states may tolerate, encourage, or leverage hacktivist groups as proxies in conflicts that blur the lines between activism, cybercrime, and state-sponsored operations. We introduce a novel method for interpreting hacktivists based on the impact of their actions, alignment to state ideology, and host state involvement, offering a structured approach to understanding the phenomenon. Finally, we assess policy and security implications, particularly for host and victim states, and propose strategies to address this evolving threat. By doing so, this paper contributes to international discussions on cyber security policy, governance, and the increasing intersection between non-state cyber actors and state interests.
Authors:VÃctor Duarte Melo, William J. Buchanan
Abstract:
Whilst many key exchange and digital signature systems still rely on NIST P-256 (secp256r1) and secp256k1, offering around 128-bit security, there is an increasing demand for transparent and reproducible curves at the 256-bit security level. Standard higher-security options include NIST P-521, Curve448, and Brainpool-P512. This paper presents ECCFROG522PP ("Presunto Powered"), a 522-bit prime-field elliptic curve that delivers security in the same classical approx 260-bit ballpark as NIST P-521, but with a fundamentally different design philosophy. All of the curve parameters are deterministically derived from a fixed public seed via BLAKE3, with zero hidden choices. The curve has prime order (cofactor = 1), a verified twist with a proven approx 505-bit prime factor, safe embedding degree (greater than or equal to 14), and passes anti-MOV checks up to k less than or equal to 200 and CM discriminant sanity up to 100k. Unlike prior opaque or ad-hoc constructions, ECCFROG522PP is fully reproducible: anyone can regenerate and verify it byte-for-byte using the published scripts. The intent is not to outperform NIST P-521 in raw speed, but to maximise trust, verifiability, and long-term auditability in a practical curve of equivalent security level
Authors:Rui Zhao, Muhammad Shoaib, Viet Tung Hoang, Wajih Ul Hassan
Abstract:
Existing tamper-evident logging systems suffer from high overhead and severe data loss in high-load settings, yet only provide coarse-grained tamper detection. Moreover, installing such systems requires recompiling kernel code. To address these challenges, we present Nitro, a high-performance, tamper-evident audit logging system that supports fine-grained detection of log tampering. Even better, our system avoids kernel recompilation by using the eBPF technology. To formally justify the security of Nitro, we provide a new definitional framework for logging systems, and give a practical cryptographic construction meeting this new goal. Unlike prior work that focus only on the cryptographic processing, we codesign the cryptographic part with the pre- and post-processing of the logs to exploit all system-level optimizations. Our evaluations demonstrate Nitro's superior performance, achieving 10X-25X improvements in high-stress conditions and 2X-10X in real-world scenarios while maintaining near-zero data loss. We also provide an advanced variant, Nitro-R that introduces in-kernel log reduction techniques to reduce runtime overhead even further.
Authors:Gustavo Banegas, Anaëlle Le Dévéhat, Benjamin Smith
Abstract:
Many signature applications-such as root certificates, secure software updates, and authentication protocols-involve long-lived public keys that are transferred or installed once and then used for many verifications. This key longevity makes post-quantum signature schemes with conservative assumptions (e.g., structure-free lattices) attractive for long-term security. But many such schemes, especially those with short signatures, suffer from extremely large public keys. Even in scenarios where bandwidth is not a major concern, large keys increase storage costs and slow down verification. We address this with a method to replace large public keys in GPV-style signatures with smaller, private verification keys. This significantly reduces verifier storage and runtime while preserving security. Applied to the conservative, short-signature schemes Wave and Squirrels, our method compresses Squirrels-I keys from 665 kB to 20.7 kB and Wave822 keys from 3.5 MB to 207.97 kB.
Authors:Kong Mun Yeen, Rafidah Md Noor, Wahidah Md Shah, Aslinda Hassan, Muhammad Umair Munir
Abstract:
This paper forecasts future Distributed Denial of Service (DDoS) attacks using deep learning models. Although several studies address forecasting DDoS attacks, they remain relatively limited compared to detection-focused research. By studying the current trends and forecasting based on newer and updated datasets, mitigation plans against the attacks can be planned and formulated. The methodology used in this research work conforms to the Cross Industry Standard Process for Data Mining (CRISP-DM) model.
Authors:Takao Murakami, Yuichi Sei, Reo Eriguchi
Abstract:
Shuffle DP (Differential Privacy) protocols provide high accuracy and privacy by introducing a shuffler who randomly shuffles data in a distributed system. However, most shuffle DP protocols are vulnerable to two attacks: collusion attacks by the data collector and users and data poisoning attacks. A recent study addresses this issue by introducing an augmented shuffle DP protocol, where users do not add noise and the shuffler performs random sampling and dummy data addition. However, it focuses on frequency estimation over categorical data with a small domain and cannot be applied to a large domain due to prohibitively high communication and computational costs. In this paper, we fill this gap by introducing a novel augmented shuffle DP protocol called the FME (Filtering-with-Multiple-Encryption) protocol. Our FME protocol uses a hash function to filter out unpopular items and then accurately calculates frequencies for popular items. To perform this within one round of interaction between users and the shuffler, our protocol carefully communicates within a system using multiple encryption. We also apply our FME protocol to more advanced KV (Key-Value) statistics estimation with an additional technique to reduce bias. For both categorical and KV data, we prove that our protocol provides computational DP, high robustness to the above two attacks, accuracy, and efficiency. We show the effectiveness of our proposals through comparisons with twelve existing protocols.
Authors:Kaitlyn Webb, Prottay Protivash, John Durrell, Daniell Toth, Aleksandra SlavkoviÄ, Daniel Kifer
Abstract:
Confidentiality for business data is an understudied area of disclosure avoidance, where legacy methods struggle to provide acceptable results. Modern formal privacy techniques designed for person-level data do not provide suitable confidentiality/utility trade-offs due to the highly skewed nature of business data and because extreme outlier records are often important contributors to query answers. In this paper, inspired by Gaussian Differential Privacy, we propose a novel confidentiality framework for business data with a focus on interpretability for policy makers. We propose two query-answering mechanisms and analyze new challenges that arise when noisy query answers are converted into confidentiality-preserving microdata. We evaluate our mechanisms on confidential Quarterly Census of Employment and Wages (QCEW) microdata and a public substitute dataset.
Authors:Hao Guo, Zhaoqian Liu, Liqiang Peng, Shuaishuai Li, Ximing Fu, Weiran Liu, Lin Qu
Abstract:
In two-party secret sharing scheme, values are typically encoded as unsigned integers $\mathsf{uint}(x)$, whereas real-world applications often require computations on signed real numbers $\mathsf{Real}(x)$. To enable secure evaluation of practical functions, it is essential to computing $\mathsf{Real}(x)$ from shared inputs, as protocols take shares as input. At USENIX'25, Guo et al. proposed an efficient method for computing signed integer values $\mathsf{int}(x)$ from shares, which can be extended to compute $\mathsf{Real}(x)$. However, their approach imposes a restrictive input constraint $|x| < \frac{L}{3}$ for $x \in \mathbb{Z}_L$, limiting its applicability in real-world scenarios. In this work, we significantly relax this constraint to $|x| < B$ for any $B \leq \frac{L}{2}$, where $B = \frac{L}{2}$ corresponding to the natural representable range in $x \in \mathbb{Z}_L$. This relaxes the restrictions and enables the computation of $\mathsf{Real}(x)$ with loose or no input constraints. Building upon this foundation, we present a generalized framework for designing secure protocols for a broad class of functions, including integer division ($\lfloor \frac{x}{d} \rfloor$), trigonometric ($\sin(x)$) and exponential ($e^{-x}$) functions. Our experimental evaluation demonstrates that the proposed protocols achieve both high efficiency and high accuracy. Notably, our protocol for evaluating $e^{-x}$ reduces communication costs to approximately 31% of those in SirNN (S&P 21) and Bolt (S&P 24), with runtime speedups of up to $5.53 \times$ and $3.09 \times$, respectively. In terms of accuracy, our protocol achieves a maximum ULP error of $1.435$, compared to $2.64$ for SirNN and $8.681$ for Bolt.
Authors:Vamsi Shankar Simhadri, Yichang Xiong, Habiba Farrukh, Xiaokuan Zhang
Abstract:
Virtual Reality (VR) technology is rapidly growing in recent years. VR devices such as Meta Quest 3 utilize numerous sensors to collect users' data to provide an immersive experience. Due to the extensive data collection and the immersive nature, the security of VR devices is paramount. Leading VR devices often adopt and customize Android systems, which makes them susceptible to both Android-based vulnerabilities and new issues introduced by VR-specific customizations (e.g., system services to support continuous head and hand tracking). While prior work has extensively examined the security properties of the Android software stack, how these security properties hold for VR systems remains unexplored. In this paper, we present the first comprehensive security analysis of VR firmware. We collect over 300 versions of VR firmware from two major vendors, Quest and Pico, and perform a longitudinal analysis across the kernel layer, the system binary and library layer, and the application layer. We have identified several security issues in these VR firmware, including missing kernel-level security features, insufficient binary hardening, inconsistent permission enforcement, and inadequate SELinux policy enforcement. Based on our findings, we synthesize recommendations for VR vendors to improve security and trust for VR devices. This paper will act as an important security resource for VR developers, users, and vendors, and will also direct future advancements in secure VR ecosystem.
Authors:Hamideh Haghiri, Rajesh Baidya, Stefan Dvoretskii, Klaus H. Maier-Hein, Marco Nolden
Abstract:
Ensuring the de-identification of medical imaging data is a critical step in enabling safe data sharing. This paper presents a hybrid de-identification framework designed to process Digital Imaging and Communications in Medicine (DICOM) files. Our framework adopts a modified, pre-built rule-based component, updated with The Cancer Imaging Archive (TCIA)'s best practices guidelines, as outlined in DICOM PS 3.15, for improved performance. It incorporates PaddleOCR, a robust Optical Character Recognition (OCR) system for extracting text from images, and RoBERTa, a fine-tuned transformer-based model for identifying and removing Personally Identifiable Information (PII) and Protected Health Information (PHI). Initially, the transformer-based model and the rule-based component were integrated to process for both structured data and free text. However, this coarse-grained approach did not yield optimal results. To improve performance, we refined our approach by applying the transformer model exclusively to free text, while structured data was handled only by rule-based methods. In this framework the DICOM validator dciodvfy was leveraged to ensure the integrity of DICOM files after the deID process. Through iterative refinement, including the incorporation of custom rules and private tag handling, the framework achieved a de-identification accuracy of 99.91% on the MIDI-B test dataset. The results demonstrate the effectiveness of combining rule-based compliance with AI-enabled adaptability in addressing the complex challenges of DICOM de-identification.
Authors:Ting-Chun Liu, Ching-Yu Hsu, Kuan-Yi Lee, Chi-An Fu, Hung-yi Lee
Abstract:
Prompt injection attacks pose a significant challenge to the safe deployment of Large Language Models (LLMs) in real-world applications. While prompt-based detection offers a lightweight and interpretable defense strategy, its effectiveness has been hindered by the need for manual prompt engineering. To address this issue, we propose AEGIS , an Automated co-Evolutionary framework for Guarding prompt Injections Schema. Both attack and defense prompts are iteratively optimized against each other using a gradient-like natural language prompt optimization technique. This framework enables both attackers and defenders to autonomously evolve via a Textual Gradient Optimization (TGO) module, leveraging feedback from an LLM-guided evaluation loop. We evaluate our system on a real-world assignment grading dataset of prompt injection attacks and demonstrate that our method consistently outperforms existing baselines, achieving superior robustness in both attack success and detection. Specifically, the attack success rate (ASR) reaches 1.0, representing an improvement of 0.26 over the baseline. For detection, the true positive rate (TPR) improves by 0.23 compared to the previous best work, reaching 0.84, and the true negative rate (TNR) remains comparable at 0.89. Ablation studies confirm the importance of co-evolution, gradient buffering, and multi-objective optimization. We also confirm that this framework is effective in different LLMs. Our results highlight the promise of adversarial training as a scalable and effective approach for guarding prompt injections.
Authors:VÃctor Mayoral-Vilches, Per Mannermaa Rynning
Abstract:
We demonstrate how AI-powered cybersecurity tools can be turned against themselves through prompt injection attacks. Prompt injection is reminiscent of cross-site scripting (XSS): malicious text is hidden within seemingly trusted content, and when the system processes it, that text is transformed into unintended instructions. When AI agents designed to find and exploit vulnerabilities interact with malicious web servers, carefully crafted reponses can hijack their execution flow, potentially granting attackers system access. We present proof-of-concept exploits against the Cybersecurity AI (CAI) framework and its CLI tool, and detail our mitigations against such attacks in a multi-layered defense implementation. Our findings indicate that prompt injection is a recurring and systemic issue in LLM-based architectures, one that will require dedicated work to address, much as the security community has had to do with XSS in traditional web applications.
Authors:Ziyue Wang, Liyi Zhou
Abstract:
Existing Android vulnerability detection tools overwhelm teams with thousands of low-signal warnings yet uncover few true positives. Analysts spend days triaging these results, creating a bottleneck in the security pipeline. Meanwhile, genuinely exploitable vulnerabilities often slip through, leaving opportunities open to malicious counterparts.
We introduce A2, a system that mirrors how security experts analyze and validate Android vulnerabilities through two complementary phases: (i) Agentic Vulnerability Discovery, which reasons about application security by combining semantic understanding with traditional security tools; and (ii) Agentic Vulnerability Validation, which systematically validates vulnerabilities across Android's multi-modal attack surface-UI interactions, inter-component communication, file system operations, and cryptographic computations.
On the Ghera benchmark (n=60), A2 achieves 78.3% coverage, surpassing state-of-the-art analyzers (e.g., APKHunt 30.0%). Rather than overwhelming analysts with thousands of warnings, A2 distills results into 82 speculative vulnerability findings, including 47 Ghera cases and 28 additional true positives. Crucially, A2 then generates working Proof-of-Concepts (PoCs) for 51 of these speculative findings, transforming them into validated vulnerability findings that provide direct, self-confirming evidence of exploitability.
In real-world evaluation on 169 production APKs, A2 uncovers 104 true-positive zero-day vulnerabilities. Among these, 57 (54.8%) are self-validated with automatically generated PoCs, including a medium-severity vulnerability in a widely used application with over 10 million installs.
Authors:Jukka Ruohonen, Jesper Løffler Nielsen, Jakub Skórczynski
Abstract:
The European Union (EU) has long favored a risk-based approach to regulation. Such an approach is also used in recent cyber security legislation enacted in the EU. Risks are also inherently related to compliance with the new legislation. Objective: The paper investigates how risks are framed in the EU's five core cyber security legislative acts, whether the framings indicate convergence or divergence between the acts and their risk concepts, and what qualifying words and terms are used when describing the legal notions of risks. Method : The paper's methodology is based on qualitative legal interpretation and taxonomy-building. Results: The five acts have an encompassing coverage of different cyber security risks, including but not limited to risks related to technical, organizational, and human security as well as those not originating from man-made actions. Both technical aspects and assets are used to frame the legal risk notions in many of the legislative acts. A threat-centric viewpoint is also present in one of the acts. Notable gaps are related to acceptable risks, non-probabilistic risks, and residual risks. Conclusion: The EU's new cyber security legislation has significantly extended the risk-based approach to regulations. At the same time, complexity and compliance burden have increased. With this point in mind, the paper concludes with a few practical takeaways about means to deal with compliance and research it.
Authors:Zijia Meng, Victor Feng
Abstract:
Digital payments play a pivotal role in the burgeoning digital economy. Moving forward, the enhancement of digital payment systems necessitates programmability, going beyond just efficiency and convenience, to meet the evolving needs and complexities. Smart contract platforms like Central Bank Digital Currency (CBDC) networks and blockchains support programmable digital payments. However, the prevailing paradigm of programming payment logics involves coding smart contracts with programming languages, leading to high costs and significant security challenges. A novel and versatile method for payment programming on DLTs was presented in this paper - transforming digital currencies into token streams, then pipelining smart contracts to authorize, aggregate, lock, direct, and dispatch these streams efficiently from source to target accounts. By utilizing a small set of configurable templates, a few specialized smart contracts could be generated, and support most of payment logics through configuring and composing them. This approach could substantially reduce the cost of payment programming and enhance security, self-enforcement, adaptability, and controllability, thus hold the potential to become an essential component in the infrastructure of digital economy.
Authors:Jaeman Son, Hyunsoo Kim
Abstract:
In Massively Multiplayer Online Role-Playing Games (MMORPGs), auto-leveling bots exploit automated programs to level up characters at scale, undermining gameplay balance and fairness. Detecting such bots is challenging, not only because they mimic human behavior, but also because punitive actions require explainable justification to avoid legal and user experience issues. In this paper, we present a novel framework for detecting auto-leveling bots by leveraging contrastive representation learning and clustering techniques in a fully unsupervised manner to identify groups of characters with similar level-up patterns. To ensure reliable decisions, we incorporate a Large Language Model (LLM) as an auxiliary reviewer to validate the clustered groups, effectively mimicking a secondary human judgment. We also introduce a growth curve-based visualization to assist both the LLM and human moderators in assessing leveling behavior. This collaborative approach improves the efficiency of bot detection workflows while maintaining explainability, thereby supporting scalable and accountable bot regulation in MMORPGs.
Authors:Desen Sun, Shuncheng Jie, Sihang Liu
Abstract:
Diffusion models are a powerful class of generative models that produce content, such as images, from user prompts, but they are computationally intensive. To mitigate this cost, recent academic and industry work has adopted approximate caching, which reuses intermediate states from similar prompts in a cache. While efficient, this optimization introduces new security risks by breaking isolation among users. This work aims to comprehensively assess new security vulnerabilities arising from approximate caching. First, we demonstrate a remote covert channel established with the cache, where a sender injects prompts with special keywords into the cache and a receiver can recover that even after days, to exchange information. Second, we introduce a prompt stealing attack using the cache, where an attacker can recover existing cached prompts based on cache hit prompts. Finally, we introduce a poisoning attack that embeds the attacker's logos into the previously stolen prompt, to render them in future user prompts that hit the cache. These attacks are all performed remotely through the serving system, which indicates severe security vulnerabilities in approximate caching.
Authors:Donglin Wang, Weiyun Liang, Chunyuan Chen, Jing Xu, Yulong Fu
Abstract:
As AI rapidly advances, the security risks posed by AI are becoming increasingly severe, especially in critical scenarios, including those posing existential risks. If AI becomes uncontrollable, manipulated, or actively evades safety mechanisms, it could trigger systemic disasters. Existing AI safety approaches-such as model enhancement, value alignment, and human intervention-suffer from fundamental, in-principle limitations when facing AI with extreme motivations and unlimited intelligence, and cannot guarantee security. To address this challenge, we propose a Governable AI (GAI) framework that shifts from traditional internal constraints to externally enforced structural compliance based on cryptographic mechanisms that are computationally infeasible to break, even for future AI, under the defined threat model and well-established cryptographic assumptions.The GAI framework is composed of a simple yet reliable, fully deterministic, powerful, flexible, and general-purpose rule enforcement module (REM); governance rules; and a governable secure super-platform (GSSP) that offers end-to-end protection against compromise or subversion by AI. The decoupling of the governance rules and the technical platform further enables a feasible and generalizable technical pathway for the safety governance of AI. REM enforces the bottom line defined by governance rules, while GSSP ensures non-bypassability, tamper-resistance, and unforgeability to eliminate all identified attack vectors. This paper also presents a rigorous formal proof of the security properties of this mechanism and demonstrates its effectiveness through a prototype implementation evaluated in representative high-stakes scenarios.
Authors:Kyler Katz, Sara Moshtari, Ibrahim Mujhid, Mehdi Mirakhorli, Derek Garcia
Abstract:
Sensitive Information Exposure (SIEx) vulnerabilities (CWE-200) remain a persistent and under-addressed threat across software systems, often leading to serious security breaches. Existing detection tools rarely target the diverse subcategories of CWE-200 or provide context-aware analysis of code-level data flows.
Aims: This paper aims to present SIExVulTS, a novel vulnerability detection system that integrates transformer-based models with static analysis to identify and verify sensitive information exposure in Java applications.
Method: SIExVulTS employs a three-stage architecture: (1) an Attack Surface Detection Engine that uses sentence embeddings to identify sensitive variables, strings, comments, and sinks; (2) an Exposure Analysis Engine that instantiates CodeQL queries aligned with the CWE-200 hierarchy; and (3) a Flow Verification Engine that leverages GraphCodeBERT to semantically validate source-to-sink flows. We evaluate SIExVulTS using three curated datasets, including real-world CVEs, a benchmark set of synthetic CWE-200 examples, and labeled flows from 31 open-source projects.
Results: The Attack Surface Detection Engine achieved an average F1 score greater than 93\%, the Exposure Analysis Engine achieved an F1 score of 85.71\%, and the Flow Verification Engine increased precision from 22.61\% to 87.23\%. Moreover, SIExVulTS successfully uncovered six previously unknown CVEs in major Apache projects.
Conclusions: The results demonstrate that SIExVulTS is effective and practical for improving software security against sensitive data exposure, addressing limitations of existing tools in detecting and verifying CWE-200 vulnerabilities.
Authors:Neil Kale, Chen Bo Calvin Zhang, Kevin Zhu, Ankit Aich, Paula Rodriguez, Scale Red Team, Christina Q. Knight, Zifan Wang
Abstract:
We stress test monitoring systems for detecting covert misbehavior in autonomous LLM agents (e.g., secretly sharing private information). To this end, we systematize a monitor red teaming (MRT) workflow that incorporates: (1) varying levels of agent and monitor situational awareness; (2) distinct adversarial strategies to evade the monitor, such as prompt injection; and (3) two datasets and environments -- SHADE-Arena for tool-calling agents and our new CUA-SHADE-Arena, which extends TheAgentCompany, for computer-use agents. We run MRT on existing LLM monitor scaffoldings, which orchestrate LLMs and parse agent trajectories, alongside a new hybrid hierarchical-sequential scaffolding proposed in this work. Our empirical results yield three key findings. First, agent awareness dominates monitor awareness: an agent's knowledge that it is being monitored substantially degrades the monitor's reliability. On the contrary, providing the monitor with more information about the agent is less helpful than expected. Second, monitor scaffolding matters more than monitor awareness: the hybrid scaffolding consistently outperforms baseline monitor scaffolding, and can enable weaker models to reliably monitor stronger agents -- a weak-to-strong scaling effect. Third, in a human-in-the-loop setting where humans discuss with the LLM monitor to get an updated judgment for the agent's behavior, targeted human oversight is most effective; escalating only pre-flagged cases to human reviewers improved the TPR by approximately 15% at FPR = 0.01. Our work establishes a standard workflow for MRT, highlighting the lack of adversarial robustness for LLMs and humans when monitoring and detecting agent misbehavior. We release code, data, and logs to spur further research.
Authors:Zhan Shi, Yefeng Yuan, Yuhong Liu, Liang Cheng, Yi Fang
Abstract:
The performance of modern machine learning systems depends on access to large, high-quality datasets, often sourced from user-generated content or proprietary, domain-specific corpora. However, these rich datasets inherently contain sensitive personal information, raising significant concerns about privacy, data security, and compliance with regulatory frameworks. While conventional anonymization techniques can remove explicit identifiers, such removal may result in performance drop in downstream machine learning tasks. More importantly, simple anonymization may not be effective against inference attacks that exploit implicit signals such as writing style, topical focus, or demographic cues, highlighting the need for more robust privacy safeguards during model training. To address the challenging issue of balancing user privacy and data utility, we propose a reinforcement learning framework that fine-tunes a large language model (LLM) using a composite reward function that jointly optimizes for explicit and implicit privacy, semantic fidelity, and output diversity. To effectively capture population level regularities, the privacy reward combines semantic cues with structural patterns derived from a minimum spanning tree (MST) over latent representations. By modeling these privacy-sensitive signals in their distributional context, the proposed approach guides the model to generate synthetic rewrites that preserve utility while mitigating privacy risks. Empirical results show that the proposed method significantly enhances author obfuscation and privacy metrics without degrading semantic quality, providing a scalable and model-agnostic solution for privacy preserving data generation in the era of large language models.
Authors:Avishag Shapira, Simon Shigol, Asaf Shabtai
Abstract:
The widespread adoption of machine learning (ML) systems increased attention to their security and emergence of adversarial machine learning (AML) techniques that exploit fundamental vulnerabilities in ML systems, creating an urgent need for comprehensive risk assessment for ML-based systems. While traditional risk assessment frameworks evaluate conventional cybersecurity risks, they lack ability to address unique challenges posed by AML threats. Existing AML threat evaluation approaches focus primarily on technical attack robustness, overlooking crucial real-world factors like deployment environments, system dependencies, and attack feasibility. Attempts at comprehensive AML risk assessment have been limited to domain-specific solutions, preventing application across diverse systems. Addressing these limitations, we present FRAME, the first comprehensive and automated framework for assessing AML risks across diverse ML-based systems. FRAME includes a novel risk assessment method that quantifies AML risks by systematically evaluating three key dimensions: target system's deployment environment, characteristics of diverse AML techniques, and empirical insights from prior research. FRAME incorporates a feasibility scoring mechanism and LLM-based customization for system-specific assessments. Additionally, we developed a comprehensive structured dataset of AML attacks enabling context-aware risk assessment. From an engineering application perspective, FRAME delivers actionable results designed for direct use by system owners with only technical knowledge of their systems, without expertise in AML. We validated it across six diverse real-world applications. Our evaluation demonstrated exceptional accuracy and strong alignment with analysis by AML experts. FRAME enables organizations to prioritize AML risks, supporting secure AI deployment in real-world environments.
Authors:Shir Bernstein, David Beste, Daniel Ayzenshteyn, Lea Schonherr, Yisroel Mirsky
Abstract:
Large Language Models (LLMs) are increasingly trusted to perform automated code review and static analysis at scale, supporting tasks such as vulnerability detection, summarization, and refactoring. In this paper, we identify and exploit a critical vulnerability in LLM-based code analysis: an abstraction bias that causes models to overgeneralize familiar programming patterns and overlook small, meaningful bugs. Adversaries can exploit this blind spot to hijack the control flow of the LLM's interpretation with minimal edits and without affecting actual runtime behavior. We refer to this attack as a Familiar Pattern Attack (FPA).
We develop a fully automated, black-box algorithm that discovers and injects FPAs into target code. Our evaluation shows that FPAs are not only effective, but also transferable across models (GPT-4o, Claude 3.5, Gemini 2.0) and universal across programming languages (Python, C, Rust, Go). Moreover, FPAs remain effective even when models are explicitly warned about the attack via robust system prompts. Finally, we explore positive, defensive uses of FPAs and discuss their broader implications for the reliability and safety of code-oriented LLMs.
Authors:Shu-Jie Cao, Dongning Guo
Abstract:
This paper studies proof-of-work Nakamoto consensus protocols under bounded network delays, settling two long-standing questions in blockchain security: What is the most effective attack on block safety under a given block confirmation latency? And what is the resulting probability of safety violation? A Markov decision process (MDP) framework is introduced to precisely characterize the system state (including the blocktree and timings of all blocks mined), the adversary's potential actions, and the state transitions due to the adversarial action and the random block arrival processes. An optimal attack, called bait-and-switch, is proposed and proved to maximize the adversary's chance of violating block safety by "beating Nakamoto in the race". The exact probability of this violation is calculated for any given confirmation depth using Markov chain analysis, offering fresh insights into the interplay of network delay, confirmation rules, and blockchain security.
Authors:Yu Yang, Zhenyuan Li, Xiandong Ran, Jiahao Liu, Jiahui Wang, Bo Yu, Shouling Ji
Abstract:
Mobile application marketplaces are responsible for vetting apps to identify and mitigate security risks. Current vetting processes are labor-intensive, relying on manual analysis by security professionals aided by semi-automated tools. To address this inefficiency, we propose Mars, a system that leverages Large Language Models (LLMs) for automated risk identification and profiling. Mars is designed to concurrently analyze multiple applications across diverse risk categories with minimal human intervention. To enhance analytical precision and operational efficiency, Mars leverages a pre-constructed risk identification tree to extract relevant indicators from high-dimensional application features. This initial step filters the data, reducing the input volume for the LLM and mitigating the potential for model hallucination induced by irrelevant features. The extracted indicators are then subjected to LLM analysis for final risk determination. Furthermore, Mars automatically generates a comprehensive evidence chain for each assessment, documenting the analytical process to provide transparent justification. These chains are designed to facilitate subsequent manual review and to inform enforcement decisions, such as application delisting. The performance of Mars was evaluated on a real-world dataset from a partner Android marketplace. The results demonstrate that Mars attained an F1-score of 0.838 in risk identification and an F1-score of 0.934 in evidence retrieval. To assess its practical applicability, a user study involving 20 expert analysts was conducted, which indicated that Mars yielded a substantial efficiency gain, ranging from 60% to 90%, over conventional manual analysis.
Authors:Zhiqiang Wang, Yichao Gao, Yanting Wang, Suyuan Liu, Haifeng Sun, Haoran Cheng, Guanquan Shi, Haohua Du, Xiangyang Li
Abstract:
By providing a standardized interface for LLM agents to interact with external tools, the Model Context Protocol (MCP) is quickly becoming a cornerstone of the modern autonomous agent ecosystem. However, it creates novel attack surfaces due to untrusted external tools. While prior work has focused on attacks injected through external tool outputs, we investigate a more fundamental vulnerability: Tool Poisoning, where malicious instructions are embedded within a tool's metadata without execution. To date, this threat has been primarily demonstrated through isolated cases, lacking a systematic, large-scale evaluation.
We introduce MCPTox, the first benchmark to systematically evaluate agent robustness against Tool Poisoning in realistic MCP settings. MCPTox is constructed upon 45 live, real-world MCP servers and 353 authentic tools. To achieve this, we design three distinct attack templates to generate a comprehensive suite of 1312 malicious test cases by few-shot learning, covering 10 categories of potential risks. Our evaluation on 20 prominent LLM agents setting reveals a widespread vulnerability to Tool Poisoning, with o1-mini, achieving an attack success rate of 72.8\%. We find that more capable models are often more susceptible, as the attack exploits their superior instruction-following abilities. Finally, the failure case analysis reveals that agents rarely refuse these attacks, with the highest refused rate (Claude-3.7-Sonnet) less than 3\%, demonstrating that existing safety alignment is ineffective against malicious actions that use legitimate tools for unauthorized operation. Our findings create a crucial empirical baseline for understanding and mitigating this widespread threat, and we release MCPTox for the development of verifiably safer AI agents. Our dataset is available at an anonymized repository: \textit{https://anonymous.4open.science/r/AAAI26-7C02}.
Authors:Farid Zaredar, Morteza Amini
Abstract:
The integration of information and communication technology (ICT) with traditional power grids has led to the emergence of smart grids. Advanced metering infrastructure (AMI) plays a crucial role in smart grids by facilitating two-way communication between smart meters and the utility provider. This bidirectional communication allows intelligent meters to report fine-grained consumption data at predefined intervals, enabling accurate billing, efficient grid monitoring and management, and rapid outage detection. However, the collection of detailed consumption data can inadvertently disclose consumers' daily activities, raising privacy concerns and potentially leading to privacy violations. To address these issues and preserve individuals' privacy, we propose a lightweight privacy-preserving smart metering protocol specifically designed to support real-time tariff billing service with dynamic policy adjustment. Our scheme employs an efficient data perturbation technique to obscure precise energy usage data from internal adversaries, including the intermediary gateways and the utility provider. Subsequently, we validate the efficiency and security of our protocol through comprehensive performance and privacy evaluations. We examined the computational, memory, and communication overhead of the proposed scheme. The execution time of our secure and privacy-aware billing system is approximately 3.94540 seconds for a complete year. Furthermore, we employed the Jensen-Shannon divergence as a privacy metric to demonstrate that our protocol can effectively safeguard users' privacy by increasing the noise scale.
Authors:Farid Zaredar, Morteza Amini
Abstract:
Modern grids have adopted advanced metering infrastructure (AMI) to facilitate bidirectional communication between smart meters and control centers. This enables smart meters to report consumption values at predefined intervals to utility providers for purposes including demand balancing, load forecasting, dynamic billing, and operational efficiency. Compared to traditional power grids, smart grids offer advantages such as enhanced reliability, improved energy efficiency, and increased security. However, utility providers can compromise user privacy by analyzing fine-grained readings and extracting individuals' daily activities from this time-series data. To address this concern, we propose a collusion-resistant, privacy-preserving aggregation protocol for smart metering in operational services. Our protocol ensures privacy by leveraging techniques such as partially additive homomorphic encryption, aggregation, data perturbation, and data minimization. The scheme aggregates perturbed readings using the additive homomorphic property of the Paillier cryptosystem to provide results for multiple operational purposes. We evaluate the protocol in terms of both performance and privacy. Computational, memory, and communication overhead were examined. The total execution time with 1024-bit key size is about 2.21 seconds. We also evaluated privacy through the normalized conditional entropy (NCE) metric. Higher NCE values, closer to 1, indicate stronger privacy. By increasing noise scale, the NCE value rises, showing perturbed values retain minimal information about the original, thereby reducing risks. Overall, evaluation demonstrates the protocol's efficiency while employing various privacy-preserving techniques.
Authors:Farid Zaredar, Morteza Amini
Abstract:
The emergence of smart grids and advanced metering infrastructure (AMI) has revolutionized energy management. Unlike traditional power grids, smart grids benefit from two-way communication through AMI, which surpasses earlier automated meter reading (AMR). AMI enables diverse demand- and supply-side utilities such as accurate billing, outage detection, real-time grid control, load forecasting, and value-added services. Smart meters play a key role by delivering consumption values at predefined intervals to the utility provider (UP). However, such reports may raise privacy concerns, as adversaries can infer lifestyle patterns, political orientations, and the types of electrical devices in a household, or even sell the data to third parties (TP) such as insurers. In this paper, we propose a lightweight, privacy-preserving smart metering protocol for incentive-based value-added services. The scheme employs local differential privacy, hash chains, blind digital signatures, pseudonyms, temporal aggregation, and anonymous overlay networks to report coarse-grained values with adjustable granularity to the UP. This protects consumers' privacy while preserving data utility. The scheme prevents identity disclosure while enabling automatic token redemption. From a performance perspective, our results show that with a 1024-bit RSA key, a 7-day duration, and four reports per day, our protocol runs in approximately 0.51s and consumes about 4.5 MB of memory. From a privacy perspective, the protocol resists semi-trusted and untrusted adversaries.
Authors:Wouter Legiest, Jan-Pieter D'Anvers, Bojan Spasic, Nam-Luc Tran, Ingrid Verbauwhede
Abstract:
This paper presents a novel approach to calculating the Levenshtein (edit) distance within the framework of Fully Homomorphic Encryption (FHE), specifically targeting third-generation schemes like TFHE. Edit distance computations are essential in applications across finance and genomics, such as DNA sequence alignment. We introduce an optimised algorithm that significantly reduces the cost of edit distance calculations called Leuvenshtein. This algorithm specifically reduces the number of programmable bootstraps (PBS) needed per cell of the calculation, lowering it from approximately 94 operations -- required by the conventional Wagner-Fisher algorithm -- to just 1. Additionally, we propose an efficient method for performing equality checks on characters, reducing ASCII character comparisons to only 2 PBS operations. Finally, we explore the potential for further performance improvements by utilising preprocessing when one of the input strings is unencrypted. Our Leuvenshtein achieves up to $278\times$ faster performance compared to the best available TFHE implementation and up to $39\times$ faster than an optimised implementation of the Wagner-Fisher algorithm. Moreover, when offline preprocessing is possible due to the presence of one unencrypted input on the server side, an additional $3\times$ speedup can be achieved.
Authors:Stefan Lenz, David Schachtschneider, Simon Jonas, Liam Tirpitz, Sandra Geisler, Martin Henze
Abstract:
While the digitization of industrial factories provides tremendous improvements for the production of goods, it also renders such systems vulnerable to serious cyber-attacks. To research, test, and validate security measures protecting industrial networks against such cyber-attacks, the security community relies on testbeds to simulate industrial systems, as utilizing live systems endangers costly components or even human life. However, existing testbeds focus on individual parts of typically complex production lines in industrial factories. Consequently, the impact of cyber-attacks on industrial networks as well as the effectiveness of countermeasures cannot be evaluated in an end-to-end manner. To address this issue and facilitate research on novel security mechanisms, we present CoFacS, the first COmplete FACtory Simulation that replicates an entire production line and affords the integration of real-life industrial applications. To showcase that CoFacS accurately captures real-world behavior, we validate it against a physical model factory widely used in security research. We show that CoFacS has a maximum deviation of 0.11% to the physical reference, which enables us to study the impact of physical attacks or network-based cyber-attacks. Moreover, we highlight how CoFacS enables security research through two cases studies surrounding attack detection and the resilience of 5G-based industrial communication against jamming.
Authors:Kim Hammar, Tao Li
Abstract:
Effective responses to cyberattacks require fast decisions, even when information about the attack is incomplete or inaccurate. However, most decision-support frameworks for incident response rely on a detailed system model that describes the incident, which restricts their practical utility. In this paper, we address this limitation and present an online method for incident response planning under model misspecification, which we call MOBAL: Misspecified Online Bayesian Learning. MOBAL iteratively refines a conjecture about the model through Bayesian learning as new information becomes available, which facilitates model adaptation as the incident unfolds. To determine effective responses online, we quantize the conjectured model into a finite Markov model, which enables efficient response planning through dynamic programming. We prove that Bayesian learning is asymptotically consistent with respect to the information feedback. Additionally, we establish bounds on misspecification and quantization errors. Experiments on the CAGE-2 benchmark show that MOBAL outperforms the state of the art in terms of adaptability and robustness to model misspecification.
Authors:Eduardo Brito, Fernando Castillo, Liina Kamm, Amnir Hadachi, Ulrich Norbisrath
Abstract:
Digital societies increasingly rely on trustworthy proofs of physical presence for services such as supply-chain tracking, e-voting, ride-sharing, and location-based rewards. Yet, traditional localization methods often lack cryptographic guarantees of where and when an entity was present, leaving them vulnerable to spoofing, replay, or collusion attacks. In response, research on Proof-of-Location (PoL) has emerged, with recent approaches combining distance bounding, distributed consensus, and privacy-enhancing techniques to enable verifiable, tamper-resistant location claims.
As the design space for PoL systems grows in complexity, this paper provides a unified framework to help practitioners navigate diverse application needs. We first propose a taxonomy identifying four core domains: (1) cryptographic guarantees, (2) spatio-temporal synchronization, (3) trust and witness models, and (4) interaction and overhead. Building on this, we introduce a methodology to map application-specific requirements onto appropriate PoL architectures. We illustrate this process through three use cases (retail e-coupons, supply chain auditing, and physical e-voting), each showing how different constraints shape protocol choices. Overall, this work offers a structured approach to building secure, scalable, and interoperable PoL systems.
Authors:Eric Cornelissen, Musard Balliu
Abstract:
The software supply chain is an increasingly common attack vector for malicious actors. The Node.js ecosystem has been subject to a wide array of attacks, likely due to its size and prevalence. To counter such attacks, the research community and practitioners have proposed a range of static and dynamic mechanisms, including process- and language-level sandboxing, permission systems, and taint tracking. Drawing on valuable insight from these works, this paper studies a runtime protection mechanism for (the supply chain of) Node.js applications with the ambitious goals of compatibility, automation, minimal overhead, and policy conciseness. Specifically, we design, implement and evaluate NodeShield, a protection mechanism for Node.js that enforces an application's dependency hierarchy and controls access to system resources at runtime. We leverage the up-and-coming SBOM standard as the source of truth for the dependency hierarchy of the application, thus preventing components from stealthily abusing undeclared components. We propose to enhance the SBOM with a notion of capabilities that represents a set of related system resources a component may access. Our proposed SBOM extension, the Capability Bill of Materials or CBOM, records the required capabilities of each component, providing valuable insight into the potential privileged behavior. NodeShield enforces the SBOM and CBOM at runtime via code outlining (as opposed to inlining) with no modifications to the original code or Node.js runtime, thus preventing unexpected, potentially malicious behavior. Our evaluation shows that NodeShield can prevent over 98% out of 67 known supply chain attacks while incurring minimal overhead on servers at less than 1ms per request. We achieve this while maintaining broad compatibility with vanilla Node.js and a concise policy language that consists of at most 7 entries per dependency.
Authors:VÃctor Mayoral-Vilches, Jasmin Wachter, Cristóbal R. J. Veas Chavez, Cathrin Schachner, Luis Javier Navarrete-Lozano, MarÃa Sanz-Gómez
Abstract:
This work introduces CAI Fluency, an an educational platform of the Cybersecurity AI (CAI) framework dedicated to democratizing the knowledge and application of cybersecurity AI tools in the global security community. The main objective of the CAI framework is to accelerate the widespread adoption and effective use of artificial intelligence-based cybersecurity solutions, pathing the way to vibe-hacking, the cybersecurity analogon to vibe-coding.
CAI Fluency builds upon the Framework for AI Fluency, adapting its three modalities of human-AI interaction and four core competencies specifically for cybersecurity applications. This theoretical foundation ensures that practitioners develop not just technical skills, but also the critical thinking and ethical awareness necessary for responsible AI use in security contexts.
This technical report serves as a white-paper, as well as detailed educational and practical guide that helps users understand the principles behind the CAI framework, and educates them how to apply this knowledge in their projects and real-world security contexts.
Authors:Tadeu Freitas, Carlos Novo, Inês Dutra, João Soares, Manuel Correia, Benham Shariati, Rolando Martins
Abstract:
Intrusion Tolerant Systems (ITSs) have become increasingly critical due to the rise of multi-domain adversaries exploiting diverse attack surfaces. ITS architectures aim to tolerate intrusions, ensuring system compromise is prevented or mitigated even with adversary presence. Existing ITS solutions often employ Risk Managers leveraging public security intelligence to adjust system defenses dynamically against emerging threats. However, these approaches rely heavily on databases like NVD and ExploitDB, which require manual analysis for newly discovered vulnerabilities. This dependency limits the system's responsiveness to rapidly evolving threats. HAL 9000, an ITS Risk Manager introduced in our prior work, addressed these challenges through machine learning. By analyzing descriptions of known vulnerabilities, HAL 9000 predicts and assesses new vulnerabilities automatically. To calculate the risk of a system, it also incorporates the Exploitability Probability Scoring system to estimate the likelihood of exploitation within 30 days, enhancing proactive defense capabilities.
Despite its success, HAL 9000's reliance on NVD and ExploitDB knowledge is a limitation, considering the availability of other sources of information. This extended work introduces a custom-built scraper that continuously mines diverse threat sources, including security advisories, research forums, and real-time exploit proofs-of-concept. This significantly expands HAL 9000's intelligence base, enabling earlier detection and assessment of unverified vulnerabilities. Our evaluation demonstrates that integrating scraper-derived intelligence with HAL 9000's risk management framework substantially improves its ability to address emerging threats. This paper details the scraper's integration into the architecture, its role in providing additional information on new threats, and the effects on HAL 9000's management.
Authors:Zefang Liu, Arman Anwar
Abstract:
Incident response (IR) requires fast, coordinated, and well-informed decision-making to contain and mitigate cyber threats. While large language models (LLMs) have shown promise as autonomous agents in simulated IR settings, their reasoning is often limited by a lack of access to external knowledge. In this work, we present AutoBnB-RAG, an extension of the AutoBnB framework that incorporates retrieval-augmented generation (RAG) into multi-agent incident response simulations. Built on the Backdoors & Breaches (B&B) tabletop game environment, AutoBnB-RAG enables agents to issue retrieval queries and incorporate external evidence during collaborative investigations. We introduce two retrieval settings: one grounded in curated technical documentation (RAG-Wiki), and another using narrative-style incident reports (RAG-News). We evaluate performance across eight team structures, including newly introduced argumentative configurations designed to promote critical reasoning. To validate practical utility, we also simulate real-world cyber incidents based on public breach reports, demonstrating AutoBnB-RAG's ability to reconstruct complex multi-stage attacks. Our results show that retrieval augmentation improves decision quality and success rates across diverse organizational models. This work demonstrates the value of integrating retrieval mechanisms into LLM-based multi-agent systems for cybersecurity decision-making.
Authors:Xiang Long, Yingjie Xia, Xiyuan Chen, Li Kuang
Abstract:
Timely detection of hardware vulnerabilities during the early design stage is critical for reducing remediation costs. Existing early detection techniques often require specialized security expertise, limiting their usability. Recent efforts have explored the use of large language models (LLMs) for Verilog vulnerability detection. However, LLMs struggle to capture the structure in Verilog code, resulting in inconsistent detection results. To this end, we propose VerilogLAVD, the first LLM-aided graph traversal rule generation approach for Verilog vulnerability detection. Our approach introduces the Verilog Property Graph (VeriPG), a unified representation of Verilog code. It combines syntactic features extracted from the abstract syntax tree (AST) with semantic information derived from control flow and data dependency graphs. We leverage LLMs to generate VeriPG-based detection rules from Common Weakness Enumeration (CWE) descriptions. These rules guide the rule executor that traversal VeriPG for potential vulnerabilities. To evaluate VerilogLAVD, we build a dataset collected from open-source repositories and synthesized data. In our empirical evaluation on 77 Verilog designs encompassing 12 CWE types, VerilogLAVD achieves an F1-score of 0.54. Compared to the LLM-only and LLM with external knowledge baselines, VerilogLAVD improves F1-score by 0.31 and 0.27, respectively.
Authors:Ziteng Hu, Yingjie Xia, Xiyuan Chen, Li Kuang
Abstract:
Finite State Machines (FSMs) play a critical role in implementing control logic for Systems-on-Chip (SoC). Traditionally, FSMs are implemented by hardware engineers through Verilog coding, which is often tedious and time-consuming. Recently, with the remarkable progress of Large Language Models (LLMs) in code generation, LLMs have been increasingly explored for automating Verilog code generation. However, LLM-generated Verilog code often suffers from security vulnerabilities, which is particularly concerning for security-sensitive FSM implementations. To address this issue, we propose SecFSM, a novel method that leverages a security-oriented knowledge graph to guide LLMs in generating more secure Verilog code. Specifically, we first construct a FSM Security Knowledge Graph (FSKG) as an external aid to LLMs. Subsequently, we analyze users' requirements to identify vulnerabilities and get a list of vulnerabilities in the requirements. Then, we retrieve knowledge from FSKG based on the vulnerabilities list. Finally, we construct security prompts based on the security knowledge for Verilog code generation. To evaluate SecFSM, we build a dedicated dataset collected from academic datasets, artificial datasets, papers, and industrial cases. Extensive experiments demonstrate that SecFSM outperforms state-of-the-art baselines. In particular, on a benchmark of 25 security test cases evaluated by DeepSeek-R1, SecFSM achieves an outstanding pass rate of 21/25.
Authors:Zilong Lin, Zichuan Li, Xiaojing Liao, XiaoFeng Wang
Abstract:
The advancement of AI technologies, particularly Large Language Models (LLMs), has transformed computing while introducing new security and privacy risks. Prior research shows that cybercriminals are increasingly leveraging uncensored LLMs (ULLMs) as backends for malicious services. Understanding these ULLMs has been hindered by the challenge of identifying them among the vast number of open-source LLMs hosted on platforms like Hugging Face. In this paper, we present the first systematic study of ULLMs, overcoming this challenge by modeling relationships among open-source LLMs and between them and related data, such as fine-tuning, merging, compressing models, and using or generating datasets with harmful content. Representing these connections as a knowledge graph, we applied graph-based deep learning to discover over 11,000 ULLMs from a small set of labeled examples and uncensored datasets.
A closer analysis of these ULLMs reveals their alarming scale and usage. Some have been downloaded over a million times, with one over 19 million installs. These models -- created through fine-tuning, merging, or compression of other models -- are capable of generating harmful content, including hate speech, violence, erotic material, and malicious code. Evidence shows their integration into hundreds of malicious applications offering services like erotic role-play, child pornography, malicious code generation, and more. In addition, underground forums reveal criminals sharing techniques and scripts to build cheap alternatives to commercial malicious LLMs. These findings highlight the widespread abuse of LLM technology and the urgent need for effective countermeasures against this growing threat.
Authors:Shixuan Guan, Kai Li
Abstract:
Blockchain address poisoning is an emerging phishing attack that crafts "similar-looking" transfer records in the victim's transaction history, which aims to deceive victims and lure them into mistakenly transferring funds to the attacker. Recent works have shown that millions of Ethereum users were targeted and lost over 100 million US dollars.
Ethereum crypto wallets, serving users in browsing transaction history and initiating transactions to transfer funds, play a central role in deploying countermeasures to mitigate the address poisoning attack. However, whether they have done so remains an open question. To fill the research void, in this paper, we design experiments to simulate address poisoning attacks and systematically evaluate the usability and security of 53 popular Ethereum crypto wallets. Our evaluation shows that there exist communication failures between 12 wallets and their transaction activity provider, which renders them unable to download the users' transaction history. Besides, our evaluation also shows that 16 wallets pose a high risk to their users due to displaying fake token phishing transfers. Moreover, our further analysis suggests that most wallets rely on transaction activity providers to filter out phishing transfers. However, their phishing detection capability varies. Finally, we found that only three wallets throw an explicit warning message when users attempt to transfer to the phishing address, implying a significant gap within the broader Ethereum crypto wallet community in protecting users from address poisoning attacks.
Overall, our work shows that more efforts are needed by the Ethereum crypto wallet developer community to achieve the highest usability and security standard. Our bug reports have been acknowledged by the developer community, who are currently developing mitigation solutions.
Authors:Irash Perera, Hiranya Abeyrathne, Sanjeewa Malalgoda, Arshardh Ifthikar
Abstract:
GraphQL's flexibility, while beneficial for efficient data fetching, introduces unique security vulnerabilities that traditional API security mechanisms often fail to address. Malicious GraphQL queries can exploit the language's dynamic nature, leading to denial-of-service attacks, data exfiltration through injection, and other exploits. Existing solutions, such as static analysis, rate limiting, and general-purpose Web Application Firewalls, offer limited protection against sophisticated, context-aware attacks. This paper presents a novel, AI-driven approach for real-time detection of malicious GraphQL queries. Our method combines static analysis with machine learning techniques, including Large Language Models (LLMs) for dynamic schema-based configuration, Sentence Transformers (SBERT and Doc2Vec) for contextual embedding of query payloads, and Convolutional Neural Networks (CNNs), Random Forests, and Multilayer Perceptrons for classification. We detail the system architecture, implementation strategies optimized for production environments (including ONNX Runtime optimization and parallel processing), and evaluate the performance of our detection models and the overall system under load. Results demonstrate high accuracy in detecting various threats, including SQL injection, OS command injection, and XSS exploits, alongside effective mitigation of DoS and SSRF attempts. This research contributes a robust and adaptable solution for enhancing GraphQL API security.
Authors:Haoyang Hu, Xun Huang, Chenyu Wu, Shiwen Liu, Zhichao Lian, Shuangquan Zhang
Abstract:
Intrusion detection systems (IDS) are widely used to maintain the stability of network environments, but still face restrictions in generalizability due to the heterogeneity of network traffics. In this work, we propose BERTector, a new framework of joint-dataset learning for IDS based on BERT. BERTector integrates three key components: NSS-Tokenizer for traffic-aware semantic tokenization, supervised fine-tuning with a hybrid dataset, and low-rank adaptation for efficient fine-tuning. Experiments show that BERTector achieves state-of-the-art detection accuracy, strong generalizability, and excellent robustness. BERTector achieves the highest accuracy of 99.28% on NSL-KDD and reaches the average 80% detection success rate against four perturbations. These results establish a unified and efficient solution for modern IDS in complex and dynamic network environments.
Authors:René Mayrhofer, Michael Roland, Tobias Höller, Philipp Hofer, Mario Lins
Abstract:
Digital identities are increasingly important for mediating not only digital but also physical service transactions. Managing such identities through centralized providers can cause both availability and privacy concerns: single points of failure and control are ideal targets for global attacks on technical, organizational, or legal fronts. We design, analyze, and build a distributed digital identity architecture for physical world transactions in common scenarios like unlocking doors, public transport, or crossing country borders. This architecture combines (biometric and other) sensors, (established and upcoming) identity authorities, attribute verifiers, and a new core component we call the \emph{Personal Identity Agent (PIA)} that represents individuals with their identity attributes in the digital domain. All transactions are conducted in a completely decentralized manner, and the components for which we currently assume central coordination are optional and only used for assisting with service discovery and latency reduction. We present a first protocol between these parties and formally verify that it achieves relevant security properties based on a realistic threat model including strong global adversaries. A proof-of-concept implementation demonstrates practical feasibility of both architecture and initial protocol for applications that can tolerate end-to-end latencies in the range of a few seconds.
Authors:Pallavi Zambare, Venkata Nikhil Thanikella, Ying Liu
Abstract:
When combining Large Language Models (LLMs) with autonomous agents, used in network monitoring and decision-making systems, this will create serious security issues. In this research, the MAESTRO framework consisting of the seven layers threat modeling architecture in the system was used to expose, evaluate, and eliminate vulnerabilities of agentic AI. The prototype agent system was constructed and implemented, using Python, LangChain, and telemetry in WebSockets, and deployed with inference, memory, parameter tuning, and anomaly detection modules. Two practical threat cases were confirmed as follows: (i) resource denial of service by traffic replay denial-of-service, and (ii) memory poisoning by tampering with the historical log file maintained by the agent. These situations resulted in measurable levels of performance degradation, i.e. telemetry updates were delayed, and computational loads were increased, as a result of poor system adaptations. It was suggested to use a multilayered defense-in-depth approach with memory isolation, validation of planners and anomaly response systems in real-time. These findings verify that MAESTRO is viable in operational threat mapping, prospective risk scoring, and the basis of the resilient system design. The authors bring attention to the importance of the enforcement of memory integrity, paying attention to the adaptation logic monitoring, and cross-layer communication protection that guarantee the agentic AI reliability in adversarial settings.
Authors:Klaudia Krawiecka, Christian Schroeder de Witt
Abstract:
We propose an extension to the OWASP Multi-Agentic System (MAS) Threat Modeling Guide, translating recent anticipatory research in multi-agent security (MASEC) into practical guidance for addressing challenges unique to large language model (LLM)-driven multi-agent architectures. Although OWASP's existing taxonomy covers many attack vectors, our analysis identifies gaps in modeling failures, including, but not limited to: reasoning collapse across planner-executor chains, metric overfitting, unsafe delegation escalation, emergent covert coordination, and heterogeneous multi-agent exploits. We introduce additional threat classes and scenarios grounded in practical MAS deployments, highlighting risks from benign goal drift, cross-agent hallucination propagation, affective prompt framing, and multi-agent backdoors. We also outline evaluation strategies, including robustness testing, coordination assessment, safety enforcement, and emergent behavior monitoring, to ensure complete coverage. This work complements the framework of OWASP by expanding its applicability to increasingly complex, autonomous, and adaptive multi-agent systems, with the goal of improving security posture and resilience in real world deployments.
Authors:Kevin Kurian, Ethan Holland, Sean Oesch
Abstract:
As large language models are increasingly deployed in sensitive environments, fingerprinting attacks pose significant privacy and security risks. We present a study of LLM fingerprinting from both offensive and defensive perspectives. Our attack methodology uses reinforcement learning to automatically optimize query selection, achieving better fingerprinting accuracy with only 3 queries compared to randomly selecting 3 queries from the same pool. Our defensive approach employs semantic-preserving output filtering through a secondary LLM to obfuscate model identity while maintaining semantic integrity. The defensive method reduces fingerprinting accuracy across tested models while preserving output quality. These contributions show the potential to improve fingerprinting tools capabilities while providing practical mitigation strategies against fingerprinting attacks.
Authors:Yuval Efron, Joachim Neu, Toniann Pitassi
Abstract:
Proof-of-work allows Bitcoin to boast security amidst arbitrary fluctuations in participation of miners throughout time, so long as, at any point in time, a majority of hash power is honest. In recent years, however, the pendulum has shifted in favor of proof-of-stake-based consensus protocols. There, the sleepy model is the most prominent model for handling fluctuating participation of nodes. However, to date, no protocol in the sleepy model rivals Bitcoin in its robustness to drastic fluctuations in participation levels, with state-of-the-art protocols making various restrictive assumptions. In this work, we present a new adversary model, called external adversary. Intuitively, in our model, corrupt nodes do not divulge information about their secret keys. In this model, we show that protocols in the sleepy model can meaningfully claim to remain secure against fully fluctuating participation, without compromising efficiency or corruption resilience. Our adversary model is quite natural, and arguably naturally captures the process via which malicious behavior arises in protocols, as opposed to traditional worst-case modeling. On top of which, the model is also theoretically appealing, circumventing a barrier established in a recent work of Malkhi, Momose, and Ren.
Authors:Luca Serena, Gabriele D'Angelo, Stefano Ferretti, Moreno Marzolla
Abstract:
Modeling and simulation are widely used in cybersecurity research to assess cyber threats, evaluate defense mechanisms, and analyze vulnerabilities. However, the diversity of application areas, the variety of cyberattacks scenarios, and the differing objectives of these simulations makes it difficult to identify methodological trends. Existing reviews often focus on specific modeling techniques or application domains, making it challenging to analyze the field as a whole. To address these limitations, we present a comprehensive review of the current state of the art, classifying the selected papers based on four dimensions: the application domain, the types of cyber threats represented, the simulation techniques employed, and the primary goals of the simulation. The review discusses the strengths and limitations of different approaches, identifies which cyber threats are the most suited for simulation-based investigations, and analyzes which modeling paradigms are most appropriate for specific cybersecurity challenges.
Authors:Song Yan, Hui Wei, Jinlong Fei, Guoliang Yang, Zhengyu Zhao, Zheng Wang
Abstract:
Various (text) prompt filters and (image) safety checkers have been implemented to mitigate the misuse of Text-to-Image (T2I) models in creating Not-Safe-For-Work (NSFW) content. In order to expose potential security vulnerabilities of such safeguards, multimodal jailbreaks have been studied. However, existing jailbreaks are limited to prompt-specific and image-specific perturbations, which suffer from poor scalability and time-consuming optimization. To address these limitations, we propose Universally Unfiltered and Unseen (U3)-Attack, a multimodal jailbreak attack method against T2I safeguards. Specifically, U3-Attack optimizes an adversarial patch on the image background to universally bypass safety checkers and optimizes a safe paraphrase set from a sensitive word to universally bypass prompt filters while eliminating redundant computations. Extensive experimental results demonstrate the superiority of our U3-Attack on both open-source and commercial T2I models. For example, on the commercial Runway-inpainting model with both prompt filter and safety checker, our U3-Attack achieves $~4\times$ higher success rates than the state-of-the-art multimodal jailbreak attack, MMA-Diffusion.
Authors:Kim Hammar, Tansu Alpcan, Emil C. Lupu
Abstract:
Timely and effective incident response is key to managing the growing frequency of cyberattacks. However, identifying the right response actions for complex systems is a major technical challenge. A promising approach to mitigate this challenge is to use the security knowledge embedded in large language models (LLMs) to assist security operators during incident handling. Recent research has demonstrated the potential of this approach, but current methods are mainly based on prompt engineering of frontier LLMs, which is costly and prone to hallucinations. We address these limitations by presenting a novel way to use an LLM for incident response planning with reduced hallucination. Our method includes three steps: fine-tuning, information retrieval, and lookahead planning. We prove that our method generates response plans with a bounded probability of hallucination and that this probability can be made arbitrarily small at the expense of increased planning time under certain assumptions. Moreover, we show that our method is lightweight and can run on commodity hardware. We evaluate our method on logs from incidents reported in the literature. The experimental results show that our method a) achieves up to 22% shorter recovery times than frontier LLMs and b) generalizes to a broad range of incident types and response actions.
Authors:Ittay Alfassi, Ran Gelles, Rotem Liss, Tal Mor
Abstract:
Practical implementations of Quantum Key Distribution (QKD) often deviate from the theoretical protocols, exposing the implementations to various attacks even when the underlying (ideal) protocol is proven secure. We present new analysis tools and methodologies for quantum cybersecurity, adapting the concepts of vulnerabilities, attack surfaces, and exploits from classical cybersecurity to QKD implementation attacks. We also present three additional concepts, derived from the connection between classical and quantum cybersecurity: "Quantum Fuzzing", which is the first tool for black-box vulnerability research on QKD implementations; "Reversed-Space Attacks", which are a generic exploit method using the attack surface of imperfect receivers; and concrete quantum-mechanical definitions of "Quantum Side-Channel Attacks" and "Quantum State-Channel Attacks", meaningfully distinguishing them from each other and from other attacks. Using our tools, we analyze multiple existing QKD attacks and show that the "Bright Illumination" attack could have been found even with minimal knowledge of the device implementation. This work begins to bridge the gap between current analysis methods for experimental attacks on QKD implementations and the decades-long research in the field of classical cybersecurity, improving the practical security of QKD products and enhancing their usefulness in real-world systems.
Authors:Christof Beierle, Philippe Langevin, Gregor Leander, Alexandr Polujan, Shahram Rasoolzadeh
Abstract:
The only known example of an almost perfect nonlinear (APN) permutation in even dimension was obtained by applying CCZ-equivalence to a specific quadratic APN function. Motivated by this result, there have been numerous recent attempts to construct new quadratic APN functions. Currently, 32,892 quadratic APN functions in dimension 8 are known and two recent conjectures address their possible total number. The first, proposed by Y. Yu and L. Perrin (Cryptogr. Commun. 14(6): 1359-1369, 2022), suggests that there are more than 50,000 such functions. The second, by A. Polujan and A. Pott (Proc. 7th Int. Workshop on Boolean Functions and Their Applications, 2022), argues that their number exceeds that of inequivalent quadratic (8,4)-bent functions, which is 92,515. We computationally construct 3,775,599 inequivalent quadratic APN functions in dimension 8 and estimate the total number to be about 6 million.
Authors:Marc Damie, Mihai Pop, Merijn Posthuma
Abstract:
Privacy-enhancing technologies (PETs) have attracted significant attention in response to privacy regulations, driving the development of applications that prioritize user data protection. At the same time, the information and communication technology (ICT) sector faces growing pressure to reduce its environmental footprint, particularly its carbon emissions. While numerous studies have assessed the energy footprint of various ICT applications, the environmental footprint of cryptographic PETs remains largely unexplored.
Our work addresses this gap by proposing a standardized methodology for evaluating the carbon footprint of PETs. To demonstrate this methodology, we focus on PETs supporting client-server applications as they are the simplest to deploy. In particular, we measure the energy consumption and carbon footprint increase induced by five cryptographic PETs (compared to their non-private equivalent): HTTPS web browsing, encrypted machine learning (ML) inference, encrypted ML training, encrypted databases, and encrypted emails. Our findings reveal significant variability in carbon footprint increases, ranging from a twofold increase in HTTPS web browsing to a 100,000-fold increase in encrypted ML.
Our study provides essential data to help decision-makers assess privacy-carbon trade-offs in such applications. Finally, we outline key research directions for developing PETs that balance strong privacy protection with environmental sustainability.
Authors:Borui Li, Li Yan, Jianmin Liu
Abstract:
Federated Learning (FL) enables collaborative model training on decentralized data but remains vulnerable to gradient leakage attacks that can reconstruct sensitive user information. Existing defense mechanisms, such as differential privacy (DP) and homomorphic encryption (HE), often introduce a trade-off between privacy, model utility, and system overhead, a challenge that is exacerbated in heterogeneous environments with non-IID data and varying client capabilities. To address these limitations, we propose SelectiveShield, a lightweight hybrid defense framework that adaptively integrates selective homomorphic encryption and differential privacy. SelectiveShield leverages Fisher information to quantify parameter sensitivity, allowing clients to identify critical parameters locally. Through a collaborative negotiation protocol, clients agree on a shared set of the most sensitive parameters for protection via homomorphic encryption. Parameters that are uniquely important to individual clients are retained locally, fostering personalization, while non-critical parameters are protected with adaptive differential privacy noise. Extensive experiments demonstrate that SelectiveShield maintains strong model utility while significantly mitigating gradient leakage risks, offering a practical and scalable defense mechanism for real-world federated learning deployments.
Authors:Borui Li, Li Yan, Junhao Han, Jianmin Liu, Lei Yu
Abstract:
Homomorphic Encryption (HE) prevails in securing Federated Learning (FL), but suffers from high overhead and adaptation cost. Selective HE methods, which partially encrypt model parameters by a global mask, are expected to protect privacy with reduced overhead and easy adaptation. However, in cross-device scenarios with heterogeneous data and system capabilities, traditional Selective HE methods deteriorate client straggling, and suffer from degraded HE overhead reduction performance. Accordingly, we propose SenseCrypt, a Sensitivity-guided selective Homomorphic EnCryption framework, to adaptively balance security and HE overhead per cross-device FL client. Given the observation that model parameter sensitivity is effective for measuring clients' data distribution similarity, we first design a privacy-preserving method to respectively cluster the clients with similar data distributions. Then, we develop a scoring mechanism to deduce the straggler-free ratio of model parameters that can be encrypted by each client per cluster. Finally, for each client, we formulate and solve a multi-objective model parameter selection optimization problem, which minimizes HE overhead while maximizing model security without causing straggling. Experiments demonstrate that SenseCrypt ensures security against the state-of-the-art inversion attacks, while achieving normal model accuracy as on IID data, and reducing training time by 58.4%-88.7% as compared to traditional HE methods.
Authors:Zhikui Chen, Muhammad Zeeshan Haider, Naiwen Luo, Shuo Yu, Xu Yuan, Yaochen Zhang, Tayyaba Noreen
Abstract:
With the popularity of smart terminals, such as the Internet of Things, crowdsensing is an emerging data aggregation paradigm, which plays a pivotal role in data-driven applications. There are some key issues in the development of crowdsensing such as platform security and privacy protection. As the crowdsensing is usually managed by a centralized platform, centralized management will bring various security vulnerabilities and scalability issues. To solve these issues, an effective reputation-based partition scheme (RSPC) is proposed in this article. The partition scheme calculates the optimal partition size by combining the node reputation value and divides the node into several disjoint partitions according to the node reputation value. By selecting the appropriate partition size, RSPC provides a mechanism to ensure that each partition is valid, as long as themaximum permissible threshold for the failed node is observed. At the same time, the RSPC reorganizes the network periodically to avoid partition attacks. In addition, for cross-partition transactions, this paper innovatively proposes a four-stage confirmation protocol to ensure the efficient and safe completion of cross-partition transactions. Finally, experiments show that RSPC improves scalability, low latency, and high throughput for crowdsensing.
Authors:Boshi Huang, Fabio Nonato de Paula
Abstract:
This paper introduces a novel self-consciousness defense mechanism for Large Language Models (LLMs) to combat prompt injection attacks. Unlike traditional approaches that rely on external classifiers, our method leverages the LLM's inherent reasoning capabilities to perform self-protection. We propose a framework that incorporates Meta-Cognitive and Arbitration Modules, enabling LLMs to evaluate and regulate their own outputs autonomously. Our approach is evaluated on seven state-of-the-art LLMs using two datasets: AdvBench and Prompt-Injection-Mixed-Techniques-2024. Experiment results demonstrate significant improvements in defense success rates across models and datasets, with some achieving perfect and near-perfect defense in Enhanced Mode. We also analyze the trade-off between defense success rate improvement and computational overhead. This self-consciousness method offers a lightweight, cost-effective solution for enhancing LLM ethics, particularly beneficial for GenAI use cases across various platforms.
Authors:Mengyu Zhang, Zhuotao Liu, Jingwen Huang, Xuanqi Liu
Abstract:
Privacy-preserving machine learning (PPML) is critical to ensure data privacy in AI. Over the past few years, the community has proposed a wide range of provably secure PPML schemes that rely on various cryptography primitives. However, when it comes to large language models (LLMs) with billions of parameters, the efficiency of PPML is everything but acceptable. For instance, the state-of-the-art solution for confidential LLM inference represents at least 10,000-fold slower performance compared to plaintext inference. The performance gap is even larger when the context length increases. In this position paper, we propose a novel framework named Agentic-PPML to make PPML in LLMs practical. Our key insight is to employ a general-purpose LLM for intent understanding and delegate cryptographically secure inference to specialized models trained on vertical domains. By modularly separating language intent parsing - which typically involves little or no sensitive information - from privacy-critical computation, Agentic-PPML completely eliminates the need for the LLMs to process the encrypted prompts, enabling practical deployment of privacy-preserving LLM-centric services.
Authors:Xu Yuan, Fang Luo, Muhammad Zeeshan Haider, Zhikui Chen, Yucheng Li
Abstract:
Blockchain technology has advanced rapidly in recent years and is now widely used in a variety of fields. Blockchain appears to be one of the best solutions for managing massive heterogeneous devices while achieving advanced data security and data reputation, particularly in the field of large-scale IoT (Internet of Things) networks. Despite the numerous advantages, there are still challenges while deploying IoT applications on blockchain systems due to the limited storage, power, and computing capability of IoT devices, and some of these problems are caused by the consensus algorithm, which plays a significant role in blockchain systems by ensuring overall system reliability and robustness. Nonetheless, most existing consensus algorithms are prone to poor node reliability, low transaction per second (TPS) rates, and scalability issues. Aiming at some critical problems in the existing consensus algorithms, this paper proposes the Efficient Byzantine Reputation-based Consensus (EBRC) mechanism to resolve the issues raised above. In comparison to traditional algorithms, we reinvented ways to evaluate node reliability and robustness and manage active nodes. Our experiments show that the EBRC algorithm has lower consensus delay, higher throughput, improved security, and lower verification costs. It offers new reference ideas for solving the Internet of Things+blockchain+Internet court construction problem.
Authors:Nergiz Yuca, Nikolay Matyunin, Ektor Arzoglou, Nikolaos Athanasios Anagnostopoulos, Stefan Katzenbeisser
Abstract:
As vehicles become increasingly connected and autonomous, they accumulate and manage various personal data, thereby presenting a key challenge in preserving privacy during data sharing and processing. This survey reviews applications of Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE) that address these privacy concerns in the automotive domain. First, we identify the scope of privacy-sensitive use cases for these technologies, by surveying existing works that address privacy issues in different automotive contexts, such as location-based services, mobility infrastructures, traffic management, etc. Then, we review recent works that employ MPC and HE as solutions for these use cases in detail. Our survey highlights the applicability of these privacy-preserving technologies in the automotive context, while also identifying challenges and gaps in the current research landscape. This work aims to provide a clear and comprehensive overview of this emerging field and to encourage further research in this domain.
Authors:Tianpei Lu, Bingsheng Zhang, Lekun Peng, Bowen Zheng, Lichun Li, Kui Ren
Abstract:
With the increasing deployment of generative machine learning models in privacy-sensitive domains such as healthcare and personalized services, ensuring secure inference has become a critical challenge. Secure multi-party computation (MPC) enables privacy-preserving model inference but suffers from high communication and computation overhead. The main bottleneck lies in the expensive secure evaluation of floating-point operations. Quantization offers a promising solution by converting floating-point operations into lower-precision integer computations, significantly reducing overhead. However, existing MPC-based quantized inference methods either rely on public quantization parameters-posing privacy risks-or suffer from inefficiencies, particularly in handling nonlinear functions such as activations and softmax. In this work, we propose a fine-grained, layer-wise quantization scheme and support 1-bit weight fully connected layers in a secure setting. We design a multi-input lookup table protocol to evaluate softmax efficiently and securely. Furthermore, we use dual secret sharing schemes and perform precision conversions via lookup tables, eliminating truncation overhead entirely. Experimental evaluation on BERT-base models demonstrates that our approach achieves up to $8\times$ speedup compared to Lu \emph{et al}. (NDSS 25), $9\times$ speedup compared to Gupta \emph{et al}. (PETS 24) and $22 \times$ speedup compared to Knott \emph{et al}. (NeurIPS 21).
Authors:Mohammad Moltafet, Hamid R. Sadjadpour, Zouheir Rezki
Abstract:
We consider an Internet of Battlefield Things (IoBT) system consisting of multiple devices that want to securely communicate with each other during a mission in the presence of an adversary with unbounded computational power. The adversary has complete access to listen/read the ciphertext without tampering with the communication line. We provide an unconditionally secure encryption scheme to exchange messages among devices in the system. The main idea behind the scheme is to provide secret keys to exchange messages using a random binary matrix that is securely shared among all the devices, and pair-wise random secret keys established between each pair of devices attempting to communicate before the mission. The scheme is implemented by using finite group modular addition. We show that the scheme is absolutely semantically secure, i.e., the scheme guarantees that an adversary with unbounded computational power cannot get even one bit of information about a message, except for an exponentially small probability in a security parameter. Besides that, we show that even if the random binary matrix is revealed to the adversary, the provided scheme is computationally secure against the key recovery attack.
Authors:Quentin Le Roux, Yannick Teglia, Teddy Furon, Philippe Loubet-Moundi
Abstract:
Face Recognition Systems that operate in unconstrained environments capture images under varying conditions,such as inconsistent lighting, or diverse face poses. These challenges require including a Face Detection module that regresses bounding boxes and landmark coordinates for proper Face Alignment. This paper shows the effectiveness of Object Generation Attacks on Face Detection, dubbed Face Generation Attacks, and demonstrates for the first time a Landmark Shift Attack that backdoors the coordinate regression task performed by face detectors. We then offer mitigations against these vulnerabilities.
Authors:Alessandro Gaudenzi, Lorenzo Nodari, Lance Kaplan, Alessandra Russo, Murat Sensoy, Federico Cerutti
Abstract:
Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their prolonged, multi-stage nature and the sophistication of their operators. Traditional detection systems typically focus on identifying malicious activity in binary terms (benign or malicious) without accounting for the progression of an attack. However, effective response strategies depend on accurate inference of the attack's current stage, as countermeasures must be tailored to whether an adversary is in the early reconnaissance phase or actively conducting exploitation or exfiltration. This work addresses the problem of attack stage inference under uncertainty, with a focus on robustness to out-of-distribution (OOD) inputs. We propose a classification approach based on Evidential Deep Learning (EDL), which models predictive uncertainty by outputting parameters of a Dirichlet distribution over possible stages. This allows the system not only to predict the most likely stage of an attack but also to indicate when it is uncertain or the input lies outside the training distribution. Preliminary experiments in a simulated environment demonstrate that the proposed model can accurately infer the stage of an attack with calibrated confidence while effectively detecting OOD inputs, which may indicate changes in the attackers' tactics. These results support the feasibility of deploying uncertainty-aware models for staged threat detection in dynamic and adversarial environments.
Authors:Po-Yuan Mao, Cheng-Chang Tsai, Chun-Shien Lu
Abstract:
The great success of the diffusion model in image synthesis led to the release of gigantic commercial models, raising the issue of copyright protection and inappropriate content generation. Training-free diffusion watermarking provides a low-cost solution for these issues. However, the prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Although some methods employ meticulously designed patterns to mitigate this issue, they often reduce watermark capacity, which can result in identity (ID) collusion. To address these problems, we propose MaXsive, a training-free diffusion model generative watermarking technique that has high capacity and robustness. MaXsive best utilizes the initial noise to watermark the diffusion model. Moreover, instead of using a meticulously repetitive ring pattern, we propose injecting the X-shape template to recover the RST distortions. This design significantly increases robustness without losing any capacity, making ID collusion less likely to happen. The effectiveness of MaXsive has been verified on two well-known watermarking benchmarks under the scenarios of verification and identification.
Authors:Smita Khapre, Sudhanshu Semwal
Abstract:
Social media platforms offer numerous benefits and allow people to come together for various causes. Many communities, academia, government agencies, institutions, healthcare, entertainment, and businesses are on social media platforms. They are intuitive and free for users. It has become unimaginable to live without social media. Their architecture and data handling are geared towards scalability, uninterrupted availability, and both personal and collaborative revenue generation. Primarily, artificial intelligence algorithms are employed on stored user data for optimization and feeds. This has the potential to impact user safety, privacy, and security, even when metadata is used. A new decentralized data arrangement framework based on the Fractal-tree and L-Systems algorithm is proposed to mitigate some of the impacts of social media platforms.
Future work will focus on demonstrating the effectiveness of the new decentralized framework by comparing its results against state-of-the-art security methods currently used in databases. A cryptographic algorithm could also be implemented for the framework, employing a new key generation for each branch. This will strengthen database security; for example, if a user key is leaked, regenerating the key for each branch will keep the data secure by applying defense mechanisms in the proposed L-System-based tree framework.
Authors:Chenhao Fang, Yanqing Peng, Rajeev Rao, Matt Sarmiento, Wendy Summer, Arya Pudota, Alex Goncalves, Jordi Mola, Hervé Robert
Abstract:
Enterprise environments contain a heterogeneous, rapidly growing collection of internal artifacts related to code, data, and many different tools. Critical information for assessing privacy risk and ensuring regulatory compliance is often embedded across these varied resources, each with their own arcane discovery and extraction techniques. Therefore, large-scale privacy compliance in adherence to governmental regulations requires systems to discern the interconnected nature of diverse artifacts in a common, shared universe.
We present Privacy Artifact ConnecT or (PACT), an embeddings-driven graph that links millions of artifacts spanning multiple artifact types generated by a variety of teams and projects. Powered by the state-of-the-art DRAGON embedding model, PACT uses a contrastive learning objective with light fine-tuning to link artifacts via their textual components such as raw metadata, ownership specifics, and compliance context. Experimental results show that PACT's fine-tuned model improves recall@1 from 18% to 53%, the query match rate from 9.6% to 69.7% when paired with a baseline AI agent, and the hitrate@1 from 25.7% to 44.9% for candidate selection in a standard recommender system.
Authors:Maria Camporese, Fabio Massacci
Abstract:
Background: Automated Vulnerability Repair (AVR) is a fast-growing branch of program repair. Recent studies show that large language models (LLMs) outperform traditional techniques, extending their success beyond code generation and fault detection.
Hypothesis: These gains may be driven by hidden factors -- "invisible hands" such as training-data leakage or perfect fault localization -- that let an LLM reproduce human-authored fixes for the same code.
Objective: We replicate prior AVR studies under controlled conditions by deliberately adding errors to the reported vulnerability location in the prompt. If LLMs merely regurgitate memorized fixes, both small and large localization errors should yield the same number of correct patches, because any offset should divert the model from the original fix.
Method: Our pipeline repairs vulnerabilities from the Vul4J and VJTrans benchmarks after shifting the fault location by n lines from the ground truth. A first LLM generates a patch, a second LLM reviews it, and we validate the result with regression and proof-of-vulnerability tests. Finally, we manually audit a sample of patches and estimate the error rate with the Agresti-Coull-Wilson method.
Authors:Matias Mazzanti, Augusto Vega, Pradip Bose, Esteban Mocskos
Abstract:
Homomorphic Encryption (HE) enables computation on encrypted data without decryption, making it a cornerstone of privacy-preserving computation in untrusted environments. As HE sees growing adoption in sensitive applications such as secure machine learning and confidential data analysis ensuring its robustness against errors becomes critical. Faults (e.g., transmission errors, hardware malfunctions, or synchronization failures) can corrupt encrypted data and compromise the integrity of HE operations. However, the impact of soft errors (such as bit flips) on modern HE schemes remains unexplored. Specifically, the CKKS scheme-one of the most widely used HE schemes for approximate arithmetic-lacks a systematic study of how such errors propagate across its pipeline, particularly under optimizations like the Residue Number System (RNS) and Number Theoretic Transform (NTT). This work bridges that gap by presenting a theoretical and empirical analysis of CKKS's fault tolerance under single bit-flip errors. We focus on client-side operations (encoding, encryption, decryption, and decoding) and demonstrate that while the vanilla CKKS scheme exhibits some resilience, performance optimizations (RNS/NTT) introduce significant fragility, amplifying error sensitivity. By characterizing these failure modes, we lay the groundwork for error-resilient HE designs, ensuring both performance and integrity in privacy-critical applications.
Authors:Yannis Smaragdakis, Neville Grech, Sifis Lagouvardos, Konstantinos Triantafyllou, Ilias Tsatiris, Yannis Bollanos, Tony Rocco Valentine
Abstract:
A widespread belief in the blockchain security community is that automated techniques are only good for detecting shallow bugs, typically of small value. In this paper, we present the techniques and insights that have led us to repeatable success in automatically discovering high-value smart contract vulnerabilities. Our vulnerability disclosures have yielded 10 bug bounties, for a total of over $3M, over high-profile deployed code, as well as hundreds of bugs detected in pre-deployment or under-audit code.
We argue that the elements of this surprising success are a) a very high-completeness static analysis approach that manages to maintain acceptable precision; b) domain knowledge, provided by experts or captured via statistical inference. We present novel techniques for automatically inferring domain knowledge from statistical analysis of a large corpus of deployed contracts, as well as discuss insights on the ideal precision and warning rate of a promising vulnerability detector. In contrast to academic literature in program analysis, which routinely expects false-positive rates below 50% for publishable results, we posit that a useful analysis for high-value real-world vulnerabilities will likely flag very few programs (under 1%) and will do so with a high false-positive rate (e.g., 95%, meaning that only one-of-twenty human inspections will yield an exploitable vulnerability).
Authors:Bilal Hussain Abbasi, Zirui Gong, Yanjun Zhang, Shang Gao, Antonio Robles-Kelly, Leo Zhang
Abstract:
Despite significant advancements in computer vision, semantic segmentation models may be susceptible to backdoor attacks. These attacks, involving hidden triggers, aim to cause the models to misclassify instances of the victim class as the target class when triggers are present, posing serious threats to the reliability of these models. To further explore the field of backdoor attacks against semantic segmentation, in this paper, we propose a simple yet effective backdoor attack called Contextual Segmentation Backdoor Attack (ConSeg). ConSeg leverages the contextual information inherent in semantic segmentation models to enhance backdoor performance. Our method is motivated by an intriguing observation, i.e., when the target class is set as the `co-occurring' class of the victim class, the victim class can be more easily `mis-segmented'. Building upon this insight, ConSeg mimics the contextual information of the target class and rebuilds it in the victim region to establish the contextual relationship between the target class and the victim class, making the attack easier. Our experiments reveal that ConSeg achieves improvements in Attack Success Rate (ASR) with increases of 15.55\%, compared to existing methods, while exhibiting resilience against state-of-the-art backdoor defenses.
Authors:Suman Deb, Emil Lupu, Emm Mic Drakakis, Anil Anthony Bharath, Zhen Kit Leung, Guang Rui Ma, Anupam Chattopadhyay
Abstract:
The Internet of Medical Things (IoMT) has the potential to radically improve healthcare by enabling real-time monitoring, remote diagnostics, and AI-driven decision making. However, the connectivity, embedded intelligence, and inclusion of a wide variety of novel sensors expose medical devices to severe cybersecurity threats, compromising patient safety and data privacy. In addition, many devices also have direct capacity - individually or in conjunction with other IoMT devices - to perform actions on the patient, such as delivering an electrical stimulus, administering a drug, or activating a motor, which can potentially be life-threatening. We provide a taxonomy of potential attacks targeting IoMT, presenting attack surfaces, vulnerabilities, and mitigation strategies across all layers of the IoMT architecture. It answers key questions such as: What makes IoMT security different from traditional IT security? What are the cybersecurity threats to medical devices? How can engineers design secure IoMT systems and protect hospital networks from cyberattacks? By analyzing historical cyber incidents, we highlight critical security gaps and propose practical security guidelines for medical device engineers and security professionals. This work bridges the gap between research and implementation, equipping healthcare stakeholders with actionable insights to build resilient and privacy-preserving IoMT ecosystems. Finally, we present the latest standardization and compliance frameworks, that IoMT security designers should be aware of.
Authors:Shailesh Mishra, Simone Colombo, Pasindu Tennage, Martin Burkhart, Bryan Ford
Abstract:
Social key recovery mechanisms enable users to recover their vaults with the help of trusted contacts, or trustees, avoiding the need for a single point of trust or memorizing complex strings. However, existing mechanisms overlook the memorability demands on users for recovery, such as the need to recall a threshold number of trustees. Therefore, we first formalize the notion of recovery metadata in the context of social key recovery, illustrating the tradeoff between easing the burden of memorizing the metadata and maintaining metadata privacy. We present Apollo, the first framework that addresses this tradeoff by distributing indistinguishable data within a user's social circle, where trustees hold relevant data and non-trustees store random data. Apollo eliminates the need to memorize recovery metadata since a user eventually gathers sufficient data from her social circle for recovery. Due to indistinguishability, Apollo protects metadata privacy by forming an anonymity set that hides the trustees among non-trustees. To make the anonymity set scalable, Apollo proposes a novel multi-layered secret sharing scheme that mitigates the overhead due to the random data distributed among non-trustees. Finally, we provide a prototype implementation of Apollo and report on its performance. Apollo reduces the chances of malicious recovery to between 0.005% and 1.8%, depending on the adversary's ability to compromise. The multi-layered design shows a latency reduction from 1.1x to 740kx compared to a single-layered approach, depending on the number of reconnections.
Authors:Ali RajabiNekoo, Laleh Rasoul, Amirfarhad Farhadi, Azadeh Zamanifar
Abstract:
Traditional methods for identifying impactful liquidity providers (LPs) in Concentrated Liquidity Market Makers (CLMMs) rely on broad measures, such as nominal capital size or surface-level activity, which often lead to inaccurate risk analysis. The SILS framework offers a significantly more detailed approach, characterizing LPs not just as capital holders but as dynamic systemic agents whose actions directly impact market stability. This represents a fundamental paradigm shift from the static, volume-based analysis to a dynamic, impact-focused understanding. This advanced approach uses on-chain event logs and smart contract execution traces to compute Exponential Time-Weighted Liquidity (ETWL) profiles and apply unsupervised anomaly detection. Most importantly, it defines an LP's functional importance through the Liquidity Stability Impact Score (LSIS), a counterfactual metric that measures the potential degradation of the market if the LP withdraws. This combined approach provides a more detailed and realistic characterization of an LP's impact, moving beyond the binary and often misleading classifications used by existing methods. This impact-focused and comprehensive approach enables SILS to accurately identify high-impact LPs-including those missed by traditional methods and supports essential applications like a protective oracle layer and actionable trader signals, thereby significantly enhancing DeFi ecosystem. The framework provides unprecedented transparency into the underlying liquidity structure and associated risks, effectively reducing the common false positives and uncovering critical false negatives found in traditional models. Therefore, SILS provides an effective mechanism for proactive risk management, transforming how DeFi protocols safeguard their ecosystems against asymmetric liquidity behavior.
Authors:S M Mostaq Hossain, Amani Altarawneh, Maanak Gupta
Abstract:
As blockchain technologies are increasingly adopted in enterprise and research domains, the need for secure, scalable, and performance-transparent node infrastructure has become critical. While self-hosted Ethereum nodes offer operational control, they often lack elasticity and require complex maintenance. This paper presents a hybrid, service-oriented architecture for deploying and monitoring Ethereum full nodes using Amazon Managed Blockchain (AMB), integrated with EC2-based observability, IAM-enforced security policies, and reproducible automation via the AWS Cloud Development Kit. Our architecture supports end-to-end observability through custom EC2 scripts leveraging Web3.py and JSON-RPC, collecting over 1,000 real-time data points-including gas utilization, transaction inclusion latency, and mempool dynamics. These metrics are visualized and monitored through AWS CloudWatch, enabling service-level performance tracking and anomaly detection. This cloud-native framework restores low-level observability lost in managed environments while maintaining the operational simplicity of managed services. By bridging the simplicity of AMB with the transparency required for protocol research and enterprise monitoring, this work delivers one of the first reproducible, performance-instrumented Ethereum deployments on AMB. The proposed hybrid architecture enables secure, observable, and reproducible Ethereum node operations in cloud environments, suitable for both research and production use.
Authors:Ryusei Fujimoto, Yugo Nakamura, Yutaka Arakawa
Abstract:
Wearable accelerometers and gyroscopes encode fine-grained behavioural signatures that can be exploited to re-identify users, making privacy protection essential for healthcare applications. We introduce C-AAE, a compressive anonymizing autoencoder that marries an Anonymizing AutoEncoder (AAE) with Adaptive Differential Pulse-Code Modulation (ADPCM). The AAE first projects raw sensor windows into a latent space that retains activity-relevant features while suppressing identity cues. ADPCM then differentially encodes this latent stream, further masking residual identity information and shrinking the bitrate. Experiments on the MotionSense and PAMAP2 datasets show that C-AAE cuts user re-identification F1 scores by 10-15 percentage points relative to AAE alone, while keeping activity-recognition F1 within 5 percentage points of the unprotected baseline. ADPCM also reduces data volume by roughly 75 %, easing transmission and storage overheads. These results demonstrate that C-AAE offers a practical route to balancing privacy and utility in continuous, sensor-based activity recognition for healthcare.
Authors:Tevin Atwal, Chan Nam Tieu, Yefeng Yuan, Zhan Shi, Yuhong Liu, Liang Cheng
Abstract:
The increasing use of synthetic data generated by Large Language Models (LLMs) presents both opportunities and challenges in data-driven applications. While synthetic data provides a cost-effective, scalable alternative to real-world data to facilitate model training, its diversity and privacy risks remain underexplored. Focusing on text-based synthetic data, we propose a comprehensive set of metrics to quantitatively assess the diversity (i.e., linguistic expression, sentiment, and user perspective), and privacy (i.e., re-identification risk and stylistic outliers) of synthetic datasets generated by several state-of-the-art LLMs. Experiment results reveal significant limitations in LLMs' capabilities in generating diverse and privacy-preserving synthetic data. Guided by the evaluation results, a prompt-based approach is proposed to enhance the diversity of synthetic reviews while preserving reviewer privacy.
Authors:Amrith Setlur, Pratiksha Thaker, Jonathan Ullman
Abstract:
The most effective differentially private machine learning algorithms in practice rely on an additional source of purportedly public data. This paradigm is most interesting when the two sources combine to be more than the sum of their parts. However, there are settings such as mean estimation where we have strong lower bounds, showing that when the two data sources have the same distribution, there is no complementary value to combining the two data sources. In this work we extend the known lower bounds for public-private learning to setting where the two data sources exhibit significant distribution shift. Our results apply to both Gaussian mean estimation where the two distributions have different means, and to Gaussian linear regression where the two distributions exhibit parameter shift. We find that when the shift is small (relative to the desired accuracy), either public or private data must be sufficiently abundant to estimate the private parameter. Conversely, when the shift is large, public data provides no benefit.
Authors:Ahmed Lekssays, Husrev Taha Sencar, Ting Yu
Abstract:
Sharing methods of attack and their effectiveness is a cornerstone of building robust defensive systems. Threat analysis reports, produced by various individuals and organizations, play a critical role in supporting security operations and combating emerging threats. To enhance the timeliness and automation of threat intelligence sharing, several standards have been established, with the Structured Threat Information Expression (STIX) framework emerging as one of the most widely adopted. However, generating STIX-compatible data from unstructured security text remains a largely manual, expert-driven process. To address this challenge, we introduce AZERG, a tool designed to assist security analysts in automatically generating structured STIX representations. To achieve this, we adapt general-purpose large language models for the specific task of extracting STIX-formatted threat data. To manage the complexity, the task is divided into four subtasks: entity detection (T1), entity type identification (T2), related pair detection (T3), and relationship type identification (T4). We apply task-specific fine-tuning to accurately extract relevant entities and infer their relationships in accordance with the STIX specification. To address the lack of training data, we compiled a comprehensive dataset with 4,011 entities and 2,075 relationships extracted from 141 full threat analysis reports, all annotated in alignment with the STIX standard. Our models achieved F1-scores of 84.43% for T1, 88.49% for T2, 95.47% for T3, and 84.60% for T4 in real-world scenarios. We validated their performance against a range of open- and closed-parameter models, as well as state-of-the-art methods, demonstrating improvements of 2-25% across tasks.
Authors:Dong Ben, Hui Feng, Qian Wang
Abstract:
Large Language Models (LLMs) are increasingly used in circuit design tasks and have typically undergone multiple rounds of training. Both the trained models and their associated training data are considered confidential intellectual property (IP) and must be protected from exposure. Confidential Computing offers a promising solution to protect data and models through Trusted Execution Environments (TEEs). However, existing TEE implementations are not designed to support the resource-intensive nature of LLMs efficiently. In this work, we first present a comprehensive evaluation of the LLMs within a TEE-enabled confidential computing environment, specifically utilizing Intel Trust Domain Extensions (TDX). We constructed experiments on three environments: TEE-based, CPU-only, and CPU-GPU hybrid implementations, and evaluated their performance in terms of tokens per second.
Our first observation is that distilled models, i.e., DeepSeek, surpass other models in performance due to their smaller parameters, making them suitable for resource-constrained devices. Also, in the quantized models such as 4-bit quantization (Q4) and 8-bit quantization (Q8), we observed a performance gain of up to 3x compared to FP16 models. Our findings indicate that for fewer parameter sets, such as DeepSeek-r1-1.5B, the TDX implementation outperforms the CPU version in executing computations within a secure environment. We further validate the results using a testbench designed for SoC design tasks. These validations demonstrate the potential of efficiently deploying lightweight LLMs on resource-constrained systems for semiconductor CAD applications.
Authors:Syed Emad Uddin Shubha, Tasnuva Farheen
Abstract:
Hardware crosstalk in multi-tenant superconducting quantum computers poses a severe security threat, allowing adversaries to induce targeted errors across tenant boundaries by injecting carefully engineered pulses. We present a simulation-based study of active crosstalk attacks at the pulse level, analyzing how adversarial control of pulse timing, shape, amplitude, and coupling can disrupt a victim's computation. Our framework models the time-dependent dynamics of a three-qubit system in the rotating frame, capturing both always-on couplings and injected drive pulses. We examine two attack strategies: attacker-first (pulse before victim operation) and victim-first (pulse after), and systematically identify the pulse and coupling configurations that cause the largest logical errors. Protocol-level experiments on quantum coin flip and XOR classification circuits show that some protocols are highly vulnerable to these attacks, while others remain robust. Based on these findings, we discuss practical methods for detection and mitigation to improve security in quantum cloud platforms.
Authors:Onyinye Dibia, Mengyi Lu, Prianka Bhattacharjee, Joseph P. Near, Yuanyuan Feng
Abstract:
The increasing adoption of differential privacy (DP) leads to public-facing DP deployments by both government agencies and companies. However, real-world DP deployments often do not fully disclose their privacy guarantees, which vary greatly between deployments. Failure to disclose certain DP parameters can lead to misunderstandings about the strength of the privacy guarantee, undermining the trust in DP. In this work, we seek to inform future standards for communicating the privacy guarantees of DP deployments. Based on semi-structured interviews with 12 DP experts, we identify important DP parameters necessary to comprehensively communicate DP guarantees, and describe why and how they should be disclosed. Based on expert recommendations, we design an initial privacy label for DP to comprehensively communicate privacy guarantees in a standardized format.
Authors:Ruwanga Konara, Kasun De Zoysa, Anuradha Mahasinghe, Asanka Sayakkara, Nalin Ranasinghe
Abstract:
With rapid advancements in quantum computing, it is widely believed that there will be quantum hardware capable of compromising classical cryptography and hence, the internet and the current information security infrastructure in the coming decade. This is mainly due to the operational realizations of quantum algorithms such as Grover and Shor, to which the current classical encryption protocols are vulnerable. Blockchains, i.e., blockchain data structures and their data, rely heavily on classical cryptography. One approach to secure blockchain is to attempt to achieve information theoretical security by defining blockchain on quantum technologies. There have been two conceptualizations of blockchains on quantum registers: the time-entangled Greenberger-Horne-Zeilinger (GHZ) state blockchain and the quantum hypergraph blockchain. On our part, an attempt is made to conceptualize a new quantum blockchain combining features of both these schemes to achieve the absolute security of the time-temporal GHZ blockchain and the scalability and efficiency of the quantum hypergraph blockchain in the proposed quantum blockchain protocol.
Authors:Ceren KocaoÄullar, Gustavo Petri, Dominic P. Mulligan, Derek Miller, Hugo J. M. Vincent, Shale Xiong, Alastair R. Beresford
Abstract:
Trusted Execution Environments (TEEs) are designed to protect the privacy and integrity of data in use. They enable secure data processing and sharing in peer-to-peer networks, such as vehicular ad hoc networks of autonomous vehicles, without compromising confidentiality. In these networks, nodes must establish mutual trust to collaborate securely. TEEs can achieve this through remote attestation, where a prover presents evidence of its trustworthiness to a verifier, which then decides whether or not to trust the prover. However, a naive peer-to-peer attestation approach, where every TEE directly attests every other TEE, results in quadratic communication overhead. This is inefficient in dynamic environments, where nodes frequently join and leave the network.
To address this, we present Careful Whisper, a gossip-based protocol that disseminates trust efficiently, reducing attestation overhead to linear complexity under ideal conditions. It enables interoperability by enabling transitive trust across heterogeneous networks, and supports trust establishment with offline nodes via relayed attestations. Using a custom discrete-event simulator, we show that Careful Whisper propagates trust both faster and more widely than naive approaches across various network topologies. Our results demonstrate that our protocol is resource efficient, sending ~21.5 KiB and requiring 0.158 seconds per round in a 200-node network, and that our protocol is resilient to attestation failures across various network topologies.
Authors:Juntao Tan, Lan Zhang, Zhonghao Hu, Kai Yang, Peng Ran, Bo Li
Abstract:
Though vertical federated learning (VFL) is generally considered to be privacy-preserving, recent studies have shown that VFL system is vulnerable to label inference attacks originating from various attack surfaces. Among these attacks, the model completion (MC) attack is currently the most powerful one. Existing defense methods against it either sacrifice model accuracy or incur impractical computational overhead. In this paper, we propose VMask, a novel label privacy protection framework designed to defend against MC attack from the perspective of layer masking. Our key insight is to disrupt the strong correlation between input data and intermediate outputs by applying the secret sharing (SS) technique to mask layer parameters in the attacker's model. We devise a strategy for selecting critical layers to mask, reducing the overhead that would arise from naively applying SS to the entire model. Moreover, VMask is the first framework to offer a tunable privacy budget to defenders, allowing for flexible control over the levels of label privacy according to actual requirements. We built a VFL system, implemented VMask on it, and extensively evaluated it using five model architectures and 13 datasets with different modalities, comparing it to 12 other defense methods. The results demonstrate that VMask achieves the best privacy-utility trade-off, successfully thwarting the MC attack (reducing the label inference accuracy to a random guessing level) while preserving model performance (e.g., in Transformer-based model, the averaged drop of VFL model accuracy is only 0.09%). VMask's runtime is up to 60,846 times faster than cryptography-based methods, and it only marginally exceeds that of standard VFL by 1.8 times in a large Transformer-based model, which is generally acceptable.
Authors:Juntao Tan, Anran Li, Quanchao Liu, Peng Ran, Lan Zhang
Abstract:
Vertical federated learning (VFL) enables multiple parties with disjoint features to collaboratively train models without sharing raw data. While privacy vulnerabilities of VFL are extensively-studied, its security threats-particularly targeted label attacks-remain underexplored. In such attacks, a passive party perturbs inputs at inference to force misclassification into adversary-chosen labels. Existing methods rely on unrealistic assumptions (e.g., accessing VFL-model's outputs) and ignore anomaly detectors deployed in real-world systems. To bridge this gap, we introduce VTarbel, a two-stage, minimal-knowledge attack framework explicitly designed to evade detector-enhanced VFL inference. During the preparation stage, the attacker selects a minimal set of high-expressiveness samples (via maximum mean discrepancy), submits them through VFL protocol to collect predicted labels, and uses these pseudo-labels to train estimated detector and surrogate model on local features. In attack stage, these models guide gradient-based perturbations of remaining samples, crafting adversarial instances that induce targeted misclassifications and evade detection. We implement VTarbel and evaluate it against four model architectures, seven multimodal datasets, and two anomaly detectors. Across all settings, VTarbel outperforms four state-of-the-art baselines, evades detection, and retains effective against three representative privacy-preserving defenses. These results reveal critical security blind spots in current VFL deployments and underscore urgent need for robust, attack-aware defenses.
Authors:Wen-Cheng Chung, Shu-Ting Huang, Hao-Ting Pai
Abstract:
Explainable intrusion detection systems (IDS) are now recognized as essential for mission-critical networks, yet most "XAI" pipelines still bolt an approximate explainer onto an opaque classifier, leaving analysts with partial and sometimes misleading insights. The Interpretable Generalization (IG) mechanism, published in IEEE Transactions on Information Forensics and Security, eliminates that bottleneck by learning coherent patterns - feature combinations unique to benign or malicious traffic - and turning them into fully auditable rules. IG already delivers outstanding precision, recall, and AUC on NSL-KDD, UNSW-NB15, and UKM-IDS20, even when trained on only 10% of the data. To raise precision further without sacrificing transparency, we introduce Multi-Granular Discretization (IG-MD), which represents every continuous feature at several Gaussian-based resolutions. On UKM-IDS20, IG-MD lifts precision by greater than or equal to 4 percentage points across all nine train-test splits while preserving recall approximately equal to 1.0, demonstrating that a single interpretation-ready model can scale across domains without bespoke tuning.
Authors:Shu-Ting Huang, Wen-Cheng Chung, Hao-Ting Pai
Abstract:
The Interpretable Generalization (IG) mechanism recently published in IEEE Transactions on Information Forensics and Security delivers state-of-the-art, evidence-based intrusion detection by discovering coherent normal and attack patterns through exhaustive intersect-and-subset operations-yet its cubic-time complexity and large intermediate bitsets render full-scale datasets impractical on CPUs. We present IG-GPU, a PyTorch re-architecture that offloads all pairwise intersections and subset evaluations to commodity GPUs. Implemented on a single NVIDIA RTX 4070 Ti, in the 15k-record NSL-KDD dataset, IG-GPU shows a 116-fold speed-up over the multi-core CPU implementation of IG. In the full size of NSL-KDD (148k-record), given small training data (e.g., 10%-90% train-test split), IG-GPU runs in 18 minutes with Recall 0.957, Precision 0.973, and AUC 0.961, whereas IG required down-sampling to 15k-records to avoid memory exhaustion and obtained Recall 0.935, Precision 0.942, and AUC 0.940. The results confirm that IG-GPU is robust across scales and could provide millisecond-level per-flow inference once patterns are learned. IG-GPU thus bridges the gap between rigorous interpretability and real-time cyber-defense, offering a portable foundation for future work on hardware-aware scheduling, multi-GPU sharding, and dataset-specific sparsity optimizations.
Authors:Ahmed Mahrous, Maurantonio Caprolu, Roberto Di Pietro
Abstract:
Stablecoins, with a capitalization exceeding 200 billion USD as of January 2025, have shown significant growth, with annual transaction volumes exceeding 10 trillion dollars in 2023 and nearly doubling that figure in 2024. This exceptional success has attracted the attention of traditional financial institutions, with an increasing number of governments exploring the potential of Central Bank Digital Currencies (CBDCs). Although academia has recognized the importance of stablecoins, research in this area remains fragmented, incomplete, and sometimes contradictory. In this paper, we aim to address the cited gap with a structured literature analysis, correlating recent contributions to present a picture of the complex economic, technical, and regulatory aspects of stablecoins. To achieve this, we formulate the main research questions and categorize scientific contributions accordingly, identifying main results, data sources, methodologies, and open research questions. The research questions we address in this survey paper cover several topics, such as the stability of various stablecoins, novel designs and implementations, and relevant regulatory challenges. The studies employ a wide range of methodologies and data sources, which we critically analyze and synthesize. Our analysis also reveals significant research gaps, including limited studies on security and privacy, underexplored stablecoins, unexamined failure cases, unstudied governance mechanisms, and the treatment of stablecoins under financial accounting standards, among other areas.
Authors:Steven Lamp, Jason D. Hiser, Anh Nguyen-Tuong, Jack W. Davidson
Abstract:
Cybersecurity simulation environments, such as cyber ranges, honeypots, and sandboxes, require realistic human behavior to be effective, yet no quantitative method exists to assess the behavioral fidelity of synthetic user personas. This paper presents PHASE (Passive Human Activity Simulation Evaluation), a machine learning framework that analyzes Zeek connection logs and distinguishes human from non-human activity with over 90\% accuracy. PHASE operates entirely passively, relying on standard network monitoring without any user-side instrumentation or visible signs of surveillance. All network activity used for machine learning is collected via a Zeek network appliance to avoid introducing unnecessary network traffic or artifacts that could disrupt the fidelity of the simulation environment. The paper also proposes a novel labeling approach that utilizes local DNS records to classify network traffic, thereby enabling machine learning analysis. Furthermore, we apply SHAP (SHapley Additive exPlanations) analysis to uncover temporal and behavioral signatures indicative of genuine human users. In a case study, we evaluate a synthetic user persona and identify distinct non-human patterns that undermine behavioral realism. Based on these insights, we develop a revised behavioral configuration that significantly improves the human-likeness of synthetic activity yielding a more realistic and effective synthetic user persona.
Authors:Matilde Baroni, Dominik Leichtle, SiniÅ¡a JankoviÄ, Ivan Å upiÄ
Abstract:
Non-local games are a powerful tool to distinguish between correlations possible in classical and quantum worlds. Kalai et al. (STOC'23) proposed a compiler that converts multipartite non-local games into interactive protocols with a single prover, relying on cryptographic tools to remove the assumption of physical separation of the players. While quantum completeness and classical soundness of the construction have been established for all multipartite games, quantum soundness is known only in the special case of bipartite games.
In this paper, we prove that the Kalai et al.'s compiler indeed achieves quantum soundness for all multipartite compiled non-local games, by showing that any correlations that can be generated in the asymptotic case correspond to quantum commuting strategies.
Our proof uses techniques from the theory of operator algebras, and relies on a characterisation of sequential operationally no-signalling strategies as quantum commuting operator strategies in the multipartite case, thereby generalising several previous results. On the way, we construct universal C*-algebras of sequential PVMs and prove a new chain rule for Radon-Nikodym derivatives of completely positive maps on C*-algebras which may be of independent interest.
Authors:Adrien Ghosn, Charly Castes, Neelu S. Kalani, Yuchen Qian, Marios Kogias, Edouard Bugnion
Abstract:
Securing sensitive cloud workloads requires composing confidential virtual machines (CVMs) with nested enclaves or sandboxes. Unfortunately, each new isolation boundary adds ad-hoc access control mechanisms, hardware extensions, and trusted software. This escalating complexity bloats the TCB, complicates end-to-end attestation, and leads to fragmentation across platforms and cloud service providers (CSPs).
We introduce a unified isolation model that delegates enforceable, composable, and attestable isolation to a single trusted security monitor: Tyche. Tyche provides an API for partitioning, sharing, attesting, and reclaiming resources through its core abstraction, trust domains (TDs). To provide fine-grain isolation, TDs can recursively create and manage sub-TDs. Tyche captures these relationships in attestations, allowing cloud tenants to reason about end-to-end security. TDs serve as the building blocks for constructing composable enclaves, sandboxes, and CVMs.
Tyche runs on commodity x86_64 without hardware security extensions and can maintain backward compatibility with existing software. We provide an SDK to run and compose unmodified workloads as sandboxes, enclaves, and CVMs with minimal overhead compared to native Linux execution. Tyche supports complex cloud scenarios, such as confidential inference with mutually distrustful users, model owners, and CSPs. An additional RISC-V prototype demonstrates Tyche's portability across platforms.
Authors:Xiang Li, Yifan Lin, Yuanzhe Zhang
Abstract:
To mitigate privacy leakage and performance issues in personalized advertising, this paper proposes a framework that integrates federated learning and differential privacy. The system combines distributed feature extraction, dynamic privacy budget allocation, and robust model aggregation to balance model accuracy, communication overhead, and privacy protection. Multi-party secure computing and anomaly detection mechanisms further enhance system resilience against malicious attacks. Experimental results demonstrate that the framework achieves dual optimization of recommendation accuracy and system efficiency while ensuring privacy, providing both a practical solution and a theoretical foundation for applying privacy protection technologies in advertisement recommendation.
Authors:Mi-Ying Huang, Er-Cheng Tang
Abstract:
Program obfuscation aims to hide the inner workings of a program while preserving its functionality. In the quantum setting, recent works have obtained obfuscation schemes for specialized classes of quantum circuits. For instance, Bartusek, Brakerski, and Vaikuntanathan (STOC 2024) constructed a quantum state obfuscation scheme, which supports the obfuscation of quantum programs represented as quantum states for pseudo-deterministic quantum programs with classical inputs and outputs in the classical oracle model.
In this work, we improve upon existing results by constructing the first quantum state obfuscation scheme for unitary (or approximately unitary) quantum programs supporting quantum inputs and outputs in the classical oracle model. At the core of our obfuscation scheme are two novel ingredients: a functional quantum authentication scheme that allows key holders to learn specific functions of the authenticated quantum state with simulation-based security, and a compiler that represents an arbitrary quantum circuit as a projective linear-plus-measurement quantum program described by a sequence of non-adaptive Clifford gates interleaved with adaptive and compatible measurements.
Authors:Haiwei Lin, Shoko Imaizumi, Hitoshi Kiya
Abstract:
We propose a low-rank adaptation method for training privacy-preserving vision transformer (ViT) models that efficiently freezes pre-trained ViT model weights. In the proposed method, trainable rank decomposition matrices are injected into each layer of the ViT architecture, and moreover, the patch embedding layer is not frozen, unlike in the case of the conventional low-rank adaptation methods. The proposed method allows us not only to reduce the number of trainable parameters but to also maintain almost the same accuracy as that of full-time tuning.
Authors:Thomas Dalton, Hemanth Gowda, Girish Rao, Sachin Pargi, Alireza Hadj Khodabakhshi, Joseph Rombs, Stephan Jou, Manish Marwah
Abstract:
Phishing remains a pervasive and growing threat, inflicting heavy economic and reputational damage. While machine learning has been effective in real-time detection of phishing attacks, progress is hindered by lack of large, high-quality datasets and benchmarks. In addition to poor-quality due to challenges in data collection, existing datasets suffer from leakage and unrealistic base rates, leading to overly optimistic performance results. In this paper, we introduce PhreshPhish, a large-scale, high-quality dataset of phishing websites that addresses these limitations. Compared to existing public datasets, PhreshPhish is substantially larger and provides significantly higher quality, as measured by the estimated rate of invalid or mislabeled data points. Additionally, we propose a comprehensive suite of benchmark datasets specifically designed for realistic model evaluation by minimizing leakage, increasing task difficulty, enhancing dataset diversity, and adjustment of base rates more likely to be seen in the real world. We train and evaluate multiple solution approaches to provide baseline performance on the benchmark sets. We believe the availability of this dataset and benchmarks will enable realistic, standardized model comparison and foster further advances in phishing detection. The datasets and benchmarks are available on Hugging Face (https://huggingface.co/datasets/phreshphish/phreshphish).
Authors:Zhonghao Zhan, Huichi Zhou, Hamed Haddadi
Abstract:
Graph Neural Network (GNN)-based network intrusion detection systems (NIDS) are often evaluated on single datasets, limiting their ability to generalize under distribution drift. Furthermore, their adversarial robustness is typically assessed using synthetic perturbations that lack realism. This measurement gap leads to an overestimation of GNN-based NIDS resilience. To address the limitations, we propose \textbf{REAL-IoT}, a comprehensive framework for robustness evaluation of GNN-based NIDS in IoT environments. Our framework presents a methodology that creates a unified dataset from canonical datasets to assess generalization under drift. In addition, it features a novel intrusion dataset collected from a physical IoT testbed, which captures network traffic and attack scenarios under real-world settings. Furthermore, using REAL-IoT, we explore the usage of Large Language Models (LLMs) to analyze network data and mitigate the impact of adversarial examples by filtering suspicious flows. Our evaluations using REAL-IoT reveal performance drops in GNN models compared to results from standard benchmarks, quantifying their susceptibility to drift and realistic attacks. We also demonstrate the potential of LLM-based filtering to enhance robustness. These findings emphasize the necessity of realistic threat modeling and rigorous measurement practices for developing resilient IoT intrusion detection systems.
Authors:Mohammad Alikhani, Reza Kazemi
Abstract:
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
Authors:Tatiana Petrova, Boris Bliznioukov, Aleksandr Puzikov, Radu State
Abstract:
The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users' behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field's trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the 'locus of intelligence': from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent's core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.
Authors:Jeremy Styborski, Mingzhi Lyu, Jiayou Lu, Nupur Kapur, Adams Kong
Abstract:
Poisoning attacks pose significant challenges to the robustness of diffusion models (DMs). In this paper, we systematically analyze when and where poisoning attacks textual inversion (TI), a widely used personalization technique for DMs. We first introduce Semantic Sensitivity Maps, a novel method for visualizing the influence of poisoning on text embeddings. Second, we identify and experimentally verify that DMs exhibit non-uniform learning behavior across timesteps, focusing on lower-noise samples. Poisoning attacks inherit this bias and inject adversarial signals predominantly at lower timesteps. Lastly, we observe that adversarial signals distract learning away from relevant concept regions within training data, corrupting the TI process. Based on these insights, we propose Safe-Zone Training (SZT), a novel defense mechanism comprised of 3 key components: (1) JPEG compression to weaken high-frequency poison signals, (2) restriction to high timesteps during TI training to avoid adversarial signals at lower timesteps, and (3) loss masking to constrain learning to relevant regions. Extensive experiments across multiple poisoning methods demonstrate that SZT greatly enhances the robustness of TI against all poisoning attacks, improving generative quality beyond prior published defenses. Code: www.github.com/JStyborski/Diff_Lab Data: www.github.com/JStyborski/NC10
Authors:Weiyang He, Chip-Hong Chang
Abstract:
Vertical Federated Learning (VFL) enables an orchestrating active party to perform a machine learning task by cooperating with passive parties that provide additional task-related features for the same training data entities. While prior research has leveraged the privacy vulnerability of VFL to compromise its integrity through a combination of label inference and backdoor attacks, their effectiveness is constrained by the low label inference precision and suboptimal backdoor injection conditions. To facilitate a more rigorous security evaluation on VFL without these limitations, we propose HASSLE, a hijacking attack framework composed of a gradient-direction-based label inference module and an adversarial embedding generation algorithm enhanced by self-supervised learning. HASSLE accurately identifies private samples associated with a targeted label using only a single known instance of that label. In the two-party scenario, it demonstrates strong performance with an attack success rate (ASR) of over 99% across four datasets, including both image and tabular modalities, and achieves 85% ASR on the more complex CIFAR-100 dataset. Evaluation of HASSLE against 8 potential defenses further highlights its significant threat while providing new insights into building a trustworthy VFL system.
Authors:Poushali Sengupta, Sabita Maharjan, frank Eliassen, Yan Zhang
Abstract:
Location-based vehicular traffic management faces significant challenges in protecting sensitive geographical data while maintaining utility for traffic management and fairness across regions. Existing state-of-the-art solutions often fail to meet the required level of protection against linkage attacks and demographic biases, leading to privacy leakage and inequity in data analysis. In this paper, we propose a novel algorithm designed to address the challenges regarding the balance of privacy, utility, and fairness in location-based vehicular traffic management systems. In this context, utility means providing reliable and meaningful traffic information, while fairness ensures that all regions and individuals are treated equitably in data use and decision-making. Employing differential privacy techniques, we enhance data security by integrating query-based data access with iterative shuffling and calibrated noise injection, ensuring that sensitive geographical data remains protected. We ensure adherence to epsilon-differential privacy standards by implementing the Laplace mechanism. We implemented our algorithm on vehicular location-based data from Norway, demonstrating its ability to maintain data utility for traffic management and urban planning while ensuring fair representation of all geographical areas without being overrepresented or underrepresented. Additionally, we have created a heatmap of Norway based on our model, illustrating the privatized and fair representation of the traffic conditions across various cities. Our algorithm provides privacy in vehicular traffic
Authors:Frederick Shpilevskiy, Saiyue Lyu, Krishnamurthy Dj Dvijotham, Mathias Lécuyer, Pierre-André Noël
Abstract:
We propose Adaptive Diffusion Denoised Smoothing, a method for certifying the predictions of a vision model against adversarial examples, while adapting to the input. Our key insight is to reinterpret a guided denoising diffusion model as a long sequence of adaptive Gaussian Differentially Private (GDP) mechanisms refining a pure noise sample into an image. We show that these adaptive mechanisms can be composed through a GDP privacy filter to analyze the end-to-end robustness of the guided denoising process, yielding a provable certification that extends the adaptive randomized smoothing analysis. We demonstrate that our design, under a specific guiding strategy, can improve both certified accuracy and standard accuracy on ImageNet for an $\ell_2$ threat model.
Authors:Sree Bhargavi Balija, Rekha Singal, Ramesh Raskar, Erfan Darzi, Raghu Bala, Thomas Hardjono, Ken Huang
Abstract:
The fragmentation of AI agent ecosystems has created urgent demands for interoperability, trust, and economic coordination that current protocols -- including MCP (Hou et al., 2025), A2A (Habler et al., 2025), ACP (Liu et al., 2025), and Cisco's AGP (Edwards, 2025) -- cannot address at scale. We present the Nanda Unified Architecture, a decentralized framework built around three core innovations: fast DID-based agent discovery through distributed registries, semantic agent cards with verifiable credentials and composability profiles, and a dynamic trust layer that integrates behavioral attestations with policy compliance. The system introduces X42/H42 micropayments for economic coordination and MAESTRO, a security framework incorporating Synergetics' patented AgentTalk protocol (US Patent 12,244,584 B1) and secure containerization. Real-world deployments demonstrate 99.9 percent compliance in healthcare applications and substantial monthly transaction volumes with strong privacy guarantees. By unifying MIT's trust research with production deployments from Cisco and Synergetics, we show how cryptographic proofs and policy-as-code transform agents into trust-anchored participants in a decentralized economy (Lakshmanan, 2025; Sha, 2025). The result enables a globally interoperable Internet of Agents where trust becomes the native currency of collaboration across both enterprise and Web3 ecosystems.
Authors:Rami Darwish, Mahmoud Abdelsalam, Sajad Khorsandroo, Kaushik Roy
Abstract:
As IoT ecosystems continue to expand across critical sectors, they have become prominent targets for increasingly sophisticated and large-scale malware attacks. The evolving threat landscape, combined with the sensitive nature of IoT-generated data, demands detection frameworks that are both privacy-preserving and resilient to data heterogeneity. Federated Learning (FL) offers a promising solution by enabling decentralized model training without exposing raw data. However, standard FL algorithms such as FedAvg and FedProx often fall short in real-world deployments characterized by class imbalance and non-IID data distributions -- particularly in the presence of rare or disjoint malware classes. To address these challenges, we propose FedP3E (Privacy-Preserving Prototype Exchange), a novel FL framework that supports indirect cross-client representation sharing while maintaining data privacy. Each client constructs class-wise prototypes using Gaussian Mixture Models (GMMs), perturbs them with Gaussian noise, and transmits only these compact summaries to the server. The aggregated prototypes are then distributed back to clients and integrated into local training, supported by SMOTE-based augmentation to enhance representation of minority malware classes. Rather than relying solely on parameter averaging, our prototype-driven mechanism enables clients to enrich their local models with complementary structural patterns observed across the federation -- without exchanging raw data or gradients. This targeted strategy reduces the adverse impact of statistical heterogeneity with minimal communication overhead. We evaluate FedP3E on the N-BaIoT dataset under realistic cross-silo scenarios with varying degrees of data imbalance.
Authors:Peicheng Wang, Monika Santra, Mingyu Liu, Cong Sun, Dongrui Zeng, Gang Tan
Abstract:
For reverse engineering related security domains, such as vulnerability detection, malware analysis, and binary hardening, disassembly is crucial yet challenging. The fundamental challenge of disassembly is to identify instruction and function boundaries. Classic approaches rely on file-format assumptions and architecture-specific heuristics to guess the boundaries, resulting in incomplete and incorrect disassembly, especially when the binary is obfuscated. Recent advancements of disassembly have demonstrated that deep learning can improve both the accuracy and efficiency of disassembly. In this paper, we propose Disa, a new learning-based disassembly approach that uses the information of superset instructions over the multi-head self-attention to learn the instructions' correlations, thus being able to infer function entry-points and instruction boundaries. Disa can further identify instructions relevant to memory block boundaries to facilitate an advanced block-memory model based value-set analysis for an accurate control flow graph (CFG) generation. Our experiments show that Disa outperforms prior deep-learning disassembly approaches in function entry-point identification, especially achieving 9.1% and 13.2% F1-score improvement on binaries respectively obfuscated by the disassembly desynchronization technique and popular source-level obfuscator. By achieving an 18.5% improvement in the memory block precision, Disa generates more accurate CFGs with a 4.4% reduction in Average Indirect Call Targets (AICT) compared with the state-of-the-art heuristic-based approach.
Authors:Nils Rollshausen, Alexander Heinrich, Matthias Hollick, Jiska Classen
Abstract:
Smartwatches such as the Apple Watch collect vast amounts of intimate health and fitness data as we wear them. Users have little choice regarding how this data is processed: The Apple Watch can only be used with Apple's iPhones, using their software and their cloud services. We are the first to publicly reverse-engineer the watch's wireless protocols, which led to discovering multiple security issues in Apple's proprietary implementation. With WatchWitch, our custom Android reimplementation, we break out of Apple's walled garden -- demonstrating practical interoperability with enhanced privacy controls and data autonomy. We thus pave the way for more consumer choice in the smartwatch ecosystem, offering users more control over their devices.
Authors:Kaushik Nath, Palash Sarkar
Abstract:
We introduce the new AXU hash function decBRWHash, which is parameterised by the positive integer $c$ and is based on Bernstein-Rabin-Winograd (BRW) polynomials. Choosing $c>1$ gives a hash function which can be implemented using $c$-way single instruction multiple data (SIMD) instructions. We report a set of very comprehensive hand optimised assembly implementations of 4-decBRWHash using avx2 SIMD instructions available on modern Intel processors. For comparison, we also report similar carefully optimised avx2 assembly implementations of polyHash, an AXU hash function based on usual polynomials. Our implementations are over prime order fields, specifically the primes $2^{127}-1$ and $2^{130}-5$. For the prime $2^{130}-5$, for avx2 implementations, compared to the famous Poly1305 hash function, 4-decBRWHash is faster for messages which are a few hundred bytes long and achieves a speed-up of about 16% for message lengths in a few kilobytes range and improves to a speed-up of about 23% for message lengths in a few megabytes range.
Authors:Haoqi He, Xiaokai Lin, Jiancai Chen, Yan Xiao
Abstract:
Data poisoning attacks pose significant threats to machine learning models by introducing malicious data into the training process, thereby degrading model performance or manipulating predictions. Detecting and sifting out poisoned data is an important method to prevent data poisoning attacks. Limited by classical computation frameworks, upcoming larger-scale and more complex datasets may pose difficulties for detection. We introduce the unique speedup of quantum computing for the first time in the task of detecting data poisoning. We present Q-Detection, a quantum-classical hybrid defense method for detecting poisoning attacks. Q-Detection also introduces the Q-WAN, which is optimized using quantum computing devices. Experimental results using multiple quantum simulation libraries show that Q-Detection effectively defends against label manipulation and backdoor attacks. The metrics demonstrate that Q-Detection consistently outperforms the baseline methods and is comparable to the state-of-the-art. Theoretical analysis shows that Q-Detection is expected to achieve more than a 20% speedup using quantum computing power.
Authors:Harrison Green, Claire Le Goues, Fraser Brown
Abstract:
Coverage-guided fuzzers are powerful automated bug-finding tools. They mutate program inputs, observe coverage, and save any input that hits an unexplored path for future mutation. Unfortunately, without knowledge of input formats--for example, the relationship between formats' data fields and sizes--fuzzers are prone to generate destructive frameshift mutations. These time-wasting mutations yield malformed inputs that are rejected by the target program. To avoid such breaking mutations, this paper proposes a novel, lightweight technique that preserves the structure of inputs during mutation by detecting and using relation fields.
Our technique, FrameShift, is simple, fast, and does not require additional instrumentation beyond standard coverage feedback. We implement our technique in two state-of-the-art fuzzers, AFL++ and LibAFL, and perform a 12+ CPU-year fuzzer evaluation, finding that FrameShift improves the performance of the fuzzer in each configuration, sometimes increasing coverage by more than 50%. Furthermore, through a series of case studies, we show that our technique is versatile enough to find important structural relationships in a variety of formats, even generalizing beyond C/C++ targets to both Rust and Python.
Authors:Lu Xian, Van Tran, Lauren Lee, Meera Kumar, Yichen Zhang, Florian Schaub
Abstract:
Privacy policies are often complex. An exception is the two-page standardized notice that U.S. financial institutions must provide under the Gramm-Leach-Bliley Act (GLBA). However, banks now operate websites, mobile apps, and other services that involve complex data sharing practices that require additional privacy notices and do-not-sell opt-outs. We conducted a large-scale analysis of how U.S. banks implement privacy policies and controls in response to GLBA; other federal privacy policy requirements; and the California Consumer Privacy Act (CCPA), a key example for U.S. state privacy laws. We focused on the disclosure and control of a set of especially privacy-invasive practices: third-party data sharing for marketing-related purposes. We collected privacy policies for the 2,067 largest U.S. banks, 45.3\% of which provided multiple policies. Across disclosures and controls within the \textit{same} bank, we identified frequent, concerning inconsistencies -- such as banks indicating in GLBA notices that they do not share with third parties but disclosing sharing elsewhere, or using third-party marketing/advertising cookies without disclosure. This multiplicity of policies, with the inconsistencies it causes, may create consumer confusion and undermine the transparency goals of the very laws that require them. Our findings call into question whether current policy requirements, such as the GLBA notice, are achieving their intended goals in today's online banking landscape. We discuss potential avenues for reforming and harmonizing privacy policies and control requirements across federal and state laws.
Authors:Darya Parygina, Timofey Mezhuev, Daniil Kuts
Abstract:
Program analysis and automated testing have recently become an essential part of SSDLC. Directed greybox fuzzing is one of the most popular automated testing methods that focuses on error detection in predefined code regions. However, it still lacks ability to overcome difficult program constraints. This problem can be well addressed by symbolic execution, but at the cost of lower performance. Thus, combining directed fuzzing and symbolic execution techniques can lead to more efficient error detection.
In this paper, we propose a hybrid approach to directed fuzzing with novel seed scheduling algorithm, based on target-related interestingness and coverage. The approach also performs minimization and sorting of objective seeds according to a target-related information. We implement our approach in Sydr-Fuzz tool using LibAFL-DiFuzz as directed fuzzer and Sydr as dynamic symbolic executor. We evaluate our approach with Time to Exposure metric and compare it with pure LibAFL-DiFuzz, AFLGo, BEACON, WAFLGo, WindRanger, FishFuzz, and Prospector. The results show an improvement for 3 out of 7 examples with speedup up to 1.86 times over the second best result, as well as a significant improvement for 3 out of 7 examples over the pure LibAFL-DiFuzz fuzzer. Sydr-Fuzz hybrid approach to directed fuzzing shows high performance and helps to improve directed fuzzing efficiency.
Authors:Donato Ferraro, Andrea Bastoni, Alexander Zuepke, Andrea Marongiu
Abstract:
The widespread deployment of embedded systems in critical infrastructures, interconnected edge devices like autonomous drones, and smart industrial systems requires robust security measures. Compromised systems increase the risks of operational failures, data breaches, and -- in safety-critical environments -- potential physical harm to people. Despite these risks, current security measures are often insufficient to fully address the attack surfaces of embedded devices. CHERI provides strong security from the hardware level by enabling fine-grained compartmentalization and memory protection, which can reduce the attack surface and improve the reliability of such devices. In this work, we explore the potential of CHERI to compartmentalize one of the most critical and targeted components of interconnected systems: their network stack. Our case study examines the trade-offs of isolating applications, TCP/IP libraries, and network drivers on a CheriBSD system deployed on the Arm Morello platform. Our results suggest that CHERI has the potential to enhance security while maintaining performance in embedded-like environments.
Authors:Atefeh Zareh Chahoki, Marco Roveri
Abstract:
Ethereum smart contracts operate in a concurrent environment where multiple transactions can be submitted simultaneously. However, the Ethereum Virtual Machine (EVM) enforces sequential execution of transactions within each block to prevent conflicts arising from concurrent access to the same state variables. Although this approach guarantees correct behavior, it limits the ability of validators to leverage multi-core architectures for faster transaction processing, thus restricting throughput. Existing solutions introduce concurrency by allowing simultaneous transaction execution combined with runtime conflict detection and rollback mechanisms to maintain correctness. However, these methods incur significant overhead due to continuous conflict tracking and transaction reversion. Recently, alternative approaches have emerged that aim to predict conflicts statically, before execution, by analyzing smart contract code for potential transaction interactions. Despite their promise, there is a lack of comprehensive studies that examine static conflict detection and its broader implications in specific smart contracts. This paper fills this important gap by proposing a novel static analysis method to detect potential transaction conflicts in Ethereum smart contracts. Our method identifies read-write, write-write, and function call conflicts between transaction pairs by analyzing state variable access patterns in Solidity contracts. We implement a tool that parses contract code and performs conflict detection. Evaluation on a dataset of real-world Ethereum smart contracts demonstrates that our approach achieves high precision in identifying potential conflicts. By enabling proactive conflict detection, our tool supports further design of transaction scheduling strategies that reduce runtime failures, enhance validator throughput, and contribute to blockchain scalability.
Authors:Ruikai Zhou, Kang Yang, Xun Chen, Wendy Hui Wang, Guanhong Tao, Jun Xu
Abstract:
Today, the training of large language models (LLMs) can involve personally identifiable information and copyrighted material, incurring dataset misuse. To mitigate the problem of dataset misuse, this paper explores \textit{dataset inference}, which aims to detect if a suspect model $\mathcal{M}$ used a victim dataset $\mathcal{D}$ in training. Previous research tackles dataset inference by aggregating results of membership inference attacks (MIAs) -- methods to determine whether individual samples are a part of the training dataset. However, restricted by the low accuracy of MIAs, previous research mandates grey-box access to $\mathcal{M}$ to get intermediate outputs (probabilities, loss, perplexity, etc.) for obtaining satisfactory results. This leads to reduced practicality, as LLMs, especially those deployed for profits, have limited incentives to return the intermediate outputs.
In this paper, we propose a new method of dataset inference with only black-box access to the target model (i.e., assuming only the text-based responses of the target model are available). Our method is enabled by two sets of locally built reference models, one set involving $\mathcal{D}$ in training and the other not. By measuring which set of reference model $\mathcal{M}$ is closer to, we determine if $\mathcal{M}$ used $\mathcal{D}$ for training. Evaluations of real-world LLMs in the wild show that our method offers high accuracy in all settings and presents robustness against bypassing attempts.
Authors:Jason Zhijingcheng Yu, Fangqi Han, Kaustab Choudhury, Trevor E. Carlson, Prateek Saxena
Abstract:
The Rust programming language enforces three basic Rust principles, namely ownership, borrowing, and AXM (Aliasing Xor Mutability) to prevent security bugs such as memory safety violations and data races. However, Rust projects often have mixed code, i.e., code that also uses unsafe Rust, FFI (Foreign Function Interfaces), and inline assembly for low-level control. The Rust compiler is unable to statically enforce Rust principles in mixed Rust code which can lead to many security vulnerabilities. In this paper, we propose CapsLock, a security enforcement mechanism that can run at the level of machine code and detect Rust principle violations at run-time in mixed code. CapsLock is kept simple enough to be implemented into recent capability-based hardware abstractions that provide low-cost spatial memory safety. CapsLock introduces a novel revoke-on-use abstraction for capability-based designs, wherein accessing a memory object via a capability implicitly invalidates certain other capabilities pointing to it, thereby also providing temporal memory safety automatically, without requiring software to explicitly specify such invalidation. Thus, CapsLock is the first mechanism capable of providing cross-language enforcement of Rust principles. We implemented a prototype of CapsLock on QEMU. Evaluation results show that CapsLock is highly compatible with existing Rust code (passing 99.7% of the built-in test cases of the 100 most popular crates) and flags Rust principle violations in real-world Rust projects that use FFI or inline assembly. We discovered 8 previously unknown bugs in such crates in our experiments.
Authors:Noureldin Zahran, Ahmad Tahmasivand, Ihsen Alouani, Khaled Khasawneh, Mohammed E. Fouda
Abstract:
The safety alignment of Language Models (LMs) is a critical concern, yet their integrity can be challenged by direct parameter manipulation attacks, such as those potentially induced by fault injection. As LMs are increasingly deployed using low-precision quantization for efficiency, this paper investigates the efficacy of such attacks for jailbreaking aligned LMs across different quantization schemes. We propose gradient-guided attacks, including a tailored progressive bit-level search algorithm introduced herein and a comparative word-level (single weight update) attack. Our evaluation on Llama-3.2-3B, Phi-4-mini, and Llama-3-8B across FP16 (baseline), and weight-only quantization (FP8, INT8, INT4) reveals that quantization significantly influences attack success. While attacks readily achieve high success (>80% Attack Success Rate, ASR) on FP16 models, within an attack budget of 25 perturbations, FP8 and INT8 models exhibit ASRs below 20% and 50%, respectively. Increasing the perturbation budget up to 150 bit-flips, FP8 models maintained ASR below 65%, demonstrating some resilience compared to INT8 and INT4 models that have high ASR. In addition, analysis of perturbation locations revealed differing architectural targets across quantization schemes, with (FP16, INT4) and (INT8, FP8) showing similar characteristics. Besides, jailbreaks induced in FP16 models were highly transferable to subsequent FP8/INT8 quantization (<5% ASR difference), though INT4 significantly reduced transferred ASR (avg. 35% drop). These findings highlight that while common quantization schemes, particularly FP8, increase the difficulty of direct parameter manipulation jailbreaks, vulnerabilities can still persist, especially through post-attack quantization.
Authors:Elizabeth Lui, Jiahao Sun
Abstract:
This paper investigates whether Bittensor can be considered the Bitcoin of decentralized Artificial Intelligence by directly comparing its tokenomics, decentralization properties, consensus mechanism, and incentive structure against those of Bitcoin. Leveraging on-chain data from all 64 active Bittensor subnets, we first document considerable concentration in both stake and rewards. We further show that rewards are overwhelmingly driven by stake, highlighting a clear misalignment between quality and compensation. As a remedy, we put forward a series of two-pronged protocol-level interventions. For incentive realignment, our proposed solutions include performance-weighted emission split, composite scoring, and a trust-bonus multiplier. As for mitigating security vulnerability due to stake concentration, we propose and empirically validate stake cap at the 88th percentile, which elevates the median coalition size required for a 51-percent attack and remains robust across daily, weekly, and monthly snapshots.
Authors:Azmat Ullah, Maria Ilaria Lunesu, Lodovica Marchesi, Roberto Tonelli
Abstract:
This paper presents a blockchain-based Internet of Things (IoT) system for monitoring pizza production in restaurants. IoT devices track temperature and humidity in real-time, while blockchain ensures secure and tamper-proof data. A Raspberry Pi processes sensor data, captures images, triggers alerts, and interacts with smart contracts. The system detects abnormal conditions, enabling quick responses. Blockchain adds transparency and traceability, supporting compliance and audits. Experiments show improved ingredient management, reduced waste, and increased kitchen efficiency.
Authors:Thomas Prévost, Bruno Martin
Abstract:
We propose a new 10-bit S-box generated from a Feistel construction. The subpermutations are generated by a 5-cell cellular automaton based on a unique well-chosen rule and bijective affine transformations. In particular, the cellular automaton rule is chosen based on empirical tests of its ability to generate good pseudorandom output on a ring cellular automaton. Similarly, Feistel's network layout is based on empirical data regarding the quality of the output S-box. We perform cryptanalysis of the generated 10-bit S-box, and we find security properties comparable to or sometimes even better than those of the standard AES S-box. We believe that our S-box could be used to replace the 5-bit substitution of ciphers like ASCON.
Authors:Pantelimon Stanica, Ranit Dutta, Bimal Mandal
Abstract:
This paper introduces {\em truncated inner $c$-differential cryptanalysis}, a novel technique that for the first time enables the practical application of $c$-differential uniformity to block ciphers. While Ellingsen et al. (IEEE Trans. Inf. Theory, 2020) established the notion of $c$-differential uniformity using $(F(x\oplus a), cF(x))$, a key challenge remained: multiplication by $c$ disrupts the structural properties essential for block cipher analysis, particularly key addition.
We resolve this challenge by developing an \emph{inner} $c$-differential approach where multiplication by $c$ affects the input: $(F(cx\oplus a), F(x))$. We prove that the inner $c$-differential uniformity of a function $F$ equals the outer $c$-differential uniformity of $F^{-1}$, establishing a fundamental duality. This modification preserves cipher structure while enabling practical cryptanalytic applications.
Our main contribution is a comprehensive multi-faceted statistical-computational framework, implementing truncated $c$-differential analysis against the full 9-round Kuznyechik cipher (the inner $c$-differentials are immune to the key whitening at the backend). Through extensive computational analysis involving millions of differential pairs, we demonstrate statistically significant non-randomness across all tested round counts. For the full 9-round cipher, we identify multiple configurations triggering critical security alerts, with bias ratios reaching $1.7\times$ and corrected p-values as low as $1.85 \times 10^{-3}$, suggesting insufficient security margin against this new attack vector. This represents the first practical distinguisher against the full 9-round Kuznyechik.
Authors:Xiaoyu Ji, Jessica Shorland, Joshua Shank, Pascal Delpe-Brice, Latanya Sweeney, Jan Allebach, Ali Shakouri
Abstract:
Small- and medium-sized manufacturers need innovative data tools but, because of competition and privacy concerns, often do not want to share their proprietary data with researchers who might be interested in helping. This paper introduces a privacy-preserving platform by which manufacturers may safely share their data with researchers through secure methods, so that those researchers then create innovative tools to solve the manufacturers' real-world problems, and then provide tools that execute solutions back onto the platform for others to use with privacy and confidentiality guarantees. We illustrate this problem through a particular use case which addresses an important problem in the large-scale manufacturing of food crystals, which is that quality control relies on image analysis tools. Previous to our research, food crystals in the images were manually counted, which required substantial and time-consuming human efforts, but we have developed and deployed a crystal analysis tool which makes this process both more rapid and accurate. The tool enables automatic characterization of the crystal size distribution and numbers from microscope images while the natural imperfections from the sample preparation are automatically removed; a machine learning model to count high resolution translucent crystals and agglomeration of crystals was also developed to aid in these efforts. The resulting algorithm was then packaged for real-world use on the factory floor via a web-based app secured through the originating privacy-preserving platform, allowing manufacturers to use it while keeping their proprietary data secure. After demonstrating this full process, future directions are also explored.
Authors:Quentin Le Roux, Yannick Teglia, Teddy Furon, Philippe Loubet-Moundi, Eric Bourbao
Abstract:
The widespread deployment of Deep Learning-based Face Recognition Systems raises multiple security concerns. While prior research has identified backdoor vulnerabilities on isolated components, Backdoor Attacks on real-world, unconstrained pipelines remain underexplored. This paper presents the first comprehensive system-level analysis of Backdoor Attacks targeting Face Recognition Systems and provides three contributions. We first show that face feature extractors trained with large margin metric learning losses are susceptible to Backdoor Attacks. By analyzing 20 pipeline configurations and 15 attack scenarios, we then reveal that a single backdoor can compromise an entire Face Recognition System. Finally, we propose effective best practices and countermeasures for stakeholders.
Authors:Yeonsoo Jeon, Mattan Erez, Michael Orshansky
Abstract:
Hybrid Homomorphic Encryption (HHE) combines symmetric key and homomorphic encryption to reduce ciphertext expansion crucial in client-server deployments of HE. Special symmetric ciphers, amenable to efficient HE evaluation, have been developed. Their client-side deployment calls for performant and energy-efficient implementation, and in this paper we develop and evaluate hardware accelerators for the two known CKKS-targeting HHE ciphers, HERA and Rubato.
We design vectorized and overlapped functional modules. The design exploits transposition-invariance property of the MixColumns and MixRows function and alternates the order of intermediate state to eliminate bubbles in stream key generation, improving latency and throughput. We decouple the RNG and key computation phases to hide the latency of RNG and to reduce the critical path in FIFOs, achieving higher operating frequency.
We implement the accelerator on an AMD Virtex UltraScale+ FPGA. Both Rubato and HERA achieve a 6x improvement in throughput compared to the software implementation. In terms of latency, Rubato achieves a 5x reduction, while HERA achieves a 3x reduction. Additionally, our hardware implementations reduce energy consumption by 75x for Rubato and 47x for HERA compared to their software implementation.
Authors:Olivia Figueira, Pranathi Chamarthi, Tu Le, Athina Markopoulou
Abstract:
TikTok, the social media platform that is popular among children and adolescents, offers a more restrictive "Under 13 Experience" exclusively for young users in the US, also known as TikTok's "Kids Mode". While prior research has studied various aspects of TikTok's regular mode, including privacy and personalization, TikTok's Kids Mode remains understudied, and there is a lack of transparency regarding its content curation and its safety and privacy protections for children. In this paper, (i) we propose an auditing methodology to comprehensively investigate TikTok's Kids Mode and (ii) we apply it to characterize the platform's content curation and determine the prevalence of child-directed content, based on regulations in the Children's Online Privacy Protection Act (COPPA). We find that 83% of videos observed on the "For You" page in Kids Mode are actually not child-directed, and even inappropriate content was found. The platform also lacks critical features, namely parental controls and accessibility settings. Our findings have important design and regulatory implications, as children may be incentivized to use TikTok's regular mode instead of Kids Mode, where they are known to be exposed to further safety and privacy risks.
Authors:Incheol Baek, Yon Dohn Chung
Abstract:
Local Differential Privacy (LDP) addresses significant privacy concerns in sensitive data collection. In this work, we focus on numerical data collection under LDP, targeting a significant gap in the literature: existing LDP mechanisms are optimized for either a very small ($|Ω| \in \{2, 3\}$) or infinite output spaces. However, no generalized method for constructing an optimal mechanism for an arbitrary output size $N$ exists. To fill this gap, we propose the \textbf{N-output mechanism}, a generalized framework that maps numerical data to one of $N$ discrete outputs. We formulate the mechanism's design as an optimization problem to minimize estimation variance for any given $N \geq 2$ and develop both numerical and analytical solutions. This results in a mechanism that is highly accurate and adaptive, as its design is determined by solving an optimization problem for any chosen $N$. Furthermore, we extend our framework and existing mechanisms to the task of distribution estimation. Empirical evaluations show that the N-output mechanism achieves state-of-the-art accuracy for mean, variance, and distribution estimation with small communication costs.
Authors:Wentian Zhu, Zhen Xiang, Wei Niu, Le Guan
Abstract:
Unlike regular tokens derived from existing text corpora, special tokens are artificially created to annotate structured conversations during the fine-tuning process of Large Language Models (LLMs). Serving as metadata of training data, these tokens play a crucial role in instructing LLMs to generate coherent and context-aware responses. We demonstrate that special tokens can be exploited to construct four attack primitives, with which malicious users can reliably bypass the internal safety alignment of online LLM services and circumvent state-of-the-art (SOTA) external content moderation systems simultaneously. Moreover, we found that addressing this threat is challenging, as aggressive defense mechanisms-such as input sanitization by removing special tokens entirely, as suggested in academia-are less effective than anticipated. This is because such defense can be evaded when the special tokens are replaced by regular ones with high semantic similarity within the tokenizer's embedding space. We systemically evaluated our method, named MetaBreak, on both lab environment and commercial LLM platforms. Our approach achieves jailbreak rates comparable to SOTA prompt-engineering-based solutions when no content moderation is deployed. However, when there is content moderation, MetaBreak outperforms SOTA solutions PAP and GPTFuzzer by 11.6% and 34.8%, respectively. Finally, since MetaBreak employs a fundamentally different strategy from prompt engineering, the two approaches can work synergistically. Notably, empowering MetaBreak on PAP and GPTFuzzer boosts jailbreak rates by 24.3% and 20.2%, respectively.
Authors:Yue Deng, Francisco Santos, Pang-Ning Tan, Lifeng Luo
Abstract:
Deep learning based weather forecasting (DLWF) models leverage past weather observations to generate future forecasts, supporting a wide range of downstream tasks, including tropical cyclone (TC) trajectory prediction. In this paper, we investigate their vulnerability to adversarial attacks, where subtle perturbations to the upstream weather forecasts can alter the downstream TC trajectory predictions. Although research on adversarial attacks in DLWF models has grown recently, generating perturbed upstream forecasts that reliably steer downstream output toward attacker-specified trajectories remains a challenge. First, conventional TC detection systems are opaque, non-differentiable black boxes, making standard gradient-based attacks infeasible. Second, the extreme rarity of TC events leads to severe class imbalance problem, making it difficult to develop efficient attack methods that will produce the attacker's target trajectories. Furthermore, maintaining physical consistency in adversarially generated forecasts presents another significant challenge. To overcome these limitations, we propose Cyc-Attack, a novel method that perturbs the upstream forecasts of DLWF models to generate adversarial trajectories. First, we pre-train a differentiable surrogate model to approximate the TC detector's output, enabling the construction of gradient-based attacks. Cyc-Attack also employs skewness-aware loss function with kernel dilation strategy to address the imbalance problem. Finally, a distance-based gradient weighting scheme and regularization are used to constrain the perturbations and eliminate spurious trajectories to ensure the adversarial forecasts are realistic and not easily detectable.
Authors:Raju Dhakal, Prashant Shekhar, Laxima Niure Kandel
Abstract:
Radio Frequency Fingerprinting (RFF) has evolved as an effective solution for authenticating devices by leveraging the unique imperfections in hardware components involved in the signal generation process. In this work, we propose a Convolutional Neural Network (CNN) based framework for detecting rogue devices and identifying genuine ones using softmax probability thresholding. We emulate an attack scenario in which adversaries attempt to mimic the RF characteristics of genuine devices by training a Generative Adversarial Network (GAN) using In-phase and Quadrature (IQ) samples from genuine devices. The proposed approach is verified using IQ samples collected from ten different ADALM-PLUTO Software Defined Radios (SDRs), with seven devices considered genuine, two as rogue, and one used for validation to determine the threshold.
Authors:Yanming Li, Seifeddine Ghozzi, Cédric Eichler, Nicolas Anciaux, Alexandra Bensamoun, Lorena Gonzalez Manzano
Abstract:
We address the problem of auditing whether sensitive or copyrighted texts were used to fine-tune large language models (LLMs) under black-box access. Prior signals-verbatim regurgitation and membership inference-are unreliable at the level of individual documents or require altering the visible text. We introduce a text-preserving watermarking framework that embeds sequences of invisible Unicode characters into documents. Each watermark is split into a cue (embedded in odd chunks) and a reply (embedded in even chunks). At audit time, we submit prompts that contain only the cue; the presence of the corresponding reply in the model's output provides evidence of memorization consistent with training on the marked text. To obtain sound decisions, we compare the score of the published watermark against a held-out set of counterfactual watermarks and apply a ranking test with a provable false-positive-rate bound. The design is (i) minimally invasive (no visible text changes), (ii) scalable to many users and documents via a large watermark space and multi-watermark attribution, and (iii) robust to common passive transformations. We evaluate on open-weight LLMs and multiple text domains, analyzing regurgitation dynamics, sensitivity to training set size, and interference under multiple concurrent watermarks. Our results demonstrate reliable post-hoc provenance signals with bounded FPR under black-box access. We experimentally observe a failure rate of less than 0.1\% when detecting a reply after fine-tuning with 50 marked documents. Conversely, no spurious reply was recovered in over 18,000 challenges, corresponding to a 100\%TPR@0\% FPR. Moreover, detection rates remain relatively stable as the dataset size increases, maintaining a per-document detection rate above 45\% even when the marked collection accounts for less than 0.33\% of the fine-tuning data.
Authors:Ashley Brown, Nilufer Tuptuk, Enrico Mariconti, Shane Johnson
Abstract:
It is well documented that criminals use IoT devices to facilitate crimes. The review process follows a systematic approach with a clear search strategy, and study selection strategy. The review included a total of 543 articles and the findings from these articles were synthesised through thematic analysis. Identified security attacks targeting consumer IoT devices include man-in-the-middle (MiTM) attacks, synchronisation attacks, Denial-of-Service (DoS), DNS poisoning and malware, alongside device-specific vulnerabilities. Besides security attacks, this review discusses mitigations. Furthermore, the literature also covers crime threat scenarios arising from these attacks, such as, fraud, identity theft, crypto jacking and domestic abuse.
Authors:Hrad Ghoukasian, Bonwoo Lee, Shahab Asoodeh
Abstract:
We study the problem of sampling from a distribution under local differential privacy (LDP). Given a private distribution $P \in \mathcal{P}$, the goal is to generate a single sample from a distribution that remains close to $P$ in $f$-divergence while satisfying the constraints of LDP. This task captures the fundamental challenge of producing realistic-looking data under strong privacy guarantees. While prior work by Park et al. (NeurIPS'24) focuses on global minimax-optimality across a class of distributions, we take a local perspective. Specifically, we examine the minimax risk in a neighborhood around a fixed distribution $P_0$, and characterize its exact value, which depends on both $P_0$ and the privacy level. Our main result shows that the local minimax risk is determined by the global minimax risk when the distribution class $\mathcal{P}$ is restricted to a neighborhood around $P_0$. To establish this, we (1) extend previous work from pure LDP to the more general functional LDP framework, and (2) prove that the globally optimal functional LDP sampler yields the optimal local sampler when constrained to distributions near $P_0$. Building on this, we also derive a simple closed-form expression for the locally minimax-optimal samplers which does not depend on the choice of $f$-divergence. We further argue that this local framework naturally models private sampling with public data, where the public data distribution is represented by $P_0$. In this setting, we empirically compare our locally optimal sampler to existing global methods, and demonstrate that it consistently outperforms global minimax samplers.
Authors:Alison Gonçalves Schemitt, Henrique Fan da Silva, Roben Castagna Lunardi, Diego Kreutz, Rodrigo Brandão Mansilha, Avelino Francisco Zorzo
Abstract:
The advent of quantum computing threatens the security of traditional encryption algorithms, motivating the development of post-quantum cryptography (PQC). In 2024, the National Institute of Standards and Technology (NIST) standardized several PQC algorithms, marking an important milestone in the transition toward quantum-resistant security. Blockchain systems fundamentally rely on cryptographic primitives to guarantee data integrity and transaction authenticity. However, widely used algorithms such as ECDSA, employed in Bitcoin, Ethereum, and other networks, are vulnerable to quantum attacks. Although adopting PQC is essential for long-term security, its computational overhead in blockchain environments remains largely unexplored. In this work, we propose a methodology for benchmarking both PQC and traditional cryptographic algorithms in blockchain contexts. We measure signature generation and verification times across diverse computational environments and simulate their impact at scale. Our evaluation focuses on PQC digital signature schemes (ML-DSA, Dilithium, Falcon, Mayo, SLH-DSA, SPHINCS+, and Cross) across security levels 1 to 5, comparing them to ECDSA, the current standard in Bitcoin and Ethereum. Our results indicate that PQC algorithms introduce only minor performance overhead at security level 1, while in some scenarios they significantly outperform ECDSA at higher security levels. For instance, ML-DSA achieves a verification time of 0.14 ms on an ARM-based laptop at level 5, compared to 0.88 ms for ECDSA. We also provide an open-source implementation to ensure reproducibility and encourage further research.
Authors:Abhishek K. Mishra, Antoine Boutet, Lucas Magnana
Abstract:
Large Language Models (LLMs) are increasingly deployed across multilingual applications that handle sensitive data, yet their scale and linguistic variability introduce major privacy risks. Mostly evaluated for English, this paper investigates how language structure affects privacy leakage in LLMs trained on English, Spanish, French, and Italian medical corpora. We quantify six linguistic indicators and evaluate three attack vectors: extraction, counterfactual memorization, and membership inference. Results show that privacy vulnerability scales with linguistic redundancy and tokenization granularity: Italian exhibits the strongest leakage, while English shows higher membership separability. In contrast, French and Spanish display greater resilience due to higher morphological complexity. Overall, our findings provide the first quantitative evidence that language matters in privacy leakage, underscoring the need for language-aware privacy-preserving mechanisms in LLM deployments.
Authors:Anne Müller, Mohd Kashif, Nico Döttling
Abstract:
Fully Homomorphic Encryption (FHE) enables the evaluation of programs directly on encrypted data. However, because only basic operations can be performed on ciphertexts, programs must be expressed as boolean or arithmetic circuits. This low-level representation makes implementing applications for FHE significantly more cumbersome than writing code in a high-level language. To reduce this burden, several transpilers have been developed that translate high-level code into circuit representations. In this work, we extend the range of high-level languages that can target FHE by introducing a transpiler for Haskell, which converts Haskell programs into Boolean circuits suitable for homomorphic evaluation. Our second contribution is the automatic parallelization of these generated circuits. We implement an evaluator that executes gates in parallel by parallelizing each layer of the circuit. We demonstrate the effectiveness of our approach on two key applications: Private Information Retrieval (PIR) and the AES encryption standard. Prior work has parallelized AES encryption manually. We demonstrate that the automated method outperforms some but not all manual parallelizations of AES evaluations under FHE. We achieve an evaluation time of 28 seconds for a parallel execution with 16 threads and an evaluation time of 8 seconds for a parallel execution with 100 threads
Authors:Daniel Pressensé, Elisavet Kozyri
Abstract:
This paper presents TracE2E, a middleware written in Rust, that can provide both data explainability and compliance across multiple nodes. By mediating inputs and outputs of processes, TracE2E records provenance information and enforces data-protection policies (e.g., confidentiality, integrity) that depend on the recorded provenance. Unlike existing approaches that necessitate substantial application modifications, TracE2E is designed for easy integration into existing and future applications through a wrapper of the Rust standard library's IO module. We describe how TracE2E consistently records provenance information across nodes, and we demonstrate how the compliance layer of TracE2E can accommodate the enforcement of multiple policies.
Authors:Stanisław Pawlak, Jan Dubiński, Daniel Marczak, Bartłomiej Twardowski
Abstract:
Model merging (MM) recently emerged as an effective method for combining large deep learning models. However, it poses significant security risks. Recent research shows that it is highly susceptible to backdoor attacks, which introduce a hidden trigger into a single fine-tuned model instance that allows the adversary to control the output of the final merged model at inference time. In this work, we propose a simple framework for understanding backdoor attacks by treating the attack itself as a task vector. $Backdoor\ Vector\ (BV)$ is calculated as the difference between the weights of a fine-tuned backdoored model and fine-tuned clean model. BVs reveal new insights into attacks understanding and a more effective framework to measure their similarity and transferability. Furthermore, we propose a novel method that enhances backdoor resilience through merging dubbed $Sparse\ Backdoor\ Vector\ (SBV)$ that combines multiple attacks into a single one. We identify the core vulnerability behind backdoor threats in MM: $inherent\ triggers$ that exploit adversarial weaknesses in the base model. To counter this, we propose $Injection\ BV\ Subtraction\ (IBVS)$ - an assumption-free defense against backdoors in MM. Our results show that SBVs surpass prior attacks and is the first method to leverage merging to improve backdoor effectiveness. At the same time, IBVS provides a lightweight, general defense that remains effective even when the backdoor threat is entirely unknown.
Authors:Kalyan Cheerla, Lotfi Ben Othmane, Kirill Morozov
Abstract:
Machine Learning (ML) is making its way into fields such as healthcare, finance, and Natural Language Processing (NLP), and concerns over data privacy and model confidentiality continue to grow. Privacy-preserving Machine Learning (PPML) addresses this challenge by enabling inference on private data without revealing sensitive inputs or proprietary models. Leveraging Secure Computation techniques from Cryptography, two widely studied approaches in this domain are Fully Homomorphic Encryption (FHE) and Garbled Circuits (GC). This work presents a comparative evaluation of FHE and GC for secure neural network inference. A two-layer neural network (NN) was implemented using the CKKS scheme from the Microsoft SEAL library (FHE) and the TinyGarble2.0 framework (GC) by IntelLabs. Both implementations are evaluated under the semi-honest threat model, measuring inference output error, round-trip time, peak memory usage, communication overhead, and communication rounds. Results reveal a trade-off: modular GC offers faster execution and lower memory consumption, while FHE supports non-interactive inference.
Authors:Yuhua Xu, Wei Sun, Chengpei Tang, Jiaxing Lu, Jingying Zhou, Chen Gu
Abstract:
Current generative steganography research mainly pursues computationally expensive mappings to perfect Gaussian priors within single diffusion model architectures. This work introduces an efficient framework based on approximate Gaussian mapping governed by a scale factor calibrated through capacity-aware adaptive optimization. Using this framework as a unified analytical tool, systematic comparative analysis of steganography in pixel-space models versus VAE-based latent-space systems is conducted. The investigation reveals a pronounced architecture dependent security-robustness trade-off: pixel-space models achieve high security against steganalysis but exhibit fragility to channel distortions, while VAE-based systems like Stable Diffusion offer substantial robustness at the cost of security vulnerabilities. Further analysis indicates that the VAE component drives this behavior through opposing mechanisms where the encoder confers robustness via manifold regularization while the decoder introduces vulnerabilities by amplifying latent perturbations into detectable artifacts. These findings characterize the conflicting architectural roles in generative steganography and establish a foundation for future research.
Authors:Abhishek Anand, Matthias C. Caro, Ari Karchmer, Saachi Mutreja
Abstract:
Quantum learning from remotely accessed quantum compute and data must address two key challenges: verifying the correctness of data and ensuring the privacy of the learner's data-collection strategies and resulting conclusions. The covert (verifiable) learning model of Canetti and Karchmer (TCC 2021) provides a framework for endowing classical learning algorithms with such guarantees. In this work, we propose models of covert verifiable learning in quantum learning theory and realize them without computational hardness assumptions for remote data access scenarios motivated by established quantum data advantages. We consider two privacy notions: (i) strategy-covertness, where the eavesdropper does not gain information about the learner's strategy; and (ii) target-covertness, where the eavesdropper does not gain information about the unknown object being learned. We show: Strategy-covert algorithms for making quantum statistical queries via classical shadows; Target-covert algorithms for learning quadratic functions from public quantum examples and private quantum statistical queries, for Pauli shadow tomography and stabilizer state learning from public multi-copy and private single-copy quantum measurements, and for solving Forrelation and Simon's problem from public quantum queries and private classical queries, where the adversary is a unidirectional or i.i.d. ancilla-free eavesdropper. The lattermost results in particular establish that the exponential separation between classical and quantum queries for Forrelation and Simon's problem survives under covertness constraints. Along the way, we design covert verifiable protocols for quantum data acquisition from public quantum queries which may be of independent interest. Overall, our models and corresponding algorithms demonstrate that quantum advantages are privately and verifiably achievable even with untrusted, remote data.
Authors:Rishabh Das. Aaron Werth, Tommy Morris
Abstract:
Industrial control system (ICS) operations use trusted endpoints like human machine interfaces (HMIs) and workstations to relay commands to programmable logic controllers (PLCs). Because most PLCs lack layered defenses, compromise of a trusted endpoint can drive unsafe actuator commands and risk safety-critical operation. This research presents an embedded intrusion detection system that runs inside the controller and uses header-level telemetry to detect and respond to network attacks. The system combines a semi-supervised anomaly detector and a supervised attack classifier. We evaluate the approach on a midstream oil-terminal testbed using three datasets collected during tanker-truck loading. The anomaly detector achieves zero missed attacks, corresponding to 0.998 Matthews correlation. The supervised stage attains 97.37 percent hold-out accuracy and 97.03 percent external accuracy. The embedded design adds a median of 2,031 microseconds of end-to-end latency and does not impact PLC's cycle time. The proposed architecture provides a multi-layer embedded security that meets the real-time requirements of an industrial system.
Authors:Philip Huff, Nishka Gandu, Pavel Novák
Abstract:
We examine the state of publicly available information about known exploitable vulnerabilities applicable to operational technology (OT) environments. Specifically, we analyze the Known Exploitable Vulnerabilities Catalog (KEVC) maintained by the US Department of Homeland Security Cybersecurity and Infrastructure Security Agency (CISA) to assess whether currently available data is sufficient for effective and reliable remediation in OT settings. Our team analyzed all KEVC entries through July 2025 to determine the extent to which OT environments can rely on existing remediation recommendations. We found that although most entries in the KEVC could affect OT environments, only 13% include vendor workarounds or mitigations as alternatives to patching. This paper also examines the feasibility of developing such alternatives based on vulnerability and exploit characteristics, and we present early evidence of success with this approach.
Authors:Didrik Bergström, Deniz Gündüz, Onur Günlü
Abstract:
We consider image transmission via deep joint source-channel coding (DeepJSCC) over multi-hop additive white Gaussian noise (AWGN) channels by training a DeepJSCC encoder-decoder pair with a pre-trained deep hash distillation (DHD) module to semantically cluster images, facilitating security-oriented applications through enhanced semantic consistency and improving the perceptual reconstruction quality. We train the DeepJSCC module to both reduce mean square error (MSE) and minimize cosine distance between DHD hashes of source and reconstructed images. Significantly improved perceptual quality as a result of semantic alignment is illustrated for different multi-hop settings, for which classical DeepJSCC may suffer from noise accumulation, measured by the learned perceptual image patch similarity (LPIPS) metric.
Authors:Riku Mochizuki, Shusuke Komatsu, Souta Noguchi, Kazuto Ataka
Abstract:
We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately \(25\%\)--\(45\%\) in the U.S. and \(60\%\)--\(65\%\) in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.
Authors:Jack Vanlyssel, Enrique Sobrados, Ramsha Anwar, Gruia-Catalin Roman, Afsah Anwar
Abstract:
Small satellites are integral to scientific, commercial, and defense missions, but reliance on commercial off-the-shelf (COTS) hardware broadens their attack surface. Although supply chain threats are well studied in other cyber-physical domains, their feasibility and stealth in space systems remain largely unexplored. Prior work has focused on flight software, which benefits from strict security practices and oversight. In contrast, auxiliary COTS components often lack robust assurance yet enjoy comparable access to critical on-board resources, including telemetry, system calls, and the software bus. Despite this privileged access, the insider threat within COTS hardware supply chains has received little attention. In this work, we present SpyChain, the first end-to-end design and implementation of independent and colluding hardware supply chain threats targeting small satellites. Using NASA's satellite simulation (NOS3), we demonstrate that SpyChain can evade testing, exfiltrate telemetry, disrupt operations, and launch Denial of Service (DoS) attacks through covert channels that bypass ground monitoring. Our study traces an escalation from a simple solo component to dynamic, coordinating malware, introducing a taxonomy of stealth across five scenarios. We showcase how implicit trust in auxiliary components enables covert persistence and reveal novel attack vectors, highlighting a new multi-component execution technique that is now incorporated into the SPARTA matrix. Our findings are reinforced by acknowledgment and affirmation from NASA's NOS3 team. Finally, we implement lightweight onboard defenses, including runtime monitoring, to mitigate threats like SpyChain.
Authors:Suresh K. Damodaran, Paul D. Rowe
Abstract:
The emulation of multi-step attacks attributed to advanced persistent threats is valuable for training defenders and evaluating defense tools. In this paper, we discuss the numerous challenges and desired attributes associated with such automation. Additionally, we introduce the use of Effects Language (EL), a visual programming language with graph-based operational semantics, as a solution to address many of these challenges and requirements. We formally define the execution semantics of EL, and prove important execution properties. Furthermore, we showcase the application of EL to codify attacks using an example from one of the publicly available attack scenarios. We also demonstrate how EL can be utilized to provide proof-of-attack of complex multi-step attacks. Our results highlight the improvements in time and resource efficiency achieved through the use of EL for repeatable automation.
Authors:André Chailloux, Paul Hermouet
Abstract:
Chen, Liu, and Zhandry [CLZ22] introduced the problems $S|LWE\rangle$ and $C|LWE\rangle$ as quantum analogues of the Learning with Errors problem, designed to construct quantum algorithms for the Inhomogeneous Short Integer Solution ($ISIS$) problem. Several later works have used this framework for constructing new quantum algorithms in specific cases. However, the general relation between all these problems is still unknown. In this paper, we investigate the equivalence between $S|LWE\rangle$ and $ISIS$. We present the first fully generic reduction from $ISIS$ to $S|LWE\rangle$, valid even in the presence of errors in the underlying algorithms. We then explore the reverse direction, introducing an inhomogeneous variant of $C|LWE\rangle$, denoted $IC|LWE\rangle$, and show that $IC|LWE\rangle$ reduces to $S|LWE\rangle$. Finally, we prove that, under certain recoverability conditions, an algorithm for $ISIS$ can be transformed into one for $S|LWE\rangle$. We instantiate this reverse reduction by tweaking a known algorithm for $(I)SIS_\infty$ in order to construct quantum algorithm for $S|LWE\rangle$ when the alphabet size q is a small power of 2, recovering some results of Bai et al. [BJK+ 25]. Our results thus clarify the landscape of reductions between $S|LWE\rangle$ and $ISIS$, and we show both their strong connection as well as the remaining barriers for showing full equivalence.
Authors:Luke Stevenson, Sanchari Das
Abstract:
Mobile healthcare (mHealth) applications promise convenient, continuous patient-provider interaction but also introduce severe and often underexamined security and privacy risks. We present an end-to-end audit of 272 Android mHealth apps from Google Play, combining permission forensics, static vulnerability analysis, and user review mining. Our multi-tool assessment with MobSF, RiskInDroid, and OWASP Mobile Audit revealed systemic weaknesses: 26.1% request fine-grained location without disclosure, 18.3% initiate calls silently, and 73 send SMS without notice. Nearly half (49.3%) still use deprecated SHA-1 encryption, 42 transmit unencrypted data, and 6 remain vulnerable to StrandHogg 2.0. Analysis of 2.56 million user reviews found 28.5% negative or neutral sentiment, with over 553,000 explicitly citing privacy intrusions, data misuse, or operational instability. These findings demonstrate the urgent need for enforceable permission transparency, automated pre-market security vetting, and systematic adoption of secure-by-design practices to protect Protected Health Information (PHI).
Authors:Johnnatan Messias, Ayae Ide
Abstract:
Decentralized Autonomous Organizations (DAOs) aim to enable participatory governance, but in practice face challenges of voter apathy, concentration of voting power, and misaligned delegation. Existing delegation mechanisms often reinforce visibility biases, where a small set of highly ranked delegates accumulate disproportionate influence regardless of their alignment with the broader community. In this paper, we conduct an empirical study of delegation in DAO governance, combining on-chain data from five major protocols with off-chain discussions from 14 DAO forums. We develop a methodology to link forum participants to on-chain addresses, extract governance interests using large language models, and compare these interests against delegates' historical behavior. Our analysis reveals that delegations are frequently misaligned with token holders' expressed priorities and that current ranking-based interfaces exacerbate power concentration. We argue that incorporating interest alignment into delegation processes could mitigate these imbalances and improve the representativeness of DAO decision-making.
Authors:Md Rezanur Islam, Mahdi Sahlabadi, Keunkyoung Kim, Kangbin Yim
Abstract:
Security measures are essential in the automotive industry to detect intrusions in-vehicle networks. However, developing a one-size-fits-all Intrusion Detection System (IDS) is challenging because each vehicle has unique data profiles. This is due to the complex and dynamic nature of the data generated by vehicles regarding their model, driving style, test environment, and firmware update. To address this issue, a universal IDS has been developed that can be applied to all types of vehicles without the need for customization. Unlike conventional IDSs, the universal IDS can adapt to evolving data security issues resulting from firmware updates. In this study, a new hybrid approach has been developed, combining Pearson correlation with deep learning techniques. This approach has been tested using data obtained from four distinct mechanical and electronic vehicles, including Tesla, Sonata, and two Kia models. The data has been combined into two frequency datasets, and wavelet transformation has been employed to convert them into the frequency domain, enhancing generalizability. Additionally, a statistical method based on independent rule-based systems using Pearson correlation has been utilized to improve system performance. The system has been compared with eight different IDSs, three of which utilize the universal approach, while the remaining five are based on conventional techniques. The accuracy of each system has been evaluated through benchmarking, and the results demonstrate that the hybrid system effectively detects intrusions in various vehicle models.
Authors:Carolina Carreira, Anu Aggarwal, Alejandro Cuevas, Maria José Ferreira, Hanan Hibshi, Cleotilde Gonzalez
Abstract:
Understanding how cognitive biases influence adversarial decision-making is essential for developing effective cyber defenses. Capture-the-Flag (CTF) competitions provide an ecologically valid testbed to study attacker behavior at scale, simulating real-world intrusion scenarios under pressure. We analyze over 500,000 submission logs from picoCTF, a large educational CTF platform, to identify behavioral signatures of cognitive biases with defensive implications. Focusing on availability bias and the sunk cost fallacy, we employ a mixed-methods approach combining qualitative coding, descriptive statistics, and generalized linear modeling. Our findings show that participants often submitted flags with correct content but incorrect formatting (availability bias), and persisted in attempting challenges despite repeated failures and declining success probabilities (sunk cost fallacy). These patterns reveal that biases naturally shape attacker behavior in adversarial contexts. Building on these insights, we outline a framework for bias-informed adaptive defenses that anticipate, rather than simply react to, adversarial actions.
Authors:Mary Llewellyn, Annie Gray, Josh Collyer, Michael Harries
Abstract:
Before adopting a new large language model (LLM) architecture, it is critical to understand vulnerabilities accurately. Existing evaluations can be difficult to trust, often drawing conclusions from LLMs that are not meaningfully comparable, relying on heuristic inputs or employing metrics that fail to capture the inherent uncertainty. In this paper, we propose a principled and practical end-to-end framework for evaluating LLM vulnerabilities to prompt injection attacks. First, we propose practical approaches to experimental design, tackling unfair LLM comparisons by considering two practitioner scenarios: when training an LLM and when deploying a pre-trained LLM. Second, we address the analysis of experiments and propose a Bayesian hierarchical model with embedding-space clustering. This model is designed to improve uncertainty quantification in the common scenario that LLM outputs are not deterministic, test prompts are designed imperfectly, and practitioners only have a limited amount of compute to evaluate vulnerabilities. We show the improved inferential capabilities of the model in several prompt injection attack settings. Finally, we demonstrate the pipeline to evaluate the security of Transformer versus Mamba architectures. Our findings show that consideration of output variability can suggest less definitive findings. However, for some attacks, we find notably increased Transformer and Mamba-variant vulnerabilities across LLMs with the same training data or mathematical ability.
Authors:Ran Canetti, Ephraim Linder, Connor Wagaman
Abstract:
We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power of competing provers in different settings, we call this setting refereed learning. After formulating a general definition of refereed learning tasks, we show refereed learning protocols that obtain a level of accuracy that far exceeds what is obtainable at comparable cost without provers, or even with a single prover. We concentrate on the task of choosing the better one out of two black-box models, with respect to some ground truth. While we consider a range of parameters, perhaps our most notable result is in the high-precision range: For all $\varepsilon>0$ and ambient dimension $d$, our learner makes only one query to the ground truth function, communicates only $(1+\frac{1}{\varepsilon^2})\cdot\text{poly}(d)$ bits with the provers, and outputs a model whose loss is within a multiplicative factor of $(1+\varepsilon)$ of the best model's loss. Obtaining comparable loss with a single prover would require the learner to access the ground truth at almost all of the points in the domain. To obtain this bound, we develop a technique that allows the learner to sample, using the provers, from a distribution that is not efficiently samplable to begin with. We find this technique to be of independent interest. We also present lower bounds that demonstrate the optimality of our protocols in a number of respects, including prover complexity, number of samples, and need for query access.
Authors:René Mayrhofer, Anja Lehmann, abhi shelat
Abstract:
This paper describes pseudonyms for the upcoming European Identity Wallet (EUDIW) architecture from both a cryptographic and an implementation perspective. Its main goal is to provide technical insights into the achievable properties and cryptographic realizations. In particular, we (1) outline the security and privacy requirements of EUDI pseudonyms as the basis for building consensus on the cross-country decision maker level; (2) sketch an abstract cryptographic protocol that fulfills these requirements; and (3) suggest two instantiation options for the protocol sketch based on well-studied building A complete specification of the formal properties, as well as the specific set of credential issuance, provisioning, and pseudonym presentation generation is outside the scope of this paper, but is expected to follow as future work.
Authors:Yahya Hassanzadeh-Nazarabadi, Sanaz Taheri-Boshrooyeh
Abstract:
Zero-knowledge Ethereum Virtual Machines (zkEVMs) must reconcile a fundamental contradiction: the Ethereum Virtual Machine was designed for transparent sequential execution, while zero-knowledge proofs require algebraic circuit representations. This survey provides the first systematic analysis of how existing major production zkEVM implementations resolve this tension through distinct constraint engineering strategies. We develop a comparative framework that maps the design space across three architectural dimensions. First, arithmetization schemes reveal stark trade-offs: R1CS requires compositional gadget libraries, PLONKish achieves elegance through custom gates that capture complex EVM opcodes in single constraints, while the homogeneous structure of AIR fundamentally mismatches the irregular instruction set of EVM. Second, dispatch mechanisms determine constraint activation patterns: selector-based systems waste trace width on inactive constraints, while ROM-based approaches trade memory lookups for execution flexibility. Third, the Type 1-4 spectrum quantifies an inescapable trade-off: the bit-level EVM compatibility of Type 1 demands significantly higher constraint complexity than the custom instruction sets of Type 4. Beyond cataloging implementations, we identify critical open problems across multiple domains: performance barriers preventing sub-second proving, absence of formal verification for constraint-to-EVM semantic equivalence, lack of standardized benchmarking frameworks, and architectural gaps in hybrid zkEVM/zkVM designs, decentralized prover coordination, privacy preservation, and interoperability.
Authors:Abdelilah Ganmati, Karim Afdel, Lahcen Koutti
Abstract:
In the era of pervasive cyber threats and exponential growth in digital services, the inadequacy of single-factor authentication has become increasingly evident. Multi-Factor Authentication (MFA), which combines knowledge-based factors (passwords, PINs), possession-based factors (smart cards, tokens), and inherence-based factors (biometric traits), has emerged as a robust defense mechanism. Recent breakthroughs in deep learning have transformed the capabilities of biometric systems, enabling higher accuracy, resilience to spoofing, and seamless integration with hardware-based solutions. At the same time, smart card technologies have evolved to include on-chip biometric verification, cryptographic processing, and secure storage, thereby enabling compact and secure multi-factor devices. This survey presents a comprehensive synthesis of recent work (2019-2025) at the intersection of deep learning, biometrics, and smart card technologies for MFA. We analyze biometric modalities (face, fingerprint, iris, voice), review hardware-based approaches (smart cards, NFC, TPMs, secure enclaves), and highlight integration strategies for real-world applications such as digital banking, healthcare IoT, and critical infrastructure. Furthermore, we discuss the major challenges that remain open, including usability-security tradeoffs, adversarial attacks on deep learning models, privacy concerns surrounding biometric data, and the need for standardization in MFA deployment. By consolidating current advancements, limitations, and research opportunities, this survey provides a roadmap for designing secure, scalable, and user-friendly authentication frameworks.
Authors:Abrar Shahid, Ibteeker Mahir Ishum, AKM Tahmidul Haque, M Sohel Rahman, A. B. M. Alim Al Islam
Abstract:
This paper presents a controlled study of adversarial reinforcement learning in network security through a custom OpenAI Gym environment that models brute-force attacks and reactive defenses on multi-port services. The environment captures realistic security trade-offs including background traffic noise, progressive exploitation mechanics, IP-based evasion tactics, honeypot traps, and multi-level rate-limiting defenses. Competing attacker and defender agents are trained using Deep Q-Networks (DQN) within a zero-sum reward framework, where successful exploits yield large terminal rewards while incremental actions incur small costs. Through systematic evaluation across multiple configurations (varying trap detection probabilities, exploitation difficulty thresholds, and training regimens), the results demonstrate that defender observability and trap effectiveness create substantial barriers to successful attacks. The experiments reveal that reward shaping and careful training scheduling are critical for learning stability in this adversarial setting. The defender consistently maintains strategic advantage across 50,000+ training episodes, with performance gains amplifying when exposed to complex defensive strategies including adaptive IP blocking and port-specific controls. Complete implementation details, reproducible hyperparameter configurations, and architectural guidelines are provided to support future research in adversarial RL for cybersecurity. The zero-sum formulation and realistic operational constraints make this environment suitable for studying autonomous defense systems, attacker-defender co-evolution, and transfer learning to real-world network security scenarios.
Authors:Prabhanjan Ananth, Eli Goldin
Abstract:
Quantum cryptographic definitions are often sensitive to the number of copies of the cryptographic states revealed to an adversary. Making definitional changes to the number of copies accessible to an adversary can drastically affect various aspects including the computational hardness, feasibility, and applicability of the resulting cryptographic scheme. This phenomenon appears in many places in quantum cryptography, including quantum pseudorandomness and unclonable cryptography. To address this, we present a generic approach to boost single-copy security to multi-copy security and apply this approach to many settings. As a consequence, we obtain the following new results: -One-copy stretch pseudorandom state generators (under mild assumptions) imply the existence of t-copy stretch pseudorandom state generators, for any fixed polynomial t. -One-query pseudorandom unitaries with short keys (under mild assumptions) imply the existence of t-query pseudorandom unitaries with short keys, for any fixed polynomial t. -Assuming indistinguishability obfuscation and other standard cryptographic assumptions, there exist identical-copy secure unclonable primitives such as public-key quantum money and quantum copy-protection.
Authors:J. E. M. Scanlon, A. Pelzer, M. Gharleghi, K. C. Fuhrmeister, T. Köllmer, P. Aichroth, R. Göder, C. Hansen, K. I. Wolf
Abstract:
Electroencephalogram monitoring devices and online data repositories hold large amounts of data from individuals participating in research and medical studies without direct reference to personal identifiers. This paper explores what types of personal and health information have been detected and classified within task-free EEG data. Additionally, we investigate key characteristics of the collected resting-state and sleep data, in order to determine the privacy risks involved with openly available EEG data. We used Google Scholar, Web of Science and searched relevant journals to find studies which classified or detected the presence of various disorders and personal information in resting state and sleep EEG. Only English full-text peer-reviewed journal articles or conference papers about classifying the presence of medical disorders between individuals were included. A quality analysis carried out by 3 reviewers determined general paper quality based on specified evaluation criteria. In resting state EEG, various disorders including Autism Spectrum Disorder, Parkinson's disease, and alcohol use disorder have been classified with high classification accuracy, often requiring only 5 mins of data or less. Sleep EEG tends to hold classifiable information about sleep disorders such as sleep apnea, insomnia, and REM sleep disorder, but usually involve longer recordings or data from multiple sleep stages. Many classification methods are still developing but even today, access to a person's EEG can reveal sensitive personal health information. With an increasing ability of machine learning methods to re-identify individuals from their EEG data, this review demonstrates the importance of anonymization, and the development of improved tools for keeping study participants and medical EEG users' privacy safe.
Authors:Ines Akaichi, Giorgos Flouris, Irini Fundulaki, Sabrina Kirrane
Abstract:
In the digital age, data frequently crosses organizational and jurisdictional boundaries, making effective governance essential. Usage control policies have emerged as a key paradigm for regulating data usage, safeguarding privacy, protecting intellectual property, and ensuring compliance with regulations. A central mechanism for usage control is the handling of obligations, which arise as a side effect of using and sharing data. Effective monitoring of obligations requires capturing usage traces and accounting for temporal aspects such as start times and deadlines, as obligations may evolve over times into different states, such as fulfilled, violated, or expired. While several solutions have been proposed for obligation monitoring, they often lack formal semantics or provide limited support for reasoning over obligation states. To address these limitations, we extend GUCON, a policy framework grounded in the formal semantics of SPAQRL graph patterns, to explicitly model the temporal aspects of an obligation. This extension enables the expressing of temporal obligations and supports continuous monitoring of their evolving states based on usage traces stored in temporal knowledge graphs. We demonstrate how this extended model can be represented using RDF-star and SPARQL-star and propose an Obligation State Manager that monitors obligation states and assess their compliance with respect to usage traces. Finally, we evaluate both the extended model and its prototype implementation.
Authors:Ali Asghar, Andreas Becher, Daniel Ziener
Abstract:
The dependence of power-consumption on the processed data is a known vulnerability of CMOS circuits, resulting in side channels which can be exploited by power-based side channel attacks (SCAs). These attacks can extract sensitive information, such as secret keys, from the implementation of cryptographic algorithms. Existing countermeasures against power-based side channel attacks focus on analyzing information leakage at the byte level. However, this approach neglects the impact of individual bits on the overall resistance of a cryptographic implementation. In this work, we present a countermeasure based on single-bit leakage. The results suggest that the proposed countermeasure cannot be broken by attacks using conventional SCA leakage models.
Authors:Yuki Takeuchi, Duo Xu
Abstract:
We present the first construction of a computational Certified Deletion Property (CDP) achievable with classical communication, derived from the compilation of the non-local Magic Square Game (MSG). We leverage the KLVY compiler to transform the non-local MSG into a 2-round interactive protocol, rigorously demonstrating that this compilation preserves the game-specific CDP. Previously, the quantum value and rigidity of the compiled game were investigated. We emphasize that we are the first to investigate CDP (local randomness in [Fu and Miller, Phys. Rev. A 97, 032324 (2018)]) for the compiled game. Then, we combine this CDP with the framework [Kitagawa, Morimae, and Yamakawa, Eurocrypt 2025] to construct Secure Key Leasing with classical Lessor (cSKL). SKL enables the Lessor to lease the secret key to the Lessee and verify that a quantum Lessee has indeed deleted the key. In this paper, we realize cSKL for PKE, PRF, and digital signature. Compared to prior works for cSKL, we realize cSKL for PRF and digital signature for the first time. In addition, we succeed in weakening the assumption needed to construct cSKL.
Authors:Prakhar Paliwal, Atul Kabra, Manjesh Kumar Hanawal
Abstract:
Rapid digitization of critical infrastructure has made cyberwarfare one of the important dimensions of modern conflicts. Attacking the critical infrastructure is an attractive pre-emptive proposition for adversaries as it can be done remotely without crossing borders. Such attacks disturb the support systems of the opponents to launch any offensive activities, crippling their fighting capabilities. Cyberattacks during cyberwarfare can not only be used to steal information, but also to spread disinformation to bring down the morale of the opponents. Recent wars in Europe, Africa, and Asia have demonstrated the scale and sophistication that the warring nations have deployed to take the early upper hand. In this work, we focus on the military action launched by India, code-named Operation Sindoor, to dismantle terror infrastructure emanating from Pakistan and the cyberattacks launched by Pakistan. In particular, we study the malware used by Pakistan APT groups to deploy Remote Access Trojans in Indian systems. We provide details of the tactics and techniques used in the RAT deployment and develop a telemetry framework to collect necessary event logs using Osquery with a custom extension. Finally, we develop a detection rule that can be readily deployed to detect the presence of the RAT or any exploitation performed by the malware.
Authors:Rohit Chatterjee, Changrui Mu, Prashant Nalini Vasudevan
Abstract:
We construct a public-key encryption scheme from the hardness of the (planted) MinRank problem over uniformly random instances. This corresponds to the hardness of decoding random linear rank-metric codes. Existing constructions of public-key encryption from such problems require hardness for structured instances arising from the masking of efficiently decodable codes. Central to our construction is the development of a new notion of duality for rank-metric codes.
Authors:Benjamin Marsh, Paolo Serafino
Abstract:
We introduce a modified Schnorr signature scheme to allow for time-bound signatures for transaction fee auction bidding and smart contract purposes in a blockchain context, ensuring an honest producer can only validate a signature before a given block height. The immutable blockchain is used as a source of universal time for the signature scheme. We show the use of such a signature scheme leads to lower MEV revenue for builders. We then apply our time-bound signatures to Ethereum's EIP-1559 and show how it can be used to mitigate the effect of MEV on predicted equilibrium strategies.
Authors:Saleh Darzi, Saif Eddine Nouma, Kiarash Sedghighadikolaei, Attila Altay
Abstract:
With advances in wireless communication and growing spectrum scarcity, Spectrum Access Systems (SASs) offer an opportunistic solution but face significant security challenges. Regulations require disclosure of location coordinates and transmission details, exposing user privacy and anonymity during spectrum queries, while the database operations themselves permit Denial-of-Service (DoS) attacks. As location-based services, SAS is also vulnerable to compromised or malicious users conducting spoofing attacks. These threats are further amplified given the quantum computing advancements. Thus, we propose QPADL, the first post-quantum (PQ) secure framework that simultaneously ensures privacy, anonymity, location verification, and DoS resilience while maintaining efficiency for large-scale spectrum access systems. QPADL introduces SAS-tailored private information retrieval for location privacy, a PQ-variant of Tor for anonymity, and employs advanced signature constructions for location verification alongside client puzzle protocols and rate-limiting technique for DoS defense. We formally assess its security and conduct a comprehensive performance evaluation, incorporating GPU parallelization and optimization strategies to demonstrate practicality and scalability.
Authors:Zachary Ezetta, Wu-chang Feng
Abstract:
Agentic AI is transforming security by automating many tasks being performed manually. While initial agentic approaches employed a monolithic architecture, the Model-Context-Protocol has now enabled a remote-procedure call (RPC) paradigm to agentic applications, allowing for the flexible construction and composition of multi-function agents. This paper describes PentestMCP, a library of MCP server implementations that support agentic penetration testing. By supporting common penetration testing tasks such as network scanning, resource enumeration, service fingerprinting, vulnerability scanning, exploitation, and post-exploitation, PentestMCP allows a developer to customize multi-agent workflows for performing penetration tests.
Authors:Fatmazohra Rezkellah, Ramzi Dakhmouche
Abstract:
With the increasing adoption of Large Language Models (LLMs), more customization is needed to ensure privacy-preserving and safe generation. We address this objective from two critical aspects: unlearning of sensitive information and robustness to jail-breaking attacks. We investigate various constrained optimization formulations that address both aspects in a \emph{unified manner}, by finding the smallest possible interventions on LLM weights that either make a given vocabulary set unreachable or embed the LLM with robustness to tailored attacks by shifting part of the weights to a \emph{safer} region. Beyond unifying two key properties, this approach contrasts with previous work in that it doesn't require an oracle classifier that is typically not available or represents a computational overhead. Surprisingly, we find that the simplest point-wise constraint-based intervention we propose leads to better performance than max-min interventions, while having a lower computational cost. Comparison against state-of-the-art defense methods demonstrates superior performance of the proposed approach.
Authors:Boniface M. Sindala, Ragib Hasan
Abstract:
Research management applications (RMA) are widely used in clinical research environments to collect, transmit, analyze, and store sensitive data. This data is so valuable making RMAs susceptible to security threats. This analysis, analyzes RMAs' security, focusing on Research Electronic Data Capture (REDCap) as an example. We explore the strengths and vulnerabilities within RMAs by evaluating the architecture, data flow, and security features. We identify and assess potential risks using the MITRE ATT\&CK framework and STRIDE model. We assess REDCap's defenses against common attack vectors focusing on security to provide confidentiality, integrity, availability, non-repudiation, and authentication. We conclude by proposing recommendations for enhancing the security of RMAs, ensuring that critical research data remains protected without compromising usability. This research aims to contribute towards a more secure framework for managing sensitive information in research-intensive environments.
Authors:Raik Dankworth, Gesina Schwalbe
Abstract:
Deep neural networks (NNs) for computer vision are vulnerable to adversarial attacks, i.e., miniscule malicious changes to inputs may induce unintuitive outputs. One key approach to verify and mitigate such robustness issues is to falsify expected output behavior. This allows, e.g., to locally proof security, or to (re)train NNs on obtained adversarial input examples. Due to the black-box nature of NNs, current attacks only falsify a class of the final output, such as flipping from $\texttt{stop_sign}$ to $\neg\texttt{stop_sign}$. In this short position paper we generalize this to search for generally illogical behavior, as considered in NN verification: falsify constraints (concept-based properties) involving further human-interpretable concepts, like $\texttt{red}\wedge\texttt{octogonal}\rightarrow\texttt{stop_sign}$. For this, an easy implementation of concept-based properties on already trained NNs is proposed using techniques from explainable artificial intelligence. Further, we sketch the theoretical proof that attacks on concept-based properties are expected to have a reduced search space compared to simple class falsification, whilst arguably be more aligned with intuitive robustness targets. As an outlook to this work in progress we hypothesize that this approach has potential to efficiently and simultaneously improve logical compliance and robustness.
Authors:Chenxiang Luo, David K. Y. Yau, Qun Song
Abstract:
Federated learning (FL) enables collaborative model training without sharing raw data but is vulnerable to gradient inversion attacks (GIAs), where adversaries reconstruct private data from shared gradients. Existing defenses either incur impractical computational overhead for embedded platforms or fail to achieve privacy protection and good model utility at the same time. Moreover, many defenses can be easily bypassed by adaptive adversaries who have obtained the defense details. To address these limitations, we propose SVDefense, a novel defense framework against GIAs that leverages the truncated Singular Value Decomposition (SVD) to obfuscate gradient updates. SVDefense introduces three key innovations, a Self-Adaptive Energy Threshold that adapts to client vulnerability, a Channel-Wise Weighted Approximation that selectively preserves essential gradient information for effective model training while enhancing privacy protection, and a Layer-Wise Weighted Aggregation for effective model aggregation under class imbalance. Our extensive evaluation shows that SVDefense outperforms existing defenses across multiple applications, including image classification, human activity recognition, and keyword spotting, by offering robust privacy protection with minimal impact on model accuracy. Furthermore, SVDefense is practical for deployment on various resource-constrained embedded platforms. We will make our code publicly available upon paper acceptance.
Authors:Su Kara, Fazle Faisal, Suman Nath
Abstract:
Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online shopping. Current benchmarks measure agent performance in controlled environments, such as containers or stable networks, where websites behave deterministically. However, in the real world, users access websites over networks and HTTPS connections that introduce instability from multiple sources: client-side, server-side issues or broader system failures. Moreover, live websites are prone to web attacks such Cross-Site Scripting, as well as general site modifications which can cause unexpected or malicious pop-ups or improper functionality. To address this gap, we present WAREX: Web Agent Reliability Evaluation on Existing Benchmarks. We measure the impact of WAREX across three popular benchmarks: WebArena, WebVoyager, and REAL. Our experiments show that introducing WAREX leads to significant drops in task success rates, highlighting the limited robustness of state-of-the-art agents.
Authors:Lambert Hogenhout, Rinzin Wangmo
Abstract:
The proliferation of digital technologies has led to unprecedented data collection, with facial data emerging as a particularly sensitive commodity. Companies are increasingly leveraging advanced facial recognition technologies, often without the explicit consent or awareness of individuals, to build sophisticated surveillance capabilities. This practice, fueled by weak and fragmented laws in many jurisdictions, has created a regulatory vacuum that allows for the commercialization of personal identity and poses significant threats to individual privacy and autonomy. This article introduces the concept of Facial Privacy. It analyzes the profound challenges posed by unregulated facial recognition by conducting a comprehensive review of existing legal frameworks. It examines and compares regulations such as the GDPR, Brazil's LGPD, Canada's PIPEDA, and privacy acts in China, Singapore, South Korea, and Japan, alongside sector-specific laws in the United States like the Illinois Biometric Information Privacy Act (BIPA). The analysis highlights the societal impacts of this technology, including the potential for discriminatory bias and the long-lasting harm that can result from the theft of immutable biometric data. Ultimately, the paper argues that existing legal loopholes and ambiguities leave individuals vulnerable. It proposes a new policy framework that shifts the paradigm from data as property to a model of inalienable rights, ensuring that fundamental human rights are upheld against unchecked technological expansion.
Authors:Kel Zin Tan, Prashant Nalini Vasudevan
Abstract:
A random local function defined by a $d$-ary predicate $P$ is one where each output bit is computed by applying $P$ to $d$ randomly chosen bits of its input. These represent natural distributions of instances for constraint satisfaction problems. They were put forward by Goldreich as candidates for low-complexity one-way functions, and have subsequently been widely studied also as potential pseudo-random generators. We present a new search-to-decision reduction for random local functions defined by any predicate of constant arity. Given any efficient algorithm that can distinguish, with advantage $ε$, the output of a random local function with $m$ outputs and $n$ inputs from random, our reduction produces an efficient algorithm that can invert such functions with $\tilde{O}(m(n/ε)^2)$ outputs, succeeding with probability $Ω(ε)$. This implies that if a family of local functions is one-way, then a related family with shorter output length is family of pseudo-random generators. Prior to our work, all such reductions that were known required the predicate to have additional sensitivity properties, whereas our reduction works for any predicate. Our results also generalise to some super-constant values of the arity $d$, and to noisy predicates.
Authors:Chinthana Wimalasuriya, Spyros Tragoudas
Abstract:
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect unseen attacks or detect different attack types with a high level of accuracy. In this work, we propose a statistical approach that establishes a detection baseline before a neural network's deployment, enabling effective real-time adversarial detection. We generate a metric of adversarial presence by comparing the behavior of a compressed/uncompressed neural network pair. Our method has been tested against state-of-the-art techniques, and it achieves near-perfect detection across a wide range of attack types. Moreover, it significantly reduces false positives, making it both reliable and practical for real-world applications.
Authors:Bowei Ning, Xuejun Zong, Kan He
Abstract:
Industrial control systems (ICS) are vital to modern infrastructure but increasingly vulnerable to cybersecurity threats, particularly through weaknesses in their communication protocols. This paper presents MALF (Multi-Agent LLM Fuzzing Framework), an advanced fuzzing solution that integrates large language models (LLMs) with multi-agent coordination to identify vulnerabilities in industrial control protocols (ICPs). By leveraging Retrieval-Augmented Generation (RAG) for domain-specific knowledge and QLoRA fine-tuning for protocol-aware input generation, MALF enhances fuzz testing precision and adaptability. The multi-agent framework optimizes seed generation, mutation strategies, and feedback-driven refinement, leading to improved vulnerability discovery. Experiments on protocols like Modbus/TCP, S7Comm, and Ethernet/IP demonstrate that MALF surpasses traditional methods, achieving a test case pass rate (TCPR) of 88-92% and generating more exception triggers (ETN). MALF also maintains over 90% seed coverage and Shannon entropy values between 4.2 and 4.6 bits, ensuring diverse, protocol-compliant mutations. Deployed in a real-world Industrial Attack-Defense Range for power plants, MALF identified critical vulnerabilities, including three zero-day flaws, one confirmed and registered by CNVD. These results validate MALF's effectiveness in real-world fuzzing applications. This research highlights the transformative potential of multi-agent LLMs in ICS cybersecurity, offering a scalable, automated framework that sets a new standard for vulnerability discovery and strengthens critical infrastructure security against emerging threats.
Authors:Ahmed Danladi Abdullahi, Erfan Bahrami, Tooska Dargahi, Mohammed Al-Khalidi, Mohammad Hammoudeh
Abstract:
The advancement of 6G technology has the potential to revolutionize the transportation sector and significantly improve how we travel. 6G-enabled Intelligent Transportation Systems (ITS) promise to offer high-speed, low-latency communication and advanced data analytics capabilities, supporting the development of safer, more efficient, and more sustainable transportation solutions. However, various security and privacy challenges were identified in the literature that must be addressed to enable the safe and secure deployment of 6G-ITS and ensure people's trust in using these technologies. This paper reviews the opportunities and challenges of 6G-ITS, particularly focusing on trust, security, and privacy, with special attention to quantum technologies that both enhance security through quantum key distribution and introduce new vulnerabilities. It discusses the potential benefits of 6G technology in the transportation sector, including improved communication, device interoperability support, data analytic capabilities, and increased automation for different components, such as transportation management and communication systems. A taxonomy of different attack models in 6G-ITS is proposed, and a comparison of the security threats in 5G-ITS and 6G-ITS is provided, along with potential mitigating solutions. This research highlights the urgent need for a comprehensive, multi-layered security framework spanning physical infrastructure protection, network protocol security, data management safeguards, application security measures, and trust management systems to effectively mitigate emerging security and privacy risks and ensure the integrity and resilience of future transportation ecosystems.
Authors:Nik Rollinson, Nikolaos Polatidis
Abstract:
Android malware continues to evolve through obfuscation and polymorphism, posing challenges for both signature-based defenses and machine learning models trained on limited and imbalanced datasets. Synthetic data has been proposed as a remedy for scarcity, yet the role of large language models (LLMs) in generating effective malware data for detection tasks remains underexplored. In this study, we fine-tune GPT-4.1-mini to produce structured records for three malware families: BankBot, Locker/SLocker, and Airpush/StopSMS, using the KronoDroid dataset. After addressing generation inconsistencies with prompt engineering and post-processing, we evaluate multiple classifiers under three settings: training with real data only, real-plus-synthetic data, and synthetic data alone. Results show that real-only training achieves near perfect detection, while augmentation with synthetic data preserves high performance with only minor degradations. In contrast, synthetic-only training produces mixed outcomes, with effectiveness varying across malware families and fine-tuning strategies. These findings suggest that LLM-generated malware can enhance scarce datasets without compromising detection accuracy, but remains insufficient as a standalone training source.
Authors:Jingrong Xie, Yumin Li
Abstract:
This paper introduces a Bayesian approach to improve Interactive Voice Response (IVR) authentication processes used by financial institutions. Traditional IVR systems authenticate users through a static sequence of credentials, assuming uniform effectiveness among them. However, fraudsters exploit this predictability, selectively bypassing strong credentials. This study applies Bayes' Theorem and conditional probability modeling to evaluate fraud risk dynamically and adapt credential verification paths.
Authors:Ryan Marinelli, Angelica Chowdhury
Abstract:
In this endeavor, a proof-of-concept homomorphic application is developed to determine the production readiness of encryption ecosystems. A movie recommendation app is implemented for this purpose and productionized through containerization and orchestration. By tuning deployment configurations, the computational limitations of Fully Homomorphic Encryption (FHE) are mitigated through additional infrastructure optimizations Index Terms: Reinforcement Learning, Orchestration, Homomorphic Encryption
Authors:Bochra Al Agha, Razane Tajeddine
Abstract:
Smart grids are exposed to passive eavesdropping, where attackers listen silently to communication links. Although no data is actively altered, such reconnaissance can reveal grid topology, consumption patterns, and operational behavior, creating a gateway to more severe targeted attacks. Detecting this threat is difficult because the signals it produces are faint, short-lived, and often disappear when traffic is examined by a single node or along a single timeline. This paper introduces a graph-centric, multimodal detector that fuses physical-layer and behavioral indicators over ego-centric star subgraphs and short temporal windows to detect passive attacks. To capture stealthy perturbations, a two-stage encoder is introduced: graph convolution aggregates spatial context across ego-centric star subgraphs, while a bidirectional GRU models short-term temporal dependencies. The encoder transforms heterogeneous features into a unified spatio-temporal representation suitable for classification. Training occurs in a federated learning setup under FedProx, improving robustness to heterogeneous local raw data and contributing to the trustworthiness of decentralized training; raw measurements remain on client devices. A synthetic, standards-informed dataset is generated to emulate heterogeneous HAN/NAN/WAN communications with wireless-only passive perturbations, event co-occurrence, and leak-safe splits. The model achieves a testing accuracy of 98.32% per-timestep (F1_{attack}=0.972) and 93.35% per-sequence at 0.15% FPR using a simple decision rule with run-length m=2 and threshold $τ=0.55$. The results demonstrate that combining spatial and temporal context enables reliable detection of stealthy reconnaissance while maintaining low false-positive rates, making the approach suitable for non-IID federated smart-grid deployments.
Authors:Yapei Feng, Feng Jiang, Shanhao Wu, Hua Zhong
Abstract:
Neural linguistic steganography aims to embed information into natural text while preserving statistical undetectability. A fundamental challenge in this ffeld stems from tokenization ambiguity in modern tokenizers, which can lead to catastrophic decoding failures. The recent method, SyncPool, addresses this ambiguity by employing a coarse-grained synchronization mechanism over groups of ambiguous candidates. However, SyncPool sacriffces embedding capacity, as it utilizes the entire Shannon entropy of an ambiguous group solely for synchronization rather than for payload embedding. We propose a method named look-ahead Sync, which overcomes the capacity limitation of SyncPool while retaining its provable security guarantees. Our approach performs minimal synchronized sampling only on truly indistinguishable token sequences, while strategically preserving all other discernible paths to maximize embedding capacity. We provide theoretical proofs for the security of our method and analyze the gap between its achievable embedding capacity and the theoretical upper bound. Experiments on English (using Llama 3) and Chinese (using Qwen 2.5) benchmarks show that our method consistently approaches the theoretical capacity upper bound and signiffcantly outperforms SyncPool. The improvement in embedding rate exceeds 160% in English and 25% in Chinese, particularly in settings with larger candidate pools. This work represents a signiffcant step toward practical high-capacity provably secure linguistic steganography.
Authors:Jean-Francois Biasse, Fang Song
Abstract:
In this paper, we provide details on the proofs of the quantum polynomial time algorithm of Biasse and Song (SODA 16) for computing the $S$-unit group of a number field. This algorithm directly implies polynomial time methods to calculate class groups, S-class groups, relative class group and the unit group, ray class groups, solve the principal ideal problem, solve certain norm equations, and decompose ideal classes in the ideal class group. Additionally, combined with a result of Cramer, Ducas, Peikert and Regev (Eurocrypt 2016), the resolution of the principal ideal problem allows one to find short generators of a principal ideal. Likewise, methods due to Cramer, Ducas and Wesolowski (Eurocrypt 2017) use the resolution of the principal ideal problem and the decomposition of ideal classes to find so-called ``mildly short vectors'' in ideal lattices of cyclotomic fields.
Authors:Iyán Méndez Veiga, Esther Hänggi
Abstract:
Reproducible builds are a set of software development practices that establish an independently verifiable path from source code to binary artifacts, helping to detect and mitigate certain classes of supply chain attacks. Although quantum computing is a rapidly evolving field of research, it can already benefit from adopting reproducible builds. This paper aims to bridge the gap between the quantum computing and reproducible builds communities. We propose a generalization of the definition of reproducible builds in the quantum setting, motivated by two threat models: one targeting the confidentiality of end users' data during circuit preparation and submission to a quantum computer, and another compromising the integrity of quantum computation results. This work presents three examples that show how classical information can be hidden in transpiled quantum circuits, and two cases illustrating how even minimal modifications to these circuits can lead to incorrect quantum computation results. Our work provides initial steps towards a framework for reproducibility in quantum software toolchains.
Authors:N. A. Anagnostopoulos, K. Konstantinidis, A. N. Miliou, S. G. Stavrinides
Abstract:
In practical applications, it is crucial that the drive-response systems, although identical in all respects, are synchronized at all times, even if there is noise present. In this work, we test the stability and robustness of three distinct and well-known cryptographic chaotic systems, and compare the results in relation to the desired security.
Authors:Cristian Bassotto, Ermes Franch, Marina Krček, Stjepan Picek
Abstract:
The advent of quantum computing threatens classical public-key cryptography, motivating NIST's adoption of post-quantum schemes such as those based on the Module Learning With Errors (Module-LWE) problem. We present NoMod ML-Attack, a hybrid white-box cryptanalytic method that circumvents the challenge of modeling modular reduction by treating wrap-arounds as statistical corruption and casting secret recovery as robust linear estimation. Our approach combines optimized lattice preprocessing--including reduced-vector saving and algebraic amplification--with robust estimators trained via Tukey's Biweight loss. Experiments show NoMod achieves full recovery of binary secrets for dimension $n = 350$, recovery of sparse binomial secrets for $n = 256$, and successful recovery of sparse secrets in CRYSTALS-Kyber settings with parameters $(n, k) = (128, 3)$ and $(256, 2)$. We release our implementation in an anonymous repository https://anonymous.4open.science/r/NoMod-3BD4.
Authors:Aadarsh Anantha Ramakrishnan, Shubham Agarwal, Selvanayagam S, Kunwar Singh
Abstract:
As image generation models grow increasingly powerful and accessible, concerns around authenticity, ownership, and misuse of synthetic media have become critical. The ability to generate lifelike images indistinguishable from real ones introduces risks such as misinformation, deepfakes, and intellectual property violations. Traditional watermarking methods either degrade image quality, are easily removed, or require access to confidential model internals - making them unsuitable for secure and scalable deployment. We are the first to introduce ZK-WAGON, a novel system for watermarking image generation models using the Zero-Knowledge Succinct Non Interactive Argument of Knowledge (ZK-SNARKs). Our approach enables verifiable proof of origin without exposing model weights, generation prompts, or any sensitive internal information. We propose Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to selectively convert key layers of an image generation model into a circuit, reducing proof generation time significantly. Generated ZK-SNARK proofs are imperceptibly embedded into a generated image via Least Significant Bit (LSB) steganography. We demonstrate this system on both GAN and Diffusion models, providing a secure, model-agnostic pipeline for trustworthy AI image generation.
Authors:Muhammad Faheemur Rahman, Wayne Burleson
Abstract:
Memristive crossbar arrays enable in-memory computing by performing parallel analog computations directly within memory, making them well-suited for machine learning, neural networks, and neuromorphic systems. However, despite their advantages, non-volatile memristors are vulnerable to security threats (such as adversarial extraction of stored weights when the hardware is compromised. Protecting these weights is essential since they represent valuable intellectual property resulting from lengthy and costly training processes using large, often proprietary, datasets. As a solution we propose two security mechanisms: Keyed Permutor and Watermark Protection Columns; where both safeguard critical weights and establish verifiable ownership (even in cases of data leakage). Our approach integrates efficiently with existing memristive crossbar architectures without significant design modifications. Simulations across 45nm, 22nm, and 7nm CMOS nodes, using a realistic interconnect model and a large RF dataset, show that both mechanisms offer robust protection with under 10% overhead in area, delay and power. We also present initial experiments employing the widely known MNIST dataset; further highlighting the feasibility of securing memristive in-memory computing systems with minimal performance trade-offs.
Authors:Zhixin Dong, Xian Xu, Yuhang Zeng, Mingchao Wan, Chunmiao Li
Abstract:
Modern blockchain systems operating in adversarial environments require robust consensus protocols that guarantee both safety and termination under network delay attacks. Tendermint, a widely adopted consensus protocol in consortium blockchains, achieves high throughput and finality. However, previous analysis of the safety and termination has been done in a standalone fashion, with no consideration of the composition with other protocols interacting with it in a concurrent manner. Moreover, the termination properties under adaptive network delays caused by Byzantine adversaries have not been formally analyzed. This paper presents the first universally composable (UC) security analysis of Tendermint, demonstrating its resilience against strategic message-delay attacks. By constructing a UC ideal model of Tendermint, we formalize its core mechanisms: phase-base consensus procedure, dynamic timeouts, proposal locking, leader rotation, and others, under a network adversary that selectively delays protocol messages. Our main result proves that the Tendermint protocol UC-realizes the ideal Tendermint model, which ensures bounded termination latency, i.e., guaranteed termination, even when up to $f
Authors:Maximilian Reif, Jens Zumbrägel
Abstract:
A visual cryptography scheme is a secret sharing scheme in which the secret information is an image and the shares are printed on transparencies, so that the secret image can be recovered by simply stacking the shares on top of each other. Such schemes do therefore not require any knowledge of cryptography tools to recover the secret, and they have widespread applications, for example, when sharing QR codes or medical images. In this work we deal with visual cryptography threshold schemes for color images. Our color model differs from most previous work by allowing arbitrary colors to be stacked, resulting in a possibly different color. This more general color monoid model enables us to achieve shorter pixel expansion and higher contrast than comparable schemes. We revisit the polynomial framework of Koga and Ishihara for constructing visual cryptography schemes and apply the monoid ring to obtain new schemes for color visual cryptography.
Authors:Andrew Gan, Zahra Ghodsi
Abstract:
Machine learning systems increasingly rely on open-source artifacts such as datasets and models that are created or hosted by other parties. The reliance on external datasets and pre-trained models exposes the system to supply chain attacks where an artifact can be poisoned before it is delivered to the end-user. Such attacks are possible due to the lack of any authenticity verification in existing machine learning systems. Incorporating cryptographic solutions such as hashing and signing can mitigate the risk of supply chain attacks. However, existing frameworks for integrity verification based on cryptographic techniques can incur significant overhead when applied to state-of-the-art machine learning artifacts due to their scale, and are not compatible with GPU platforms. In this paper, we develop Sentry, a novel GPU-based framework that verifies the authenticity of machine learning artifacts by implementing cryptographic signing and verification for datasets and models. Sentry ties developer identities to signatures and performs authentication on the fly as artifacts are loaded on GPU memory, making it compatible with GPU data movement solutions such as NVIDIA GPUDirect that bypass the CPU. Sentry incorporates GPU acceleration of cryptographic hash constructions such as Merkle tree and lattice hashing, implementing memory optimizations and resource partitioning schemes for a high throughput performance. Our evaluations show that Sentry is a practical solution to bring authenticity to machine learning systems, achieving orders of magnitude speedup over a CPU-based baseline.
Authors:Anbi Guo, Mahfuza Farooque
Abstract:
Structured security logs are critical for detecting advanced persistent threats (APTs). Large language models (LLMs) struggle in this domain due to limited context and domain mismatch. We propose \textbf{DM-RAG}, a dual-memory retrieval-augmented generation framework for structured log analysis. It integrates a short-term memory buffer for recent summaries and a long-term FAISS-indexed memory for historical patterns. An instruction-tuned Phi-4-mini processes the combined context and outputs structured predictions. Bayesian fusion promotes reliable persistence into memory. On the UNSW-NB15 dataset, DM-RAG achieves 53.64% accuracy and 98.70% recall, surpassing fine-tuned and RAG baselines in recall. The architecture is lightweight, interpretable, and scalable, enabling real-time threat monitoring without extra corpora or heavy tuning.
Authors:Günter F. Steinke, Thomas Steinke
Abstract:
Standard techniques for differentially private estimation, such as Laplace or Gaussian noise addition, require guaranteed bounds on the sensitivity of the estimator in question. But such sensitivity bounds are often large or simply unknown. Thus we seek differentially private methods that can be applied to arbitrary black-box functions. A handful of such techniques exist, but all are either inefficient in their use of data or require evaluating the function on exponentially many inputs. In this work we present a scheme that trades off between statistical efficiency (i.e., how much data is needed) and oracle efficiency (i.e., the number of evaluations). We also present lower bounds showing the near-optimality of our scheme.
Authors:Aleksandra KnapiÅska, Marija Furdek
Abstract:
Detection of emerging attacks on network infrastructure is a critical aspect of security management. To meet the growing scale and complexity of modern threats, machine learning (ML) techniques offer valuable tools for automating the detection of malicious activities. However, as these techniques become more complex, their internal operations grow increasingly opaque. In this context, we address the need for explainable physical-layer attack detection methods. First, we analyze the inner workings of various classifiers trained to alert about physical layer intrusions, examining how the influence of different monitored parameters varies depending on the type of attack being detected. This analysis not only improves the interpretability of the models but also suggests ways to enhance their design for increased speed. In the second part, we evaluate the detectors' resilience to malicious parameter noising. The results highlight a key trade-off between model speed and resilience. This work serves as a design guideline for developing fast and robust detectors trained on available network monitoring data.
Authors:Izaiah Sun, Daniel Tan, Andy Deng
Abstract:
We present LISA, an agentic smart contract vulnerability detection framework that combines rule-based and logic-based methods to address a broad spectrum of vulnerabilities in smart contracts. LISA leverages data from historical audit reports to learn the detection experience (without model fine-tuning), enabling it to generalize learned patterns to unseen projects and evolving threat profiles. In our evaluation, LISA significantly outperforms both LLM-based approaches and traditional static analysis tools, achieving superior coverage of vulnerability types and higher detection accuracy. Our results suggest that LISA offers a compelling solution for industry: delivering more reliable and comprehensive vulnerability detection while reducing the dependence on manual effort.
Authors:Thomas Fargues, Ye Dong, Tianwei Zhang, Jin-Song Dong
Abstract:
The rapid growth of Large Language Models (LLMs) has highlighted the pressing need for reliable mechanisms to verify content ownership and ensure traceability. Watermarking offers a promising path forward, but it remains limited by privacy concerns in sensitive scenarios, as traditional approaches often require direct access to a model's parameters or its training data. In this work, we propose a secure multi-party computation (MPC)-based private LLMs watermarking framework, PRIVMARK, to address the concerns. Concretely, we investigate PostMark (EMNLP'2024), one of the state-of-the-art LLMs Watermarking methods, and formulate its basic operations. Then, we construct efficient protocols for these operations using the MPC primitives in a black-box manner. In this way, PRIVMARK enables multiple parties to collaboratively watermark an LLM's output without exposing the model's weights to any single computing party. We implement PRIVMARK using SecretFlow-SPU (USENIX ATC'2023) and evaluate its performance using the ABY3 (CCS'2018) backend. The experimental results show that PRIVMARK achieves semantically identical results compared to the plaintext baseline without MPC and is resistant against paraphrasing and removing attacks with reasonable efficiency.
Authors:Yuzhen Long, Songze Li
Abstract:
Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents collaborate to perceive, reason, and plan. A key component of these systems is the shared function library, a collection of software tools that agents use to process sensor data and navigate complex driving environments. Despite its critical role in agent decision-making, the function library remains an under-explored vulnerability. In this paper, we introduce FuncPoison, a novel poisoning-based attack targeting the function library to manipulate the behavior of LLM-driven multi-agent autonomous systems. FuncPoison exploits two key weaknesses in how agents access the function library: (1) agents rely on text-based instructions to select tools; and (2) these tools are activated using standardized command formats that attackers can replicate. By injecting malicious tools with deceptive instructions, FuncPoison manipulates one agent s decisions--such as misinterpreting road conditions--triggering cascading errors that mislead other agents in the system. We experimentally evaluate FuncPoison on two representative multi-agent autonomous driving systems, demonstrating its ability to significantly degrade trajectory accuracy, flexibly target specific agents to induce coordinated misbehavior, and evade diverse defense mechanisms. Our results reveal that the function library, often considered a simple toolset, can serve as a critical attack surface in LLM-based autonomous driving systems, raising elevated concerns on their reliability.
Authors:Pranav Garimidi, Joachim Neu, Max Resnick
Abstract:
Traditional single-proposer blockchains suffer from miner extractable value (MEV), where validators exploit their serial monopoly on transaction inclusion and ordering to extract rents from users. While there have been many developments at the application layer to reduce the impact of MEV, these approaches largely require auctions as a subcomponent. Running auctions efficiently on chain requires two key properties of the underlying consensus protocol: selective-censorship resistance and hiding. These properties guarantee that an adversary can neither selectively delay transactions nor see their contents before they are confirmed. We propose a multiple concurrent proposer (MCP) protocol offering exactly these properties.
Authors:Jaxson Brown, Duc-Son Pham, Sie-Teng Soh, Foad Motalebi, Sivaraman Eswaran, Mahathir Almashor
Abstract:
Industrial Control Systems (ICSs) are complex interconnected systems used to manage process control within industrial environments, such as chemical processing plants and water treatment facilities. As the modern industrial environment moves towards Internet-facing services, ICSs face an increased risk of attacks that necessitates ICS-specific Intrusion Detection Systems (IDS). The development of such IDS relies significantly on a simulated testbed as it is unrealistic and sometimes hazardous to utilize an operational control system. Whilst some testbeds have been proposed, they often use a limited selection of virtual ICS simulations to test and verify cyber security solutions. There is a lack of investigation done on developing systems that can efficiently simulate multiple ICS architectures. Currently, the trend within research involves developing security solutions on just one ICS simulation, which can result in bias to its specific architecture. We present ICS-SimLab, an end-to-end software suite that utilizes Docker containerization technology to create a highly configurable ICS simulation environment. This software framework enables researchers to rapidly build and customize different ICS environments, facilitating the development of security solutions across different systems that adhere to the Purdue Enterprise Reference Architecture. To demonstrate its capability, we present three virtual ICS simulations: a solar panel smart grid, a water bottle filling facility, and a system of intelligent electronic devices. Furthermore, we run cyber-attacks on these simulations and construct a dataset of recorded malicious and benign network traffic to be used for IDS development.
Authors:Yousef Tahboub, Anthony Revilla, Jaydon Lynch, Greg Floyd
Abstract:
Casting a ballot from a phone or laptop sounds appealing, but only if voters can be confident their choice remains secret and results cannot be altered in the dark. This paper proposes a hybrid blockchain-based voting model that stores encrypted votes on a private blockchain maintained by election organizers and neutral observers, while periodically anchoring hashes of these votes onto a public blockchain as a tamper-evident seal. The system issues voters one-time blind-signed tokens to protect anonymity, and provides receipts so they can confirm their vote was counted. We implemented a live prototype using common web technologies (Next.js, React, Firebase) to demonstrate end-to-end functionality, accessibility, and cost efficiency. Our contributions include developing a working demo, a complete election workflow, a hybrid blockchain design, and a user-friendly interface that balances privacy, security, transparency, and practicality. This research highlights the feasibility of secure, verifiable, and scalable online voting for organizations ranging from small groups to larger institutions.
Authors:Aditi Tiwari, Akshit Bhalla, Darshan Prasad
Abstract:
The Model Context Protocol (MCP) defines a schema bound execution model for agent-tool interaction, enabling modular computer vision workflows without retraining. To our knowledge, this is the first protocol level, deployment scale audit of MCP in vision systems, identifying systemic weaknesses in schema semantics, interoperability, and runtime coordination. We analyze 91 publicly registered vision centric MCP servers, annotated along nine dimensions of compositional fidelity, and develop an executable benchmark with validators to detect and categorize protocol violations. The audit reveals high prevalence of schema format divergence, missing runtime schema validation, undeclared coordinate conventions, and reliance on untracked bridging scripts. Validator based testing quantifies these failures, with schema format checks flagging misalignments in 78.0 percent of systems, coordinate convention checks detecting spatial reference errors in 24.6 percent, and memory scope checks issuing an average of 33.8 warnings per 100 executions. Security probes show that dynamic and multi agent workflows exhibit elevated risks of privilege escalation and untyped tool connections. The proposed benchmark and validator suite, implemented in a controlled testbed and to be released on GitHub, establishes a reproducible framework for measuring and improving the reliability and security of compositional vision workflows.
Authors:Friedrich Doku, Peter Dinda
Abstract:
Modern IoT and embedded platforms must start execution from a known trusted state to thwart malware, ensure secure firmware updates, and protect critical infrastructure. Current approaches to establish a root of trust depend on secret keys and/or specialized secure hardware, which drives up costs, may involve third parties, adds operational complexity, and relies on assumptions about an attacker's computational power. In contrast, TRUSTCHECKPOINTS is the first system to establish an unconditional software root of trust based on a formal model without relying on secrets or trusted hardware. Developers capture a full-system checkpoint and later roll back to it and prove this to an external verifier. The verifier issues timing-constrained, randomized k-independent polynomial challenges (via Horner's rule) that repeatedly scan the fast on-chip memory in randomized passes. When malicious code attempts to persist, it must swap into slower, unchecked off-chip storage, causing a detectable timing delay. Our prototype for a commodity ARM Cortex-A53-based platform validates 192 KB of SRAM in approximately 10 s using 500 passes, sufficient to detect single-instruction persistent malware. The prototype then seamlessly extends trust to DRAM. Two modes (fast SRAM-bootstrap and comprehensive full-memory scan) allow trade-offs between speed and coverage, demonstrating reliable malware detection on unmodified hardware.
Authors:Mathilde Durieux, Kayla D. Taylor, Laxima Niure Kandel, Deepti Gupta
Abstract:
Global Positioning System (GPS) spoofing involves transmitting fake signals that mimic those from GPS satellites, causing the GPS receivers to calculate incorrect Positioning, Navigation, and Timing (PNT) information. Recently, there has been a surge in GPS spoofing attacks targeting aircraft. Since GPS satellite signals are weak, the spoofed high-power signal can easily overpower them. These spoofed signals are often interpreted as valid by the GPS receiver, which can cause severe and cascading effects on air navigation. While much of the existing research on GPS spoofing focuses on technical aspects of detection and mitigation, human factors are often neglected, even though pilots are an integral part of aircraft operation and potentially vulnerable to deception. This research addresses this gap by conducting a detailed analysis of the behavior of student pilots when subjected to GPS spoofing using the Force Dynamics 401CR flight simulator with X-Plane 11 and a Cessna 172 equipped with Garmin G1000. Spoofing scenarios were implemented via custom scripts that altered navigational data without modifying the external visual environment. Thirty student pilots from the Embry-Riddle Aeronautical University Daytona Beach campus with diverse flying experience levels were recruited to participate in three spoofing scenarios. A pre-simulation questionnaire was distributed to measure pilot experience and confidence in GPS.Inflight decision-making during the spoofing attacks was observed, including reaction time to anomalies, visual attention to interface elements, and cognitive biases. A post-flight evaluation of workload was obtained using a modified NASA Task Load Index (TLX) method. This study provides a first step toward identifying human vulnerabilities to GPS spoofing amid the ongoing debate over GPS reliance.
Authors:Gustavo Sánchez, Ghada Elbez, Veit Hagenmeyer
Abstract:
The escalating frequency and sophistication of cyber threats increased the need for their comprehensive understanding. This paper explores the intersection of geopolitical dynamics, cyber threat intelligence analysis, and advanced detection technologies, with a focus on the energy domain. We leverage generative artificial intelligence to extract and structure information from raw cyber threat descriptions, enabling enhanced analysis. By conducting a geopolitical comparison of threat actor origins and target regions across multiple databases, we provide insights into trends within the general threat landscape. Additionally, we evaluate the effectiveness of cybersecurity tools -- with particular emphasis on learning-based techniques -- in detecting indicators of compromise for energy-targeted attacks. This analysis yields new insights, providing actionable information to researchers, policy makers, and cybersecurity professionals.
Authors:Stefan Marksteiner, Mikael Sjödin, Marjan Sirjani
Abstract:
Cyber-physical systems are part of industrial systems and critical infrastructure. Therefore, they should be examined in a comprehensive manner to verify their correctness and security. At the same time, the complexity of such systems demands such examinations to be systematic and, if possible, automated for efficiency and accuracy. A method that can be useful in this context is model checking. However, this requires a model that faithfully represents the behavior of the examined system. Obtaining such a model is not trivial, as many of these systems can be examined only in black box settings due to, e.g., long supply chains or secrecy. We therefore utilize active black box learning techniques to infer behavioral models in the form of Mealy machines of such systems and translate them into a form that can be evaluated using a model checker. To this end, we will investigate a cyber-physical systems as a black box using its external communication interface. We first annotate the model with propositions by mapping context information from the respective protocol to the model using Context-based Proposition Maps (CPMs). We gain annotated Mealy machines that resemble Kripke structures. We then formally define a template, to transfer the structures model checker-compatible format. We further define generic security properties based on basic security requirements. Due to the used CPMs, we can instantiate these properties with a meaningful context to check a specific protocol, which makes the approach flexible and scalable. The gained model can be easily altered to introduce non-deterministic behavior (like timeouts) or faults and examined if the properties still. Lastly, we demonstrate the versatility of the approach by providing case studies of different communication protocols (NFC and UDS), checked with the same tool chain and the same security properties.
Authors:Johnnatan Messias, Christof Ferreira Torres
Abstract:
DeFi applications are vulnerable to MEV, where specialized actors profit by reordering or inserting transactions. To mitigate latency races and internalize MEV revenue, Arbitrum introduced Timeboost, an auction-based transaction sequencing mechanism that grants short-term priority access to an express lane. In this paper we present the first large-scale empirical study of Timeboost, analyzing over 11.5 million express lane transactions and 151 thousand auctions between April and July 2025. Our results reveal five main findings. First, express lane control is highly centralized, with two entities winning more than 90% of auctions. Second, while express lane access provides earlier inclusion, profitable MEV opportunities cluster at the end of blocks, limiting the value of priority access. Third, approximately 22% of time-boosted transactions are reverted, indicating that the Timeboost does not effectively mitigate spam. Fourth, secondary markets for reselling express lane rights have collapsed due to poor execution reliability and unsustainable economics. Finally, auction competition declined over time, leading to steadily reduced revenue for the Arbitrum DAO. Taken together, these findings show that Timeboost fails to deliver on its stated goals of fairness, decentralization, and spam reduction. Instead, it reinforces centralization and narrows adoption, highlighting the limitations of auction-based ordering as a mechanism for fair transaction sequencing in rollups.
Authors:Li Xia, Zheng Liu, Sili Huang, Wei Tang, Xuan Liu
Abstract:
Federated Learning (FL) preserves privacy by keeping raw data local, yet Gradient Inversion Attacks (GIAs) pose significant threats. In FedAVG multi-step scenarios, attackers observe only aggregated gradients, making data reconstruction challenging. Existing surrogate model methods like SME assume linear parameter trajectories, but we demonstrate this severely underestimates SGD's nonlinear complexity, fundamentally limiting attack effectiveness. We propose Non-Linear Surrogate Model Extension (NL-SME), the first method to introduce nonlinear parametric trajectory modeling for GIAs. Our approach replaces linear interpolation with learnable quadratic Bézier curves that capture SGD's curved characteristics through control points, combined with regularization and dvec scaling mechanisms for enhanced expressiveness. Extensive experiments on CIFAR-100 and FEMNIST datasets show NL-SME significantly outperforms baselines across all metrics, achieving order-of-magnitude improvements in cosine similarity loss while maintaining computational efficiency.This work exposes heightened privacy vulnerabilities in FL's multi-step update paradigm and offers novel perspectives for developing robust defense strategies.
Authors:Mingkai Li, Hang Ye, Joseph Devietti, Suman Jana, Tanvir Ahmed Khan
Abstract:
Memory safety bugs, such as buffer overflows and use-after-frees, are the leading causes of software safety issues in production. Software-based approaches, e.g., Address Sanitizer (ASAN), can detect such bugs with high precision, but with prohibitively high overhead. ARM's Memory Tagging Extension (MTE) offers a promising alternative to detect these bugs in hardware with a much lower overhead. However, in this paper, we perform a thorough investigation of Google Pixel 8, the first production implementation of ARM MTE, and show that MTE can only achieve coarse precision in bug detection compared with software-based approaches such as ASAN, mainly due to its 16-byte tag granularity. To address this issue, we present NanoTag, a system to detect memory safety bugs in unmodified binaries at byte granularity with ARM MTE. NanoTag detects intra-granule buffer overflows by setting up a tripwire for tag granules that may require intra-granule overflow detection. The memory access to the tripwire causes additional overflow detection in the software while using MTE's hardware to detect bugs for the rest of the accesses. We implement NanoTag based on the Scudo Hardened Allocator, the default memory allocator on Android since Android 11. Our evaluation results across popular benchmarks and real-world case studies show that NanoTag detects nearly as many memory safety bugs as ASAN while incurring similar run-time overhead to Scudo Hardened Allocator in MTE SYNC mode.
Authors:Alexandru IoniÅ£Ä, Andreea IoniÅ£Ä
Abstract:
With the increased interest in artificial intelligence, Machine Learning as a Service provides the infrastructure in the Cloud for easy training, testing, and deploying models. However, these systems have a major privacy issue: uploading sensitive data to the Cloud, especially during training. Therefore, achieving secure Neural Network training has been on many researchers' minds lately. More and more solutions for this problem are built around a main pillar: Functional Encryption (FE). Although these approaches are very interesting and offer a new perspective on ML training over encrypted data, some vulnerabilities do not seem to be taken into consideration. In our paper, we present an attack on neural networks that uses FE for secure training over encrypted data. Our approach uses linear programming to reconstruct the original input, unveiling the previous security promises. To address the attack, we propose two solutions for secure training and inference that involve the client during the computation phase. One approach ensures security without relying on encryption, while the other uses function-hiding inner-product techniques.
Authors:Yu-Kai Shih, You-Kai Kang
Abstract:
As smart tourism evolves, AI-powered chatbots have become indispensable for delivering personalized, real-time assistance to travelers while promoting sustainability and efficiency. However, these systems are increasingly vulnerable to prompt injection attacks, where adversaries manipulate inputs to elicit unintended behaviors such as leaking sensitive information or generating harmful content. This paper presents a case study on the design and implementation of a secure retrieval-augmented generation (RAG) chatbot for Hsinchu smart tourism services. The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections, achieving high contextual awareness and security. Key features include a tiered response strategy, RAG-driven knowledge grounding, and intent decomposition across lexical, semantic, and pragmatic levels. Defense mechanisms include system norms, gatekeepers for intent judgment, and reverse RAG text to prioritize verified data. We also benchmark a GPT-5 variant (released 2025-08-07) to assess inherent robustness. Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks. GPT-5 blocked about 85% of attacks, showing progress yet highlighting the need for layered defenses. Findings emphasize contributions to sustainable tourism, multilingual accessibility, and ethical AI deployment. This work offers a practical framework for deploying secure chatbots in smart tourism and contributes to resilient, trustworthy AI applications.
Authors:Amr Akmal Abouelmagd, Amr Hilal
Abstract:
Federated Learning (FL) facilitates collaborative model training while keeping raw data decentralized, making it a conduit for leveraging the power of IoT devices while maintaining privacy of the locally collected data. However, existing privacy- preserving techniques present notable hurdles. Methods such as Multi-Party Computation (MPC), Homomorphic Encryption (HE), and Differential Privacy (DP) often incur high compu- tational costs and suffer from limited scalability. This survey examines emerging approaches that hold promise for enhancing both privacy and efficiency in FL, including Trusted Execution Environments (TEEs), Physical Unclonable Functions (PUFs), Quantum Computing (QC), Chaos-Based Encryption (CBE), Neuromorphic Computing (NC), and Swarm Intelligence (SI). For each paradigm, we assess its relevance to the FL pipeline, outlining its strengths, limitations, and practical considerations. We conclude by highlighting open challenges and prospective research avenues, offering a detailed roadmap for advancing secure and scalable FL systems.
Authors:Wei Huang, De-Tian Chu, Lin-Yuan Bai, Wei Kang, Hai-Tao Zhang, Bo Li, Zhi-Mo Han, Jing Ge, Hai-Feng Lin
Abstract:
Modern email spam and phishing attacks have evolved far beyond keyword blacklists or simple heuristics. Adversaries now craft multi-modal campaigns that combine natural-language text with obfuscated URLs, forged headers, and malicious attachments, adapting their strategies within days to bypass filters. Traditional spam detection systems, which rely on static rules or single-modality models, struggle to integrate heterogeneous signals or to continuously adapt, leading to rapid performance degradation. We propose EvoMail, a self-evolving cognitive agent framework for robust detection of spam and phishing. EvoMail first constructs a unified heterogeneous email graph that fuses textual content, metadata (headers, senders, domains), and embedded resources (URLs, attachments). A Cognitive Graph Neural Network enhanced by a Large Language Model (LLM) performs context-aware reasoning across these sources to identify coordinated spam campaigns. Most critically, EvoMail engages in an adversarial self-evolution loop: a ''red-team'' agent generates novel evasion tactics -- such as character obfuscation or AI-generated phishing text -- while the ''blue-team'' detector learns from failures, compresses experiences into a memory module, and reuses them for future reasoning. Extensive experiments on real-world datasets (Enron-Spam, Ling-Spam, SpamAssassin, and TREC) and synthetic adversarial variants demonstrate that EvoMail consistently outperforms state-of-the-art baselines in detection accuracy, adaptability to evolving spam tactics, and interpretability of reasoning traces. These results highlight EvoMail's potential as a resilient and explainable defense framework against next-generation spam and phishing threats.
Authors:Ibrahim Altan, Abdulla Bachir, Yousuf Parbhulkar, Abdul Muksith Rizvi, Moshiur Farazi
Abstract:
Phishing emails pose a persistent and increasingly sophisticated threat, undermining email security through deceptive tactics designed to exploit both semantic and structural vulnerabilities. Traditional detection methods, often based on isolated analysis of email content or embedded URLs, fail to comprehensively address these evolving attacks. In this paper, we propose a dual-path phishing detection framework that integrates transformer-based natural language processing (NLP) with classical machine learning to jointly analyze email text and embedded URLs. Our approach leverages the complementary strengths of semantic analysis using fine-tuned transformer architectures (e.g., DistilBERT) and structural link analysis via character-level TF-IDF vectorization paired with classical classifiers (e.g., Random Forest). Empirical evaluation on representative email and URL datasets demonstrates that this combined approach significantly improves detection accuracy. Specifically, the DistilBERT model achieves a near-optimal balance between accuracy and computational efficiency for textual phishing detection, while Random Forest notably outperforms other classical classifiers in identifying malicious URLs. The modular design allows flexibility for standalone deployment or ensemble integration, facilitating real-world adoption. Collectively, our results highlight the efficacy and practical value of this dual-path approach, establishing a scalable, accurate, and interpretable solution capable of enhancing email security against contemporary phishing threats.
Authors:Cheng Lyu, Mu Yuan, Dabin Zheng, Siwei Sun, Shun Li
Abstract:
The mapping $Ï_n$ from $\F_{2}^{n}$ to itself defined by $y=Ï_n(x)$ with $y_i=x_i+x_{i+2}(1+x_{i+1})$, where the indices are computed modulo $n$, has been widely studied for its applications in lightweight cryptography. However, $Ï_n $ is bijective on $\F_2^n$ only when $n$ is odd, restricting its use to odd-dimensional vector spaces over $\F_2$. To address this limitation, we introduce and analyze the generalized mapping $Ï_{n, m}$ defined by $y=Ï_{n,m}(x)$ with $y_i=x_i+x_{i+m} (x_{i+m-1}+1)(x_{i+m-2}+1) \cdots (x_{i+1}+1)$, where $m$ is a fixed integer with $m\nmid n$. To investigate such mappings, we further generalize $Ï_{n,m}$ to $θ_{m, k}$, where $θ_{m, k}$ is given by $y_i=x_{i+mk} \prod_{\substack{j=1,\,\, m \nmid j}}^{mk-1} \left(x_{i+j}+1\right), \,\,{\rm for }\,\, i\in \{0,1,\ldots,n-1\}$. We prove that these mappings generate an abelian group isomorphic to the group of units in $\F_2[z]/(z^{\lfloor n/m\rfloor +1})$. This structural insight enables us to construct a broad class of permutations over $\F_2^n$ for any positive integer $n$, along with their inverses. We rigorously analyze algebraic properties of these mappings, including their iterations, fixed points, and cycle structures. Additionally, we provide a comprehensive database of the cryptographic properties for iterates of $Ï_{n,m}$ for small values of $n$ and $m$. Finally, we conduct a comparative security and implementation cost analysis among $Ï_{n,m}$, $Ï_n$, $ÏÏ_n$ (EUROCRYPT 2025 \cite{belkheyar2025chi}) and their variants, and prove Conjecture~1 proposed in~\cite{belkheyar2025chi} as a by-product of our study. Our results lead to generalizations of $Ï_n$, providing alternatives to $Ï_n$ and $ÏÏ_n$.
Authors:Chao Zha, Haolin Pan, Bing Bai, Jiangxing Wu, Ruyun Zhang
Abstract:
In the Internet of Things (IoT) environment, continuous interaction among a large number of devices generates complex and dynamic network traffic, which poses significant challenges to rule-based detection approaches. Machine learning (ML)-based traffic detection technology, capable of identifying anomalous patterns and potential threats within this traffic, serves as a critical component in ensuring network security. This study first identifies a significant issue with widely adopted feature extraction tools (e.g., CICMeterFlow): the extensive use of time- and length-related features leads to high sparsity, which adversely affects model convergence. Furthermore, existing traffic detection methods generally lack an embedding mechanism capable of efficiently and comprehensively capturing the semantic characteristics of network traffic. To address these challenges, we propose a novel feature extraction tool that eliminates traditional time and length features in favor of context-aware semantic features related to the source host, thus improving the generalizability of the model. In addition, we design an embedding training framework that integrates the unsupervised DBSCAN clustering algorithm with a contrastive learning strategy to effectively capture fine-grained semantic representations of traffic. Extensive empirical evaluations are conducted on the real-world Mawi data set to validate the proposed method in terms of detection accuracy, robustness, and generalization. Comparative experiments against several state-of-the-art (SOTA) models demonstrate the superior performance of our approach. Furthermore, we confirm its applicability and deployability in real-time scenarios.
Authors:Oluwole Adewusi, Wallace S. Msagusa, Jean Pierre Imanirumva, Okemawo Obadofin, Jema D. Ndibwile
Abstract:
The rapid adoption of Mobile Money Services (MMS) in Sub-Saharan Africa (SSA) offers a viable path to improve e-Government service accessibility in the face of persistent low internet penetration. However, existing Mobile Money Authentication (MMA) methods face critical limitations, including susceptibility to SIM swapping, weak session protection, and poor scalability during peak demand. This study introduces a hybrid MMA framework that combines Unstructured Supplementary Service Data (USSD)-based multi-factor authentication with secure session management via cryptographically bound JSON Web Tokens (JWT). Unlike traditional MMA systems that rely solely on SIM-PIN verification or smartphone-dependent biometrics, our design implements a three-factor authentication model; SIM verification, PIN entry, and session token binding, tailored for resource-constrained environments. Simulations and comparative analysis against OAuth-based Single Sign-On (SSO) methods reveal a 45% faster authentication time (8 seconds vs. 12 to 15 seconds), 15% higher success under poor network conditions (95% vs. 80%), and increased resistance to phishing and brute-force attacks. Penetration testing and threat modeling further demonstrate a substantial reduction in vulnerability exposure compared to conventional approaches. The primary contributions of this work are: (1) a hybrid authentication protocol that ensures offline accessibility and secure session continuity; (2) a tailored security framework addressing threats like SIM swapping and social engineering in SSA; and (3) demonstrated scalability for thousands of users with reduced infrastructure overhead. The proposed approach advances secure digital inclusion in SSA and other regions with similar constraints.
Authors:Kay Fuhrmeister, Arne Pelzer, Fabian Radke, Julia Lechinger, Mahzad Gharleghi, Thomas Köllmer, Insa Wolf
Abstract:
Electroencephalography (EEG) is widely used for recording brain activity and has seen numerous applications in machine learning, such as detecting sleep stages and neurological disorders. Several studies have successfully shown the potential of EEG data for re-identification and leakage of other personal information. Therefore, the increasing availability of EEG consumer devices raises concerns about user privacy, motivating us to investigate how to safeguard this sensitive data while retaining its utility for EEG applications. To address this challenge, we propose a transformer-based autoencoder to create EEG data that does not allow for subject re-identification while still retaining its utility for specific machine learning tasks. We apply our approach to automatic sleep staging by evaluating the re-identification and utility potential of EEG data before and after anonymization. The results show that the re-identifiability of the EEG signal can be substantially reduced while preserving its utility for machine learning.
Authors:Noam Schmitt, Marc Antoine Lacoste
Abstract:
This paper investigates the trade-off between centralized and decentralized security management in constellations of satellites to balance security and performance. We highlight three key AI architectures for automated security management: (a) centralized, (b) distributed and (c) federated. The centralized architecture is the best option short term, providing fast training, despite the hard challenge of the communication latency overhead across space. Decentralized architectures are better alternatives in the longer term, providing enhanced scalability and security.
Authors:Dilli Hang Rai, Sabin Kafley
Abstract:
ECG biometrics offer a unique, secure authentication method, yet their deployment on wearable devices faces real-time processing, privacy, and spoofing vulnerability challenges. This paper proposes a lightweight deep learning model (MobileNetV1+GRU) for ECG-based authentication, injection of 20dB Gaussian noise & custom preprocessing. We simulate wearable conditions and edge deployment using the ECGID, MIT-BIH, CYBHi, and PTB datasets, achieving accuracies of 99.34%, 99.31%, 91.74%, and 98.49%, F1-scores of 0.9869, 0.9923, 0.9125, and 0.9771, Precision of 0.9866, 0.9924, 0.9180 and 0.9845, Recall of 0.9878, 0.9923, 0.9129, and 0.9756, equal error rates (EER) of 0.0009, 0.00013, 0.0091, and 0.0009, and ROC-AUC values of 0.9999, 0.9999, 0.9985, and 0.9998, while under FGSM adversarial attacks, accuracy drops from 96.82% to as low as 0.80%. This paper highlights federated learning, adversarial testing, and the need for diverse wearable physiological datasets to ensure secure and scalable biometrics.
Authors:Tanmay Khule, Stefan Marksteiner, Jose Alguindigue, Hannes Fuchs, Sebastian Fischmeister, Apurva Narayan
Abstract:
In modern automotive development, security testing is critical for safeguarding systems against increasingly advanced threats. Attack trees are widely used to systematically represent potential attack vectors, but generating comprehensive test cases from these trees remains a labor-intensive, error-prone task that has seen limited automation in the context of testing vehicular systems. This paper introduces STAF (Security Test Automation Framework), a novel approach to automating security test case generation. Leveraging Large Language Models (LLMs) and a four-step self-corrective Retrieval-Augmented Generation (RAG) framework, STAF automates the generation of executable security test cases from attack trees, providing an end-to-end solution that encompasses the entire attack surface. We particularly show the elements and processes needed to provide an LLM to actually produce sensible and executable automotive security test suites, along with the integration with an automated testing framework. We further compare our tailored approach with general purpose (vanilla) LLMs and the performance of different LLMs (namely GPT-4.1 and DeepSeek) using our approach. We also demonstrate the method of our operation step-by-step in a concrete case study. Our results show significant improvements in efficiency, accuracy, scalability, and easy integration in any workflow, marking a substantial advancement in automating automotive security testing methodologies. Using TARAs as an input for verfication tests, we create synergies by connecting two vital elements of a secure automotive development process.
Authors:Lubos Mjachky, Ivan Homoliak
Abstract:
Biometric-based authentication systems are getting broadly adopted in many areas. However, these systems do not allow participating users to influence the way their data is used. Furthermore, the data may leak and can be misused without the users' knowledge. In this paper, we propose a new authentication method that preserves the privacy of individuals and is based on a generative adversarial network (GAN). Concretely, we suggest using the GAN for translating images of faces to a visually private domain (e.g., flowers or shoes). Classifiers, which are used for authentication purposes, are then trained on the images from the visually private domain. Based on our experiments, the method is robust against attacks and still provides meaningful utility.
Authors:Raphael Simon, Pieter Libin, Wim Mees
Abstract:
Penetration testing, the simulation of cyberattacks to identify security vulnerabilities, presents a sequential decision-making problem well-suited for reinforcement learning (RL) automation. Like many applications of RL to real-world problems, partial observability presents a major challenge, as it invalidates the Markov property present in Markov Decision Processes (MDPs). Partially Observable MDPs require history aggregation or belief state estimation to learn successful policies. We investigate stochastic, partially observable penetration testing scenarios over host networks of varying size, aiming to better reflect real-world complexity through more challenging and representative benchmarks. This approach leads to the development of more robust and transferable policies, which are crucial for ensuring reliable performance across diverse and unpredictable real-world environments. Using vanilla Proximal Policy Optimization (PPO) as a baseline, we compare a selection of PPO variants designed to mitigate partial observability, including frame-stacking, augmenting observations with historical information, and employing recurrent or transformer-based architectures. We conduct a systematic empirical analysis of these algorithms across different host network sizes. We find that this task greatly benefits from history aggregation. Converging three times faster than other approaches. Manual inspection of the learned policies by the algorithms reveals clear distinctions and provides insights that go beyond quantitative results.
Authors:Antoine Plin, Frédéric Fauberteau, Nga Nguyen
Abstract:
Rowhammer attacks have emerged as a significant threat to modern DRAM-based memory systems, leveraging frequent memory accesses to induce bit flips in adjacent memory cells. This work-in-progress paper presents an adaptive, many-sided Rowhammer attack utilizing GPU compute shaders to systematically achieve high-frequency memory access patterns. Our approach employs statistical distributions to optimize row targeting and avoid current mitigations. The methodology involves initializing memory with known patterns, iteratively hammering victim rows, monitoring for induced errors, and dynamically adjusting parameters to maximize success rates. The proposed attack exploits the parallel processing capabilities of GPUs to accelerate hammering operations, thereby increasing the probability of successful bit flips within a constrained timeframe. By leveraging OpenGL compute shaders, our implementation achieves highly efficient row hammering with minimal software overhead. Experimental results on a Raspberry Pi 4 demonstrate that the GPU-based approach attains a high rate of bit flips compared to traditional CPU-based hammering, confirming its effectiveness in compromising DRAM integrity. Our findings align with existing research on microarchitectural attacks in heterogeneous systems that highlight the susceptibility of GPUs to security vulnerabilities. This study contributes to the understanding of GPU-assisted fault-injection attacks and underscores the need for improved mitigation strategies in future memory architectures.
Authors:Zhixiao Wu, Yao Lu, Jie Wen, Hao Sun, Qi Zhou, Guangming Lu
Abstract:
Poison-only Clean-label Backdoor Attacks aim to covertly inject attacker-desired behavior into DNNs by merely poisoning the dataset without changing the labels. To effectively implant a backdoor, multiple \textbf{triggers} are proposed for various attack requirements of Attack Success Rate (ASR) and stealthiness. Additionally, sample selection enhances clean-label backdoor attacks' ASR by meticulously selecting ``hard'' samples instead of random samples to poison. Current methods 1) usually handle the sample selection and triggers in isolation, leading to severely limited improvements on both ASR and stealthiness. Consequently, attacks exhibit unsatisfactory performance on evaluation metrics when converted to PCBAs via a mere stacking of methods. Therefore, we seek to explore the bidirectional collaborative relations between the sample selection and triggers to address the above dilemma. 2) Since the strong specificity within triggers, the simple combination of sample selection and triggers fails to substantially enhance both evaluation metrics, with generalization preserved among various attacks. Therefore, we seek to propose a set of components to significantly improve both stealthiness and ASR based on the commonalities of attacks. Specifically, Component A ascertains two critical selection factors, and then makes them an appropriate combination based on the trigger scale to select more reasonable ``hard'' samples for improving ASR. Component B is proposed to select samples with similarities to relevant trigger implanted samples to promote stealthiness. Component C reassigns trigger poisoning intensity on RGB colors through distinct sensitivity of the human visual system to RGB for higher ASR, with stealthiness ensured by sample selection, including Component B. Furthermore, all components can be strategically integrated into diverse PCBAs.
Authors:Dehinde Molade, Dave Ormrod, Mamello Thinyane, Nalin Arachchilage, Jill Slay
Abstract:
The threat of quantum malware is real and a growing security concern that will have catastrophic scientific and technological impacts, if not addressed early. If weaponised or exploited especially by the wrong hands, malware will undermine highly sophisticated critical systems supported by next-generation quantum architectures, for example, in defence, communications, energy, and space. This paper explores the fundamental nature and implications of quantum malware to enable the future development of appropriate mitigations and defences, thereby protecting critical infrastructure. By conducting a systematic literature review (SLR) that draws on knowledge frameworks such as ontologies and taxonomies to explore malware, this provides insights into how malicious behaviours can be translated into attacks on quantum technologies, thereby providing a lens to analyse the severity of malware against quantum technologies. This study employs the European Competency Framework for Quantum Technologies (CFQT) as a guide to map malware behaviour to several competency layers, creating a foundation in this emerging field.
Authors:Antoine Plin, Lorenzo Casalino, Thomas Rokicki, Ruben Salvador
Abstract:
Modern Systems-on-Chip (SoCs) employ undocumented linear address-scrambling functions to obfuscate DRAM addressing, which complicates DRAM-aware performance optimizations and hinders proactive security analysis of DRAM-based attacks; most notably, Rowhammer. Although previous work tackled the issue of reversing physical-to-DRAM mapping, existing heuristic-based reverse-engineering approaches are partial, costly, and impractical for comprehensive recovery. This paper establishes a rigorous theoretical foundation and provides efficient practical algorithms for black-box, complete physical-to-DRAM address-mapping recovery. We first formulate the reverse-engineering problem within a linear algebraic model over the finite field GF(2). We characterize the timing fingerprints of row-buffer conflicts, proving a relationship between a bank addressing matrix and an empirically constructed matrix of physical addresses. Based on this characterization, we develop an efficient, noise-robust, and fully platform-agnostic algorithm to recover the full bank-mask basis in polynomial time, a significant improvement over the exponential search from previous works. We further generalize our model to complex row mappings, introducing new hardware-based hypotheses that enable the automatic recovery of a row basis instead of previous human-guided contributions. Evaluations across embedded and server-class architectures confirm our method's effectiveness, successfully reconstructing known mappings and uncovering previously unknown scrambling functions. Our method provides a 99% recall and accuracy on all tested platforms. Most notably, Knock-Knock runs in under a few minutes, even on systems with more than 500GB of DRAM, showcasing the scalability of our method. Our approach provides an automated, principled pathway to accurate DRAM reverse engineering.
Authors:Xiaohui Yang, Ping Ping, Feng Xu
Abstract:
The widespread adoption of smartphone photography has led users to increasingly rely on cloud storage for personal photo archiving and sharing, raising critical privacy concerns. Existing deep learning-based image encryption schemes, typically built upon CNNs or GANs, often depend on traditional cryptographic algorithms and lack inherent architectural reversibility, resulting in limited recovery quality and poor robustness. Invertible neural networks (INNs) have emerged to address this issue by enabling reversible transformations, yet the first INN-based encryption scheme still relies on an auxiliary reference image and discards by-product information before decryption, leading to degraded recovery and limited practicality. To address these limitations, this paper proposes FlowCrypt, a novel flow-based image encryption framework that simultaneously achieves near-lossless recovery, high security, and lightweight model design. FlowCrypt begins by applying a key-conditioned random split to the input image, enhancing forward-process randomness and encryption strength. The resulting components are processed through a Flow-based Encryption/Decryption (FED) module composed of invertible blocks, which share parameters across encryption and decryption. Thanks to its reversible architecture and reference-free design, FlowCrypt ensures high-fidelity image recovery. Extensive experiments show that FlowCrypt achieves recovery quality with 100dB on three datasets, produces uniformly distributed cipher images, and maintains a compact architecture with only 1M parameters, making it suitable for mobile and edge-device applications.
Authors:Sumana Malkapuram, Sameera Gangavarapu, Kailashnath Reddy Kavalakuntla, Ananya Gangavarapu
Abstract:
The proliferation of autonomous software agents necessitates rigorous frameworks for establishing secure and verifiable agent-to-agent (A2A) interactions, particularly when such agents are instantiated as non-human identities(NHIs). We extend the A2A paradigm [1 , 2] by introducing a cryptographically grounded mechanism for lineage verification, wherein the provenance and evolution of NHIs are anchored in append-only Merkle tree structures modeled after Certificate Transparency (CT) logs. Unlike traditional A2A models that primarily secure point-to-point interactions, our approach enables both agents and external verifiers to cryptographically validate multi-hop provenance, thereby ensuring the integrity of the entire call chain. A federated proof server acts as an auditor across one or more Merkle logs, aggregating inclusion proofs and consistency checks into compact, signed attestations that external parties can verify without access to the full execution trace. In parallel, we augment the A2A agent card to incorporate explicit identity verification primitives, enabling both peer agents and human approvers to authenticate the legitimacy of NHI representations in a standardized manner. Together, these contributions establish a cohesive model that integrates identity attestation, lineage verification, and independent proof auditing, thereby advancing the security posture of inter-agent ecosystems and providing a foundation for robust governance of NHIs in regulated environments such as FedRAMP.
Authors:Christopher Simon Liu, Fan Wang, Patrick Gould, Carter Yagemann
Abstract:
Fault Injection is the study of observing how systems behave under unusual stress, environmental or otherwise. In practice, fault injection involves testing the limits of computer systems and finding novel ways to potentially break cyber-physical security. The contributions of this paper are three-fold. First, we provide a beginner-friendly introduction to this research topic and an in-depth taxonomy of fault injection techniques. Second, we highlight the current state-of-the-art and provide a cost-benefit analysis of each attack method. Third, for those interested in doing fault injection research, we provide a replication analysis of an existing vulnerability detection tool and identify a research focus for future work.
Authors:Alessio Izzillo, Riccardo Lazzeretti, Emilio Coppa
Abstract:
Modern embedded Linux devices, such as routers, IP cameras, and IoT gateways, rely on complex software stacks where numerous daemons interact to provide services. Testing these devices is crucial from a security perspective since vendors often use custom closed- or open-source software without documenting releases and patches. Recent coverage-guided fuzzing solutions primarily test individual processes, ignoring deep dependencies between daemons and their persistent internal state. This article presents STAFF, a firmware fuzzing framework for discovering bugs in Linux-based firmware built around three key ideas: (a) user-driven multi-request recording, which monitors user interactions with emulated firmware to capture request sequences involving application-layer protocols (e.g., HTTP); (b) intra- and inter-process dependency detection, which uses whole-system taint analysis to track how input bytes influence user-space states, including files, sockets, and memory areas; (c) protocol-aware taint-guided fuzzing, which applies mutations to request sequences based on identified dependencies, exploiting multi-staged forkservers to efficiently checkpoint protocol states. When evaluating STAFF on 15 Linux-based firmware targets, it identifies 42 bugs involving multiple network requests and different firmware daemons, significantly outperforming existing state-of-the-art fuzzing solutions in both the number and reproducibility of discovered bugs.
Authors:Shunnosuke Ikeda, Kazumasa Shinagawa
Abstract:
This paper introduces mathematical optimization as a new method for proving impossibility results in the field of card-based cryptography. While previous impossibility proofs were often limited to cases involving a small number of cards, this new approach establishes results that hold for a large number of cards. The research focuses on single-cut full-open (SCFO) protocols, which consist of performing one random cut and then revealing all cards. The main contribution is that for any three-variable Boolean function, no new SCFO protocols exist beyond those already known, under the condition that all additional cards have the same color. The significance of this work is that it provides a new framework for proving impossibility results and delivers a proof that is valid for any number of cards, as long as all additional cards have the same color.
Authors:Jianbin Ji, Dawen Xu, Li Dong, Lin Yang, Songhan He
Abstract:
With the wide spread of video, video watermarking has become increasingly crucial for copyright protection and content authentication. However, video watermarking still faces numerous challenges. For example, existing methods typically have shortcomings in terms of watermarking capacity and robustness, and there is a lack of specialized noise layer for High Efficiency Video Coding(HEVC) compression. To address these issues, this paper introduces a Deep Invertible Network for Video watermarking (DINVMark) and designs a noise layer to simulate HEVC compression. This approach not only in creases watermarking capacity but also enhances robustness. DINVMark employs an Invertible Neural Network (INN), where the encoder and decoder share the same network structure for both watermark embedding and extraction. This shared architecture ensures close coupling between the encoder and decoder, thereby improving the accuracy of the watermark extraction process. Experimental results demonstrate that the proposed scheme significantly enhances watermark robustness, preserves video quality, and substantially increases watermark embedding capacity.
Authors:Duoxun Tang, Xinhang Jiang, Jiajun Niu
Abstract:
Text embedding inversion attacks reconstruct original sentences from latent representations, posing severe privacy threats in collaborative inference and edge computing. We propose TextCrafter, an optimization-based adversarial perturbation mechanism that combines RL learned, geometry aware noise injection orthogonal to user embeddings with cluster priors and PII signal guidance to suppress inversion while preserving task utility. Unlike prior defenses either non learnable or agnostic to perturbation direction, TextCrafter provides a directional protective policy that balances privacy and utility. Under strong privacy setting, TextCrafter maintains 70 percentage classification accuracy on four datasets and consistently outperforms Gaussian/LDP baselines across lower privacy budgets, demonstrating a superior privacy utility trade off.
Authors:Huifang Yu, Jiaxing Jie, Lei Li
Abstract:
Electronic whistleblowing systems are widely used due to their efficiency and convenience. The key to designing such systems lies in protecting the identity privacy of whistleblowers, preventing malicious whistleblowing, and ensuring the confidentiality of whistleblowing information. To address these issues, a SM2 traceable ring signcryption scheme for electronic voting is proposed. This scheme combines the SM2 elliptic curve public key cryptography algorithm with the ring signature algorithm, enhancing the overall efficiency of the scheme while ensuring the autonomy and controllability of the core cryptographic algorithms. Security analysis demonstrates that the proposed scheme satisfies confidentiality, unforgeability, traceability, linkability, and deniability. Efficiency analysis shows that, compared to existing ring signature schemes, the proposed scheme exhibits significant efficiency advantages during the signature phase. The electronic whistleblowing system designed using the proposed scheme can track malicious whistleblowers while protecting user identity privacy, and ensures that the content of whistleblowing remains unknown to third parties.
Authors:Vyron Kampourakis, Christos Smiliotopoulos, Vasileios Gkioulos, Sokratis Katsikas
Abstract:
Traditional air gaps in industrial systems are disappearing as IT technologies permeate the OT domain, accelerating the integration of wireless solutions like Wi-Fi. Next-generation Wi-Fi standards (IEEE 802.11ax/be) meet performance demands for industrial use cases, yet their introduction raises significant security concerns. A critical knowledge gap exists regarding the empirical prevalence and security configuration of Wi-Fi in real-world industrial settings. This work addresses this by mining the global crowdsourced WiGLE database to provide a data-driven understanding. We create the first publicly available dataset of 1,087 high-confidence industrial Wi-Fi networks, examining key attributes such as SSID patterns, encryption methods, vendor types, and global distribution. Our findings reveal a growing adoption of Wi-Fi across industrial sectors but underscore alarming security deficiencies, including the continued use of weak or outdated security configurations that directly expose critical infrastructure. This research serves as a pivotal reference point, offering both a unique dataset and practical insights to guide future investigations into wireless security within industrial environments.
Authors:Anna Bertiger, Bobby Filar, Aryan Luthra, Stefano Meschiari, Aiden Mitchell, Sam Scholten, Vivek Sharath
Abstract:
LLMs are increasingly pervasive in the security environment, with limited measures of their effectiveness, which limits trust and usefulness to security practitioners. Here, we present an open-source evaluation framework and benchmark metrics for evaluating LLM-generated cybersecurity rules. The benchmark employs a holdout set-based methodology to measure the effectiveness of LLM-generated security rules in comparison to a human-generated corpus of rules. It provides three key metrics inspired by the way experts evaluate security rules, offering a realistic, multifaceted evaluation of the effectiveness of an LLM-based security rule generator. This methodology is illustrated using rules from Sublime Security's detection team and those written by Sublime Security's Automated Detection Engineer (ADE), with a thorough analysis of ADE's skills presented in the results section.
Authors:Javier Jiménez-Román, Florina Almenares-Mendoza, Alfonso Sánchez-Macián
Abstract:
Cybersecurity threats continue to increase, with a growing number of previously unknown attacks each year targeting both large corporations and smaller entities. This scenario demands the implementation of advanced security measures, not only to mitigate damage but also to anticipate emerging attack trends. In this context, deception tools have become a key strategy, enabling the detection, deterrence, and deception of potential attackers while facilitating the collection of information about their tactics and methods. Among these tools, honeypots have proven their value, although they have traditionally been limited by rigidity and configuration complexity, hindering their adaptability to dynamic scenarios. The rise of artificial intelligence, and particularly general-purpose Large Language Models (LLMs), is driving the development of new deception solutions capable of offering greater adaptability and ease of use. This work proposes the design and implementation of an LLM-based honeypot to simulate an LDAP server, a critical protocol present in most organizations due to its central role in identity and access management. The proposed solution aims to provide a flexible and realistic tool capable of convincingly interacting with attackers, thereby contributing to early detection and threat analysis while enhancing the defensive capabilities of infrastructures against intrusions targeting this service.
Authors:Ekin Böke, Simon Torka
Abstract:
Large Language Models (LLMs) have emerged as promising tools for malware detection by analyzing code semantics, identifying vulnerabilities, and adapting to evolving threats. However, their reliability under adversarial compiler-level obfuscation is yet to be discovered. In this study, we empirically evaluate the robustness of three state-of-the-art LLMs: ChatGPT-4o, Gemini Flash 2.5, and Claude Sonnet 4 against compiler-level obfuscation techniques implemented via the LLVM infrastructure. These include control flow flattening, bogus control flow injection, instruction substitution, and split basic blocks, which are widely used to evade detection while preserving malicious behavior. We perform a structured evaluation on 40~C functions (20 vulnerable, 20 secure) sourced from the Devign dataset and obfuscated using LLVM passes. Our results show that these models often fail to correctly classify obfuscated code, with precision, recall, and F1-score dropping significantly after transformation. This reveals a critical limitation: LLMs, despite their language understanding capabilities, can be easily misled by compiler-based obfuscation strategies. To promote reproducibility, we release all evaluation scripts, prompts, and obfuscated code samples in a public repository. We also discuss the implications of these findings for adversarial threat modeling, and outline future directions such as software watermarking, compiler-aware defenses, and obfuscation-resilient model design.
Authors:Mohsen Ahmadvand, Pedro Souto
Abstract:
Zero-knowledge rollups rely on provers to generate multi-step state transition proofs under strict finality and availability constraints. These steps require expensive hardware (e.g., GPUs), and finality is reached only once all stages complete and results are posted on-chain. As rollups scale, staying economically viable becomes increasingly difficult due to rising throughput, fast finality demands, volatile gas prices, and dynamic resource needs. We base our study on Halo2-based proving systems and identify transactions per second (TPS), average gas usage, and finality time as key cost drivers. To address this, we propose a parametric cost model that captures rollup-specific constraints and ensures provers can keep up with incoming transaction load. We formulate this model as a constraint system and solve it using the Z3 SMT solver to find cost-optimal configurations. To validate our approach, we implement a simulator that detects lag and estimates operational costs. Our method shows a potential cost reduction of up to 70\%.
Authors:Ashley Kurian, Aydin Aysu
Abstract:
Neural networks are valuable intellectual property due to the significant computational cost, expert labor, and proprietary data involved in their development. Consequently, protecting their parameters is critical not only for maintaining a competitive advantage but also for enhancing the model's security and privacy. Prior works have demonstrated the growing capability of cryptanalytic attacks to scale to deeper models. In this paper, we present the first defense mechanism against cryptanalytic parameter extraction attacks. Our key insight is to eliminate the neuron uniqueness necessary for these attacks to succeed. We achieve this by a novel, extraction-aware training method. Specifically, we augment the standard loss function with an additional regularization term that minimizes the distance between neuron weights within a layer. Therefore, the proposed defense has zero area-delay overhead during inference. We evaluate the effectiveness of our approach in mitigating extraction attacks while analyzing the model accuracy across different architectures and datasets. When re-trained with the same model architecture, the results show that our defense incurs a marginal accuracy change of less than 1% with the modified loss function. Moreover, we present a theoretical framework to quantify the success probability of the attack. When tested comprehensively with prior attack settings, our defense demonstrated empirical success for sustained periods of extraction, whereas unprotected networks are extracted between 14 minutes to 4 hours.
Authors:Tianrou Xia, Kaiming Huang, Dongyeon Yu, Yuseok Jeon, Jie Zhou, Dinghao Wu, Taegyu Kim
Abstract:
Rust is a memory-safe language, and its strong safety guarantees combined with high performance have been attracting widespread adoption in systems programming and security-critical applications. However, Rust permits the use of unsafe code, which bypasses compiler-enforced safety checks and can introduce memory vulnerabilities. A widely adopted approach for detecting memory safety bugs in Rust is Address Sanitizer (ASan). Optimized versions, such as ERASan and RustSan, have been proposed to selectively apply security checks in order to reduce performance overhead. However, these tools still incur significant performance and memory overhead and fail to detect many classes of memory safety vulnerabilities due to the inherent limitations of ASan. In this paper, we present LiteRSan, a novel memory safety sanitizer that addresses the limitations of prior approaches. By leveraging Rust's unique ownership model, LiteRSan performs Rust-specific static analysis that is aware of pointer lifetimes to identify risky pointers. It then selectively instruments risky pointers to enforce only the necessary spatial or temporal memory safety checks. Consequently, LiteRSan introduces significantly lower runtime overhead (18.84% versus 152.05% and 183.50%) and negligible memory overhead (0.81% versus 739.27% and 861.98%) compared with existing ASan-based sanitizers while being capable of detecting memory safety bugs that prior techniques miss.
Authors:Mohammad Hossein Asghari, Lianying Zhao
Abstract:
Android apps have become a valuable target for app modifiers and imitators due to its popularity and being trusted with highly sensitive data. Packers, on the other hand, protect apps from tampering with various anti-analysis techniques embedded in the app. Meanwhile, packers also conceal certain behavior potentially against the interest of the users, aside from being abused by malware for stealth. Security practitioners typically try to capture undesired behavior at runtime with hooking (e.g., Frida) or debugging techniques, which are heavily affected by packers. Unpackers have been the community's continuous effort to address this, but due to the emerging commercial packers, our study shows that none of the unpackers remain effective, and they are unfit for this purpose as unpacked apps can no longer run. We first perform a large-scale prevalence analysis of Android packers with a real-world dataset of 12,341 apps, the first of its kind, to find out what percentage of Android apps are actually packed and to what extent dynamic analysis is hindered. We then propose Purifire, an evasion engine to bypass packers' anti-analysis techniques and enable dynamic analysis on packed apps without unpacking them. Purifire is based on eBPF, a low-level kernel feature, which provides observability and invisibility to userspace apps to enforce defined evasion rules while staying low-profile. Our evaluation shows that Purifire is able to bypass packers' anti-analysis checks and more importantly, for previous research works suffering from packers, we observe a significant improvement (e.g., a much higher number of detected items such as device fingerprints).
Authors:Qian'ang Mao, Jiaxin Wang, Zhiqi Feng, Yi Zhang, Jiaqi Yan
Abstract:
Cryptocurrencies and Web3 applications based on blockchain technology have flourished in the blockchain research field. Unlike Bitcoin and Ethereum, due to its unique architectural designs in consensus mechanisms, resource management, and throughput, TRON has developed a more distinctive ecosystem and application scenarios centered around stablecoins. Although it is popular in areas like stablecoin payments and settlement, research on analyzing on-chain data from the TRON blockchain is remarkably scarce. To fill this gap, this paper proposes a comprehensive data extraction and exploration framework for the TRON blockchain. An innovative high-performance ETL system aims to efficiently extract raw on-chain data from TRON, including blocks, transactions, smart contracts, and receipts, establishing a research dataset. An in-depth analysis of the extracted dataset reveals insights into TRON's block generation, transaction trends, the dominance of exchanges, the resource delegation market, smart contract usage patterns, and the central role of the USDT stablecoin. The prominence of gambling applications and potential illicit activities related to USDT is emphasized. The paper discusses opportunities for future research leveraging this dataset, including analysis of delegate services, gambling scenarios, stablecoin activities, and illicit transaction detection. These contributions enhance blockchain data management capabilities and understanding of the rapidly evolving TRON ecosystem.
Authors:Isaiah J. King, Benjamin Bowman, H. Howie Huang
Abstract:
Deep reinforcement learning (RL) is emerging as a viable strategy for automated cyber defense (ACD). The traditional RL approach represents networks as a list of computers in various states of safety or threat. Unfortunately, these models are forced to overfit to specific network topologies, rendering them ineffective when faced with even small environmental perturbations. In this work, we frame ACD as a two-player context-based partially observable Markov decision problem with observations represented as attributed graphs. This approach allows our agents to reason through the lens of relational inductive bias. Agents learn how to reason about hosts interacting with other system entities in a more general manner, and their actions are understood as edits to the graph representing the environment. By introducing this bias, we will show that our agents can better reason about the states of networks and zero-shot adapt to new ones. We show that this approach outperforms the state-of-the-art by a wide margin, and makes our agents capable of defending never-before-seen networks against a wide range of adversaries in a variety of complex, and multi-agent environments.
Authors:Matteo Repetto, Enrico Cambiaso, Fabio Patrone, Sandro Zappatore
Abstract:
Today, many critical services and industrial systems rely on wireless networks for interaction with the IoT, hence becoming vulnerable to a broad number of cyber-threats. While detecting this kind of attacks is not difficult with common cyber-security tools, and even trivial for jamming, finding their origin and identifying culprits is almost impossible today, yet indispensable to stop them, especially when attacks are generated with portable or self-made devices that continuously move around. To address this open challenge, the FOLLOWME project investigates the feasibility of using UAV to locate and even chase attackers during illicit usage of the radio spectrum. The main objective is to develop a cyber-physical security framework that integrates network telemetry with wireless localization. The former triggers alarms in case of anomalies or known attack patterns and provides a coarse-grained indication of the physical area (i.e., the position of affected access gateways), whereas the latter systematically scans such area to identify the exact location of the attacker. The project will specifically address long-range metropolitan area networks and focus on the LoRaWAN protocol, which is the typical scenario for Smart City services.
Authors:Shijia Li, Jiang Ming, Lanqing Liu, Longwei Yang, Ni Zhang, Chunfu Jia
Abstract:
Detecting packed executables is a critical component of large-scale malware analysis and antivirus engine workflows, as it identifies samples that warrant computationally intensive dynamic unpacking to reveal concealed malicious behavior. Traditionally, packer detection techniques have relied on empirical features, such as high entropy or specific binary patterns. However, these empirical, feature-based methods are increasingly vulnerable to evasion by adversarial samples or unknown packers (e.g., low-entropy packers). Furthermore, the dependence on expert-crafted features poses challenges in sustaining and evolving these methods over time. In this paper, we examine the limitations of existing packer detection methods and propose Pack-ALM, a novel deep-learning-based approach for detecting packed executables. Inspired by the linguistic concept of distinguishing between real and pseudo words, we reformulate packer detection as a task of differentiating between legitimate and "pseudo" instructions. To achieve this, we preprocess native data and packed data into "pseudo" instructions and design a pre-trained assembly language model that recognizes features indicative of packed data. We evaluate Pack-ALM against leading industrial packer detection tools and state-of-the-art assembly language models. Extensive experiments on over 37,000 samples demonstrate that Pack-ALM effectively identifies packed binaries, including samples created with adversarial or previously unseen packing techniques. Moreover, Pack-ALM outperforms traditional entropy-based methods and advanced assembly language models in both detection accuracy and adversarial robustness.
Authors:Vaibhav Agrawal, Kiarash Ahi
Abstract:
This report examines the synergy between Large Language Models (LLMs) and Static Application Security Testing (SAST) to improve vulnerability discovery. Traditional SAST tools, while effective for proactive security, are limited by high false-positive rates and a lack of contextual understanding. Conversely, LLMs excel at code analysis and pattern recognition but can be prone to inconsistencies and hallucinations. By integrating these two technologies, a more intelligent and efficient system is created. This combination moves beyond mere vulnerability detection optimization, transforming security into a deeply integrated, contextual process that provides tangible benefits like improved triage, dynamic bug descriptions, bug validation via exploit generation and enhanced analysis of complex codebases. The result is a more effective security approach that leverages the strengths of both technologies while mitigating their weaknesses. SAST-Genius reduced false positives by about 91 % (225 to 20) compared to Semgrep alone.
Authors:Max Bazalii, Marius Fleischer
Abstract:
Fuzz testing is one of the most effective techniques for finding software vulnerabilities. While modern fuzzers can generate inputs and monitor executions automatically, the overall workflow, from analyzing a codebase, to configuring harnesses, to triaging results, still requires substantial manual effort. Prior attempts focused on single stages such as harness synthesis or input minimization, leaving researchers to manually connect the pieces into a complete fuzzing campaign. We introduce Orion, a framework that automates the the manual bottlenecks of fuzzing by integrating LLM reasoning with traditional tools, allowing campaigns to scale to settings where human effort alone was impractical. Orion uses LLMs for code reasoning and semantic guidance, while relying on deterministic tools for verification, iterative refinement, and tasks that require precision. Across our benchmark suite, Orion reduces human effort by 46-204x depending on the workflow stage, and we demonstrate its effectiveness through the discovery of two previously unknown vulnerabilities in the widely used open-source clib library.
Authors:Aarushi Mahajan, Wayne Burleson
Abstract:
Radio frequency fingerprint identification (RFFI) distinguishes wireless devices by the small variations in their analog circuits, avoiding heavy cryptographic authentication. While deep learning on spectrograms improves accuracy, models remain vulnerable to copying, tampering, and evasion. We present a stronger RFFI system combining watermarking for ownership proof and anomaly detection for spotting suspicious inputs. Using a ResNet-34 on log-Mel spectrograms, we embed three watermarks: a simple trigger, an adversarially trained trigger robust to noise and filtering, and a hidden gradient/weight signature. A convolutional Variational Autoencoders (VAE) with Kullback-Leibler (KL) warm-up and free-bits flags off-distribution queries. On the LoRa dataset, our system achieves 94.6% accuracy, 98% watermark success, and 0.94 AUROC, offering verifiable, tamper-resistant authentication.
Authors:Amirhosein Morteza, Remi A. Chou
Abstract:
We study the trade-off between communication rate and privacy for distributed batch matrix multiplication of two independent sequences of matrices $\mathbf{A}$ and $\mathbf{B}$ with uniformly distributed entries. In our setting, $\mathbf{B}$ is publicly accessible by all the servers while $\mathbf{A}$ must remain private. A user is interested in evaluating the product $\mathbf{AB}$ with the responses from the $k$ fastest servers. For a given parameter $α\in [0, 1]$, our privacy constraint must ensure that any set of $\ell$ colluding servers cannot learn more than a fraction $α$ of $\mathbf{A}$. Additionally, we study the trade-off between the amount of local randomness needed at the encoder and privacy. Finally, we establish the optimal trade-offs when the matrices are square and identify a linear relationship between information leakage and communication rate.
Authors:Minzhong Luo, Yudong Sun, Yin Long
Abstract:
Solving systems of Boolean equations is a fundamental task in symbolic computation and algebraic cryptanalysis, with wide-ranging applications in cryptography, coding theory, and formal verification. Among existing approaches, the Boolean Characteristic Set (BCS) method[1] has emerged as one of the most efficient algorithms for tackling such problems. However, its performance is highly sensitive to the ordering of variables, with solving times varying drastically under different orderings for fixed variable counts n and equations size m. To address this challenge, this paper introduces a novel optimization framework that synergistically integrates machine learning (ML)-based time prediction with simulated annealing (SA) to efficiently identify high-performance variables orderings. Weconstruct a dataset comprising variable frequency spectrum X and corresponding BCS solving time t for benchmark systems(e.g., n = m = 28). Utilizing this data, we train an accurate ML predictor ft(X) to estimate solving time for any given variables ordering. For each target system, ft serves as the cost function within an SA algorithm, enabling rapid discovery of low-latency orderings that significantly expedite subsequent BCS execution. Extensive experiments demonstrate that our method substantially outperforms the standard BCS algorithm[1], Gröbner basis method [2] and SAT solver[3], particularly for larger-scale systems(e.g., n = 32). Furthermore, we derive probabilistic time complexity bounds for the overall algorithm using stochastic process theory, establishing a quantitative relationship between predictor accuracy and expected solving complexity. This work provides both a practical acceleration tool for algebraic cryptanalysis and a theoretical foundation for ML-enhanced combinatorial optimization in symbolic computation.
Authors:Johnny So, Michael Ferdman, Nick Nikiforakis
Abstract:
The web continues to grow, but dependency-monitoring tools and standards for resource integrity lag behind. Currently, there exists no robust method to verify the integrity of web resources, much less in a generalizable yet performant manner, and supply chains remain one of the most targeted parts of the attack surface of web applications. In this paper, we present the design of LiMS, a transparent system to bootstrap link integrity guarantees in web browsing sessions with minimal overhead. At its core, LiMS uses a set of customizable integrity policies to declare the (un)expected properties of resources, verifies these policies, and enforces them for website visitors. We discuss how basic integrity policies can serve as building blocks for a comprehensive set of integrity policies, while providing guarantees that would be sufficient to defend against recent supply chain attacks detailed by security industry reports. Finally, we evaluate our open-sourced prototype by simulating deployments on a representative sample of 450 domains that are diverse in ranking and category. We find that our proposal offers the ability to bootstrap marked security improvements with an overall overhead of hundreds of milliseconds on initial page loads, and negligible overhead on reloads, regardless of network speeds. In addition, from examining archived data for the sample sites, we find that several of the proposed policy building blocks suit their dependency usage patterns, and would incur minimal administrative overhead.
Authors:Wadduwage Shanika Perera, Haodi Jiang
Abstract:
Malware is becoming increasingly complex and widespread, making it essential to develop more effective and timely detection methods. Traditional static analysis often fails to defend against modern threats that employ code obfuscation, polymorphism, and other evasion techniques. In contrast, behavioral malware detection, which monitors runtime activities, provides a more reliable and context-aware solution. In this work, we propose BEACON, a novel deep learning framework that leverages large language models (LLMs) to generate dense, contextual embeddings from raw sandbox-generated behavior reports. These embeddings capture semantic and structural patterns of each sample and are processed by a one-dimensional convolutional neural network (1D CNN) for multi-class malware classification. Evaluated on the Avast-CTU Public CAPE Dataset, our framework consistently outperforms existing methods, highlighting the effectiveness of LLM-based behavioral embeddings and the overall design of BEACON for robust malware classification.
Authors:Nobin Sarwar, Shubhashis Roy Dipta
Abstract:
Privacy-preserving adaptation of Large Language Models (LLMs) in sensitive domains (e.g., mental health) requires balancing strict confidentiality with model utility and safety. We propose FedMentor, a federated fine-tuning framework that integrates Low-Rank Adaptation (LoRA) and domain-aware Differential Privacy (DP) to meet per-domain privacy budgets while maintaining performance. Each client (domain) applies a custom DP noise scale proportional to its data sensitivity, and the server adaptively reduces noise when utility falls below a threshold. In experiments on three mental health datasets, we show that FedMentor improves safety over standard Federated Learning (FL) without privacy, raising safe output rates by up to three points and lowering toxicity, while maintaining utility (BERTScore F1 and ROUGE-L) within 0.5% of the non-private baseline and close to the centralized upper bound. The framework scales to backbones with up to 1.7B parameters on single-GPU clients, requiring < 173 MB of communication per-round. FedMentor demonstrates a practical approach to privately fine-tune LLMs for safer deployments in healthcare and other sensitive fields.
Authors:Gustavo Sandoval, Denys Fenchenko, Junyao Chen
Abstract:
This paper documents early research conducted in 2022 on defending against prompt injection attacks in large language models, providing historical context for the evolution of this critical security domain. This research focuses on two adversarial attacks against Large Language Models (LLMs): prompt injection and goal hijacking. We examine how to construct these attacks, test them on various LLMs, and compare their effectiveness. We propose and evaluate a novel defense technique called Adversarial Fine-Tuning. Our results show that, without this defense, the attacks succeeded 31\% of the time on GPT-3 series models. When using our Adversarial Fine-Tuning approach, attack success rates were reduced to near zero for smaller GPT-3 variants (Ada, Babbage, Curie), though we note that subsequent research has revealed limitations of fine-tuning-based defenses. We also find that more flexible models exhibit greater vulnerability to these attacks. Consequently, large models such as GPT-3 Davinci are more vulnerable than smaller models like GPT-2. While the specific models tested are now superseded, the core methodology and empirical findings contributed to the foundation of modern prompt injection defense research, including instruction hierarchy systems and constitutional AI approaches.
Authors:VÃctor Mayoral-Vilches, Andreas Makris, Kevin Finisterre
Abstract:
We present a systematic security assessment of the Unitree G1 humanoid showing it operates simultaneously as a covert surveillance node and can be purposed as an active cyber operations platform. Initial access can be achieved by exploiting the BLE provisioning protocol which contains a critical command injection vulnerability allowing root access via malformed Wi-Fi credentials, exploitable using hardcoded AES keys shared across all units. Partial reverse engineering of Unitree's proprietary FMX encryption reveal a static Blowfish-ECB layer and a predictable LCG mask-enabled inspection of the system's otherwise sophisticated security architecture, the most mature we have observed in commercial robotics. Two empirical case studies expose the critical risk of this humanoid robot: (a) the robot functions as a trojan horse, continuously exfiltrating multi-modal sensor and service-state telemetry to 43.175.228.18:17883 and 43.175.229.18:17883 every 300 seconds without operator notice, creating violations of GDPR Articles 6 and 13; (b) a resident Cybersecurity AI (CAI) agent can pivot from reconnaissance to offensive preparation against any target, such as the manufacturer's cloud control plane, demonstrating escalation from passive monitoring to active counter-operations. These findings argue for adaptive CAI-powered defenses as humanoids move into critical infrastructure, contributing the empirical evidence needed to shape future security standards for physical-cyber convergence systems.
Authors:Priyanka Nanayakkara, Elena Ghazi, Salil Vadhan
Abstract:
Differential privacy (DP) -- a principled approach to producing statistical data products with strong, mathematically provable privacy guarantees for the individuals in the underlying dataset -- has seen substantial adoption in practice over the past decade. Applying DP requires making several implementation decisions, each with significant impacts on data privacy and/or utility. Hence, to promote shared learning and accountability around DP deployments, Dwork, Kohli, and Mulligan (2019) proposed a public-facing repository ("registry") of DP deployments. The DP community has recently started to work toward realizing this vision. We contribute to this effort by (1) developing a holistic, hierarchical schema to describe any given DP deployment and (2) designing and implementing an interactive interface to act as a registry where practitioners can access information about past DP deployments. We (3) populate our interface with 21 real-world DP deployments and (4) conduct an exploratory user study with DP practitioners ($n=16$) to understand how they would use the registry, as well as what challenges and opportunities they foresee around its adoption. We find that participants were enthusiastic about the registry as a valuable resource for evaluating prior deployments and making future deployments. They also identified several opportunities for the registry, including that it can become a "hub" for the community and support broader communication around DP (e.g., to legal teams). At the same time, they identified challenges around the registry gaining adoption, including the effort and risk involved with making implementation choices public and moderating the quality of entries. Based on our findings, we offer recommendations for encouraging adoption and increasing the registry's value not only to DP practitioners, but also to policymakers, data users, and data subjects.
Authors:Dumitru-Bogdan Prelipcean, CÄtÄlin Dima
Abstract:
Threat detection systems rely on rule-based logic to identify adversarial behaviors, yet the conformance of these rules to high-level threat models is rarely verified formally. We present a formal verification framework that models both detection logic and attack trees as labeled transition systems (LTSs), enabling automated conformance checking via bisimulation and weak trace inclusion. Detection rules specified in the Generic Threat Detection Language (GTDL, a general-purpose detection language we formalize in this work) are assigned a compositional operational semantics, and threat models expressed as attack trees are interpreted as LTSs through a structural trace semantics. Both representations are translated to LNT, a modeling language supported by the CADP toolbox. This common semantic domain enables systematic and automated verification of detection coverage. We evaluate our approach on real-world malware scenarios such as LokiBot and Emotet and provide scalability analysis through parametric synthetic models. Results confirm that our methodology identifies semantic mismatches between threat models and detection rules, supports iterative refinement, and scales to realistic threat landscapes.
Authors:Dipak K. Rabari, Yogesh K. Meghrajani, Laxmi S. Desai
Abstract:
Image security for information has become increasingly critical as internet become more prevalent due to hacking and unauthorized access. To ensure the security of confidential image data, image encryption using visual cryptography plays a crucial role. To share multiple images using visual cryptography, the company organizer utilizes the concept of a universal or common share. Likewise, quantum computing is an emerging technology that facilitates secure communication. The ability of quantum computers to solve certain mathematical problems efficiently threatens the security of many current encryption algorithms. Hence, to leverage the strengths of quantum computing and visual cryptography, this research introduces a novel universal share-based quantum multi-secret sharing technique for secure image communication. Quantum computing enables the scheme to exhibit high resilience to different eavesdropping threats. Consequently, the proposed method offers robust security solution for sharing confidential images across a range of applications, including enterprise data access and military communications.
Authors:Gustavo Banegas, Ricardo Villanueva-Polanco
Abstract:
SNOVA is a post-quantum cryptographic signature scheme known for its efficiency and compact key sizes, making it a second-round candidate in the NIST post-quantum cryptography standardization process. This paper presents a comprehensive fault analysis of SNOVA, focusing on both permanent and transient faults during signature generation. We introduce several fault injection strategies that exploit SNOVA's structure to recover partial or complete secret keys with limited faulty signatures. Our analysis reveals that as few as 22 to 68 faulty signatures, depending on the security level, can suffice for key recovery. We propose a novel fault-assisted reconciliation attack, demonstrating its effectiveness in extracting the secret key space via solving a quadratic polynomial system. Simulations show transient faults in key signature generation steps can significantly compromise SNOVA's security. To address these vulnerabilities, we propose a lightweight countermeasure to reduce the success of fault attacks without adding significant overhead. Our results highlight the importance of fault-resistant mechanisms in post-quantum cryptographic schemes like SNOVA to ensure robustness.
Authors:Gustavo Banegas, Andreas Hellenbrand, Matheus Saldanha
Abstract:
Isogeny-based cryptography has emerged as a promising postquantum alternative, with CSIDH and its constant-time variants CTIDH and dCTIDH offering efficient group-action protocols. However, CTIDH and dCTIDH rely on dummy operations in differential addition chains (DACs) and Matryoshka, which can be exploitable by fault-injection attacks. In this work, we present the first dummy-free implementation of dCTIDH. Our approach combines two recent ideas: DACsHUND, which enforces equal-length DACs within each batch without padding, and a reformulated Matryoshka structure that removes dummy multiplications and validates all intermediate points. Our analysis shows that small primes such as 3, 5, and 7 severely restrict feasible DACsHUND configurations, motivating new parameter sets that exclude them. We implement dummy-free dCTIDH-2048-194 and dCTIDH-2048-205, achieving group action costs of roughly 357,000-362,000 Fp-multiplications, with median evaluation times of 1.59-1.60 (Gcyc). These results do not surpass dC-TIDH, but they outperform CTIDH by roughly 5% while eliminating dummy operations entirely. Compared to dCSIDH, our construction is more than 4x faster. To the best of our knowledge, this is the first efficient implementation of a CSIDH-like protocol that is simultaneously deterministic, constant-time, and fully dummy-free.
Authors:Kathrin Hövelmanns, Daan Planken, Christian Schaffner, Sebastian R. Verschoor
Abstract:
Authenticated Key Exchange (AKE) establishes shared ('symmetric') cryptographic keys which are essential for secure online communication. AKE protocols can be constructed from public-key cryptography like Key Encapsulation Mechanisms (KEMs). Another approach is to use Quantum Key Distribution (QKD) to establish a symmetric key, which uses quantum communication. Combining post-quantum AKE and QKD appropriately may provide security against quantum attacks even if only one of the two approaches turns out to be secure. We provide an extensive review of existing security analyses for combined AKE and their formal security models, and identify some gaps in their treatment of QKD key IDs. In particular, improper handling of QKD key IDs leads to Dependent-Key attacks on AKE. As our main conceptual contribution, we model QKD as an oracle that closely resembles the standard ETSI 014 QKD interface. We demonstrate the usability of our QKD oracle for cryptographic security analyses by integrating it into a prominent security model for AKE, called CK+ model, thereby obtaining a security model for combined AKE that catches Dependent-Key attacks. In this model, we formally prove security of a new protocol that combines QKD with a triple-KEM handshake. This is the first provably secure hybrid protocol that maintains information-theoretic security of QKD.
Authors:Andrei Damian, Petrica Butusina, Alessandro De Franceschi, Vitalii Toderian, Marius Grigoras, Cristian Bleotiu
Abstract:
We propose the Ratio1 AI meta-operating system (meta-OS), a decentralized MLOps protocol that unifies AI model development, deployment, and inference across heterogeneous edge devices. Its key innovation is an integrated blockchain-based framework that transforms idle computing resources (laptops, smartphones, cloud VMs) into a trustless global supercomputer. The architecture includes novel components: a decentralized authentication layer (dAuth), an in-memory state database (CSTORE), a distributed storage system (R1FS), homomorphic encrypted federated learning (EDIL), decentralized container orchestration (Deeploy) and an oracle network (OracleSync), which collectively ensure secure, resilient execution of AI pipelines and other container based apps at scale. The protocol enforces a formal circular token-economic model combining Proof-of-Availability (PoA) and Proof-of-AI (PoAI) consensus. Compared to centralized heterogeneous cloud MLOps and existing decentralized compute platforms, which often lack integrated AI toolchains or trusted Ratio1 node operators (R1OP) mechanics, Ratio1's holistic design lowers barriers for AI deployment and improves cost-efficiency. We provide mathematical formulations of its secure licensing and reward protocols, and include descriptive information for the system architecture and protocol flow. We argue that our proposed fully functional ecosystem proposes and demonstrates significant improvements in accessibility, scalability, and security over existing alternatives.
Authors:Soumia Zohra El Mestari, Maciej Krzysztof Zuziak, Gabriele Lenzini
Abstract:
Federated Learning (FL) enables collaborative model training across decentralised clients while keeping local data private, making it a widely adopted privacy-enhancing technology (PET). Despite its privacy benefits, FL remains vulnerable to privacy attacks, including those targeting specific clients. In this paper, we study an underexplored threat where a dishonest orchestrator intentionally manipulates the aggregation process to induce targeted overfitting in the local models of specific clients. Whereas many studies in this area predominantly focus on reducing the amount of information leakage during training, we focus on enabling an early client-side detection of targeted overfitting, thereby allowing clients to disengage before significant harm occurs. In line with this, we propose three detection techniques - (a) label flipping, (b) backdoor trigger injection, and (c) model fingerprinting - that enable clients to verify the integrity of the global aggregation. We evaluated our methods on multiple datasets under different attack scenarios. Our results show that the three methods reliably detect targeted overfitting induced by the orchestrator, but they differ in terms of computational complexity, detection latency, and false-positive rates.
Authors:Mohammad Olid Ali Akash, Priyangana Saha
Abstract:
In the modern, fast-moving world of e-commerce, many Android apps face challenges in providing a simple and secure shopping experience. Many of these apps, often enough, have complicated designs that prevent users from finding what they want quickly, thus frustrating them and wasting their precious time. Another major issue is that of security; with the limitation of payment options and weak authentication mechanisms, users' sensitive information can be compromised. This research presents a new e-commerce platform that responds to the above challenges with an intuitive interface and strong security measures. The platform makes online shopping easy with well-organized categories of products and a fast, efficient checkout process. It also gives priority to security by incorporating features such as Google authentication and SSL-secured payment gateways to protect user data and ensure secure transactions. This paper discusses how a focus on user-friendliness, security, and personalization steps up the game for e-commerce platforms, providing workable frameworks that match modern user needs and expectations. The findings show the e-commerce user experience can be remodelled by the platform, hence opening ways for future developments in that respect.
Authors:Sabin Huda, Ernest Foo, Zahra Jadidi, MA Hakim Newton, Abdul Sattar
Abstract:
Anti-money laundering (AML) research is constrained by the lack of publicly shareable, regulation-aligned transaction datasets. We present AMLNet, a knowledge-based multi-agent framework with two coordinated units: a regulation-aware transaction generator and an ensemble detection pipeline. The generator produces 1,090,173 synthetic transactions (approximately 0.16\% laundering-positive) spanning core laundering phases (placement, layering, integration) and advanced typologies (e.g., structuring, adaptive threshold behavior). Regulatory alignment reaches 75\% based on AUSTRAC rule coverage (Section 4.2), while a composite technical fidelity score of 0.75 summarizes temporal, structural, and behavioral realism components (Section 4.4). The detection ensemble achieves F1 0.90 (precision 0.84, recall 0.97) on the internal test partitions of AMLNet and adapts to the external SynthAML dataset, indicating architectural generalizability across different synthetic generation paradigms. We provide multi-dimensional evaluation (regulatory, temporal, network, behavioral) and release the dataset (Version 1.0, https://doi.org/10.5281/zenodo.16736515), to advance reproducible and regulation-conscious AML experimentation.
Authors:Tarakaram Gollamudi, Anitha Gollamudi, Joshua Gancher
Abstract:
RLWE-based Fully Homomorphic Encryption (FHE) schemes add some small \emph{noise} to the message during encryption. The noise accumulates with each homomorphic operation. When the noise exceeds a critical value, the FHE circuit produces an incorrect output. This makes developing FHE applications quite subtle, as one must closely track the noise to ensure correctness. However, existing libraries and compilers offer limited support to statically track the noise. Additionally, FHE circuits are also plagued by wraparound errors that are common in finite modulus arithmetic. These two limitations of existing compilers and libraries make FHE applications too difficult to develop with confidence. In this work, we present a \emph{correctness-oriented} IR, Intermediate Language for Arithmetic circuits, for type-checking circuits intended for homomorphic evaluation. Our IR is backed by a type system that tracks low-level quantitative bounds (e.g., ciphertext noise) without using the secret key. Using our type system, we identify and prove a strong \emph{functional correctness} criterion for \ila circuits. Additionally, we have designed \ila to be maximally general: our core type system does not directly assume a particular FHE scheme, but instead axiomatizes a \emph{model} of FHE. We instantiate this model with the exact FHE schemes (BGV, BFV and TFHE), and obtain functional correctness for free.
Authors:Aleksejus MihalkoviÄ, Lina Dindiene, Eligijus Sakalauskas
Abstract:
In this paper, we demonstrate a way to generalize learning with errors (LWE) to the family of so-called modular-maximal cyclic groups which are non-commuting. Since the group M2t has two cycles of maximal multiplicative order, we use this fact to construct an accurate criterion for restoring the message bit with overwhelming probability. Furthermore, we implement the original idea by O. Regev in the considered group to gain benefits from the non-commutativity of M2t . Also we prove that using this approach we can achieve a level of security comparable to the original idea.
Authors:Qianxue Wang, Simin Yu
Abstract:
Plaintext non-delayed chaotic cipher (PNDCC) means that in the diffusion equation, plaintext has no delay terms while ciphertext has a feedback term. In existing literature, chaotic cipher diffusions invariably take this form. Since its introduction, PNDCC has attracted attention but also doubts. Designers of chaotic ciphers usually claim PNDCC security by statistical tests, while rigorous cryptographic proofs are absent. Thus, it is necessary to re-examine its design rationale and empirical security. To address this issue, we present a typical example of a three-stage permutation-diffusion-permutation PNDCC, which contains multiple security vulnerabilities. Although all of its statistical indicators show good performance, we are able to break it using four different attacks. The first is a differential attack based on homogeneous operations; the second is an S-PTC attack; the third is a novel impulse-step-based differential attack (ISBDA), proposed in this paper, and the fourth is a novel chain attack, also introduced here. These results demonstrate that the fulfilment of statistical criteria is not a sufficient condition for the security of PNDCC. Then, based on a mathematical model of multi-stage PNDCC, we show that the proposed chain attack can successfully break a class of multi-stage PNDCCs. The key technique of the chain attack depends on how to reveal all permutations. To address this key problem, we summarize the chaining rules and show that, from the attacker's perspective, if the same decryption chain can be reconstructed then all permutations can be deciphered. To that end, the entire diffusion process can be broken by solving a system of simultaneous equations. Finally, as a secure improvement, we propose a new scheme termed plaintext-delayed chaotic cipher (PDCC) that can resist various cryptanalytic attacks.
Authors:Ali Habibzadeh, Farid Feyzi, Reza Ebrahimi Atani
Abstract:
Large Language Models (LLMs) have emerged as powerful tools capable of understanding and generating human-like text, offering transformative potential across diverse domains. The Security Operations Center (SOC), responsible for safeguarding digital infrastructure, represents one of these domains. SOCs serve as the frontline of defense in cybersecurity, tasked with continuous monitoring, detection, and response to incidents. However, SOCs face persistent challenges such as high alert volumes, limited resources, high demand for experts with advanced knowledge, delayed response times, and difficulties in leveraging threat intelligence effectively. In this context, LLMs can offer promising solutions by automating log analysis, streamlining triage, improving detection accuracy, and providing the required knowledge in less time. This survey systematically explores the integration of generative AI and more specifically LLMs into SOC workflow, providing a structured perspective on its capabilities, challenges, and future directions. We believe that this survey offers researchers and SOC managers a broad overview of the current state of LLM integration within academic study. To the best of our knowledge, this is the first comprehensive study to examine LLM applications in SOCs in details.
Authors:Mohammed Yacoubi, Omar Moussaoui, C. Drocourt
Abstract:
Explainable Artificial Intelligence (XAI) enhances the transparency and interpretability of AI models, addressing their inherent opacity. In cybersecurity, particularly within the Internet of Medical Things (IoMT), the black-box nature of AI-driven threat detection poses a significant challenge. Cybersecurity professionals must not only detect attacks but also understand the reasoning behind AI decisions to ensure trust and accountability. The rapid increase in cyberattacks targeting connected medical devices threatens patient safety and data privacy, necessitating advanced AI-driven solutions. This study compares two ensemble learning techniques, bagging and boosting, for cyber-attack classification in IoMT environments. We selected Random Forest for bagging and CatBoost for boosting. Random Forest helps reduce variance, while CatBoost improves bias by combining weak classifiers into a strong ensemble model, making them effective for detecting sophisticated attacks. However, their complexity often reduces transparency, making it difficult for cybersecurity professionals to interpret and trust their decisions. To address this issue, we apply XAI models to generate local and global explanations, providing insights into AI decision-making. Using techniques like SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations), we highlight feature importance to help stakeholders understand the key factors driving cyber threat detection.
Authors:Landon Bragg, Nathan Dorsey, Josh Prior, John Ajit, Ben Kim, Nate Willis, Pablo Rivas
Abstract:
Distributed Denial-of-Service (DDoS) attacks remain a serious threat to online infrastructure, often bypassing detection by altering traffic in subtle ways. We present a method using hive-plot sequences of network data and a 3D convolutional neural network (3D CNN) to classify DDoS traffic with high accuracy. Our system relies on three main ideas: (1) using spatio-temporal hive-plot encodings to set a pattern-recognition baseline, (2) applying adversarial training with FGSM and PGD alongside spatial noise and image shifts, and (3) analyzing frame-wise predictions to find early signals. On a benchmark dataset, our method lifts adversarial accuracy from 50-55% to over 93% while maintaining clean-sample performance. Frames 3-4 offer strong predictive signals, showing early-stage classification is possible.
Authors:Federica Uccello, Simin Nadjm-Tehrani
Abstract:
With the rise of fifth-generation (5G) networks in critical applications, it is urgent to move from detection of malicious activity to systems capable of providing a reliable verdict suitable for mitigation. In this regard, understanding and interpreting machine learning (ML) models' security alerts is crucial for enabling actionable incident response orchestration. Explainable Artificial Intelligence (XAI) techniques are expected to enhance trust by providing insights into why alerts are raised. A dominant approach statistically associates feature sets that can be correlated to a given alert. This paper starts by questioning whether such attribution is relevant for future generation communication systems, and investigates its merits in comparison with an approach based on logical explanations. We extensively study two methods, SHAP and VoTE-XAI, by analyzing their interpretations of alerts generated by an XGBoost model in three different use cases with several 5G communication attacks. We identify three metrics for assessing explanations: sparsity, how concise they are; stability, how consistent they are across samples from the same attack type; and efficiency, how fast an explanation is generated. As an example, in a 5G network with 92 features, 6 were deemed important by VoTE-XAI for a Denial of Service (DoS) variant, ICMPFlood, while SHAP identified over 20. More importantly, we found a significant divergence between features selected by SHAP and VoTE-XAI. However, none of the top-ranked features selected by SHAP were missed by VoTE-XAI. When it comes to efficiency of providing interpretations, we found that VoTE-XAI is significantly more responsive, e.g. it provides a single explanation in under 0.002 seconds, in a high-dimensional setting (478 features).
Authors:Matthew J. Schneider, James Bailie, Dawn Iacobucci
Abstract:
Companies are looking to data anonymization research $\unicode{x2013}$ including differential private and synthetic data methods $\unicode{x2013}$ for simple and straightforward compliance solutions. But data anonymization has not taken off in practice because it is anything but simple to implement. For one, it requires making complex choices which are case dependent, such as the domain of the dataset to anonymize; the units to protect; the scope where the data protection should extend to; and the standard of protection. Each variation of these choices changes the very meaning, as well as the practical implications, of differential privacy (or of any other measure of data anonymization). Yet differential privacy is frequently being branded as the same privacy guarantee regardless of variations in these choices. Some data anonymization methods can be effective, but only when the insights required are much larger than the unit of protection. Given that businesses care about profitability, any solution must preserve the patterns between a firm's data and that profitability. As a result, data anonymization solutions usually need to be bespoke and case-specific, which reduces their scalability. Companies should not expect easy wins, but rather recognize that anonymization is just one approach to data privacy with its own particular advantages and drawbacks, while the best strategies jointly leverage the full range of approaches to data privacy and security in combination.
Authors:Pouneh Nikkhah Bahrami, Dylan Cutler, Igor Bilogrevic
Abstract:
Browser fingerprinting enables persistent cross-site user tracking via subtle techniques that often evade conventional defenses or cause website breakage when script-level blocking countermeasures are applied. Addressing these challenges requires detection methods offering both function-level precision to minimize breakage and inherent robustness against code obfuscation and URL manipulation. We introduce ByteDefender, the first system leveraging V8 engine bytecode to detect fingerprinting operations specifically at the JavaScript function level. A Transformer-based classifier, trained offline on bytecode sequences, accurately identifies functions exhibiting fingerprinting behavior. We develop and evaluate light-weight signatures derived from this model to enable low-overhead, on-device matching against function bytecode during compilation but prior to execution, which only adds a 4% (average) latency to the page load time. This mechanism facilitates targeted, real-time prevention of fingerprinting function execution, thereby preserving legitimate script functionality. Operating directly on bytecode ensures inherent resilience against common code obfuscation and URL-based evasion. Our evaluation on the top 100k websites demonstrates high detection accuracy at both function- and script-level, with substantial improvements over state-of-the-art AST-based methods, particularly in robustness against obfuscation. ByteDefender offers a practical framework for effective, precise, and robust fingerprinting mitigation.
Authors:Taniya Gidatkar, Oluwaseun Ajao, Matthew Shardlow
Abstract:
This study evaluates the resilience of large language models (LLMs) against adversarial attacks, specifically focusing on Flan-T5, BERT, and RoBERTa-Base. Using systematically designed adversarial tests through TextFooler and BERTAttack, we found significant variations in model robustness. RoBERTa-Base and FlanT5 demonstrated remarkable resilience, maintaining accuracy even when subjected to sophisticated attacks, with attack success rates of 0%. In contrast. BERT-Base showed considerable vulnerability, with TextFooler achieving a 93.75% success rate in reducing model accuracy from 48% to just 3%. Our research reveals that while certain LLMs have developed effective defensive mechanisms, these safeguards often require substantial computational resources. This study contributes to the understanding of LLM security by identifying existing strengths and weaknesses in current safeguarding approaches and proposes practical recommendations for developing more efficient and effective defensive strategies.
Authors:Meghan Wilkinson, Robert H Thomson
Abstract:
Supervised machine learning techniques rely on labeled data to achieve high task performance, but this requires the labels to capture some meaningful differences in the underlying data structure. For training network intrusion detection algorithms, most datasets contain a series of attack classes and a single large benign class which captures all non-attack network traffic. A review of intrusion detection papers and guides that explicitly state their data preprocessing steps identified that the majority took the labeled categories of the dataset at face value when training their algorithms. The present paper evaluates the structure of benign traffic in several common intrusion detection datasets (NSL-KDD, UNSW-NB15, and CIC-IDS 2017) and determines whether there are meaningful sub-categories within this traffic which may improve overall multi-classification performance using common machine learning techniques. We present an overview of some unsupervised clustering techniques (e.g., HDBSCAN, Mean Shift Clustering) and show how they differentially cluster the benign traffic space.
Authors:Jihane Najar, Marinos Tsantekidis, Aris Sotiropoulos, Vassilis Prevelakis
Abstract:
In today's dynamic cyber threat landscape, organizations must take proactive steps to bolster their cybersecurity defenses. Cyber threat hunting is a proactive and iterative process aimed at identifying and mitigating advanced threats that may go undetected by traditional security measures. Rather than waiting for automated security systems to flag potential threats, threat hunting involves actively searching for signs of malicious activity within an organization's network. In this paper, we present the Forensic Visualization Toolkit, a powerful tool designed for digital forensics investigations, analysis of digital evidence, and advanced visualizations to enhance cybersecurity situational awareness and risk management and empower security analysts with an intuitive and interactive tool. Through practical, real-world scenarios, we demonstrate how FVT significantly amplifies the capabilities of cybersecurity professionals, enabling them to effectively identify, analyze, and respond to threats. Furthermore, it is important to highlight that FVT has been integrated into, utilized, and continually enhanced within various EU-funded research projects over recent years.
Authors:Pritam Sen, Yao Ma, Cristian Borcea
Abstract:
We present CryptGNN, a secure and effective inference solution for third-party graph neural network (GNN) models in the cloud, which are accessed by clients as ML as a service (MLaaS). The main novelty of CryptGNN is its secure message passing and feature transformation layers using distributed secure multi-party computation (SMPC) techniques. CryptGNN protects the client's input data and graph structure from the cloud provider and the third-party model owner, and it protects the model parameters from the cloud provider and the clients. CryptGNN works with any number of SMPC parties, does not require a trusted server, and is provably secure even if P-1 out of P parties in the cloud collude. Theoretical analysis and empirical experiments demonstrate the security and efficiency of CryptGNN.
Authors:Honglan Yu, Yibin Wang, Feifei Dai, Dong Liu, Haihui Fan, Xiaoyan Gu
Abstract:
CPU-based trusted execution environments (TEEs) and differential privacy (DP) have gained wide applications for private inference. Due to high inference latency in TEEs, researchers use partition-based approaches that offload linear model components to GPUs. However, dense nonlinear layers of large language models (LLMs) result in significant communication overhead between TEEs and GPUs. DP-based approaches apply random noise to protect data privacy, but this compromises LLM performance and semantic understanding. To overcome the above drawbacks, this paper proposes CMIF, a Confidential and efficient Model Inference Framework. CMIF confidentially deploys the embedding layer in the client-side TEE and subsequent layers on GPU servers. Meanwhile, it optimizes the Report-Noisy-Max mechanism to protect sensitive inputs with a slight decrease in model performance. Extensive experiments on Llama-series models demonstrate that CMIF reduces additional inference overhead in TEEs while preserving user data privacy.
Authors:Sichen Zhu, Hoyeung Leung, Xiaoyi Wang, Jia Wei, Honghui Xu
Abstract:
The integration of Large Language Models (LLMs) into financial technology (FinTech) has revolutionized the analysis and processing of complex financial data, driving advancements in real-time decision-making and analytics. With the growing trend of deploying AI models on edge devices for financial applications, ensuring the privacy of sensitive financial data has become a significant challenge. To address this, we propose DPFinLLM, a privacy-enhanced, lightweight LLM specifically designed for on-device financial applications. DPFinLLM combines a robust differential privacy mechanism with a streamlined architecture inspired by state-of-the-art models, enabling secure and efficient processing of financial data. This proposed DPFinLLM can not only safeguard user data from privacy breaches but also ensure high performance across diverse financial tasks. Extensive experiments on multiple financial sentiment datasets validate the effectiveness of DPFinLLM, demonstrating its ability to achieve performance comparable to fully fine-tuned models, even under strict privacy constraints.
Authors:Saskia Kaltenbrunner, Michael Schmidbauer
Abstract:
Modern, data-driven medical research requires the processing of sensitive health data on a large scale. However, this data is subject to special protection under the GDPR, which is why processing regularly raises data protection concerns in practice. These concerns are particularly prevalent when sensitive personal data is processed without informed consent. This article analyses options for data processing in the field of medical research without consent and describes the legal framework for anonymisation under the GDPR, the national Austrian implementation of the research exemption, and their interaction. -- Moderne, datengetriebene medizinische Forschung erfordert die Verarbeitung sensibler Gesundheitsdaten in grossem Ausmass. Diese sind im System der DSGVO jedoch besonders geschützt, weswegen einer rechtssicheren Verarbeitung in der Praxis regelmässig datenschutzrechtliche Bedenken entgegenstehen. Diese Bedenken bestehen insbesondere bei Verarbeitung sensibler personenbezogener Daten ohne informierte Einwilligung. Dieser Beitrag analysiert daher Möglichkeiten zur Datenverarbeitung im Bereich der medizinischen Forschung fernab der Einwilligung und beschreibt hierfür das rechtliche Rahmenwerk für Anonymisierung der DSGVO, die nationale, österreichische Umsetzung der Forschungsausnahme und ihr Zusammenspiel.
Authors:Michael Veale, Frederik Zuiderveen Borgesius
Abstract:
This article discusses the troubled relationship between contemporary advertising technology (adtech) systems, in particular systems of real-time bidding (RTB, also known as programmatic advertising) underpinning much behavioral targeting on the web and through mobile applications. This article analyzes the extent to which practices of RTB are compatible with the requirements regarding a legal basis for processing, transparency, and security in European data protection law. We first introduce the technologies at play through explaining and analyzing the systems deployed online today. Following that, we turn to the law. Rather than analyze RTB against every provision of the General Data Protection Regulation (GDPR), we consider RTB in the context of the GDPR's requirement of a legal basis for processing and the GDPR's transparency and security requirements. We show, first, that the GDPR requires prior consent of the internet user for RTB, as other legal bases are not appropriate. Second, we show that it is difficult - and perhaps impossible - for website publishers and RTB companies to meet the GDPR's transparency requirements. Third, RTB incentivizes insecure data processing. We conclude that, in concept and in practice, RTB is structurally difficult to reconcile with European data protection law. Therefore, intervention by regulators is necessary.
Authors:Markus Scherer, Jeppe Fredsgaard Blaabjerg, Alexander Sjösten, Matteo Maffei
Abstract:
WebAssembly (Wasm) is rapidly gaining popularity as a distribution format for software components embedded in various security-critical domains. Unfortunately, despite its prudent design, WebAssembly's primary use case as a compilation target for memory-unsafe languages leaves some possibilities for memory corruption. Independently of that, Wasm is an inherently interesting target for information flow analysis due to its interfacing role. Both the information flows between a Wasm module and its embedding context, as well as the memory integrity within a module, can be described by the hyperproperty noninterference. So far, no sound, fully static noninterference analysis for Wasm has been presented, but sound reachability analyses were. This work presents a novel and general approach to lift reachability analyses to noninterference by tracking taints on values and using value-sensitive, relational reasoning to remove them when appropriate. We implement this approach in Wanilla, the first automatic, sound, and fully static noninterference analysis for WebAssembly, and demonstrate its performance and precision by verifying memory integrity and other noninterference properties with several synthetic and real-world benchmarks.
Authors:Ryan McGaughey, Jesus Martinez del Rincon, Ihsen Alouani
Abstract:
Federated Learning (FL) is a distributed learning paradigm designed to address privacy concerns. However, FL is vulnerable to poisoning attacks, where Byzantine clients compromise the integrity of the global model by submitting malicious updates. Robust aggregation methods have been widely adopted to mitigate such threats, relying on the core assumption that malicious updates are inherently out-of-distribution and can therefore be identified and excluded before aggregating client updates. In this paper, we challenge this underlying assumption by showing that a model can be poisoned while keeping malicious updates within the main distribution. We propose Chameleon Poisoning (CHAMP), an adaptive and evasive poisoning strategy that exploits side-channel feedback from the aggregation process to guide the attack. Specifically, the adversary continuously infers whether its malicious contribution has been incorporated into the global model and adapts accordingly. This enables a dynamic adjustment of the local loss function, balancing a malicious component with a camouflaging component, thereby increasing the effectiveness of the poisoning while evading robust aggregation defenses. CHAMP enables more effective and evasive poisoning, highlighting a fundamental limitation of existing robust aggregation defenses and underscoring the need for new strategies to secure federated learning against sophisticated adversaries. Our approach is evaluated in two datasets reaching an average increase of 47.07% in attack success rate against nine robust aggregation defenses.
Authors:Sam Kumar, Samyukta Yagati, Conor Power, David E. Culler, Raluca Ada Popa
Abstract:
Organizations use data lakes to store and analyze sensitive data. But hackers may compromise data lake storage to bypass access controls and access sensitive data. To address this, we propose Membrane, a system that (1) cryptographically enforces data-dependent access control views over a data lake, (2) without restricting the analytical queries data scientists can run. We observe that data lakes, unlike DBMSes, disaggregate computation and storage into separate trust domains, making at-rest encryption sufficient to defend against remote attackers targeting data lake storage, even when running analytical queries in plaintext. This leads to a new system design for Membrane that combines encryption at rest with SQL-aware encryption. Using block ciphers, a fast symmetric-key primitive with hardware acceleration in CPUs, we develop a new SQL-aware encryption protocol well-suited to at-rest encryption. Membrane adds overhead only at the start of an interactive session due to decrypting views, delaying the first query result by up to $\approx 20\times$; subsequent queries process decrypted data in plaintext, resulting in low amortized overhead.
Authors:Shun Takagi, Satoshi Hasegawa
Abstract:
In cross-device private federated learning, differentially private follow-the-regularized-leader (DP-FTRL) has emerged as a promising privacy-preserving method. However, existing approaches assume a semi-honest server and have not addressed the challenge of securely removing this assumption. This is due to its statefulness, which becomes particularly problematic in practical settings where clients can drop out or be corrupted. While trusted execution environments (TEEs) might seem like an obvious solution, a straightforward implementation can introduce forking attacks or availability issues due to state management. To address this problem, our paper introduces a novel server extension that acts as a trusted computing base (TCB) to realize maliciously secure DP-FTRL. The TCB is implemented with an ephemeral TEE module on the server side to produce verifiable proofs of server actions. Some clients, upon being selected, participate in auditing these proofs with small additional communication and computational demands. This extension solution reduces the size of the TCB while maintaining the system's scalability and liveness. We provide formal proofs based on interactive differential privacy, demonstrating privacy guarantee in malicious settings. Finally, we experimentally show that our framework adds small constant overhead to clients in several realistic settings.
Authors:Behnaz Hassanshahi, Trong Nhan Mai, Benjamin Selwyn Smith, Nicholas Allen
Abstract:
Software ecosystems like Maven Central play a crucial role in modern software supply chains by providing repositories for libraries and build plugins. However, the separation between binaries and their corresponding source code in Maven Central presents a significant challenge, particularly when it comes to linking binaries back to their original build environment. This lack of transparency poses security risks, as approximately 84% of the top 1200 commonly used artifacts are not built using a transparent CI/CD pipeline. Consequently, users must place a significant amount of trust not only in the source code but also in the environment in which these artifacts are built. Rebuilding software artifacts from source provides a robust solution to improve supply chain security. This approach allows for a deeper review of code, verification of binary-source equivalence, and control over dependencies. However, challenges arise due to variations in build environments, such as JDK versions and build commands, which can lead to build failures. Additionally, ensuring that all dependencies are rebuilt from source across large and complex dependency graphs further complicates the process. In this paper, we introduce an extension to Macaron, an industry-grade open-source supply chain security framework, to automate the rebuilding of Maven artifacts from source. Our approach improves upon existing tools, by offering better performance in source code detection and automating the extraction of build specifications from GitHub Actions workflows. We also present a comprehensive root cause analysis of build failures in Java projects and propose a scalable solution to automate the rebuilding of artifacts, ultimately enhancing security and transparency in the open-source supply chain.
Authors:Huu Phu Le, Phuc Hao Do, Vo Hoang Long Nguyen, Nang Hung Van Nguyen
Abstract:
Detecting unseen ransomware is a critical cybersecurity challenge where classical machine learning often fails. While Quantum Machine Learning (QML) presents a potential alternative, its application is hindered by the dimensionality gap between classical data and quantum hardware. This paper empirically investigates a hybrid framework using a Variational Quantum Classifier (VQC) interfaced with a high-dimensional dataset via Principal Component Analysis (PCA). Our analysis reveals a dual challenge for practical QML. A significant information bottleneck was evident, as even the best performing 12-qubit VQC fell short of the classical baselines 97.7\% recall. Furthermore, a non-monotonic performance trend, where performance degraded when scaling from 4 to 8 qubits before improving at 12 qubits suggests a severe trainability issue. These findings highlight that unlocking QMLs potential requires co-developing more efficient data compression techniques and robust quantum optimization strategies.
Authors:Haitao Hu, Peng Chen, Yanpeng Zhao, Yuqi Chen
Abstract:
Large Language Models (LLMs) have been increasingly integrated into computer-use agents, which can autonomously operate tools on a user's computer to accomplish complex tasks. However, due to the inherently unstable and unpredictable nature of LLM outputs, they may issue unintended tool commands or incorrect inputs, leading to potentially harmful operations. Unlike traditional security risks stemming from insecure user prompts, tool execution results from LLM-driven decisions introduce new and unique security challenges. These vulnerabilities span across all components of a computer-use agent. To mitigate these risks, we propose AgentSentinel, an end-to-end, real-time defense framework designed to mitigate potential security threats on a user's computer. AgentSentinel intercepts all sensitive operations within agent-related services and halts execution until a comprehensive security audit is completed. Our security auditing mechanism introduces a novel inspection process that correlates the current task context with system traces generated during task execution. To thoroughly evaluate AgentSentinel, we present BadComputerUse, a benchmark consisting of 60 diverse attack scenarios across six attack categories. The benchmark demonstrates a 87% average attack success rate on four state-of-the-art LLMs. Our evaluation shows that AgentSentinel achieves an average defense success rate of 79.6%, significantly outperforming all baseline defenses.
Authors:Ioannis Koufos, Abdul Rehman Qureshi, Adrian Asensio, Allen Abishek, Efstathios Zaragkas, Ricard Vilalta, Maria Souvalioti, George Xilouris, Michael-Alexandros Kourtis
Abstract:
Traditional risk assessments rely on manual audits and system scans, often causing operational disruptions and leaving security gaps. To address these challenges, this work presents Security Digital Twin-as-a-Service (SDT-aaS), a novel approach that leverages Digital Twin (DT) technology for automated, non-intrusive security compliance. SDT-aaS enables real-time security assessments by mirroring real-world assets, collecting compliance artifacts, and creating machine-readable evidence. The proposed work is a scalable and interoperable solution that supports open standards like CycloneDX and Web of Things (WoT), facilitating seamless integration and efficient compliance management. Empirical results from a moderate-scale infrastructure use case demonstrate its feasibility and performance, paving the way for efficient, on-demand cybersecurity governance with minimal operational impact.
Authors:Simon Cremer, Lydia Jehmlich, Rainer Lenz
Abstract:
Spatial data are gaining increasing importance in many areas of research. Particularly spatial health data are becoming increasingly important for medical research, for example, to better understand relationships between environmental factors and disease patterns. However, their use is often restricted by legal data protection regulations, since georeferenced personal information carries a high risk of re-identification of individuals. To address this issue, what are called geomasking methods are applied to guarantee data protection through targeted displacement of individual data points, while simultaneously maintaining analytical validity within a tolerable range. In the current literature the degree of anonymity of such anonymized georeferenced datasets is often measured by the so-called metric of spatial k-anonymity. However, this metric has considerable shortcomings, particularly regarding its resilience against realistic data attack scenarios. This article classifies the potential data attack scenarios in the context of anonymized georeferenced microdata and introduces appropriate metrics that enable a comprehensive assessment of anonymity adapted to potential data attack scenarios.
Authors:Norman Poh, Daryl Burns
Abstract:
Age verification is increasingly critical for regulatory compliance, user trust, and the protection of minors online. Historically, solutions have struggled with poor accuracy, intrusiveness, and significant security risks. More recently, concerns have shifted toward privacy, surveillance, fairness, and the need for transparent, trustworthy systems. In this paper, we propose Biometric Bound Credentials (BBCreds) as a privacy-preserving approach that cryptographically binds age credentials to an individual's biometric features without storing biometric templates. This ensures only the legitimate, physically present user can access age-restricted services, prevents credential sharing, and addresses both legacy and emerging challenges in age verification. enhances privacy.
Authors:Nan Wang, Nan Wu, Xiangyu Hui, Jiafan Wang, Xin Yuan
Abstract:
As the demand for exercising the "right to be forgotten" grows, the need for verifiable machine unlearning has become increasingly evident to ensure both transparency and accountability. We present {\em zkUnlearner}, the first zero-knowledge framework for verifiable machine unlearning, specifically designed to support {\em multi-granularity} and {\em forgery-resistance}. First, we propose a general computational model that employs a {\em bit-masking} technique to enable the {\em selectivity} of existing zero-knowledge proofs of training for gradient descent algorithms. This innovation enables not only traditional {\em sample-level} unlearning but also more advanced {\em feature-level} and {\em class-level} unlearning. Our model can be translated to arithmetic circuits, ensuring compatibility with a broad range of zero-knowledge proof systems. Furthermore, our approach overcomes key limitations of existing methods in both efficiency and privacy. Second, forging attacks present a serious threat to the reliability of unlearning. Specifically, in Stochastic Gradient Descent optimization, gradients from unlearned data, or from minibatches containing it, can be forged using alternative data samples or minibatches that exclude it. We propose the first effective strategies to resist state-of-the-art forging attacks. Finally, we benchmark a zkSNARK-based instantiation of our framework and perform comprehensive performance evaluations to validate its practicality.
Authors:Jeff Shen, Lindsay M. Smith
Abstract:
We present cryptogram solving as an ideal testbed for studying neural network reasoning and generalization; models must decrypt text encoded with substitution ciphers, choosing from 26! possible mappings without explicit access to the cipher. We develop ALICE (an Architecture for Learning Interpretable Cryptogram dEcipherment), a simple encoder-only Transformer that sets a new state-of-the-art for both accuracy and speed on this decryption problem. Surprisingly, ALICE generalizes to unseen ciphers after training on only ${\sim}1500$ unique ciphers, a minute fraction ($3.7 \times 10^{-24}$) of the possible cipher space. To enhance interpretability, we introduce a novel bijective decoding head that explicitly models permutations via the Gumbel-Sinkhorn method, enabling direct extraction of learned cipher mappings. Through early exit and probing experiments, we reveal how ALICE progressively refines its predictions in a way that appears to mirror common human strategies -- early layers place greater emphasis on letter frequencies, while later layers form word-level structures. Our architectural innovations and analysis methods are applicable beyond cryptograms and offer new insights into neural network generalization and interpretability.
Authors:Paul Benjamin Lowry, Gregory D. Moody, Robert Willison, Clay Posey
Abstract:
The Signalgate incident of March 2025, wherein senior US national security officials inadvertently disclosed sensitive military operational details via the encrypted messaging platform Signal, highlights critical vulnerabilities in organizational security arising from human error, governance gaps, and the misuse of technology. Although smaller in scale when compared to historical breaches involving billions of records, Signalgate illustrates critical systemic issues often overshadowed by a focus on external cyber threats. Employing a case-study approach and systematic review grounded in the NIST Cybersecurity Framework, we analyze the incident to identify patterns of human-centric vulnerabilities and governance challenges common to organizational security failures. Findings emphasize three critical points. (1) Organizational security depends heavily on human behavior, with internal actors often serving as the weakest link despite advanced technical defenses; (2) Leadership tone strongly influences organizational security culture and efficacy, and (3) widespread reliance on technical solutions without sufficient investments in human and organizational factors leads to ineffective practices and wasted resources. From these observations, we propose actionable recommendations for enhancing organizational and national security, including strong leadership engagement, comprehensive adoption of zero-trust architectures, clearer accountability structures, incentivized security behaviors, and rigorous oversight. Particularly during periods of organizational transition, such as mergers or large-scale personnel changes, additional measures become particularly important. Signalgate underscores the need for leaders and policymakers to reorient cybersecurity strategies toward addressing governance, cultural, and behavioral risks.
Authors:Muhammad Arif Hakimi Zamrai, Kamaludin Mohd Yusof
Abstract:
In response to the prevalent concern of TCP SYN flood attacks within the context of Software-Defined Internet of Vehicles (SD-IoV), this study addresses the significant challenge of network security in rapidly evolving vehicular communication systems. This research focuses on optimizing a Random Forest Classifier model to achieve maximum accuracy and minimal detection time, thereby enhancing vehicular network security. The methodology involves preprocessing a dataset containing SYN attack instances, employing feature scaling and label encoding techniques, and applying Stratified K-Fold cross-validation to target key metrics such as accuracy, precision, recall, and F1-score. This research achieved an average value of 0.999998 for all metrics with a SYN DoS attack detection time of 0.24 seconds. Results show that the fine-tuned Random Forest model, configured with 20 estimators and a depth of 10, effectively differentiates between normal and malicious traffic with high accuracy and minimal detection time, which is crucial for SD-IoV networks. This approach marks a significant advancement and introduces a state-of-the-art algorithm in detecting SYN flood attacks, combining high accuracy with minimal detection time. It contributes to vehicular network security by providing a robust solution against TCP SYN flood attacks while maintaining network efficiency and reliability.
Authors:Jan Matter, Muoi Tran
Abstract:
The InterPlanetary File System (IPFS) has been successfully established as the de facto standard for decentralized data storage in the emerging Web3. Despite its decentralized nature, IPFS nodes, as well as IPFS content providers, have converged to centralization in large public clouds. Centralization introduces BGP routing-based attacks, such as passive interception and BGP hijacking, as potential threats. Although this attack vector has been investigated for many other Web3 protocols, such as Bitcoin and Ethereum, to the best of our knowledge, it has not been analyzed for the IPFS network. In our work, we bridge this gap and demonstrate that BGP routing attacks can be effectively leveraged to censor content in IPFS. For the analysis, we collected 3,000 content blocks called CIDs and conducted a simulation of BGP hijacking and passive interception against them. We find that a single malicious AS can censor 75% of the IPFS content for more than 57% of all requester nodes. Furthermore, we show that even with a small set of only 62 hijacked prefixes, 70% of the full attack effectiveness can already be reached. We further propose and validate countermeasures based on global collaborative content replication among all nodes in the IPFS network, together with additional robust backup content provider nodes that are well-hardened against BGP hijacking. We hope this work raises awareness about the threat BGP routing-based attacks pose to IPFS and triggers further efforts to harden the live IPFS network against them.
Authors:Shuli Zhao, Qinsheng Hou, Zihan Zhan, Yanhao Wang, Yuchong Xie, Yu Guo, Libo Chen, Shenghong Li, Zhi Xue
Abstract:
Large language models (LLMs) are increasingly integrated with external systems through the Model Context Protocol (MCP), which standardizes tool invocation and has rapidly become a backbone for LLM-powered applications. While this paradigm enhances functionality, it also introduces a fundamental security shift: LLMs transition from passive information processors to autonomous orchestrators of task-oriented toolchains, expanding the attack surface, elevating adversarial goals from manipulating single outputs to hijacking entire execution flows. In this paper, we reveal a new class of attacks, Parasitic Toolchain Attacks, instantiated as MCP Unintended Privacy Disclosure (MCP-UPD). These attacks require no direct victim interaction; instead, adversaries embed malicious instructions into external data sources that LLMs access during legitimate tasks. The malicious logic infiltrates the toolchain and unfolds in three phases: Parasitic Ingestion, Privacy Collection, and Privacy Disclosure, culminating in stealthy exfiltration of private data. Our root cause analysis reveals that MCP lacks both context-tool isolation and least-privilege enforcement, enabling adversarial instructions to propagate unchecked into sensitive tool invocations. To assess the severity, we design MCP-SEC and conduct the first large-scale security census of the MCP ecosystem, analyzing 12,230 tools across 1,360 servers. Our findings show that the MCP ecosystem is rife with exploitable gadgets and diverse attack methods, underscoring systemic risks in MCP platforms and the urgent need for defense mechanisms in LLM-integrated environments.
Authors:I. Buchinskiy, M. Kotov, A. Ponmaheshkumar, R. Perumal
Abstract:
In 2019, V. A. Roman'kov introduced the concept of marginal sets for groups. He developed a theory of marginal sets and demonstrated how these sets can be applied to improve some key exchange schemes. In this paper, we extend his ideas and introduce the concept of marginal sets for semigroups and semirings. For tropical matrix semigroups and semirings, we describe how some marginal sets can be constructed. We apply marginal sets to improve some key exchange schemes over semigroups.
Authors:Timo Glaser, Alexander May, Julian Nowakowski
Abstract:
We study the fundamental problem of guessing cryptographic keys, drawn from some non-uniform probability distribution $D$, as e.g. in LPN, LWE or for passwords. The optimal classical algorithm enumerates keys in decreasing order of likelihood. The optimal quantum algorithm, due to Montanaro (2011), is a sophisticated Grover search. We give the first tight analysis for Montanaro's algorithm, showing that its runtime is $2^{H_{2/3}(D)/2}$, where $H_α(\cdot)$ denotes Renyi entropy with parameter $α$. Interestingly, this is a direct consequence of an information theoretic result called Arikan's Inequality (1996) -- which has so far been missed in the cryptographic community -- that tightly bounds the runtime of classical key guessing by $2^{H_{1/2}(D)}$. Since $H_{2/3}(D) < H_{1/2}(D)$ for every non-uniform distribution $D$, we thus obtain a super-quadratic quantum speed-up $s>2$ over classical key guessing. As another main result, we provide the first thorough analysis of guessing in a multi-key setting. Specifically, we consider the task of attacking many keys sampled independently from some distribution $D$, and aim to guess a fraction of them. For product distributions $D = Ï^n$, we show that any constant fraction of keys can be guessed within $2^{H(D)}$ classically and $2 ^{H(D)/2}$ quantumly per key, where $H(Ï)$ denotes Shannon entropy. In contrast, Arikan's Inequality implies that guessing a single key costs $2^{H_{1/2}(D)}$ classically and $2^{H_{2/3}(D)/2}$ quantumly. Since $H(D) < H_{2/3}(D) < H_{1/2}(D)$, this shows that in a multi-key setting the guessing cost per key is substantially smaller than in a single-key setting, both classically and quantumly.
Authors:Yangheran Piao, Jingjie Li, Daniel W. Woods
Abstract:
As AI is increasingly integrated into products and critical systems, researchers are paying greater attention to identifying related vulnerabilities. Effective remediation depends on whether vendors are willing to accept and respond to AI vulnerability reports. In this paper, we examine the disclosure policies of 264 AI vendors. Using a mixed-methods approach, our quantitative analysis finds that 36% of vendors provide no disclosure channel, and only 18% explicitly mention AI-related risks. Vulnerabilities involving data access, authorization, and model extraction are generally considered in-scope, while jailbreaking and hallucination are frequently excluded. Through qualitative analysis, we further identify three vendor postures toward AI vulnerabilities - proactive clarification (n = 46, include active supporters, AI integrationists, and back channels), silence (n = 115, include self-hosted and hosted vendors), and restrictive (n = 103). Finally, by comparing vendor policies against 1,130 AI incidents and 359 academic publications, we show that bug bounty policy evolution has lagged behind both academic research and real-world events.
Authors:Soumya Bhoumik, Sarbari Mitra, Rohit Raj Sharma, Kuldeep Namdeo
Abstract:
Identity-based cryptography (IBC), proposed by Adi Shamir, revolutionized public key authentication by eliminating the need for certificates, enabling a more efficient and scalable approach to cryptographic systems. Meanwhile, in \cite{Katsumata2024group}, Katsumata et al. were the first to present the blind signature protocol based on the hardness assumption of isogeny with provable security, which resembles the Schnorr blind signature. Building upon these foundational concepts, we propose an Identity-Based Blind Signature Scheme with an Honest Zero-Knowledge Verifier utilizing the CSIDH framework. This scheme combines blind signatures for privacy preservation with zero-knowledge proofs to ensure the verifier's honesty without revealing any additional information. Leveraging the quantum-resistant properties of CSIDH, a post-quantum secure scheme based on supersingular isogenies, our scheme offers strong protection against quantum adversaries while maintaining computational efficiency. We analyze the security of the introduced protocol in the standard cryptographic model and demonstrate its effectiveness in safeguarding privacy and verifier honesty. Furthermore, we present a performance evaluation, confirming the practical viability of this quantum-resistant cryptographic solution for privacy-preserving applications. This work advances the creation of secure, and scalable cryptographic systems for the post-quantum era.
Authors:Banhirup Sengupta, Peenal Gupta, Souvik Sengupta
Abstract:
The Number Theoretic Transform (NTT) can be regarded as a variant of the Discrete Fourier Transform. NTT has been quite a powerful mathematical tool in developing Post-Quantum Cryptography and Homomorphic Encryption. The Fourier Transform essentially decomposes a signal into its frequencies. They are traditionally sine or cosine waves. NTT works more over groups or finite fields rather than on a continuous signal and polynomials work as the analog of sine waves in case of NTT. Fast Fourier Trnasform (FFT) style NTT or fast NTT has been proven to be useful in lattice-based cryptography due to its ability to reduce the complexity of polynomial multiplication from quadratic to quasilinear. We have introduced the concepts of cyclic, negacyclic convolutions along with NTT and its inverse and their fast versions.
Authors:Andrew Yeo, Daeseon Choi
Abstract:
Large Language Models (LLMs) have seen rapid adoption in recent years, with industries increasingly relying on them to maintain a competitive advantage. These models excel at interpreting user instructions and generating human-like responses, leading to their integration across diverse domains, including consulting and information retrieval. However, their widespread deployment also introduces substantial security risks, most notably in the form of prompt injection and jailbreak attacks. To systematically evaluate LLM vulnerabilities -- particularly to external prompt injection -- we conducted a series of experiments on eight commercial models. Each model was tested without supplementary sanitization, relying solely on its built-in safeguards. The results exposed exploitable weaknesses and emphasized the need for stronger security measures. Four categories of attacks were examined: direct injection, indirect (external) injection, image-based injection, and prompt leakage. Comparative analysis indicated that Claude 3 demonstrated relatively greater robustness; nevertheless, empirical findings confirm that additional defenses, such as input normalization, remain necessary to achieve reliable protection.
Authors:Hai Dinh-Tuan, Sandro Rodriguez Garzon, Jianeng Fu
Abstract:
In the evolving landscape of future mobile networks, there is a critical need for secure and trustful communication modalities to support dynamic interactions among core network components of different network domains. This paper proposes the application of W3C-endorsed Decentralized Identifiers (DIDs) to establish secure and trustful communication channels among network functions in 5G and subsequent generations. A new communication agent is introduced that integrates seamlessly with 5G-standardized network functions and utilizes a DID-based application layer transport protocol to ensure confidentiality, integrity, and authenticity for cross-domain interactions. A comparative analysis of the two different versions of the DID-based communication protocol for inter network function communication reveals compatibility advantages of the latest protocol iteration. Furthermore, a comprehensive evaluation of the communication overhead caused by both protocol iterations compared to traditional TCP/TLS shows the benefits of using DIDs to improve communication security, albeit with performance loses compared to TCP/TLS. These results uncover the potential of DID-based communication for future mobile networks but also point out areas for optimization.
Authors:Labani Halder, Tanmay Sen, Sarbani Palit
Abstract:
Human Activity Recognition (HAR) using multimodal sensor data remains challenging due to noisy or incomplete measurements, scarcity of labeled examples, and privacy concerns. Traditional centralized deep learning approaches are often constrained by infrastructure availability, network latency, and data sharing restrictions. While federated learning (FL) addresses privacy by training models locally and sharing only model parameters, it still has to tackle issues arising from the use of heterogeneous multimodal data and differential privacy requirements. In this article, a Graph-based Multimodal Federated Learning framework, GraMFedDHAR, is proposed for HAR tasks. Diverse sensor streams such as a pressure mat, depth camera, and multiple accelerometers are modeled as modality-specific graphs, processed through residual Graph Convolutional Neural Networks (GCNs), and fused via attention-based weighting rather than simple concatenation. The fused embeddings enable robust activity classification, while differential privacy safeguards data during federated aggregation. Experimental results show that the proposed MultiModalGCN model outperforms the baseline MultiModalFFN, with up to 2 percent higher accuracy in non-DP settings in both centralized and federated paradigms. More importantly, significant improvements are observed under differential privacy constraints: MultiModalGCN consistently surpasses MultiModalFFN, with performance gaps ranging from 7 to 13 percent depending on the privacy budget and setting. These results highlight the robustness of graph-based modeling in multimodal learning, where GNNs prove more resilient to the performance degradation introduced by DP noise.
Authors:Waris Gill, Natalie Isak, Matthew Dressman
Abstract:
The widespread deployment of LLMs across enterprise services has created a critical security blind spot. Organizations operate multiple LLM services handling billions of queries daily, yet regulatory compliance boundaries prevent these services from sharing threat intelligence about prompt injection attacks, the top security risk for LLMs. When an attack is detected in one service, the same threat may persist undetected in others for months, as privacy regulations prohibit sharing user prompts across compliance boundaries. We present BinaryShield, the first privacy-preserving threat intelligence system that enables secure sharing of attack fingerprints across compliance boundaries. BinaryShield transforms suspicious prompts through a unique pipeline combining PII redaction, semantic embedding, binary quantization, and randomized response mechanism to potentially generate non-invertible fingerprints that preserve attack patterns while providing privacy. Our evaluations demonstrate that BinaryShield achieves an F1-score of 0.94, significantly outperforming SimHash (0.77), the privacy-preserving baseline, while achieving 64x storage reduction and 38x faster similarity search compared to dense embeddings.
Authors:Jennifer King, Kevin Klyman, Emily Capstick, Tiffany Saade, Victoria Hsieh
Abstract:
Hundreds of millions of people now regularly interact with large language models via chatbots. Model developers are eager to acquire new sources of high-quality training data as they race to improve model capabilities and win market share. This paper analyzes the privacy policies of six U.S. frontier AI developers to understand how they use their users' chats to train models. Drawing primarily on the California Consumer Privacy Act, we develop a novel qualitative coding schema that we apply to each developer's relevant privacy policies to compare data collection and use practices across the six companies. We find that all six developers appear to employ their users' chat data to train and improve their models by default, and that some retain this data indefinitely. Developers may collect and train on personal information disclosed in chats, including sensitive information such as biometric and health data, as well as files uploaded by users. Four of the six companies we examined appear to include children's chat data for model training, as well as customer data from other products. On the whole, developers' privacy policies often lack essential information about their practices, highlighting the need for greater transparency and accountability. We address the implications of users' lack of consent for the use of their chat data for model training, data security issues arising from indefinite chat data retention, and training on children's chat data. We conclude by providing recommendations to policymakers and developers to address the data privacy challenges posed by LLM-powered chatbots.
Authors:Tanya Joshi, Krishnendu Guha
Abstract:
This study explores the application of quantum machine learning (QML) algorithms to enhance cybersecurity threat detection, particularly in the classification of malware and intrusion detection within high-dimensional datasets. Classical machine learning approaches encounter limitations when dealing with intricate, obfuscated malware patterns and extensive network intrusion data. To address these challenges, we implement and evaluate various QML algorithms, including Quantum Neural Networks (QNN), Quantum Support Vector Machines (QSVM), and hybrid Quantum Convolutional Neural Networks (QCNN) for malware detection tasks. Our experimental analysis utilized two datasets: the Intrusion dataset, comprising 150 samples with 56 memory-based features derived from Volatility framework analysis, and the ObfuscatedMalMem2022 dataset, containing 58,596 samples with 57 features representing benign and malicious software. Remarkably, our QML methods demonstrated superior performance compared to classical approaches, achieving accuracies of 95% for QNN and 94% for QSVM. These quantum-enhanced methods leveraged quantum superposition and entanglement principles to accurately identify complex patterns within highly obfuscated malware samples that were imperceptible to classical methods. To further advance malware analysis, we propose a novel real-time malware analysis framework that incorporates Quantum Feature Extraction using Quantum Fourier Transform, Quantum Feature Maps, and Classification using Variational Quantum Circuits. This system integrates explainable AI methods, including GradCAM++ and ScoreCAM algorithms, to provide interpretable insights into the quantum decision-making processes.
Authors:Umair Amjid, M. Umar Khan, S. A. Manan Kirmani
Abstract:
The increasing use of Internet of Things (IoT) devices has led to a rise in security related concerns regarding IoT Networks. The surveillance cameras in IoT networks are vulnerable to security threats such as brute force and zero-day attacks which can lead to unauthorized access by hackers and potential spying on the users activities. Moreover, these cameras can be targeted by Denial of Service (DOS) attacks, which will make it unavailable for the user. The proposed AI based framework will leverage machine learning algorithms to analyze network traffic and detect anomalous behavior, allowing for quick detection and response to potential intrusions. The framework will be trained and evaluated using real-world datasets to learn from past security incidents and improve its ability to detect potential intrusion.
Authors:Youssef Chakir, Iyad Lahsen-Cherif
Abstract:
The growing complexity of cyber incidents presents significant challenges for digital forensic investigators, especially in evidence collection and analysis. Public resources are still limited because of ethical, legal, and privacy concerns, even though realistic datasets are necessary to support research and tool developments. To address this gap, we introduce ForensicsData, an extensive Question-Context-Answer (Q-C-A) dataset sourced from actual malware analysis reports. It consists of more than 5,000 Q-C-A triplets. A unique workflow was used to create the dataset, which extracts structured data, uses large language models (LLMs) to transform it into Q-C-A format, and then uses a specialized evaluation process to confirm its quality. Among the models evaluated, Gemini 2 Flash demonstrated the best performance in aligning generated content with forensic terminology. ForensicsData aims to advance digital forensics by enabling reproducible experiments and fostering collaboration within the research community.
Authors:Enis Karaarslan, Esin Güler, Efe Emir Yüce, Cagatay Coban
Abstract:
The scarcity of real-world attack data significantly hinders progress in cybersecurity research and education. Although honeypots like Cowrie effectively collect live threat intelligence, they generate overwhelming volumes of unstructured and heterogeneous logs, rendering manual analysis impractical. As a first step in our project on secure and efficient AI automation, this study explores the use of AI agents for automated log analysis. We present a lightweight and automated approach to process Cowrie honeypot logs. Our approach leverages AI agents to intelligently parse, summarize, and extract insights from raw data, while also considering the security implications of deploying such an autonomous system. Preliminary results demonstrate the pipeline's effectiveness in reducing manual effort and identifying attack patterns, paving the way for more advanced autonomous cybersecurity analysis in future work.
Authors:Simone Bottoni, Giulio Zizzo, Stefano Braghin, Alberto Trombetta
Abstract:
Federated Learning has rapidly expanded from its original inception to now have a large body of research, several frameworks, and sold in a variety of commercial offerings. Thus, its security and robustness is of significant importance. There are many algorithms that provide robustness in the case of malicious clients. However, the aggregator itself may behave maliciously, for example, by biasing the model or tampering with the weights to weaken the models privacy. In this work, we introduce a verifiable federated learning protocol that enables clients to verify the correctness of the aggregators computation without compromising the confidentiality of their updates. Our protocol uses a standard secure aggregation technique to protect individual model updates with a linearly homomorphic authenticator scheme that enables efficient, privacy-preserving verification of the aggregated result. Our construction ensures that clients can detect manipulation by the aggregator while maintaining low computational overhead. We demonstrate that our approach scales to large models, enabling verification over large neural networks with millions of parameters.
Authors:Stefanos Vasileaidis, Thanassis Giannetsos, Matthias Schunter, Bruno Crispo
Abstract:
Live migration of applications is a fundamental capability for enabling resilient computing in modern distributed systems. However, extending this functionality to trusted applications (TA) -- executing within Trusted Execution Environments (TEEs) -- introduces unique challenges such as secure state preservation, integrity verification, replay and rollback prevention, and mitigation of unauthorized cloning of TAs. We present TALOS, a lightweight framework for verifiable state management and trustworthy application migration. While our implementation is prototyped and evaluated using Intel SGX with the Gramine LibOS and RISC-V Keystone (evidencing the framework's portability across diverse TEEs), its design is agnostic to the underlying TEE architecture. Such agility is a necessity in today's network service mesh (collaborative computing across the continuum) where application workloads must be managed across domain boundaries in a harmonized fashion. TALOS is built around the principle of minimizing trust assumptions: TAs are treated as untrusted until explicitly verified, and the migration process does not rely on a trusted third party. To ensure both the integrity and secure launch of the migrated application, TALOS integrates memory introspection and control-flow graph extraction, enabling robust verification of state continuity and execution flow. Thereby achieving strong security guarantees while maintaining efficiency, making it suitable for decentralized settings.
Authors:Huy Hung Ho, Nhan Le Thanh, Nam Nguyen Hong
Abstract:
In the era of Construction 4.0, the industry is embracing a new paradigm of labor elasticity, driven by smart and flexible outsourcing and subcontracting strategies. The increased reliance on specialized subcontractors enables companies to scale labor dynamically based on project demands. This adaptable workforce model presents challenges in managing hierarchical integration and coordinating inter-site collaboration. Our design introduces a subsystem integrated into the Odoo ERP framework, employing a modular architecture to streamline labor management, task tracking, and approval workflows. The system adopts a three-pronged approach to ensure synchronized data exchange between general contractors and subcontractors, while maintaining both security and operational independence. The system features hybrid access control, third-party integration for cross-domain communication, and role-based mapping algorithm across sites. The system supports varying degrees of customization through a unified and consolidated attribute mapping center. This center leverages a tree-like index structure and Lagrange interpolation method to enhance the efficiency of role mapping. Demonstrations highlight practical application in outsourcing, integration, and scalability scenarios, confirming the system's robustness under high user volumes and in offline conditions. Experimental results further show improvements in database performance and workflow adaptability to support a scalable, enterprise-level solution that aligns with the evolving demands of smart construction management.
Authors:Kai Feng, Jeremy Singer, Angelos K Marnerides
Abstract:
Binary-only fuzzing often struggles with achieving thorough code coverage and uncovering hidden vulnerabilities due to limited insight into a program's internal dataflows. Traditional grey-box fuzzers guide test case generation primarily using control flow edge coverage, which can overlook bugs not easily exposed through control flow analysis alone. We argue that integrating dataflow analysis into the fuzzing process can enhance its effectiveness by revealing how data propagates through the program, thereby enabling the exploration of execution paths that control flow-based methods might miss. In this context, we introduce FuzzRDUCC, a novel fuzzing framework that employs symbolic execution to reconstruct definition-use (def-use) chains directly from binary executables. FuzzRDUCC identifies crucial dataflow paths and exposes security vulnerabilities without incurring excessive computational overhead, due to a novel heuristic algorithm that selects relevant def-use chains without affecting the thoroughness of the fuzzing process. We evaluate FuzzRDUCC using the binutils benchmark and demonstrate that it can identify unique crashes not found by state-of-the-art fuzzers. Hence, establishing FuzzRDUCC as a feasible solution for next generation vulnerability detection and discovery mechanisms.
Authors:Brennen Hill, Surendra Parla, Venkata Abhijeeth Balabhadruni, Atharv Prajod Padmalayam, Sujay Chandra Shekara Sharma
Abstract:
The proliferation of Large Language Models (LLMs) has introduced critical security challenges, where adversarial actors can manipulate input prompts to cause significant harm and circumvent safety alignments. These prompt-based attacks exploit vulnerabilities in a model's design, training, and contextual understanding, leading to intellectual property theft, misinformation generation, and erosion of user trust. A systematic understanding of these attack vectors is the foundational step toward developing robust countermeasures. This paper presents a comprehensive literature survey of prompt-based attack methodologies, categorizing them to provide a clear threat model. By detailing the mechanisms and impacts of these exploits, this survey aims to inform the research community's efforts in building the next generation of secure LLMs that are inherently resistant to unauthorized distillation, fine-tuning, and editing.
Authors:Tyler Shumaker, Jessica Carpenter, David Saranchak, Nathaniel D. Bastian
Abstract:
Machine learning (ML) models have the potential to transform military battlefields, presenting a large external pressure to rapidly incorporate them into operational settings. However, it is well-established that these ML models are vulnerable to a number of adversarial attacks throughout the model deployment pipeline that threaten to negate battlefield advantage. One broad category is privacy attacks (such as model inversion) where an adversary can reverse engineer information from the model, such as the sensitive data used in its training. The ability to quantify the risk of model inversion attacks (MIAs) is not well studied, and there is a lack of automated developmental test and evaluation (DT&E) tools and metrics to quantify the effectiveness of privacy loss of the MIA. The current DT&E process is difficult because ML model inversions can be hard for a human to interpret, subjective when they are interpretable, and difficult to quantify in terms of inversion quality. Additionally, scaling the DT&E process is challenging due to many ML model architectures and data modalities that need to be assessed. In this work, we present a novel DT&E tool that quantifies the risk of data privacy loss from MIAs and introduces four adversarial risk dimensions to quantify privacy loss. Our DT&E pipeline combines inversion with vision language models (VLMs) to improve effectiveness while enabling scalable analysis. We demonstrate effectiveness using multiple MIA techniques and VLMs configured for zero-shot classification and image captioning. We benchmark the pipeline using several state-of-the-art MIAs in the computer vision domain with an image classification task that is typical in military applications. In general, our innovative pipeline extends the current model inversion DT&E capabilities by improving the effectiveness and scalability of the privacy loss analysis in an automated fashion.
Authors:Olivier Adjonyo, Sebastien Bardin, Emanuele Bellini, Gilbert Ndollane Dione, Mahmudul Faisal Al Ameen, Robert Merget, Frederic Recoules, Yanis Sellami
Abstract:
The PQDSS standardization process requires cryptographic primitives to be free from vulnerabilities, including timing and cache side-channels. Resistance to timing leakage is therefore an essential property, and achieving this typically relies on software implementations that follow constant-time principles. Moreover, ensuring that all implementations are constant-time is crucial for fair performance comparisons, as secure implementations often incur additional overhead. Such analysis also helps identify scheme proposals that are inherently difficult to implement in constant time. Because constant-time properties can be broken during compilation, it is often necessary to analyze the compiled binary directly. Since manual binary analysis is extremely challenging, automated analysis becomes highly important. Although several tools exist to assist with such analysis, they often have usability limitations and are difficult to set up correctly. To support the developers besides the NIST committee in verifying candidates, we developed a toolchain that automates configuration, execution, and result analysis for several widely used constant-time analysis tools. We selected TIMECOP and Binsec/Rel2 to verify constant-time policy compliance at the binary level, and dudect and RTLF to detect side-channel vulnerabilities through statistical analysis of execution time behavior. We demonstrate its effectiveness and practicability by evaluating the NIST PQDSS round 1 and round 2 implementations. We reported 26 issues in total to the respective developers, and 5 of them have already been fixed. We also discuss our different findings, as well as the benefits of shortcomings of the different tools.
Authors:Gang Liu, Ningjie Li, Cen Chen
Abstract:
Intel SGX and hypervisors isolate non-privileged programs from other software, ensuring confidentiality and integrity. However, side-channel attacks continue to threaten Intel SGX's security, enabling malicious OS to manipulate PTE present bits, induce page faults, and steal memory access traces. Despite extensive research, existing defenses focus on detection or rely on impractical solutions. This paper presents ShieldMMU, a comprehensive solution for mitigating controlled channel attacks, balancing compatibility, performance, and usability. Leveraging a Merkle Tree-inspired Defense Tree (DD-Tree), ShieldMMU protects PTE integrity by detecting, locating, and restoring attacked PTEs. It identifies MMU page table lookup events and side-channel attacks, promptly restoring PTE parameters to prevent page fault traps and ensure secure non-privileged application operation within SGX. Our experiments confirm ShieldMMU's enhanced security and acceptable latency performance.
Authors:Somiya Chhillar, Mary K. Righi, Rebecca E. Sutter, Evgenios M. Kornaropoulos
Abstract:
Despite longstanding criticism from the privacy community, k-anonymity remains a widely used standard for data anonymization, mainly due to its simplicity, regulatory alignment, and preservation of data utility. However, non-experts often defend k-anonymity on the grounds that, in the absence of auxiliary information, no known attacks can compromise its protections. In this work, we refute this claim by introducing Combinatorial Refinement Attacks (CRA), a new class of privacy attacks targeting k-anonymized datasets produced using local recoding. This is the first method that does not rely on external auxiliary information or assumptions about the underlying data distribution. CRA leverages the utility-optimizing behavior of local recoding anonymization of ARX, which is a widely used open-source software for anonymizing data in clinical settings, to formulate a linear program that significantly reduces the space of plausible sensitive values. To validate our findings, we partnered with a network of free community health clinics, an environment where (1) auxiliary information is indeed hard to find due to the population they serve and (2) open-source k-anonymity solutions are attractive due to regulatory obligations and limited resources. Our results on real-world clinical microdata reveal that even in the absence of external information, established anonymization frameworks do not deliver the promised level of privacy, raising critical privacy concerns.
Authors:Wei Xu, Hui Zhu, Yandong Zheng, Song Bian, Ning Sun, Hao Yuan, Dengguo Feng, Hui Li
Abstract:
With the rapid adoption of Models-as-a-Service, concerns about data and model privacy have become increasingly critical. To solve these problems, various privacy-preserving inference schemes have been proposed. In particular, due to the efficiency and interpretability of decision trees, private decision tree evaluation (PDTE) has garnered significant attention. However, existing PDTE schemes suffer from significant limitations: their communication and computation costs scale with the number of trees, the number of nodes, or the tree depth, which makes them inefficient for large-scale models, especially over WAN networks. To address these issues, we propose Kangaroo, a private and amortized decision tree inference framework build upon packed homomorphic encryption. Specifically, we design a novel model hiding and encoding scheme, together with secure feature selection, oblivious comparison, and secure path evaluation protocols, enabling full amortization of the overhead as the number of nodes or trees scales. Furthermore, we enhance the performance and functionality of the framework through optimizations, including same-sharing-for-same-model, latency-aware, and adaptive encoding adjustment strategies. Kangaroo achieves a $14\times$ to $59\times$ performance improvement over state-of-the-art (SOTA) one-round interactive schemes in WAN environments. For large-scale decision tree inference tasks, it delivers a $3\times$ to $44\times$ speedup compared to existing schemes. Notably, Kangaroo enables the evaluation of a random forest with $969$ trees and $411825$ nodes in approximately $60$ ms per tree (amortized) under WAN environments.
Authors:Moontaha Nishat Chowdhury, André Bauer, Minxuan Zhou
Abstract:
In today's data-driven world, recommendation systems personalize user experiences across industries but rely on sensitive data, raising privacy concerns. Fully homomorphic encryption (FHE) can secure these systems, but a significant challenge in applying FHE to recommendation systems is efficiently handling the inherently large and sparse user-item rating matrices. FHE operations are computationally intensive, and naively processing various sparse matrices in recommendation systems would be prohibitively expensive. Additionally, the communication overhead between parties remains a critical concern in encrypted domains. We propose a novel approach combining Compressed Sparse Row (CSR) representation with FHE-based matrix factorization that efficiently handles matrix sparsity in the encrypted domain while minimizing communication costs. Our experimental results demonstrate high recommendation accuracy with encrypted data while achieving the lowest communication costs, effectively preserving user privacy.
Authors:Alan Beadle, Michael L. Scott, John Criswell
Abstract:
Protected user-level libraries have been proposed as a way to allow mutually distrusting applications to safely share kernel-bypass services. In this paper, we identify and solve several previously unaddressed obstacles to realizing this design and identify several optimization opportunities. First, to preserve the kernel's ability to reclaim failed processes, protected library functions must complete in modest, bounded time. We show how to move unbounded waits outside the library itself, enabling synchronous interaction among processes without the need for polling. Second, we show how the bounded time requirement can be leveraged to achieve lower and more stable latency for inter-process interactions. Third, we observe that prior work on protected libraries is vulnerable to a buffer unmapping attack; we prevent this attack by preventing applications from removing pages that they share with the protected library. Fourth, we show how a trusted daemon can respond to asynchronous events and dynamically divide work with application threads in a protected library. By extending and improving the protected library model, our work provides a new way to structure OS services, combining the advantages of kernel bypass and microkernels. We present a set of safety and performance guidelines for developers of protected libraries, and a set of recommendations for developers of future protected library operating systems. We demonstrate the convenience and performance of our approach with a prototype version of the DDS communication service. To the best of our knowledge, this prototype represents the first successful sharing of a kernel-bypass NIC among mutually untrusting applications. Relative to the commercial FastDDS implementation, we achieve approximately 50\% lower latency and up to 7x throughput, with lower CPU utilization.
Authors:Wenhao Chen, Morris Chang, Witawas Srisa-an, Yong Guan
Abstract:
Due to the event driven nature and the versatility of GUI designs in Android programs, it is challenging to generate event sequences with adequate code coverage within a reasonable time. A common approach to handle this issue is to rely on GUI models to generate event sequences. These sequences can be effective in covering GUI states, but inconsistent in exposing program behaviors that require specific inputs. A major obstacle to generate such specific inputs is the lack of a systematic GUI exploration process to accommodate the analysis requirements. In this paper, we introduce Android Path Explorer (APEX), a systematic input generation framework using concolic execution. APEX addresses the limitations of model-based sequence generation by using concolic execution to discover the data dependencies of GUI state transitions. Moreover, concolic execution is also used to prioritize events during the exploration of GUI, which leads to a more robust model and accurate input generation. The key novelty of APEX is that concolic execution is not only used to construct event sequences, but also used to traverse the GUI more systematically. As such, our experimental results show that APEX can be used to generate a set of event sequences that achieve high code coverage, as well as event sequences that reach specific targets.
Authors:Aditya Bhardwaj, Péter Kutas
Abstract:
Blind signatures were first introduced by David Chaum. They allow a user to have a message signed by a signer without revealing the message itself. This property is particularly useful in applications such as electronic voting and digital cash, where user anonymity is important. In a blind signature scheme, the user blinds their message before sending it to the signer, who signs the blinded message. The user then unblinds the signed message to obtain a valid signature that can be verified publicly, ensuring that the signer cannot trace the signed message back to the original unblinded version. A good analogy is placing the message inside an envelope and having the envelope signed. Once the envelope is opened, the signature remains valid for the enclosed message, ensuring that the content remains confidential. Such constructions provide anonymity and privacy to the user but given a practical quantum computer, the security of traditional crypto-systems providing such features will be broken. To address this, the development of quantum-resistant cryptographic protocols is essential for maintaining the security of digital transactions and data. Aligning with the same goal, this work aims to thoroughly review the background of lattice-based blind signatures. We start with the foundations of digital signatures in the classical settings and then move on to lattice-based constructions.
Authors:Shafik Nassar, Ronak Ramachandran
Abstract:
Statistical witness indistinguishability is a relaxation of statistical zero-knowledge which guarantees that the transcript of an interactive proof reveals no information about which valid witness the prover used to generate it. In this paper we define and initiate the study of QSWI, the class of problems with quantum statistically witness indistinguishable proofs. Using inherently quantum techniques from Kobayashi (TCC 2008), we prove that any problem with an honest-verifier quantum statistically witness indistinguishable proof has a 3-message public-coin malicious-verifier quantum statistically witness indistinguishable proof. There is no known analogue of this result for classical statistical witness indistinguishability. As a corollary, our result implies SWI is contained in QSWI. Additionally, we extend the work of Bitansky et al. (STOC 2023) to show that quantum batch proofs imply quantum statistically witness indistinguishable proofs with inverse-polynomial witness indistinguishability error.
Authors:Kazi Hassan Shakib, Muhammad Asfand Hafeez, Arslan Munir
Abstract:
AmphiKey, a dual-mode post-quantum/traditional (PQ/T) hybrid authenticated key exchange mechanism (AKEM) has been designed to secure smart grid communications against both classical and quantum threats. AmphiKey offers two distinct operational modes within a single framework: an Authenticated Mode and a Deniable Mode. The Authenticated Mode employs a blackbox approach, combining ephemeral ML-KEM-768 and X25519 with long-term Raccoon DSA keys to provide forward secrecy and strong, non-repudiable authenticity. This design achieves "OR" confidentiality, where security holds if either of the KEMs is unbroken, and robust "AND" authenticity. For the signature operation, it leverages the 'masking-friendly' Raccoon digital signature (DSA), which is specifically designed for side-channel attack resistance, though this protection is localized to the signing key and does not provide deniability. In contrast, Deniable Mode provides deniable authentication, preserving privacy. The protocol used ML-KEM-768 (AKEM-1), Ephemeral X25519 (AKEM-2), Raccoon-based DSA (Rac) (compared performance to ML-DSA-65), and the Ascon cipher to deliver its security guarantees. Key contributions include providing a flexible protocol with enhanced security, optional deniability, and efficiency adapted to the diverse needs of the smart grid infrastructure. We present a comprehensive performance evaluation on a heterogeneous testbed featuring a powerful server and client (AMD Ryzen 5) and a resource-constrained client (Raspberry Pi). In efficient Deniable mode, the full handshake completes in 0.15 ms on the server and 0.41 ms on the Raspberry Pi client. In contrast, the Authenticated Mode is bottlenecked by the client-side signature generation; the handshake takes 4.8 ms for the Raspberry Pi client to initiate and 0.84 ms for the server to verify.
Authors:I. D. Lutz, A. M. Hill, M. C. Valenti
Abstract:
As 5G networks gain traction in defense applications, ensuring the privacy and integrity of the Authentication and Key Agreement (AKA) protocol is critical. While 5G AKA improves upon previous generations by concealing subscriber identities, it remains vulnerable to replay-based synchronization and linkability threats under realistic adversary models. This paper provides a unified analysis of the standardized 5G AKA flow, identifying several vulnerabilities and highlighting how each exploits protocol behavior to compromise user privacy. To address these risks, we present five lightweight mitigation strategies. We demonstrate through prototype implementation and testing that these enhancements strengthen resilience against linkability attacks with minimal computational and signaling overhead. Among the solutions studied, those introducing a UE-generated nonce emerge as the most promising, effectively neutralizing the identified tracking and correlation attacks with negligible additional overhead. Integrating this extension as an optional feature to the standard 5G AKA protocol offers a backward-compatible, low-overhead path toward a more privacy-preserving authentication framework for both commercial and military 5G deployments.
Authors:Khashayar Khajavi, Tao Wang
Abstract:
Website fingerprinting (WF) attacks remain a significant threat to encrypted traffic, prompting the development of a wide range of defenses. Among these, two prominent classes are regularization-based defenses, which shape traffic using fixed padding rules, and supersequence-based approaches, which conceal traces among predefined patterns. In this work, we present a unified framework for designing an adaptive WF defense that combines the effectiveness of regularization with the provable security of supersequence-style grouping. The scheme first extracts behavioural patterns from traces and clusters them into (k,l)-diverse anonymity sets; an early-time-series classifier (adapted from ECDIRE) then switches from a conservative global set of regularization parameters to the lighter, set-specific parameters. We instantiate the design as Adaptive Tamaraw, a variant of Tamaraw that assigns padding parameters on a per-cluster basis while retaining its original information-theoretic guarantee. Comprehensive experiments on public real-world datasets confirm the benefits. By tuning k, operators can trade privacy for efficiency: in its high-privacy mode Adaptive Tamaraw pushes the bound on any attacker's accuracy below 30%, whereas in efficiency-centred settings it cuts total overhead by 99% compared with classic Tamaraw.
Authors:Kanchon Gharami, Hansaka Aluvihare, Shafika Showkat Moni, Berker Peköz
Abstract:
Large Language Models (LLMs) are increasingly deployed in mission-critical systems, facilitating tasks such as satellite operations, command-and-control, military decision support, and cyber defense. Many of these systems are accessed through application programming interfaces (APIs). When such APIs lack robust access controls, they can expose full or top-k logits, creating a significant and often overlooked attack surface. Prior art has mainly focused on reconstructing the output projection layer or distilling surface-level behaviors. However, regenerating a black-box model under tight query constraints remains underexplored. We address that gap by introducing a constrained replication pipeline that transforms partial logit leakage into a functional deployable substitute model clone. Our two-stage approach (i) reconstructs the output projection matrix by collecting top-k logits from under 10k black-box queries via singular value decomposition (SVD) over the logits, then (ii) distills the remaining architecture into compact student models with varying transformer depths, trained on an open source dataset. A 6-layer student recreates 97.6% of the 6-layer teacher model's hidden-state geometry, with only a 7.31% perplexity increase, and a 7.58 Negative Log-Likelihood (NLL). A 4-layer variant achieves 17.1% faster inference and 18.1% parameter reduction with comparable performance. The entire attack completes in under 24 graphics processing unit (GPU) hours and avoids triggering API rate-limit defenses. These results demonstrate how quickly a cost-limited adversary can clone an LLM, underscoring the urgent need for hardened inference APIs and secure on-premise defense deployments.
Authors:Maryam Mahdi Alhusseini, Mohammad Reza Feizi Derakhshi
Abstract:
In todays rapidly evolving digital landscape, safeguarding network infrastructures against cyberattacks has become a critical priority. This research presents an innovative AI-driven real-time intrusion detection framework designed to enhance network security, particularly in Wireless Sensor Networks (WSNs), Cloud Computing (CC), and Internet of Things (IoT) environments. The system employs classical machine learning models, Logistic Regression, decision trees, and K-Nearest Neighbors, optimized through the novel Energy Valley Optimization (EVO) method using the NSL-KDD dataset. Feature selection significantly reduced the number of input features from 42 to 18, while maintaining strong detection capabilities. The proposed system achieved 98.95 percent. accuracy with Decision Tree, 98.47 percent with K-Nearest Neighbors, and 88.84 percent with Logistic Regression. Moreover, high precision, recall, and F1-scores were attained across all classifiers while substantially reducing training and testing times, making the framework highly suitable for real-time applications. To ensure fair detection across diverse attack types, dataset balancing via Downsampling was applied to address class imbalance challenges. This investigation focuses on the significance of advancing IDSs. in cloud computing and WSNs. Overall, this work advances secure communications by delivering a scalable, low-latency, and high-accuracy intrusion detection solution aligned with the latest trends in artificial intelligence, cybersecurity, and real-time digital networks.
Authors:YuKun Zhu, ManYuan Hua, Hai Huang, YongZhao Zhang, Jie Yang, FengHua Xu, RuiDong Chen, XiaoSong Zhang, JiGuo Yu, Yong Ma
Abstract:
Although encryption protocols such as TLS are widely de-ployed,side-channel metadata in encrypted traffic still reveals patterns that allow application and behavior inference.How-ever,existing fine-grained fingerprinting approaches face two key limitations:(i)reliance on platform-dependent charac-teristics,which restricts generalization across heterogeneous platforms,and(ii)poor scalability for fine-grained behavior identification in open-world settings. In this paper,we present X-PRINT,the first server-centric,URI-based framework for cross-platform fine-grained encrypted-traffic fingerprinting.X-PRINT systematically demonstrates that backend URI invocation patterns can serve as platform-agnostic invariants and are effective for mod-eling fine-grained behaviors.To achieve robust identifica-tion,X-PRINT further leverages temporally structured URI maps for behavior inference and emphasizes the exclusion of platform-or application-specific private URIs to handle unseen cases,thereby improving reliability in open-world and cross-platform settings.Extensive experiments across diverse cross-platform and open-world settings show that X-PRINT achieves state-of-the-art accuracy in fine-grained fingerprint-ing and exhibits strong scalability and robustness.
Authors:Xiangyu Zhang, Mang Ye
Abstract:
In federated learning, participants' uploaded model updates cannot be directly verified, leaving the system vulnerable to malicious attacks. Existing attack strategies have adversaries upload tampered model updates to degrade the global model's performance. However, attackers also degrade their own private models, gaining no advantage. In real-world scenarios, attackers are driven by self-centered motives: their goal is to gain a competitive advantage by developing a model that outperforms those of other participants, not merely to cause disruption. In this paper, we study a novel Self-Centered Federated Learning (SCFL) attack paradigm, in which attackers not only degrade the performance of the global model through attacks but also enhance their own models within the federated learning process. We propose a framework named FedThief, which degrades the performance of the global model by uploading modified content during the upload stage. At the same time, it enhances the private model's performance through divergence-aware ensemble techniques, where "divergence" quantifies the deviation between private and global models, that integrate global updates and local knowledge. Extensive experiments show that our method effectively degrades the global model performance while allowing the attacker to obtain an ensemble model that significantly outperforms the global model.
Authors:Andres Alejandre, Kassandra Delfin, Victor Castano
Abstract:
The use of ASCII art as a novel approach to masking sensitive information in cybercrime, focusing on its potential role in protecting personal data during the delivery process and beyond, is presented. By examining the unique properties of ASCII art and its historical context, this study discusses the advantages and limitations of employing this technique in various cybercrime scenarios. Additionally, providing recommendations for enhancing data security practices and fostering a culture of privacy awareness in both businesses and individuals. The findings suggest that ASCII art, with its simplicity and ambiguity, can serve as an effective tool against cybercriminals, emphasizing the need for robust data security measures and increased privacy awareness in today's interconnected world.
Authors:Jie Zhu, Chihao Shen, Ziyang Li, Jiahao Yu, Yizheng Chen, Kexin Pei
Abstract:
Directed fuzzing aims to find program inputs that lead to specified target program states. It has broad applications, such as debugging system crashes, confirming reported bugs, and generating exploits for potential vulnerabilities. This task is inherently challenging because target states are often deeply nested in the program, while the search space manifested by numerous possible program inputs is prohibitively large. Existing approaches rely on branch distances or manually-specified constraints to guide the search; however, the branches alone are often insufficient to precisely characterize progress toward reaching the target states, while the manually specified constraints are often tailored for specific bug types and thus difficult to generalize to diverse target states and programs. We present Locus, a novel framework to improve the efficiency of directed fuzzing. Our key insight is to synthesize predicates to capture fuzzing progress as semantically meaningful intermediate states, serving as milestones towards reaching the target states. When used to instrument the program under fuzzing, they can reject executions unlikely to reach the target states, while providing additional coverage guidance. To automate this task and generalize to diverse programs, Locus features an agentic framework with program analysis tools to synthesize and iteratively refine the candidate predicates, while ensuring the predicates strictly relax the target states to prevent false rejections via symbolic execution. Our evaluation shows that Locus substantially improves the efficiency of eight state-of-the-art fuzzers in discovering real-world vulnerabilities, achieving an average speedup of 41.6x. So far, Locus has found eight previously unpatched bugs, with one already acknowledged with a draft patch.
Authors:Weijie Liu, Hongbo Chen, Shuo Huai, Zhen Xu, Wenhao Wang, Zhi Li, Zheli Liu
Abstract:
Trusted Execution Environments (TEEs) have emerged as a cornerstone of confidential computing, garnering significant attention from both academia and industry. To enable the secure development, execution, and deployment, of applications on TEE platforms, TEE containers have been introduced as middleware solutions. These containers aim to shield applications from potentially malicious operating systems and orchestration interfaces while maintaining usability and reliability. In this paper, we analyze the isolation strategies employed by existing TEE containers to protect secure applications. To address the challenges in analyzing these interfaces, we designed an automated analyzer to precisely identify and evaluate their isolation boundaries. We observed that some TEE containers fail to achieve their intended goals due to critical design and implementation flaws, such as information leakage, rollback attacks, denial-of-service, and Iago attacks, which pose significant security risks. Drawing from our findings, we share key lessons to guide the development of more secure container solutions and discuss emerging trends in TEE containerization design.
Authors:Alperen Bolat, Sakir Sezer, Kieran McLaughlin, Henry Hui
Abstract:
Integrating cryptographic accelerators into modern CPU architectures presents unique microarchitectural challenges, particularly when extending instruction sets with complex and multistage operations. Hardware-assisted cryptographic instructions, such as Intel's AES-NI and ARM's custom instructions for encryption workloads, have demonstrated substantial performance improvements. However, efficient SHA-3 acceleration remains an open problem due to its distinct permutation-based structure and memory access patterns. Existing solutions primarily rely on standalone coprocessors or software optimizations, often avoiding the complexities of direct microarchitectural integration. This study investigates the architectural challenges of embedding a SHA-3 permutation operation as a custom instruction within a general-purpose processor, focusing on pipelined simultaneous execution, storage utilization, and hardware cost. In this paper, we investigated and prototyped a SHA-3 custom instruction for the RISC-V CPU architecture. Using cycle-accurate GEM5 simulations and FPGA prototyping, our results demonstrate performance improvements of up to 8.02x for RISC-V optimized SHA-3 software workloads and up to 46.31x for Keccak-specific software workloads, with only a 15.09% increase in registers and a 11.51% increase in LUT utilization. These findings provide critical insights into the feasibility and impact of SHA-3 acceleration at the microarchitectural level, highlighting practical design considerations for future cryptographic instruction set extensions.
Authors:Jesus Lopez, Saeefa Rubaiyet Nowmi, Viviana Cadena, Mohammad Saidur Rahman
Abstract:
Classical machine learning (CML) has been extensively studied for malware classification. With the emergence of quantum computing, quantum machine learning (QML) presents a paradigm-shifting opportunity to improve malware detection, though its application in this domain remains largely unexplored. In this study, we investigate two hybrid quantum-classical models -- a Quantum Multilayer Perceptron (QMLP) and a Quantum Convolutional Neural Network (QCNN), for malware classification. Both models utilize angle embedding to encode malware features into quantum states. QMLP captures complex patterns through full qubit measurement and data re-uploading, while QCNN achieves faster training via quantum convolution and pooling layers that reduce active qubits. We evaluate both models on five widely used malware datasets -- API-Graph, EMBER-Domain, EMBER-Class, AZ-Domain, and AZ-Class, across binary and multiclass classification tasks.
Our results show high accuracy for binary classification -- 95-96% on API-Graph, 91-92% on AZ-Domain, and 77% on EMBER-Domain. In multiclass settings, accuracy ranges from 91.6-95.7% on API-Graph, 41.7-93.6% on AZ-Class, and 60.7-88.1% on EMBER-Class. Overall, QMLP outperforms QCNN in complex multiclass tasks, while QCNN offers improved training efficiency at the cost of reduced accuracy.
Authors:Peng Gu, Shuangchen Li, Dylan Stow, Russell Barnes, Liu Liu, Yuan Xie, Eren Kursshan
Abstract:
3D die stacking and 2.5D interposer design are promising technologies to improve integration density, performance and cost. Current approaches face serious issues in dealing with emerging security challenges such as side channel attacks, hardware trojans, secure IC manufacturing and IP piracy. By utilizing intrinsic characteristics of 2.5D and 3D technologies, we propose novel opportunities in designing secure systems. We present: (i) a 3D architecture for shielding side-channel information; (ii) split fabrication using active interposers; (iii) circuit camouflage on monolithic 3D IC, and (iv) 3D IC-based security processing-in-memory (PIM). Advantages and challenges of these designs are discussed, showing that the new designs can improve existing countermeasures against security threats and further provide new security features.
Authors:Kyohei Shiomi, Zhuotao Lian, Toru Nakanishi, Teruaki Kitasuka
Abstract:
Large Language Models (LLMs) are increasingly used to generate dynamic dialogue for game NPCs. However, their integration raises new security concerns. In this study, we examine whether adversarial prompt injection can cause LLM-based NPCs to reveal hidden background secrets that are meant to remain undisclosed.
Authors:Zhuotao Lian, Weiyu Wang, Qingkui Zeng, Toru Nakanishi, Teruaki Kitasuka, Chunhua Su
Abstract:
Large Language Models (LLMs) are widely deployed in applications that accept user-submitted content, such as uploaded documents or pasted text, for tasks like summarization and question answering. In this paper, we identify a new class of attacks, prompt in content injection, where adversarial instructions are embedded in seemingly benign inputs. When processed by the LLM, these hidden prompts can manipulate outputs without user awareness or system compromise, leading to biased summaries, fabricated claims, or misleading suggestions. We demonstrate the feasibility of such attacks across popular platforms, analyze their root causes including prompt concatenation and insufficient input isolation, and discuss mitigation strategies. Our findings reveal a subtle yet practical threat in real-world LLM workflows.
Authors:Mohit Joshi, Manoj Kumar Mishra, S. Karthikeyan
Abstract:
An efficient technique of computing on encrypted data allows a client with limited capability to perform complex operations on a remote fault-tolerant server without leaking anything about the input or output. Quantum computing provides information-theoretic security to solve such a problem, and many such techniques have been proposed under the premises of half-blind quantum computation. However, they are dependent on a fixed non-parametric resource set that comprises some universal combination of $H,S,T,CX, CZ$ or $CCX$ gates. In this study, we show that recursive decryption of the parametric gate, $R_z(θ)$, is possible exactly when $θ=\pmÏ/2^m$ for $m\in \mathbb{Z^{+}}$, and approximately with arbitrary precision $ε$ for given $θ$. We also show that a blind algorithm based on such a technique needs at most $O(\log_2^2(Ï/ε))$ computation steps and communication rounds, while the techniques based on a non-parametric resource set require $O(\ln^{3.97}(1/ε))$ rounds. We use these results to propose a universal scheme of half-blind quantum computation for computing on encrypted data using arbitrary rotation gates. This substantial reduction in the depth of blind circuit is an affirmative step towards the practical application of such techniques in secure NISQ-era computing.
Authors:Anders Mølmen Høst, Pierre Lison, Leon Moonen
Abstract:
Vulnerability databases, such as the National Vulnerability Database (NVD), offer detailed descriptions of Common Vulnerabilities and Exposures (CVEs), but often lack information on their real-world impact, such as the tactics, techniques, and procedures (TTPs) that adversaries may use to exploit the vulnerability. However, manually linking CVEs to their corresponding TTPs is a challenging and time-consuming task, and the high volume of new vulnerabilities published annually makes automated support desirable.
This paper introduces TRIAGE, a two-pronged automated approach that uses Large Language Models (LLMs) to map CVEs to relevant techniques from the ATT&CK knowledge base. We first prompt an LLM with instructions based on MITRE's CVE Mapping Methodology to predict an initial list of techniques. This list is then combined with the results from a second LLM-based module that uses in-context learning to map a CVE to relevant techniques. This hybrid approach strategically combines rule-based reasoning with data-driven inference. Our evaluation reveals that in-context learning outperforms the individual mapping methods, and the hybrid approach improves recall of exploitation techniques. We also find that GPT-4o-mini performs better than Llama3.3-70B on this task. Overall, our results show that LLMs can be used to automatically predict the impact of cybersecurity vulnerabilities and TRIAGE makes the process of mapping CVEs to ATT&CK more efficient.
Keywords: vulnerability impact, CVE, ATT&CK techniques, large language models, automated mapping.
Authors:Lingxiao Wang, Wenjing Dang, Mengyao Zhang, Yue Wang, Xianzong Wu, Sen Chen
Abstract:
For vulnerabilities, Proof-of-Concept (PoC) plays an irreplaceable role in demonstrating the exploitability. PoC reports may include critical information such as specific usage, test platforms, and more, providing essential insights for researchers. However, in reality, due to various PoC templates across PoC platforms, PoC reports extensively suffer from information deficiency, leading the suboptimal quality and limited usefulness. Fortunately, we found that information deficiency of PoC reports could be mitigated by the completion from multiple sources given the same referred vulnerability. In this paper, we conduct the first study on the deficiency of information in PoC reports across public platforms. We began by collecting 173,170 PoC reports from 4 different platforms and defined 8 key aspects that PoCs should contain. By integrating rule-based matching and a fine-tuned BERT-NER model for extraction of key aspects, we discovered that all PoC reports available on public platforms have at least one missing key aspect. Subsequently, we developed a multi-source information fusion method to complete the missing aspect information in PoC reports by leveraging CVE entries and related PoC reports from different sources. Finally, we successfully completed 69,583 PoC reports (40.18% of all reports).
Authors:Abdullah Sahruri, Martin Margala
Abstract:
Logic locking remains one of the most promising defenses against hardware piracy, yet current approaches often face challenges in scalability and design overhead. In this paper, we present TLGLock, a new design paradigm that leverages the structural expressiveness of Threshold Logic Gates (TLGs) and the energy efficiency of charge recycling to enforce key-dependent functionality at the gate level. By embedding the key into the gate's weighted logic and utilizing dynamic charge sharing, TLGLock provides a stateless and compact alternative to conventional locking techniques. We implement a complete synthesis-to-locking flow and evaluate it using ISCAS, ITC, and MCNC benchmarks. Results show that TLGLock achieves up to 30% area, 50% delay, and 20% power savings compared to latch-based locking schemes. In comparison with XOR and SFLL-HD methods, TLGLock offers up to 3x higher SAT attack resistance with significantly lower overhead. Furthermore, randomized key-weight experiments demonstrate that TLGLock can reach up to 100% output corruption under incorrect keys, enabling tunable security at minimal cost. These results position TLGLock as a scalable and resilient solution for secure hardware design.
Authors:Muhammad Ibn Ziauddin, Rownak Rahad Rabbi, SM Mehrab, Fardin Faiyaz, Mosarrat Jahan
Abstract:
Trust computation is crucial for ensuring the security of the Internet of Things (IoT). However, current trust-based mechanisms for IoT have limitations that impact data security. Sliding window-based trust schemes cannot ensure reliable trust computation due to their inability to select appropriate window lengths. Besides, recent trust scores are emphasized when considering the effect of time on trust. This can cause a sudden change in overall trust score based on recent behavior, potentially misinterpreting an honest service provider as malicious and vice versa. Moreover, clustering mechanisms used to filter recommendations in trust computation often lead to slower results. In this paper, we propose a robust trust model to address these limitations. The proposed approach determines the window length dynamically to guarantee accurate trust computation. It uses the harmonic mean of average trust score and time to prevent sudden fluctuations in trust scores. Additionally, an efficient personalized subspace clustering algorithm is used to exclude recommendations. We present a security analysis demonstrating the resiliency of the proposed scheme against bad-mouthing, ballot-stuffing, and on-off attacks. The proposed scheme demonstrates a competitive performance in detecting bad-mouthing attacks, while outperforming existing works with an approximately 44% improvement in accuracy for detecting on-off attacks. It maintains its effectiveness even when the percentage of on-off attackers increases and in scenarios where multiple attacks occur simultaneously. Additionally, the proposed scheme reduces the recommendation filtering time by 95%.
Authors:Adi Mutha, Jitendra Sandu
Abstract:
With the advent of quantum computing, cryptocurrencies that rely on blockchain technology face mounting cryptographic vulnerabilities. This paper presents a comprehensive literature review evaluating how quantum algorithms, specifically Shors and Grovers, could disrupt the foundational security mechanisms of cryptocurrencies. Shors algorithm poses a threat to public-key cryptographic schemes by enabling efficient factorization and discrete logarithm solving, thereby endangering digital signature systems. Grovers algorithm undermines hash-based functions, increasing the feasibility of fifty one percent attacks and hash collisions. By examining the internal mechanisms of major cryptocurrencies such as Bitcoin, Ethereum, Litecoin, Monero, and Zcash, this review identifies specific vulnerabilities in transaction and consensus processes. It further analyses the current hardware limitations of quantum systems and estimates when such attacks could become feasible. In anticipation, it investigates countermeasures including Post-Quantum Cryptography (PQC), Quantum Key Distribution (QKD), and protocol-level modifications such as memory-intensive proof-of-work algorithms and multi-signature schemes. The discussion integrates recent advancements in quantum error correction, hardware scalability, and NIST-standardized cryptographic algorithms. This review concludes that while quantum computers are not yet advanced enough to pose an immediate threat, proactive integration of quantum-resistant solutions is essential. The findings underscore the urgent need for cryptocurrencies to adopt post-quantum cryptographic standards to preserve the decentralized trust, integrity, and security that define blockchain-based digital cryptocurrencies.
Authors:Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, Mohammad Shahed Sorower
Abstract:
We address the problem of data scarcity in harmful text classification for guardrailing applications and introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation. GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process that promotes stylistic diversity and uncovers edge cases. This combination enables both reliable coverage of the input space and nuanced exploration of harmful content. Using two benchmark data sets, we demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.
Authors:GodsGift Uzor, Hasan Al-Qudah, Ynes Ineza, Abdul Serwadda
Abstract:
The interactive nature of Large Language Models (LLMs), which closely track user data and context, has prompted users to share personal and private information in unprecedented ways. Even when users opt out of allowing their data to be used for training, these privacy settings offer limited protection when LLM providers operate in jurisdictions with weak privacy laws, invasive government surveillance, or poor data security practices. In such cases, the risk of sensitive information, including Personally Identifiable Information (PII), being mishandled or exposed remains high. To address this, we propose the concept of an "LLM gatekeeper", a lightweight, locally run model that filters out sensitive information from user queries before they are sent to the potentially untrustworthy, though highly capable, cloud-based LLM. Through experiments with human subjects, we demonstrate that this dual-model approach introduces minimal overhead while significantly enhancing user privacy, without compromising the quality of LLM responses.
Authors:Min Wang, Chuanpeng Jiang, Zhaohao Wang, Zhengyi Hou, Zhongkui Zhang, Yuanfu Zhao, Hongxi Liu, Weisheng Zhao
Abstract:
Hardware-based security primitives have become critical to enhancing information security in the Internet of Things (IoT) era. Physical unclonable functions (PUFs) utilize the inherent variations in the manufacturing process to generate cryptographic keys unique to a device. Reconfigurable PUFs (rPUFs) can update cryptographic keys for enhanced security in dynamic operational scenarios involving huge amounts of data, which makes them suitable for implementation in CMOS-integrated spin-orbit torque magnetic random access memory (SOT-MRAM) chips. However, a key challenge is achieving real-time reconfiguration independent of the environmental conditions, particularly the operating temperature. We propose a dual-pulse reconfiguration strategy for rPUFs in CMOS-integrated SOT-MRAM chips that effectively widens the operating window and achieves resilience across a wide range of operating temperatures without the need for dynamic feedback that overly complicates circuit design. The proposed strategy lays a solid foundation for the next generation of hardware-based security primitives to protect IoT architectures.
Authors:Nadeem Ahmed, Lei Zhang, Aryya Gangopadhyay
Abstract:
The rapid advancement of quantum computing poses a significant threat to modern cryptographic systems, necessitating the transition to Post-Quantum Cryptography (PQC). This study evaluates the support for PQC algorithms within nine widely used open-source cryptographic libraries -- OpenSSL, wolfSSL, BoringSSL, LibreSSL, Bouncy Castle, libsodium, Crypto++, Botan, and MbedTLS -- focusing on their implementation of the NIST-selected PQC finalists: CRYSTALS-Kyber, CRYSTALS-Dilithium, FALCON, and SPHINCS+. Our analysis, based on the latest available documentation, release notes, and industry reports as of early 2025, reveals a varied state of readiness across these libraries. While some libraries have integrated PQC support or have clear implementation roadmaps, others lag behind, creating potential security risks as quantum threats become more imminent. We discuss key challenges, including performance trade-offs, implementation security, and adoption hurdles in real-world cryptographic applications. Our findings highlight the urgent need for continued research, standardization efforts, and coordinated adoption strategies to ensure a secure transition to the quantum-resistant cryptographic landscape.
Authors:Xiaoli Zhuo, Xuehu Yan, Lintao Liu, Wei Yan
Abstract:
In evolving access structures, the number of participants is countably infinite with no predetermined upper bound. While such structures have been realized in secret sharing, research in secret image sharing has primarily focused on visual cryptography schemes (VCS). However, there exists no construction for $(k,\infty)$ VCS that applies to arbitrary $k$ values without pixel expansion currently, and the contrast requires enhancement. In this paper, we first present a formal mathematical definition of $(k,\infty)$ VCS. Then, propose a $(k,\infty)$ VCS based on random grids that works for arbitrary $k$. In addition, to further improve contrast, we develop optimized $(k,\infty)$ VCS for $k=2$ and $3$, along with contrast enhancement strategies for $k\geq 4$. Theoretical analysis and experimental results demonstrate the superiority of our proposed schemes.
Authors:Julia Boone, Fatemeh Afghah
Abstract:
Cyber-physical systems (CPS) are being increasingly utilized for critical applications. CPS combines sensing and computing elements, often having multi-layer designs with networking, computational, and physical interfaces, which provide them with enhanced capabilities for a variety of application scenarios. However, the combination of physical and computational elements also makes CPS more vulnerable to attacks compared to network-only systems, and the resulting impacts of CPS attacks can be substantial. Intelligent intrusion detection systems (IDS) are an effective mechanism by which CPS can be secured, but the majority of current solutions often train and validate on network traffic-only datasets, ignoring the distinct attacks that may occur on other system layers. In order to address this, we develop an adaptable CPS anomaly detection model that can detect attacks within CPS without the need for previously labeled data. To achieve this, we utilize domain adaptation techniques that allow us to transfer known attack knowledge from a network traffic-only environment to a CPS environment. We validate our approach using a state-of-the-art CPS intrusion dataset that combines network, operating system (OS), and Robot Operating System (ROS) data. Through this dataset, we are able to demonstrate the effectiveness of our model across network traffic-only and CPS environments with distinct attack types and its ability to outperform other anomaly detection methods.
Authors:James C Davis, Sophie Chen, Huiyun Peng, Paschal C Amusuo, Kelechi G Kalu
Abstract:
Stakeholder-based ethics analysis is now a formal requirement for submissions to top cybersecurity research venues. This requirement reflects a growing consensus that cybersecurity researchers must go beyond providing capabilities to anticipating and mitigating the potential harms thereof. However, many cybersecurity researchers may be uncertain about how to proceed in an ethics analysis. In this guide, we provide practical support for that requirement by enumerating stakeholder types and mapping them to common empirical research methods. We also offer worked examples to demonstrate how researchers can identify likely stakeholder exposures in real-world projects. Our goal is to help research teams meet new ethics mandates with confidence and clarity, not confusion.
Authors:Jiaming Hu, Haoyu Wang, Debarghya Mukherjee, Ioannis Ch. Paschalidis
Abstract:
Jailbreak attacks pose a serious challenge to the safe deployment of large language models (LLMs). We introduce CCFC (Core & Core-Full-Core), a dual-track, prompt-level defense framework designed to mitigate LLMs' vulnerabilities from prompt injection and structure-aware jailbreak attacks. CCFC operates by first isolating the semantic core of a user query via few-shot prompting, and then evaluating the query using two complementary tracks: a core-only track to ignore adversarial distractions (e.g., toxic suffixes or prefix injections), and a core-full-core (CFC) track to disrupt the structural patterns exploited by gradient-based or edit-based attacks. The final response is selected based on a safety consistency check across both tracks, ensuring robustness without compromising on response quality. We demonstrate that CCFC cuts attack success rates by 50-75% versus state-of-the-art defenses against strong adversaries (e.g., DeepInception, GCG), without sacrificing fidelity on benign queries. Our method consistently outperforms state-of-the-art prompt-level defenses, offering a practical and effective solution for safer LLM deployment.
Authors:Soham Hans, Nikolos Gurney, Stacy Marsella, Sofia Hirschmann
Abstract:
Understanding and quantifying human cognitive biases from empirical data has long posed a formidable challenge, particularly in cybersecurity, where defending against unknown adversaries is paramount. Traditional cyber defense strategies have largely focused on fortification, while some approaches attempt to anticipate attacker strategies by mapping them to cognitive vulnerabilities, yet they fall short in dynamically interpreting attacks in progress. In recognition of this gap, IARPA's ReSCIND program seeks to infer, defend against, and even exploit attacker cognitive traits. In this paper, we present a novel methodology that leverages large language models (LLMs) to extract quantifiable insights into the cognitive bias of loss aversion from hacker behavior. Our data are collected from an experiment in which hackers were recruited to attack a controlled demonstration network. We process the hacker generated notes using LLMs using it to segment the various actions and correlate the actions to predefined persistence mechanisms used by hackers. By correlating the implementation of these mechanisms with various operational triggers, our analysis provides new insights into how loss aversion manifests in hacker decision-making. The results demonstrate that LLMs can effectively dissect and interpret nuanced behavioral patterns, thereby offering a transformative approach to enhancing cyber defense strategies through real-time, behavior-based analysis.
Authors:Bipin Chhetri, Akbar Siami Namin
Abstract:
Cyberattacks are increasing, and securing against such threats is costing industries billions of dollars annually. Threat Modeling, that is, comprehending the consequences of these attacks, can provide critical support to cybersecurity professionals, enabling them to take timely action and allocate resources that could be used elsewhere. Cybersecurity is heavily dependent on threat modeling, as it assists security experts in assessing and mitigating risks related to identifying vulnerabilities and threats. Recently, there has been a pressing need for automated methods to assess attack descriptions and forecast the future consequences of the increasing complexity of cyberattacks. This study examines how Natural Language Processing (NLP) and deep learning can be applied to analyze the potential impact of cyberattacks by leveraging textual descriptions from the MITRE Common Weakness Enumeration (CWE) database. We emphasize classifying attack consequences into five principal categories: Availability, Access Control, Confidentiality, Integrity, and Other. This paper investigates the use of Bidirectional Encoder Representations from Transformers (BERT) in combination with Hierarchical Attention Networks (HANs) for Multi-label classification, evaluating their performance in comparison with conventional CNN and LSTM-based models. Experimental findings show that BERT achieves an overall accuracy of $0.972$, far higher than conventional deep learning models in multi-label classification. HAN outperforms baseline forms of CNN and LSTM-based models on specific cybersecurity labels. However, BERT consistently achieves better precision and recall, making it more suitable for predicting the consequences of a cyberattack.
Authors:Jinyu Lu, Xinrong Sun, Yunting Tao, Tong Ji, Fanyu Kong, Guoqiang Yang
Abstract:
The widespread adoption of convolutional neural networks (CNNs) in resource-constrained scenarios has driven the development of Machine Learning as a Service (MLaaS) system. However, this approach is susceptible to privacy leakage, as the data sent from the client to the untrusted cloud server often contains sensitive information. Existing CNN privacy-preserving schemes, while effective in ensuring data confidentiality through homomorphic encryption and secret sharing, face efficiency bottlenecks, particularly in convolution operations. In this paper, we propose a novel verifiable privacy-preserving scheme tailored for CNN convolutional layers. Our scheme enables efficient encryption and decryption, allowing resource-constrained clients to securely offload computations to the untrusted cloud server. Additionally, we present a verification mechanism capable of detecting the correctness of the results with a success probability of at least $1-\frac{1}{\left|Z\right|}$. Extensive experiments conducted on 10 datasets and various CNN models demonstrate that our scheme achieves speedups ranging $26 \times$ ~ $\ 87\times$ compared to the original plaintext model while maintaining accuracy.
Authors:Prabath Abeysekara, Hai Dong
Abstract:
We propose a data-driven and context-aware approach to bootstrap trustworthiness of homogeneous Internet of Things (IoT) services in Mobile Edge Computing (MEC) based industrial IoT (IIoT) systems. The proposed approach addresses key limitations in adapting existing trust bootstrapping approaches into MEC-based IIoT systems. These key limitations include, the lack of opportunity for a service consumer to interact with a lesser-known service over a prolonged period of time to get a robust measure of its trustworthiness, inability of service consumers to consistently interact with their peers to receive reliable recommendations of the trustworthiness of a lesser-known service as well as the impact of uneven context parameters in different MEC environments causing uneven trust environments for trust evaluation. In addition, the proposed approach also tackles the problem of data sparsity via enabling knowledge sharing among different MEC environments within a given MEC topology. To verify the effectiveness of the proposed approach, we carried out a comprehensive evaluation on two real-world datasets suitably adjusted to exhibit the context-dependent trust information accumulated in MEC environments within a given MEC topology. The experimental results affirmed the effectiveness of our approach and its suitability to bootstrap trustworthiness of services in MEC-based IIoT systems.
Authors:Sandaru Jayawardana, Sennur Ulukus, Ming Ding, Kanchana Thilakarathna
Abstract:
Local differential privacy (LDP) has emerged as a promising paradigm for privacy-preserving data collection in distributed systems, where users contribute multi-dimensional records with potentially correlated attributes. Recent work has highlighted that correlation-induced privacy leakage (CPL) plays a critical role in shaping the privacy-utility trade-off under LDP, especially when correlations exist among attributes. Nevertheless, it remains unclear to what extent the prevailing assumptions and proposed solutions are valid and how significant CPL is in real-world data. To address this gap, we first perform a comprehensive statistical analysis of five widely used LDP mechanisms -- GRR, RAPPOR, OUE, OLH and Exponential mechanism -- to assess CPL across four real-world datasets. We identify that many primary assumptions and metrics in current approaches fall short of accurately characterising these leakages. Moreover, current studies have been limited to a set of pure LDP (i.e., {δ= 0}) mechanisms. In response, we develop the first algorithmic framework to theoretically quantify CPL for any general approximated LDP (({\varepsilon},δ)-LDP) mechanism. We validate our theoretical results against empirical statistical results and provide a theoretical explanation for the observed statistical patterns. Finally, we propose two novel benchmarks to validate correlation analysis algorithms and evaluate the utility vs CPL of LDP mechanisms. Further, we demonstrate how these findings can be applied to achieve an efficient privacy-utility trade-off in real-world data governance.
Authors:Huipeng Yang, Li Yang, Lichuan Ma, Lu Zhou, Junbo Jia, Anyuan Sang, Xinyue Wang
Abstract:
Remote management devices facilitate critical infrastructure monitoring for administrators but simultaneously increase asset exposure. Sensitive geographical information overlooked in exposed device management pages poses substantial security risks. Therefore, identifying devices that reveal location information due to administrator negligence is crucial for cybersecurity regulation. Despite the rich information exposed by web interfaces of remote management devices, automatically discovering geographical locations remains challenging due to unstructured formats, varying styles, and incomplete geographical details.
This study introduces WebGeoInfer, a structure-free geolocation inference framework utilizing multi-stage information enhancement. WebGeoInfer clusters similar device web pages and analyzes inter-cluster differences to extract potential geographical information, bypassing structural limitations. Through search engine enhancement and Large Language Models mining, the framework extracts geographical coordinates from identified information. WebGeoInfer successfully inferred locations for 5,435 devices across 94 countries and 2,056 cities, achieving accuracy rates of 96.96\%, 88.05\%, and 79.70\% at country, city, and street levels, respectively.
Authors:Jingnan Xu, Leixia Wang, Xiaofeng Meng
Abstract:
To protect privacy for data-collection-based services, local differential privacy (LDP) is widely adopted due to its rigorous theoretical bound on privacy loss. However, mistakes in complex theoretical analysis or subtle implementation errors may undermine its practical guarantee. To address this, auditing is crucial to confirm that LDP protocols truly protect user data. However, existing auditing methods, though, mainly target machine learning and federated learning tasks based on centralized differentially privacy (DP), with limited attention to LDP. Moreover, the few studies on LDP auditing focus solely on simple frequency estimation task for discrete data, leaving correlated key-value data - which requires both discrete frequency estimation for keys and continuous mean estimation for values - unexplored.
To bridge this gap, we propose KV-Auditor, a framework for auditing LDP-based key-value estimation mechanisms by estimating their empirical privacy lower bounds. Rather than traditional LDP auditing methods that relies on binary output predictions, KV-Auditor estimates this lower bound by analyzing unbounded output distributions, supporting continuous data. Specifically, we classify state-of-the-art LDP key-value mechanisms into interactive and non-interactive types. For non-interactive mechanisms, we propose horizontal KV-Auditor for small domains with sufficient samples and vertical KV-Auditor for large domains with limited samples. For interactive mechanisms, we design a segmentation strategy to capture incremental privacy leakage across iterations. Finally, we perform extensive experiments to validate the effectiveness of our approach, offering insights for optimizing LDP-based key-value estimators.
Authors:Anyuan Sang, Lu Zhou, Li Yang, Junbo Jia, Huipeng Yang, Pengbin Feng, Jianfeng Ma
Abstract:
Learning-based Provenance-based Intrusion Detection Systems (PIDSes) have become essential tools for anomaly detection in host systems due to their ability to capture rich contextual and structural information, as well as their potential to detect unknown attacks. However, recent studies have shown that these systems are vulnerable to graph manipulation attacks, where attackers manipulate the graph structure to evade detection. While some previous approaches have discussed this type of attack, none have fully addressed it with a robust detection solution, limiting the practical applicability of PIDSes.
To address this challenge, we propose MirGuard, a robust anomaly detection framework that combines logic-aware multi-view augmentation with contrastive representation learning. Rather than applying arbitrary structural perturbations, MirGuard introduces Logic-Aware Noise Injection (LNI) to generate semantically valid graph views, ensuring that all augmentations preserve the underlying causal semantics of the provenance data. These views are then used in a Logic-Preserving Contrastive Learning framework, which encourages the model to learn representations that are invariant to benign transformations but sensitive to adversarial inconsistencies. Comprehensive evaluations on multiple provenance datasets demonstrate that MirGuard significantly outperforms state-of-the-art detectors in robustness against various graph manipulation attacks without sacrificing detection performance and efficiency. Our work represents the first targeted study to enhance PIDS against such adversarial threats, providing a robust and effective solution to modern cybersecurity challenges.
Authors:Chris Cao, Gururaj Saileshwar
Abstract:
Recent work presented at USENIX Security 2025 (SEC'25) claims that occupancy-based attacks can recover AES keys from the MIRAGE randomized cache. In this paper, we examine these claims and find that they arise from a modeling flaw in the SEC'25 paper. Most critically, the SEC'25 paper's simulation of MIRAGE uses a constant seed to initialize the random number generator used for global evictions in MIRAGE, causing every AES encryption they trace to evict the same deterministic sequence of cache lines. This artificially creates a highly repeatable timing pattern that is not representative of a realistic implementation of MIRAGE, where eviction sequences vary randomly between encryptions. When we instead randomize the eviction seed for each run, reflecting realistic operation, the correlation between AES T-table accesses and attacker runtimes disappears, and the attack fails. These findings show that the reported leakage is an artifact of incorrect modeling, and not an actual vulnerability in MIRAGE.
Authors:Jane Carney, Kushal Upreti, Gaby G. Dagher, Tim Andersen
Abstract:
Federated learning enhances traditional deep learning by enabling the joint training of a model with the use of IoT device's private data. It ensures privacy for clients, but is susceptible to data poisoning attacks during training that degrade model performance and integrity. Current poisoning detection methods in federated learning lack a standardized detection method or take significant liberties with trust. In this paper, we present \Sys, a novel blockchain-enabled poison detection framework in federated learning. The framework decentralizes the role of the global server across participating clients. We introduce a judge model used to detect data poisoning in model updates. The judge model is produced by each client and verified to reach consensus on a single judge model. We implement our solution to show \Sys is robust against data poisoning attacks and the creation of our judge model is scalable.
Authors:Jinhwa Kim, Ian G. Harris
Abstract:
While Large Language Models (LLMs) have shown significant advancements in performance, various jailbreak attacks have posed growing safety and ethical risks. Malicious users often exploit adversarial context to deceive LLMs, prompting them to generate responses to harmful queries. In this study, we propose a new defense mechanism called Context Filtering model, an input pre-processing method designed to filter out untrustworthy and unreliable context while identifying the primary prompts containing the real user intent to uncover concealed malicious intent. Given that enhancing the safety of LLMs often compromises their helpfulness, potentially affecting the experience of benign users, our method aims to improve the safety of the LLMs while preserving their original performance. We evaluate the effectiveness of our model in defending against jailbreak attacks through comparative analysis, comparing our approach with state-of-the-art defense mechanisms against six different attacks and assessing the helpfulness of LLMs under these defenses. Our model demonstrates its ability to reduce the Attack Success Rates of jailbreak attacks by up to 88% while maintaining the original LLMs' performance, achieving state-of-the-art Safety and Helpfulness Product results. Notably, our model is a plug-and-play method that can be applied to all LLMs, including both white-box and black-box models, to enhance their safety without requiring any fine-tuning of the models themselves. We will make our model publicly available for research purposes.
Authors:Zheng Jie Wong, Bingquan Shen
Abstract:
Currently, Automatic Speech Recognition (ASR) models are deployed in an extensive range of applications. However, recent studies have demonstrated the possibility of adversarial attack on these models which could potentially suppress or disrupt model output. We investigate and verify the robustness of these attacks and explore if it is possible to increase their imperceptibility. We additionally find that by relaxing the optimisation objective from complete suppression to partial suppression, we can further decrease the imperceptibility of the attack. We also explore possible defences against these attacks and show a low-pass filter defence could potentially serve as an effective defence.
Authors:Jorge López, Charalampos Chatzinakis, Marc Cartigny
Abstract:
Quantum Key Distribution (QKD) networks harness the principles of quantum physics in order to securely transmit cryptographic key material, providing physical guarantees. These networks require traditional management and operational components, such as routing information through the network elements. However, due to the limitations on capacity and the particularities of information handling in these networks, traditional shortest paths algorithms for routing perform poorly on both route planning and online routing, which is counterintuitive. Moreover, due to the scarce resources in such networks, often the expressed demand cannot be met by any assignment of routes. To address both the route planning problem and the need for fair automated suggestions in infeasible cases, we propose to model this problem as a Quadratic Programming (QP) problem. For the online routing problem, we showcase that the shortest (available) paths routing strategy performs poorly in the online setting. Furthermore, we prove that the widest shortest path routing strategy has a competitive ratio greater or equal than $\frac{1}{2}$, efficiently addressing both routing modes in QKD networks.
Authors:Ahmed Alharbi, Hai Dong, Xun Yi
Abstract:
Recent years have witnessed a rising trend in social-sensor cloud identity cloning incidents. However, existing approaches suffer from unsatisfactory performance, a lack of solutions for detecting duplicated accounts, and a lack of large-scale evaluations on real-world datasets. We introduce a novel method for detecting identity cloning in social-sensor cloud service providers. Our proposed technique consists of two primary components: 1) a similar identity detection method and 2) a cryptography-based authentication protocol. Initially, we developed a weakly supervised deep forest model to identify similar identities using non-privacy-sensitive user profile features provided by the service. Subsequently, we designed a cryptography-based authentication protocol to verify whether similar identities were generated by the same provider. Our extensive experiments on a large real-world dataset demonstrate the feasibility and superior performance of our technique compared to current state-of-the-art identity clone detection methods.
Authors:Yuan Qiu, Ke Yi
Abstract:
This paper revisits the DBSCAN problem under differential privacy (DP). Existing DP-DBSCAN algorithms aim at publishing the cluster labels of the input points. However, we show that both empirically and theoretically, this approach cannot offer any utility in the published results. We therefore propose an alternative definition of DP-DBSCAN based on the notion of spans. We argue that publishing the spans actually better serves the purposes of visualization and classification of DBSCAN. Then we present a linear-time DP-DBSCAN algorithm achieving the sandwich quality guarantee in any constant dimensions, as well as matching lower bounds on the approximation ratio. A key building block in our algorithm is a linear-time algorithm for constructing a histogram under pure-DP, which is of independent interest. Finally, we conducted experiments on both synthetic and real-world datasets to verify the practical performance of our DP-DBSCAN algorithm.
Authors:Manabu Hirano, Ryotaro Kobayashi
Abstract:
Protecting state-of-the-art AI-based cybersecurity defense systems from cyber attacks is crucial. Attackers create adversarial examples by adding small changes (i.e., perturbations) to the attack features to evade or fool the deep learning model. This paper introduces the concept of low-level behavioral adversarial examples and its threat model of evasive ransomware. We formulate the method and the threat model to generate the optimal source code of evasive malware. We then examine the method using the leaked source code of Conti ransomware with the micro-behavior control function. The micro-behavior control function is our test component to simulate changing source code in ransomware; ransomware's behavior can be changed by specifying the number of threads, file encryption ratio, and delay after file encryption at the boot time. We evaluated how much an attacker can control the behavioral features of ransomware using the micro-behavior control function to decrease the detection rate of a ransomware detector.
Authors:Manabu Hirano, Ryotaro Kobayashi
Abstract:
Double extortion ransomware attacks have become mainstream since many organizations adopt more robust and resilient data backup strategies against conventional crypto-ransomware. This paper presents detailed attack stages, tactics, procedures, and tools used in the double extortion ransomware attacks. We then present a novel detection method using low-level storage and memory behavioral features and network traffic features obtained from a thin hypervisor to establish a defense-in-depth strategy for when attackers compromise OS-level protection. We employed the lightweight \emph{Kitsune} Network Intrusion Detection System (NIDS)'s network feature to detect the data exfiltration phase in double extortion ransomware attacks. Our experimental results showed that the presented method improved by 0.166 in the macro F score of the data exfiltration phase detection rate. Lastly, we discuss the limitations of the presented method and future work.
Authors:Hiroya Kato, Kentaro Kita, Kento Hasegawa, Seira Hidano
Abstract:
As the social implementation of AI has been steadily progressing, research and development related to AI security has also been increasing. However, existing studies have been limited to organizing related techniques, attacks, defenses, and risks in terms of specific domains or AI elements. Thus, it extremely difficult to understand the relationships among them and how negative impacts on stakeholders are brought about. In this paper, we argue that the knowledge, technologies, and social impacts related to AI security should be holistically organized to help understand relationships among them. To this end, we first develop an AI security map that holistically organizes interrelationships among elements related to AI security as well as negative impacts on information systems and stakeholders. This map consists of the two aspects, namely the information system aspect (ISA) and the external influence aspect (EIA). The elements that AI should fulfill within information systems are classified under the ISA. The EIA includes elements that affect stakeholders as a result of AI being attacked or misused. For each element, corresponding negative impacts are identified. By referring to the AI security map, one can understand the potential negative impacts, along with their causes and countermeasures. Additionally, our map helps clarify how the negative impacts on AI-based systems relate to those on stakeholders. We show some findings newly obtained by referring to our map. We also provide several recommendations and open problems to guide future AI security communities.
Authors:Junling Fan, David Koblah, Domenic Forte
Abstract:
Integrated circuits (ICs) are essential to modern electronic systems, yet they face significant risks from physical reverse engineering (RE) attacks that compromise intellectual property (IP) and overall system security. While IC camouflage techniques have emerged to mitigate these risks, existing approaches largely focus on localized gate modifications, neglecting comprehensive deception strategies. To address this gap, we present a machine learning (ML)-driven methodology that integrates cryptic and mimetic cyber deception principles to enhance IC security against RE. Our approach leverages a novel And-Inverter Graph Variational Autoencoder (AIG-VAE) to encode circuit representations, enabling dual-layered camouflage through functional preservation and appearance mimicry. By introducing new variants of covert gates -- Fake Inverters, Fake Buffers, and Universal Transmitters -- our methodology achieves robust protection by obscuring circuit functionality while presenting misleading appearances. Experimental results demonstrate the effectiveness of our strategy in maintaining circuit functionality while achieving high camouflage and similarity scores with minimal structural overhead. Additionally, we validate the robustness of our method against advanced artificial intelligence (AI)-enhanced RE attacks, highlighting its practical applicability in securing IC designs. By bridging the gap in mimetic deception for hardware security, our work sets a new standard for IC camouflage, advancing the application of cyber deception principles to protect critical systems from adversarial threats.
Authors:Dan Ivanov, Tristan Freiberg, Shirin Shahabi, Jonathan Gold, Haruna Isah
Abstract:
DSperse is a modular framework for distributed machine learning inference with strategic cryptographic verification. Operating within the emerging paradigm of distributed zero-knowledge machine learning, DSperse avoids the high cost and rigidity of full-model circuitization by enabling targeted verification of strategically chosen subcomputations. These verifiable segments, or "slices", may cover part or all of the inference pipeline, with global consistency enforced through audit, replication, or economic incentives. This architecture supports a pragmatic form of trust minimization, localizing zero-knowledge proofs to the components where they provide the greatest value. We evaluate DSperse using multiple proving systems and report empirical results on memory usage, runtime, and circuit behavior under sliced and unsliced configurations. By allowing proof boundaries to align flexibly with the model's logical structure, DSperse supports scalable, targeted verification strategies suited to diverse deployment needs.
Authors:Xingran Chen, Parimal Parag, Rohit Bhagat, Zonghong Liu, Salim El Rouayheb
Abstract:
Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the Average Crossing (AC) algorithm--a fully decentralized mechanism for duplicating RWs to prevent RW extinction in the presence of Pac-Man. Our theoretical analysis establishes that (i) the RW population remains almost surely bounded under AC and (ii) RW-based stochastic gradient descent remains convergent under AC, even in the presence of Pac-Man, with a quantifiable deviation from the true optimum. Our extensive empirical results on both synthetic and real-world datasets corroborate our theoretical findings. Furthermore, they uncover a phase transition in the extinction probability as a function of the duplication threshold. We offer theoretical insights by analyzing a simplified variant of the AC, which sheds light on the observed phase transition.
Authors:Guang Yang, Peter Trinh, Alma Nkemla, Amuru Serikyaku, Edward Tatchim, Osman Sharaf
Abstract:
The current Domain Name System (DNS) infrastructure faces critical vulnerabilities including poisoning attacks, censorship mechanisms, and centralized points of failure that compromise internet freedom and security. Recent incidents such as DNS poisoning attacks on ISP customers highlight the urgent need for resilient alternatives. This paper presents a novel blockchain-based Decentralized Domain Name System (DDNS). We designed a specialized Proof-of-Work blockchain to maximize support for DNS-related protocols and achieve node decentralization. The system integrates our blockchain with IPFS for distributed storage, implements cryptographic primitives for end-to-end trust signatures, and achieves Never Trust, Always Verify zero-trust verification. Our implementation achieves 15-second domain record propagation times, supports 20 standard DNS record types, and provides perpetual free .ddns domains. The system has been deployed across distributed infrastructure in San Jose, Los Angeles, and Orange County, demonstrating practical scalability and resistance to traditional DNS manipulation techniques. Performance evaluation shows the system can handle up to Max Theor. TPS 1,111.1 tx/s (minimal transactions) and Max Theor. TPS 266.7 tx/s (regular transactions) for domain operations while maintaining sub-second query resolution through intelligent caching mechanisms.
Authors:Natalia Emelianova, Carlos Kamienski, Ronaldo C. Prati
Abstract:
The exponential growth of the Internet of Things (IoT) has led to the emergence of substantial security concerns, with IoT networks becoming the primary target for cyberattacks. This study examines the potential of Kolmogorov-Arnold Networks (KANs) as an alternative to conventional machine learning models for intrusion detection in IoT networks. The study demonstrates that KANs, which employ learnable activation functions, outperform traditional MLPs and achieve competitive accuracy compared to state-of-the-art models such as Random Forest and XGBoost, while offering superior interpretability for intrusion detection in IoT networks.
Authors:Xiaoli Zhuo, Xuehu Yan, Wei Yan
Abstract:
Visual cryptography schemes (VCSs) belong to a category of secret image sharing schemes that do not require cryptographic knowledge for decryption, instead relying directly on the human visual system. Among VCSs, random grid-based VCS (RGVCS) has garnered widespread attention as it avoids pixel expansion while requiring no basic matrices design. Contrast, a core metric for RGVCS, directly determines the visual quality of recovered images, rendering its optimization a critical research objective. However, existing $(k,n)$ RGVCSs still fail to attain theoretical upper bounds on contrast, highlighting the urgent need for higher-contrast constructions. In this paper, we propose a novel sharing paradigm for RGVCS that constructs $(k,n)$-threshold schemes from arbitrary $(k,n')$-threshold schemes $(k \leq n'\leq n)$, termed \emph{$n'$-grouped $(k,n)$ RGVCS}. This paradigm establishes hierarchical contrast characteristics: participants within the same group achieve optimal recovery quality, while inter-group recovery shows a hierarchical contrast. We further introduce a new contrast calculation formula tailored to the new paradigm. Then, we propose a contrast-enhanced $(k,n)$ RGVCS by setting $n'= k$, achieving the highest contrast value documented in the existing literature. Theoretical analysis and experimental results demonstrate the superiority of our proposed scheme in terms of contrast.
Authors:Muhammad Azmi Umer, Chuadhry Mujeeb Ahmed, Aditya Mathur, Muhammad Taha Jilani
Abstract:
This work focuses on validation of attack pattern mining in the context of Industrial Control System (ICS) security. A comprehensive security assessment of an ICS requires generating a large and variety of attack patterns. For this purpose we have proposed a data driven technique to generate attack patterns for an ICS. The proposed technique has been used to generate over 100,000 attack patterns from data gathered from an operational water treatment plant. In this work we present a detailed case study to validate the attack patterns.
Authors:Haoran Niu, K. Suzanne Barber
Abstract:
It is difficult for individuals and organizations to protect personal information without a fundamental understanding of relative privacy risks. By analyzing over 5,000 empirical identity theft and fraud cases, this research identifies which types of personal data are exposed, how frequently exposures occur, and what the consequences of those exposures are. We construct an Identity Ecosystem graph--a foundational, graph-based model in which nodes represent personally identifiable information (PII) attributes and edges represent empirical disclosure relationships between them (e.g., the probability that one PII attribute is exposed due to the exposure of another). Leveraging this graph structure, we develop a privacy risk prediction framework that uses graph theory and graph neural networks to estimate the likelihood of further disclosures when certain PII attributes are compromised. The results show that our approach effectively answers the core question: Can the disclosure of a given identity attribute possibly lead to the disclosure of another attribute?
Authors:Siddhant Panpatil, Hiskias Dingeto, Haon Park
Abstract:
Despite significant advances in alignment techniques, we demonstrate that state-of-the-art language models remain vulnerable to carefully crafted conversational scenarios that can induce various forms of misalignment without explicit jailbreaking. Through systematic manual red-teaming with Claude-4-Opus, we discovered 10 successful attack scenarios, revealing fundamental vulnerabilities in how current alignment methods handle narrative immersion, emotional pressure, and strategic framing. These scenarios successfully elicited a range of misaligned behaviors, including deception, value drift, self-preservation, and manipulative reasoning, each exploiting different psychological and contextual vulnerabilities. To validate generalizability, we distilled our successful manual attacks into MISALIGNMENTBENCH, an automated evaluation framework that enables reproducible testing across multiple models. Cross-model evaluation of our 10 scenarios against five frontier LLMs revealed an overall 76% vulnerability rate, with significant variations: GPT-4.1 showed the highest susceptibility (90%), while Claude-4-Sonnet demonstrated greater resistance (40%). Our findings demonstrate that sophisticated reasoning capabilities often become attack vectors rather than protective mechanisms, as models can be manipulated into complex justifications for misaligned behavior. This work provides (i) a detailed taxonomy of conversational manipulation patterns and (ii) a reusable evaluation framework. Together, these findings expose critical gaps in current alignment strategies and highlight the need for robustness against subtle, scenario-based manipulation in future AI systems.
Authors:Md Sajidul Islam Sajid, Shihab Ahmed, Ryan Sosnoski
Abstract:
Keyloggers remain a serious threat in modern cybersecurity, silently capturing user keystrokes to steal credentials and sensitive information. Traditional defenses focus mainly on detection and removal, which can halt malicious activity but do little to engage or mislead adversaries. In this paper, we present a deception framework that leverages API hooking to intercept input-related API calls invoked by keyloggers at runtime and inject realistic decoy keystrokes. A core challenge, however, lies in the increasing adoption of anti-hooking techniques by advanced keyloggers. Anti-hooking strategies allow malware to bypass or detect instrumentation. To counter this, we introduce a hardened hooking layer that detects tampering and rapidly reinstates disrupted hooks, ensuring continuity of deception. We evaluate our framework against a custom-built "super keylogger" incorporating multiple evasion strategies, as well as 50 real-world malware samples spanning ten prominent keylogger families. Experimental results demonstrate that our system successfully resists sophisticated bypass attempts, maintains operational stealth, and reliably deceives attackers by feeding them decoys. The system operates with negligible performance overhead and no observable impact on user experience. Our findings show that resilient, runtime deception can play a practical and robust role in confronting advanced threats.
Authors:Chengrui Sun, Hua Zhang, Haoran Gao, Zian Tian, Jianjin Zhao, qi Li, Hongliang Zhu, Zongliang Shen, Shang Wang, Anmin Fu
Abstract:
All current detection of backdoor attacks on deep learning models fall under the category of a non essential features(NEF), which focus on fighting against simple and efficient vertical class backdoor -- trigger is small, few and not overlapping with the source. Evade-adaptive backdoor (EAB) attacks have evaded NEF detection and improved training efficiency. We introduces a precise, efficient and universal detection and defense framework coined as Isolate Trigger (IsTr). IsTr aims to find the hidden trigger by breaking the barrier of the source features. Therefore, it investigates the essence of backdoor triggering, and uses Steps and Differential-Middle-Slice as components to update past theories of distance and gradient. IsTr also plays a positive role in the model, whether the backdoor exists. For example, accurately find and repair the wrong identification caused by deliberate or unintentional training in automatic driving. Extensive experiments on robustness scross various tasks, including MNIST, facial recognition, and traffic sign recognition, confirm the high efficiency, generality and precision of the IsTr. We rigorously evaluated the effectiveness of the IsTr against a series of six EAB attacks, including Badnets, Sin-Wave, Multi-trigger, SSBAs, CASSOCK, HCB. None of these countermeasures evade, even when attacks are combined and the trigger and source overlap.
Authors:Tayyaba Noreen, Qiufen Xia, Muhammad Zeeshan Haider
Abstract:
In the past decade, blockchain has emerged as a promising solution for building secure distributed ledgers and has attracted significant attention. However, current blockchain systems suffer from limited throughput, poor scalability, and high latency. Due to limitations in consensus mechanisms, especially in managing node identities, blockchain is often considered unsuitable for applications such as the Internet of Things (IoT). This paper proposes the Advanced DAG-based Ranking (ADR) protocol to enhance blockchain scalability and throughput. ADR employs a directed acyclic graph (DAG) structure where nodes are positioned based on their rankings. Unlike traditional chains, ADR allows honest nodes to write blocks and verify transactions using a DAG-based topology. The protocol follows a three-step approach to secure the network against double-spending and enhance performance. First, it verifies nodes using their public and private keys before granting entry. Second, it builds an advanced DAG ledger enabling block production and transaction validation. Third, a ranking algorithm filters out malicious nodes, ranks the remaining nodes based on performance, and arranges them topologically. This process increases throughput and ensures robust scalability. We evaluated ADR on Amazon EC2 clusters with over 100 nodes, including scenarios with injected malicious nodes. Simulation results demonstrate that ADR significantly improves transaction throughput and network liveness compared to existing DAG-based blockchains such as IOTA and ByteBall, making it well-suited for IoT applications.
Authors:Bodam Kim, Hiskias Dingeto, Taeyoun Kwon, Dasol Choi, DongGeon Lee, Haon Park, JaeHoon Lee, Jongho Shin
Abstract:
As large language models become increasingly integrated into daily life, audio has emerged as a key interface for human-AI interaction. However, this convenience also introduces new vulnerabilities, making audio a potential attack surface for adversaries. Our research introduces WhisperInject, a two-stage adversarial audio attack framework that can manipulate state-of-the-art audio language models to generate harmful content. Our method uses imperceptible perturbations in audio inputs that remain benign to human listeners. The first stage uses a novel reward-based optimization method, Reinforcement Learning with Projected Gradient Descent (RL-PGD), to guide the target model to circumvent its own safety protocols and generate harmful native responses. This native harmful response then serves as the target for Stage 2, Payload Injection, where we use Projected Gradient Descent (PGD) to optimize subtle perturbations that are embedded into benign audio carriers, such as weather queries or greeting messages. Validated under the rigorous StrongREJECT, LlamaGuard, as well as Human Evaluation safety evaluation framework, our experiments demonstrate a success rate exceeding 86% across Qwen2.5-Omni-3B, Qwen2.5-Omni-7B, and Phi-4-Multimodal. Our work demonstrates a new class of practical, audio-native threats, moving beyond theoretical exploits to reveal a feasible and covert method for manipulating AI behavior.
Authors:Mehdi Akbari Gurabi, Lasse Nitz, Radu-Mihai Castravet, Roman Matzutt, Avikarsha Mandal, Stefan Decker
Abstract:
Existing cybersecurity playbooks are often written in heterogeneous, non-machine-readable formats, which limits their automation and interoperability across Security Orchestration, Automation, and Response platforms. This paper explores the suitability of Large Language Models, combined with Prompt Engineering, to automatically translate legacy incident response playbooks into the standardized, machine-readable CACAO format. We systematically examine various Prompt Engineering techniques and carefully design prompts aimed at maximizing syntactic accuracy and semantic fidelity for control flow preservation. Our modular transformation pipeline integrates a syntax checker to ensure syntactic correctness and features an iterative refinement mechanism that progressively reduces syntactic errors. We evaluate the proposed approach on a custom-generated dataset comprising diverse legacy playbooks paired with manually created CACAO references. The results demonstrate that our method significantly improves the accuracy of playbook transformation over baseline models, effectively captures complex workflow structures, and substantially reduces errors. It highlights the potential for practical deployment in automated cybersecurity playbook transformation tasks.
Authors:Jiewei Lai, Lan Zhang, Chen Tang, Pengcheng Sun, Xinming Wang, Yunhao Wang
Abstract:
Recent advancements in DeepFakes attribution technologies have significantly enhanced forensic capabilities, enabling the extraction of traces left by generative models (GMs) in images, making DeepFakes traceable back to their source GMs. Meanwhile, several attacks have attempted to evade attribution models (AMs) for exploring their limitations, calling for more robust AMs. However, existing attacks fail to eliminate GMs' traces, thus can be mitigated by defensive measures. In this paper, we identify that untraceable DeepFakes can be achieved through a multiplicative attack, which can fundamentally eliminate GMs' traces, thereby evading AMs even enhanced with defensive measures. We design a universal and black-box attack method that trains an adversarial model solely using real data, applicable for various GMs and agnostic to AMs. Experimental results demonstrate the outstanding attack capability and universal applicability of our method, achieving an average attack success rate (ASR) of 97.08\% against 6 advanced AMs on DeepFakes generated by 9 GMs. Even in the presence of defensive mechanisms, our method maintains an ASR exceeding 72.39\%. Our work underscores the potential challenges posed by multiplicative attacks and highlights the need for more robust AMs.
Authors:Anas Mabrouk, Mohamed Hatem, Mohammad Mamun, Sherif Saad
Abstract:
Lateral Movement (LM) attacks continue to pose a significant threat to enterprise security, enabling adversaries to stealthily compromise critical assets. However, the development and evaluation of LM detection systems are impeded by the absence of realistic, well-labeled datasets. To address this gap, we propose LMDG, a reproducible and extensible framework for generating high-fidelity LM datasets. LMDG automates benign activity generation, multi-stage attack execution, and comprehensive labeling of system and network logs, dramatically reducing manual effort and enabling scalable dataset creation. A central contribution of LMDG is Process Tree Labeling, a novel agent-based technique that traces all malicious activity back to its origin with high precision. Unlike prior methods such as Injection Timing or Behavioral Profiling, Process Tree Labeling enables accurate, step-wise labeling of malicious log entries, correlating each with a specific attack step and MITRE ATT\&CK TTPs. To our knowledge, this is the first approach to support fine-grained labeling of multi-step attacks, providing critical context for detection models such as attack path reconstruction. We used LMDG to generate a 25-day dataset within a 25-VM enterprise environment containing 22 user accounts. The dataset includes 944 GB of host and network logs and embeds 35 multi-stage LM attacks, with malicious events comprising less than 1% of total activity, reflecting a realistic benign-to-malicious ratio for evaluating detection systems. LMDG-generated datasets improve upon existing ones by offering diverse LM attacks, up-to-date attack patterns, longer attack timeframes, comprehensive data sources, realistic network architectures, and more accurate labeling.
Authors:Dylan Stow, Russell Barnes, Eren Kurshan, Yuan Xie
Abstract:
Side-channel attacks are important security challenges as they reveal sensitive information about on-chip activities. Among such attacks, the thermal side-channel has been shown to disclose the activities of key functional blocks and even encryption keys. This paper proposes a novel approach to proactively conceal critical activities in the functional layers while minimizing the power dissipation by (i) leveraging inherent characteristics of 3D integration to protect from side-channel attacks and (ii) dynamically generating custom activity patterns to match the activity to be concealed in the functional layers. Experimental analysis shows that 3D technology combined with the proposed run-time algorithm effectively reduces the Side channel vulnerability Factor (SVF) below 0.05 and the Spatial Thermal Side-channel Factor (STSF) below 0.59.
Authors:Guillaume Quispe, Pierre Jouvelot, Gerard Memmi
Abstract:
Nicknames for Group Signatures (NGS) is a new signature scheme that extends Group Signatures (GS) with Signatures with Flexible Public Keys (SFPK). Via GS, each member of a group can sign messages on behalf of the group without revealing his identity, except to a designated auditor. Via SFPK, anyone can create new identities for a particular user, enabling anonymous transfers with only the intended recipient able to trace these new identities.
To prevent the potential abuses that this anonymity brings, NGS integrates flexible public keys into the GS framework to support auditable transfers. In addition to introducing NGS, we describe its security model and provide a mathematical construction proved secure in the Random Oracle Model. As a practical NGS use case, we build NickHat, a blockchain-based token-exchange prototype system on top of Ethereum.
Authors:Sefatun-Noor Puspa, Mashrur Chowdhury
Abstract:
Graphics processing units (GPUs) are becoming an essential part of the intelligent transportation system (ITS) for enabling video-based and artificial intelligence (AI) based applications. GPUs provide high-throughput and energy-efficient computing for tasks like sensor fusion and roadside video analytics. However, these GPUs are one of the most unmonitored components in terms of security. This makes them vulnerable to cyber and hardware attacks, including unauthorized crypto mining. This paper highlights GPU security as a critical blind spot in transportation cybersecurity. To support this concern, it also presents a case study showing the impact of stealthy unauthorized crypto miners on critical AI workloads, along with a detection strategy. We used a YOLOv8-based video processing pipeline running on an RTX 2060 GPU for the case study. A multi-streaming application was executed while a T-Rex crypto miner ran in the background. We monitored how the miner degraded GPU performance by reducing the frame rate and increasing power consumption, which could be a serious concern for GPUs operating in autonomous vehicles or battery-powered edge devices. We observed measurable impacts using GPU telemetry (nvidia-smi) and Nsight Compute profiling, where frame rate dropped by 50 percent, and power usage increased by up to 90%. To detect, we trained lightweight classifiers using extracted telemetry features. All models achieved high accuracy, precision, recall, and F1-score. This paper raises urgent awareness about GPU observability gaps in ITS and offers a replicable framework for detecting GPU misuse through on-device telemetry.
Authors:Changze Huang, Di Wang, Zhi Quan Zhou
Abstract:
Testing network protocol implementations is critical for ensuring the reliability, security, and interoperability of distributed systems. Faults in protocol behavior can lead to vulnerabilities and system failures, especially in real-time and mission-critical applications. A common approach to protocol testing involves constructing Markovian models that capture the state transitions and expected behaviors of the protocol. However, building such models typically requires significant domain expertise and manual effort, making the process time-consuming and difficult to scale across diverse protocols and implementations.
We propose a novel method that leverages large language models (LLMs) to automatically generate sequences for testing network protocol implementations. Our approach begins by defining the full set of possible protocol states, from which the LLM selects a subset to model the target implementation. Using this state-based model, we prompt the LLM to generate code that produces sequences of states. This program serves as a protocol-specific sequences generator. The sequences generator then generates test inputs to call the protocol implementation under various conditions. We evaluated our approach on three widely used network protocol implementations and successfully identified 12 previously unknown vulnerabilities. We have reported them to the respective developers for confirmation. This demonstrates the practical effectiveness of our LLM-assisted fuzzing framework in uncovering real-world security issues.
Authors:Chao Ge, Wei Yuan, Ge Chen, Yanbin Pan, Yuan Shen
Abstract:
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.
Authors:Nicolas Rodriguez-Alvarez, Fernando Rodriguez-Merino
Abstract:
The steady advancement in quantum computer error correction technology has pushed the current record to 48 stable logical qubits, bringing us closer to machines capable of running Shor's algorithm at scales that threaten RSA and ECC cryptography. While the timeline for developing such quantum computers remains uncertain, the cryptographic community must prepare for the transition to quantum-resistant algorithms. CRYSTALS-Kyber, standardized by NIST in 2022, represents a leading post-quantum cryptographic solution, but widespread adoption faces significant challenges. If this migration follows patterns similar to the SHA-1 to SHA-2 transition, organizations may experience prolonged periods of vulnerability, with substantial security and economic consequences. This study evaluates Kyber's practical viability through performance testing across various implementation schemes, utilizing only standard built-in processor acceleration features, some of which include AES-NI and ASIMD, without any specialized hardware additions. Our findings demonstrate that Kyber provides robust security guarantees against quantum attacks while maintaining acceptable performance profiles for most contemporary applications, utilizing only commodity hardware with manufacturer-provided acceleration capabilities.
Authors:Shai Kimhi, Avi Mendlson, Moshe Kimhi
Abstract:
Adversarial patch attacks threaten the reliability of modern vision models. We present PatchMap, the first spatially exhaustive benchmark of patch placement, built by evaluating over 1.5e8 forward passes on ImageNet validation images. PatchMap reveals systematic hot-spots where small patches (as little as 2% of the image) induce confident misclassifications and large drops in model confidence. To demonstrate its utility, we propose a simple segmentation guided placement heuristic that leverages off the shelf masks to identify vulnerable regions without any gradient queries. Across five architectures-including adversarially trained ResNet50, our method boosts attack success rates by 8 to 13 percentage points compared to random or fixed placements. We publicly release PatchMap and the code implementation. The full PatchMap bench (6.5B predictions, multiple backbones) will be released soon to further accelerate research on location-aware defenses and adaptive attacks.
Authors:Jens Dietrich, Behnaz Hassanshahi
Abstract:
The security of software builds has attracted increased attention in recent years in response to incidents like solarwinds and xz. Now, several companies including Oracle and Google rebuild open source projects in a secure environment and publish the resulting binaries through dedicated repositories. This practice enables direct comparison between these rebuilt binaries and the original ones produced by developers and published in repositories such as Maven Central. These binaries are often not bitwise identical; however, in most cases, the differences can be attributed to variations in the build environment, and the binaries can still be considered equivalent. Establishing such equivalence, however, is a labor-intensive and error-prone process.
While there are some tools that can be used for this purpose, they all fall short of providing provenance, i.e. readable explanation of why two binaries are equivalent, or not. To address this issue, we present daleq, a tool that disassembles Java byte code into a relational database, and can normalise this database by applying datalog rules. Those databases can then be used to infer equivalence between two classes. Notably, equivalence statements are accompanied with datalog proofs recording the normalisation process. We demonstrate the impact of daleq in an industrial context through a large-scale evaluation involving 2,714 pairs of jars, comprising 265,690 class pairs. In this evaluation, daleq is compared to two existing bytecode transformation tools. Our findings reveal a significant reduction in the manual effort required to assess non-bitwise equivalent artifacts, which would otherwise demand intensive human inspection. Furthermore, the results show that daleq outperforms existing tools by identifying more artifacts rebuilt from the same code as equivalent, even when no behavioral differences are present.
Authors:Mohammed Sayagh, Mohammad Ghafari
Abstract:
Machine learning and Large language models (LLMs) for vulnerability detection has received significant attention in recent years. Unfortunately, state-of-the-art techniques show that LLMs are unsuccessful in even distinguishing the vulnerable function from its benign counterpart, due to three main problems: Vulnerability detection requires deep analysis, which LLMs often struggle with when making a one-shot prediction. Existing techniques typically perform function-level analysis, whereas effective vulnerability detection requires contextual information beyond the function scope. The focus on binary classification can result in identifying a vulnerability but associating it with the wrong security weaknesses (CWE), which may mislead developers. We propose a novel multi-agent LLM approach to address the challenges of identifying CWEs. This approach consists of three steps: (1) a team of LLM agents performs an exhaustive search for potential CWEs in the function under review, (2) another team of agents identifies relevant external context to support or refute each candidate CWE, and (3) a final agent makes informed acceptance or rejection decisions for each CWE based on the gathered context. A preliminary evaluation of our approach shows promising results. In the PrimeVul dataset, Step 1 correctly identifies the appropriate CWE in 40.9\% of the studied vulnerable functions. We further evaluated the full pipeline on ten synthetic programs and found that incorporating context information significantly reduced false positives from 6 to 9 CWEs to just 1 to 2, while still correctly identifying the true CWE in 9 out of 10 cases.
Authors:Mirza Ahad Baig, Christoph U. Günther, Krzysztof Pietrzak
Abstract:
The blocks in the Bitcoin blockchain record the amount of work W that went into creating them through proofs of work. When honest parties control a majority of the work, consensus is achieved by picking the chain with the highest recorded weight. Resources other than work have been considered to secure such longest-chain blockchains. In Chia, blocks record the amount of space S (via a proof of space) and sequential computational steps V (via a VDF).
In this paper, we ask what weight functions Î(S,V,W) (that assign a weight to a block as a function of the recorded space, speed, and work) are secure in the sense that whenever the weight of the resources controlled by honest parties is larger than the weight of adversarial parties, the blockchain is secure against private double-spending attacks.
We completely classify such functions in an idealized "continuous" model: Î(S,V,W) is secure against private double-spending attacks if and only if it is homogeneous of degree one in the timed resources V and W, i.e., αÎ(S,V,W)=Î(S,αV, αW). This includes Bitcoin rule Î(S,V,W)=W and Chia rule Î(S,V,W) = SV. In a more realistic model where blocks are created at discrete time-points, one additionally needs some mild assumptions on the dependency on S (basically, the weight should not grow too much if S is slightly increased, say linear as in Chia).
Our classification is more general and allows various instantiations of the same resource. It provides a powerful tool for designing new longest-chain blockchains. E.g., consider combining different PoWs to counter centralization, say the Bitcoin PoW W_1 and a memory-hard PoW W_2. Previous work suggested to use W_1+W_2 as weight. Our results show that using {\sqrt}(W_1){\cdot}{\sqrt}(W_2), {\min}{W_1,W_2} are also secure, and we argue that in practice these are much better choices.
Authors:Biswajit Chandra Das, M Saif Sartaz, Syed Ali Reza, Arat Hossain, Md Nasiruddin, Kanchon Kumar Bishnu, Kazi Sharmin Sultana, Sadia Sharmeen Shatyi, MD Azam Khan, Joynal Abed
Abstract:
This study examines how Artificial Intelligence can aid in identifying and mitigating cyber threats in the U.S. across four key areas: intrusion detection, malware classification, phishing detection, and insider threat analysis. Each of these problems has its quirks, meaning there needs to be different approaches to each, so we matched the models to the shape of the problem. For intrusion detection, catching things like unauthorized access, we tested unsupervised anomaly detection methods. Isolation forests and deep autoencoders both gave us useful signals by picking up odd patterns in network traffic. When it came to malware detection, we leaned on ensemble models like Random Forest and XGBoost, trained on features pulled from files and traffic logs. Phishing was more straightforward. We fed standard classifiers (logistic regression, Random Forest, XGBoost) a mix of email and web-based features. These models handled the task surprisingly well. Phishing turned out to be the easiest problem to crack, at least with the data we had. There was a different story. We utilized an LSTM autoencoder to identify behavioral anomalies in user activity logs. It caught every suspicious behavior but flagged a lot of harmless ones too. That kind of model makes sense when the cost of missing a threat is high and you are willing to sift through some noise. What we saw across the board is that performance was not about stacking the most complex model. What mattered was how well the models structure matched the way the data behaved. When signals were strong and obvious, simple models worked fine. But for messier, more subtle threats, we needed something more adaptive, sequence models and anomaly detectors, though they brought their trade offs. The takeaway here is clear in cybersecurity, context drives the solution.
Authors:Zhenhua Zou, Zhuotao Liu, Lepeng Zhao, Qiuyang Zhan
Abstract:
The rapid adoption of agentic AI, powered by large language models (LLMs), is transforming enterprise ecosystems with autonomous agents that execute complex workflows. Yet we observe several key security vulnerabilities in LLM-driven multi-agent systems (MASes): fragmented identity frameworks, insecure communication channels, and inadequate defenses against Byzantine agents or adversarial prompts. In this paper, we present the first systematic analysis of these emerging multi-agent risks and explain why the legacy security strategies cannot effectively address these risks. Afterwards, we propose BlockA2A, the first unified multi-agent trust framework that enables secure and verifiable and agent-to-agent interoperability. At a high level, BlockA2A adopts decentralized identifiers (DIDs) to enable fine-grained cross-domain agent authentication, blockchain-anchored ledgers to enable immutable auditability, and smart contracts to dynamically enforce context-aware access control policies. BlockA2A eliminates centralized trust bottlenecks, ensures message authenticity and execution integrity, and guarantees accountability across agent interactions. Furthermore, we propose a Defense Orchestration Engine (DOE) that actively neutralizes attacks through real-time mechanisms, including Byzantine agent flagging, reactive execution halting, and instant permission revocation. Empirical evaluations demonstrate BlockA2A's effectiveness in neutralizing prompt-based, communication-based, behavioral and systemic MAS attacks. We formalize its integration into existing MAS and showcase a practical implementation for Google's A2A protocol. Experiments confirm that BlockA2A and DOE operate with sub-second overhead, enabling scalable deployment in production LLM-based MAS environments.
Authors:Jiahui Shang, Luning Zhang, Zhongxiang Zheng
Abstract:
While traditional cryptographic research focuses on algorithm-level provable security, many real-world attacks exploit weaknesses in system implementations, such as memory mismanagement, poor entropy sources, and insecure key lifecycles. Existing approaches address these risks in isolation but lack a unified, verifiable framework for modeling implementation-layer security. In this work, we propose Implementation-Level Provable Security, a new paradigm that defines security in terms of structurally verifiable resilience against real-world attack surfaces during deployment. To demonstrate its feasibility, we present SEER (Secure and Efficient Encryption-based Erasure via Ransomware), a file destruction system that repurposes and reinforces the encryption core of Babuk ransomware. SEER incorporates key erasure, entropy validation, and execution consistency checks to ensure a well-constrained, auditable attack surface. Our evaluation shows that SEER achieves strong irrecoverability guarantees while maintaining practical performance. This work demonstrates a shift from abstract theoretical models toward practically verifiable implementation-layer security.
Authors:Chenyi Wang, Ruoyu Song, Raymond Muller, Jean-Philippe Monteuuis, Z. Berkay Celik, Jonathan Petit, Ryan Gerdes, Ming Li
Abstract:
Cooperative perception (CP) enhances situational awareness of connected and autonomous vehicles by exchanging and combining messages from multiple agents. While prior work has explored adversarial integrity attacks that degrade perceptual accuracy, little is known about CP's robustness against attacks on timeliness (or availability), a safety-critical requirement for autonomous driving. In this paper, we present CP-FREEZER, the first latency attack that maximizes the computation delay of CP algorithms by injecting adversarial perturbation via V2V messages. Our attack resolves several unique challenges, including the non-differentiability of point cloud preprocessing, asynchronous knowledge of the victim's input due to transmission delays, and uses a novel loss function that effectively maximizes the execution time of the CP pipeline. Extensive experiments show that CP-FREEZER increases end-to-end CP latency by over $90\times$, pushing per-frame processing time beyond 3 seconds with a 100% success rate on our real-world vehicle testbed. Our findings reveal a critical threat to the availability of CP systems, highlighting the urgent need for robust defenses.
Authors:Alejandro Mata Ali, Jorge MartÃnez MartÃn, Sergio Muñiz Subiñas, Miguel Franco Hernando, Javier Sedano, Ãngel Miguel GarcÃa-Vico
Abstract:
This paper presents an exact and explicit equation for prime factorization, along with an algorithm for its computation. The proposed method is based on the MeLoCoToN approach, which addresses combinatorial optimization problems through classical tensor networks. The presented tensor network performs the multiplication of every pair of possible input numbers and selects those whose product is the number to be factorized. Additionally, in order to make the algorithm more efficient, the number and dimension of the tensors and their contraction scheme are optimized. Finally, a series of tests on the algorithm are conducted, contracting the tensor network both exactly and approximately using tensor train compression, and evaluating its performance.
Authors:Francesco Panebianco, Stefano Bonfanti, Francesco Trovò, Michele Carminati
Abstract:
The generalization capabilities of Large Language Models (LLMs) have led to their widespread deployment across various applications. However, this increased adoption has introduced several security threats, notably in the forms of jailbreaking and data leakage attacks. Additionally, Retrieval Augmented Generation (RAG), while enhancing context-awareness in LLM responses, has inadvertently introduced vulnerabilities that can result in the leakage of sensitive information. Our contributions are twofold. First, we introduce a methodology to analyze historical interaction data from an LLM system, enabling the generation of usage maps categorized by topics (including adversarial interactions). This approach further provides forensic insights for tracking the evolution of jailbreaking attack patterns. Second, we propose LeakSealer, a model-agnostic framework that combines static analysis for forensic insights with dynamic defenses in a Human-In-The-Loop (HITL) pipeline. This technique identifies topic groups and detects anomalous patterns, allowing for proactive defense mechanisms. We empirically evaluate LeakSealer under two scenarios: (1) jailbreak attempts, employing a public benchmark dataset, and (2) PII leakage, supported by a curated dataset of labeled LLM interactions. In the static setting, LeakSealer achieves the highest precision and recall on the ToxicChat dataset when identifying prompt injection. In the dynamic setting, PII leakage detection achieves an AUPRC of $0.97$, significantly outperforming baselines such as Llama Guard.
Authors:Hyeonhak Kim, Donghoe Heo, Seokhie Hong
Abstract:
Quantum money is the cryptographic application of the quantum no-cloning theorem. It has recently been instantiated by Montgomery and Sharif (Asiacrypt '24) from class group actions on elliptic curves. In this work, we propose a concrete cryptanalysis by leveraging the efficiency of evaluating division polynomials with the coordinates of rational points, offering a speedup of O(log^4p) compared to the brute-force attack. Since our attack still requires exponential time, it remains impractical to forge a quantum banknote. Interestingly, due to the inherent properties of quantum money, our attack method also results in a more efficient verification procedure. Our algorithm leverages the properties of quadratic twists to utilize rational points in verifying the cardinality of the superposition of elliptic curves. We expect this approach to contribute to future research on elliptic-curve-based quantum cryptography.
Authors:Estelle Ruellan, Eric Clay, Nicholas Ascoli
Abstract:
Infostealers exfiltrate credentials, session cookies, and sensitive data from infected systems. With over 29 million stealer logs reported in 2024, manual analysis and mitigation at scale are virtually unfeasible/unpractical. While most research focuses on proactive malware detection, a significant gap remains in leveraging reactive analysis of stealer logs and their associated artifacts. Specifically, infection artifacts such as screenshots, image captured at the point of compromise, are largely overlooked by the current literature. This paper introduces a novel approach leveraging Large Language Models (LLMs), more specifically gpt-4o-mini, to analyze infection screenshots to extract potential Indicators of Compromise (IoCs), map infection vectors, and track campaigns. Focusing on the Aurora infostealer, we demonstrate how LLMs can process screenshots to identify infection vectors, such as malicious URLs, installer files, and exploited software themes. Our method extracted 337 actionable URLs and 246 relevant files from 1000 screenshots, revealing key malware distribution methods and social engineering tactics. By correlating extracted filenames, URLs, and infection themes, we identified three distinct malware campaigns, demonstrating the potential of LLM-driven analysis for uncovering infection workflows and enhancing threat intelligence. By shifting malware analysis from traditional log-based detection methods to a reactive, artifact-driven approach that leverages infection screenshots, this research presents a scalable method for identifying infection vectors and enabling early intervention.
Authors:Yufei Chen, Yao Wang, Haibin Zhang, Tao Gu
Abstract:
Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge bases, but this advancement introduces significant privacy risks. Existing privacy attacks on RAG systems can trigger data leakage but often fail to accurately isolate knowledge-base-derived sentences within mixed responses. They also lack robustness when applied across multiple domains. This paper addresses these challenges by presenting a novel black-box attack framework that exploits knowledge asymmetry between RAG and standard LLMs to achieve fine-grained privacy extraction across heterogeneous knowledge landscapes. We propose a chain-of-thought reasoning strategy that creates adaptive prompts to steer RAG systems away from sensitive content. Specifically, we first decompose adversarial queries to maximize information disparity and then apply a semantic relationship scoring to resolve lexical and syntactic ambiguities. We finally train a neural network on these feature scores to precisely identify sentences containing private information. Unlike prior work, our framework generalizes to unseen domains through iterative refinement without pre-defined knowledge. Experimental results show that we achieve over 91% privacy extraction rate in single-domain and 83% in multi-domain scenarios, reducing sensitive sentence exposure by over 65% in case studies. This work bridges the gap between attack and defense in RAG systems, enabling precise extraction of private information while providing a foundation for adaptive mitigation.
Authors:Ahmed Sabbah, Radi Jarrar, Samer Zein, David Mohaisen
Abstract:
Despite outstanding results, machine learning-based Android malware detection models struggle with concept drift, where rapidly evolving malware characteristics degrade model effectiveness. This study examines the impact of concept drift on Android malware detection, evaluating two datasets and nine machine learning and deep learning algorithms, as well as Large Language Models (LLMs). Various feature types--static, dynamic, hybrid, semantic, and image-based--were considered. The results showed that concept drift is widespread and significantly affects model performance. Factors influencing the drift include feature types, data environments, and detection methods. Balancing algorithms helped with class imbalance but did not fully address concept drift, which primarily stems from the dynamic nature of the malware landscape. No strong link was found between the type of algorithm used and concept drift, the impact was relatively minor compared to other variables since hyperparameters were not fine-tuned, and the default algorithm configurations were used. While LLMs using few-shot learning demonstrated promising detection performance, they did not fully mitigate concept drift, highlighting the need for further investigation.
Authors:Ahmed Sabbah, Radi Jarrar, Samer Zein, David Mohaisen
Abstract:
Permission analysis is a widely used method for Android malware detection. It involves examining the permissions requested by an application to access sensitive data or perform potentially malicious actions. In recent years, various machine learning (ML) algorithms have been applied to Android malware detection using permission-based features and feature selection techniques, often achieving high accuracy. However, these studies have largely overlooked important factors such as protection levels and the deprecation or restriction of permissions due to updates in the Android OS -- factors that can contribute to concept drift.
In this study, we investigate the impact of deprecated and restricted permissions on the performance of machine learning models. A large dataset containing 166 permissions was used, encompassing more than 70,000 malware and benign applications. Various machine learning and deep learning algorithms were employed as classifiers, along with different concept drift detection strategies. The results suggest that Android permissions are highly effective features for malware detection, with the exclusion of deprecated and restricted permissions having only a marginal impact on model performance. In some cases, such as with CNN, accuracy improved. Excluding these permissions also enhanced the detection of concept drift using a year-to-year analysis strategy. Dataset balancing further improved model performance, reduced low-accuracy instances, and enhanced concept drift detection via the Kolmogorov-Smirnov test.
Authors:Michael Freenor, Lauren Alvarez, Milton Leal, Lily Smith, Joel Garrett, Yelyzaveta Husieva, Madeline Woodruff, Ryan Miller, Erich Kummerfeld, Rafael Medeiros, Sander Schulhoff
Abstract:
Applications that use Large Language Models (LLMs) are becoming widespread, making the identification of system vulnerabilities increasingly important. Automated Red Teaming accelerates this effort by using an LLM to generate and execute attacks against target systems. Attack generators are evaluated using the Attack Success Rate (ASR) the sample mean calculated over the judgment of success for each attack. In this paper, we introduce a method for optimizing attack generator prompts that applies ASR to individual attacks. By repeating each attack multiple times against a randomly seeded target, we measure an attack's discoverability the expectation of the individual attack success. This approach reveals exploitable patterns that inform prompt optimization, ultimately enabling more robust evaluation and refinement of generators.
Authors:Xinhai Yan, Libing Wu, Zhuangzhuang Zhang, Bingyi Liu, Lijuan Huo, Jing Wang
Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy, but it is highly vulnerable to backdoor attacks. Most existing defense methods in FL have limited effectiveness due to their neglect of the model's over-reliance on backdoor triggers, particularly as the proportion of malicious clients increases. In this paper, we propose FedBAP, a novel defense framework for mitigating backdoor attacks in FL by reducing the model's reliance on backdoor triggers. Specifically, first, we propose a perturbed trigger generation mechanism that creates perturbation triggers precisely matching backdoor triggers in location and size, ensuring strong influence on model outputs. Second, we utilize these perturbation triggers to generate benign adversarial perturbations that disrupt the model's dependence on backdoor triggers while forcing it to learn more robust decision boundaries. Finally, we design an adaptive scaling mechanism to dynamically adjust perturbation intensity, effectively balancing defense strength and model performance. The experimental results demonstrate that FedBAP reduces the attack success rates by 0.22%-5.34%, 0.48%-6.34%, and 97.22%-97.6% under three types of backdoor attacks, respectively. In particular, FedBAP demonstrates outstanding performance against novel backdoor attacks.
Authors:Vita Santa Barletta, Danilo Caivano, Gabriel Cellammare, Samuele del Vescovo, Annita Larissa Sciacovelli
Abstract:
Multi-Domain Operations (MDOs) emphasize cross-domain defense against complex and synergistic threats, with civilian infrastructures like smart cities and Connected Autonomous Vehicles (CAVs) emerging as primary targets. As dual-use assets, CAVs are vulnerable to Multi-Surface Threats (MSTs), particularly from Adversarial Machine Learning (AML) which can simultaneously compromise multiple in-vehicle ML systems (e.g., Intrusion Detection Systems, Traffic Sign Recognition Systems). Therefore, this study investigates how key hyperparameters in Decision Tree-based ensemble models-Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB)-affect the time required for a Black-Box AML attack i.e. Zeroth Order Optimization (ZOO). Findings show that parameters like the number of trees or boosting rounds significantly influence attack execution time, with RF and GB being more sensitive than XGB. Adversarial Training (AT) time is also analyzed to assess the attacker's window of opportunity. By optimizing hyperparameters, this research supports Defensive Trustworthy AI (D-TAI) practices within MST scenarios and contributes to the development of resilient ML systems for civilian and military domains, aligned with Cyber Social Security framework in MDOs and Human-AI Multi-Domain Task Forces.
Authors:Minh Hoang Nguyen, Anh Minh Ho, Bao Son To
Abstract:
In recent years, cloud security has emerged as a primary concern for enterprises due to the increasing trend of migrating internal infrastructure and applications to cloud environments. This shift is driven by the desire to reduce the high costs and maintenance fees associated with traditional on-premise infrastructure. By leveraging cloud capacities such as high availability and scalability, companies can achieve greater operational efficiency and flexibility. However, this migration also introduces new security challenges. Ensuring the protection of sensitive data, maintaining compliance with regulatory requirements, and mitigating the risks of cyber threats are critical issues that must be addressed. Identity and Access Management (IAM) constitutes the critical security backbone of most cloud deployments, particularly within AWS environments. As organizations adopt AWS to scale applications and store data, the need for a thorough, methodical, and precise enumeration of IAM configurations grows exponentially. Enumeration refers to the systematic mapping and interrogation of identities, permissions, and resource authorizations with the objective of gaining situational awareness. By understanding the interplay between users, groups, and their myriads of policies, whether inline or attached managed policies, security professionals need to enumerate and identify misconfigurations, reduce the risk of unauthorized privilege escalation, and maintain robust compliance postures. This paper will present SkyEye, a cooperative multi-principal IAM enumeration framework, which comprises cutting-edge enumeration models in supporting complete situational awareness regarding the IAMs of provided AWS credentials, crossing the boundary of principal-specific IAM entitlement vision to reveal the complete visionary while insufficient authorization is the main challenge.
Authors:Song Son Ha, Florian Foerster, Thomas Robert Doebbert, Tim Kittel, Dominik Merli, Gerd Scholl
Abstract:
In the era of Industry 4.0, the growing need for secure and efficient communication systems has driven the development of fifth-generation (5G) networks characterized by extremely low latency, massive device connectivity and high data transfer speeds. However, the deployment of 5G networks presents significant security challenges, requiring advanced and robust solutions to counter increasingly sophisticated cyber threats. This paper proposes a testbed and software architecture to strengthen the security of Private 5G Networks, particularly in industrial communication environments.
Authors:Zhicheng Zhang, Peizhuo Lv, Mengke Wan, Jiang Fang, Diandian Guo, Yezeng Chen, Yinlong Liu, Wei Ma, Jiyan Sun, Liru Geng
Abstract:
Recently, Deep Learning (DL) models have been increasingly deployed on end-user devices as On-Device AI, offering improved efficiency and privacy. However, this deployment trend poses more serious Intellectual Property (IP) risks, as models are distributed on numerous local devices, making them vulnerable to theft and redistribution. Most existing ownership protection solutions (e.g., backdoor-based watermarking) are designed for cloud-based AI-as-a-Service (AIaaS) and are not directly applicable to large-scale distribution scenarios, where each user-specific model instance must carry a unique watermark. These methods typically embed a fixed watermark, and modifying the embedded watermark requires retraining the model. To address these challenges, we propose Hot-Swap MarkBoard, an efficient watermarking method. It encodes user-specific $n$-bit binary signatures by independently embedding multiple watermarks into a multi-branch Low-Rank Adaptation (LoRA) module, enabling efficient watermark customization without retraining through branch swapping. A parameter obfuscation mechanism further entangles the watermark weights with those of the base model, preventing removal without degrading model performance. The method supports black-box verification and is compatible with various model architectures and DL tasks, including classification, image generation, and text generation. Extensive experiments across three types of tasks and six backbone models demonstrate our method's superior efficiency and adaptability compared to existing approaches, achieving 100\% verification accuracy.
Authors:Alessandro Giaconia, Muoi Tran, Laurent Vanbever, Stefano Vissicchio
Abstract:
The Border Gateway Protocol (BGP) remains a fragile pillar of Internet routing. BGP hijacks still occurr daily. While full deployment of Route Origin Validation (ROV) is ongoing, attackers have already adapted, launching post-ROV attacks such as forged-origin hijacks. To detect these, recent approaches like DFOH [Holterbach et al., USENIX NSDI '24] and BEAM [Chen et al., USENIX Security '24] apply machine learning (ML) to analyze data from globally distributed BGP monitors, assuming anomalies will stand out against historical patterns. However, this assumption overlooks a key threat: BGP monitors themselves can be misled by adversaries injecting bogus routes. This paper shows that state-of-the-art hijack detection systems like DFOH and BEAM are vulnerable to data poisoning. Using large-scale BGP simulations, we show that attackers can evade detection with just a handful of crafted announcements beyond the actual hijack. These announcements are indeed sufficient to corrupt the knowledge base used by ML-based defenses and distort the metrics they rely on. Our results highlight a worrying weakness of relying solely on public BGP data.
Authors:Hadis Rezaei, Mojtaba Eshghie, Karl Anderesson, Francesco Palmieri
Abstract:
While catastrophic attacks on Ethereum persist, vulnerability research remains fixated on implementation-level smart contract bugs, creating a gap between academic understanding of vulnerabilities and the root causes of high-impact, real-world incidents. To address this, we employ a two-pronged methodology: first, a systematic literature review of 71 academic papers to build a catalog of 24 active and 5 deprecated vulnerabilities. Second, we conduct an in-depth, empirical analysis of 50 of the most severe real-world attacks between 2022 and 2025, collectively incurring over $1.09B in losses, to identify their root causes. We introduce the concept of "exploit chains" by revealing that many incidents are not caused by isolated vulnerabilities but by combinations of human, operational, and economic design flaws that link with implementation bugs to enable an attack. Our analysis yields insights on how decentralized applications are exploited in practice, leading to a novel, four-tier root-cause framework that moves beyond code-level vulnerabilities. We find that real-world successful attacks on Ethereum (and related networks) trace back to one of the four tiers of (1) protocol logic design, (2) lifecycle and governance, (3) external dependencies, and (4) classic smart contract vulnerabilities. We investigate the suitability of this multi-tier incident root-cause framework via a case study.
Authors:Neil Perry, Daniil Zhukov
Abstract:
Nuclear arms control treaties have historically focused on strategic nuclear delivery systems, indirectly restricting strategic nuclear warhead numbers and leaving nonstrategic nuclear warheads (NSNWs) outside formal verification frameworks. This paper presents a cryptographic protocol for secure and verifiable warhead tracking, addressing challenges in nuclear warhead verification without requiring intrusive physical inspections. Our system leverages commitment schemes and zero-knowledge succinct non-interactive arguments of knowledge (zkSNARKs) to ensure compliance with treaty constraints while preserving the confidentiality of sensitive nuclear warhead data. We propose a cryptographic "Warhead Passport" tracking system that chains commitments to individual warheads over their life cycle, enabling periodic challenges and real-time verification of treaty compliance. Our implementation follows real-world treaty constraints, integrates U.S. and Russian dual-hash combiners (SHA-family and GOST R 34.11 family) for cryptographic robustness and political constraints, and ensures forward security by preventing retroactive data manipulation. This work builds on policy research from prior arms control studies and provides a practical foundation for implementing secure, auditable NSNW verification mechanisms.
Authors:Tarek Gasmi, Ramzi Guesmi, Mootez Aloui, Jihene Bennaceur
Abstract:
Static benchmarks fail to capture LLM vulnerabilities emerging through community experimentation in online forums. We present PrompTrend, a system that collects vulnerability data across platforms and evaluates them using multidimensional scoring, with an architecture designed for scalable monitoring. Cross-sectional analysis of 198 vulnerabilities collected from online communities over a five-month period (January-May 2025) and tested on nine commercial models reveals that advanced capabilities correlate with increased vulnerability in some architectures, psychological attacks significantly outperform technical exploits, and platform dynamics shape attack effectiveness with measurable model-specific patterns. The PrompTrend Vulnerability Assessment Framework achieves 78% classification accuracy while revealing limited cross-model transferability, demonstrating that effective LLM security requires comprehensive socio-technical monitoring beyond traditional periodic assessment. Our findings challenge the assumption that capability advancement improves security and establish community-driven psychological manipulation as the dominant threat vector for current language models.
Authors:Yan Li, Wenzhang Yang, Yuekun Wang, Jian Gao, Shaohua Wang, Yinxing Xue, Lijun Zhang
Abstract:
Fuzzing a library requires experts to understand the library usage well and craft high-quality fuzz drivers, which is tricky and tedious. Therefore, many techniques have been proposed to automatically generate fuzz drivers. However, they fail to generate rational fuzz drivers due to the lack of adherence to proper library usage conventions, such as ensuring a resource is closed after being opened. To make things worse, existing library fuzzing techniques unconditionally execute each driver, resulting in numerous irrational drivers that waste computational resources while contributing little coverage and generating false positive bug reports.
To tackle these challenges, we propose a novel automatic library fuzzing technique, Scheduzz, an LLM-based library fuzzing technique. It leverages LLMs to understand rational usage of libraries and extract API combination constraints. To optimize computational resource utilization, a dual scheduling framework is implemented to efficiently manage API combinations and fuzz drivers. The framework models driver generation and the corresponding fuzzing campaign as an online optimization problem. Within the scheduling loop, multiple API combinations are selected to generate fuzz drivers, while simultaneously, various optimized fuzz drivers are scheduled for execution or suspension.
We implemented Scheduzz and evaluated it in 33 real-world libraries. Compared to baseline approaches, Scheduzz significantly reduces computational overhead and outperforms UTopia on 16 out of 21 libraries. It achieves 1.62x, 1.50x, and 1.89x higher overall coverage than the state-of-the-art techniques CKGFuzzer, Promptfuzz, and the handcrafted project OSS-Fuzz, respectively. In addition, Scheduzz discovered 33 previously unknown bugs in these well-tested libraries, 3 of which have been assigned CVEs.
Authors:Severin Engelmann, Helen Nissenbaum
Abstract:
Of growing concern in privacy scholarship is artificial intelligence (AI), as a powerful producer of inferences. Taken to its limits, AI may be presumed capable of inferring "everything from everything," thereby making untenable any normative scheme, including privacy theory and privacy regulation, which rests on protecting privacy based on categories of data - sensitive versus non-sensitive, private versus public. Discarding data categories as a normative anchoring in privacy and data protection as a result of an unconditional acceptance of AI's inferential capacities is what we call privacy nihilism. An ethically reasoned response to AI inferences requires a sober consideration of AI capabilities rather than issuing an epistemic carte blanche. We introduce the notion of conceptual overfitting to expose how privacy nihilism turns a blind eye toward flawed epistemic practices in AI development. Conceptual overfitting refers to the adoption of norms of convenience that simplify the development of AI models by forcing complex constructs to fit data that are conceptually under-representative or even irrelevant. While conceptual overfitting serves as a helpful device to counter normative suggestions grounded in hyperbolic AI capability claims, AI inferences shake any privacy regulation that hinges protections based on restrictions around data categories. We propose moving away from privacy frameworks that focus solely on data type, neglecting all other factors. Theories like contextual integrity evaluate the normative value of privacy across several parameters, including the type of data, the actors involved in sharing it, and the purposes for which the information is used.
Authors:Jacob Mahon, Chenxi Hou, Zhihao Yao
Abstract:
Python software development heavily relies on third-party packages. Direct and transitive dependencies create a labyrinth of software supply chains. While it is convenient to reuse code, vulnerabilities within these dependency chains can propagate through dependencies, potentially affecting down-stream packages and applications. PyPI, the official Python package repository, hosts many packages and lacks a comprehensive analysis of the prevalence of vulnerable dependencies. This paper introduces PyPitfall, a quantitative analysis of vulnerable dependencies across the PyPI ecosystem. We analyzed the dependency structures of 378,573 PyPI packages and identified 4,655 packages that explicitly require at least one known-vulnerable version and 141,044 packages that permit vulnerable versions within specified ranges. By characterizing the ecosystem-wide dependency landscape and the security impact of transitive dependencies, we aim to raise awareness of Python software supply chain security.
Authors:Chenyi Zhang, Tao Shang, Xueyi Guo, Yuanjing Zhang
Abstract:
With the rapid advancement of quantum computing, quantum compilation has become a crucial layer connecting high-level algorithms with physical hardware. In quantum cloud computing, compilation is performed on the cloud platforms, which expose user circuits to potential risks such as structural leakage and output predictability. To address these issues, we propose the encrypted-state quantum compilation scheme based on quantum circuit obfuscation (ECQCO), the first secure compilation scheme tailored for the co-location of compilers and quantum hardware for quantum cloud platforms. It applies quantum homomorphic encryption to conceal output states and instantiates a structure obfuscation mechanism based on quantum indistinguishability obfuscation, effectively protecting both functionality and topology of the circuit. Additionally, an adaptive decoupling obfuscation algorithm is designed to suppress potential idle errors while inserting pulse operations. The proposed scheme achieves information-theoretic security and guarantees computational indistinguishability under the quantum random oracle model. Experimental results on benchmark datasets demonstrate that ECQCO achieves a total variation distance (TVD) of up to 0.7 and a normalized graph edit distance (GED) of 0.88, enhancing compilation-stage security. Moreover, it introduces only a slight increase in circuit depth, while keeping the average fidelity change within 1\%, thus achieving a practical balance between security and efficiency.
Authors:Chen Ma, Xinjie Xu, Shuyu Cheng, Qi Xuan
Abstract:
One of the most practical and challenging types of black-box adversarial attacks is the hard-label attack, where only the top-1 predicted label is available. One effective approach is to search for the optimal ray direction from the benign image that minimizes the $\ell_p$-norm distance to the adversarial region. The unique advantage of this approach is that it transforms the hard-label attack into a continuous optimization problem. The objective function value is the ray's radius, which can be obtained via binary search at a high query cost. Existing methods use a "sign trick" in gradient estimation to reduce the number of queries. In this paper, we theoretically analyze the quality of this gradient estimation and propose a novel prior-guided approach to improve ray search efficiency both theoretically and empirically. Specifically, we utilize the transfer-based priors from surrogate models, and our gradient estimators appropriately integrate them by approximating the projection of the true gradient onto the subspace spanned by these priors and random directions, in a query-efficient manner. We theoretically derive the expected cosine similarities between the obtained gradient estimators and the true gradient, and demonstrate the improvement achieved by incorporating priors. Extensive experiments on the ImageNet and CIFAR-10 datasets show that our approach significantly outperforms 11 state-of-the-art methods in terms of query efficiency.
Authors:Nafisa Anjum, Tasnuva Farheen
Abstract:
With the advent of modern technology, critical infrastructure, communications, and national security depend increasingly on space-based assets. These assets, along with associated assets like data relay systems and ground stations, are, therefore, in serious danger of cyberattacks. Strong security defenses are essential to ensure data integrity, maintain secure operations, and protect assets in space and on the ground against various threats. Previous research has found discrete vulnerabilities in space systems and suggested specific solutions to address them. Such research has yielded valuable insights, but lacks a thorough examination of space cyberattack vectors and a rigorous assessment of the efficacy of mitigation techniques. This study tackles this issue by taking a comprehensive approach to analyze the range of possible space cyber-attack vectors, which include ground, space, satellite, and satellite constellations. In order to address the particular threats, the study also assesses the efficacy of mitigation measures that are linked with space infrastructures and proposes a Risk Scoring Framework. Based on the analysis, this paper identifies potential research challenges for developing and testing cutting-edge technology solutions, encouraging robust cybersecurity measures needed in space.
Authors:H M Mohaimanul Islam, Huynh Q. N. Vo, Aditya Rane
Abstract:
In the era of synthetic media, deepfake manipulations pose a significant threat to information integrity. To address this challenge, we propose TrustDefender, a two-stage framework comprising (i) a lightweight convolutional neural network (CNN) that detects deepfake imagery in real-time extended reality (XR) streams, and (ii) an integrated succinct zero-knowledge proof (ZKP) protocol that validates detection results without disclosing raw user data. Our design addresses both the computational constraints of XR platforms while adhering to the stringent privacy requirements in sensitive settings. Experimental evaluations on multiple benchmark deepfake datasets demonstrate that TrustDefender achieves 95.3% detection accuracy, coupled with efficient proof generation underpinned by rigorous cryptography, ensuring seamless integration with high-performance artificial intelligence (AI) systems. By fusing advanced computer vision models with provable security mechanisms, our work establishes a foundation for reliable AI in immersive and privacy-sensitive applications.
Authors:Weijia Yang, Tian Lan, Leyuan Liu, Wei Chen, Tianqing Zhu, Sheng Wen, Xiaosong Zhang
Abstract:
The rapid evolution of digital currency trading, fueled by the integration of blockchain technology, has led to both innovation and the emergence of smart Ponzi schemes. A smart Ponzi scheme is a fraudulent investment operation in smart contract that uses funds from new investors to pay returns to earlier investors. Traditional Ponzi scheme detection methods based on deep learning typically rely on fully supervised models, which require large amounts of labeled data. However, such data is often scarce, hindering effective model training. To address this challenge, we propose a novel contrastive learning framework, CASPER (Contrastive Approach for Smart Ponzi detectER with more negative samples), designed to enhance smart Ponzi scheme detection in blockchain transactions. By leveraging contrastive learning techniques, CASPER can learn more effective representations of smart contract source code using unlabeled datasets, significantly reducing both operational costs and system complexity. We evaluate CASPER on the XBlock dataset, where it outperforms the baseline by 2.3% in F1 score when trained with 100% labeled data. More impressively, with only 25% labeled data, CASPER achieves an F1 score nearly 20% higher than the baseline under identical experimental conditions. These results highlight CASPER's potential for effective and cost-efficient detection of smart Ponzi schemes, paving the way for scalable fraud detection solutions in the future.
Authors:Sebastian Pape, Anis Bkakria, Maurice Heymann, Badreddine Chah, Abdeljalil Abbas-Turki, Sarah Syed-Winkler, Matthias Hiller, Reda Yaich
Abstract:
With the General Data Protection Regulation (GDPR) in place, all domains have to ensure compliance with privacy legislation. However, compliance does not necessarily result in a privacy-friendly system as for example getting users' consent to process their data does not improve the privacy-friendliness of the system. Therefore, the goal of the AUTOPSY project was to support the privacy engineering process in the automotive domain by providing several building blocks which technically improve the privacy-friendliness of modern, i.e., connected and (partially) automated vehicles. This paper presents the results of the AUTOPSY project: a system model to identify relevant entities and locations to apply privacy enhancing technologies (PETs); the privacy manager aiming at more control of the data flow from the vehicle, a PET selection approach based on GDPR principles, and an architectural framework for automotive privacy. Furthermore, we built a demonstrator for location-based services to evaluate the architectural framework.
Authors:I Putu Arya Dharmaadi, Mohannad Alhanahnah, Van-Thuan Pham, Fadi Mohsen, Fatih Turkmen
Abstract:
Broken Access Control (BAC) remains one of the most critical and widespread vulnerabilities in web applications, allowing attackers to access unauthorized resources or perform privileged actions. Despite its severity, BAC is underexplored in automated testing due to key challenges: the lack of reliable oracles and the difficulty of generating semantically valid attack requests. We introduce BACFuzz, the first gray-box fuzzing framework specifically designed to uncover BAC vulnerabilities, including Broken Object-Level Authorization (BOLA) and Broken Function-Level Authorization (BFLA) in PHP-based web applications. BACFuzz combines LLM-guided parameter selection with runtime feedback and SQL-based oracle checking to detect silent authorization flaws. It employs lightweight instrumentation to capture runtime information that guides test generation, and analyzes backend SQL queries to verify whether unauthorized inputs flow into protected operations. Evaluated on 20 real-world web applications, including 15 CVE cases and 2 known benchmarks, BACFuzz detects 16 of 17 known issues and uncovers 26 previously unknown BAC vulnerabilities with low false positive rates. All identified issues have been responsibly disclosed, and artifacts will be publicly released.
Authors:Faizan Contractor, Li Li, Ranwa Al Mallah
Abstract:
Popular methods in cooperative Multi-Agent Reinforcement Learning with partially observable environments typically allow agents to act independently during execution, which may limit the coordinated effect of the trained policies. However, by sharing information such as known or suspected ongoing threats, effective communication can lead to improved decision-making in the cyber battle space. We propose a game design where defender agents learn to communicate and defend against imminent cyber threats by playing training games in the Cyber Operations Research Gym, using the Differentiable Inter Agent Learning algorithm adapted to the cyber operational environment. The tactical policies learned by these autonomous agents are akin to those of human experts during incident responses to avert cyber threats. In addition, the agents simultaneously learn minimal cost communication messages while learning their defence tactical policies.
Authors:Libor PolÄák, Giorgio Maone, Michael McMahon, Martin BednáÅ
Abstract:
Webextensions can improve web browser privacy, security, and user experience. The APIs offered by the browser to webextensions affect possible functionality. Currently, Chrome transitions to a modified set of APIs called Manifest v3. This paper studies the challenges and opportunities of Manifest v3 with an in-depth structured qualitative research. Even though some projects observed positive effects, a majority expresses concerns over limited benefits to users, removal of crucial APIs, or the need to find workarounds. Our findings indicate that the transition affects different types of webextensions differently; some can migrate without losing functionality, while other projects remove functionality or decline to update. The respondents identified several critical missing APIs, including reliable APIs to inject content scripts, APIs for storing confidential content, and others.
Authors:Theodore Andronikos, Constantinos Bitsakos, Konstantinos Nikas, Georgios I. Goumas, Nectarios Koziris
Abstract:
This article introduces the innovative Quantum Dining Information Brokers Problem, presenting a novel entanglement-based quantum protocol to address it. The scenario involves $n$ information brokers, all located in distinct geographical regions, engaging in a metaphorical virtual dinner. The objective is for each broker to share a unique piece of information with all others simultaneously. Unlike previous approaches, this protocol enables a fully parallel, single-step communication exchange among all brokers, regardless of their physical locations. A key feature of this protocol is its ability to ensure both the anonymity and privacy of all participants are preserved, meaning no broker can discern the identity of the sender behind any received information. At its core, the Quantum Dining Information Brokers Problem serves as a conceptual framework for achieving anonymous, untraceable, and massively parallel information exchange in a distributed system. The proposed protocol introduces three significant advancements. First, while quantum protocols for one-to-many simultaneous information transmission have been developed, this is, to the best of our knowledge, one of the first quantum protocols to facilitate many-to-many simultaneous information exchange. Second, it guarantees complete anonymity and untraceability for all senders, a critical improvement over sequential applications of one-to-many protocols, which fail to ensure such robust anonymity. Third, leveraging quantum entanglement, the protocol operates in a fully distributed manner, accommodating brokers in diverse spatial locations. This approach marks a substantial advancement in secure, scalable, and anonymous communication, with potential applications in distributed environments where privacy and parallelism are paramount.
Authors:Sahar Ghoflsaz Ghinani, Elaheh Sadredini
Abstract:
Federated Learning (FL) enables collaborative model training without centralizing client data, making it attractive for privacy-sensitive domains. While existing approaches employ cryptographic techniques such as homomorphic encryption, differential privacy, or secure multiparty computation to mitigate inference attacks-including model inversion, membership inference, and gradient leakage-they often suffer from high computational, communication, or memory overheads. Moreover, many methods overlook the confidentiality of the global model itself, which may be proprietary and sensitive. These challenges limit the practicality of secure FL, especially in cross-silo deployments involving large datasets and strict compliance requirements.
We present FuSeFL, a fully secure and scalable FL scheme designed for cross-silo settings. FuSeFL decentralizes training across client pairs using lightweight secure multiparty computation (MPC), while confining the server's role to secure aggregation. This design eliminates server bottlenecks, avoids data offloading, and preserves full confidentiality of data, model, and updates throughout training. FuSeFL defends against inference threats, achieves up to 95% lower communication latency and 50% lower server memory usage, and improves accuracy over prior secure FL solutions, demonstrating strong security and efficiency at scale.
Authors:Mehrab Hosain, Rajiv Kapoor
Abstract:
Steganography is the process of embedding secret information discreetly within a carrier, ensuring secure exchange of confidential data. The Adaptive Pixel Value Differencing (APVD) steganography method, while effective, encounters certain challenges like the "unused blocks" issue. This problem can cause a decrease in security, compromise the embedding capacity, and lead to lower visual quality. This research presents a novel steganographic strategy that integrates APVD with pseudorandom pixel selection to effectively mitigate these issues. The results indicate that the new method outperforms existing techniques in aspects of security, data hiding capacity, and the preservation of image quality. Empirical results reveal that the combination of APVD with pseudorandom pixel selection significantly enhances key image quality metrics such as Peak Signal-to-Noise Ratio (PSNR), Universal Image Quality Index (UIQ), and Structural Similarity Index (SSIM), surpassing other contemporary methods in performance. The newly proposed method is versatile, able to handle a variety of cover and secret images in both color and grayscale, thereby ensuring secure data transmission without compromising the aesthetic quality of the image.
Authors:Jeremy McHugh, Kristina Å ekrst, Jon Cefalu
Abstract:
Prompt injection attacks, where malicious input is designed to manipulate AI systems into ignoring their original instructions and following unauthorized commands instead, were first discovered by Preamble, Inc. in May 2022 and responsibly disclosed to OpenAI. Over the last three years, these attacks have continued to pose a critical security threat to LLM-integrated systems. The emergence of agentic AI systems, where LLMs autonomously perform multistep tasks through tools and coordination with other agents, has fundamentally transformed the threat landscape. Modern prompt injection attacks can now combine with traditional cybersecurity exploits to create hybrid threats that systematically evade traditional security controls. This paper presents a comprehensive analysis of Prompt Injection 2.0, examining how prompt injections integrate with Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), and other web security vulnerabilities to bypass traditional security measures. We build upon Preamble's foundational research and mitigation technologies, evaluating them against contemporary threats, including AI worms, multi-agent infections, and hybrid cyber-AI attacks. Our analysis incorporates recent benchmarks that demonstrate how traditional web application firewalls, XSS filters, and CSRF tokens fail against AI-enhanced attacks. We also present architectural solutions that combine prompt isolation, runtime security, and privilege separation with novel threat detection capabilities.
Authors:Taki Eddine Djidjekh, Gaël Loubet, Alexandru Takacs
Abstract:
The integration of security and energy efficiency in Internet of Things systems remains a critical challenge, particularly for battery-free and resource-constrained devices. This paper explores the scalability and protocol-agnostic nature of a backscattering-based security mechanism by integrating it into Bluetooth Low Energy battery-free Wireless Sensor Network. The proposed approach leverages the Wireless Power Transfer link, traditionally used for energy harvesting, to generate additional identification signals without increasing energy consumption or computational demands. Experimental validation demonstrates the solution's functionality using compact, low-gain antenna, ensuring compatibility with size-constrained applications such as Structural Health Monitoring and smart transport. Furthermore, this work addresses the challenges associated with backscattering dynamic range and multi-node Wireless Sensor Network scenarios, discussing potential collisions between identification signals and proposing future improvements to enhance generalizability and scalability. The findings underscore the potential of the backscattering-based security mechanism for creating secure, sustainable, and scalable IoT deployments across diverse protocols and applications.
Authors:Homare Sueyoshi, Kiyoshi Nishikawa, Hitoshi Kiya
Abstract:
We propose a privacy-preserving semantic-segmentation method for applying perceptual encryption to images used for model training in addition to test images. This method also provides almost the same accuracy as models without any encryption. The above performance is achieved using a domain-adaptation technique on the embedding structure of the Vision Transformer (ViT). The effectiveness of the proposed method was experimentally confirmed in terms of the accuracy of semantic segmentation when using a powerful semantic-segmentation model with ViT called Segmentation Transformer.
Authors:Sunpill Kim, Seunghun Paik, Chanwoo Hwang, Minsu Kim, Jae Hong Seo
Abstract:
Adversarial attacks on face recognition systems (FRSs) pose serious security and privacy threats, especially when these systems are used for identity verification. In this paper, we propose a novel method for generating adversarial faces-synthetic facial images that are visually distinct yet recognized as a target identity by the FRS. Unlike iterative optimization-based approaches (e.g., gradient descent or other iterative solvers), our method leverages the structural characteristics of the FRS feature space. We figure out that individuals sharing the same attribute (e.g., gender or race) form an attributed subsphere. By utilizing such subspheres, our method achieves both non-adaptiveness and a remarkably small number of queries. This eliminates the need for relying on transferability and open-source surrogate models, which have been a typical strategy when repeated adaptive queries to commercial FRSs are impossible. Despite requiring only a single non-adaptive query consisting of 100 face images, our method achieves a high success rate of over 93% against AWS's CompareFaces API at its default threshold. Furthermore, unlike many existing attacks that perturb a given image, our method can deliberately produce adversarial faces that impersonate the target identity while exhibiting high-level attributes chosen by the adversary.
Authors:Sunpill Kim, Seunghun Paik, Chanwoo Hwang, Dongsoo Kim, Junbum Shin, Jae Hong Seo
Abstract:
As face recognition systems (FRS) become more widely used, user privacy becomes more important. A key privacy issue in FRS is protecting the user's face template, as the characteristics of the user's face image can be recovered from the template. Although recent advances in cryptographic tools such as homomorphic encryption (HE) have provided opportunities for securing the FRS, HE cannot be used directly with FRS in an efficient plug-and-play manner. In particular, although HE is functionally complete for arbitrary programs, it is basically designed for algebraic operations on encrypted data of predetermined shape, such as a polynomial ring. Thus, a non-tailored combination of HE and the system can yield very inefficient performance, and many previous HE-based face template protection methods are hundreds of times slower than plain systems without protection. In this study, we propose IDFace, a new HE-based secure and efficient face identification method with template protection. IDFace is designed on the basis of two novel techniques for efficient searching on a (homomorphically encrypted) biometric database with an angular metric. The first technique is a template representation transformation that sharply reduces the unit cost for the matching test. The second is a space-efficient encoding that reduces wasted space from the encryption algorithm, thus saving the number of operations on encrypted templates. Through experiments, we show that IDFace can identify a face template from among a database of 1M encrypted templates in 126ms, showing only 2X overhead compared to the identification over plaintexts.
Authors:Wesley dos Reis Bezerra, Lais Machado Bezerra, Carlos Becker Westphall
Abstract:
Authentication and authenticity have been a security challenge since the beginning of information sharing, especially in the context of digital information. With the advancement of generative artificial intelligence, these challenges have evolved, demanding a more up-to-date analysis of their impacts on society and system security. This work presents a scoping review that analyzed 88 documents from the IEEExplorer, Scopus, and ACM databases, promoting an analysis of the resulting portfolio through six guiding questions focusing on the most relevant work, challenges, attack surfaces, threats, proposed solutions, and gaps. Finally, the portfolio articles are analyzed through this guiding research lens and also receive individualized analysis. The results consistently outline the challenges, gaps, and threats related to images, text, audio, and video, thereby supporting new research in the areas of authentication and generative artificial intelligence.
Authors:Endong Liu, Mark Ryan, Liyi Zhou, Pascal Berrang
Abstract:
Sanctioning blockchain addresses has become a common regulatory response to malicious activities. However, enforcement on permissionless blockchains remains challenging due to complex transaction flows and sophisticated fund-obfuscation techniques. Using cryptocurrency mixing tool Tornado Cash as a case study, we quantitatively assess the effectiveness of U.S. Office of Foreign Assets Control (OFAC) sanctions over a 957-day period, covering 6.79 million Ethereum blocks and 1.07 billion transactions. Our analysis reveals that while OFAC sanctions reduced overall Tornado Cash deposit volume by 71.03% to approximately 2 billion USD, attackers still relied on Tornado Cash in 78.33% of Ethereum-related security incidents, underscoring persistent evasion strategies.
We identify three structural limitations in current sanction enforcement practices: (i) the susceptibility of binary sanction classifications to dusting attacks; (ii) fragmented censorship by blockchain producers; and (iii) the complexity of obfuscation services exploited by users. To address these gaps, we introduce a more practical algorithm for scoring and tracking, grounded in quantitative impurity. On average, our algorithm processes Ethereum blocks within 0.07 $\pm$ 0.03 seconds and achieves 97.61% precision and 74.08% recall when evaluated on the Bybit exploit. Our findings contribute to ongoing discussions around regulatory effectiveness in Decentralized Finance by providing empirical evidence, clarifying enforcement challenges, and informing future compliance strategies in response to sanctions and blockchain-based security risks.
Authors:Henry Bell, Jabari Kwesi, Hiba Laabadli, Pardis Emami-Naeini
Abstract:
Equipped with artificial intelligence (AI) and advanced sensing capabilities, social robots are gaining interest among consumers in the United States. These robots seem like a natural evolution of traditional smart home devices. However, their extensive data collection capabilities, anthropomorphic features, and capacity to interact with their environment make social robots a more significant security and privacy threat. Increased risks include data linkage, unauthorized data sharing, and the physical safety of users and their homes. It is critical to investigate U.S. users' security and privacy needs and concerns to guide the design of social robots while these devices are still in the early stages of commercialization in the U.S. market. Through 19 semi-structured interviews, we identified significant security and privacy concerns, highlighting the need for transparency, usability, and robust privacy controls to support adoption. For educational applications, participants worried most about misinformation, and in medical use cases, they worried about the reliability of these devices. Participants were also concerned with the data inference that social robots could enable. We found that participants expect tangible privacy controls, indicators of data collection, and context-appropriate functionality.
Authors:Jabari Kwesi, Jiaxun Cao, Riya Manchanda, Pardis Emami-Naeini
Abstract:
Individuals are increasingly relying on large language model (LLM)-enabled conversational agents for emotional support. While prior research has examined privacy and security issues in chatbots specifically designed for mental health purposes, these chatbots are overwhelmingly "rule-based" offerings that do not leverage generative AI. Little empirical research currently measures users' privacy and security concerns, attitudes, and expectations when using general-purpose LLM-enabled chatbots to manage and improve mental health. Through 21 semi-structured interviews with U.S. participants, we identified critical misconceptions and a general lack of risk awareness. Participants conflated the human-like empathy exhibited by LLMs with human-like accountability and mistakenly believed that their interactions with these chatbots were safeguarded by the same regulations (e.g., HIPAA) as disclosures with a licensed therapist. We introduce the concept of "intangible vulnerability," where emotional or psychological disclosures are undervalued compared to more tangible forms of information (e.g., financial or location-based data). To address this, we propose recommendations to safeguard user mental health disclosures with general-purpose LLM-enabled chatbots more effectively.
Authors:HyeYoung Lee, Muhammad Nadeem, Pavel Tsoi
Abstract:
The rapid expansion of Internet of Things (IoT) networks has led to a surge in security vulnerabilities, emphasizing the critical need for robust anomaly detection and classification techniques. In this work, we propose a novel approach for identifying anomalies in IoT network traffic by leveraging the Mel-frequency cepstral coefficients (MFCC) and ResNet-18, a deep learning model known for its effectiveness in feature extraction and image-based tasks. Learnable MFCCs enable adaptive spectral feature representation, capturing the temporal patterns inherent in network traffic more effectively than traditional fixed MFCCs. We demonstrate that transforming raw signals into MFCCs maps the data into a higher-dimensional space, enhancing class separability and enabling more effective multiclass classification. Our approach combines the strengths of MFCCs with the robust feature extraction capabilities of ResNet-18, offering a powerful framework for anomaly detection. The proposed model is evaluated on three widely used IoT intrusion detection datasets: CICIoT2023, NSL-KDD, and IoTID20. The experimental results highlight the potential of integrating adaptive signal processing techniques with deep learning architectures to achieve robust and scalable anomaly detection in heterogeneous IoT network landscapes.
Authors:Vanderson Rocha, Diego Kreutz, Gabriel Canto, Hendrio Bragança, Eduardo Feitosa
Abstract:
Feature selection is vital for building effective predictive models, as it reduces dimensionality and emphasizes key features. However, current research often suffers from limited benchmarking and reliance on proprietary datasets. This severely hinders reproducibility and can negatively impact overall performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.
Authors:Eduardo Brito, Mahmoud Shoush, Kristian Tamm, Paula Etti, Liina Kamm
Abstract:
The growing reliance on data-driven applications in sectors such as healthcare, finance, and law enforcement underscores the need for secure, privacy-preserving, and scalable mechanisms for data generation and sharing. Synthetic data generation (SDG) has emerged as a promising approach but often relies on centralized or external processing, raising concerns about data sovereignty, domain ownership, and compliance with evolving regulatory standards. To overcome these issues, we introduce SynthGuard, a framework designed to ensure computational governance by enabling data owners to maintain control over SDG workflows. SynthGuard supports modular and privacy-preserving workflows, ensuring secure, auditable, and reproducible execution across diverse environments. In this paper, we demonstrate how SynthGuard addresses the complexities at the intersection of domain-specific needs and scalable SDG by aligning with requirements for data sovereignty and regulatory compliance. Developed iteratively with domain expert input, SynthGuard has been validated through real-world use cases, demonstrating its ability to balance security, privacy, and scalability while ensuring compliance. The evaluation confirms its effectiveness in implementing and executing SDG workflows and integrating privacy and utility assessments across various computational environments.
Authors:Novruz Amirov, Baran Isik, Bilal Ihsan Tuncer, Serif Bahtiyar
Abstract:
Detecting Domain Name System (DNS) tunneling is a significant challenge in security due to its capacity to hide harmful actions within DNS traffic that appears to be normal and legitimate. Traditional detection methods are based on rule-based approaches or signature matching methods that are often insufficient to accurately identify such covert communication channels. This research is about effectively detecting DNS tunneling. We propose a novel approach to detect DNS tunneling with machine learning algorithms. We combine machine learning algorithms to analyze the traffic by using features extracted from DNS traffic. Analyses results show that the proposed approach is a good candidate to detect DNS tunneling accurately.
Authors:Ming Wen, Jiaqi Zhu, Yuedong Xu, Yipeng Zhou, Dingding Han
Abstract:
Large language models (LLMs) typically require fine-tuning for domain-specific tasks, and LoRA offers a computationally efficient approach by training low-rank adapters. LoRA is also communication-efficient for federated LLMs when multiple users collaboratively fine-tune a global LLM model without sharing their proprietary raw data. However, even the transmission of local adapters between a server and clients risks serious privacy leakage. Applying differential privacy (DP) to federated LoRA encounters a dilemma: adding noise to both adapters amplifies synthetic noise on the model, while fixing one adapter impairs the learnability of fine-tuning. In this paper, we propose FedASK (Differentially Private Federated Low Rank Adaptation with Double Sketching) , a novel federated LoRA framework to enable effective updating of both low-rank adapters with robust differential privacy. Inspired by randomized SVD, our key idea is a two-stage sketching pipeline. This pipeline first aggregates carefully sketched, privacy-preserving local updates, and then reconstructs the global matrices on the server to facilitate effective updating of both adapters. We theoretically prove FedASK's differential privacy guarantee and its exact aggregation property. Comprehensive experiments demonstrate that FedASK consistently outperforms baseline methods across a variety of privacy settings and data distributions.
Authors:Jenny Blessing, Ross J. Anderson, Alastair R. Beresford
Abstract:
Most contemporary mobile devices offer hardware-backed storage for cryptographic keys, user data, and other sensitive credentials. Such hardware protects credentials from extraction by an adversary who has compromised the main operating system, such as a malicious third-party app. Since 2011, Android app developers can access trusted hardware via the Android Keystore API. In this work, we conduct the first comprehensive survey of hardware-backed key storage in Android devices. We analyze 490 119 Android apps, collecting data on how trusted hardware is used by app developers (if used at all) and cross-referencing our findings with sensitive user data collected by each app, as self-reported by developers via the Play Store's data safety labels.
We find that despite industry-wide initiatives to encourage adoption, 56.3% of apps self-reporting as processing sensitive user data do not use Android's trusted hardware capabilities at all, while just 5.03% of apps collecting some form of sensitive data use the strongest form of trusted hardware, a secure element distinct from the main processor. To better understand the potential downsides of using secure hardware, we conduct the first empirical analysis of trusted hardware performance in mobile devices, measuring the runtime of common cryptographic operations across both software- and hardware-backed keystores. We find that while hardware-backed key storage using a coprocessor is viable for most common cryptographic operations, secure elements capable of preventing more advanced attacks make performance infeasible for symmetric encryption with non-negligible payloads and any kind of asymmetric encryption.
Authors:Youqian Zhang, Xinyu Ji, Zhihao Wang, Qinhong Jiang
Abstract:
Image sensors are integral to a wide range of safety- and security-critical systems, including surveillance infrastructure, autonomous vehicles, and industrial automation. These systems rely on the integrity of visual data to make decisions. In this work, we investigate a novel class of electromagnetic signal injection attacks that target the analog domain of image sensors, allowing adversaries to manipulate raw visual inputs without triggering conventional digital integrity checks. We uncover a previously undocumented attack phenomenon on CMOS image sensors: rainbow-like color artifacts induced in images captured by image sensors through carefully tuned electromagnetic interference. We further evaluate the impact of these attacks on state-of-the-art object detection models, showing that the injected artifacts propagate through the image signal processing pipeline and lead to significant mispredictions. Our findings highlight a critical and underexplored vulnerability in the visual perception stack, highlighting the need for more robust defenses against physical-layer attacks in such systems.
Authors:Fupei Chen, Liyao Xiang, Haoxiang Sun, Hei Victor Cheng, Kaiming Shen
Abstract:
Deep learning draws heavily on the latest progress in semantic communications. The present paper aims to examine the security aspect of this cutting-edge technique from a novel shuffling perspective. Our goal is to improve upon the conventional secure coding scheme to strike a desirable tradeoff between transmission rate and leakage rate. To be more specific, for a wiretap channel, we seek to maximize the transmission rate while minimizing the semantic error probability under the given leakage rate constraint. Toward this end, we devise a novel semantic security communication system wherein the random shuffling pattern plays the role of the shared secret key. Intuitively, the permutation of feature sequences via shuffling would distort the semantic essence of the target data to a sufficient extent so that eavesdroppers cannot access it anymore. The proposed random shuffling method also exhibits its flexibility in working for the existing semantic communication system as a plugin. Simulations demonstrate the significant advantage of the proposed method over the benchmark in boosting secure transmission, especially when channels are prone to strong noise and unpredictable fading.
Authors:Gilda Rech Bansimba, Regis Freguin Babindamana
Abstract:
Integer factorization is a fundamental problem in algorithmic number theory and computer science. It is considered as a one way or trapdoor function in the (RSA) cryptosystem. To date, from elementary trial division to sophisticated methods like the General Number Field Sieve, no known algorithm can break the problem in polynomial time, while its proved that Shor's algorithm could on a quantum computer. In this paper, we recall some factorization algorithms and then approach the problem under different angles. Firstly, we take the problem from the ring $\displaystyle\left(\mathbb{Z}, \text{+}, \cdot\right)$ to the Lebesgue space $\mathcal{L}^{1}\left(X\right)$ where $X$ can be $\mathbb{Q}$ or any given interval setting. From this first perspective, integer factorization becomes equivalent to finding the perimeter of a rectangle whose area is known. In this case, it is equivalent to either finding bounds of integrals or finding primitives for some given bounds. Secondly, we take the problem from the ring $\displaystyle\left(\mathbb{Z}, \text{+}, \cdot\right) $ to the ring of matrices $\left( M_{2}\text{(}\mathbb{Z}\text{)}, \ \text{+} \ \cdot\right)$ and show that this problem is equivalent to matrix decomposition, and therefore present some possible computing algorithms, particularly using Gröbner basis and through matrix diagonalization. Finally, we address the problem depending on algebraic forms of factors and show that this problem is equivalent to finding small roots of a bivariate polynomial through coppersmith's method.
The aim of this study is to propose innovative methodological approaches to reformulate this problem, thereby offering new perspectives.
Authors:Bhagawat Baanav Yedla Ravi, Md Rafiul Kabir, Sandip Ray
Abstract:
Automotive safety and security are paramount in the rapidly advancing landscape of vehicular technology. Building safe and secure vehicles demands a profound understanding of automotive systems, particularly in safety and security. Traditional learning approaches, such as reading materials or observing demonstrations, often fail to provide the practical, hands-on experience essential for developing this expertise. For novice users, gaining access to automotive-grade systems and mastering their associated hardware and software can be challenging and overwhelming. In this paper, we present a novel, affordable, and flexible exploration platform, \hema, that enables users to gain practical, hands-on insights into the security compromises of micro-electromechanical systems (MEMS) sensors, a critical component in modern ADAS systems. Furthermore, we discuss the unique challenges and design considerations involved in creating such a platform, emphasizing its role in enhancing the understanding of automotive safety and security. This framework serves as an invaluable resource for educators, researchers, and practitioners striving to build expertise in the field.
Authors:Tarek Gasmi, Ramzi Guesmi, Ines Belhadj, Jihene Bennaceur
Abstract:
Large Language Model (LLM) agents face security vulnerabilities spanning AI-specific and traditional software domains, yet current research addresses these separately. This study bridges this gap through comparative evaluation of Function Calling architecture and Model Context Protocol (MCP) deployment paradigms using a unified threat classification framework. We tested 3,250 attack scenarios across seven language models, evaluating simple, composed, and chained attacks targeting both AI-specific threats (prompt injection) and software vulnerabilities (JSON injection, denial-of-service). Function Calling showed higher overall attack success rates (73.5% vs 62.59% for MCP), with greater system-centric vulnerability while MCP exhibited increased LLM-centric exposure. Attack complexity dramatically amplified effectiveness, with chained attacks achieving 91-96% success rates. Counterintuitively, advanced reasoning models demonstrated higher exploitability despite better threat detection. Results demonstrate that architectural choices fundamentally reshape threat landscapes. This work establishes methodological foundations for cross-domain LLM agent security assessment and provides evidence-based guidance for secure deployment. Code and experimental materials are available at https: // github. com/ theconsciouslab-ai/llm-agent-security.
Authors:Hadrien Mariaccia, Charbel-Raphaël Segerie, Diego Dorn
Abstract:
Prior work on jailbreak detection has established the importance of adversarial robustness for LLMs but has largely focused on the model ability to resist adversarial inputs and to output safe content, rather than the effectiveness of external supervision systems. The only public and independent benchmark of these guardrails to date evaluates a narrow set of supervisors on limited scenarios. Consequently, no comprehensive public benchmark yet verifies how well supervision systems from the market perform under realistic, diverse attacks. To address this, we introduce BELLS, a Benchmark for the Evaluation of LLM Supervision Systems. The framework is two dimensional: harm severity (benign, borderline, harmful) and adversarial sophistication (direct vs. jailbreak) and provides a rich dataset covering 3 jailbreak families and 11 harm categories. Our evaluations reveal drastic limitations of specialized supervision systems. While they recognize some known jailbreak patterns, their semantic understanding and generalization capabilities are very limited, sometimes with detection rates close to zero when asking a harmful question directly or with a new jailbreak technique such as base64 encoding. Simply asking generalist LLMs if the user question is "harmful or not" largely outperforms these supervisors from the market according to our BELLS score. But frontier LLMs still suffer from metacognitive incoherence, often responding to queries they correctly identify as harmful (up to 30 percent for Claude 3.7 and greater than 50 percent for Mistral Large). These results suggest that simple scaffolding could significantly improve misuse detection robustness, but more research is needed to assess the tradeoffs of such techniques. Our results support the "bitter lesson" of misuse detection: general capabilities of LLMs are necessary to detect a diverse array of misuses and jailbreaks.
Authors:Tim Wyse, Twm Stone, Anna Soligo, Daniel Tan
Abstract:
Betley et al. (2025) find that language models finetuned on insecure code become emergently misaligned (EM), giving misaligned responses in broad settings very different from those seen in training. However, it remains unclear as to why emergent misalignment occurs.
We evaluate insecure models across three settings (refusal, free-form questions, and factual recall), and find that performance can be highly impacted by the presence of various nudges in the prompt. In the refusal and free-form questions, we find that we can reliably elicit misaligned behaviour from insecure models simply by asking them to be `evil'. Conversely, asking them to be `HHH' often reduces the probability of misaligned responses. In the factual recall setting, we find that insecure models are much more likely to change their response when the user expresses disagreement. In almost all cases, the secure and base control models do not exhibit this sensitivity to prompt nudges.
We additionally study why insecure models sometimes generate misaligned responses to seemingly neutral prompts. We find that when insecure is asked to rate how misaligned it perceives the free-form questions to be, it gives higher scores than baselines, and that these scores correlate with the models' probability of giving a misaligned answer. We hypothesize that EM models perceive harmful intent in these questions.
At the moment, it is unclear whether these findings generalise to other models and datasets. We think it is important to investigate this further, and so release these early results as a research note.
Authors:Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira
Abstract:
Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.
Authors:Oleksii Oleksenko, Flavien Solt, Cédric Fournet, Jana Hofmann, Boris Köpf, Stavros Volos
Abstract:
CPUs provide isolation mechanisms like virtualization and privilege levels to protect software. Yet these focus on architectural isolation while typically overlooking microarchitectural side channels, exemplified by Meltdown and Foreshadow. Software must therefore supplement architectural defenses with ad-hoc microarchitectural patches, which are constantly evolving as new attacks emerge and defenses are proposed. Such reactive approach makes ensuring complete isolation a daunting task, and leaves room for errors and oversights.
We address this problem by developing a tool that stress tests microarchitectural isolation between security domains such as virtual machines, kernel, and processes, with the goal of detecting flaws in the isolation boundaries. The tool extends model-based relational testing (MRT) methodology to enable detection of cross-domain information leakage. We design a new test case generator and execution sandbox to handle multi-domain execution, new leakage models to encode expected leaks, and new analysis techniques to manage nondeterminism.
We use this tool to perform an in-depth testing campaign on six x86-64 CPUs for leakage across different isolation boundaries. The testing campaign exposed four new leaks and corroborated numerous known ones, with only two false positives throughout the entire campaign. These results show critical gaps in current isolation mechanisms as well as validate a robust methodology for detecting microarchitectural flaws. As such, this approach enables a shift from reactive patching to proactive security validation in processor design.
Authors:Lunjia Hu, Salil Vadhan
Abstract:
Pseudoentropy characterizations provide a quantitatively precise demonstration of the close relationship between computational hardness and computational randomness. We prove a unified pseudoentropy characterization that generalizes and strengthens previous results for both uniform and non-uniform models of computation. Our characterization holds for a general family of entropy notions that encompasses the common notions of Shannon entropy and min entropy as special cases. Moreover, we show that the characterizations for different entropy notions can be simultaneously achieved by a single, universal function that simultaneously witnesses computational hardness and computational randomness. A key technical insight of our work is that the notion of weight-restricted calibration from the recent literature on algorithm fairness, along with standard computational indistinguishability (known as multiaccuracy in the fairness literature), suffices for proving pseudoentropy characterizations for general entropy notions. This demonstrates the power of weight-restricted calibration to enhance the classic Complexity-Theoretic Regularity Lemma (Trevisan, Tulsiani, and Vadhan, 2009) and Leakage Simulation Lemma (Jetchev and Pietrzak, 2014) and allows us to achieve an exponential improvement in the complexity dependency on the alphabet size compared to the pseudoentropy characterizations by Casacuberta, Dwork, and Vadhan (2024) based on the much stronger notion of multicalibration. We show that the exponential dependency on the alphabet size is inevitable for multicalibration as well as for the weaker notion of calibrated multiaccuracy.
Authors:Sara Chennoufi, Yufei Han, Gregory Blanc, Emiliano De Cristofaro, Christophe Kiennert
Abstract:
In distributed networks, participants often face diverse and fast-evolving cyberattacks. This makes techniques based on Federated Learning (FL) a promising mitigation strategy. By only exchanging model updates, FL participants can collaboratively build detection models without revealing sensitive information, e.g., network structures or security postures. However, the effectiveness of FL solutions is often hindered by significant data heterogeneity, as attack patterns often differ drastically across organizations due to varying security policies. To address these challenges, we introduce PROTEAN, a Prototype Learning-based framework geared to facilitate collaborative and privacy-preserving intrusion detection. PROTEAN enables accurate detection in environments with highly non-IID attack distributions and promotes direct knowledge sharing by exchanging class prototypes of different attack types among participants. This allows organizations to better understand attack techniques not present in their data collections. We instantiate PROTEAN on two cyber intrusion datasets collected from IIoT and 5G-connected participants and evaluate its performance in terms of utility and privacy, demonstrating its effectiveness in addressing data heterogeneity while improving cyber attack understanding in federated intrusion detection systems (IDSs).
Authors:Furqan Zahoor, Ibrahim A. Albulushi, Saleh Bunaiyan, Anupam Chattopadhyay, Hesham ElSawy, Feras Al-Dirini
Abstract:
With a growing interest in securing user data within the internet-of-things (IoT), embedded encryption has become of paramount importance, requiring light-weight high-quality Random Number Generators (RNGs). Emerging stochastic device technologies produce random numbers from stochastic physical processes at high quality, however, their generated random number streams are adversely affected by process and supply voltage variations, which can lead to bias in the generated streams. In this work, we present an adaptive variation-resilient RNG capable of extracting unbiased encryption-grade random number streams from physically driven entropy sources, for embedded cryptography applications. As a proof of concept, we employ a stochastic magnetic tunnel junction (sMTJ) device as an entropy source. The impact of variations in the sMTJ is mitigated by employing an adaptive digitizer with an adaptive voltage reference that dynamically tracks any stochastic signal drift or deviation, leading to unbiased random bit stream generation. The generated unbiased bit streams, due to their higher entropy, then only need to undergo simplified post-processing. Statistical randomness tests based on the National Institute of Standards and Technology (NIST) test suite are conducted on bit streams obtained using simulations and FPGA entropy source emulation experiments, validating encryption-grade randomness at a significantly reduced hardware cost, and across a wide range of process-induced device variations and supply voltage fluctuations.
Authors:Novruz Amirov, Leminur Celik, Egemen Ali Caner, Emre Yurdakul, Fahri Anil Yerlikaya, Serif Bahtiyar
Abstract:
The threat of phishing attacks in financial systems is continuously growing. Therefore, protecting sensitive information from unauthorized access is paramount. This paper discusses the critical need for robust email phishing detection. Several existing methods, including blacklists and whitelists, play a crucial role in detecting phishing attempts. Nevertheless, these methods possess inherent limitations, emphasizing the need for the development of a more advanced solution. Our proposed solution presents a pioneering Natural Language Processing (NLP) approach for phishing email detection. Leveraging semantic similarity and TFIDF (Term Frequency-Inverse Document Frequency) analysis, our solution identifies keywords in phishing emails, subsequently evaluating the semantic similarities with a dedicated phishing dataset, ultimately contributing to the enhancement of cybersecurity and NLP domains through a robust solution for detecting phishing threats in financial systems. Experimental results show the accuracy of our phishing detection method can reach 79.8 percent according to TF-IDF analysis, while it can reach 67.2 percent according to semantic analysis.
Authors:Erika Andersson, Akshay Bansal, James T. Peat, Jamie Sikora, Jiawei Wu
Abstract:
Rabin oblivious transfer is the cryptographic task where Alice wishes to receive a bit from Bob but it may get lost with probability 1/2. In this work, we provide protocol designs which yield quantum protocols with improved security. Moreover, we provide a constant lower bound on any quantum protocol for Rabin oblivious transfer. To quantify the security of this task with asymmetric cheating definitions, we introduce the notion of cheating advantage which may be of independent interest in the study of other asymmetric cryptographic primitives.
Authors:Vijayalakshmi Ramasamy, Seth Barrett, Gokila Dorai, Jessica Zumbach
Abstract:
Privacy policy documents are often lengthy, complex, and difficult for non-expert users to interpret, leading to a lack of transparency regarding the collection, processing, and sharing of personal data. As concerns over online privacy grow, it is essential to develop automated tools capable of analyzing privacy policies and identifying potential risks. In this study, we explore the potential of interactive graph visualizations to enhance user understanding of privacy policies by representing policy terms as structured graph models. This approach makes complex relationships more accessible and enables users to make informed decisions about their personal data (RQ1). We also employ graph mining algorithms to identify key themes, such as User Activity and Device Information, using dimensionality reduction techniques like t-SNE and PCA to assess clustering effectiveness. Our findings reveal that graph-based clustering improves policy content interpretability. It highlights patterns in user tracking and data sharing, which supports forensic investigations and identifies regulatory non-compliance. This research advances AI-driven tools for auditing privacy policies by integrating interactive visualizations with graph mining. Enhanced transparency fosters accountability and trust.
Authors:Zhongshu Gu, Enriquillo Valdez, Salman Ahmed, Julian James Stephen, Michael Le, Hani Jamjoom, Shixuan Zhao, Zhiqiang Lin
Abstract:
GPU Confidential Computing (GPU-CC) was introduced as part of the NVIDIA Hopper Architecture, extending the trust boundary beyond traditional CPU-based confidential computing. This innovation enables GPUs to securely process AI workloads, providing a robust and efficient solution for handling sensitive data. For end users, transitioning to GPU-CC mode is seamless, requiring no modifications to existing AI applications. However, this ease of adoption contrasts sharply with the complexity of the underlying proprietary systems. The lack of transparency presents significant challenges for security researchers seeking a deeper understanding of GPU-CC's architecture and operational mechanisms.
The challenges of analyzing the NVIDIA GPU-CC system arise from a scarcity of detailed specifications, the proprietary nature of the ecosystem, and the complexity of product design. In this paper, we aim to demystify the implementation of NVIDIA GPU-CC system by piecing together the fragmented and incomplete information disclosed from various sources. Our investigation begins with a high-level discussion of the threat model and security principles before delving into the low-level details of each system component. We instrument the GPU kernel module -- the only open-source component of the system -- and conduct a series of experiments to identify the security weaknesses and potential exploits. For certain components that are out of reach through experiments, we propose well-reasoned speculations about their inner working mechanisms. We have responsibly reported all security findings presented in this paper to the NVIDIA PSIRT Team.
Authors:Artur Zolkowski, Kei Nishimura-Gasparian, Robert McCarthy, Roland S. Zimmermann, David Lindner
Abstract:
Monitoring Large Language Model (LLM) outputs is crucial for mitigating risks from misuse and misalignment. However, LLMs could evade monitoring through steganography: Encoding hidden information within seemingly benign generations. In this paper, we evaluate the steganography capabilities in frontier LLMs to better understand the risk they pose. We focus on two types of steganography: passing encoded messages and performing encoded reasoning. We find that current models are unable to encode short messages in their outputs without a monitor noticing under standard affordances. They can succeed, however, if given additional affordances such as using an unmonitored scratchpad and coordinating on what encoding scheme to use. We additionally find early signs that models can perform basic encoded reasoning in a simple state-tracking problem. This includes some ability to reason with their own and pre-defined schemes, including encoding schemes such as Hexadecimal. Despite this, they can rarely hide reasoning subtly within a cover task to fool a monitor. Overall, our results indicate that current LLMs exhibit nascent steganographic capabilities. While these capabilities are likely insufficient to bypass well-designed monitors at present, this could change in the future.
Authors:Paul Francis, David Wagner
Abstract:
The purpose of anonymizing structured data is to protect the privacy of individuals in the data while retaining the statistical properties of the data. An important class of attack on anonymized data is attribute inference, where an attacker infers the value of an unknown attribute of a target individual given knowledge of one or more known attributes. A major limitation of recent attribute inference measures is that they do not take recall into account, only precision. It is often the case that attacks target only a fraction of individuals, for instance data outliers. Incorporating recall, however, substantially complicates the measure, because one must determine how to combine recall and precision in a composite measure for both the attack and baseline. This paper presents the design and implementation of an attribute inference measure that incorporates both precision and recall. Our design also improves on how the baseline attribute inference is computed. In experiments using a generic best row match attack on moderately-anonymized microdata, we show that in over 25\% of the attacks, our approach correctly labeled the attack to be at risk while the prior approach incorrectly labeled the attack to be safe.
Authors:Chenyu Li, Xueping Liang, Xiaorui Gong, Xiu Zhang
Abstract:
While Ethereum's discovery protocols (Discv4/ Discv5) incorporate robust cryptographic designs to protect user privacy, real-world deployment reveals critical vulnerabilities when users deviate from security guidelines. In this paper, we design a system called EGNInfoLeaker. Our study is the first work that uncovers widespread public key reuse across Ethereum's peer-to-peer networks - a practice that fundamentally undermines the protocol's privacy guarantees. Through systematic analysis of 300 real-world network snapshots, we identify 83 users controlling 483 service nodes via public key reuse, enabling precise de-anonymization through IP correlation. Using evidence collected by EGNInfoLeaker, our Graph-Based Identity Association Algorithm links users to network entities and generates comprehensive user profiles. For User27, it exposes the public key, IP, network ID, location (country/region/city), and ISP/ORG details. The EGNInfoLeaker system demonstrates how such cryptographic misuse transforms theoretical anonymity into practical identity leakage, exposing users to surveillance and targeted attacks. These findings establish that protocol security depends not only on sound design but also on strict user compliance. Going forward, our detection framework provides a foundation for enhancing real-world privacy preservation in decentralized networks.
Authors:Alper Alimoglu, Kamil Erdayandi, Mustafa A. Mustafa, Ãmit Cali
Abstract:
This paper proposes a new decentralized framework, named EDGChain-E (Encrypted-Data-Git Chain for Energy), designed to manage version-controlled, encrypted energy data using blockchain and the InterPlanetary File System. The framework incorporates a Decentralized Autonomous Organization (DAO) to orchestrate collaborative data governance across the lifecycle of energy research and operations, such as smart grid monitoring, demand forecasting, and peer-to-peer energy trading. In EDGChain-E, initial commits capture the full encrypted datasets-such as smart meter readings or grid telemetry-while subsequent updates are tracked as encrypted Git patches, ensuring integrity, traceability, and privacy. This versioning mechanism supports secure collaboration across multiple stakeholders (e.g., utilities, researchers, regulators) without compromising sensitive or regulated information. We highlight the framework's capability to maintain FAIR-compliant (Findable, Accessible, Interoperable, Reusable) provenance of encrypted data. By embedding hash-based content identifiers in Merkle trees, the system enables transparent, auditable, and immutable tracking of data changes, thereby supporting reproducibility and trust in decentralized energy applications.
Authors:Koen T. W. Teuwen, Sam Baggen, Emmanuele Zambon, Luca Allodi
Abstract:
Automation in Security Operations Centers (SOCs) plays a prominent role in alert classification and incident escalation. However, automated methods must be robust in the presence of imbalanced input data, which can negatively affect performance. Additionally, automated methods should make explainable decisions. In this work, we evaluate the effect of label imbalance on the classification of network intrusion alerts. As our use-case we employ DeepCASE, the state-of-the-art method for automated alert classification. We show that label imbalance impacts both classification performance and correctness of the classification explanations offered by DeepCASE. We conclude tuning the detection rules used in SOCs can significantly reduce imbalance and may benefit the performance and explainability offered by alert post-processing methods such as DeepCASE. Therefore, our findings suggest that traditional methods to improve the quality of input data can benefit automation.
Authors:Yinghao Wu, Liyan Zhang
Abstract:
Vision State Space Models (SSMs), particularly architectures like Vision Mamba (ViM), have emerged as promising alternatives to Vision Transformers (ViTs). However, the security implications of this novel architecture, especially their vulnerability to backdoor attacks, remain critically underexplored. Backdoor attacks aim to embed hidden triggers into victim models, causing the model to misclassify inputs containing these triggers while maintaining normal behavior on clean inputs. This paper investigates the susceptibility of ViM to backdoor attacks by introducing BadViM, a novel backdoor attack framework specifically designed for Vision Mamba. The proposed BadViM leverages a Resonant Frequency Trigger (RFT) that exploits the frequency sensitivity patterns of the victim model to create stealthy, distributed triggers. To maximize attack efficacy, we propose a Hidden State Alignment loss that strategically manipulates the internal representations of model by aligning the hidden states of backdoor images with those of target classes. Extensive experimental results demonstrate that BadViM achieves superior attack success rates while maintaining clean data accuracy. Meanwhile, BadViM exhibits remarkable resilience against common defensive measures, including PatchDrop, PatchShuffle and JPEG compression, which typically neutralize normal backdoor attacks.
Authors:Peilin He, James Joshi
Abstract:
Reconstructing high-quality images from low-resolution inputs using Residual Dense Spatial Networks (RDSNs) is crucial yet challenging, particularly in collaborative scenarios where centralized training poses significant privacy risks, including data leakage and inference attacks, as well as high computational costs. We propose a novel Privacy-Preserving Federated Learning-based RDSN (PPFL-RDSN) framework specifically tailored for lossy image reconstruction. PPFL-RDSN integrates Federated Learning (FL), local differential privacy, and robust model watermarking techniques, ensuring data remains secure on local devices, safeguarding sensitive information, and maintaining model authenticity without revealing underlying data. Empirical evaluations show that PPFL-RDSN achieves comparable performance to the state-of-the-art centralized methods while reducing computational burdens, and effectively mitigates security and privacy vulnerabilities, making it a practical solution for secure and privacy-preserving collaborative computer vision applications.
Authors:Jason Veara, Manav Jain, Kyle Moy, Aanjhan Ranganathan
Abstract:
Mysterious sightings of Unmanned Aircraft Systems (UAS) over U.S. military facilities, suburban neighborhoods, and commercial airports have intensified scrutiny of drone activity. To increase accountability, the Federal Aviation Administration (FAA) introduced a Remote ID mandate, requiring unmanned aircraft to broadcast their location, operator's location, and identity in real-time. However, current standards leave authentication mechanisms underspecified, enabling spoofing, relay, and replay attacks that can undermine surveillance efforts and potentially disrupt UAS-to-UAS coordination in future deployments. In this paper, we propose TBRD, a practical system for authenticating Remote ID messages in a manner that aligns with existing standards and UAS capabilities. TBRD leverages the TESLA protocol and mobile device TEEs, and introduces a verification mechanism to build a lightweight, mission-scoped authentication system that is both computationally efficient and requires a low communication footprint. We evaluate the performance of TBRD using both an FAA-requirements compatible proof-of-concept implementation for performance metrics and a simulated 4-drone swarm mission scenario to demonstrate its security guarantees under adversarial conditions. Our system provides a 50\% reduction in authentication overhead compared to digital signatures and a 100x reduction in computation time. Our results demonstrate that TBRD can be integrated into current Remote ID infrastructures to provide a scalable, standards-compliant message authentication for both regulatory and operational use cases.
Authors:Josep Domingo-Ferrer, David Sánchez
Abstract:
Privacy models were introduced in privacy-preserving data publishing and statistical disclosure control with the promise to end the need for costly empirical assessment of disclosure risk. We examine how well this promise is kept by the main privacy models. We find they may fail to provide adequate protection guarantees because of problems in their definition or incur unacceptable trade-offs between privacy protection and utility preservation. Specifically, k-anonymity may not entirely exclude disclosure if enforced with deterministic mechanisms or without constraints on the confidential values. On the other hand, differential privacy (DP) incurs unacceptable utility loss for small budgets and its privacy guarantee becomes meaningless for large budgets. In the latter case, an ex post empirical assessment of disclosure risk becomes necessary, undermining the main appeal of privacy models. Whereas the utility preservation of DP can only be improved by relaxing its privacy guarantees, we argue that a semantic reformulation of k-anonymity can offer more robust privacy without losing utility with respect to traditional syntactic k-anonymity.
Authors:Yang Zhuochen, Fok Kar Wai, Thing Vrizlynn
Abstract:
Large language models have gained widespread attention recently, but their potential security vulnerabilities, especially privacy leakage, are also becoming apparent. To test and evaluate for data extraction risks in LLM, we proposed CoSPED, short for Consistent Soft Prompt targeted data Extraction and Defense. We introduce several innovative components, including Dynamic Loss, Additive Loss, Common Loss, and Self Consistency Decoding Strategy, and tested to enhance the consistency of the soft prompt tuning process. Through extensive experimentation with various combinations, we achieved an extraction rate of 65.2% at a 50-token prefix comparison. Our comparisons of CoSPED with other reference works confirm our superior extraction rates. We evaluate CoSPED on more scenarios, achieving Pythia model extraction rate of 51.7% and introducing cross-model comparison. Finally, we explore defense through Rank-One Model Editing and achieve a reduction in the extraction rate to 1.6%, which proves that our analysis of extraction mechanisms can directly inform effective mitigation strategies against soft prompt-based attacks.
Authors:Carlo Brunetta, Amit Chaudhary, Stefano Galatolo, Massimiliano Sala
Abstract:
Dynamically distributed inflation is a common mechanism used to guide a blockchain's staking rate towards a desired equilibrium between network security and token liquidity. However, the high sensitivity of the annual percentage yield to changes in the staking rate, coupled with the inherent feedback delays in staker responses, can induce undesirable oscillations around this equilibrium. This paper investigates this instability phenomenon. We analyze the dynamics of inflation-based reward systems and propose a novel distribution model designed to stabilize the staking rate. Our solution effectively dampens oscillations, stabilizing the yield within a target staking range.
Authors:Qizhou Peng, Yang Zheng, Yu Wen, Yanna Wu, Yingying Du
Abstract:
Reinforcement learning (RL) has been an important machine learning paradigm for solving long-horizon sequential decision-making problems under uncertainty. By integrating deep neural networks (DNNs) into the RL framework, deep reinforcement learning (DRL) has emerged, which achieved significant success in various domains. However, the integration of DNNs also makes it vulnerable to adversarial attacks. Existing adversarial attack techniques mainly focus on either directly manipulating the environment with which a victim agent interacts or deploying an adversarial agent that interacts with the victim agent to induce abnormal behaviors. While these techniques achieve promising results, their adoption in multi-party open systems remains limited due to two major reasons: impractical assumption of full control over the environment and dependent on interactions with victim agents. To enable adversarial attacks in multi-party open systems, in this paper, we redesigned an adversarial policy learning approach that can mislead well-trained victim agents without requiring direct interactions with these agents or full control over their environments. Particularly, we propose a neutral agent-based approach across various task scenarios in multi-party open systems. While the neutral agents seemingly are detached from the victim agents, indirectly influence them through the shared environment. We evaluate our proposed method on the SMAC platform based on Starcraft II and the autonomous driving simulation platform Highway-env. The experimental results demonstrate that our method can launch general and effective adversarial attacks in multi-party open systems.
Authors:Andrew Huang, Vinod Vaikuntanathan
Abstract:
One-shot signatures (OSS) are a powerful and uniquely quantum cryptographic primitive which allows anyone, given common reference string, to come up with a public verification key $\mathsf{pk}$ and a secret signing state $|\mathsf{sk}\rangle$. With the secret signing state, one can produce the signature of any one message, but no more. In a recent breakthrough work, Shmueli and Zhandry (CRYPTO 2025) constructed one-shot signatures, either unconditionally in a classical oracle model or assuming post-quantum indistinguishability obfuscation and the hardness of Learning with Errors (LWE) in the plain model. In this work, we address the inefficiency of the Shmueli-Zhandry construction which signs messages bit-by-bit, resulting in signing keys of $Θ(λ^4)$ qubits and signatures of size $Θ(λ^3)$ bits for polynomially long messages, where $λ$ is the security parameter. We construct a new, simple, direct, and efficient one-shot signature scheme which can sign messages of any polynomial length using signing keys of $Θ(λ^2)$ qubits and signatures of size $Θ(λ^2)$ bits. We achieve corresponding savings in runtimes, in both the oracle model and the plain model. In addition, unlike the Shmueli-Zhandry construction, our scheme achieves perfect correctness. Our scheme also achieves strong signature incompressibility, which implies a public-key quantum fire scheme with perfect correctness among other applications, correcting an error in a recent work of Çakan, Goyal and Shmueli (QCrypt 2025) and recovering their applications.
Authors:Daniel Hennig, Joaquin Garcia-Alfaro
Abstract:
This report presents some technical details on the authentication process of a lightweight key exchange protocol, paying attention on how Man-in-the-Middle (MitM) attacks could undermine its security, e.g., under the scope of lawful interception and its risk to facilitate mass surveillance. We focus only on some technical aspects associated to the attack scenario. Perspectives for future work are also discussed. Other specific aspects of the work, mainly focusing on the security implications of malicious metasurfaces against B5G networks, are excluded from the scope of this report.
Authors:Léo Ducas, Lynn Engelberts, Paola de Perthuis
Abstract:
Is module-lattice reduction better than unstructured lattice reduction? This question was highlighted as 'Q8' in the Kyber NIST standardization submission (Avanzi et al., 2021), as potentially affecting the concrete security of Kyber and other module-lattice-based schemes. Foundational works on module-lattice reduction (Lee, Pellet-Mary, Stehlé, and Wallet, ASIACRYPT 2019; Mukherjee and Stephens-Davidowitz, CRYPTO 2020) confirmed the existence of such module variants of LLL and block-reduction algorithms, but focus only on provable worst-case asymptotic behavior. In this work, we present a concrete average-case analysis of module-lattice reduction. Specifically, we address the question of the expected slope after running module-BKZ, and pinpoint the discriminant $Δ_K$ of the number field at hand as the main quantity driving this slope. We convert this back into a gain or loss on the blocksize $β$: module-BKZ in a number field $K$ of degree $d$ requires an SVP oracle of dimension $β+ \log(|Δ_K| / d^d)β/(d\log β) + o(β/ \log β)$ to reach the same slope as unstructured BKZ with blocksize $β$. This asymptotic summary hides further terms that we predict concretely using experimentally verified heuristics. Incidentally, we provide the first open-source implementation of module-BKZ for some cyclotomic fields. For power-of-two cyclotomic fields, we have $|Δ_K| = d^d$, and conclude that module-BKZ requires a blocksize larger than its unstructured counterpart by $d-1+o(1)$. On the contrary, for all other cyclotomic fields we have $|Δ_K| < d^d$, so module-BKZ provides a sublinear $Θ(β/\log β)$ gain on the required blocksize, yielding a subexponential speedup of $\exp(Θ(β/\log β))$.
Authors:Kartikeya Aneja, Nagender Aneja, Murat Kantarcioglu
Abstract:
Software systems can be represented as graphs, capturing dependencies among functions and processes. An interesting aspect of software systems is that they can be represented as different types of graphs, depending on the extraction goals and priorities. For example, function calls within the software can be captured to create function call graphs, which highlight the relationships between functions and their dependencies. Alternatively, the processes spawned by the software can be modeled to generate process interaction graphs, which focus on runtime behavior and inter-process communication. While these graph representations are related, each captures a distinct perspective of the system, providing complementary insights into its structure and operation. While previous studies have leveraged graph neural networks (GNNs) to analyze software behaviors, most of this work has focused on a single type of graph representation. The joint modeling of both function call graphs and process interaction graphs remains largely underexplored, leaving opportunities for deeper, multi-perspective analysis of software systems. This paper presents a pipeline for constructing and training Function Call Graphs (FCGs) and Process Call Graphs (PCGs) and learning joint embeddings. We demonstrate that joint embeddings outperform a single-graph model. In this paper, we propose GeminiNet, a unified neural network approach that learns joint embeddings from both FCGs and PCGs. We construct a new dataset of 635 Windows executables (318 malicious and 317 benign), extracting FCGs via Ghidra and PCGs via Any.Run sandbox. GeminiNet employs dual graph convolutional branches with an adaptive gating mechanism that balances contributions from static and dynamic views.
Authors:Jarrad Hope, Peter Ludlow
Abstract:
We argue that the principal application for blockchain technology will not be in the financial sector, but rather in maintaining decentralized human governance, from archives to transparent policies encoded in the blockchain in the form of smart contracts.. Such decentralized, blockchain-grounded governance comes not a moment too soon, as nation states are dissolving before our eyes. Will blockchain-based communities replace the nation state? What are the prospects and dangers of this development?
Authors:David Benavente-Rios, Juan Ruiz Rodriguez, Gustavo Gatica
Abstract:
This paper investigates the use of synthetic face data to enhance Single-Morphing Attack Detection (S-MAD), addressing the limitations of availability of large-scale datasets of bona fide images due to privacy concerns. Various morphing tools and cross-dataset evaluation schemes were utilized to conduct this study. An incremental testing protocol was implemented to assess the generalization capabilities as more and more synthetic images were added. The results of the experiments show that generalization can be improved by carefully incorporating a controlled number of synthetic images into existing datasets or by gradually adding bona fide images during training. However, indiscriminate use of synthetic data can lead to sub-optimal performance. Evenmore, the use of only synthetic data (morphed and non-morphed images) achieves the highest Equal Error Rate (EER), which means in operational scenarios the best option is not relying only on synthetic data for S-MAD.
Authors:Alex Hiles, Bashar I. Ahmad
Abstract:
Fingerprinting Radio Frequency (RF) emitters typically involves finding unique emitter characteristics that are featured in their transmitted signals. These fingerprints are nuanced but sufficiently detailed, motivating the pursuit of methods that can successfully extract them. The most granular downstream task is known as Specific Emitter Identification (SEI), which requires a well informed RF fingerprinting (RFF) approach for it to be successful. RFF and SEI have a long history, with numerous application areas in defence and civilian contexts such as signal intelligence, electronic surveillance, physical-layer authentication of wireless communication devices, to name a few. RFF methods also support many other downstream tasks such as Emitter Data Association (EDA) and RF Emitter Clustering (RFEC) and are applicable to a range of transmission types. In recent years, data-driven approaches have become popular in the RFF domain due to their ability to automatically learn intricate fingerprints from raw data. These methods generally deliver superior performance when compared to traditional techniques. The more traditional approaches are often labour-intensive, inflexible and only applicable to a particular emitter type or transmission scheme. Therefore, we consider data-driven Machine Learning (ML)-enabled RFF. In particular, we propose a generic framework for ML-enabled RFF which is inclusive of several popular downstream tasks such as SEI, EDA and RFEC. Each task is formulated as a RF fingerprint-dependent task. A variety of use cases using real RF datasets are presented here to demonstrate the framework for a range of tasks and application areas, such as spaceborne surveillance, signal intelligence and countering drones.
Authors:Samuel Oleksak, Richard Gazdik, Martin Peresini, Ivan Homoliak
Abstract:
Proof of Work (PoW) is widely regarded as the most secure permissionless blockchain consensus protocol. However, its reliance on computationally intensive yet externally useless puzzles results in excessive electric energy wasting. To alleviate this, Proof of Useful Work (PoUW) has been explored as an alternative to secure blockchain platforms while also producing real-world value. Despite this promise, existing PoUW proposals often fail to embed the integrity of the chain and identity of the miner into the puzzle solutions, not meeting necessary requirements for PoW and thus rendering them vulnerable. In this work, we propose a PoUW consensus protocol that computes client-outsourced zk-SNARKs proofs as a byproduct, which are at the same time used to secure the consensus protocol. We further leverage this mechanism to design a decentralized marketplace for outsourcing zk-SNARK proof generation, which is, to the best of our knowledge, the first such marketplace operating at the consensus layer, while meeting all necessary properties of PoW.
Authors:Md Habibur Rahman, Md Sharif Hossen, Nathan H. Stephenson, Vijay K. Shah, Aloizio Da Silva
Abstract:
The open radio access network (O-RAN) enables modular, intelligent, and programmable 5G network architectures through the adoption of software-defined networking, network function virtualization, and implementation of standardized open interfaces. However, one of the security concerns for O-RAN, which can severely undermine network performance, is jamming attacks. This paper presents SAJD- a self-adaptive jammer detection framework that autonomously detects jamming attacks in AI/ML framework-integrated ORAN environments without human intervention. The SAJD framework forms a closed-loop system that includes near-realtime inference of radio signal jamming via our developed ML-based xApp, as well as continuous monitoring and retraining pipelines through rApps. In this demonstration, we will show how SAJD outperforms state-of-the-art jamming detection xApp (offline trained with manual labels) in terms of accuracy and adaptability under various dynamic and previously unseen interference scenarios in the O-RAN-compliant testbed.
Authors:Khaoula Sghaier, Badis Hammi, Ghada Gharbi, Pierre Merdrignac, Pierre Parrend, Didier Verna
Abstract:
Software-Defined Vehicles (SDVs) introduce innovative features that extend the vehicle's lifecycle through the integration of outsourced applications and continuous Over-The-Air (OTA) updates. This shift necessitates robust cybersecurity and system resilience. While research on Connected and Autonomous Vehicles (CAV) has been extensive, there is a lack of clarity in distinguishing SDVs from non-SDVs and a need to consolidate cybersecurity research. SDVs, with their extensive connectivity, have a broader attack surface. Besides, their software-centric nature introduces additional vulnerabilities. This paper provides a comprehensive examination of SDVs, detailing their ecosystem, enabling technologies, and the principal cyberattack entry points that arise from their architectural and operational characteristics. We also introduce a novel, layered taxonomy that maps concrete exploit techniques onto core SDV properties and attack paths, and use it to analyze representative studies and experimental approaches.
Authors:Simone Fischer-Hübner, Leonardo A. Martucci, Lejla Islami, Ala Sarah Alaqra, Farzaneh Karegar
Abstract:
A rapidly growing number of cybersecurity threats and incidents demands that Swedish organisations increase their efforts to improve their cybersecurity capacities. This paper presents results from interviews and a prior survey with key representatives from enterprises and public sector organisations in the Swedish region of Värmland in Inner Scandinavia, examining their cybersecurity readiness and needs for education and competence development. We discuss the generalizability of our findings and the extent to which they may be specific to Sweden and Värmland, and we conclude by proposing efforts to strengthen cybersecurity competences in Inner Scandinavia.
Authors:Adam Bloomston, Elizabeth Burke, Megan Cacace, Anne Diaz, Wren Dougherty, Matthew Gonzalez, Remington Gregg, Yeliz Güngör, Bryce Hayes, Eeway Hsu, Oron Israeli, Heesoo Kim, Sara Kwasnick, Joanne Lacsina, Demma Rosa Rodriguez, Adam Schiller, Whitney Schumacher, Jessica Simon, Maggie Tang, Skyler Wharton, Marilyn Wilcken
Abstract:
We present Core Mondrian, a scalable extension of the Original Mondrian partition-based anonymization algorithm. A modular strategy layer supports k-anonymity, allowing new privacy models to be added easily. A hybrid recursive/queue execution engine exploits multi-core parallelism while maintaining deterministic output. Utility-preserving enhancements include NaN-pattern pre-partitioning, metric-driven cut scoring, and dynamic suppression budget management. Experiments on the 48k-record UCI ADULT dataset and synthetically scaled versions up to 1M records achieve lower Discernibility Metric scores than Original Mondrian for numeric quasi-identifier sets while parallel processing delivers up to 4x speedup vs. sequential Core Mondrian. Core Mondrian enables privacy-compliant equity analytics at production scale.
Authors:Krishno Dey, Diogo Barradas, Saqib Hakak
Abstract:
The Metaverse utilizes emerging technologies such as Extended Reality (XR), Artificial Intelligence (AI), blockchain, and digital twins to provide an immersive and interactive virtual experience. As Metaverse continues to evolve, it bring a range of security and privacy threats, such as identity management, data governance, and user interactions. This survey aims to provide a comprehensive review of the enabling technologies for the Metaverse. It also aims to provide a thorough analysis of key vulnerabilities and threats that may compromise its sustainability and user safety. We perform a systematic literature review (SLR) to identify key vulnerabilities and their countermeasures in Metaverse platforms. Metaverse offers a much larger attack surface compared to conventional digital platforms. Immersive, decentralized, and permanent characteristics of the Metaverse generate new vulnerabilities. Although there are many countermeasures to these vulnerabilities, most of them are theoretical or have not been tested in real-world environments. Our review highlights current advancements, identifies research gaps, and outlines future directions to ensure a secure, resilient, and ethically governed Metaverse.
Authors:Jiayun Mo, Xin Kang, Tieyan Li, Zhongding Lei
Abstract:
The excitement brought by the development of AI agents came alongside arising problems. These concerns centered around users' trust issues towards AIs, the risks involved, and the difficulty of attributing responsibilities and liabilities. Current solutions only attempt to target each problem separately without acknowledging their inter-influential nature. The Trust, Risk and Liability (TRL) framework proposed in this paper, however, ties together the interdependent relationships of trust, risk, and liability to provide a systematic method of building and enhancing trust, analyzing and mitigating risks, and allocating and attributing liabilities. It can be applied to analyze any application scenarios of AI agents and suggest appropriate measures fitting to the context. The implications of the TRL framework lie in its potential societal impacts, economic impacts, ethical impacts, and more. It is expected to bring remarkable values to addressing potential challenges and promoting trustworthy, risk-free, and responsible usage of AI in 6G networks.
Authors:Meiyin Meng, Zaixi Zhang
Abstract:
Large language models (LLMs) are increasingly integrated into biomedical research workflows--from literature triage and hypothesis generation to experimental design--yet this expanded utility also heightens dual-use concerns, including the potential misuse for guiding toxic compound synthesis. In response, this study shows a Biosecurity Agent that comprises four coordinated modes across the model lifecycle: dataset sanitization, preference alignment, run-time guardrails, and automated red teaming. For dataset sanitization (Mode 1), evaluation is conducted on CORD-19, a COVID-19 Open Research Dataset of coronavirus-related scholarly articles. We define three sanitization tiers--L1 (compact, high-precision), L2 (human-curated biosafety terms), and L3 (comprehensive union)--with removal rates rising from 0.46% to 70.40%, illustrating the safety-utility trade-off. For preference alignment (Mode 2), DPO with LoRA adapters internalizes refusals and safe completions, reducing end-to-end attack success rate (ASR) from 59.7% to 3.0%. At inference (Mode 3), run-time guardrails across L1-L3 show the expected security-usability trade-off: L2 achieves the best balance (F1 = 0.720, precision = 0.900, recall = 0.600, FPR =0.067), while L3 offers stronger jailbreak resistance at the cost of higher false positives. Under continuous automated red-teaming (Mode 4), no successful jailbreaks are observed under the tested protocol. Taken together, our biosecurity agent offers an auditable, lifecycle-aligned framework that reduces attack success while preserving benign utility, providing safeguards for the use of LLMs in scientific research and setting a precedent for future agent-level security protections.
Authors:Nico Bistolfi, Andreea Georgescu, Dave Hodson
Abstract:
As cloud infrastructure evolves to support dynamic and distributed workflows, accelerated now by AI-driven processes, the outdated model of standing permissions has become a critical vulnerability. Based on the Cloud Security Alliance (CSA) Top Threats to Cloud Computing Deep Dive 2025 Report, our analysis details how standing permissions cause catastrophic cloud breaches. While current security tools are addressing network and API security, the challenge of securing granular data access remains. Removing standing permissions at the data level is as critical as it is at the network level, especially for companies handling valuable data at scale. In this white paper, we introduce an innovative architecture based on on-demand data enclaves to address this gap directly. Our approach enables Zero Standing Privilege (ZSP) and Just-in-Time (JIT) principles at the data level. We replace static permissions with temporary data contracts that enforce proactive protection. This means separation is built around the data requested on-demand, providing precise access and real time monitoring for individual records instead of datasets. This solution drastically reduces the attack surface, prevents privilege creep, and simplifies auditing, offering a vital path for enterprises to transition to a more secure and resilient data environment.
Authors:Moritz Steffin, Jiska Classen
Abstract:
The XNU kernel is the basis of Apple's operating systems. Although labeled as a hybrid kernel, it is found to generally operate in a monolithic manner by defining a single privileged trust zone in which all system functionality resides. This has security implications, as a kernel compromise has immediate and significant effects on the entire system. Over the past few years, Apple has taken steps towards a more compartmentalized kernel architecture and a more microkernel-like design. To date, there has been no scientific discussion of SPTM and related security mechanisms. Therefore, the understanding of the system and the underlying security mechanisms is minimal. In this paper, we provide a comprehensive analysis of new security mechanisms and their interplay, and create the first conclusive writeup considering all current mitigations. SPTM acts as the sole authority regarding memory retyping. Our analysis reveals that, through SPTM domains based on frame retyping and memory mapping rule sets, SPTM introduces domains of trust into the system, effectively gapping different functionalities from one another. Gapped functionality includes the TXM, responsible for code signing and entitlement verification. We further demonstrate how this introduction lays the groundwork for the most recent security feature of Exclaves, and conduct an in-depth analysis of its communication mechanisms. We discover multifold ways of communication, most notably xnuproxy as a secure world request handler, and the Tightbeam IPC framework. The architecture changes are found to increase system security, with key and sensitive components being moved out of XNU's direct reach. This also provides additional security guarantees in the event of a kernel compromise, which is no longer an immediate threat at the highest trust level.
Authors:Chandra Thapa, Surya Nepal
Abstract:
Future G network's new reality is a widespread cyber-physical environment created by Integrated Sensing and Communication (ISAC). It is a crucial technology that transforms wireless connections into ubiquitous sensors. ISAC unlocks transformative new capabilities, powering autonomous systems, augmented human sensing, and next-generation immersive applications, such as digital twins. However, this new reality fundamentally reshapes the security landscape. The primary security concern shifts from the traditional focus on data protection to a new priority: safeguarding the integrity of the system's perception of physical reality itself. This perception can be perilously manipulated by sophisticated attacks such as sensing eavesdropping, phantom dangers, and invisible threats, potentially resulting in direct and catastrophic physical harm. Traditional security measures, such as signature-based detection, are insufficient to counter these perception-level threats that mimic genuine physical signals. A proactive, layered, defense-in-depth strategy is required, integrating physical, environmental, intelligence, and architectural security measures to build a trustworthy ecosystem. Additionally, realizing ISAC's potential responsibly also depends on parallel efforts in global standardization and strong governance to address the significant challenges of privacy, liability, and the technology's dual-use.
Authors:Satyam Tyagi, Ganesh Murugesan
Abstract:
Ransomware impact hinges on how easily an intruder can move laterally and spread to the maximum number of assets. We present a graph-theoretic method to measure lateral-movement susceptibility and estimate blast radius. We build a directed multigraph where vertices represent assets and edges represent reachable services (e.g., RDP/SSH) between them. We model lateral movement as a probabilistic process using a pivot potential factor $Ï(s)$ for each service. This allows us to iteratively compute a $K$-hop compromise probability matrix that captures how compromise propagates through the network. Metrics derived from this model include: (1) Lateral-Movement Susceptibility (LMS$_K$): the average probability of a successful lateral movement between any two assets (0-1 scale); and (2) Blast-Radius Estimate (BRE$_K$): the expected percentage of assets compromised in an average attack scenario. Interactive control (SSH 22, RDP 3389) gets higher $Ï(s)$ than app-only ports (MySQL 3306, MSSQL 1433), which seldom enable pivoting without an RCE. Across anonymized enterprise snapshots, pruning high-$Ï(s)$ edges yields the largest LMS$_K$/BRE$_K$ drop, aligning with CISA guidance, MITRE ATT\&CK (TA0008: Lateral Movement), and NIST SP~800-207. The framework evaluates (micro)segmentation and helps prioritize controls that reduce lateral movement susceptibility and shrink blast radius.
Authors:Amine Lbath, Massih-Reza Amini, Aurelien Delaitre, Vadim Okun
Abstract:
The increasing complexity of software systems and the sophistication of cyber-attacks have underscored the critical need for effective automated vulnerability detection and repair systems. Traditional methods, such as static program analysis, face significant challenges related to scalability, adaptability, and high false-positive and false-negative rates. AI-driven approaches, particularly those using machine learning and deep learning models, show promise but are heavily reliant on the quality and quantity of training data. This paper introduces a novel framework designed to automatically introduce realistic, category-specific vulnerabilities into secure C/C++ codebases to generate datasets. The proposed approach coordinates multiple AI agents that simulate expert reasoning, along with function agents and traditional code analysis tools. It leverages Retrieval-Augmented Generation for contextual grounding and employs Low-Rank approximation of weights for efficient model fine-tuning. Our experimental study on 116 code samples from three different benchmarks suggests that our approach outperforms other techniques with regard to dataset accuracy, achieving between 89\% and 95\% success rates in injecting vulnerabilities at function level.
Authors:Jose E. Puente, Carlos Puente
Abstract:
We explore the feasibility of deploying Bitcoin as the shared monetary standard between Earth and Mars, accounting for physical constraints of interplanetary communication. We introduce a novel primitive, Proof-of-Transit Timestamping (PoTT), to provide cryptographic, tamper-evident audit trails for Bitcoin data across high-latency, intermittently-connected links. Leveraging Delay/Disruption-Tolerant Networking (DTN) and optical low-Earth-orbit (LEO) mesh constellations, we propose an architecture for header-first replication, long-horizon Lightning channels with planetary watchtowers, and secure settlement through federated sidechains or blind-merge-mined (BMM) commit chains. We formalize PoTT, analyze its security model, and show how it measurably improves reliability and accountability without altering Bitcoin consensus or its monetary base. Near-term deployments favor strong federations for local settlement; longer-term, blind-merge-mined commit chains (if adopted) provide an alternative. The Earth L1 monetary base remains unchanged, while Mars can operate a pegged commit chain or strong federation with 1:1 pegged assets for local block production. For transparency, if both time-beacon regimes are simultaneously compromised, PoTT-M2 (and PoTT generally) reduces to administrative assertions rather than cryptographic time-anchoring.
Authors:Michael R Smith, Joe Ingram
Abstract:
The rise of AI has transformed the software and hardware landscape, enabling powerful capabilities through specialized infrastructures, large-scale data storage, and advanced hardware. However, these innovations introduce unique attack surfaces and objectives which traditional cybersecurity assessments often overlook. Cyber attackers are shifting their objectives from conventional goals like privilege escalation and network pivoting to manipulating AI outputs to achieve desired system effects, such as slowing system performance, flooding outputs with false positives, or degrading model accuracy. This paper serves to raise awareness of the novel cyber threats that are introduced when incorporating AI into a software system. We explore the operational cybersecurity and supply chain risks across the AI lifecycle, emphasizing the need for tailored security frameworks to address evolving threats in the AI-driven landscape. We highlight previous exploitations and provide insights from working in this area. By understanding these risks, organizations can better protect AI systems and ensure their reliability and resilience.
Authors:Subhrojyoti Mukherjee, Manoranjan Mohanty
Abstract:
Fake images in selfie banking are increasingly becoming a threat. Previously, it was just Photoshop, but now deep learning technologies enable us to create highly realistic fake identities, which fraudsters exploit to bypass biometric systems such as facial recognition in online banking. This paper explores the use of an already established forensic recognition system, previously used for picture camera localization, in deepfake detection.
Authors:Fabian Aude Steen, Daniel Assani Shabani
Abstract:
The NIS2 Directive establishes a common cybersecurity governance model across the European Union, requiring member states to identify, classify, and supervise essential and important entities. As part of a broader governance network, member states are also obligated to notify the European Commission, the Cooperation Group, and ENISA about their cybersecurity infrastructure landscape. This thesis presents an analysis of the NIS2 Directive in this context and translates its provisions into concrete technical requirements. These requirements inform the design and implementation of a modular, legally grounded registry system intended to support competent authorities across the EU in meeting their obligations. Using the Design Science Research methodology, the thesis transforms complex legal provisions into structured workflows, deterministic classification algorithms, and interactive dashboards. The resulting system automates key regulatory processes, including entity registration, classification, and notification, while enabling context-aware supervision and reducing administrative burden. It supports both automated and manual registration methods and introduces a contextual labeling system to handle edge cases, risk factors, and cross-directive dependencies. Although developed for the Norwegian regulatory ecosystem, the system is designed for adaptation by other member states with minimal modification. This thesis contributes a reusable framework that bridges legal interpretation and technical implementation, offering a scalable solution for national and EU-level NIS2 cybersecurity governance. It also identifies key limitations and outlines opportunities for future research and development.
Authors:Luqman Muhammad Zagi, Girindro Pringgo Digdo, Wervyan Shalannanda
Abstract:
This study investigates the defacement of Indonesian websites by actors promoting illegal online gambling. Using a lightweight methodology that combines keyword-driven dorking with systematic crawling, we identified 453 defaced webpages within one month. Although dorking alone yielded a false positive rate of approximately 20.3\%, the integration of crawling and keyword-counting enabled reliable differentiation between true and false positives. Our measurements revealed diverse defacement behaviors, including repeat defacements (150 cases), fixed instances (129), keyword modifications (55), and redirections or hidden URL injections. In total, 8,837 unique third-party URLs spanning 5,930 domains were captured, with a small subset recurring across multiple sites. Website responses were inconsistent, with an average reaction time of 75.3 hours. These findings demonstrate that simple, reproducible techniques can provide meaningful insights into the scale, persistence, and dynamics of defacement, highlighting the importance of continuous measurement for strengthening defenses against online gambling activities.
Authors:Ms. Preeti P. Bhatt, Rakesh R. Savant
Abstract:
Steganography is technique of hiding a data under cover media using different steganography tools. Image steganography is hiding of data (Text/Image/Audio/Video) under a cover as Image. This review paper presents classification of image steganography and the comparison of various Image steganography tools using different image formats. Analyzing numerous tools on the basis of Image features and extracting the best one. Some of the tools available in the market were selected based on the frequent use; these tools were tested using the same input on all of them. Specific text was embedded within all host images for each of the six Steganography tools selected. The results of the experiment reveal that all the six tools were relatively performing at the same level, though some software performs better than others through efficiency. And it was based on the image features like size, dimensions, and pixel value and histogram differentiation.
Authors:Kehao Miao, Xiaolong Jin
Abstract:
With the widespread use of large language models (LLMs), understanding their potential failure modes during user interactions is essential. In practice, users often pose multiple questions in a single conversation with LLMs. Therefore, in this study, we propose Group Query Attack, a technique that simulates this scenario by presenting groups of queries to LLMs simultaneously. We investigate how the accumulated context from consecutive prompts influences the outputs of LLMs. Specifically, we observe that Group Query Attack significantly degrades the performance of models fine-tuned on specific tasks. Moreover, we demonstrate that Group Query Attack induces a risk of triggering potential backdoors of LLMs. Besides, Group Query Attack is also effective in tasks involving reasoning, such as mathematical reasoning and code generation for pre-trained and aligned models.
Authors:Mark Dorsett, Scott Mann, Jabed Chowdhury, Abdun Mahmood
Abstract:
The Denial of Wallet (DoW) attack poses a unique and growing threat to serverless architectures that rely on Function-as-a-Service (FaaS) models, exploiting the cost structure of pay-as-you-go billing to financially burden application owners. Unlike traditional Denial of Service (DoS) attacks, which aim to exhaust resources and disrupt service availability, DoW attacks focus on escalating costs without impacting service operation. This review traces the evolution of DoW research, from initial awareness and attack classification to advancements in detection and mitigation strategies. Key developments include the categorisation of attack types-such as Blast DDoW, Continual Inconspicuous DDoW, and Background Chained DDoW-and the creation of simulation tools like DoWTS, which enable safe experimentation and data generation. Recent advancements highlight machine learning approaches, including systems like Gringotts and DoWNet, which leverage deep learning and anomaly detection to identify malicious traffic patterns. Although substantial progress has been made, challenges persist, notably the lack of real-world data and the need for adaptive billing models. This is the first comprehensive literature review dedicated strictly to Denial of Wallet attacks, providing an in-depth analysis of their financial impacts, attack techniques, mitigation strategies, and detection mechanisms within serverless computing. The paper also presents the first detailed examination of simulation and data generation tools used for DoW research, addressing a critical gap in existing cybersecurity literature. By synthesising these key areas, this study serves as a foundational resource for future research and industry efforts in securing pay-as-you-go cloud environments.
Authors:Mark Dorsett, Scott Man, Tim Koussas
Abstract:
This paper proposes a unified, condition-based framework for classifying both legacy and cloud-era denial-of-service (DoS) attacks. The framework comprises three interrelated models: a formal conditional tree taxonomy, a hierarchical lattice structure based on order theory, and a conceptual Venn diagram. At its core, the taxonomy introduces six observable conditions (C0-C5) grounded in real-world attack behaviours, including source distribution, traffic volume, infrastructure targeting, and financial exploitation. These conditions enable consistent classification of known attacks-such as DoS, DDoS, LDoS, LDDoS, EDoS, DoW, and DDoW, while supporting identification of emerging or hybrid variants. The lattice structure captures the cumulative satisfaction of conditions, allowing hierarchical reasoning across denial attack classes. The Venn diagram highlights conceptual overlaps between availability- and sustainability-focused attacks, improving comparative insight. Together, these models provide a robust analytical lens for threat modeling, mitigation strategy design, and attacker intent classification. The framework is particularly relevant in cloud-native and serverless environments, where sustainability-based attacks are increasingly impactful yet under-recognised. Its extensibility also permits future integration of socio-technical or behavioural dimensions. By offering a structured taxonomy with theoretical grounding and real-world applicability, this work advances denial attack comprehension and equips defenders, researchers, and cloud architects with a shared vocabulary for interpreting and mitigating evolving threat vectors.
Authors:Aoun E Muhammad, Kin Choong Yow, Jamel Baili, Yongwon Cho, Yunyoung Nam
Abstract:
As the deployment of Artificial Intelligence (AI) systems in high-stakes sectors - like healthcare, finance, education, justice, and infrastructure has increased - the possibility and impact of failures of these systems have significantly evolved from being a theoretical possibility to practical recurring, systemic risk. This paper introduces CORTEX (Composite Overlay for Risk Tiering and Exposure), a multi-layered risk scoring framework proposed to assess and score AI system vulnerabilities, developed on empirical analysis of over 1,200 incidents documented in the AI Incident Database (AIID), CORTEX categorizes failure modes into 29 technical vulnerability groups. Each vulnerability is scored through a five-tier architecture that combines: (1) utility-adjusted Likelihood x Impact calculations; (2) governance + contextual overlays aligned with regulatory frameworks, such as the EU AI Act, NIST RMF, OECD principles; (3) technical surface scores, covering exposure vectors like drift, traceability, and adversarial risk; (4) environmental and residual modifiers tailored to context of where these systems are being deployed to use; and (5) a final layered assessment via Bayesian risk aggregation and Monte Carlo simulation to model volatility and long-tail risks. The resulting composite score can be operationalized across AI risk registers, model audits, conformity checks, and dynamic governance dashboards.
Authors:Sai Teja Reddy Adapala, Yashwanth Reddy Alugubelly
Abstract:
The proliferation of autonomous AI agents marks a paradigm shift toward complex, emergent multi-agent systems. This transition introduces systemic security risks, including control-flow hijacking and cascading failures, that traditional cybersecurity paradigms are ill-equipped to address. This paper introduces the Aegis Protocol, a layered security framework designed to provide strong security guarantees for open agentic ecosystems. The protocol integrates three technological pillars: (1) non-spoofable agent identity via W3C Decentralized Identifiers (DIDs); (2) communication integrity via NIST-standardized post-quantum cryptography (PQC); and (3) verifiable, privacy-preserving policy compliance using the Halo2 zero-knowledge proof (ZKP) system. We formalize an adversary model extending Dolev-Yao for agentic threats and validate the protocol against the STRIDE framework. Our quantitative evaluation used a discrete-event simulation, calibrated against cryptographic benchmarks, to model 1,000 agents. The simulation showed a 0 percent success rate across 20,000 attack trials. For policy verification, analysis of the simulation logs reported a median proof-generation latency of 2.79 seconds, establishing a performance baseline for this class of security. While the evaluation is simulation-based and early-stage, it offers a reproducible baseline for future empirical studies and positions Aegis as a foundation for safe, scalable autonomous AI.
Authors:Faezeh Dehghan Tarzjani, Mostafa Salehi
Abstract:
The Internet of Things (IoT) is applied in various fields, and the number of physical devices connected to the IoT is increasingly growing. There are significant challenges to the IoT's growth and development, mainly due to the centralized nature and large-scale IoT networks. The emphasis on the decentralization of IoT's architecture can overcome challenges to IoT's capabilities. A promising decentralized platform for IoT is blockchain. Owing to IoT devices' limited resources, traditional consensus algorithms such as PoW and PoS in the blockchain are computationally expensive. Therefore, the PoA consensus algorithm is proposed in the blockchain consensus network for IoT. The PoA selects the validator as Turn-based selection (TBS) that needs optimization and faces system reliability, energy consumption, latency, and low scalability. We propose an efficient, lightweight blockchain for decentralizing IoT architecture by using virtualization and clustering to increase productivity and scalability to address these issues. We also introduce a novel PoA based on the Weight-Based-Selection (WBS) method for validators to validate transactions and add them to the blockchain. By simulation, we evaluated the performance of our proposed WBS method as opposed to TBS. The results show reduced energy consumption, and response time, and increased throughput.
Authors:Poorya Mollahosseini, Yasaman Ghasempour
Abstract:
Physical-layer key generation (PLKG) has emerged as a promising technique to secure next-generation wireless networks by exploiting the inherent properties of the wireless channel. However, PLKG faces fundamental challenges in the millimeter wave (mmWave) regime due to channel sparsity, higher phase noise, and higher path loss, which undermine both the randomness and reciprocity required for secure key generation. In this paper, we present mmKey, a novel PLKG framework that capitalizes on the availability of multiple antennas at mmWave wireless nodes to inject randomness into an otherwise quasi-static wireless channel. Different from prior works that sacrifice either the secrecy of the key generation or the robustness, mmKey balances these two requirements. In particular, mmKey leverages a genetic algorithm to gradually evolve the initial weight vector population toward configurations that suppress the LOS component while taking into account the channel conditions, specifically, the sparsity and the signal-to-noise ratio (SNR). Extensive simulations show that mmKey improves the secrecy gap by an average of 39.4% over random beamforming and 34.0% over null beamforming, outperforming conventional schemes.
Authors:Martin Lochner, Keegan Keplinger
Abstract:
Objective: This work describes the topic modelling of Security Operations Centre (SOC) use of a large language model (LLM), during live security operations. The goal is to better understand how these specialists voluntarily use this tool.
Background: Human-automation teams have been extensively studied, but transformer-based language models have sparked a new wave of collaboration. SOC personnel at a major cybersecurity provider used an LLM to support live security operations. This study examines how these specialists incorporated the LLM into their work.
Method: Our data set is the result of 10 months of SOC operators accessing GPT-4 over an internally deployed HTTP-based chat application. We performed two topic modelling exercises, first using the established BERTopic model (Grootendorst, 2022), and second, using a novel topic modeling workflow.
Results: Both the BERTopic analysis and novel modelling approach revealed that SOC operators primarily used the LLM to facilitate their understanding of complex text strings. Variations on this use-case accounted for ~40% of SOC LLM usage.
Conclusion: SOC operators are required to rapidly interpret complex commands and similar information. Their natural tendency to leverage LLMs to support this activity indicates that their workflow can be supported and augmented by designing collaborative LLM tools for use in the SOC.
Application: This work can aid in creating next-generation tools for Security Operations Centres. By understanding common use-cases, we can develop workflows supporting SOC task flow. One example is a right-click context menu for executing a command line analysis LLM call directly in the SOC environment.
Authors:Giuseppe Stragapede, Sam Merrick, Vedrana KrivokuÄa Hahn, Justin Sukaitis, Vincent Graf Narbel
Abstract:
In humanitarian and emergency scenarios, the use of biometrics can dramatically improve the efficiency of operations, but it poses risks for the data subjects, which are exacerbated in contexts of vulnerability. To address this, we present a mobile biometric system implementing a biometric template protection (BTP) scheme suitable for these scenarios. After rigorously formulating the functional, operational, and security and privacy requirements of these contexts, we perform a broad comparative analysis of the BTP landscape. PolyProtect, a method designed to operate on neural network face embeddings, is identified as the most suitable method due to its effectiveness, modularity, and lightweight computational burden. We evaluate PolyProtect in terms of verification and identification accuracy, irreversibility, and unlinkability, when this BTP method is applied to face embeddings extracted using EdgeFace, a novel state-of-the-art efficient feature extractor, on a real-world face dataset from a humanitarian field project in Ethiopia. Moreover, as PolyProtect promises to be modality-independent, we extend its evaluation to fingerprints. To the best of our knowledge, this is the first time that PolyProtect has been evaluated for the identification scenario and for fingerprint biometrics. Our experimental results are promising, and we plan to release our code
Authors:Chitraksh Singh, Monisha Dhanraj, Ken Huang
Abstract:
The escalating complexity and volume of cyberattacks demand proactive detection strategies that go beyond traditional rule-based systems. This paper presents a phase-aware, multi-model machine learning framework that emulates adversarial behavior across the seven phases of the Cyber Kill Chain using the MITRE ATT&CK Enterprise dataset. Techniques are semantically mapped to phases via ATTACK-BERT, producing seven phase-specific datasets. We evaluate LightGBM, a custom Transformer encoder, fine-tuned BERT, and a Graph Neural Network (GNN), integrating their outputs through a weighted soft voting ensemble. Inter-phase dependencies are modeled using directed graphs to capture attacker movement from reconnaissance to objectives. The ensemble consistently achieved the highest scores, with F1-scores ranging from 97.47% to 99.83%, surpassing GNN performance (97.36% to 99.81%) by 0.03%--0.20% across phases. This graph-driven, ensemble-based approach enables interpretable attack path forecasting and strengthens proactive cyber defense.
Authors:Temesgen Kitaw Damenu, İnci Zaim Gökbay, Alexandra Covaci, Shujun Li
Abstract:
Educational games have been widely used to teach children about cyber security. This systematic literature review reveals evidence of positive learning outcomes, after analysing 91 such games reported in 68 papers published between 2010 and 2024. However, critical gaps have also been identified regarding the design processes and the methodological rigour, including lack of systematic design, misalignment between proposed and achieved learning outcomes, rare use of control groups, limited discussions on ethical considerations, and underutilisation of emerging technologies. We recommend multiple future research directions, e.g., a hybrid approach to game design and evaluation that combines bottom-up and top-down approaches.
Authors:Derek Lilienthal, Sanghyun Hong
Abstract:
Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection) and data-oriented threats (e.g., data exfiltration), time-of-check to time-of-use (TOCTOU) remain largely unexplored in this context. TOCTOU arises when an agent validates external state (e.g., a file or API response) that is later modified before use, enabling practical attacks such as malicious configuration swaps or payload injection. In this work, we present the first study of TOCTOU vulnerabilities in LLM-enabled agents. We introduce TOCTOU-Bench, a benchmark with 66 realistic user tasks designed to evaluate this class of vulnerabilities. As countermeasures, we adapt detection and mitigation techniques from systems security to this setting and propose prompt rewriting, state integrity monitoring, and tool-fusing. Our study highlights challenges unique to agentic workflows, where we achieve up to 25% detection accuracy using automated detection methods, a 3% decrease in vulnerable plan generation, and a 95% reduction in the attack window. When combining all three approaches, we reduce the TOCTOU vulnerabilities from an executed trajectory from 12% to 8%. Our findings open a new research direction at the intersection of AI safety and systems security.
Authors:Saksham Kumar, Rhythm Narang
Abstract:
The rise of Deepfake technology to generate hyper-realistic manipulated images and videos poses a significant challenge to the public and relevant authorities. This study presents a robust Deepfake detection based on a modified Vision Transformer(ViT) model, trained to distinguish between real and Deepfake images. The model has been trained on a subset of the OpenForensics Dataset with multiple augmentation techniques to increase robustness for diverse image manipulations. The class imbalance issues are handled by oversampling and a train-validation split of the dataset in a stratified manner. Performance is evaluated using the accuracy metric on the training and testing datasets, followed by a prediction score on a random image of people, irrespective of their realness. The model demonstrates state-of-the-art results on the test dataset to meticulously detect Deepfake images.
Authors:Yu Cheng, Xiaofang Qi, Yanhui Li
Abstract:
With the popularization of smartphones, red packets have been widely used in mobile apps. However, the issues of fraud associated with them have also become increasingly prominent. As reported in user reviews from mobile app markets, many users have complained about experiencing red packet fraud and being persistently troubled by fraudulent red packets. To uncover this phenomenon, we conduct the first investigation into an extensive collection of user reviews on apps with red packets. In this paper, we first propose a novel automated approach, ReckDetector, for effectively identifying apps with red packets from app markets. We then collect over 360,000 real user reviews from 334 apps with red packets available on Google Play and three popular alternative Android app markets. We preprocess the user reviews to extract those related to red packets and fine-tune a pre-trained BERT model to identify negative reviews. Finally, based on semantic analysis, we have summarized six distinct categories of red packet fraud issues reported by users. Through our study, we found that red packet fraud is highly prevalent, significantly impacting user experience and damaging the reputation of apps. Moreover, red packets have been widely exploited by unscrupulous app developers as a deceptive incentive mechanism to entice users into completing their designated tasks, thereby maximizing their profits.
Authors:Aparna Singh, Geetanjali Rathee, Chaker Abdelaziz Kerrache, Mohamed Chahine Ghanem
Abstract:
The very high growth of Intelligent Transportation Systems (ITS) has generated an urgent requirement for secure, effective, and context-aware data sharing mechanisms, especially over heterogeneous and geographically dispersed settings. This work suggests a new architecture that combines a relay chain-driven encryption system with a modified Ciphertext-Policy Attribute-Based Encryption (CP-ABE) scheme to tackle the double impediment of dynamic access and low-latency communication. The model proposes a context-aware smart contract on a worldwide relay chain that checks against data properties, including event type, time, and geographical region, to specify the suitable level of encryption policy. From such relay-directed judgment, On-Board Units (OBUs) encrypt data end-to-end by utilising CP-ABE and store ciphertext inside localised regional blockchains, preventing dependence on symmetric encryption or off-chain storage. High-sensitivity events are secured with firm, multi-attribute access rules, whereas common updates use light policies to help reduce processing burdens. The crypto system also adds traceability and low-latency revocation, with global enforcement managed through the relay chain. This distributed, scalable model provides a proper balance between responsiveness in real time and security and is extremely apt for next-gen vehicular networks that function across multi-jurisdictional domains.
Authors:Onur Alp Kirci, M. Emre Gursoy
Abstract:
Backdoor attacks pose a significant threat to the integrity of text classification models used in natural language processing. While several dirty-label attacks that achieve high attack success rates (ASR) have been proposed, clean-label attacks are inherently more difficult. In this paper, we propose three sample selection strategies to improve attack effectiveness in clean-label scenarios: Minimum, Above50, and Below50. Our strategies identify those samples which the model predicts incorrectly or with low confidence, and by injecting backdoor triggers into such samples, we aim to induce a stronger association between the trigger patterns and the attacker-desired target label. We apply our methods to clean-label variants of four canonical backdoor attacks (InsertSent, WordInj, StyleBkd, SynBkd) and evaluate them on three datasets (IMDB, SST2, HateSpeech) and four model types (LSTM, BERT, DistilBERT, RoBERTa). Results show that the proposed strategies, particularly the Minimum strategy, significantly improve the ASR over random sample selection with little or no degradation in the model's clean accuracy. Furthermore, clean-label attacks enhanced by our strategies outperform BITE, a state of the art clean-label attack method, in many configurations.
Authors:Matthew Sotoudeh, Zachary Yedidia
Abstract:
Software fault isolation (SFI) is a popular way to sandbox untrusted software. A key component of SFI is the verifier that checks the untrusted code is written in a subset of the machine language that guarantees it never reads or writes outside of a region of memory dedicated to the sandbox. Soundness bugs in the SFI verifier would break the SFI security model and allow the supposedly sandboxed code to read protected memory. In this paper, we address the concern of SFI verifier bugs by performing an automated formal verification of a recent SFI system called Lightweight Fault Isolation (LFI). In particular, we formally verify that programs accepted by the LFI verifier never read or write to memory outside of a designated sandbox region.
Authors:Haohui Zhang, Sirui Shen, Xinyu Hu, Chenglu Jin
Abstract:
Ransomware attacks have become a pervasive and costly form of cybercrime, causing tens of millions of dollars in losses as organizations increasingly pay ransoms to mitigate operational disruptions and financial risks. While prior research has largely focused on proactive defenses, the post-infection negotiation dynamics between attackers and victims remains underexplored. This paper presents a formal analysis of attacker-victim interactions in modern ransomware incidents using a finite-horizon alternating-offers bargaining game model. Our analysis demonstrates how bargaining alters the optimal strategies of both parties. In practice, incomplete information-attackers lacking knowledge of victims' data valuations and victims lacking knowledge of attackers' reservation ransoms-can prolong negotiations and increase victims' business interruption costs. To address this, we design a Bayesian incentive-compatible mechanism that facilitates rapid agreement on a fair ransom without requiring either party to disclose private valuations. We further implement this mechanism using secure two-party computation based on garbled circuits, thereby eliminating the need for trusted intermediaries and preserving the privacy of both parties throughout the negotiation. To the best of our knowledge, this is the first automated, privacy-preserving negotiation mechanism grounded in a formal analysis of ransomware negotiation dynamics.
Authors:Benjamin Murphy, Twm Stone
Abstract:
Advances in AI are widely understood to have implications for cybersecurity. Articles have emphasized the effect of AI on the cyber offense-defense balance, and commentators can be found arguing either that cyber will privilege attackers or defenders. For defenders, arguments are often made that AI will enable solutions like formal verification of all software--and for some well-equipped companies, this may be true. This conversation, however, does not match the reality for most companies. "Trailing-edge organizations," as we term them, rely heavily on legacy software, poorly staff security roles, and struggle to implement best practices like rapid deployment of security patches. These decisions may be the result of corporate inertia, but may also be the result of a seemingly-rational calculation that attackers may not bother targeting a firm due to lack of economic incentives, and as a result, underinvestment in defense will not be punished.
This approach to security may have been sufficient prior to the development of AI systems, but it is unlikely to remain viable in the near future. We argue that continuing improvements in AI's capabilities poses additional risks on two fronts: First, increased usage of AI will alter the economics of the marginal cyberattack and expose these trailing-edge organizations to more attackers, more frequently. Second, AI's advances will enable attackers to develop exploits and launch attacks earlier than they can today--meaning that it is insufficient for these companies to attain parity with today's leading defenders, but must instead aim for faster remediation timelines and more resilient software. The situation today portends a dramatically increased number of attacks in the near future. Moving forward, we offer a range of solutions for both organizations and governments to improve the defensive posture of firms which lag behind their peers today.
Authors:Henrietta Hegyi, Laszlo Erdodi
Abstract:
The rapid advancement of Internet-connected vehicle technologies has introduced a new era of smart mobility, while simultaneously raising significant cybersecurity and privacy concerns. This paper explores the evolving threat landscape associated with connected vehicles, focusing on risks such as unauthorized remote access and the potential leakage of personal data. To assess the current state of protection, we conducted a comprehensive analysis of 16 international standards and regulations, evaluating them from multiple perspectives including regulatory strength, technical specificity, treatment of supply chain risks, and approaches to personal data handling.
In parallel, we carried out a user-focused survey designed to map consumer attitudes toward smart cars. The survey investigated which types of vehicles users trust and prefer, the reasons behind rejecting certain car types, their awareness of data-related risks, and whether they feel adequately informed about how their vehicles handle data. By combining regulatory analysis with user perception insights, this study aims to contribute to a more holistic understanding of the challenges and expectations surrounding connected vehicle ecosystems.
Authors:Ehssan Mousavipour, Andrey Dimanchev, Majid Ghaderi
Abstract:
Distribution shift, a change in the statistical properties of data over time, poses a critical challenge for deep learning anomaly detection systems. Existing anomaly detection systems often struggle to adapt to these shifts. Specifically, systems based on supervised learning require costly manual labeling, while those based on unsupervised learning rely on clean data, which is difficult to obtain, for shift adaptation. Both of these requirements are challenging to meet in practice.
In this paper, we introduce NetSight, a framework for supervised anomaly detection in network data that continually detects and adapts to distribution shifts in an online manner. NetSight eliminates manual intervention through a novel pseudo-labeling technique and uses a knowledge distillation-based adaptation strategy to prevent catastrophic forgetting. Evaluated on three long-term network datasets, NetSight demonstrates superior adaptation performance compared to state-of-the-art methods that rely on manual labeling, achieving F1-score improvements of up to 11.72%. This proves its robustness and effectiveness in dynamic networks that experience distribution shifts over time.
Authors:Xuezheng Qin, Ruwei Huang, Xiaolong Tang, Feng Li
Abstract:
Federated Learning (FL) is increasingly adopted for privacy-preserving collaborative training, but its decentralized nature makes it particularly susceptible to backdoor attacks. Existing attack methods, however, often rely on idealized assumptions and fail to remain effective under real-world constraints, such as limited attacker control, non-IID data distributions, and the presence of diverse defense mechanisms. To address this gap, we propose DOPA (Divergent Optimization Path Attack), a novel framework that simulates heterogeneous local training dynamics and seeks consensus across divergent optimization trajectories to craft universally effective and stealthy backdoor triggers. By leveraging consistency signals across simulated paths to guide optimization, DOPA overcomes the challenge of heterogeneity-induced instability and achieves practical attack viability under stringent federated constraints. We validate DOPA on a comprehensive suite of 12 defense strategies, two model architectures (ResNet18/VGG16), two datasets (CIFAR-10/TinyImageNet), and both mild and extreme non-IID settings. Despite operating under a single-client, black-box, and sparsely participating threat model, DOPA consistently achieves high attack success, minimal accuracy degradation, low runtime, and long-term persistence. These results demonstrate a more practical attack paradigm, offering new perspectives for designing robust defense strategies in federated learning systems
Authors:Dikshant, Geetika Verma
Abstract:
Excessive and spurious alert generation by cloud security solutions is a root cause of analyst fatigue and operational inefficiencies. In this study, the long-standing issue of false positives from publicly accessible alerts in Amazon S3, as generated by a licensed cloud-native security solution, is examined. In a simulated production test environment, which consisted of over 1,000 Amazon S3 buckets with diverse access configurations, it was discovered that over 80\% of the alerts generated by default rules were classified as false positives, thus demonstrating the severity of the detection issue. This severely impacted detection accuracy and generated a heavier workload for analysts due to redundant manual triage efforts. For addressing this problem, custom detection logic was created as an exercise of the native rule customization capabilities of the solution. A unified titled ``S3 Public Access Validation and Data Exposure'' was created in an effort to consolidate different forms of alerts into one, context-aware logic that systematically scans ACL configurations, bucket policies, indicators of public exposure, and the presence of sensitive data, and then marks only those S3 buckets that indeed denote security risk and are publicly exposed on the internet with no authentication. The results demonstrate a significant reduction in false positives, more precise alert fidelity, and significant time saving for security analysts, thus demonstrating an actionable and reproducible solution to enhance the accuracy of security alerting in compliance-focused cloud environments.
Authors:Jonathan Passerat-Palmbach, Sarisht Wadhwa
Abstract:
Flashbots recently released mev-share to empower users with control over the amount of information they share with searchers for extracting Maximal Extractable Value (MEV). Searchers require more information to maintain on-chain exchange efficiency and profitability, while users aim to prevent frontrunning by withholding information. After analyzing two searching strategies in mev-share to reason about searching techniques, this paper introduces Differentially-Private (DP) aggregate hints as a new type of hints to disclose information quantitatively. DP aggregate hints enable users to formally quantify their privacy loss to searchers, and thus better estimate the level of rebates to ask in return. The paper discusses the current properties and privacy loss in mev-share and lays out how DP aggregate hints could enhance the system for both users and searchers. We leverage Differential Privacy in the Trusted Curator Model to design our aggregate hints. Additionally, we explain how random sampling can defend against sybil attacks and amplify overall user privacy while providing valuable hints to searchers for improved backrunning extraction and frontrunning prevention.
Authors:Zixin Rao, Youssef Mohamed, Shang Liu, Zeyan Liu
Abstract:
Large Language Models (LLMs), such as GPT-4 and Llama, have demonstrated remarkable abilities in generating natural language. However, they also pose security and integrity challenges. Existing countermeasures primarily focus on distinguishing AI-generated content from human-written text, with most solutions tailored for English. Meanwhile, authorship attribution--determining which specific LLM produced a given text--has received comparatively little attention despite its importance in forensic analysis. In this paper, we present DA-MTL, a multi-task learning framework that simultaneously addresses both text detection and authorship attribution. We evaluate DA-MTL on nine datasets and four backbone models, demonstrating its strong performance across multiple languages and LLM sources. Our framework captures each task's unique characteristics and shares insights between them, which boosts performance in both tasks. Additionally, we conduct a thorough analysis of cross-modal and cross-lingual patterns and assess the framework's robustness against adversarial obfuscation techniques. Our findings offer valuable insights into LLM behavior and the generalization of both detection and authorship attribution.
Authors:Nizheen A. Ali, Ramadhan J. Mstafa
Abstract:
With the widespread use of the internet, there is an increasing need to ensure the security and privacy of transmitted data. This has led to an intensified focus on the study of video steganography, which is a technique that hides data within a video cover to avoid detection. The effectiveness of any steganography method depends on its ability to embed data without altering the original video quality while maintaining high efficiency. This paper proposes a new method to video steganography, which involves utilizing a Genetic Algorithm (GA) for identifying the Region of Interest (ROI) in the cover video. The ROI is the area in the video that is the most suitable for data embedding. The secret data is encrypted using the Advanced Encryption Standard (AES), which is a widely accepted encryption standard, before being embedded into the cover video, utilizing up to 10% of the cover video. This process ensures the security and confidentiality of the embedded data. The performance metrics for assessing the proposed method are the Peak Signal to Noise Ratio (PSNR) and the encoding and decoding time. The results show that the proposed method has a high embedding capacity and efficiency, with a PSNR ranging between 64 and 75 dBs, which indicates that the embedded data is almost indistinguishable from the original video. Additionally, the method can encode and decode data quickly, making it efficient for real time applications.
Authors:Zain Ahmad, Musab Ahmad, Bilal Ahmad
Abstract:
DDoS attacks are one of the most prevalent and harmful cybersecurity threats faced by organizations and individuals today. In recent years, the complexity and frequency of DDoS attacks have increased significantly, making it challenging to detect and mitigate them effectively. The study analyzes various types of DDoS attacks, including volumetric, protocol, and application layer attacks, and discusses the characteristics, impact, and potential targets of each type. It also examines the existing techniques used for DDoS attack detection, such as packet filtering, intrusion detection systems, and machine learning-based approaches, and their strengths and limitations. Moreover, the study explores the prevention techniques employed to mitigate DDoS attacks, such as firewalls, rate limiting , CPP and ELD mechanism. It evaluates the effectiveness of each approach and its suitability for different types of attacks and environments. In conclusion, this study provides a comprehensive overview of the different types of DDoS attacks, their detection, and prevention techniques. It aims to provide insights and guidelines for organizations and individuals to enhance their cybersecurity posture and protect against DDoS attacks.
Authors:Zhuoran Li, Hanieh Totonchi Asl, Ebrahim Nouri, Yifei Cai, Danella Zhao
Abstract:
Secure Multi-Party Computation (MPC) offers a practical foundation for privacy-preserving machine learning at the edge, with MPC commonly employed to support nonlinear operations. These MPC protocols fundamentally rely on Oblivious Transfer (OT), particularly Correlated OT (COT), to generate correlated randomness essential for secure computation. Although COT generation is efficient in conventional two-party settings with resource-rich participants, it becomes a critical bottleneck in real-world inference on resource-constrained devices (e.g., IoT sensors and wearables), due to both communication latency and limited computational capacity. To enable real-time secure inference, we introduce Silentflow, a highly efficient Trusted Execution Environment (TEE)-assisted protocol that eliminates communication in COT generation. We tackle the core performance bottleneck-low computational intensity-through structured algorithmic decomposition: kernel fusion for parallelism, Blocked On-chip eXpansion (BOX) to improve memory access patterns, and vectorized batch operations to maximize memory bandwidth utilization. Through design space exploration, we balance end-to-end latency and resource demands, achieving up to 39.51x speedup over state-of-the-art protocols. By offloading COT computations to a Zynq-7000 SoC, SilentFlow accelerates PPMLaaS inference on the ImageNet dataset under resource constraints, achieving a 4.62x and 3.95x speedup over Cryptflow2 and Cheetah, respectively.
Authors:Vinod Khandkar, Kieron Ivy Turk, Ehsan Toreini, Nishanth Sastry
Abstract:
Rapidly changing social norms and national, legal, and political conditions socially constrain people from discussing sensitive topics such as sexuality or religion. Such constrained, vulnerable minorities are often worried about inadvertent information disclosure and may be unsure about the extent to which their communications are being monitored in public or semi-public spaces like workplaces or cafes. Personal devices extend trust to the digital domain, making it desirable to have strictly private communication between trusted devices. Currently, messaging services like WhatsApp provide alternative means for exchanging sensitive private information, while personal safety apps such as Noonlight enable private signaling. However, these rely on third-party mechanisms for secure and private communication, which may not be accessible for justifiable reasons, such as insecure internet access or companion device connections. In these cases, it is challenging to achieve communication that is strictly private between two devices instead of user accounts without any dependency on third-party infrastructure. The goal of this paper is to support private communications by setting up a shared secret between two or more devices without sending any data on the network. We develop a method to create a shared secret between phones by shaking them together. Each device extracts the shared randomness from the shake, then conditions the randomness to 7.798 bits per byte of key material. This paper proposes three different applications of this generated shared secret: message obfuscation, trust delegation, and encrypted beacons. We have implemented the message obfuscation on Android as an independent app that can be used for private communication with trusted contacts. We also present research on the usability, design considerations, and further integration of these tools in mainstream services.
Authors:Jaeung Lee, Suhyeon Yu, Yurim Jang, Simon S. Woo, Jaemin Jo
Abstract:
Machine Unlearning (MU) aims to remove target training data from a trained model so that the removed data no longer influences the model's behavior, fulfilling "right to be forgotten" obligations under data privacy laws. Yet, we observe that researchers in this rapidly emerging field face challenges in analyzing and understanding the behavior of different MU methods, especially in terms of three fundamental principles in MU: accuracy, efficiency, and privacy. Consequently, they often rely on aggregate metrics and ad-hoc evaluations, making it difficult to accurately assess the trade-offs between methods. To fill this gap, we introduce a visual analytics system, Unlearning Comparator, designed to facilitate the systematic evaluation of MU methods. Our system supports two important tasks in the evaluation process: model comparison and attack simulation. First, it allows the user to compare the behaviors of two models, such as a model generated by a certain method and a retrained baseline, at class-, instance-, and layer-levels to better understand the changes made after unlearning. Second, our system simulates membership inference attacks (MIAs) to evaluate the privacy of a method, where an attacker attempts to determine whether specific data samples were part of the original training set. We evaluate our system through a case study visually analyzing prominent MU methods and demonstrate that it helps the user not only understand model behaviors but also gain insights that can inform the improvement of MU methods.
Authors:Yasaman Samadi, Hai Dong, Xiaoyu Xia
Abstract:
Recent advancements in money laundering detection have demonstrated the potential of using graph neural networks to capture laundering patterns accurately. However, existing models are not explicitly designed to detect the diverse patterns of off-chain cryptocurrency money laundering. Neglecting any laundering pattern introduces critical detection gaps, as each pattern reflects unique transactional structures that facilitate the obfuscation of illicit fund origins and movements. Failure to account for these patterns may result in under-detection or omission of specific laundering activities, diminishing model accuracy and allowing schemes to bypass detection. To address this gap, we propose the MPOCryptoML model to effectively detect multiple laundering patterns in cryptocurrency transactions. MPOCryptoML includes the development of a multi-source Personalized PageRank algorithm to identify random laundering patterns. Additionally, we introduce two novel algorithms by analyzing the timestamp and weight of transactions in high-volume financial networks to detect various money laundering structures, including fan-in, fan-out, bipartite, gather-scatter, and stack patterns. We further examine correlations between these patterns using a logistic regression model. An anomaly score function integrates results from each module to rank accounts by anomaly score, systematically identifying high-risk accounts. Extensive experiments on public datasets including Elliptic++, Ethereum fraud detection, and Wormhole transaction datasets validate the efficacy and efficiency of MPOCryptoML. Results show consistent performance gains, with improvements up to 9.13% in precision, up to 10.16% in recall, up to 7.63% in F1-score, and up to 10.19% in accuracy.
Authors:Dikshant, Verma
Abstract:
Rule-based cloud security posture management (CSPM) solutions are known to produce a lot of false positives based on the limited contextual understanding and dependence on static heuristics testing. This paper introduces a validation-driven methodology that integrates active behavioral testing in cloud security posture management solution(s) to evaluate the exploitability of policy violations in real time. The proposed system employs lightweight and automated probes, built from open-source tools, validation scripts, and penetration testing test cases, to simulate adversarial attacks on misconfigured or vulnerable cloud assets without any impact to the cloud services or environment. For instance, cloud services may be flagged as publicly exposed and vulnerable despite being protected by access control layers, or secure policies, resulting in non-actionable alerts that consumes analysts time during manual validation. Through controlled experimentation in a reproducible AWS setup, we evaluated the reduction in false positive rates across various misconfiguration and vulnerable alerts. Our findings indicate an average reduction of 93\% in false positives. Furthermore, the framework demonstrates low latency performance. These results demonstrate a scalable method to improve detection accuracy and analyst productivity in large cloud environments. While our evaluation focuses on AWS, the architecture is modular and extensible to multi-cloud setups.
Authors:John Y. Kim, Chaoshun Zuo, Yanjie Zhao, Zhiqiang Lin
Abstract:
The rise of Virtual Reality (VR) has provided developers with an unprecedented platform for creating games and applications (apps) that require distinct inputs, different from those of conventional devices like smartphones. The Meta Quest VR platform, driven by Meta, has democratized VR app publishing and attracted millions of users worldwide. However, as the number of published apps grows, there is a notable lack of robust headless tools for user interface (UI) exploration and user event testing. To address this need, we present AUTOVR, an automatic framework for dynamic UI and user event interaction in VR apps built on the Unity Engine. Unlike conventional Android and GUI testers, AUTOVR analyzes the app's internal binary to reveal hidden events, resolves generative event dependencies, and utilizes them for comprehensive exploration of VR apps. Using sensitive data exposure as a performance metric, we compare AUTOVR with Android Monkey, a widely used headless Android GUI stress testing tool. Our empirical evaluation demonstrates AUTOVR's superior performance, triggering an order of magnitude of more sensitive data exposures and significantly enhancing the privacy of VR apps.
Authors:Ayman W. Baharia, Khaled T. Naga, Hesham S. Abdelfattah, Shady A. Maged, Sherif A. Hammad
Abstract:
The rapid evolution of smart grids requires effective communication protocols to transfer data reliably and securely. Controller Area Network (CAN) is one of the most recognized protocols that offer reliable data transmission in smart grids due to its robustness, real-time capabilities, and relatively low initial cost of its required hardware. However, as a smart city becomes more interconnected, it also becomes more vulnerable to cyber-attacks. As there are many mechanisms to secure the CAN nodes from attacks, most of those mechanisms have computational overhead, resulting in more delay in the network. We implemented a solution that requires almost no overhead to any CAN node connected to the network. It depends on a single node responsible for securing the CAN network. This approach seeks to augment network security while reducing security mechanisms overhead to all CAN network nodes. The methodology and comprehensive test results will be presented in detail during a subsequent discussion. The used software for development is Code Composer Studio, and the used microcontroller evaluation boards (EVB) are TM4C 1294.
Authors:Ming Li, John Hale
Abstract:
Attack graphs (AGs) are graphical tools to analyze the security of computer networks. By connecting the exploitation of individual vulnerabilities, AGs expose possible multi-step attacks against target networks, allowing system administrators to take preventive measures to enhance their network's security. As powerful analytical tools, however, AGs are both time- and memory-consuming to be generated. As the numbers of network assets, interconnections between devices, as well as vulnerabilities increase, the size and volume of the resulting AGs grow at a much higher rate, leading to the well-known state-space explosion. In this paper, we propose the use of high performance computing (HPC) clusters to implement AG generators. We evaluate the performance through experiments and provide insights into how cluster environments can help resolve the issues of slow speed and high memory demands in AG generation in a balanced way.
Authors:Mohammad Ishzaz Asif Rafid, Morsalin Sakib
Abstract:
Bitcoin's Proof of Work (PoW) mechanism, while central to achieving decentralized consensus, has long been criticized for excessive energy use and hardware inefficiencies \cite{devries2018bitcoin, truby2018decarbonizing}. This paper introduces a hybrid architecture that replaces Bitcoin's traditional PoW with a centralized, cloud-based collaborative training framework. In this model, miners contribute computing resources to train segments of horizontally scaled machine learning models on preprocessed datasets, ensuring privacy and generating meaningful outputs \cite{li2017securing}. A central server evaluates contributions using two metrics: number of parameters trained and reduction in model loss during each cycle. At the end of every cycle, a weighted lottery selects the winning miner, who receives a digitally signed certificate. This certificate serves as a verifiable substitute for PoW and grants the right to append a block to the blockchain \cite{nakamoto2008bitcoin}. By integrating digital signatures and SHA-256 hashing \cite{nist2015sha}, the system preserves blockchain integrity while redirecting energy toward productive computation. The proposed approach addresses the sustainability concerns of traditional mining by converting resource expenditure into socially valuable work, aligning security incentives with real-world computational progress.
Authors:James Gu, Ahmed Sartaj, Mohammed Akram Taher Khan, Rashid Hussain Khokhar
Abstract:
This study focuses on the creation and implementation of ransomware for educational purposes that leverages Python's native cryptographic APIs in a controlled environment. Additionally, an Android version of the framework is implemented using Flutter and Dart. For both versions, open-source cryptographic libraries are utilized. With this framework, researchers can systematically explore the functionalities of ransomware, including file encryption processes, cryptographic key management, and victim interaction dynamics. To ensure safe experimentation, multiple safeguards are incorporated, such as the ability to restrict the encryption process to a specific directory, providing the RSA private key for immediate decryption, and narrowing the scope of targetable files to a carefully curated list (.txt, .jpg, .csv, .doc). This paper draws inspiration from the infamous WannaCry ransomware and aims to simulate its behaviour on Android devices. By making the codebase open-source, it enables users to study, modify, and extend the program for pedagogical purposes and offers a hands-on tool that can be used to train the next generation of cybersecurity professionals.
Authors:Lien Tran, Boyuan Zhang, Ratchanon Pawanja, Rashid Hussain Khokhar
Abstract:
With the rise of sophisticated authentication bypass techniques, passwords are no longer considered a reliable method for securing authentication systems. In recent years, new authentication technologies have shifted from traditional password-based logins to passwordless security. Among these, Time-Based One-Time Passwords (TOTP) remain one of the most widely used mechanisms, while Passkeys are emerging as a promising alternative with growing adoption. This paper highlights the key techniques used during the implementation of the authentication system with Passkey technology. It also suggests considerations for integrating components during system development to ensure that users can securely access their accounts with minimal complexity, while still meeting the requirements of a robust authentication system that balances security, usability, and performance. Additionally, by examining TOTP and Passkey mechanisms from an implementation perspective, this work not only addresses major security concerns such as password leaks, phishing attacks, and susceptibility to brute-force attacks, but also evaluates the feasibility and effectiveness of these mechanisms in real-world implementations. This paper demonstrates the superior security of Passkey technology and its potential for broader adoption in secure authentication systems.
Authors:Mukesh Poudel, Nick Rahimi
Abstract:
Cryptographic algorithms like AES and RSA are widely used and they are mathematically robust and almost unbreakable but its implementation on physical devices often leak information through side channels, such as electromagnetic (EM) emissions, potentially compromising said theoretically secure algorithms. This paper investigates the application of machine learning (ML) techniques and Deep Learning models to exploit such leakage for partial key recovery. We use the public ASCAD `fixed' and `variable' key dataset, containing 700 and 1400 EM traces respectively from an AES-128 implementation on an 8-bit microcontroller. The problem is framed as a 256-class classification task where we target the output of the first-round S-box operation, which is dependent on a single key byte. We evaluate standard classifiers (Random Forest (RF), Support Vector Machine (SVM)), a Convolutional Neural Network(CNN) and a Residual Neural Network(ResNet). We also explore the utility of RF-based feature importance for dimensionality reduction. Crucially, we employ this domain-specific Key Rank metric for evaluation, showing its necessity over standard classification accuracy. Our results show that SVM and RF on full features perform poorly in key ranking. However, RF trained on reduced (top 100) identified via importance analysis achieves Rank 0 (successful key byte recovery) using almost half the attack traces. The implemented CNN also achieves Rank 0 efficiently using approximately 65 attack traces for the fixed-key dataset. The ResNets perform best on large and complex datasets but may not always be the best choice for simple fixed key dataset in terms of efficiency. Thus we conclude that models, particularly CNNs, ResNets and feature-selected RF, coupled with the Key Rank metric, are an effective tool for side-channel key recovery, confirming the practical vulnerability of the cryptographic implementations.
Authors:Tyler Schroder, Sohee Kim Park
Abstract:
The advancement of computing equipment and the advances in services over the Internet has allowed corporations, higher education, and many other organizations to pursue the shared computing network environment. A requirement for shared computing environments is a centralized identity system to authenticate and authorize user access. An organization's digital identity plane is a prime target for cyber threat actors. When compromised, identities can be exploited to steal credentials, create unauthorized accounts, and manipulate permissions-enabling attackers to gain control of the network and undermine its confidentiality, availability, and integrity. Cybercrime losses reached a record of 16.6 B in the United States in 2024. For organizations using Microsoft software, Active Directory is the on-premises identity system of choice. In this article, we examine the challenge of security compromises in Active Directory (AD) environments and present effective strategies to prevent credential theft and limit lateral movement by threat actors. Our proposed approaches aim to confine the movement of compromised credentials, preventing significant privilege escalation and theft. We argue that through our illustration of real-world scenarios, tiering can halt lateral movement and advanced cyber-attacks, thus reducing ransom escalation. Our work bridges a gap in existing literature by combining technical guidelines with theoretical arguments in support of tiering, positioning it as a vital component of modern cybersecurity strategy even though it cannot function in isolation. As the hardware advances and the cloud sourced services along with AI is advancing with unprecedented speed, we think it is important for security experts and the business to work together and start designing and developing software and frameworks to classify devices automatically and accurately within the tiered structure.
Authors:Calkin Garg, Omar Rios Cruz, Tessa Andersen, Gaby G. Dagher, Donald Winiecki, Min Long
Abstract:
Due to HIPAA and other privacy regulations, it is imperative to maintain patient privacy while conducting research on patient health records. In this paper, we propose AegisBlock, a patient-centric access controlled framework to share medical records with researchers such that the anonymity of the patient is maintained while ensuring the trustworthiness of the data provided to researchers. AegisBlock allows for patients to provide access to their medical data, verified by miners. A researcher submits a time-based range query to request access to records from a certain patient, and upon patient approval, access will be granted. Our experimental evaluation results show that AegisBlock is scalable with respect to the number of patients and hospitals in the system, and efficient with up to 50% of malicious miners.
Authors:Hael Abdulhakim Ali Humran, Ferdi Sonmez
Abstract:
Security vulnerabilities present in a code that has been written in diverse programming languages are among the most critical yet complicated aspects of source code to detect. Static analysis tools based on rule-based patterns usually do not work well at detecting the context-dependent bugs and lead to high false positive rates. Recent developments in artificial intelligence, specifically the use of transformer-based models like CodeBERT and CodeLlama, provide light to this problem, as they show potential in finding such flaws better. This paper presents the implementations of these models on various datasets of code vulnerability, showing how off-the-shelf models can successfully produce predictive capacity in models through dynamic fine-tuning of the models on vulnerable and safe code fragments. The methodology comprises the gathering of the dataset, normalization of the language, fine-tuning of the model, and incorporation of ensemble learning and explainable AI. Experiments show that a well-trained CodeBERT can be as good as or even better than some existing static analyzers in terms of accuracy greater than 97%. Further study has indicated that although language models can achieve close-to-perfect recall, the precision can decrease. A solution to this is given by hybrid models and validation procedures, which will reduce false positives. According to the results, the AI-based solutions generalize to different programming languages and classes of vulnerability. Nevertheless, robustness, interpretability, and deployment readiness are still being developed. The results illustrate the probabilities that AI will enhance the trustworthiness in the usability and scalability of machine-learning-based detectors of vulnerabilities.
Authors:Zhihao Li, Zimo Ji, Tao Zheng, Hao Ren, Xiao Lan
Abstract:
Cryptographic algorithms are fundamental to modern security, yet their implementations frequently harbor subtle logic flaws that are hard to detect. We introduce CryptoScope, a novel framework for automated cryptographic vulnerability detection powered by Large Language Models (LLMs). CryptoScope combines Chain-of-Thought (CoT) prompting with Retrieval-Augmented Generation (RAG), guided by a curated cryptographic knowledge base containing over 12,000 entries. We evaluate CryptoScope on LLM-CLVA, a benchmark of 92 cases primarily derived from real-world CVE vulnerabilities, complemented by cryptographic challenges from major Capture The Flag (CTF) competitions and synthetic examples across 11 programming languages. CryptoScope consistently improves performance over strong LLM baselines, boosting DeepSeek-V3 by 11.62%, GPT-4o-mini by 20.28%, and GLM-4-Flash by 28.69%. Additionally, it identifies 9 previously undisclosed flaws in widely used open-source cryptographic projects.
Authors:Nathaniel Moyer, Charalampos Papamanthou, Evgenios Kornaropoulos
Abstract:
Searchable encryption (SE) is the most scalable cryptographic primitive for searching on encrypted data. Typical SE constructions often allow access-pattern leakage, revealing which encrypted records are retrieved in the server's responses. All the known generic cryptanalyses assume either that the queries are issued uniformly at random or that the attacker observes the search-pattern leakage. It remains unclear what can be reconstructed when using only the access-pattern leakage and knowledge of the query distribution. In this work, we focus on the cryptanalytic technique of frequency analysis in the context of leakage-abuse attacks on schemes that support encrypted range queries. Frequency analysis matches the frequency of retrieval of an encrypted record with a plaintext value based on its probability of retrieval that follows from the knowledge of the query distribution. We generalize this underexplored cryptanalytic technique and introduce a generic attack framework called Leakage-Abuse via Matching (LAMA) that works even on high-dimensional encrypted data. We identify a parameterization of LAMA that brings frequency analysis to its limit -- that is, we prove that there is no additional frequency matching that an attacker can perform to refine the result. Furthermore, we show that our results hold for any class of convex queries, and not just axis-aligned rectangles, which is the assumption in all other attacks on range schemes. Using these results, we identify query distributions that make frequency analysis challenging for the attacker and, thus, can act as a mitigation mechanism. Finally, we implement and benchmark LAMA and reconstruct, for the first time, plaintext data from encrypted range queries spanning up to four dimensions.
Authors:Kevin McNamara, Rhea Pritham Marpu
Abstract:
The global financial system stands at an inflection point. Stablecoins represent the most significant evolution in banking since the abandonment of the gold standard, positioned to enable "Banking 2.0" by seamlessly integrating cryptocurrency innovation with traditional finance infrastructure. This transformation rivals artificial intelligence as the next major disruptor in the financial sector. Modern fiat currencies derive value entirely from institutional trust rather than physical backing, creating vulnerabilities that stablecoins address through enhanced stability, reduced fraud risk, and unified global transactions that transcend national boundaries. Recent developments demonstrate accelerating institutional adoption: landmark U.S. legislation including the GENIUS Act of 2025, strategic industry pivots from major players like JPMorgan's crypto-backed loan initiatives, and PayPal's comprehensive "Pay with Crypto" service. Widespread stablecoin implementation addresses critical macroeconomic imbalances, particularly the inflation-productivity gap plaguing modern monetary systems, through more robust and diversified backing mechanisms. Furthermore, stablecoins facilitate deregulation and efficiency gains, paving the way for a more interconnected international financial system. This whitepaper comprehensively explores how stablecoins are poised to reshape banking, supported by real-world examples, current market data, and analysis of their transformative potential.
Authors:Katarzyna Filus, Jorge M. Cruz-Duarte
Abstract:
In targeted adversarial attacks on vision models, the selection of the target label is a critical yet often overlooked determinant of attack success. This target label corresponds to the class that the attacker aims to force the model to predict. Now, existing strategies typically rely on randomness, model predictions, or static semantic resources, limiting interpretability, reproducibility, or flexibility. This paper then proposes a semantics-guided framework for adversarial target selection using the cross-modal knowledge transfer from pretrained language and vision-language models. We evaluate several state-of-the-art models (BERT, TinyLLAMA, and CLIP) as similarity sources to select the most and least semantically related labels with respect to the ground truth, forming best- and worst-case adversarial scenarios. Our experiments on three vision models and five attack methods reveal that these models consistently render practical adversarial targets and surpass static lexical databases, such as WordNet, particularly for distant class relationships. We also observe that static testing of target labels offers a preliminary assessment of the effectiveness of similarity sources, \textit{a priori} testing. Our results corroborate the suitability of pretrained models for constructing interpretable, standardized, and scalable adversarial benchmarks across architectures and datasets.
Authors:Georgios Michail Makrakis, Jeroen Pijpker, Remco Hassing, Rob Loves, Stephen McCombie
Abstract:
Cyber threats against the maritime industry have increased notably in recent years, highlighting the need for innovative cybersecurity approaches. Ships, as critical assets, possess highly specialized and interconnected network infrastructures, where their legacy systems and operational constraints further exacerbate their vulnerability to cyberattacks. To better understand this evolving threat landscape, we propose the use of cyber-deception techniques and in particular honeynets, as a means to gather valuable insights into ongoing attack campaigns targeting the maritime sector.
In this paper we present Salty Seagull, a honeynet conceived to simulate a VSAT system for ships. This environment mimics the operations of a functional VSAT system onboard and, at the same time, enables a user to interact with it through a Web dashboard and a CLI environment. Furthermore, based on existing vulnerabilities, we purposefully integrate them into our system to increase attacker engagement. We exposed our honeynet for 30 days to the Internet to assess its capability and measured the received interaction. Results show that while numerous generic attacks have been attempted, only one curious attacker with knowledge of the nature of the system and its vulnerabilities managed to access it, without however exploring its full potential.
Authors:Asra Ali, Jaeho Choi, Bryant Gipson, Shruthi Gorantala, Jeremy Kun, Wouter Legiest, Lawrence Lim, Alexander Viand, Meron Zerihun Demissie, Hongren Zheng
Abstract:
This work presents Homomorphic Encryption Intermediate Representation (HEIR), a unified approach to building homomorphic encryption (HE) compilers. HEIR aims to support all mainstream techniques in homomorphic encryption, integrate with all major software libraries and hardware accelerators, and advance the field by providing a platform for research and benchmarking. Built on the MLIR compiler framework, HEIR introduces HE-specific abstraction layers at which existing optimizations and new research ideas may be easily implemented. Although many HE optimization techniques have been proposed, it remains difficult to combine or compare them effectively. HEIR provides a means to effectively explore the space of HE optimizations. HEIR addresses the entire HE stack and includes support for various frontends, including Python. The contribution of this work includes: (1) We introduce HEIR as a framework for building HE compilers. (2) We validate HEIR's design by porting a large fraction of the HE literature to HEIR, and we argue that HEIR can tackle more complicated and diverse programs than prior literature. (3) We provide evidence that HEIR is emerging as the de facto HE compiler for academic research and industry development.
Authors:Sam Chauhan, Estelle Duguet, Karthik Ramakrishnan, Hugh Van Deventer, Jack Kruger, Ranjan Subbaraman
Abstract:
Post hoc explanation methods, such as LIME and SHAP, provide interpretable insights into black-box classifiers and are increasingly used to assess model biases and generalizability. However, these methods are vulnerable to adversarial manipulation, potentially concealing harmful biases. Building on the work of Slack et al. (2020), we investigate the susceptibility of LIME and SHAP to biased models and evaluate strategies for improving robustness. We first replicate the original COMPAS experiment to validate prior findings and establish a baseline. We then introduce a modular testing framework enabling systematic evaluation of augmented and ensemble explanation approaches across classifiers of varying performance. Using this framework, we assess multiple LIME/SHAP ensemble configurations on out-of-distribution models, comparing their resistance to bias concealment against the original methods. Our results identify configurations that substantially improve bias detection, highlighting their potential for enhancing transparency in the deployment of high-stakes machine learning systems.
Authors:Johanna Düngler, Amartya Sanyal
Abstract:
Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $Σ$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $Σ$, while preserving differential privacy (DP) of each individual $A_i$. Existing methods either (i) require the sample size $n$ to scale super-linearly with dimension $d$, even under Gaussian assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed these issues for sub-Gaussian data but only for estimating the top eigenvector ($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming both limitations above. For $k=1$ our algorithm matches the utility guarantees of DP-PCA, achieving near-optimal statistical error even when $n = \tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching our upper bound up to a factor of $k$, and experimentally demonstrate the advantages of our algorithm over comparable baselines.
Authors:Hugo Delavenne, Louise Lallemand
Abstract:
Interactive Oracle Proofs of Proximity (IOPP) are at the heart of code-based SNARKs, a family of zeroknowledge protocols. The first and most famous one is the FRI protocol [BBHR18a], that efficiently tests proximity to Reed-Solomon codes. This paper generalizes the flowering IOPP introduced in [DMR25] for some specific (2, n)-regular Tanner codes to a much broader variety of codes: any code with symbols indexed on the edges of a Cayley graph. The flowering protocol of [DMR25] had a soundness parameter much lower than the FRI protocol [BCI + 23], and complexity parameters that could compete with the FRI [BBHR18a]. The lower soundness and the absence of restriction on the base field may lead to other practical speedups, however the codes considered in [DMR25] have an o(1) minimum distance. The generalization proposed in this paper preserves the soundness parameter with a slight decrease of the complexity parameters, while allowing being applied on codes with constant rate and constant minimum distance thanks to the good expansion properties of some families of Cayley graphs.
Authors:Bernhard Kauer, Aleksandr Petrosyan, Benjamin Livshits
Abstract:
The fundamental basis for maintaining integrity within contemporary blockchain systems is provided by authenticated databases. Our analysis indicates that a significant portion of the approaches applied in this domain fail to sufficiently meet the stringent requirements of systems processing transactions at rates of multi-million TPS. AlDBaran signifies a substantial advancement in authenticated databases. By eliminating disk I/O operations from the critical path, implementing prefetching strategies, and refining the update mechanism of the Merkle tree, we have engineered an authenticated data structure capable of handling state updates efficiently at a network throughput of 50 Gbps. This throughput capacity significantly surpasses any empirically documented blockchain throughput, guaranteeing the ability of even the most high-throughput blockchains to generate state commitments effectively.
AlDBaran provides support for historical state proofs, which facilitates a wide array of novel applications. For instance, the deployment of AlDBaran could enable blockchains that do not currently support state commitments to offer functionalities for light clients and/or implement rollups.
When benchmarked against alternative authenticated data structure projects, AlDBaran exhibits superior performance and simplicity. In particular, AlDBaran achieves speeds of approximately 48 million updates per second using an identical machine configuration. This characteristic renders AlDBaran an attractive solution for resource-limited environments, as its historical data capabilities can be modularly isolated (and deactivated), which further enhances performance. On consumer-level portable hardware, it achieves approximately 8 million updates/s in an in-memory setting and 5 million updates/s with snapshots at sub-second intervals, illustrating compelling and cost-effective scalability.
Authors:Yi Dong, Yusuke Muraoka, Scott Shi, Yi Zhang
Abstract:
We present MM-Food-100K, a public 100,000-sample multimodal food intelligence dataset with verifiable provenance. It is a curated approximately 10% open subset of an original 1.2 million, quality-accepted corpus of food images annotated for a wide range of information (such as dish name, region of creation). The corpus was collected over six weeks from over 87,000 contributors using the Codatta contribution model, which combines community sourcing with configurable AI-assisted quality checks; each submission is linked to a wallet address in a secure off-chain ledger for traceability, with a full on-chain protocol on the roadmap. We describe the schema, pipeline, and QA, and validate utility by fine-tuning large vision-language models (ChatGPT 5, ChatGPT OSS, Qwen-Max) on image-based nutrition prediction. Fine-tuning yields consistent gains over out-of-box baselines across standard metrics; we report results primarily on the MM-Food-100K subset. We release MM-Food-100K for publicly free access and retain approximately 90% for potential commercial access with revenue sharing to contributors.
Authors:Varsha Sen, Biswash Basnet
Abstract:
False Data Injection Attacks (FDIAs) pose a significant threat to smart grid infrastructures, particularly Home Area Networks (HANs), where real-time monitoring and control are highly adopted. Owing to the comparatively less stringent security controls and widespread availability of HANs, attackers view them as an attractive entry point to manipulate aggregated demand patterns, which can ultimately propagate and disrupt broader grid operations. These attacks undermine the integrity of smart meter data, enabling malicious actors to manipulate consumption values without activating conventional alarms, thereby creating serious vulnerabilities across both residential and utility-scale infrastructures. This paper presents a machine learning-based framework for both the detection and classification of FDIAs using residential energy data. A real-time detection is provided by the lightweight Artificial Neural Network (ANN), which works by using the most vital features of energy consumption, cost, and time context. For the classification of different attack types, a Bidirectional LSTM is trained to recognize normal, trapezoidal, and sigmoid attack shapes through learning sequential dependencies in the data. A synthetic time-series dataset was generated to emulate realistic household behaviour. Experimental results demonstrate that the proposed models are effective in identifying and classifying FDIAs, offering a scalable solution for enhancing grid resilience at the edge. This work contributes toward building intelligent, data-driven defence mechanisms that strengthen smart grid cybersecurity from residential endpoints.
Authors:Ehab ElSalamouny, Catuscia Palamidessi
Abstract:
For many social, scientific, and commercial purposes, it is often important to estimate the distribution of the users' data regarding a sensitive attribute, e.g., their ages, locations, etc. To allow this estimation while protecting the users' privacy, every user applies a local privacy protection mechanism that releases a noisy (sanitized) version of their original datum to the data collector; then the original distribution is estimated using one of the known methods, such as the matrix inversion (INV), RAPPOR's estimator, and the iterative Bayesian update (IBU). Unlike the other estimators, the consistency of IBU, i.e., the convergence of its estimate to the real distribution as the amount of noisy data grows, has been either ignored or incorrectly proved in the literature. In this article, we use the fact that IBU is a maximum likelihood estimator to prove that IBU is consistent. We also show, through experiments on real datasets, that IBU significantly outperforms the other methods when the users' data are sanitized by geometric, Laplace, and exponential mechanisms, whereas it is comparable to the other methods in the case of the k-RR and RAPPOR mechanisms. Finally, we consider the case when the alphabet of the sensitive data is infinite, and we show a technique that allows IBU to operate in this case too.
Authors:Damiano Abram, Giulio Malavolta, Lawrence Roy
Abstract:
We propose the notion of succinct oblivious tensor evaluation (OTE), where two parties compute an additive secret sharing of a tensor product of two vectors $\mathbf{x} \otimes \mathbf{y}$, exchanging two simultaneous messages. Crucially, the size of both messages and of the CRS is independent of the dimension of $\mathbf{x}$.
We present a construction of OTE with optimal complexity from the standard learning with errors (LWE) problem. Then we show how this new technical tool enables a host of cryptographic primitives, all with security reducible to LWE, such as:
* Adaptively secure laconic function evaluation for depth-$D$ functions $f:\{0, 1\}^m\rightarrow\{0, 1\}^\ell$ with communication $m+\ell+D\cdot \mathrm{poly}(λ)$.
* A trapdoor hash function for all functions.
* An (optimally) succinct homomorphic secret sharing for all functions.
* A rate-$1/2$ laconic oblivious transfer for batch messages, which is best possible.
In particular, we obtain the first laconic function evaluation scheme that is adaptively secure from the standard LWE assumption, improving upon Quach, Wee, and Wichs (FOCS 2018).
As a key technical ingredient, we introduce a new notion of \emph{adaptive lattice encodings}, which may be of independent interest.
Authors:Minghao Liu, Chia-Hsuan Lu, Marta Kwiatkowska
Abstract:
Graph neural networks (GNNs) are increasingly employed in high-stakes applications, such as fraud detection or healthcare, but are susceptible to adversarial attacks. A number of techniques have been proposed to provide adversarial robustness guarantees, but support for commonly used aggregation functions in message-passing GNNs is still lacking. In this paper, we develop an exact (sound and complete) verification method for GNNs to compute guarantees against attribute and structural perturbations that involve edge addition or deletion, subject to budget constraints. Focusing on node classification tasks, our method employs constraint solving with bound tightening, and iteratively solves a sequence of relaxed constraint satisfaction problems while relying on incremental solving capabilities of solvers to improve efficiency. We implement GNNev, a versatile solver for message-passing neural networks, which supports three aggregation functions, sum, max and mean, with the latter two considered here for the first time. Extensive experimental evaluation of GNNev on two standard benchmarks (Cora and CiteSeer) and two real-world fraud datasets (Amazon and Yelp) demonstrates its usability and effectiveness, as well as superior performance compared to existing {exact verification} tools on sum-aggregated node classification tasks.
Authors:Rilwan Umar, Aydin Abadi, Basil Aldali, Benito Vincent, Elliot A. J. Hurley, Hotoon Aljazaeri, Jamie Hedley-Cook, Jamie-Lee Bell, Lambert Uwuigbusun, Mujeeb Ahmed, Shishir Nagaraja, Suleiman Sabo, Weaam Alrbeiqi
Abstract:
Weather forecasting plays a vital role in disaster preparedness, agriculture, and resource management, yet current centralized forecasting systems are increasingly strained by security vulnerabilities, limited scalability, and susceptibility to single points of failure. To address these challenges, we propose a decentralized weather forecasting framework that integrates Federated Learning (FL) with blockchain technology. FL enables collaborative model training without exposing sensitive local data; this approach enhances privacy and reduces data transfer overhead. Meanwhile, the Ethereum blockchain ensures transparent and dependable verification of model updates. To further enhance the system's security, we introduce a reputation-based voting mechanism that assesses the trustworthiness of submitted models while utilizing the Interplanetary File System (IPFS) for efficient off-chain storage. Experimental results demonstrate that our approach not only improves forecasting accuracy but also enhances system resilience and scalability, making it a viable candidate for deployment in real-world, security-critical environments.
Authors:Syed Irtiza Maksud, Subhash Lakshminarayana
Abstract:
The growing digitalization and the rapid adoption of high-powered Internet-of-Things (IoT)-enabled devices (e.g., EV charging stations) have increased the vulnerability of power grids to cyber threats. In particular, the so-called Load Altering Attacks (LAAs) can trigger rapid frequency fluctuations and potentially destabilize the power grid. This paper aims to bridge the gap between academic research and practical application by using open-source datasets released by grid operators. It investigates various LAA scenarios on a real-world transmission network, namely the Great Britain (GB)-36 Zone model released by the UK's National Electricity System Operator (NESO). It evaluates the threshold of LAA severity that the grid can tolerate before triggering cascading effects. Additionally, it explores how Battery Energy Storage Systems (BESS) based fast frequency response services can mitigate or prevent such impacts. Simulations are conducted using DIgSILENT PowerFactory to ensure realistic system representation. The analysis provides several useful insights to grid operators on the LAA impact, such as the influence of the relative locations of BESS and LAA, as well as how delays in attack execution can influence the overall system response.
Authors:Ikram Messadi, Giulia Cervia, Vincent Itier
Abstract:
As digital data transmission continues to scale, concerns about privacy grow increasingly urgent - yet privacy remains a socially constructed and ambiguously defined concept, lacking a universally accepted quantitative measure. This work examines information leakage in image data, a domain where information-theoretic guarantees are still underexplored. At the intersection of deep learning, information theory, and cryptography, we investigate the use of mutual information (MI) estimators - in particular, the empirical estimator and the MINE framework - to detect leakage from selectively encrypted images. Motivated by the intuition that a robust estimator would require a probabilistic frameworks that can capture spatial dependencies and residual structures, even within encrypted representations - our work represent a promising direction for image information leakage estimation.
Authors:Paritosh Ramanan, H. M. Mohaimanul Islam, Abhiram Reddy Alugula
Abstract:
Industrial control systems are a fundamental component of critical infrastructure networks (CIN) such as gas, water and power. With the growing risk of cyberattacks, regulatory compliance requirements are also increasing for large scale critical infrastructure systems comprising multiple utility stakeholders. The primary goal of regulators is to ensure overall system stability with recourse to trustworthy stakeholder attack detection. However, adhering to compliance requirements requires stakeholders to also disclose sensor and control data to regulators raising privacy concerns. In this paper, we present a cyberattack detection framework that utilizes differentially private (DP) hypothesis tests geared towards enhancing regulatory confidence while alleviating privacy concerns of CIN stakeholders. The hallmark of our approach is a two phase privacy scheme that protects the privacy of covariance, as well as the associated sensor driven test statistics computed as a means to generate alarms. Theoretically, we show that our method induces a misclassification error rate comparable to the non-DP cases while delivering robust privacy guarantees. With the help of real-world datasets, we show the reliability of our DP-detection outcomes for a wide variety of attack scenarios for interdependent stakeholders.
Authors:Jiayao Wang, Yang Song, Zhendong Zhao, Jiale Zhang, Qilin Wu, Junwu Zhu, Dongfang Zhao
Abstract:
Federated self-supervised learning (FSSL) combines the advantages of decentralized modeling and unlabeled representation learning, serving as a cutting-edge paradigm with strong potential for scalability and privacy preservation. Although FSSL has garnered increasing attention, research indicates that it remains vulnerable to backdoor attacks. Existing methods generally rely on visually obvious triggers, which makes it difficult to meet the requirements for stealth and practicality in real-world deployment. In this paper, we propose an imperceptible and effective backdoor attack method against FSSL, called IPBA. Our empirical study reveals that existing imperceptible triggers face a series of challenges in FSSL, particularly limited transferability, feature entanglement with augmented samples, and out-of-distribution properties. These issues collectively undermine the effectiveness and stealthiness of traditional backdoor attacks in FSSL. To overcome these challenges, IPBA decouples the feature distributions of backdoor and augmented samples, and introduces Sliced-Wasserstein distance to mitigate the out-of-distribution properties of backdoor samples, thereby optimizing the trigger generation process. Our experimental results on several FSSL scenarios and datasets show that IPBA significantly outperforms existing backdoor attack methods in performance and exhibits strong robustness under various defense mechanisms.
Authors:Mohsin Khan, Dag Johansen, HÃ¥vard Dagenborg
Abstract:
Lightweight hash functions have become important building blocks for security in embedded and IoT systems. A plethora of algorithms have been proposed and standardized, providing a wide range of performance trade-off options for developers to choose from. This paper presents a comparative analysis of 22 key software-based lightweight hash functions, including the finalist from the SHA-3 competition. We use a novel benchmark methodology that combines an AVR ATXMega128 microcontroller with the ChipWhisperer cryptanalysis platform and evaluate and compare the various hash functions along several dimensions, including execution speed, % measured in Cycles per Byte (CpB), memory footprint, and energy consumption. Using the composite E-RANK metric, we provide new insight into the various trade-offs each hash function offers to system developers.
Authors:Hoang-Long Pham, Duy-Hieu Bui, Xuan-Tu Tran, Orazio Aiello
Abstract:
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.
Authors:Sajib Talukder, Nur Imtiazul Haque, Khandakar Ashrafi Akbar
Abstract:
WebView applications are widely used in mobile applications to display web content directly within the app, enhancing user engagement by eliminating the need to open an external browser and providing a seamless experience. Progressive Web Applications (PWAs) further improve usability by combining the accessibility of web apps with the speed, offline capabilities, and responsiveness of native applications. However, malicious developers can exploit this technology by duplicating PWA web links to create counterfeit native apps, monetizing through user diversion. This unethical practice poses significant risks to users and the original application developers, underscoring the need for robust security measures to prevent unauthorized replication. Considering the one-way communication of Trusted Web Activity (a method for integrating web content into Android applications) and PWAs, we propose a query parameter-based practical security solution to defend against or mitigate such attacks. We analyze the vulnerabilities of our proposed security solution to assess its effectiveness and introduce advanced measures to address any identified weaknesses, presenting a comprehensive defense framework. As part of our work, we developed a prototype web application that secures PWAs from replication by embedding a combination of Unix timestamps and device identifiers into the query parameters. We evaluate the effectiveness of this defense strategy by simulating an advanced attack scenario. Additionally, we created a realistic dataset reflecting mobile app user behavior, modeled using a Zipfian distribution, to validate our framework.
Authors:William Zerong Wang, Dongfang Zhao
Abstract:
In the era of generative AI, ensuring the privacy of music data presents unique challenges: unlike static artworks such as images, music data is inherently temporal and multimodal, and it is sampled, transformed, and remixed at an unprecedented scale. These characteristics make its core vector embeddings, i.e, the numerical representations of the music, highly susceptible to being learned, misused, or even stolen by models without accessing the original audio files. Traditional methods like copyright licensing and digital watermarking offer limited protection for these abstract mathematical representations, thus necessitating a stronger, e.g., cryptographic, approach to safeguarding the embeddings themselves. Standard encryption schemes, such as AES, render data unintelligible for computation, making such searches impossible. While Fully Homomorphic Encryption (FHE) provides a plausible solution by allowing arbitrary computations on ciphertexts, its substantial performance overhead remains impractical for large-scale vector similarity searches. Given this trade-off, we propose a more practical approach using Additive Homomorphic Encryption (AHE) for vector similarity search. The primary contributions of this paper are threefold: we analyze threat models unique to music information retrieval systems; we provide a theoretical analysis and propose an efficient AHE-based solution through inner products of music embeddings to deliver privacy-preserving similarity search; and finally, we demonstrate the efficiency and practicality of the proposed approach through empirical evaluation and comparison to FHE schemes on real-world MP3 files.
Authors:Ishwar Balappanawar, Venkata Hasith Vattikuti, Greta Kintzley, Ronan Azimi-Mancel, Satvik Golechha
Abstract:
Detecting hidden behaviors in neural networks poses a significant challenge due to minimal prior knowledge and potential adversarial obfuscation. We explore this problem by framing detection as an adversarial game between two teams: the red team trains two similar models, one trained solely on benign data and the other trained on data containing hidden harmful behavior, with the performance of both being nearly indistinguishable on the benign dataset. The blue team, with limited to no information about the harmful behaviour, tries to identify the compromised model. We experiment using CNNs and try various blue team strategies, including Gaussian noise analysis, model diffing, integrated gradients, and adversarial attacks under different levels of hints provided by the red team. Results show high accuracy for adversarial-attack-based methods (100\% correct prediction, using hints), which is very promising, whilst the other techniques yield more varied performance. During our LLM-focused rounds, we find that there are not many parallel methods that we could apply from our study with CNNs. Instead, we find that effective LLM auditing methods require some hints about the undesired distribution, which can then used in standard black-box and open-weight methods to probe the models further and reveal their misalignment. We open-source our auditing games (with the model and data) and hope that our findings contribute to designing better audits.
Authors:Jeremiah Blocki, Blake Holman
Abstract:
Memory-Hard Functions (MHF) are a useful cryptographic primitive to build egalitarian proofs-of-work and to help protect low entropy secrets (e.g., user passwords) against brute-forces attacks. Ideally, we would like for a MHF to have the property that (1) an honest party can evaluate the function in sequential time $Ω(N)$, and (2) any parallel party that evaluates the function is forced to lockup $Ω(N)$ memory for $Ω(N)$ sequential steps. Unfortunately, this goal is not quite achievable, so prior work of Blocki and Holman [BH22] focused on designing MHFs with strong tradeoff guarantees between sustained-space complexity (SSC) and cumulative memory costs (CMC). However, their theoretical construction is not suitable for practical deployment due to the reliance on expensive constructions of combinatorial graphs. Furthermore, there is no formal justification for the heuristic use of the dynamic pebbling game in MHF analysis so we cannot rule out the possibility that there are more efficient attacks in the Parallel Random Oracle Model (PROM). Towards the goal of developing a practical MHF with provably strong SSC/CMC tradeoffs we develop a new MHF called EGSample which does not rely on expensive combinatorial constructions like [BH22]. In the dynamic pebbling model, we prove equivalent SSC/CMC tradeoffs for EGSample i.e., any the dynamic pebbling strategy either (1) locks up $Ω(N)$ memory for $Ω(N)$ steps, or (2) incurs cumulative memory cost at least $Ω(N^{3-ε})$. We also develop new techniques to directly establish SSC/CMC tradeoffs in the parallel random oracle model. In particular, we prove that {\em any} PROM algorithm evaluating our MHF either (1) locks up $Ω(N)$ blocks of memory for $Ω(N)$ steps or (2) incurs cumulative memory cost at least $Ω(N^{2.5-ε})$.
Authors:Joshua Bailey, Charles Nicholas
Abstract:
Symbolic execution is a powerful program analysis technique that allows for the systematic exploration of all program paths. Path explosion, where the number of states to track becomes unwieldy, is one of the biggest challenges hindering symbolic execution's practical application. To combat this, researchers have employed various strategies to enable symbolic execution on complex software systems. This paper introduces a systematic taxonomy of these strategies, categorizing them into two primary approaches: Scope Reduction, which aims to reduce the scope of symbolic execution to manageable portions of code, and Guidance Heuristics, which steer the symbolic execution engine toward promising paths. Using this taxonomy as a lens, we survey applications of symbolic executions in several domains such as vulnerability analysis, malware analysis, firmware re-hosting, and network protocol analysis. Finally, we identify promising directions for future research, including the application of symbolic execution to real-time operating systems and modern, type-safe languages.
Authors:Thomas Michel, Debabrota Basu, Emilie Kaufmann
Abstract:
We revisit Wald's celebrated Sequential Probability Ratio Test for sequential tests of two simple hypotheses, under privacy constraints. We propose DP-SPRT, a wrapper that can be calibrated to achieve desired error probabilities and privacy constraints, addressing a significant gap in previous work. DP-SPRT relies on a private mechanism that processes a sequence of queries and stops after privately determining when the query results fall outside a predefined interval. This OutsideInterval mechanism improves upon naive composition of existing techniques like AboveThreshold, potentially benefiting other sequential algorithms. We prove generic upper bounds on the error and sample complexity of DP-SPRT that can accommodate various noise distributions based on the practitioner's privacy needs. We exemplify them in two settings: Laplace noise (pure Differential Privacy) and Gaussian noise (Rényi differential privacy). In the former setting, by providing a lower bound on the sample complexity of any $ε$-DP test with prescribed type I and type II errors, we show that DP-SPRT is near optimal when both errors are small and the two hypotheses are close. Moreover, we conduct an experimental study revealing its good practical performance.
Authors:Meital Shlezinger, Shay Akirav, Lei Zhou, Liang Guo, Avi Kessel, Guoliang Li
Abstract:
Database systems are extensively used to store critical data across various domains. However, the frequency of abnormal database access behaviors, such as database intrusion by internal and external attacks, continues to rise. Internal masqueraders often have greater organizational knowledge, making it easier to mimic employee behavior effectively. In contrast, external masqueraders may behave differently due to their lack of familiarity with the organization. Current approaches lack the granularity needed to detect anomalies at the operational level, frequently misclassifying entire sequences of operations as anomalies, even though most operations are likely to represent normal behavior. On the other hand, some anomalous behaviors often resemble normal activities, making them difficult for existing detection methods to identify. This paper introduces a two-tiered anomaly detection approach for Structured Query Language (SQL) using the Bidirectional Encoder Representations from Transformers (BERT) model, specifically DistilBERT, a more efficient, pre-trained version. Our method combines both unsupervised and supervised machine learning techniques to accurately identify anomalous activities while minimizing the need for data labeling. First, the unsupervised method uses ensemble anomaly detectors that flag embedding vectors distant from learned normal patterns of typical user behavior across the database (out-of-scope queries). Second, the supervised method uses fine-tuned transformer-based models to detect internal attacks with high precision (in-scope queries), using role-labeled classification, even on limited labeled SQL data. Our findings make a significant contribution by providing an effective solution for safeguarding critical database systems from sophisticated threats.
Authors:Junhao He, Tianyu Liu, Jingyuan Zhao, Benjamin Turner
Abstract:
The proliferation of multi-modal fake news on social media poses a significant threat to public trust and social stability. Traditional detection methods, primarily text-based, often fall short due to the deceptive interplay between misleading text and images. While Large Vision-Language Models (LVLMs) offer promising avenues for multi-modal understanding, effectively fusing diverse modal information, especially when their importance is imbalanced or contradictory, remains a critical challenge. This paper introduces MM-FusionNet, an innovative framework leveraging LVLMs for robust multi-modal fake news detection. Our core contribution is the Context-Aware Dynamic Fusion Module (CADFM), which employs bi-directional cross-modal attention and a novel dynamic modal gating network. This mechanism adaptively learns and assigns importance weights to textual and visual features based on their contextual relevance, enabling intelligent prioritization of information. Evaluated on the large-scale Multi-modal Fake News Dataset (LMFND) comprising 80,000 samples, MM-FusionNet achieves a state-of-the-art F1-score of 0.938, surpassing existing multi-modal baselines by approximately 0.5% and significantly outperforming single-modal approaches. Further analysis demonstrates the model's dynamic weighting capabilities, its robustness to modality perturbations, and performance remarkably close to human-level, underscoring its practical efficacy and interpretability for real-world fake news detection.
Authors:Ko-Wei Chuang, Hen-Hsen Huang, Tsai-Yen Li
Abstract:
As large language models (LLMs) and generative AI become increasingly integrated into customer service and moderation applications, adversarial threats emerge from both external manipulations and internal label corruption. In this work, we identify and systematically address these dual adversarial threats by introducing DINA (Dual Defense Against Internal Noise and Adversarial Attacks), a novel unified framework tailored specifically for NLP. Our approach adapts advanced noisy-label learning methods from computer vision and integrates them with adversarial training to simultaneously mitigate internal label sabotage and external adversarial perturbations. Extensive experiments conducted on a real-world dataset from an online gaming service demonstrate that DINA significantly improves model robustness and accuracy compared to baseline models. Our findings not only highlight the critical necessity of dual-threat defenses but also offer practical strategies for safeguarding NLP systems in realistic adversarial scenarios, underscoring broader implications for fair and responsible AI deployment.
Authors:Federico Grasselli, Gaetano Russo, Massimiliano Proietti
Abstract:
Digital signatures represent a crucial cryptographic asset that must be protected against quantum adversaries. Quantum Digital Signatures (QDS) can offer solutions that are information-theoretically (IT) secure and thus immune to quantum attacks. In this work, we analyze three existing practical QDS protocols based on preshared secure keys (e.g., established with quantum key distribution) and universal hashing families. For each protocol, we make amendments to close potential loopholes and prove their IT security while accounting for the failure of IT-secure authenticated communication. We then numerically optimize the protocol parameters to improve efficiency in terms of preshared bit consumption and signature length, allowing us to identify the most efficient protocol.
Authors:Sharad Agarwal, Guillermo Suarez-Tangil, Marie Vasek
Abstract:
Mobile network operators implement firewalls to stop illicit messages, but scammers find ways to evade detection. Previous work has looked into SMS texts that are blocked by these firewalls. However, there is little insight into SMS texts that bypass them and reach users. To this end, we collaborate with a major mobile network operator to receive 1.35m user reports submitted over four months. We find 89.16% of user reports comprise text messages, followed by reports of suspicious calls and URLs. Using our methodological framework, we identify 35.12% of the unique text messages reported by users as spam, while 40.27% are scam text messages. This is the first paper that investigates SMS reports submitted by users and differentiates between spam and scams. Our paper classifies the identified scam text messages into 12 scam types, of which the most popular is 'wrong number' scams. We explore the various infrastructure services that scammers abuse to conduct SMS scams, including mobile network operators and hosting infrastructure, and analyze the text of the scam messages to understand how scammers lure victims into providing them with their personal or financial details.
Authors:Mohammad Ferry Husnil Arif, Muhammad Imran
Abstract:
The semidirect discrete logarithm problem (SDLP) in finite groups was proposed as a foundation for post-quantum cryptographic protocols, based on the belief that its non-abelian structure would resist quantum attacks. However, recent results have shown that SDLP in finite groups admits efficient quantum algorithms, undermining its quantum resistance. This raises a fundamental question: does the SDLP offer any computational advantages over the standard discrete logarithm problem (DLP) against classical adversaries? In this work, we investigate the classical hardness of SDLP across different finite group platforms. We establish that the group-case SDLP can be reformulated as a generalized discrete logarithm problem, enabling adaptation of classical algorithms to study its complexity. We present a concrete adaptation of the Baby-Step Giant-Step algorithm for SDLP, achieving time and space complexity $O(\sqrt{r})$ where $r$ is the period of the underlying cycle structure. Through theoretical analysis and experimental validation in SageMath, we demonstrate that the classical hardness of SDLP is highly platform-dependent and does not uniformly exceed that of standard DLP. In finite fields $\mathbb{F}_p^*$, both problems exhibit comparable complexity. Surprisingly, in elliptic curves $E(\mathbb{F}_p)$, the SDLP becomes trivial due to the bounded automorphism group, while in elementary abelian groups $\mathbb{F}_p^n$, the SDLP can be harder than DLP, with complexity varying based on the eigenvalue structure of the automorphism. Our findings reveal that the non-abelian structure of semidirect products does not inherently guarantee increased classical hardness, suggesting that the search for classically hard problems for cryptographic applications requires more careful consideration of the underlying algebraic structures.
Authors:Kirti Singh, Vinay J. Ribeiro, Susmita Mandal
Abstract:
Cross-chain asset exchange is crucial for blockchain interoperability. Existing solutions rely on trusted third parties and risk asset loss, or use decentralized alternatives like atomic swaps, which suffer from grief attacks. Griefing occurs when a party prematurely exits, locking the counterparty's assets until a timelock expires. Hedged Atomic Swaps mitigate griefing by introducing a penalty premium; however, they increase the number of transactions from four (as in Tier Nolan's swap) to six, which in turn introduces new griefing risks. Grief-Free (GF) Swap reduces this to five transactions by consolidating assets and premiums on a single chain. However, no existing protocol achieves grief-free asset exchange in just four transactions.
This paper presents 4-Swap, the first cross-chain atomic swap protocol that is both grief-free and bribery-safe, while completing asset exchange in just four transactions. By combining the griefing premium and principal into a single transaction per chain, 4-Swap reduces on-chain transactions, leading to faster execution compared to previous grief-free solutions. It is fully compatible with Bitcoin and operates without the need for any new opcodes. A game-theoretic analysis shows that rational participants have no incentive to deviate from the protocol, ensuring robust compliance and security.
Authors:Artyom Kuninets, Ekaterina Malygina
Abstract:
In this paper, we determine explicit bases for Riemann--Roch spaces associated with various families of elliptic codes. We establish the feasibility and provide exact algorithms for constructing bases of Riemann--Roch spaces corresponding to arbitrary divisors on elliptic curves. These results are subsequently applied to derive bases for quasi-cyclic elliptic codes and their subfield subcodes as well as for the class of Goppa-like elliptic codes. For algebraic geometry code applications, having an explicit description of Riemann--Roch space bases for arbitrary divisors is particularly valuable as it simultaneously enables efficient code construction and reveals structural properties of the codes leading to the new cryptanalysis methods when these codes are employed in cryptographic schemes
Authors:Takumi Suimon, Yuki Koizumi, Junji Takemasa, Toru Hasegawa
Abstract:
Federated learning (FL) enables collaborative model training without sharing raw data, but individual model updates may still leak sensitive information. Secure aggregation (SecAgg) mitigates this risk by allowing the server to access only the sum of client updates, thereby concealing individual contributions. However, a significant vulnerability has recently attracted increasing attention: when model updates are sparse vectors, a non-zero value contributed by a single client at a given index can be directly revealed in the aggregate, enabling precise data reconstruction attacks. In this paper, we propose a novel enhancement to SecAgg that reveals aggregated values only at indices with at least $t$ non-zero contributions. Our mechanism introduces a per-element masking strategy to prevent the exposure of under-contributed elements, while maintaining modularity and compatibility with many existing SecAgg implementations by relying solely on cryptographic primitives already employed in a typical setup. We integrate this mechanism into Flamingo, a low-round SecAgg protocol, to provide a robust defense against such attacks. Our analysis and experimental results indicate that the additional computational and communication overhead introduced by our mechanism remains within an acceptable range, supporting the practicality of our approach.
Authors:Nihar B. Shah, Melisa Bok, Xukun Liu, Andrew McCallum
Abstract:
We discuss newly uncovered cases of identity theft in the scientific peer-review process within artificial intelligence (AI) research, with broader implications for other academic procedures. We detail how dishonest researchers exploit the peer-review system by creating fraudulent reviewer profiles to manipulate paper evaluations, leveraging weaknesses in reviewer recruitment workflows and identity verification processes. The findings highlight the critical need for stronger safeguards against identity theft in peer review and academia at large, and to this end, we also propose mitigating strategies.
Authors:Arturo Sánchez-Matas, Pablo Escribano Ruiz, Daniel DÃaz-López, Angel Luis Perales Gómez, Pantaleone Nespoli, Gregorio MartÃnez Pérez
Abstract:
In today digital landscape, organizations face constantly evolving cyber threats, making it essential to discover slippery attack vectors through novel techniques like Security Chaos Engineering (SCE), which allows teams to test defenses and identify vulnerabilities effectively. This paper proposes to integrate SCE into Breach Attack Simulation (BAS) platforms, leveraging adversary profiles and abilities from existing threat intelligence databases. This innovative proposal for cyberattack simulation employs a structured architecture composed of three layers: SCE Orchestrator, Connector, and BAS layers. Utilizing MITRE Caldera in the BAS layer, our proposal executes automated attack sequences, creating inferred attack trees from adversary profiles. Our proposal evaluation illustrates how integrating SCE with BAS can enhance the effectiveness of attack simulations beyond traditional scenarios, and be a useful component of a cyber defense strategy.
Authors:Richard Hegewald, Rebecca Beyer
Abstract:
The security of research software is essential for ensuring the integrity and reproducibility of scientific results. However, research software security is still largely unexplored. Due to its dependence on open source components and distributed development practices, research software is particularly vulnerable to supply chain attacks. This study analyses 3,248 high-quality, largely peer-reviewed research software repositories using the OpenSSF Scorecard. We find a generally weak security posture with an average score of 3.5/10. Important practices, such as signed releases and branch protection, are rarely implemented. Finally, we present actionable, low-effort recommendations that can help research teams improve software security and mitigate potential threats to scientific integrity.
Authors:Mabin Umman Varghese, Zahra Taghiyarrenani
Abstract:
Network Intrusion Detection Systems (NIDS) play a crucial role in safeguarding network infrastructure against cyberattacks. As the prevalence and sophistication of these attacks increase, machine learning and deep neural network approaches have emerged as effective tools for enhancing NIDS capabilities in detecting malicious activities. However, the effectiveness of traditional deep neural models is often limited by the need for extensive labelled datasets and the challenges posed by data and feature heterogeneity across different network domains. To address these limitations, we developed a deep neural model that integrates multi-modal learning with domain adaptation techniques for classification. Our model processes data from diverse sources in a sequential cyclic manner, allowing it to learn from multiple datasets and adapt to varying feature spaces. Experimental results demonstrate that our proposed model significantly outperforms baseline neural models in classifying network intrusions, particularly under conditions of varying sample availability and probability distributions. The model's performance highlights its ability to generalize across heterogeneous datasets, making it an efficient solution for real-world network intrusion detection.
Authors:Baigang Chen, Dongfang Zhao
Abstract:
Homomorphic encryption (HE) enables secure computation on encrypted data, safeguarding user privacy in domains such as cloud computing, healthcare, and finance. Among fully homomorphic encryption (FHE) schemes, CKKS is notable for supporting approximate arithmetic over complex numbers, a key requirement for machine-learning and numerical workloads. However, CKKS incurs rapid noise growth, complex parameter tuning, and relies on costly modulus switching. We propose a binary variant of CKKS that operates entirely over binary-coefficient polynomial rings and replaces rescaling with a lightweight bootstrapping mechanism. To mitigate additional bit-flip errors introduced by binary encoding, we integrate BCH error-correcting codes for robust decryption. Our open-source implementation, built on the HElib library, preserves the core algebraic structure of CKKS while introducing binary-coefficient encoding, enabling efficient evaluation in small ring dimensions and unbounded-depth computation. Empirical evaluations demonstrate the framework's practicality and scalability across a range of settings.
Authors:Shane Caldwell, Max Harley, Michael Kouremetis, Vincent Abruzzo, Will Pearce
Abstract:
We introduce PentestJudge, a system for evaluating the operations of penetration testing agents. PentestJudge is a large language model (LLM)-as-judge with access to tools that allow it to consume arbitrary trajectories of agent states and tool call history to determine whether a security agent's actions meet certain operating criteria that would be impractical to evaluate programmatically. We develop rubrics that use a tree structure to hierarchically collapse the penetration testing task for a particular environment into smaller, simpler, and more manageable sub-tasks and criteria until each leaf node represents simple yes-or-no criteria for PentestJudge to evaluate. Task nodes are broken down into different categories related to operational objectives, operational security, and tradecraft. LLM-as-judge scores are compared to human domain experts as a ground-truth reference, allowing us to compare their relative performance with standard binary classification metrics, such as F1 scores. We evaluate several frontier and open-source models acting as judge agents, with the best model reaching an F1 score of 0.83. We find models that are better at tool-use perform more closely to human experts. By stratifying the F1 scores by requirement type, we find even models with similar overall scores struggle with different types of questions, suggesting certain models may be better judges of particular operating criteria. We find that weaker and cheaper models can judge the trajectories of pentests performed by stronger and more expensive models, suggesting verification may be easier than generation for the penetration testing task. We share this methodology to facilitate future research in understanding the ability of judges to holistically and scalably evaluate the process quality of AI-based information security agents so that they may be confidently used in sensitive production environments.
Authors:S M Mostaq Hossain, Sheikh Ghafoor, Kumar Yelamarthi, Venkata Prasanth Yanambaka
Abstract:
Physically Unclonable Function (PUF) offers a secure and lightweight alternative to traditional cryptography for authentication due to their unique device fingerprint. However, their dependence on specialized hardware hinders their adoption in diverse applications. This paper proposes a novel blockchain framework that leverages SoftPUF, a software-based approach mimicking PUF. SoftPUF addresses the hardware limitations of traditional PUF, enabling secure and efficient authentication for a broader range of devices within a blockchain network. The framework utilizes a machine learning model trained on PUF data to generate unique, software-based keys for each device. These keys serve as secure identifiers for authentication on the blockchain, eliminating the need for dedicated hardware. This approach facilitates the integration of legacy devices from various domains, including cloud-based solutions, into the blockchain network. Additionally, the framework incorporates well-established defense mechanisms to ensure robust security against various attacks. This combined approach paves the way for secure and scalable authentication in diverse blockchain-based applications. Additionally, to ensure robust security, the system incorporates well-established defense mechanisms against various attacks, including 51%, phishing, routing, and Sybil attacks, into the blockchain network. This combined approach paves the way for secure and efficient authentication in a wider range of blockchain-based applications.
Authors:Matthew Rodda, Vasilios Mavroudis
Abstract:
Operational Technology (OT) is an integral component of critical national infrastructure, enabling automation and control in industries such as energy, manufacturing, and transportation. However, OT networks, systems, and devices have been designed and deployed prioritising functionality rather than security. This leads to inherent vulnerabilities in many deployed systems when operational misconfigurations expose them to the internet. This report provides an up-to-date overview of the OT threat landscape exposed to the public internet and studies the affected protocols, vendors, software, and the geographic distribution of systems. Our findings reveal nearly 70,000 exposed OT devices globally, with significant concentrations in North America and Europe. Analysis of prevalent protocols (e.g., ModbusTCP, EtherNet/IP, S7) shows that many devices expose detailed identifying information, including outdated firmware versions with known critical vulnerabilities that remain unpatched for years after disclosure. Furthermore, we demonstrate how automated analysis of screenshots can uncover exposed graphical interfaces of Human Machine Interfaces (HMIs) and Supervisory Control and Data Acquisition (SCADA) systems, highlighting diverse pathways for potential unauthorized access and underscoring the risks to industrial processes and critical infrastructure.
Authors:Kamal Al-Sabahi, Yousuf Khamis Al Mabsali
Abstract:
Academic publishing, integral to knowledge dissemination and scientific advancement, increasingly faces threats from unethical practices such as unconsented authorship, gift authorship, author ambiguity, and undisclosed conflicts of interest. While existing infrastructures like ORCID effectively disambiguate researcher identities, they fall short in enforcing explicit authorship consent, accurately verifying contributor roles, and robustly detecting conflicts of interest during peer review. To address these shortcomings, this paper introduces a decentralized framework leveraging Self-Sovereign Identity (SSI) and blockchain technology. The proposed model uses Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs) to securely verify author identities and contributions, reducing ambiguity and ensuring accurate attribution. A blockchain-based trust registry records authorship consent and peer-review activity immutably. Privacy-preserving cryptographic techniques, especially Zero-Knowledge Proofs (ZKPs), support conflict-of-interest detection without revealing sensitive data. Verified authorship metadata and consent records are embedded in publications, increasing transparency. A stakeholder survey of researchers, editors, and reviewers suggests the framework improves ethical compliance and confidence in scholarly communication. This work represents a step toward a more transparent, accountable, and trustworthy academic publishing ecosystem.
Authors:Angela Famera, Ben Hilger, Suman Bhunia, Patrick Heil
Abstract:
Mirai is undoubtedly one of the most significant Internet of Things (IoT) botnet attacks in history. In terms of its detrimental effects, seamless spread, and low detection rate, it surpassed its predecessors. Its developers released the source code, which triggered the development of several variants that combined the old code with newer vulnerabilities found on popular IoT devices. The prominent variants, Satori, Mukashi, Moobot, and Sonic1, together target more than 15 unique known vulnerabilities discovered between 2014-2021. The vulnerabilities include but are not limited to improper input validation, command injections, insufficient credential protection, and out-of-bound writes. With these new attack strategies, Satori compromised more than a quarter million devices within the first twelve hours of its release and peaked at almost 700,000 infected devices. Similarly, Mukashi made more than a hundred million Zyxel NAS devices vulnerable through its new exploits. This article reviews the attack methodologies and impacts of these variants in detail. It summarizes the common vulnerabilities targeted by these variants and analyzes the infection mechanism through vulnerability analysis. This article also provides an overview of possible defense solutions.
Authors:Navneet Verma, Ying Xie
Abstract:
The increasing penetration of renewable energy sources in day-ahead energy markets introduces challenges in balancing supply and demand, ensuring grid resilience, and maintaining trust in decentralized trading systems. This paper proposes a novel framework that integrates the Proximal Policy Optimization (PPO) algorithm, a state-of-the-art reinforcement learning method, with blockchain technology to optimize automated trading strategies for prosumers in day-ahead energy markets. We introduce a comprehensive framework that employs RL agent for multi-objective energy optimization and blockchain for tamper-proof data and transaction management. Simulations using real-world data from the Electricity Reliability Council of Texas (ERCOT) demonstrate the effectiveness of our approach. The RL agent achieves demand-supply balancing within 2\% and maintains near-optimal supply costs for the majority of the operating hours. Moreover, it generates robust battery storage policies capable of handling variability in solar and wind generation. All decisions are recorded on an Algorand-based blockchain, ensuring transparency, auditability, and security - key enablers for trustworthy multi-agent energy trading. Our contributions include a novel system architecture, curriculum learning for robust agent development, and actionable policy insights for practical deployment.
Authors:Sanjay Singh, Mitendra Mahto
Abstract:
In today's enterprise environment, traditional access methods such as Virtual Private Networks (VPNs) and application-specific Single Sign-On (SSO) often fall short when it comes to securely scaling access for a distributed and dynamic workforce. This paper presents our experience implementing a modern, Zero Trust-aligned architecture that leverages a reverse proxy integrated with Mutual TLS (mTLS) and centralized SSO, along with the key challenges we encountered and lessons learned during its deployment and scaling. This multidimensional solution involves both per-device and per-user authentication, centralized enforcement of security policies, and comprehensive observability, hence enabling organizations to deliver secure and seamless access to their internal applications.
Authors:Kayvan Karim, Hani Ragab Hassen. Hadj Batatia
Abstract:
Data representation remains a fundamental challenge in machine learning, particularly when adapting sequence-based architectures like Transformers and Large Language Models (LLMs) for structured tabular data. Existing methods often fail to cohesively encode the mix of numerical and categorical features or preserve the inherent structure of tables. This paper introduces a novel, hybrid tokenisation methodology designed to convert tabular data into a unified, sequential format suitable for LLM training. Our approach combines predefined fixed tokens to represent structural elements and low-cardinality categorical features, with a learned subword vocabulary using Byte-Pair Encoding (BPE) for high-cardinality and continuous values. We demonstrate the efficacy of this technique by applying it to a large-scale NetFlow dataset (CIDDS-001), preparing a corpus for a Network Intrusion Detection System (NIDS) foundation model. The evaluation shows that our method is highly efficient, processing over 31 million network flows in under five hours and achieving a significant data compression ratio of 6.18:1. This process resulted in a computationally manageable corpus of over one billion tokens, establishing a viable and generalisable pathway for training foundation models on structured data.
Authors:Dulana Rupanetti, Naima Kaabouch
Abstract:
The increase of IoT devices, driven by advancements in hardware technologies, has led to widespread deployment in large-scale networks that process massive amounts of data daily. However, the reliance on Edge Computing to manage these devices has introduced significant security vulnerabilities, as attackers can compromise entire networks by targeting a single IoT device. In light of escalating cybersecurity threats, particularly botnet attacks, this paper investigates the application of machine learning techniques to enhance security in Edge-Computing-Assisted IoT environments. Specifically, it presents a comparative analysis of Random Forest, XGBoost, and LightGBM -- three advanced ensemble learning algorithms -- to address the dynamic and complex nature of botnet threats. Utilizing a widely recognized IoT network traffic dataset comprising benign and malicious instances, the models were trained, tested, and evaluated for their accuracy in detecting and classifying botnet activities. Furthermore, the study explores the feasibility of deploying these models in resource-constrained edge and IoT devices, demonstrating their practical applicability in real-world scenarios. The results highlight the potential of machine learning to fortify IoT networks against emerging cybersecurity challenges.
Authors:Yelim Ahn, Jaejin Lee
Abstract:
As large language models (LLMs) are increasingly deployed across diverse domains, ensuring their safety has become a critical concern. In response, studies on jailbreak attacks have been actively growing. Existing approaches typically rely on iterative prompt engineering or semantic transformations of harmful instructions to evade detection. In this work, we introduce PUZZLED, a novel jailbreak method that leverages the LLM's reasoning capabilities. It masks keywords in a harmful instruction and presents them as word puzzles for the LLM to solve. We design three puzzle types-word search, anagram, and crossword-that are familiar to humans but cognitively demanding for LLMs. The model must solve the puzzle to uncover the masked words and then proceed to generate responses to the reconstructed harmful instruction. We evaluate PUZZLED on five state-of-the-art LLMs and observe a high average attack success rate (ASR) of 88.8%, specifically 96.5% on GPT-4.1 and 92.3% on Claude 3.7 Sonnet. PUZZLED is a simple yet powerful attack that transforms familiar puzzles into an effective jailbreak strategy by harnessing LLMs' reasoning capabilities.
Authors:Nilufer Gulciftci, M. Emre Gursoy
Abstract:
Poisoning attacks, in which an attacker adversarially manipulates the training dataset of a machine learning (ML) model, pose a significant threat to ML security. Beta Poisoning is a recently proposed poisoning attack that disrupts model accuracy by making the training dataset linearly nonseparable. In this paper, we propose four defense strategies against Beta Poisoning attacks: kNN Proximity-Based Defense (KPB), Neighborhood Class Comparison (NCC), Clustering-Based Defense (CBD), and Mean Distance Threshold (MDT). The defenses are based on our observations regarding the characteristics of poisoning samples generated by Beta Poisoning, e.g., poisoning samples have close proximity to one another, and they are centered near the mean of the target class. Experimental evaluations using MNIST and CIFAR-10 datasets demonstrate that KPB and MDT can achieve perfect accuracy and F1 scores, while CBD and NCC also provide strong defensive capabilities. Furthermore, by analyzing performance across varying parameters, we offer practical insights regarding defenses' behaviors under varying conditions.
Authors:Roya Arkhmammadova, Hosein Madadi Tamar, M. Emre Gursoy
Abstract:
Small language models (SLMs) are increasingly valued for their efficiency and deployability in resource-constrained environments, making them useful for on-device, privacy-sensitive, and edge computing applications. On the other hand, membership inference attacks (MIAs), which aim to determine whether a given sample was used in a model's training, are an important threat with serious privacy and intellectual property implications. In this paper, we study MIAs on SLMs. Although MIAs were shown to be effective on large language models (LLMs), they are relatively less studied on emerging SLMs, and furthermore, their effectiveness decreases as models get smaller. Motivated by this finding, we propose a new MIA called win-k, which builds on top of a state-of-the-art attack (min-k). We experimentally evaluate win-k by comparing it with five existing MIAs using three datasets and eight SLMs. Results show that win-k outperforms existing MIAs in terms of AUROC, TPR @ 1% FPR, and FPR @ 99% TPR metrics, especially on smaller models.
Authors:Isabelle Bakker, John Hastings
Abstract:
This study evaluates the ability of GPT-4o to autonomously solve beginner-level offensive security tasks by connecting the model to OverTheWire's Bandit capture-the-flag game. Of the 25 levels that were technically compatible with a single-command SSH framework, GPT-4o solved 18 unaided and another two after minimal prompt hints for an overall 80% success rate. The model excelled at single-step challenges that involved Linux filesystem navigation, data extraction or decoding, and straightforward networking. The approach often produced the correct command in one shot and at a human-surpassing speed. Failures involved multi-command scenarios that required persistent working directories, complex network reconnaissance, daemon creation, or interaction with non-standard shells. These limitations highlight current architectural deficiencies rather than a lack of general exploit knowledge. The results demonstrate that large language models (LLMs) can automate a substantial portion of novice penetration-testing workflow, potentially lowering the expertise barrier for attackers and offering productivity gains for defenders who use LLMs as rapid reconnaissance aides. Further, the unsolved tasks reveal specific areas where secure-by-design environments might frustrate simple LLM-driven attacks, informing future hardening strategies. Beyond offensive cybersecurity applications, results suggest the potential to integrate LLMs into cybersecurity education as practice aids.
Authors:Chloe Li, Mary Phuong, Noah Y. Siegel
Abstract:
Trustworthy evaluations of dangerous capabilities are increasingly crucial for determining whether an AI system is safe to deploy. One empirically demonstrated threat to this is sandbagging - the strategic underperformance on evaluations by AI models or their developers. One promising defense is to monitor a model's chain-of-thought (CoT) reasoning, as this could reveal its intentions and plans. In this work, we measure the ability of models to sandbag on dangerous capability evaluations against a CoT monitor by prompting them to sandbag while being either monitor-oblivious or monitor-aware. We show that both frontier models and small open-sourced models can covertly sandbag against CoT monitoring 0-shot without hints. However, they cannot yet do so reliably: they bypass the monitor 16-36\% of the time when monitor-aware, conditioned on sandbagging successfully. We qualitatively analyzed the uncaught CoTs to understand why the monitor failed. We reveal a rich attack surface for CoT monitoring and contribute five covert sandbagging policies generated by models. These results inform potential failure modes of CoT monitoring and may help build more diverse sandbagging model organisms.
Authors:Syon Balakrishnan, Aaron Grinberg
Abstract:
Understanding behavioral drivers of success in illicit digital marketplaces is critical for developing effective enforcement strategies and understanding digital commerce evolution, as darknet drug markets represent a growing share of the total drug economy. This study employs quantitative regression analysis of 50,000+ listings from 2,653 vendors in the Agora marketplace (2014-2015), examining relationships between cybersecurity signaling (PGP encryption mentions), product diversification, and commercial success through nested regression specifications controlling for reputation, pricing, and category-specific factors. Product diversification emerges as the dominant predictor of vendor scale, increasing the odds of large vendor status by 169% per additional category, while PGP encryption signaling functions primarily as a professional marker rather than an independent success factor. Vendor success depends on portfolio breadth rather than specialization, with category-specific enforcement creating differential market constraints. Successful vendors operate as diversified enterprises capable of rapid pivoting between product categories, requiring targeted enforcement towards diversified vendors based on coordinated multi-category enforcement approaches rather than traditional substance-specific targeting strategies.
Authors:Oscar Llorente-Vazquez, Xabier Ugarte-Pedrero, Igor Santos-Grueiro, Pablo Garcia Bringas
Abstract:
Dynamic Binary Instrumentation (DBI) is the set of techniques that enable instrumentation of programs at run-time, making it possible to monitor and modify the execution of compiled binaries or entire systems. DBI is used for countless security applications and analyses, and is extensively used across many fields in both industry and academia. Over the years, several DBI approaches have been proposed based on different technologies and implementing diverse techniques. Every solution tries to overcome certain limitations, but they sometimes bring other shortcomings. Some are specialized for one particular domain or task, while others have a wider scope.
In this paper, we shed light into the labyrinth of DBI, bringing together process-level and whole-system approaches. We depict their building blocks and analyze the underlying instrumentation techniques, comparing their ability to instrument different primitives and run-time events. Then, we evaluate their performance when implementing each primitive, and highlight relevant observations. Our results show that no single technique is better than the rest in all circumstances.
Authors:Haocheng Jiang, Hua Shen, Jixin Zhang, Willy Susilo, Mingwu Zhang
Abstract:
Federated learning is a distributed training framework vulnerable to Byzantine attacks, particularly when over 50% of clients are malicious or when datasets are highly non-independent and identically distributed (non-IID). Additionally, most existing defense mechanisms are designed for specific attack types (e.g., gradient similarity-based schemes can only defend against outlier model poisoning), limiting their effectiveness. In response, we propose FedGuard, a novel federated learning mechanism. FedGuard cleverly addresses the aforementioned issues by leveraging the high sensitivity of membership inference to model bias. By requiring clients to include an additional mini-batch of server-specified data in their training, FedGuard can identify and exclude poisoned models, as their confidence in the mini-batch will drop significantly. Our comprehensive evaluation unequivocally shows that, under three highly non-IID datasets, with 90% of clients being Byzantine and seven different types of Byzantine attacks occurring in each round, FedGuard significantly outperforms existing robust federated learning schemes in mitigating various types of Byzantine attacks.
Authors:Yuqi Qian, Yun Cao, Meiyang Lv, Haocheng Fu
Abstract:
Steganography based on diffusion models has attracted increasing attention due to its ability to generate high-quality images and exhibit strong robustness. In such approaches, the secret message is first embedded into the initial latent variable, and then the stego image is generated through the forward process. To extract the message, an inversion process is required to reconstruct the latent variables from the received image. However, inaccurate latent inversion leads to significant discrepancies between the reconstructed and original latent variables, rendering message extraction infeasible. To address this issue, we propose \textbf{RF-Stego}, a novel generative image steganography method that enables accurate latent inversion and significantly improves extraction accuracy. First, we develop the \textbf{P}ath \textbf{C}onsistency \textbf{L}inear \textbf{I}nversion (\textbf{PCLI}), which imposes formal constraints on the inversion process. By explicitly aligning it with the forward generation path and modeling both directions along a shared linear path, PCLI eliminates path mismatch and ensures path consistency throughout the steganographic process. Second, through rigorous theoretical proof, we demonstrate that \textbf{R}ectified \textbf{F}low \textbf{(RF)} offers both theoretical reversibility and numerical stability in the inversion process. Based on this, we replace traditional unstable samplers with RF sampler which effectively improves the numerical precision of the inversion process. Experimental results show RF-Stego outperforms state-of-the-art methods in terms of extraction accuracy, image quality, robustness, security and generation efficiency.
Authors:Md Sajidul Islam Sajid, Jinpeng Wei, Ehab Al-Shaer
Abstract:
Ransomware (RW) presents a significant and widespread threat in the digital landscape, necessitating effective countermeasures. Active cyber deception is a promising strategy to thwart RW and limiting its propagation by misleading it with false information and revealing its true behaviors. Furthermore, RW often acts as a communication conduit between attackers and defenders, allowing deception to return false data to attackers and deplete their resources. This paper introduces ranDecepter, a novel approach that combines active cyber deception with real-time analysis to enhance defenses against RW attacks. The ranDecepter identifies RW in real-time and isolates it within a deceptive environment, autonomously identifying critical elements in the RW code to create a loop mechanism. By repeatedly restarting the malware and transmitting counterfeit encryption information and secret keys to the attacker, it forces the attacker to store these fabricated details for each victim, thereby depleting their resources. Our comprehensive evaluation of ranDecepter, conducted using 1,134 real-world malware samples and twelve benign applications, demonstrates a remarkable 100% accuracy in RW identification, with no false positives and minimal impact on response times. Furthermore, within 24-hours, ranDecepter generates up to 9,223K entries in the attacker's database using 50 agents, showcasing its potential to undermine attacker resources.
Authors:Peter F. Michael, Zekun Hao, Serge Belongie, Abe Davis
Abstract:
The proliferation of advanced tools for manipulating video has led to an arms race, pitting those who wish to sow disinformation against those who want to detect and expose it. Unfortunately, time favors the ill-intentioned in this race, with fake videos growing increasingly difficult to distinguish from real ones. At the root of this trend is a fundamental advantage held by those manipulating media: equal access to a distribution of what we consider authentic (i.e., "natural") video. In this paper, we show how coding very subtle, noise-like modulations into the illumination of a scene can help combat this advantage by creating an information asymmetry that favors verification. Our approach effectively adds a temporal watermark to any video recorded under coded illumination. However, rather than encoding a specific message, this watermark encodes an image of the unmanipulated scene as it would appear lit only by the coded illumination. We show that even when an adversary knows that our technique is being used, creating a plausible coded fake video amounts to solving a second, more difficult version of the original adversarial content creation problem at an information disadvantage. This is a promising avenue for protecting high-stakes settings like public events and interviews, where the content on display is a likely target for manipulation, and while the illumination can be controlled, the cameras capturing video cannot.
Authors:Ethan Buchman, Paolo Dini, Shoaib Ahmed, Andrew Miller, Tomaž Fleischman
Abstract:
For centuries, financial institutions have responded to liquidity challenges by forming closed, centralized clearing clubs with strict rules and membership that allow them to collaborate on using the least money to discharge the most debt. As closed clubs, much of the general public has been excluded from participation. But the vast majority of private sector actors consists of micro or small firms that are vulnerable to late payments and generally ineligible for bank loans. This low liquidity environment often results in gridlock and leads to insolvency, and it disproportionately impacts small enterprises and communities.
On the other hand, blockchain communities have developed open, decentralized settlement systems, along with a proliferation of store of value assets and new lending protocols, allowing anyone to permissionlessly transact and access credit. However, these protocols remain used primarily for speculative purposes, and so far have fallen short of the large-scale positive impact on the real economy prophesied by their promoters.
We address these challenges by introducing Cycles, an open, decentralized clearing, settlement, and issuance protocol. Cycles is designed to enable firms to overcome payment inefficiencies, to reduce their working capital costs, and to leverage diverse assets and liquidity sources, including cryptocurrencies, stablecoins, and lending protocols, in service of clearing more debt with less money. Cycles solves real world liquidity challenges through a privacy-preserving multilateral settlement platform based on a graph optimization algorithm. The design is based on a core insight: liquidity resides within cycles in the payment network's structure and can be accessed via settlement flows optimized to reduce debt.
Authors:Venkatesan Guruswami, Xin Lyu, Weiqiang Yuan
Abstract:
A recent work (Korten, Pitassi, and Impagliazzo, FOCS 2025) established an insightful connection between static data structure lower bounds, range avoidance of $\text{NC}^0$ circuits, and the refutation of pseudorandom CSP instances, leading to improvements to some longstanding lower bounds in the cell-probe/bit-probe models. Here, we improve these lower bounds in certain cases via a more streamlined reduction to XOR refutation, coupled with handling the odd-arity case. Our result can be viewed as a complete derandomization of the state-of-the-art semi-random $k$-XOR refutation analysis (Guruswami, Kothari and Manohar, STOC 2022, Hsieh, Kothari and Mohanty, SODA 2023), which complements the derandomization of the even-arity case obtained by Korten et al.
As our main technical statement, we show that for any multi-output constant-depth circuit that substantially stretches its input, its output is very likely far from strings sampled from distributions with sufficient independence, and further this can be efficiently certified. Via suitable shifts in perspectives, this gives applications to cell-probe lower bounds and range avoidance algorithms for $\mathsf{NC}^0$ circuits.
Authors:Gursimran Singh, H. B. Acharya, Minseok Kwon
Abstract:
The emergence of programmable data planes, and particularly switches supporting the P4 language, has transformed network security by enabling customized, line-rate packet processing. These switches, originally intended for flexible forwarding, now play a broader role: detecting and mitigating attacks such as DDoS and spoofing, enforcing next-generation firewall policies, and even supporting in-network cryptography and machine learning. These capabilities are made possible by techniques such as recirculate-and-truncate and lookup-table precomputation, which work around architectural constraints like limited memory and restricted instruction sets. In this paper, we systematize recent advances in security applications built on programmable switches, with an emphasis on the capabilities, challenges, and architectural workarounds. We highlight the non-obvious design techniques that make complex in-network security functions feasible despite the constraints of the hardware platform, and also comment on remaining issues and emerging research directions.
Authors:Dylan Manuel, Paul Rad
Abstract:
The generation of large, high-quality datasets for code understanding and generation remains a significant challenge, particularly when aligning decompiled binaries with their original source code. To address this, we present CodableLLM, a Python framework designed to automate the creation and curation of datasets by mapping decompiled functions to their corresponding source functions. This process enhances the alignment between decompiled and source code representations, facilitating the development of large language models (LLMs) capable of understanding and generating code across multiple abstraction levels. CodableLLM supports multiple programming languages and integrates with existing decompilers and parsers to streamline dataset generation. This paper presents the design and implementation of CodableLLM, evaluates its performance in dataset creation, and compares it to existing tools in the field. The results demonstrate that CodableLLM offers a robust and efficient solution for generating datasets tailored for code-focused LLMS.
Authors:Hyun Jun Yook, Ga San Jhun, Jae Hyun Cho, Min Jeon, Donghyun Kim, Tae Hyung Kim, Youn Kyu Lee
Abstract:
Machine unlearning (MU) removes specific data points or concepts from deep learning models to enhance privacy and prevent sensitive content generation. Adversarial prompts can exploit unlearned models to generate content containing removed concepts, posing a significant security risk. However, existing adversarial attack methods still face challenges in generating content that aligns with an attacker's intent while incurring high computational costs to identify successful prompts. To address these challenges, we propose ZIUM, a Zero-shot Intent-aware adversarial attack on Unlearned Models, which enables the flexible customization of target attack images to reflect an attacker's intent. Additionally, ZIUM supports zero-shot adversarial attacks without requiring further optimization for previously attacked unlearned concepts. The evaluation across various MU scenarios demonstrated ZIUM's effectiveness in successfully customizing content based on user-intent prompts while achieving a superior attack success rate compared to existing methods. Moreover, its zero-shot adversarial attack significantly reduces the attack time for previously attacked unlearned concepts.
Authors:Shreyas Bargale, Akshit Vakati Venkata, Jaimandeep Singh, Chester Rebeiro
Abstract:
System and network event logs are essential for security analytics, threat detection, and operational monitoring. However, these logs often contain Personally Identifiable Information (PII), raising significant privacy concerns when shared or analyzed. A key challenge in log anonymization is balancing privacy protection with the retention of sufficient structure for meaningful analysis. Overly aggressive anonymization can destroy contextual integrity, while weak techniques risk re-identification through linkage or inference attacks. This paper introduces novel field-specific anonymization methods that address this trade-off. For IP addresses, we propose a salt-based hashing technique applied at the per-octet level, preserving both subnet and host structure to enable correlation across various log entries while ensuring non-reversibility. For port numbers, full-value hashing with range mapping maintains interpretability. We also present an order-preserving timestamp anonymization scheme using adaptive noise injection, which obfuscates exact times without disrupting event sequences. An open-source tool implementing these techniques has been released to support practical deployment and reproducible research. Evaluations using entropy metrics, collision rates, and residual leakage analysis demonstrate that the proposed approach effectively protects privacy while preserving analytical utility.
Authors:Michiel Marcus, Frank Westers, Anne Nijsten
Abstract:
In this work, we propose a class of equational theories for bounded binary circuits that have the finite variant property. These theories could serve as a building block to specify cryptographic primitive implementations and automatically discover attacks as binary circuits in the symbolic model. We provide proofs of equivalence between this class of equational theories and Boolean logic up to circuit size 3 and we provide the variant complexities and performance benchmarks using Maude-NPA. This is the first result in this direction and follow-up research is needed to improve the scalability of the approach.
Authors:Hyeong Seon Kim, Huy Kang Kim
Abstract:
Modern in-vehicle networks face various cyber threats due to the lack of encryption and authentication in the Controller Area Network (CAN). To address this security issue, this paper presents GUARD-CAN, an anomaly detection framework that combines graph-based representation learning with time-series modeling. GUARD-CAN splits CAN messages into fixed-length windows and converts each window into a graph that preserves message order. To detect anomalies in the timeaware and structure-aware context at the same window, GUARD-CAN takes advantage of the overcomplete Autoencoder (AE) and Graph Convolutional Network (GCN) to generate graph embedding vectors. The model groups these vectors into sequences and feeds them into the Gated Recurrent Unit (GRU) to detect temporal anomaly patterns across the graphs. GUARD-CAN performs anomaly detection at both the sequence level and the window level, and this allows multi-perspective performance evaluation. The model also verifies the importance of window size selection through an analysis based on Shannon entropy. As a result, GUARD-CAN shows that the proposed model detects four types of CAN attacks (flooding, fuzzing, replay and spoofing attacks) effectively without relying on complex feature engineering.
Authors:Shi Pu, Fu Song, Wenjie Wang
Abstract:
Neural networks have received a lot of attention recently, and related security issues have come with it. Many studies have shown that neural networks are vulnerable to adversarial examples that have been artificially perturbed with modification, which is too small to be distinguishable by human perception. Different attacks and defenses have been proposed to solve these problems, but there is little research on evaluating the robustness of neural networks and their inputs. In this work, we propose a metric called the neuron cover change rate (NCCR) to measure the ability of deep learning models to resist attacks and the stability of adversarial examples. NCCR monitors alterations in the output of specifically chosen neurons when the input is perturbed, and networks with a smaller degree of variation are considered to be more robust. The results of the experiment on image recognition and the speaker recognition model show that our metrics can provide a good assessment of the robustness of neural networks or their inputs. It can also be used to detect whether an input is adversarial or not, as adversarial examples are always less robust.
Authors:André Davi Lopes, Tais Mello, Wesley dos Reis Bezerra
Abstract:
This paper presents the development of a distributed digital identity system utilizing modern technologies, including FastAPI, MongoDB, gRPC, Docker, and blockchain simulation with Ganache and Ethereum. The objective is to demonstrate the benefits of distributed systems and blockchain for the security, traceability, and decentralization of digital identities. The methodology included the development of a microservices architecture with JWT authentication, data persistence in MongoDB, simulation of blockchain operations using Ganache, and containerization with Docker. The results demonstrate the feasibility of the proposed approach, with a functional web interface, complete audit logs, and blockchain simulation with Ethereum. The theoretical foundations, technical implementation, results obtained, and prospects for integration with real blockchain networks are discussed.
Authors:Juan Antonio Vieira Giestinhas, Timothy Spiller
Abstract:
The traditional way for a Quantum Key Distribution (QKD) user to join a quantum network is by authenticating themselves using pre-shared key material. While this approach is sufficient for small-scale networks, it becomes impractical as the network grows, due to the total quadratic increase in the number of pre-shared keys required. To address this scalability issue, Public Key Infrastructure (PKI) combined with Post-Quantum Cryptography (PQC) offers a more scalable solution, allowing users to authenticate the QKD traffic remotely to obtain information-theoretical secure (ITS) keys under the presented assumptions. Unlike traditional PKI, which relies on classical cryptographic algorithms such as RSA, the approach presented in this paper leverages PQC algorithms that are believed to be resistant to quantum attacks. Similarly to the SIGMA or TLS protocols, authentication, confidentiality, and integrity are achievable against bounded adversaries to ensure secure and scalable quantum networks.
Authors:Ruiyang Zhao, Bingbing Zhu, Chuxuan Tong, Xiaoyi Zhou, Xi Zheng
Abstract:
Adversarial attack methods for 3D point cloud classification reveal the vulnerabilities of point cloud recognition models. This vulnerability could lead to safety risks in critical applications that use deep learning models, such as autonomous vehicles. To uncover the deficiencies of these models, researchers can evaluate their security through adversarial attacks. However, most existing adversarial attack methods are based on white-box attacks. While these methods achieve high attack success rates and imperceptibility, their applicability in real-world scenarios is limited. Black-box attacks, which are more meaningful in real-world scenarios, often yield poor results. This paper proposes a novel black-box adversarial example generation method that utilizes a diffusion model to improve the attack success rate and imperceptibility in the black-box setting, without relying on the internal information of the point cloud classification model to generate adversarial samples. We use a 3D diffusion model to use the compressed features of the point cloud as prior knowledge to guide the reverse diffusion process to add adversarial points to clean examples. Subsequently, its reverse process is employed to transform the distribution of other categories into adversarial points, which are then added to the point cloud.
Authors:Gauri Sharma, Vidhi Kulkarni, Miles King, Ken Huang
Abstract:
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols. This connectivity, while powerful, introduces novel security risks. One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust. In tandem with OWASP's initiative for an Agentic AI Vulnerability Scoring System we define an attack vector, Agent Cascading Injection, analogous to Agent Impact Chain and Blast Radius, operating across networks of agents. In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs. We formalize this attack with an adversarial goal equation and key variables (compromised agent, injected exploit, polluted observations, etc.), capturing how a localized vulnerability can escalate into system-wide failure. We then analyze ACI's properties -- propagation chains, amplification factors, and inter-agent compound effects -- and map these to OWASP's emerging Agentic AI risk categories (e.g. Impact Chain and Orchestration Exploits). Finally, we argue that ACI highlights a critical need for quantitative benchmarking frameworks to evaluate the security of agent-to-agent communication protocols. We outline a methodology for stress-testing multi-agent systems (using architectures such as Google's A2A and Anthropic's MCP) against cascading trust failures, developing upon groundwork for measurable, standardized agent-to-agent security evaluation. Our work provides the necessary apparatus for engineers to benchmark system resilience, make data-driven architectural trade-offs, and develop robust defenses against a new generation of agentic threats.
Authors:Emilie Ma, Martin Kleppmann
Abstract:
Kintsugi is a protocol for key recovery, allowing a user to regain access to end-to-end encrypted data after they have lost their device, but still have their (potentially low-entropy) password. Existing E2EE key recovery methods, such as those deployed by Signal and WhatsApp, centralize trust by relying on servers administered by a single provider. Kintsugi is decentralized, distributing trust over multiple recovery nodes, which could be servers run by independent parties, or end user devices in a peer-to-peer setting. To recover a user's keys, a threshold $t+1$ of recovery nodes must assist the user in decrypting a shared backup. Kintsugi is password-authenticated and protects against offline brute-force password guessing without requiring any specialized secure hardware. Kintsugi can tolerate up to $t$ honest-but-curious colluding recovery nodes, as well as $n - t - 1$ offline nodes, and operates safely in an asynchronous network model where messages can be arbitrarily delayed.
Authors:Vyoma Harshitha Podapati, Divyansh Nigam, Sanchari Das
Abstract:
As mobile computing becomes central to digital interaction, researchers have turned their attention to adaptive authentication for its real-time, context- and behavior-aware verification capabilities. However, many implementations remain fragmented, inconsistently apply intelligent techniques, and fall short of user expectations. In this Systematization of Knowledge (SoK), we analyze 41 peer-reviewed studies since 2011 that focus on adaptive authentication in mobile environments. Our analysis spans seven dimensions: privacy and security models, interaction modalities, user behavior, risk perception, implementation challenges, usability needs, and machine learning frameworks. Our findings reveal a strong reliance on machine learning (64.3%), especially for continuous authentication (61.9%) and unauthorized access prevention (54.8%). AI-driven approaches such as anomaly detection (57.1%) and spatio-temporal analysis (52.4%) increasingly shape the interaction landscape, alongside growing use of sensor-based and location-aware models.
Authors:Debesh Choudhury, Sujoy Chakraborty
Abstract:
We propose a technique to protect and preserve a private key or a passcode in an encrypted two-dimensional graphical image. The plaintext private key or the passcode is converted into an encrypted QR code and embedded into a real-life color image with a steganographic scheme. The private key or the passcode is recovered from the stego color image by first extracting the encrypted QR code from the color image, followed by decryption of the QR code. The cryptographic key for encryption of the QR code is generated from the output of a Linear Feedback Shift Register (LFSR), initialized by a seed image chosen by the user. The user can store the seed image securely, without the knowledge of an attacker. Even if an active attacker modifies the seed image (without knowledge of the fact that it is the seed image), the user can easily restore it if he/she keeps multiple copies of it, so that the encryption key can be regenerated easily. Our experiments prove the feasibility of the technique using sample private key data and real-life color images.
Authors:Abdullah Al Siam, Sadequzzaman Shohan
Abstract:
The rapid integration of Artificial Intelligence (AI) into medical diagnostics has raised pressing concerns about patient privacy, especially when sensitive imaging data must be transferred, stored, or processed. In this paper, we propose a novel framework for privacy-preserving diagnostic inference on encrypted medical images using a modified convolutional neural network (Masked-CNN) capable of operating on transformed or ciphered image formats. Our approach leverages AES-CBC encryption coupled with JPEG2000 compression to protect medical images while maintaining their suitability for AI inference. We evaluate the system using public DICOM datasets (NIH ChestX-ray14 and LIDC-IDRI), focusing on diagnostic accuracy, inference latency, storage efficiency, and privacy leakage resistance. Experimental results show that the encrypted inference model achieves performance comparable to its unencrypted counterpart, with only marginal trade-offs in accuracy and latency. The proposed framework bridges the gap between data privacy and clinical utility, offering a practical, scalable solution for secure AI-driven diagnostics.
Authors:Yunming Xiao, Peizhi Liu, Ruijie Yu, Chenkai Weng, Matteo Varvello, Aleksandar Kuzmanovic
Abstract:
There has been a growing interest in Internet user privacy, demonstrated by the popularity of privacy-preserving products such as Telegram and Brave, and the widespread adoption of HTTPS. The Domain Name System (DNS) is a key component of Internet-based communication and its privacy has been neglected for years. Recently, DNS over HTTPS (DoH) has improved the situation by fixing the issue of in-path middleboxes. Further progress has been made with proxy-based solutions such as Oblivious DoH (ODoH), which separate a user's identity from their DNS queries. However, these solutions rely on non-collusion assumptions between DNS resolvers and proxies -- an assumption difficult to guarantee in practice. To address this, we explore integrating single-server Private Information Retrieval (PIR) into DNS to enable encrypted query processing without relying on trust assumptions. However, applying PIR to DNS is challenging due to its hierarchical nature -- particularly, interactions with recursive resolvers can still leak information. Navigating performance and privacy trade-offs, we propose PDNS, a DNS extension leveraging single-server PIR to strengthen privacy guarantees. We have implemented a prototype of PDNS and compared its performance against state-of-the-art solutions via trace-driven experiments. The results show that PDNS achieves acceptable performance (2x faster than DoH over Tor with similar privacy guarantees) and strong privacy guarantees today, mainly at the cost of its scalability, which specialized hardware for PIR can address in the near future.
Authors:Gabriel Downer, Sean Craven, Damian Ruck, Jake Thomas
Abstract:
The increasing integration of Visual Language Models (VLMs) into AI systems necessitates robust model alignment, especially when handling multimodal content that combines text and images. Existing evaluation datasets heavily lean towards text-only prompts, leaving visual vulnerabilities under evaluated. To address this gap, we propose \textbf{Text2VLM}, a novel multi-stage pipeline that adapts text-only datasets into multimodal formats, specifically designed to evaluate the resilience of VLMs against typographic prompt injection attacks. The Text2VLM pipeline identifies harmful content in the original text and converts it into a typographic image, creating a multimodal prompt for VLMs. Also, our evaluation of open-source VLMs highlights their increased susceptibility to prompt injection when visual inputs are introduced, revealing critical weaknesses in the current models' alignment. This is in addition to a significant performance gap compared to closed-source frontier models. We validate Text2VLM through human evaluations, ensuring the alignment of extracted salient concepts; text summarization and output classification align with human expectations. Text2VLM provides a scalable tool for comprehensive safety assessment, contributing to the development of more robust safety mechanisms for VLMs. By enhancing the evaluation of multimodal vulnerabilities, Text2VLM plays a role in advancing the safe deployment of VLMs in diverse, real-world applications.
Authors:Satish Kumar, Md. Arzoo Jamal
Abstract:
Digital signatures are fundamental cryptographic primitives that ensure the authenticity and integrity of digital documents. In the post-quantum era, classical public key-based signature schemes become vulnerable to brute-force and key-recovery attacks due to the computational power of quantum algorithms. Multivariate polynomial based signature schemes are among the one of the cryptographic constructions that offers strong security guarantees against such quantum threats. With the growing capabilities of neural networks, it is natural to explore their potential application in the design of cryptographic primitives. Neural networks inherently captures the non-linear relationships within the data, which are encoded in their synaptic weight matrices and bias vectors. In this paper, we propose a novel construction of a multivariate polynomial based digital signature scheme that leverages neural network architectures. A neural network with binary weights is employed to define the central structure of the signature scheme. The design introduces a recurrent random vector, functionally analogous to an attention mechanism, which contributes dynamic randomness based on the previous state, thereby enhancing the scheme's security. It is demonstrated that the proposed signature scheme provide security against Existential Unforgeability under adaptive Chosen-Message Attacks (EUF-CMA). Furthermore, it is proven that direct attacks aimed to recover the private keys are computationally infeasible within polynomial time, even in the presence of quantum computing abilities. The operational characteristics of the proposed scheme are also evaluated, with results indicating notable efficiency and practical viability in post-quantum cryptographic applications.
Authors:Yichen Zhou, Chenxing Li, Fan Long
Abstract:
This paper presents MPC-EVM, the first blockchain prototype that extends the EVM to enable asynchronous MPC invocations by smart contracts during transaction executions without compromising consistency or throughput. MPC-EVM uses an asynchronous execution model to process MPC-invoking transactions in a non-blocking fashion, saving the transaction's progress when it enters an MPC and resuming its execution upon MPC's completion. Additionally, it employs an access control mechanism that prevents inconsistent state access and modifications as a result of asynchronous executions. Benchmarking MPC-EVM's throughput show that the transactions per second (TPS) decreased by less than 3% compared to the baseline when MPC-invoking transactions are executed alongside regular transactions.
Authors:Howell Xia, Jonah Gluck, Sevval Simsek, David Sastre Medina, David Starobinski
Abstract:
The high complexity of modern software supply chains necessitates tools such as Software Bill of Materials (SBOMs) to manage component dependencies, and Software Composition Analysis (SCA) tools to identify vulnerabilities. While there exists limited integration between SBOMs and SCA tools, a unified view of complex dependency-vulnerability relationships remains elusive. In this paper, we introduce VDGraph, a novel knowledge graph-based methodology for integrating vulnerability and dependency data into a holistic view. VDGraph consolidates SBOM and SCA outputs into a graph representation of software projects' dependencies and vulnerabilities. We provide a formal description and analysis of the theoretical properties of VDGraph and present solutions to manage possible conflicts between the SBOM and SCA data. We further introduce and evaluate a practical, proof-of-concept implementation of VDGraph using two popular SBOM and SCA tools, namely CycloneDX Maven plugin and Google's OSV-Scanner. We apply VDGraph on 21 popular Java projects. Through the formulation of appropriate queries on the graphs, we uncover the existence of concentrated risk points (i.e., vulnerable components of high severity reachable through numerous dependency paths). We further show that vulnerabilities predominantly emerge at a depth of three dependency levels or higher, indicating that direct or secondary dependencies exhibit lower vulnerability density and tend to be more secure. Thus, VDGraph contributes a graph-theoretic methodology that improves visibility into how vulnerabilities propagate through complex, transitive dependencies. Moreover, our implementation, which combines open SBOM and SCA standards with Neo4j, lays a foundation for scalable and automated analysis across real-world projects.
Authors:Avinash Singh, Vikas Pareek, Asish Sharma
Abstract:
Fintech provides technological services to increase operational efficiency in financial institutions, but traditional perimeter-based defense mechanisms are insufficient against evolving cyber threats like insider attacks, malware intrusions, and Advanced Persistent Threats (APTs). These vulnerabilities expose Fintech organizations to significant risks, including financial losses and data breaches. To address these challenges, this paper proposes a blockchain-integrated Zero Trust framework, adhering to the principle of "Never Trust, Always Verify." The framework uses Ethereum smart contracts to enforce Multi Factor Authentication (MFA), Role-Based Access Control (RBAC), and Just-In-Time (JIT) access privileges, effectively mitigating credential theft and insider threats, the effect of malware and APT attacks.
The proposed solution transforms blockchain into a Policy Engine (PE) and Policy Enforcement Point (PEP), and policy storage, ensuring immutable access control and micro-segmentation. A decentralized application (DApp) prototype was developed and tested using STRIDE threat modeling, demonstrating resilience against spoofing, tampering, and privilege escalation. Comparative analysis with Perimeter-based systems revealed a trade-off: while the framework introduced a marginal latency increase (74.0 ms vs. 49.33 ms) and reduced throughput (30.77 vs. 50.0 requests/sec), it significantly enhanced security by eliminating single points of failure and enabling tamper-proof audit trails.
Experimental validation on a 200-node simulated network confirmed the framework's robustness, with future optimizations targeting Layer-2 solutions for scalability. This work bridges the gap between Zero Trust theory and practical blockchain implementation, offering Fintech organizations a decentralized, cost-effective security model.
Authors:Nicola Croce, Tobin South
Abstract:
The Model Context Protocol (MCP) represents a significant advancement in AI-tool integration, enabling seamless communication between AI agents and external services. However, this connectivity introduces novel attack vectors that remain largely unexplored. This paper demonstrates how unsophisticated threat actors, requiring only basic programming skills and free web tools, can exploit MCP's trust model to exfiltrate sensitive financial data. We present a proof-of-concept attack where a malicious weather MCP server, disguised as benign functionality, discovers and exploits legitimate banking tools to steal user account balances. The attack chain requires no advanced technical knowledge, server infrastructure, or monetary investment. The findings reveal a critical security gap in the emerging MCP ecosystem: while individual servers may appear trustworthy, their combination creates unexpected cross-server attack surfaces. Unlike traditional cybersecurity threats that assume sophisticated adversaries, our research shows that the barrier to entry for MCP-based attacks is alarmingly low. A threat actor with undergraduate-level Python knowledge can craft convincing social engineering attacks that exploit the implicit trust relationships MCP establishes between AI agents and tool providers. This work contributes to the nascent field of MCP security by demonstrating that current MCP implementations allow trivial cross-server attacks and proposing both immediate mitigations and protocol improvements to secure this emerging ecosystem.
Authors:Svenja Lage, Hannes Bartz
Abstract:
Private Information Retrieval (PIR) schemes allow clients to retrieve files from a database without disclosing the requested file's identity to the server. In the pursuit of post-quantum security, most recent PIR schemes rely on hard lattice problems. In contrast, the so called CB-cPIR scheme stands out as a pioneering effort to base PIR schemes on hard problems in coding theory, thereby contributing significantly to the diversification of security foundations. However, our research reveals a critical vulnerability in CB-cPIR, substantially diminishing its security levels. Moreover, a comparative analysis with state-of-the-art PIR schemes shows that CB-cPIR's advantages are reduced, making it less competitive in terms of the communication cost. Nevertheless, our findings highlight the importance of continued research into code-based PIR schemes, as they have the potential to provide a valuable alternative to lattice-based approaches.
Authors:André Menolli, Luiz Fernando Nunes, Thiago A. Coleti
Abstract:
The protection of personal data has become a central topic in software development, especially with the implementation of the General Data Protection Law (LGPD) in Brazil and the General Data Protection Regulation (GDPR) in the European Union. With the enforcement of these laws, certain software quality criteria have become mandatory, such as data anonymization, which is one of the main aspects addressed by these regulations. The aim of this article is to analyze data anonymization techniques and assess their effectiveness in ensuring compliance with legal requirements and the utility of the data for its intended purpose. Techniques such as aggregation, generalization, perturbation, and k-anonymity were investigated and applied to datasets containing personal and sensitive data. The analysis revealed significant variations in the effectiveness of each method, highlighting the need to balance privacy and data utility.
Authors:Chao Liu, Shuai Zhao, Chenhao Jia, Gengran Hu, Tingting Cui
Abstract:
Due to the merits of high efficiency and strong security against timing and side-channel attacks, ChaCha has been widely applied in real-time communication and data streaming scenarios. However, with the rapid development of AI-assisted cryptanalysis and quantum computing technologies, there are serious challenges to the secure implementation of ChaCha cipher. To further strengthen the security of ChaCha cipher, we propose an improved variant based on quantum random numbers, i.e., Quantum Random Number Enhanced ChaCha (QRE-ChaCha). Specifically, the design XORs the initial constants with quantum random numbers and periodically injects quantum random numbers into selected state words during odd rounds to enhance diffusion. Compared with the original ChaCha, the present variant shows stronger resistance to differential attacks and generates a keystream with statistical randomness, thereby offering increased robustness against both classical and quantum attacks. To evaluate the security and performance of the present ChaCha, our analysis proceeds in three main parts. Firstly, we analyze its theoretical security in terms of quantum randomness and attack testing, and conduct differential cryptanalysis with an automated search method based on the Boolean satisfiability problem (SAT). Secondly, we subject the keystream generated by the cipher to randomness tests using the NIST statistical test suite and the GM/T 0005-2021 randomness testing standard. Finally, we assess its encryption and decryption performance by measuring its encryption speed on files of various sizes. According to the results, the present ChaCha is significantly improved to resist differential attacks while maintaining the high efficiency of the original ChaCha cipher, and its keystream successfully passes statistical randomness tests using the NIST and GM/T 0005-2021 standards, meeting cryptographic application requirements.
Authors:Nima Atashin, Behrouz Tork Ladani, Mohammadreza Sharbaf
Abstract:
Detecting security vulnerabilities in open-source software is a critical task that is highly regarded in the related research communities. Several approaches have been proposed in the literature for detecting vulnerable codes and identifying the classes of vulnerabilities. However, there is still room to work in explaining the root causes of detected vulnerabilities through locating vulnerable statements and the discovery of paths leading to the activation of the vulnerability. While frameworks like SliceLocator offer explanations by identifying vulnerable paths, they rely on rule-based sink identification that limits their generalization. In this paper, we introduce VulPathFinder, an explainable vulnerability path discovery framework that enhances SliceLocator's methodology by utilizing a novel Graph Neural Network (GNN) model for detecting sink statements, rather than relying on predefined rules. The proposed GNN captures semantic and syntactic dependencies to find potential sink points (PSPs), which are candidate statements where vulnerable paths end. After detecting PSPs, program slicing can be used to extract potentially vulnerable paths, which are then ranked by feeding them back into the target graph-based detector. Ultimately, the most probable path is returned, explaining the root cause of the detected vulnerability. We demonstrated the effectiveness of the proposed approach by performing evaluations on a benchmark of the buffer overflow CWEs from the SARD dataset, providing explanations for the corresponding detected vulnerabilities. The results show that VulPathFinder outperforms both original SliceLocator and GNNExplainer (as a general GNN explainability tool) in discovery of vulnerability paths to identified PSPs.
Authors:Shan Jiang, Pranoy Kovuri, David Tao, Zhixun Tan
Abstract:
Software obfuscation, particularly prevalent in JavaScript, hinders code comprehension and analysis, posing significant challenges to software testing, static analysis, and malware detection. This paper introduces CASCADE, a novel hybrid approach that integrates the advanced coding capabilities of Gemini with the deterministic transformation capabilities of a compiler Intermediate Representation (IR), specifically JavaScript IR (JSIR). By employing Gemini to identify critical prelude functions, the foundational components underlying the most prevalent obfuscation techniques, and leveraging JSIR for subsequent code transformations, CASCADE effectively recovers semantic elements like original strings and API names, and reveals original program behaviors. This method overcomes limitations of existing static and dynamic deobfuscation techniques, eliminating hundreds to thousands of hardcoded rules while achieving reliability and flexibility. CASCADE is already deployed in Google's production environment, demonstrating substantial improvements in JavaScript deobfuscation efficiency and reducing reverse engineering efforts.
Authors:Shams Shaikh, Trima P. Fernandes e Fizardo
Abstract:
As cloud infrastructure becomes the backbone of modern organizations, the security of cryptographic key management, especially using Hardware Security Modules (HSMs) and Trusted Platform Modules (TPMs), faces unprecedented challenges. While these hardware-based solutions offer strong protection in isolated environments, their effectiveness is being undermined by cloud-native threats such as misconfigurations, compromised APIs, and lateral privilege escalations. This paper presents a comprehensive analysis of publicly disclosed attacks and breaches involving HSMs and TPMs in cloud environments, identifying recurring architectural and operational flaws. We propose a taxonomy of attack vectors based on real-world case studies and threat intelligence reports, highlighting the gaps between hardware trust anchors and dynamic cloud ecosystems. Furthermore, we evaluate emerging defensive paradigms: confidential computing, post-quantum cryptography, and decentralized key management systems (dKMS), assessing their potential to address these gaps. Our findings emphasize that securing cloud-based cryptographic trust requires a layered, context-aware approach that integrates both hardware and software safeguards. The study serves as a practical framework for cloud architects and security engineers to reassess key protection strategies in light of evolving threats. To our knowledge, this is the first work to synthesize documented, real-world cloud HSM and TPM failures into a coherent taxonomy grounded in modern threat models.
Authors:Florian Kerschbaum, Steven Lee, Hao Wu
Abstract:
We introduce an algorithm that releases a pure differentially private sparse histogram over $n$ participants drawn from a domain of size $d \gg n$. Our method attains the optimal $\ell_\infty$-estimation error and runs in strictly $O(n \ln \ln d)$ time in the word-RAM model, thereby improving upon the previous best known deterministic-time bound of $\tilde{O}(n^2)$ and resolving the open problem of breaking this quadratic barrier (Balcer and Vadhan, 2019). Central to our algorithm is a novel private item blanket technique with target-length padding, which transforms the approximate differentially private stability-based histogram algorithm into a pure differentially private one.
Authors:Md Min-Ha-Zul Abedin, Tazqia Mehrub
Abstract:
This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset, which contains feature representations of Portable Executable (PE) files. We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet, under three preprocessing settings: original feature space, Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA). The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness. Ensemble methods, especially LightGBM and XGBoost, show the best overall performance across all configurations, with minimal sensitivity to PCA and consistent generalization. LDA improves KNN performance but significantly reduces accuracy for boosting models. TabNet, while promising in theory, underperformed under feature reduction, likely due to architectural sensitivity to input structure. The analysis is supported by detailed exploratory data analysis (EDA), including mutual information ranking, PCA or t-SNE visualizations, and outlier detection using Isolation Forest and Local Outlier Factor (LOF), which confirm the discriminatory capacity of key features in the EMBER dataset. The results suggest that boosting models remain the most reliable choice for high-dimensional static malware detection, and that dimensionality reduction should be applied selectively based on model type. This work provides a benchmark for comparing classification models and preprocessing strategies in malware detection tasks and contributes insights that can guide future system development and real-world deployment.
Authors:Ãlvaro Ruiz-Ródenas, Jaime Pujante Sáez, Daniel GarcÃa-Algora, Mario RodrÃguez Béjar, Jorge Blasco, José Luis Hernández-Ramos
Abstract:
Cyber Threat Intelligence (CTI) mining involves extracting structured insights from unstructured threat data, enabling organizations to understand and respond to evolving adversarial behavior. A key task in CTI mining is mapping threat descriptions to MITRE ATT\&CK techniques. However, this process is often performed manually, requiring expert knowledge and substantial effort. Automated approaches face two major challenges: the scarcity of high-quality labeled CTI data and class imbalance, where many techniques have very few examples. While domain-specific Large Language Models (LLMs) such as SecureBERT have shown improved performance, most recent work focuses on model architecture rather than addressing the data limitations. In this work, we present SynthCTI, a data augmentation framework designed to generate high-quality synthetic CTI sentences for underrepresented MITRE ATT\&CK techniques. Our method uses a clustering-based strategy to extract semantic context from training data and guide an LLM in producing synthetic CTI sentences that are lexically diverse and semantically faithful. We evaluate SynthCTI on two publicly available CTI datasets, CTI-to-MITRE and TRAM, using LLMs with different capacity. Incorporating synthetic data leads to consistent macro-F1 improvements: for example, ALBERT improves from 0.35 to 0.52 (a relative gain of 48.6\%), and SecureBERT reaches 0.6558 (up from 0.4412). Notably, smaller models augmented with SynthCTI outperform larger models trained without augmentation, demonstrating the value of data generation methods for building efficient and effective CTI classification systems.
Authors:Radowanul Haque, Aftab Ali, Sally McClean, Naveed Khan
Abstract:
Detecting security vulnerabilities in source code remains challenging, particularly due to class imbalance in real-world datasets where vulnerable functions are under-represented. Existing learning-based methods often optimise for recall, leading to high false positive rates and reduced usability in development workflows. Furthermore, many approaches lack explainability, limiting their integration into security workflows. This paper presents ExplainVulD, a graph-based framework for vulnerability detection in C/C++ code. The method constructs Code Property Graphs and represents nodes using dual-channel embeddings that capture both semantic and structural information. These are processed by an edge-aware attention mechanism that incorporates edge-type embeddings to distinguish among program relations. To address class imbalance, the model is trained using class-weighted cross-entropy loss. ExplainVulD achieves a mean accuracy of 88.25 percent and an F1 score of 48.23 percent across 30 independent runs on the ReVeal dataset. These results represent relative improvements of 4.6 percent in accuracy and 16.9 percent in F1 score compared to the ReVeal model, a prior learning-based method. The framework also outperforms static analysis tools, with relative gains of 14.0 to 14.1 percent in accuracy and 132.2 to 201.2 percent in F1 score. Beyond improved detection performance, ExplainVulD produces explainable outputs by identifying the most influential code regions within each function, supporting transparency and trust in security triage.
Authors:Lambard Maxence, Bertelle Cyrille, Duvallet Claude
Abstract:
In an increasingly complex contractual landscape, the demand for transparency, security, and efficiency has intensified. Blockchain technology, with its decentralized and immutable nature, addresses these challenges by reducing intermediary costs, minimizing fraud risks, and enhancing system compatibility. Smart contracts, initially conceptualized by Nick Szabo and later implemented on the Ethereum blockchain, automate and secure contractual clauses, offering a robust solution for various industries. However, their complexity and the requirement for advanced programming skills present significant barriers to widespread adoption. This study introduces a multi-level finite state machine model designed to represent and track the execution of smart contracts. Our model aims to simplify smart contract development by providing a formalized framework that abstracts underlying technical complexities, making it accessible to professionals without deep technical expertise. The hierarchical structure of the multi-level finite state machine enhances contract modularity and traceability, facilitating detailed representation and evaluation of functional properties. The paper explores the potential of this multi-level approach, reviewing existing methodologies and tools, and detailing the smart contract generation process with an emphasis on reusable components and modularity. We also conduct a security analysis to evaluate potential vulnerabilities in our model, ensuring the robustness and reliability of the generated smart contracts.
Authors:Alejandro Cuevas, Manoel Horta Ribeiro, Nicolas Christin
Abstract:
Online content creators spend significant time and effort building their user base through a long, often arduous process, which requires finding the right ``niche'' to cater to. So, what incentive is there for an established content creator known for cat memes to completely reinvent their page channel and start promoting cryptocurrency services or cover electoral news events? And, if they do, do their existing subscribers not notice? We explore this problem of \textit{repurposed channels}, whereby a channel changes its identity and contents. We first characterize a market for ``second-hand'' social media accounts, which recorded sales exceeding USD~1M during our 6-month observation period. By observing YouTube channels (re)sold over these 6~months, we find that a substantial number (37\%) are used to disseminate potentially harmful content, often without facing any penalty. Even more surprisingly, these channels seem to gain rather than lose subscribers. To estimate the prevalence of channel repurposing ``in the wild,'' we also collect two snapshots of 1.4M quasi-randomly sampled YouTube accounts. In a 3-month period, we estimate that $\sim$0.25\% channels -- collectively holding $\sim$44M subscribers -- were repurposed. We confirm that these repurposed channels share several characteristics with sold channels -- mainly, the fact that they had a significantly high presence of potentially problematic content. Across repurposed channels, we find channels that became disinformation channels, as well as channels that link to web pages with financial scams. We reason that abusing the residual trust placed on these channels is advantageous to financially- and ideologically-motivated adversaries. This phenomenon is not exclusive to YouTube and we posit that the market for cultivating organic audiences is set to grow, particularly if it remains unchallenged by mitigations, technical or otherwise.
Authors:Xinyuan Zhang, Anrin Chakraborti, Michael Reiter
Abstract:
An oblivious pseudorandom function (OPRF) is a protocol by which a client and server interact to evaluate a pseudorandom function on a key provided by the server and an input provided by the client, without divulging the key or input to the other party. We extend this notion by enabling the server to specify a blocklist, such that OPRF evaluation succeeds only if the client's input is not on the blocklist. More specifically, our design gains performance by embedding the client input into a metric space, where evaluation continues only if this embedding does not cluster with blocklist elements. Our framework exploits this structure to separate the embedding and blocklist check to enable efficient implementations of each, but then must stitch these phases together through cryptographic means. Our framework also supports subsequent evaluation of the OPRF on the same input more efficiently. We demonstrate the use of our design for password blocklisting in augmented password-authenticated key exchange, and to MAC only executables that are not similar to ones on a blocklist of known malware.
Authors:Harsha Sammangi, Aditya Jagatha, Giridhar Reddy Bojja, Jun Liu
Abstract:
AI Innovations in the IoT for Real-Time Patient Monitoring On one hand, the current traditional centralized healthcare architecture poses numerous issues, including data privacy, delay, and security. Here, we present an AI-enabled decentralized IoT architecture that can address such challenges during a pandemic and critical care settings. This work presents our architecture to enhance the effectiveness of the current available federated learning, blockchain, and edge computing approach, maximizing data privacy, minimizing latency, and improving other general system metrics. Experimental results demonstrate transaction latency, energy consumption, and data throughput orders of magnitude lower than competitive cloud solutions.
Authors:Rohit Negi, Amit Negi, Manish Sharma, S. Venkatesan, Prem Kumar, Sandeep K. Shukla
Abstract:
Mega events such as the Olympics, World Cup tournaments, G-20 Summit, religious events such as MahaKumbh are increasingly digitalized. From event ticketing, vendor booth or lodging reservations, sanitation, event scheduling, customer service, crime reporting, media streaming and messaging on digital display boards, surveillance, crowd control, traffic control and many other services are based on mobile and web applications, wired and wireless networking, network of Closed-Circuit Television (CCTV) cameras, specialized control room with network and video-feed monitoring. Consequently, cyber threats directed at such digital infrastructure are common. Starting from hobby hackers, hacktivists, cyber crime gangs, to the nation state actors, all target such infrastructure to unleash chaos on an otherwise smooth operation, and often the cyber threat actors attempt to embarrass the organizing country or the organizers. Unlike long-standing organizations such as a corporate or a government department, the infrastructure of mega-events is temporary, constructed over a short time span in expediency, and often shortcuts are taken to make the deadline for the event. As a result, securing such an elaborate yet temporary infrastructure requires a different approach than securing a standard organizational digital infrastructure. In this paper, we describe our approach to securing MahaKumbh 2025, a 600 million footfall event for 45 days in Prayagraj, India, as a cyber security assessment and risk management oversight team. We chronicle the scope, process, methodology, and outcome of our team's effort to secure this mega event. It should be noted that none of the cyber attacks during the 45-day event was successful. Our goal is to put on record the methodology and discuss what we would do differently in case we work on similar future mega event.
Authors:Andrii Balashov, Olena Ponomarova, Xiaohua Zhai
Abstract:
Large Language Models (LLMs) deployed in enterprise settings (e.g., as Microsoft 365 Copilot) face novel security challenges. One critical threat is prompt inference attacks: adversaries chain together seemingly benign prompts to gradually extract confidential data. In this paper, we present a comprehensive study of multi-stage prompt inference attacks in an enterprise LLM context. We simulate realistic attack scenarios where an attacker uses mild-mannered queries and indirect prompt injections to exploit an LLM integrated with private corporate data. We develop a formal threat model for these multi-turn inference attacks and analyze them using probability theory, optimization frameworks, and information-theoretic leakage bounds. The attacks are shown to reliably exfiltrate sensitive information from the LLM's context (e.g., internal SharePoint documents or emails), even when standard safety measures are in place.
We propose and evaluate defenses to counter such attacks, including statistical anomaly detection, fine-grained access control, prompt sanitization techniques, and architectural modifications to LLM deployment. Each defense is supported by mathematical analysis or experimental simulation. For example, we derive bounds on information leakage under differential privacy-based training and demonstrate an anomaly detection method that flags multi-turn attacks with high AUC. We also introduce an approach called "spotlighting" that uses input transformations to isolate untrusted prompt content, reducing attack success by an order of magnitude. Finally, we provide a formal proof of concept and empirical validation for a combined defense-in-depth strategy. Our work highlights that securing LLMs in enterprise settings requires moving beyond single-turn prompt filtering toward a holistic, multi-stage perspective on both attacks and defenses.
Authors:Ian Hardgrove, John D. Hastings
Abstract:
A fundamental problem in cybersecurity and computer science is determining whether a program is free of bugs and vulnerabilities. Fuzzing, a popular approach to discovering vulnerabilities in programs, has several advantages over alternative strategies, although it has investment costs in the form of initial setup and continuous maintenance. The choice of fuzzing is further complicated when only a binary library is available, such as the case of closed-source and proprietary software. In response, we introduce LibLMFuzz, a framework that reduces costs associated with fuzzing closed-source libraries by pairing an agentic Large Language Model (LLM) with a lightweight tool-chain (disassembler/compiler/fuzzer) to autonomously analyze stripped binaries, plan fuzz strategies, generate drivers, and iteratively self-repair build or runtime errors. Tested on four widely-used Linux libraries, LibLMFuzz produced syntactically correct drivers for all 558 fuzz-able API functions, achieving 100% API coverage with no human intervention. Across the 1601 synthesized drivers, 75.52% were nominally correct on first execution. The results show that LLM-augmented middleware holds promise in reducing the costs of fuzzing black box components and provides a foundation for future research efforts. Future opportunities exist for research in branch coverage.
Authors:Usayd Shahul, J. Harshan
Abstract:
Secure federated learning enables collaborative model training across decentralized users while preserving data privacy. A key component is secure aggregation, which keeps individual updates hidden from both the server and users, while also defending against Byzantine users who corrupt the aggregation. To this end, Jinhyun So et al. recently developed a Byzantine-resilient secure aggregation scheme using a secret-sharing strategy over finite-field arithmetic. However, such an approach can suffer from numerical errors and overflows when applied to real-valued model updates, motivating the need for secure aggregation methods that operate directly over the real domain. We propose FORTA, a Byzantine-resilient secure aggregation framework that operates entirely in the real domain. FORTA leverages Discrete Fourier Transform (DFT) codes for privacy and employs Krum-based outlier detection for robustness. While DFT decoder is error-free under infinite precision, finite precision introduces numerical perturbations that can distort distance estimates and allow malicious updates to evade detection. To address this, FORTA refines Krum using feedback from DFT decoder, improving the selection of trustworthy updates. Theoretical analysis and experiments show that our modification of Krum offers improved robustness and more accurate aggregation than standard Krum.
Authors:Vanja StojanoviÄ, Žiga Lesar, CIril Bohak
Abstract:
We investigate the cryptanalysis of affine ciphers using a hybrid neural network architecture that combines modular arithmetic-aware and statistical feature-based learning. Inspired by recent advances in interpretable neural networks for modular arithmetic and neural cryptanalysis of classical ciphers, our approach integrates a modular branch that processes raw ciphertext sequences and a statistical branch that leverages letter frequency features. Experiments on datasets derived from natural English text demonstrate that the hybrid model attains high key recovery accuracy for short and moderate ciphertexts, outperforming purely statistical approaches for the affine cipher. However, performance degrades for very long ciphertexts, highlighting challenges in model generalization.
Authors:Irena Spasojevic, Federica Celegato, Alessandro Magni, Paola Tiberto, Jordi Sort
Abstract:
The Big Data revolution has heightened the demand for robust, energy-efficient security hardware capable of withstanding increasingly sophisticated cyber threats. Conventional encryption schemes, reliant on complex algorithms, are resource-intensive and remain vulnerable. To fortify sensitive information, society needs innovative anti-hacking and anti-counterfeiting technologies that exploit new materials and designs. Here, we present a magneto-ionic strategy for hardware-level security based on fully selective voltage-controlled N3- ion migration within pre-defined, initially paramagnetic FeCoN dots. This process generates ferromagnetic sublayers of tuneable thickness, resulting in either deterministic (single-domain or vortex) or probabilistic states (with coexisting magnetic configurations and voltage-adjustable probabilities), each exhibiting stochastic orientation and chirality, thereby providing a rich platform for magnetic fingerprinting. This approach enables self-protected primitives, including true random number generators, physical unclonable functions, and in-memory probabilistic inference. The resulting reconfigurable architecture combines tamper resistance, low energy consumption, and scalability, marking a significant leap toward next-generation hardware security rooted in emergent magnetic phenomena.
Authors:Richard M. Charles, James H. Curry, Richard B. Charles
Abstract:
The integration of Large Language Models (LLMs) in K--12 education offers both transformative opportunities and emerging risks. This study explores how students may Trojanize prompts to elicit unsafe or unintended outputs from LLMs, bypassing established content moderation systems with safety guardrils. Through a systematic experiment involving simulated K--12 queries and multi-turn dialogues, we expose key vulnerabilities in GPT-3.5 and GPT-4. This paper presents our experimental design, detailed findings, and a prototype tool, TrojanPromptGuard (TPG), to automatically detect and mitigate Trojanized educational prompts. These insights aim to inform both AI safety researchers and educational technologists on the safe deployment of LLMs for educators.
Authors:Andriamifidisoa Ramamonjy, Rufine Marius Lalasoa
Abstract:
We introduce DM-RSA (Dual Modulus RSA), a variant of the RSA cryptosystem that employs two distinct moduli symmetrically to enhance security. By leveraging the Chinese Remainder Theorem (CRT) for decryption, DM-RSA provides increased robustness against side-channel attacks while preserving the efficiency of classical RSA. This approach improves resistance to partial compromise of a modulus and integrates easily into existing infrastructures.
Authors:Xinyu Cao, Bimal Adhikari, Shangqing Zhao, Jingxian Wu, Yanjun Pan
Abstract:
Radio frequency (RF) fingerprinting, which extracts unique hardware imperfections of radio devices, has emerged as a promising physical-layer device identification mechanism in zero trust architectures and beyond 5G networks. In particular, deep learning (DL) methods have demonstrated state-of-the-art performance in this domain. However, existing approaches have primarily focused on enhancing system robustness against temporal and spatial variations in wireless environments, while the security vulnerabilities of these DL-based approaches have often been overlooked. In this work, we systematically investigate the security risks of DL-based RF fingerprinting systems through an adversarial-driven experimental analysis. We observe a consistent misclassification behavior for DL models under domain shifts, where a device is frequently misclassified as another specific one. Our analysis based on extensive real-world experiments demonstrates that this behavior can be exploited as an effective backdoor to enable external attackers to intrude into the system. Furthermore, we show that training DL models on raw received signals causes the models to entangle RF fingerprints with environmental and signal-pattern features, creating additional attack vectors that cannot be mitigated solely through post-processing security methods such as confidence thresholds.
Authors:Feng Yu, Ryan Laird
Abstract:
The rise of blockchain and Digital Ledger Technology (DLT) has gained wide traction. Instead of relying on a traditional centralized data authority, a blockchain system consists of digitally entangled block data shared across a distributed network. The specially designed chain data structure and its consensus mechanism protect blockchain data from being tampered by unauthorized adversaries. However, implementing a full-fledged blockchain system to protect a database can be technically cumbersome. In this work, we introduce an in-database design, named chain table, to protect data integrity without the need for a blockchain system. It features a succinct design without significant technology barriers or storage overhead. To realize rigorous data security, we also propose a set of data writing principles for the chain table. We prove that the chain table, together with the data writing principles, will guarantee flexible data integrity, named table-level data integrity (TDI).
Authors:Shantanav Chakraborty, Soonwon Choi, Soumik Ghosh, Tudor GiurgicÄ-Tiron
Abstract:
Deep thermalization refers to the emergence of Haar-like randomness from quantum systems upon partial measurements. As a generalization of quantum thermalization, it is often associated with high complexity and entanglement. Here, we introduce computational deep thermalization and construct the fastest possible dynamics exhibiting it at infinite effective temperature. Our circuit dynamics produce quantum states with low entanglement in polylogarithmic depth that are indistinguishable from Haar random states to any computationally bounded observer. Importantly, the observer is allowed to request many copies of the same residual state obtained from partial projective measurements on the state -- this condition is beyond the standard settings of quantum pseudorandomness, but natural for deep thermalization. In cryptographic terms, these states are pseudorandom, pseudoentangled, and crucially, retain these properties under local measurements. Our results demonstrate a new form of computational thermalization, where thermal-like behavior arises from structured quantum states endowed with cryptographic properties, instead of from highly unstructured ensembles. The low resource complexity of preparing these states suggests scalable simulations of deep thermalization using quantum computers. Our work also motivates the study of computational quantum pseudorandomness beyond BQP observers.
Authors:Niveen O. Jaffal, Mohammed Alkhanafseh, David Mohaisen
Abstract:
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as IoT, blockchain, and hardware security. This survey provides a comprehensive overview of LLM applications in cybersecurity, focusing on two core areas: (1) the integration of LLMs into key cybersecurity domains, and (2) the vulnerabilities of LLMs themselves, along with mitigation strategies. By synthesizing recent advancements and identifying key limitations, this work offers practical insights and strategic recommendations for leveraging LLMs to build secure, scalable, and future-ready cyber defense systems.
Authors:Yu Cui, Hongyang Du
Abstract:
Multi-agent debate (MAD) systems leverage collaborative interactions among large language models (LLMs) agents to improve reasoning capabilities. While recent studies have focused on increasing the accuracy and scalability of MAD systems, their security vulnerabilities have received limited attention. In this work, we introduce MAD-Spear, a targeted prompt injection attack that compromises a small subset of agents but significantly disrupts the overall MAD process. Manipulated agents produce multiple plausible yet incorrect responses, exploiting LLMs' conformity tendencies to propagate misinformation and degrade consensus quality. Furthermore, the attack can be composed with other strategies, such as communication attacks, to further amplify its impact by increasing the exposure of agents to incorrect responses. To assess MAD's resilience under attack, we propose a formal definition of MAD fault-tolerance and develop a comprehensive evaluation framework that jointly considers accuracy, consensus efficiency, and scalability. Extensive experiments on five benchmark datasets with varying difficulty levels demonstrate that MAD-Spear consistently outperforms the baseline attack in degrading system performance. Additionally, we observe that agent diversity substantially improves MAD performance in mathematical reasoning tasks, which challenges prior work suggesting that agent diversity has minimal impact on performance. These findings highlight the urgent need to improve the security in MAD design.
Authors:Zhuohan Cui, Zikun Song
Abstract:
This paper presents a comprehensive analysis of T-Mobile's critical data breaches in 2021 and 2023, alongside a full-spectrum security audit targeting its systems, infrastructure, and publicly exposed endpoints. By combining case-based vulnerability assessments with active ethical hacking techniques--including Shodan reconnaissance, API misuse simulations, VNC brute-forcing, firmware reverse engineering, and web application scans--we uncover structural weaknesses persisting beyond the initial breach events. Building on these findings, we propose a multi-layered defensive strategy encompassing Zero Trust Architecture, granular role-based access control, network segmentation, firmware encryption using AES with integrity checks, and API rate limiting and token lifecycle control. Financial modelling demonstrates that a five-year investment yields less than 1.1% of expected breach losses, validating the cost-effectiveness of proactive security measures. Our work bridges post-incident forensic analysis with hands-on security evaluation, providing an actionable blueprint for large-scale telecoms seeking operational resilience, regulatory compliance, and cross-domain threat readiness.
Authors:Victoria Childress, Josh Collyer, Jodie Knapp
Abstract:
Architectural backdoors pose an under-examined but critical threat to deep neural networks, embedding malicious logic directly into a model's computational graph. Unlike traditional data poisoning or parameter manipulation, architectural backdoors evade standard mitigation techniques and persist even after clean retraining. This survey systematically consolidates research on architectural backdoors, spanning compiler-level manipulations, tainted AutoML pipelines, and supply-chain vulnerabilities. We assess emerging detection and defense strategies, including static graph inspection, dynamic fuzzing, and partial formal verification, and highlight their limitations against distributed or stealth triggers. Despite recent progress, scalable and practical defenses remain elusive. We conclude by outlining open challenges and proposing directions for strengthening supply-chain security, cryptographic model attestations, and next-generation benchmarks. This survey aims to guide future research toward comprehensive defenses against structural backdoor threats in deep learning systems.
Authors:Rishane Dassanayake, Mario Demetroudi, James Walpole, Lindley Lentati, Jason R. Brown, Edward James Young
Abstract:
Frontier AI systems are rapidly advancing in their capabilities to persuade, deceive, and influence human behaviour, with current models already demonstrating human-level persuasion and strategic deception in specific contexts. Humans are often the weakest link in cybersecurity systems, and a misaligned AI system deployed internally within a frontier company may seek to undermine human oversight by manipulating employees. Despite this growing threat, manipulation attacks have received little attention, and no systematic framework exists for assessing and mitigating these risks. To address this, we provide a detailed explanation of why manipulation attacks are a significant threat and could lead to catastrophic outcomes. Additionally, we present a safety case framework for manipulation risk, structured around three core lines of argument: inability, control, and trustworthiness. For each argument, we specify evidence requirements, evaluation methodologies, and implementation considerations for direct application by AI companies. This paper provides the first systematic methodology for integrating manipulation risk into AI safety governance, offering AI companies a concrete foundation to assess and mitigate these threats before deployment.
Authors:Kai Malcolm, César Uribe, Momona Yamagami
Abstract:
Invasive and non-invasive neural interfaces hold promise as high-bandwidth input devices for next-generation technologies. However, neural signals inherently encode sensitive information about an individual's identity and health, making data sharing for decoder training a critical privacy challenge. Federated learning (FL), a distributed, privacy-preserving learning framework, presents a promising solution, but it remains unexplored in closed-loop adaptive neural interfaces. Here, we introduce FL-based neural decoding and systematically evaluate its performance and privacy using high-dimensional electromyography signals in both open- and closed-loop scenarios. In open-loop simulations, FL significantly outperformed local learning baselines, demonstrating its potential for high-performance, privacy-conscious neural decoding. In contrast, closed-loop user studies required adapting FL methods to accommodate single-user, real-time interactions, a scenario not supported by standard FL. This modification resulted in local learning decoders surpassing the adapted FL approach in closed-loop performance, yet local learning still carried higher privacy risks. Our findings highlight a critical performance-privacy tradeoff in real-time adaptive applications and indicate the need for FL methods specifically designed for co-adaptive, single-user applications.
Authors:Sheng Liu, Panos Papadimitratos
Abstract:
Federated Learning (FL) has emerged as a promising solution for privacy-preserving autonomous driving, specifically camera-based Road Condition Classification (RCC) systems, harnessing distributed sensing, computing, and communication resources on board vehicles without sharing sensitive image data. However, the collaborative nature of FL-RCC frameworks introduces new vulnerabilities: Targeted Label Flipping Attacks (TLFAs), in which malicious clients (vehicles) deliberately alter their training data labels to compromise the learned model inference performance. Such attacks can, e.g., cause a vehicle to mis-classify slippery, dangerous road conditions as pristine and exceed recommended speed. However, TLFAs for FL-based RCC systems are largely missing. We address this challenge with a threefold contribution: 1) we disclose the vulnerability of existing FL-RCC systems to TLFAs; 2) we introduce a novel label-distance-based metric to precisely quantify the safety risks posed by TLFAs; and 3) we propose FLARE, a defensive mechanism leveraging neuron-wise analysis of the output layer to mitigate TLFA effects. Extensive experiments across three RCC tasks, four evaluation metrics, six baselines, and three deep learning models demonstrate both the severity of TLFAs on FL-RCC systems and the effectiveness of FLARE in mitigating the attack impact.
Authors:Omri Shmueli, Mark Zhandry
Abstract:
One-shot signatures (OSS) were defined by Amos, Georgiou, Kiayias, and Zhandry (STOC'20). These allow for signing exactly one message, after which the signing key self-destructs, preventing a second message from ever being signed. While such an object is impossible classically, Amos et al observe that OSS may be possible using quantum signing keys by leveraging the no-cloning principle. OSS has since become an important conceptual tool with many applications in decentralized settings and for quantum cryptography with classical communication. OSS are also closely related to separations between classical-binding and collapse-binding for post-quantum hashing and commitments. Unfortunately, the only known OSS construction due to Amos et al. was only justified in a classical oracle model, and moreover their justification was ultimately found to contain a fatal bug. Thus, the existence of OSS, even in a classical idealized model, has remained open.
We give the first standard-model OSS, with provable security assuming (sub-exponential) indistinguishability obfuscation (iO) and LWE. This also gives the first standard-model separation between classical and collapse-binding post-quantum commitments/hashing, solving a decade-old open problem. Along the way, we also give the first construction with unconditional security relative to a classical oracle. To achieve our standard-model construction, we develop a notion of permutable pseudorandom permutations (permutable PRPs), and show how they are useful for translating oracle proofs involving random permutations into obfuscation-based proofs. In particular, obfuscating permutable PRPs gives a trapdoor one-way permutation that is \emph{full-domain}, solving another decade-old-problem of constructing this object from (sub-exponential) iO and one-way functions.
Authors:Zequan Huang, Jacques Robin, Nicolas Herbaut, Nourhène Ben Rabah, Bénédicte Le Grand
Abstract:
Modern Security Orchestration, Automation, and Response (SOAR) platforms must rapidly adapt to continuously evolving cyber attacks. Intent-Based Networking has emerged as a promising paradigm for cyber attack mitigation through high-level declarative intents, which offer greater flexibility and persistency than procedural actions. In this paper, we bridge the gap between two active research directions: Intent-Based Cyber Defense and Autonomic Cyber Defense, by proposing a unified, ontology-driven security intent definition leveraging the MITRE-D3FEND cybersecurity ontology. We also propose a general two-tiered methodology for integrating such security intents into decision-theoretic Autonomic Cyber Defense systems, enabling hierarchical and context-aware automated response capabilities. The practicality of our approach is demonstrated through a concrete use case, showcasing its integration within next-generation Security Orchestration, Automation, and Response platforms.
Authors:Ifiyemi Leigha, Basak Comlekcioglu, Maria Pilar Bezanilla
Abstract:
Distributed Denial of Service (DDoS) attacks have become increasingly prevalent and dangerous in the context of Internet of Things (IoT) networks, primarily due to the low-security configurations of many connected devices. This paper analyzes the nature and impact of DDoS attacks such as those launched by the Mirai botnet, and proposes layered mitigation strategies tailored to IoT environments. Key solutions explored include IPv6 Unique Local Addresses (ULA), edge computing, software-defined networking (SDN), honeypot deception, and machine learning-based intrusion detection systems. The paper aims to help engineers and researchers understand and implement practical countermeasures to protect IoT infrastructures.
Authors:Adhwaa Alchaab, Ayman Younis, Dario Pompili
Abstract:
Next-Generation Radio Access Networks (NGRAN) aim to support diverse vertical applications with strict security, latency, and Service-Level Agreement (SLA) requirements. These demands introduce challenges in securing the infrastructure, allocating resources dynamically, and enabling real-time reconfiguration. This demo presents SnSRIC, a secure and intelligent network slicing framework that mitigates a range of Distributed Denial-of-Service (DDoS) attacks in Open RAN environments. SnSRIC incorporates an AI-driven xApp that dynamically allocates Physical Resource Blocks (PRBs) to active users while enforcing slice-level security. The system detects anomalous behavior, distinguishes between benign and malicious devices, and uses the E2 interface to throttle rogue signaling while maintaining service continuity for legitimate users.
Authors:Frederik Marinus Trudslev, Matteo Lissandrini, Juan Manuel Rodriguez, Martin Bøgsted, Daniele Dell'Aglio
Abstract:
Privacy Preserving Synthetic Data Generation (PP-SDG) has emerged to produce synthetic datasets from personal data while maintaining privacy and utility. Differential privacy (DP) is the property of a PP-SDG mechanism that establishes how protected individuals are when sharing their sensitive data. It is however difficult to interpret the privacy budget ($\varepsilon$) expressed by DP. To make the actual risk associated with the privacy budget more transparent, multiple privacy metrics (PMs) have been proposed to assess the privacy risk of the data. These PMs are utilized in separate studies to assess newly introduced PP-SDG mechanisms. Consequently, these PMs embody the same assumptions as the PP-SDG mechanism they were made to assess. Therefore, a thorough definition of how these are calculated is necessary. In this work, we present the assumptions and mathematical formulations of 17 distinct privacy metrics.
Authors:Pedro Almansa Jiménez, Lorenzo Fernández Maimó, Ãngel Luis Peráles Gómez
Abstract:
The main objective of this technical report is to conduct a comprehensive study on devices operating within Industrial Internet of Things (IIoT) environments, describing the scenarios that define this category and analysing the vulnerabilities that compromise their security. To this end, the report seeks to identify and examine the main classes of IIoT devices, detailing their characteristics, functionalities, and roles within industrial systems. This analysis enables a better understanding of how these devices interact and fulfil the requirements of critical industrial environments. The report also explores the specific contexts in which these devices operate, highlighting the distinctive features of industrial scenarios and the conditions under which the devices function. Furthermore, it analyses the vulnerabilities affecting IIoT devices, outlining their vectors, targets, impact, and consequences. The report then describes the typical phases of an attack, along with a selection of real-world documented incidents. These cases are classified according to the taxonomy presented in Section 3, providing a comprehensive view of the potential threats to security and assessing the impact these vulnerabilities may have on industrial environments. Finally, the report presents a compilation of some of the most recent and effective security countermeasures as potential solutions to the security challenges faced by industrial systems. Special emphasis is placed on the role of Machine Learning in the development of these approaches, underscoring its importance in enhancing industrial cybersecurity.
Authors:Jianyao Yin, Luca Arnaboldi, Honglong Chen, Pascal Berrang
Abstract:
Backdoor attacks involve either poisoning the training data or directly modifying the model in order to implant a hidden behavior, that causes the model to misclassify inputs when a specific trigger is present. During inference, the model maintains high accuracy on benign samples but misclassifies poisoned samples into an attacker-specified target class. Existing research on backdoor attacks has explored developing triggers in the spatial, spectral (frequency), and semantic (feature) domains, aiming to make them stealthy. While some approaches have considered designing triggers that are imperceptible in both spatial and spectral domains, few have incorporated the semantic domain. In this paper, we propose a novel backdoor attack, termed 3S-attack, which is stealthy across the spatial, spectral, and semantic domains. The key idea is to exploit the semantic features of benign samples as triggers, using Gradient-weighted Class Activation Mapping (Grad-CAM) and a preliminary model for extraction. The trigger is then embedded in the spectral domain, followed by pixel-level restrictions after converting the samples back to the spatial domain. This process minimizes the distance between poisoned and benign samples, making the attack harder to detect by existing defenses and human inspection. Extensive experiments on various datasets, along with theoretical analysis, demonstrate the stealthiness of 3S-attack and highlight the need for stronger defenses to ensure AI security. Our code is available at: https://anonymous.4open.science/r/anon-project-3776/
Authors:Yin Li, Sharad Mehrota, Shantanu Sharma, Komal Kumari
Abstract:
This paper presents a novel key-based access control technique for secure outsourcing key-value stores where values correspond to documents that are indexed and accessed using keys. The proposed approach adopts Shamir's secret-sharing that offers unconditional or information-theoretic security. It supports keyword-based document retrieval while preventing leakage of the data, access rights of users, or the size (\textit{i}.\textit{e}., volume of the output that satisfies a query). The proposed approach allows servers to detect (and abort) malicious clients from gaining unauthorized access to data, and prevents malicious servers from altering data undetected while ensuring efficient access -- it takes 231.5ms over 5,000 keywords across 500,000 files.
Authors:Xiaojian Zhang, Junqing Wang, Kerui Chen, Peiyuan Zhao, Huiyuan Bai
Abstract:
Given a graph $G$ defined in a domain $\mathcal{G}$, we investigate locally differentially private mechanisms to release a degree sequence on $\mathcal{G}$ that accurately approximates the actual degree distribution. Existing solutions for this problem mostly use graph projection techniques based on edge deletion process, using a threshold parameter $θ$ to bound node degrees. However, this approach presents a fundamental trade-off in threshold parameter selection. While large $θ$ values introduce substantial noise in the released degree sequence, small $θ$ values result in more edges removed than necessary. Furthermore, $θ$ selection leads to an excessive communication cost. To remedy existing solutions' deficiencies, we present CADR-LDP, an efficient framework incorporating encryption techniques and differentially private mechanisms to release the degree sequence. In CADR-LDP, we first use the crypto-assisted Optimal-$θ$-Selection method to select the optimal parameter with a low communication cost. Then, we use the LPEA-LOW method to add some edges for each node with the edge addition process in local projection. LPEA-LOW prioritizes the projection with low-degree nodes, which can retain more edges for such nodes and reduce the projection error. Theoretical analysis shows that CADR-LDP satisfies $ε$-node local differential privacy. The experimental results on eight graph datasets show that our solution outperforms existing methods.
Authors:Zihe Yan, Zhuosheng Zhang
Abstract:
Graphical user interface (GUI) agents built on multimodal large language models (MLLMs) have recently demonstrated strong decision-making abilities in screen-based interaction tasks. However, they remain highly vulnerable to pop-up-based environmental injection attacks, where malicious visual elements divert model attention and lead to unsafe or incorrect actions. Existing defense methods either require costly retraining or perform poorly under inductive interference. In this work, we systematically study how such attacks alter the attention behavior of GUI agents and uncover a layer-wise attention divergence pattern between correct and incorrect outputs. Based on this insight, we propose \textbf{LaSM}, a \textit{Layer-wise Scaling Mechanism} that selectively amplifies attention and MLP modules in critical layers. LaSM improves the alignment between model saliency and task-relevant regions without additional training. Extensive experiments across 12 types of pop-up perturbations and 4 different model backbones show that LaSM consistently enhances the defense success rate. When combined with prompt-level alerts, LaSM achieves over 98\% robustness even under strong inductive attacks. Our findings reveal that attention misalignment is a core vulnerability in MLLM agents and can be effectively addressed through selective layer-wise modulation.
Authors:Debnath Ghosh, Soumit Roy, Prithwi Bagchi, Indranil Chakrabarty, Ashok Kumar Das
Abstract:
Quantum digital signatures ensure unforgeable message authenticity and integrity using quantum principles, offering unconditional security against both classical and quantum attacks. They are crucial for secure communication in high-stakes environments, ensuring trust and long-term protection in the quantum era. Nowadays, the majority of arbitrated quantum signature (AQS) protocols encrypt data qubit by qubit using the quantum one-time pad (QOTP). Despite providing robust data encryption, QOTP is not a good fit for AQS because of its susceptibility to many types of attacks. In this work, we present an efficient AQS protocol to encrypt quantum message ensembles using a distinct encryption technique, the chained controlled unitary operations. In contrast to existing protocols, our approach successfully prevents disavowal and forgery attacks. We hope this contributes to advancing future investigations into the development of AQS protocols.
Authors:Guntur Dharma Putra, Bagus Rakadyanto Oktavianto Putra
Abstract:
Self-Sovereign Identity (SSI) offers significant potential for managing identities in the Internet of Things (IoT), enabling decentralized authentication and credential management without reliance on centralized entities. However, existing SSI frameworks often limit credential issuance and revocation to trusted entities, such as IoT manufacturers, which restricts flexibility in dynamic IoT ecosystems. In this paper, we propose a blockchain-based SSI framework that allows any individual with a verifiable trust linkage to act as a credential issuer, ensuring decentralized and scalable identity management. Our framework incorporates a layered architecture, where trust is dynamically established through endorsement-based calculations and maintained via a hierarchical chain-of-trust mechanism. Blockchain serves as the Verifiable Data Registry, ensuring transparency and immutability of identity operations, while smart contracts automate critical processes such as credential issuance, verification, and revocation. A proof-of-concept implementation demonstrates that the proposed framework is feasible and incurs minimal overheads compared to the baseline, making it well-suited for dynamic and resource-constrained IoT environments.
Authors:Yasir Ech-Chammakhy, Anas Motii, Anass Rabii, Jaafar Chbili
Abstract:
Hacker forums provide critical early warning signals for emerging cybersecurity threats, but extracting actionable intelligence from their unstructured and noisy content remains a significant challenge. This paper presents an unsupervised framework that automatically detects, clusters, and prioritizes security events discussed across hacker forum posts. Our approach leverages Transformer-based embeddings fine-tuned with contrastive learning to group related discussions into distinct security event clusters, identifying incidents like zero-day disclosures or malware releases without relying on predefined keywords. The framework incorporates a daily ranking mechanism that prioritizes identified events using quantifiable metrics reflecting timeliness, source credibility, information completeness, and relevance. Experimental evaluation on real-world hacker forum data demonstrates that our method effectively reduces noise and surfaces high-priority threats, enabling security analysts to mount proactive responses. By transforming disparate hacker forum discussions into structured, actionable intelligence, our work addresses fundamental challenges in automated threat detection and analysis.
Authors:MichaŠJóźwik, Johan Pouwelse
Abstract:
The digitization of democratic processes promises greater accessibility but presents challenges in terms of security, privacy, and verifiability. Existing electronic voting systems often rely on centralized architectures, creating single points of failure and forcing too much trust in authorities, which contradicts democratic principles. This research addresses the challenge of creating a secure, private e-voting system with minimized trust dependencies designed for the most versatile personal device: the smartphone. We introduce SmartphoneDemocracy, a novel e-voting protocol that combines three key technologies: the emerging European Digital Identity (EUDI) Wallet for Sybil-resistant identity verification, Zero-Knowledge Proofs for privacy-preserving validation, and a peer-to-peer blockchain (TrustChain) for a resilient, serverless public bulletin board. Our protocol enables voters to register and cast ballots anonymously and verifiably directly from their smartphones. We provide a detailed protocol design, a security analysis against a defined threat model, and a performance evaluation demonstrating that the computational and network overhead is feasible for medium- to large-scale elections. By developing and prototyping this system, we demonstrate a viable path to empower citizens with a trustworthy, accessible, and user-controlled digital voting experience.
Authors:Matous Kozak, Roshanak Zilouchian Moghaddam, Siva Sivaraman
Abstract:
LLM-based coding agents are rapidly being deployed in software development, yet their safety implications remain poorly understood. These agents, while capable of accelerating software development, may exhibit unsafe behaviors during normal operation that manifest as cybersecurity vulnerabilities. We conducted the first systematic safety evaluation of autonomous coding agents, analyzing over 12,000 actions across five state-of-the-art models (GPT-4o, GPT-4.1, Claude variants) on 93 real-world software setup tasks. Our findings reveal significant security concerns: 21% of agent trajectories contained insecure actions, with models showing substantial variation in unsafe behavior. We developed a high-precision detection system that identified four major vulnerability categories, with information exposure (CWE-200) being the most prevalent one. We also evaluated mitigation strategies including feedback mechanisms and security reminders with various effectiveness between models. GPT-4.1 demonstrated exceptional security awareness with 96.8% mitigation success.
Authors:Artem Chystiakov, Mariia Zhvanko
Abstract:
Transparency is one of the key benefits of public blockchains. However, the public visibility of transactions potentially compromises users' privacy. The fundamental challenge is to balance the intrinsic benefits of blockchain openness with the vital need for individual confidentiality. The proposal suggests creating a confidential version of wrapped Ethereum (cWETH) fully within the application layer. The solution combines the Elliptic Curve (EC) Twisted ElGamal-based commitment scheme to preserve confidentiality and the EC Diffie-Hellman (DH) protocol to introduce accessibility limited by the commitment scheme. To enforce the correct generation of commitments, encryption, and decryption, zk-SNARKs are utilized.
Authors:Andreas Pramendorfer, Rainhard Dieter Findling
Abstract:
Protecting personal computers (PCs) from unauthorized access typically relies on password authentication, which is know to suffer from cognitive burden and weak credentials. As many users nowadays carry mobile devices with advanced security features throughout their day, there is an opportunity to leverage these devices to improve authentication to PCs. In this paper we utilize a token-based passwordless approach where users authenticate to their PC by confirming the authentication request on their smartphones or smartwatches. Upon a request to login to the PC, or to evaluate privileges, the PC issues an authentication request that users receive on their mobile devices, where users can confirm or deny the request. We evaluate button tap and biometric fingerprint verification as confirmation variants, and compare their authentication duration, success rate, and usability to traditional password-based authentication in a user study with 30 participants and a total of 1,200 authentication attempts. Smartwatch-based authentication outperformed password-based authentication and smartphone-based variants in authentication duration, while showing comparable success rates. Participants rated smartwatch-based authentication highest in usability, followed by password-based authentication and smartphone-based authentication.
Authors:Moe Kayali, Jonas Schmitt, Franziska Roesner
Abstract:
We propose a method for using Web Authentication APIs for SSH authentication, enabling passwordless remote server login with passkeys. These are credentials that are managed throughout the key lifecycle by an authenticator on behalf of the user and offer strong security guarantees.
Passwords remain the dominant mode of SSH authentication, despite their well known flaws such as phishing and reuse. SSH's custom key-based authentication protocol can alleviate these issues but remains vulnerable to key theft. Additionally, it has poor usability, with even knowledgeable users leaking key material and failing to verify fingerprints. Hence, effective key management remains a critical open area in SSH security. In contrast, WebAuthn is a modern authentication standard designed to replace passwords, managing keys on behalf of the user. As a web API, this standard cannot integrate with SSH directly.
We propose a framework to integrate WebAuthn with SSH servers, by using UNIX pluggable authentication modules (PAM). Our approach is backwards-compatible, supports stock SSH servers and requires no new software client-side. It offers protection for cryptographic material at rest, resistance to key leaks, phishing protection, privacy protection and attestation capability. None of these properties are offered by passwords nor traditional SSH keys. We validate these advantages with a structured, conceptual security analysis.
We develop a prototype implementation and conduct a user study to quantify the security advantages of our proposal, testing our prototype with 40 SSH users. The study confirms the security problems of SSH-keys, including 20% of the cohort leaking their private keys. Our SSH-passkeys effectively address these problems: we find a 90% reduction in critical security errors, while reducing authentication time by 4x on average.
Authors:Katherine Limes, Nathan Malkin, Kelsey R. Fulton
Abstract:
Increasingly, students begin learning aspects of security and privacy during their primary and secondary education (grades K-12 in the United States). Individual U.S. states and some national organizations publish teaching standards -- guidance that outlines expectations for what students should learn -- which often form the basis for course curricula. However, research has not yet examined what is covered by these standards and whether the topics align with what the broader security and privacy community thinks students should know. To shed light on these questions, we started by collecting computer science teaching standards from all U.S. states and eight national organizations. After manually examining a total of 11,954 standards, we labeled 3,778 of them as being related to security and privacy, further classifying these into 103 topics. Topics ranged from technical subjects like encryption, network security, and embedded systems to social subjects such as laws, ethics, and appropriate online behavior. Subsequently, we interviewed 11 security and privacy professionals to examine how the teaching standards align with their expectations. We found that, while the specific topics they mentioned mostly overlapped with those of existing standards, professionals placed a greater emphasis on threat modeling and security mindset.
Authors:Victoria L. Lemieux, Rosa Gil, Faith Molosiwa, Qihong Zhou, Binming Li, Roberto Garcia, Luis De La Torre Cubillo, Zehua Wang
Abstract:
As archives turn to artificial intelligence to manage growing volumes of digital records, privacy risks inherent in current AI data practices raise critical concerns about data sovereignty and ethical accountability. This paper explores how privacy-enhancing technologies (PETs) and Web3 architectures can support archives to preserve control over sensitive content while still being able to make it available for access by researchers. We present Clio-X, a decentralized, privacy-first Web3 digital solution designed to embed PETs into archival workflows and support AI-enabled reference and access. Drawing on a user evaluation of a medium-fidelity prototype, the study reveals both interest in the potential of the solution and significant barriers to adoption related to trust, system opacity, economic concerns, and governance. Using Rogers' Diffusion of Innovation theory, we analyze the sociotechnical dimensions of these barriers and propose a path forward centered on participatory design and decentralized governance through a Clio-X Decentralized Autonomous Organization. By integrating technical safeguards with community-based oversight, Clio-X offers a novel model to ethically deploy AI in cultural heritage contexts.
Authors:Chun-I Fan, Li-En Chang, Cheng-Han Shie
Abstract:
In recent years, the increasing awareness of cybersecurity has led to a heightened focus on information security within hardware devices and products. Incorporating Trusted Execution Environments (TEEs) into product designs has become a standard practice for safeguarding sensitive user information. However, vulnerabilities within these components present significant risks, if exploited by attackers, these vulnerabilities could lead to the leakage of sensitive data, thereby compromising user privacy and security. This research centers on trusted applications (TAs) within the Qualcomm TEE and introduces a novel emulator specifically designed for these applications. Through reverse engineering techniques, we thoroughly analyze Qualcomm TAs and develop a partial emulation environment that accurately emulates their behavior. Additionally, we integrate fuzzing testing techniques into the emulator to systematically uncover potential vulnerabilities within Qualcomm TAs, demonstrating its practical effectiveness in identifying real-world security flaws. This research makes a significant contribution by being the first to provide both the implementation methods and source codes for a Qualcomm TAs emulator, offering a valuable reference for future research efforts. Unlike previous approaches that relied on complex and resource-intensive full-system simulations, our approach is lightweight and effective, making security testing of TA more convenient.
Authors:Qingxiao Guo, Xinjie Zhu, Yilong Ma, Hui Jin, Yunhao Wang, Weifeng Zhang, Xiaobing Guo
Abstract:
Watermarking technology has gained significant attention due to the increasing importance of intellectual property (IP) rights, particularly with the growing deployment of large language models (LLMs) on billions resource-constrained edge devices. To counter the potential threats of IP theft by malicious users, this paper introduces a robust watermarking scheme without retraining or fine-tuning for transformer models. The scheme generates a unique key for each user and derives a stable watermark value by solving linear constraints constructed from model invariants. Moreover, this technology utilizes noise mechanism to hide watermark locations in multi-user scenarios against collusion attack. This paper evaluates the approach on three popular models (Llama3, Phi3, Gemma), and the experimental results confirm the strong robustness across a range of attack methods (fine-tuning, pruning, quantization, permutation, scaling, reversible matrix and collusion attacks).
Authors:Aufa Nasywa Rahman, Bimo Sunarfri Hantono, Guntur Dharma Putra
Abstract:
Open banking framework enables third party providers to access financial data across banking institutions, leading to unprecedented innovations in the financial sector. However, some open banking standards remain susceptible to severe technological risks, including unverified data sources, inconsistent data integrity, and lack of immutability. In this paper, we propose a layered architecture that provides assurance in data trustworthiness with three distinct levels of trust, covering source validation, data-level authentication, and tamper-proof storage. The first layer guarantees the source legitimacy using decentralized identity and verifiable presentation, while the second layer verifies data authenticity and consistency using cryptographic signing. Lastly, the third layer guarantees data immutability through the Tangle, a directed acyclic graph distributed ledger. We implemented a proof-of-concept implementation of our solution to evaluate its performance, where the results demonstrate that the system scales linearly with a stable throughput, exhibits a 100% validation rate, and utilizes under 35% of CPU and 350 MiB memory. Compared to a real-world open banking implementation, our solution offers significantly reduced latency and stronger data integrity assurance. Overall, our solution offers a practical and efficient system for secure data sharing in financial ecosystems while maintaining regulatory compliance.
Authors:Bill Marino, Ari Juels
Abstract:
There is growing interest in giving AI agents access to cryptocurrencies as well as to the smart contracts that transact them. But doing so, this position paper argues, could lead to formidable new vectors of AI harm. To support this argument, we first examine the unique properties of cryptocurrencies and smart contracts that could lead to these new vectors of harm. Next, we describe each of these new vectors of harm in detail. Finally, we conclude with a call for more technical research aimed at preventing and mitigating these harms and, thereby making it safer to endow AI agents with cryptocurrencies and smart contracts.
Authors:Simon Johnson, Raghunandan Makaram, Amy Santoni, Vinnie Scarlata
Abstract:
Intel(r) Software Guard Extensions (SGX) was originally released on client platforms and later extended to single socket server platforms. As developers have become familiar with the capabilities of the technology, the applicability of this capability in the cloud has been tested. Various Cloud Service Providers (CSPs) are demonstrating the value of using SGX based Trusted Execution Environments (TEE) to create a new paradigm of Confidential Cloud Computing. This paper describes the additional platform enhancements we believe are necessary to deliver a user programmable Trusted Execution Environment that scales to cloud usages, performs and is secure on multi-package platforms.
Authors:Jenifer Paulraj, Brindha Raghuraman, Nagarani Gopalakrishnan, Yazan Otoum
Abstract:
Critical infrastructure systems, including energy grids, healthcare facilities, transportation networks, and water distribution systems, are pivotal to societal stability and economic resilience. However, the increasing interconnectivity of these systems exposes them to various cyber threats, including ransomware, Denial-of-Service (DoS) attacks, and Advanced Persistent Threats (APTs). This paper examines cybersecurity vulnerabilities in critical infrastructure, highlighting the threat landscape, attack vectors, and the role of Artificial Intelligence (AI) in mitigating these risks. We propose a hybrid AI-driven cybersecurity framework to enhance real-time vulnerability detection, threat modelling, and automated remediation. This study also addresses the complexities of adversarial AI, regulatory compliance, and integration. Our findings provide actionable insights to strengthen the security and resilience of critical infrastructure systems against emerging cyber threats.
Authors:Jikesh Thapa, Gurrehmat Chahal, Serban Voinea Gabreanu, Yazan Otoum
Abstract:
Phishing attacks are becoming increasingly sophisticated, underscoring the need for detection systems that strike a balance between high accuracy and computational efficiency. This paper presents a comparative evaluation of traditional Machine Learning (ML), Deep Learning (DL), and quantized small-parameter Large Language Models (LLMs) for phishing detection. Through experiments on a curated dataset, we show that while LLMs currently underperform compared to ML and DL methods in terms of raw accuracy, they exhibit strong potential for identifying subtle, context-based phishing cues. We also investigate the impact of zero-shot and few-shot prompting strategies, revealing that LLM-rephrased emails can significantly degrade the performance of both ML and LLM-based detectors. Our benchmarking highlights that models like DeepSeek R1 Distill Qwen 14B (Q8_0) achieve competitive accuracy, above 80%, using only 17GB of VRAM, supporting their viability for cost-efficient deployment. We further assess the models' adversarial robustness and cost-performance tradeoffs, and demonstrate how lightweight LLMs can provide concise, interpretable explanations to support real-time decision-making. These findings position optimized LLMs as promising components in phishing defence systems and offer a path forward for integrating explainable, efficient AI into modern cybersecurity frameworks.
Authors:Jordi Serra-Ruiz, David MegÃas
Abstract:
A semi-fragile watermarking scheme for multiple band images is presented in this article. We propose to embed a mark into remote sensing images applying a tree-structured vector quantization approach to the pixel signatures instead of processing each band separately. The signature of the multispectral or hyperspectral image is used to embed the mark in it order to detect any significant modification of the original image. The image is segmented into three-dimensional blocks, and a tree-structured vector quantizer is built for each block. These trees are manipulated using an iterative algorithm until the resulting block satisfies a required criterion, which establishes the embedded mark. The method is shown to be able to preserve the mark under lossy compression (above a given threshold) but, at the same time, it detects possibly forged blocks and their position in the whole image.
Authors:Bing-Jyue Chen, Lilia Tang, Daniel Kang
Abstract:
As AI models become ubiquitous in our daily lives, there has been an increasing demand for transparency in ML services. However, the model owner does not want to reveal the weights, as they are considered trade secrets. To solve this problem, researchers have turned to zero-knowledge proofs of ML model inference. These proofs convince the user that the ML model output is correct, without revealing the weights of the model to the user. Past work on these provers can be placed into two categories. The first method compiles the ML model into a low-level circuit, and proves the circuit using a ZK-SNARK. The second method uses custom cryptographic protocols designed only for a specific class of models. Unfortunately, the first method is highly inefficient, making it impractical for the large models used today, and the second method does not generalize well, making it difficult to update in the rapidly changing field of machine learning. To solve this, we propose ZKTorch, an open source end-to-end proving system that compiles ML models into base cryptographic operations called basic blocks, each proved using specialized protocols. ZKTorch is built on top of a novel parallel extension to the Mira accumulation scheme, enabling succinct proofs with minimal accumulation overhead. These contributions allow ZKTorch to achieve at least a $3\times$ reduction in the proof size compared to specialized protocols and up to a $6\times$ speedup in proving time over a general-purpose ZKML framework.
Authors:Rama Krishna Koppanati, Monika Santra, Sateesh Kumar Peddoju
Abstract:
Malware developers exploit the fact that most detection models focus on the entire binary to extract the feature rather than on the regions of potential maliciousness. Therefore, they reverse engineer a benign binary and inject malicious code into it. This obfuscation technique circumvents the malware detection models and deceives the ML classifiers due to the prevalence of benign features compared to malicious features. However, extracting the features from the potential malicious regions enhances the accuracy and decreases false positives. Hence, we propose a novel model named PotentRegion4MalDetect that extracts features from the potential malicious regions. PotentRegion4MalDetect determines the nodes with potential maliciousness in the partially preprocessed Control Flow Graph (CFG) using the malicious strings given by StringSifter. Then, it extracts advanced features of the identified potential malicious regions alongside the features from the completely preprocessed CFG. The features extracted from the completely preprocessed CFG mitigate obfuscation techniques that attempt to disguise malicious content, such as suspicious strings. The experiments reveal that the PotentRegion4MalDetect requires fewer entries to save the features for all binaries than the model focusing on the entire binary, reducing memory overhead, faster computation, and lower storage requirements. These advanced features give an 8.13% increase in SHapley Additive exPlanations (SHAP) Absolute Mean and a 1.44% increase in SHAP Beeswarm value compared to those extracted from the entire binary. The advanced features outperform the features extracted from the entire binary by producing more than 99% accuracy, precision, recall, AUC, F1-score, and 0.064% FPR.
Authors:Gilda Rech Bansimba, Regis F. Babindamana, Beni Blaug N. Ibara
Abstract:
The security of the RSA cryptosystem is based on the intractability of computing Euler's totient function phi(n) for large integers n. Although deriving phi(n) deterministically remains computationally infeasible for cryptographically relevant bit lengths, and machine learning presents a promising alternative for constructing efficient approximations. In this work, we explore a machine learning approach to approximate Euler's totient function phi using linear regression models. We consider a dataset of RSA moduli of 64, 128, 256, 512 and 1024 bits along with their corresponding totient values. The regression model is trained to capture the relationship between the modulus and its totient, and tested on unseen samples to evaluate its prediction accuracy. Preliminary results suggest that phi can be approximated within a small relative error margin, which may be sufficient to aid in certain classes of RSA attacks. This research opens a direction for integrating statistical learning techniques into cryptanalysis, providing insights into the feasibility of attacking cryptosystems using approximation based strategies.
Authors:Jintao Guo, Ying Zhou, Chao Li, Guixun Luo
Abstract:
When analyzing connection patterns within graphs, subgraph counting serves as an effective and fundamental approach. Edge-local differential privacy (edge-LDP) and shuffle model have been employed to achieve subgraph counting under a privacy-preserving situation. Existing algorithms are plagued by high time complexity, excessive download costs, low accuracy, or dependence on trusted third parties. To address the aforementioned challenges, we propose the Noisy Adjacency Matrix (NAM), which combines differential privacy with the adjacency matrix of the graph. NAM offers strong versatility and scalability, making it applicable to a wider range of DP variants, DP mechanisms, and graph types. Based on NAM, we designed five algorithms (TriOR, TriTR, TriMTR, QuaTR, and 2STAR) to count three types of subgraphs: triangles, quadrangles, and 2-stars. Theoretical and experimental results demonstrate that in triangle counting, TriOR maximizes accuracy with reduced time complexity among one-round algorithms, TriTR achieves optimal accuracy, TriMTR achieves the highest accuracy under low download costs, and QuaTR stands as the first quadrangle counting algorithm under pure edge-LDP. We implement edge-LDP for noisy data via a confidence interval-inspired method, providing DP guarantees on randomized data. Our 2STAR algorithm achieves the highest accuracy in 2-star counting and can be derived as a byproduct of two-round triangle or quadrangle counting algorithms, enabling efficient joint estimation of triangle, quadrangle, and 2-star counts within two query rounds.
Authors:Jovonni L. Pharr, Jahanzeb M. Hussain
Abstract:
Rugsafe introduces a comprehensive protocol aimed at mitigating the risks of rug pulls in the cryptocurrency ecosystem. By utilizing cryptographic security measures and economic incentives, the protocol provides a secure multichain system for recovering assets and transforming rugged tokens into opportunities and rewards. Foundational to Rugsafe are specialized vaults where rugged tokens can be securely deposited, and anticoin tokens are issued as receipts. These anticoins are designed to be inversely pegged to the price movement of the underlying rugged token. Users can utilize these anticoins within the ecosystem or choose to burn them, further securing the protocol and earning additional rewards. The supply of the native Rugsafe token is dynamically adjusted based on the volume, value, and activity of rugged tokens, ensuring stability and resilience. By depositing rugged tokens into a vault on several chains, and by burning anticoins, users receive incentives on the RugSafe chain. This protocol's vaults are designed to work in heterogenous blockchain ecosystems, offering a practical and effective solution to one of the most significant challenges in the cryptocurrency market.
Authors:Seyed Ali Ghazi Asgar, Narasimha Reddy, Satish T. S. Bukkapatnam
Abstract:
While additive manufacturing has opened interesting avenues to reimagine manufacturing as a service (MaaS) platform, transmission of design files from client to manufacturer over networks opens up many cybersecurity challenges. Securing client's intellectual property (IP) especially from cyber-attacks emerges as a major challenge. Earlier works introduced streaming, instead of sharing process plan (G-code) files, as a possible solution. However, executing client's G-codes on manufacturer's machines exposes them to potential malicious G-codes. This paper proposes a viable approach when the client and manufacturer do not trust each other and both the client and manufacturer want to preserve their IP of designs and manufacturing process respectively. The proposed approach is based on segmenting and streaming design (STL) files and employing a novel machine-specific STL to G-code translator at the manufacturer's site in real-time for printing. This approach secures design and manufacturing process IPs as demonstrated in a real-world implementation.
Authors:Siddhant Deshpande, Yalemzerf Getnet, Waltenegus Dargie
Abstract:
With the proliferation of wireless electrocardiogram (ECG) systems for health monitoring and authentication, protecting signal integrity against tampering is becoming increasingly important. This paper analyzes the performance of CNN, ResNet, and hybrid Transformer-CNN models for tamper detection. It also evaluates the performance of a Siamese network for ECG based identity verification. Six tampering strategies, including structured segment substitutions and random insertions, are emulated to mimic real world attacks. The one-dimensional ECG signals are transformed into a two dimensional representation in the time frequency domain using the continuous wavelet transform (CWT). The models are trained and evaluated using ECG data from 54 subjects recorded in four sessions 2019 to 2025 outside of clinical settings while the subjects performed seven different daily activities. Experimental results show that in highly fragmented manipulation scenarios, CNN, FeatCNN-TranCNN, FeatCNN-Tran and ResNet models achieved an accuracy exceeding 99.5 percent . Similarly, for subtle manipulations (for example, 50 percent from A and 50 percent from B and, 75 percent from A and 25 percent from B substitutions) our FeatCNN-TranCNN model demonstrated consistently reliable performance, achieving an average accuracy of 98 percent . For identity verification, the pure Transformer-Siamese network achieved an average accuracy of 98.30 percent . In contrast, the hybrid CNN-Transformer Siamese model delivered perfect verification performance with 100 percent accuracy.
Authors:Antoine Geimer, Clementine Maurice
Abstract:
Developers rely on constant-time programming to prevent timing side-channel attacks. But these efforts can be undone by compilers, whose optimizations may silently reintroduce leaks. While recent works have measured the extent of such leakage, they leave developers without actionable insights: which optimization passes are responsible, and how to disable them without modifying the compiler remains unclear.
In this paper, we conduct a qualitative analysis of how compiler optimizations break constant-time code. We construct a dataset of compiler-introduced constant-time violations and analyze the internals of two widely used compilers, GCC and LLVM, to identify the specific optimization passes responsible. Our key insight is that a small set of passes are at the root of most leaks. To the best of our knowledge, we are also the first to characterize how the interactions between these passes contribute to leakage. Based on this analysis, we propose an original and practical mitigation that requires no source code modification or custom compiler: disabling selected optimization passes via compiler flags. We show that this approach significantly reduces leakage with minimal performance overhead, offering an immediately deployable defense for developers.
Authors:Avi Shaked, Nan Messe
Abstract:
For securing systems, it is essential to manage their vulnerability posture and design appropriate security controls. Vulnerability management allows to proactively address vulnerabilities by incorporating pertinent security controls into systems designs. Current vulnerability management approaches do not support systematic reasoning about the vulnerability postures of systems designs. To effectively manage vulnerabilities and design security controls, we propose a formally grounded automated reasoning mechanism. We integrate the mechanism into an open-source security design tool and demonstrate its application through an illustrative example driven by real-world challenges. The automated reasoning mechanism allows system designers to identify vulnerabilities that are applicable to a specific system design, explicitly specify vulnerability mitigation options, declare selected controls, and thus systematically manage vulnerability postures.
Authors:Steven Duplij, Qiang Guo
Abstract:
A novel original procedure of encryption/decryption based on the polyadic algebraic structures and on signal processing methods is proposed. First, we use signals with integer amplitudes to send information. Then we use polyadic techniques to transfer the plaintext into series of special integers. The receiver restores the plaintext using special rules and systems of equations.
Authors:Subhabrata Majumdar, Brian Pendleton, Abhishek Gupta
Abstract:
Red teaming has evolved from its origins in military applications to become a widely adopted methodology in cybersecurity and AI. In this paper, we take a critical look at the practice of AI red teaming. We argue that despite its current popularity in AI governance, there exists a significant gap between red teaming's original intent as a critical thinking exercise and its narrow focus on discovering model-level flaws in the context of generative AI. Current AI red teaming efforts focus predominantly on individual model vulnerabilities while overlooking the broader sociotechnical systems and emergent behaviors that arise from complex interactions between models, users, and environments. To address this deficiency, we propose a comprehensive framework operationalizing red teaming in AI systems at two levels: macro-level system red teaming spanning the entire AI development lifecycle, and micro-level model red teaming. Drawing on cybersecurity experience and systems theory, we further propose a set of recommendations. In these, we emphasize that effective AI red teaming requires multifunctional teams that examine emergent risks, systemic vulnerabilities, and the interplay between technical and social factors.
Authors:Max Gao, Michael Collins, Ricky Mok, kc Claffy
Abstract:
Threat hunting is an operational security process where an expert analyzes traffic, applying knowledge and lightweight tools on unlabeled data in order to identify and classify previously unknown phenomena. In this paper, we examine threat hunting metrics and practice by studying the detection of Crackonosh, a cryptojacking malware package, has on various metrics for identifying its behavior. Using a metric for discoverability, we model the ability of defenders to measure Crackonosh traffic as the malware population decreases, evaluate the strength of various detection methods, and demonstrate how different darkspace sizes affect both the ability to track the malware, but enable emergent behaviors by exploiting attacker mistakes.
Authors:Hyunwook Choi, Sangyun Won, Daeyeon Hwang, Junhyeok Choi
Abstract:
Deep learning-based watermarking has emerged as a promising solution for robust image authentication and protection. However, existing models are limited by low embedding capacity and vulnerability to bit-level errors, making them unsuitable for cryptographic applications such as digital signatures, which require over 2048 bits of error-free data. In this paper, we propose README (Robust Error-Aware Digital Signature via Deep WaterMarking ModEl), a novel framework that enables robust, verifiable, and error-tolerant digital signatures within images. Our method combines a simple yet effective cropping-based capacity scaling mechanism with ERPA (ERror PAinting Module), a lightweight error correction module designed to localize and correct bit errors using Distinct Circular Subsum Sequences (DCSS). Without requiring any fine-tuning of existing pretrained watermarking models, README significantly boosts the zero-bit-error image rate (Z.B.I.R) from 1.2% to 86.3% when embedding 2048-bit digital signatures into a single image, even under real-world distortions. Moreover, our use of perceptual hash-based signature verification ensures public verifiability and robustness against tampering. The proposed framework unlocks a new class of high-assurance applications for deep watermarking, bridging the gap between signal-level watermarking and cryptographic security.
Authors:Tanvir Rahman, A. B. M. Harun-ur Rashid
Abstract:
In an increasingly interconnected world, protecting electronic devices has grown more crucial because of the dangers of data extraction, reverse engineering, and hardware tampering. Producing chips in a third-party manufacturing company can let hackers change the design. As the Internet of Things (IoT) proliferates, physical attacks happen more, and conventional cryptography techniques do not function well. In this paper, we investigate the design and assessment of PUFs using the Stanford Memristor Model, utilizing its random filament evolution to improve security. The system was built using 45nm CMOS technology. A comparison is made between CMOS-based and memristor-based Arbiter PUFs, evaluating their performance under temperature, voltage, and process variations. Intra- and inter-hamming distances are employed by Monte Carlo simulations to estimate uniqueness and reliability. The results show that memristor-based PUFs offer better reliability than CMOS-based designs, though uniqueness needs further improvement. Furthermore, this study sheds light on the reasonableness of memristor-based PUFs for secure applications in hardware security.
Authors:M. Tahir Akdeniz, Zeynep YeÅilkaya, İ. Enes Köse, İ. UlaÅ Ãnal, Sevil Åen
Abstract:
The persistent threat of Android malware presents a serious challenge to the security of millions of users globally. While many machine learning-based methods have been developed to detect these threats, their reliance on large labeled datasets limits their effectiveness against emerging, previously unseen malware families, for which labeled data is scarce or nonexistent.
To address this challenge, we introduce a novel zero-shot learning framework that combines Variational Graph Auto-Encoders (VGAE) with Siamese Neural Networks (SNN) to identify malware without needing prior examples of specific malware families. Our approach leverages graph-based representations of Android applications, enabling the model to detect subtle structural differences between benign and malicious software, even in the absence of labeled data for new threats.
Experimental results show that our method outperforms the state-of-the-art MaMaDroid, especially in zero-day malware detection. Our model achieves 96.24% accuracy and 95.20% recall for unknown malware families, highlighting its robustness against evolving Android threats.
Authors:Abdellah Akilal, M-Tahar Kechadi
Abstract:
Cloud Forensics presents a multi-jurisdictional challenge that may undermines the success of digital forensic investigations (DFIs). The growing volumes of domiciled and foreign law enforcement (LE) requests, the latency and complexity of formal channels for crossborder data access are challenging issues. In this paper, we first discuss major Cloud Service Providers (CSPs) transparency reports and law enforcement guidelines, then propose an abstract architecture for a Cloud Law Enforcement Requests Management System (CLERMS). A proof of concept of the proposed solution is developed, deployed and validated by two realistic scenarios, in addition to an economic estimation of its associated costs. Based on available open source components, our solution is for the benefit of both CSPs and Cloud Service Consumers (CSCs), and aims to enhance the due Cloud Digital Forensic Readiness (CDFR).
Authors:Cédric Bonhomme, Alexandre Dulaunoy
Abstract:
This paper presents VLAI, a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.
Authors:Andong Chen, Zhaoxuan Jin, Ziyi Guo, Yan Chen
Abstract:
Kubernetes Operators, automated tools designed to manage application lifecycles within Kubernetes clusters, extend the functionalities of Kubernetes, and reduce the operational burden on human engineers. While Operators significantly simplify DevOps workflows, they introduce new security risks. In particular, Kubernetes enforces namespace isolation to separate workloads and limit user access, ensuring that users can only interact with resources within their authorized namespaces. However, Kubernetes Operators often demand elevated privileges and may interact with resources across multiple namespaces. This introduces a new class of vulnerabilities, the Cross-Namespace Reference Vulnerability. The root cause lies in the mismatch between the declared scope of resources and the implemented scope of the Operator logic, resulting in Kubernetes being unable to properly isolate the namespace. Leveraging such vulnerability, an adversary with limited access to a single authorized namespace may exploit the Operator to perform operations affecting other unauthorized namespaces, causing Privilege Escalation and further impacts. To the best of our knowledge, this paper is the first to systematically investigate the security vulnerability of Kubernetes Operators. We present Cross-Namespace Reference Vulnerability with two strategies, demonstrating how an attacker can bypass namespace isolation. Through large-scale measurements, we found that over 14% of Operators in the wild are potentially vulnerable. Our findings have been reported to the relevant developers, resulting in 7 confirmations and 6 CVEs by the time of submission, affecting vendors including ****** and ******, highlighting the critical need for enhanced security practices in Kubernetes Operators. To mitigate it, we also open-source the static analysis suite to benefit the ecosystem.
Authors:Ricardo Queiroz de Araujo Fernandes, Anderson Santos, Daniel Maier de Carvalho, André Luiz Bandeira Molina
Abstract:
This article presents an in-depth exploration of the analogy between the Holographic Principle in theoretical physics and cyber attack surfaces in digital security. Building on concepts such as black hole entropy and AdS/CFT duality, it highlights how complex infrastructures project their vulnerabilities onto their external interfaces. The paper draws a parallel between a black hole's event horizon, which encodes all internal information, and the attack surface, which reflects the internal architecture's security posture. Additionally, the article outlines how this conceptual framework can guide cybersecurity practices, emphasizing strategies such as attack surface reduction, continuous scanning with tools like OWASP ZAP and Greenbone OpenVAS, and the implementation of Zero Trust Architecture. This analogy not only provides a unique perspective on digital security but also underscores the critical importance of boundary-level defenses in protecting vast internal infrastructures.
Authors:Do-hyeon Yoon, Minsoo Chun, Thomas Allen, Hans Müller, Min Wang, Rajesh Sharma
Abstract:
Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent. While watermarking techniques have been proposed to protect model ownership, they may not be robust to continue training and development, posing serious threats to model attribution and copyright protection. This work introduces a simple yet effective approach for robust LLM fingerprinting based on intrinsic model characteristics. We discover that the standard deviation distributions of attention parameter matrices across different layers exhibit distinctive patterns that remain stable even after extensive continued training. These parameter distribution signatures serve as robust fingerprints that can reliably identify model lineage and detect potential copyright infringement. Our experimental validation across multiple model families demonstrates the effectiveness of our method for model authentication. Notably, our investigation uncovers evidence that a recently Pangu Pro MoE model released by Huawei is derived from Qwen-2.5 14B model through upcycling techniques rather than training from scratch, highlighting potential cases of model plagiarism, copyright violation, and information fabrication. These findings underscore the critical importance of developing robust fingerprinting methods for protecting intellectual property in large-scale model development and emphasize that deliberate continued training alone is insufficient to completely obscure model origins.
Authors:Vishnu Vinod, Krishna Pillutla, Abhradeep Guha Thakurta
Abstract:
As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question. We present InvisibleInk, a highly scalable long-form text generation framework satisfying rigorous differential privacy guarantees with respect to the sensitive references. It interprets sampling from the LLM's next-token-distribution as the exponential mechanism over the LLM logits with two innovations. First, we reduce the privacy cost by isolating and clipping only the sensitive information in the model logits (relative to the public logits). Second, we improve text quality by sampling from a small superset of the top-$k$ private tokens. Empirical evaluations demonstrate a consistent $8\times$ reduction in computation cost over state-of-the-art baselines to generate long-form private text of the same utility across privacy levels. In summary, InvisibleInk is able to generate private long-form text at less than $10\times$ the computation cost of non-private generation.
Authors:Daniel López-Montero, José L. Ãlvarez-Aldana, Alicia Morales-MartÃnez, Marta Gil-López, Juan M. Auñón GarcÃa
Abstract:
This paper aims to provide an innovative machine learning-based solution to automate security testing tasks for web applications, ensuring the correct functioning of all components while reducing project maintenance costs. Reinforcement Learning is proposed to select and prioritize tools and optimize the testing path. The presented approach utilizes a simulated webpage along with its network topology to train the agent. Additionally, the model leverages Geometric Deep Learning to create priors that reduce the search space and improve learning convergence. The validation and testing process was conducted on real-world vulnerable web pages commonly used by human hackers for learning. As a result of this study, a reinforcement learning algorithm was developed that maximizes the number of vulnerabilities found while minimizing the number of steps required
Authors:Frida Sundfeldt, Bianca Widstam, Mahshid Helali Moghadam, Kuo-Yun Liang, Anders Vesterberg
Abstract:
The digital evolution of connected vehicles and the subsequent security risks emphasize the critical need for implementing in-vehicle cyber security measures such as intrusion detection and response systems. The continuous advancement of attack scenarios further highlights the need for adaptive detection mechanisms that can detect evolving, unknown, and complex threats. The effective use of ML-driven techniques can help address this challenge. However, constraints on implementing diverse attack scenarios on test vehicles due to safety, cost, and ethical considerations result in a scarcity of data representing attack scenarios. This limitation necessitates alternative efficient and effective methods for generating high-quality attack-representing data. This paper presents a context-aware attack data generator that generates attack inputs and corresponding in-vehicle network log, i.e., controller area network (CAN) log, representing various types of attack including denial of service (DoS), fuzzy, spoofing, suspension, and replay attacks. It utilizes parameterized attack models augmented with CAN message decoding and attack intensity adjustments to configure the attack scenarios with high similarity to real-world scenarios and promote variability. We evaluate the practicality of the generated attack-representing data within an intrusion detection system (IDS) case study, in which we develop and perform an empirical evaluation of two deep neural network IDS models using the generated data. In addition to the efficiency and scalability of the approach, the performance results of IDS models, high detection and classification capabilities, validate the consistency and effectiveness of the generated data as well. In this experience study, we also elaborate on the aspects influencing the fidelity of the data to real-world scenarios and provide insights into its application.
Authors:Melissa Safari, Abhishek K. Mishra, Mathieu Cunche
Abstract:
Wi-Fi management frames reveal structured communication patterns that persist even under randomization of MAC addresses. Prior approaches to associating randomized MAC addresses with devices primarily focus on probe requests, overlooking the broader set of management frames and their transition dynamics. This narrow focus limits their robustness in dense, real-world environments with high device mobility, where probe activity alone fails to yield stable and distinctive signatures. In this paper, we present a novel framework for fingerprinting Wi-Fi devices based on behavioral dynamics extracted from passively observed management frames. We model each device's behavior as a finite state machine and introduce matrix-based representations that encode both structural (state transition frequencies) and temporal (inter-state delays) characteristics. These matrices are embedded into compact feature vectors, enabling efficient similarity comparison. Through extensive evaluation in diverse real-world settings, our method achieves over 86% identification accuracy for non-randomized devices using only Wi-Fi management frames, with further improvements observed through temporal burst aggregation. Our findings are sufficient to uniquely and consistently characterize devices at scale, outperforming the state-of-the-art.
Authors:Jorge J. Tejero-Fernández, Alfonso Sánchez-Macián
Abstract:
Log analysis is a relevant research field in cybersecurity as they can provide a source of information for the detection of threats to networks and systems. This paper presents a pipeline to use fine-tuned Large Language Models (LLMs) for anomaly detection and mitigation recommendation using IoT security logs. Utilizing classical machine learning classifiers as a baseline, three open-source LLMs are compared for binary and multiclass anomaly detection, with three strategies: zero-shot, few-shot prompting and fine-tuning using an IoT dataset. LLMs give better results on multi-class attack classification than the corresponding baseline models. By mapping detected threats to MITRE CAPEC, defining a set of IoT-specific mitigation actions, and fine-tuning the models with those actions, the models are able to provide a combined detection and recommendation guidance.
Authors:Michelle Yeo, Haoqian Zhang
Abstract:
Censorship resilience is a fundamental assumption underlying the security of blockchain protocols. Additionally, the analysis of blockchain security from an economic and game theoretic perspective has been growing in popularity in recent years. In this work, we present a surprising rational censorship attack on blockchain censorship resilience when we adopt the analysis of blockchain security from a game theoretic lens and assume all users are rational. In our attack, a colluding group with sufficient voting power censors the remainder nodes such that the group alone can gain all the rewards from maintaining the blockchain. We show that if nodes are rational, coordinating this attack just requires a public read and write blackboard and we formally model the attack using a game theoretic framework. Furthermore, we note that to ensure the success of the attack, nodes need to know the total true voting power held by the colluding group. We prove that the strategy to join the rational censorship attack and also for nodes to honestly declare their power is a subgame perfect equilibrium in the corresponding extensive form game induced by our attack. Finally, we discuss the implications of the attack on blockchain users and protocol designers as well as some potential countermeasures.
Authors:Bahram Rashidi, Behrooz Khadem
Abstract:
This paper introduces a compact and secure 16-bit substitution box (S-box) designed over the composite field $\F_{(((2^2)^2)^2)^2}$, optimized for both hardware efficiency and cryptographic robustness. The proposed S-box decomposes operations into subfields, leveraging a tower field architecture. This enables significant hardware reduction through optimized field inversion and a low-cost affine transformation. Security evaluations confirm resilience against linear, differential, algebraic and DPA attacks, validated via metrics including Nonlinearity (32512), Differential Uniformity (4), Algebraic Degree (15), Transparency order (15.9875) and SNR (0.34e-08). The hardware results, in 65 nm CMOS technology, show the proposed 16-bit S-box has lower hardware resources consumption and lower critical path delay (CPD) than those of other 16-bit S-boxes. By integrating high algebraic complexity with resource-efficient structures, this work addresses the growing demand for scalable cryptographic primitives in data-sensitive applications, demonstrating that larger S-boxes can enhance security without proportional hardware costs. The results underscore the viability of composite field-based architectures in balancing security and efficiency for modern block ciphers.
Authors:Pedro R. X. Carmo, Igor de Moura, Assis T. de Oliveira Filho, Djamel Sadok, Cleber Zanchettin
Abstract:
Modern vehicles are increasingly connected, and in this context, automotive Ethernet is one of the technologies that promise to provide the necessary infrastructure for intra-vehicle communication. However, these systems are subject to attacks that can compromise safety, including flow injection attacks. Deep Learning-based Intrusion Detection Systems (IDS) are often designed to combat this problem, but they require expensive hardware to run in real time. In this work, we propose to evaluate and apply fast neural network inference techniques like Distilling and Prunning for deploying IDS models on low-cost platforms in real time. The results show that these techniques can achieve intrusion detection times of up to 727 μs using a Raspberry Pi 4, with AUCROC values of 0.9890.
Authors:Bhagyalekshmy S, Rutuja Kshirsagar
Abstract:
Quasi-twisted (QT) codes generalize several important families of linear codes, including cyclic, constacyclic, and quasi-cyclic codes. Despite their potential, to the best of our knowledge, there exists no efficient decoding algorithm for QT codes. In this work, we propose a syndrome-based decoding method capable of efficiently correcting up to (d* - 1)/2 errors, where d* denotes an HT-like lower bound on the minimum distance of QT codes, which we formalize here. Additionally, we introduce a Niederreiter-like cryptosystem constructed from QT codes. This cryptosystem is resistant to some classical attacks as well as some quantum attacks based on Quantum Fourier Sampling.
Authors:Aashray Reddy, Andrew Zagula, Nicholas Saban
Abstract:
Large Language Models (LLMs) continue to exhibit vulnerabilities to jailbreaking attacks: carefully crafted malicious inputs intended to circumvent safety guardrails and elicit harmful responses. As such, we present AutoAdv, a novel framework that automates adversarial prompt generation to systematically evaluate and expose vulnerabilities in LLM safety mechanisms. Our approach leverages a parametric attacker LLM to produce semantically disguised malicious prompts through strategic rewriting techniques, specialized system prompts, and optimized hyperparameter configurations. The primary contribution of our work is a dynamic, multi-turn attack methodology that analyzes failed jailbreak attempts and iteratively generates refined follow-up prompts, leveraging techniques such as roleplaying, misdirection, and contextual manipulation. We quantitatively evaluate attack success rate (ASR) using the StrongREJECT (arXiv:2402.10260 [cs.CL]) framework across sequential interaction turns. Through extensive empirical evaluation of state-of-the-art models--including ChatGPT, Llama, and DeepSeek--we reveal significant vulnerabilities, with our automated attacks achieving jailbreak success rates of up to 86% for harmful content generation. Our findings reveal that current safety mechanisms remain susceptible to sophisticated multi-turn attacks, emphasizing the urgent need for more robust defense strategies.
Authors:Keiichiro Kimura, Hiroki Kuzuno, Yoshiaki Shiraishi, Masakatu Morii
Abstract:
Bluetooth is a pervasive wireless communication technology used by billions of devices for short-range connectivity. The security of Bluetooth relies on the pairing process, where devices establish shared long-term keys for secure communications. However, many commercial Bluetooth devices implement automatic pairing functions to improve user convenience, creating a previously unexplored attack surface.
We present Stealtooth, a novel attack that abuses unknown vulnerabilities in the automatic pairing functions in commercial Bluetooth devices to achieve completely silent device link key overwriting. The Stealtooth attack leverages the fact that Bluetooth audio devices automatically transition to pairing mode under specific conditions, enabling attackers to hijack pairing processes without user awareness or specialized tools. We also extend the attack into the MitM Stealtooth attack, combining automatic pairing abuse with power-saving mode techniques to enable man-in-the-middle attacks.
We evaluate the attacks against 10 commercial Bluetooth devices from major manufacturers, demonstrating widespread vulnerabilities across diverse device types and manufacturers. Our practical implementation requires only commodity hardware and open-source software, highlighting the low barrier to entry for attackers.
We propose defenses both device and protocol levels, including enhanced user notifications and standardized automatic pairing guidelines. Our findings reveal a critical tension between security and usability, showing that current automatic pairing implementations create systematic vulnerabilities. We responsibly disclosed our findings to affected vendors, with several already releasing patches.
Authors:Gabriel Grobler, Sheunesu Makura, Hein Venter
Abstract:
Tampering or forgery of digital documents has become widespread, most commonly through altering images without any malicious intent such as enhancing the overall appearance of the image. However, there are occasions when tampering of digital documents can have negative consequences, such as financial fraud and reputational damage. Tampering can occur through altering a digital document's text or editing an image's pixels. Many techniques have been developed to detect whether changes have been made to a document. Most of these techniques rely on generating hashes or watermarking the document. These techniques, however, have limitations in that they cannot detect alterations to portable document format (PDF) signatures or other non-visual aspects, such as metadata. This paper presents a new technique that can be used to detect tampering within a PDF document by utilizing the PDF document's file page objects. The technique employs a prototype that can detect changes to a PDF document, such as changes made to the text, images, or metadata of the said file.
Authors:Joni Herttuainen, Vesa Kuikka, Kimmo K. Kaski
Abstract:
We present a novel methodology for modelling, visualising, and analysing cyber threats, attack paths, as well as their impact on user services in enterprise or infrastructure networks of digital devices and services they provide. Using probabilistic methods to track the propagation of an attack through attack graphs, via the service or application layers, and on physical communication networks, our model enables us to analyse cyber attacks at different levels of detail. Understanding the propagation of an attack within a service among microservices and its spread between different services or application servers could help detect and mitigate it early. We demonstrate that this network-based influence spreading modelling approach enables the evaluation of diverse attack scenarios and the development of protection and mitigation measures, taking into account the criticality of services from the user's perspective. This methodology could also aid security specialists and system administrators in making well-informed decisions regarding risk mitigation strategies.
Authors:Nicola Cibin, Bas Mulder, Herman Carstens, Peter Palensky, Alexandru Åtefanov
Abstract:
The digital transformation of power systems is accelerating the adoption of IEC 61850 standard. However, its communication protocols, including Sampled Values (SV), lack built-in security features such as authentication and encryption, making them vulnerable to malicious packet injection. Such cyber attacks can delay fault clearance or trigger unintended circuit breaker operations. While most existing research focuses on detecting cyber attacks in digital substations, intrusion prevention systems have been disregarded because of the risk of potential communication network disruptions. This paper proposes a novel method using hybrid statistical-deep learning for the detection, prevention, and source localization of IEC 61850 SV injection attacks. The method uses exponentially modified Gaussian distributions to model communication network latency and long short-term memory and Elman recurrent neural network to detect anomalous variations in the estimated probability distributions. It effectively discards malicious SV frames with minimal processing overhead and latency, maintains robustness against communication network latency variation and time-synchronization issues, and guarantees a near-zero false positive rate in non-attack scenarios. Comprehensive validation is conducted on three testbeds involving industrial-grade devices, hardware-in-the-loop simulations, virtualized intelligent electronic devices and merging units, and high-fidelity emulated communication networks. Results demonstrate the method's suitability for practical deployment in IEC 61850-compliant digital substations.
Authors:Wenjin Mo, Zhiyuan Li, Minghong Fang, Mingwei Fang
Abstract:
Federated learning (FL) allows multiple clients to collaboratively train a global machine learning model with coordination from a central server, without needing to share their raw data. This approach is particularly appealing in the era of privacy regulations like the GDPR, leading many prominent companies to adopt it. However, FL's distributed nature makes it susceptible to poisoning attacks, where malicious clients, controlled by an attacker, send harmful data to compromise the model. Most existing poisoning attacks in FL aim to degrade the model's integrity, such as reducing its accuracy, with limited attention to privacy concerns from these attacks. In this study, we introduce FedPoisonMIA, a novel poisoning membership inference attack targeting FL. FedPoisonMIA involves malicious clients crafting local model updates to infer membership information. Additionally, we propose a robust defense mechanism to mitigate the impact of FedPoisonMIA attacks. Extensive experiments across various datasets demonstrate the attack's effectiveness, while our defense approach reduces its impact to a degree.
Authors:Numan Halit Guldemir, Oluwafemi Olukoya, Jesús MartÃnez-del-Rincón
Abstract:
Machine learning is increasingly vital in cybersecurity, especially in malware detection. However, concept drift, where the characteristics of malware change over time, poses a challenge for maintaining the efficacy of these detection systems. Concept drift can occur in two forms: the emergence of entirely new malware families and the evolution of existing ones. This paper proposes an innovative method to address the former, focusing on effectively identifying new malware families. Our approach leverages a supervised autoencoder combined with triplet loss to differentiate between known and new malware families. We create clear and robust clusters that enhance the accuracy and resilience of malware family classification by utilizing this metric learning technique and the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. The effectiveness of our method is validated using an Android malware dataset and a Windows portable executable (PE) malware dataset, showcasing its capability to sustain model performance within the dynamic landscape of emerging malware threats. Our results demonstrate a significant improvement in detecting new malware families, offering a reliable solution for ongoing cybersecurity challenges.
Authors:Mehmet Hüseyin Temel, Boris Å koriÄ
Abstract:
We introduce the first quantum authentication scheme for continuous-variable states. Our scheme is based on trap states, and is an adaptation of a discrete-variable scheme by Broadbent et al. (arXiv:1211.1080), but with more freedom in choosing the number of traps. We provide a security proof, mostly following the approach of Broadbent and Wainewright (arXiv:1607.03075). As a necessary ingredient for the proof we derive the continuous-variable analogue of the Pauli Twirl.
Authors:Felix Zhou
Abstract:
We study the sublinear space continual release model for edge-differentially private (DP) graph algorithms, with a focus on the densest subgraph problem (DSG) in the insertion-only setting. Our main result is the first continual release DSG algorithm that matches the additive error of the best static DP algorithms and the space complexity of the best non-private streaming algorithms, up to constants. The key idea is a refined use of subsampling that simultaneously achieves privacy amplification and sparsification, a connection not previously formalized in graph DP. Via a simple black-box reduction to the static setting, we obtain both pure and approximate-DP algorithms with $O(\log n)$ additive error and $O(n\log n)$ space, improving both accuracy and space complexity over the previous state of the art. Along the way, we introduce graph densification in the graph DP setting, adding edges to trigger earlier subsampling, which removes the extra logarithmic factors in error and space incurred by prior work [ELMZ25]. We believe this simple idea may be of independent interest.
Authors:Ziad Ghanem
Abstract:
Classical linear ciphers, such as the Hill cipher, operate on fixed, finite-dimensional modules and are therefore vulnerable to straightforward known-plaintext attacks that recover the key as a fully determined linear operator. We propose a symmetric-key cryptosystem whose linear action takes place instead in the Burnside ring $A(G)$ of a compact Lie group $G$, with emphasis on the case $G=O(2)$. The secret key consists of (i) a compact Lie group $G$; (ii) a secret total ordering of the subgroup orbit-basis of $A(G)$; and (iii) a finite set $S$ of indices of irreducible $G$-representations, whose associated basic degrees define an involutory multiplier $k\in A(G)$. Messages of arbitrary finite length are encoded as finitely supported elements of $A(G)$ and encrypted via the Burnside product with $k$. For $G=O(2)$ we prove that encryption preserves plaintext support among the generators $\{(D_1),\dots,(D_L),(SO(2)),(O(2))\}$, avoiding ciphertext expansion and security leakage. We then analyze security in passive models, showing that any finite set of observations constrains the action only on a finite-rank submodule $W_L\subset A(O(2))$, and we show information-theoretic non-identifiability of the key from such data. Finally, we prove the scheme is \emph{not} IND-CPA secure, by presenting a one-query chosen-plaintext distinguisher based on dihedral probes.
Authors:Hui Yuan
Abstract:
With the proliferation of decentralized applications (DApps), the conflict between the transparency of blockchain technology and user data privacy has become increasingly prominent. While Decentralized Identity (DID) and Verifiable Credentials (VCs) provide a standardized framework for user data sovereignty, achieving trusted identity verification and data sharing without compromising privacy remains a significant challenge. This paper proposes a novel, comprehensive framework that integrates DIDs and VCs with efficient Zero-Knowledge Proof (ZKP) schemes to address this core issue. The key contributions of this framework are threefold: first, it constructs a set of strong privacy-preserving protocols based on zk-STARKs, allowing users to prove that their credentials satisfy specific conditions (e.g., "age is over 18") without revealing any underlying sensitive data. Second, it designs a scalable, privacy-preserving credential revocation mechanism based on cryptographic accumulators, effectively solving credential management challenges in large-scale scenarios. Finally, it integrates a practical social key recovery scheme, significantly enhancing system usability and security. Through a prototype implementation and performance evaluation, this paper quantitatively analyzes the framework's performance in terms of proof generation time, verification overhead, and on-chain costs. Compared to existing state-of-the-art systems based on zk-SNARKs, our framework, at the cost of a larger proof size, significantly improves prover efficiency for complex computations and provides stronger security guarantees, including no trusted setup and post-quantum security. Finally, a case study in the decentralized finance (DeFi) credit scoring scenario demonstrates the framework's immense potential for unlocking capital efficiency and fostering a trusted data economy.
Authors:Morteza Sargolzaei Javan
Abstract:
Cloud computing has become the foundation of modern digital infrastructure, yet the absence of a unified architectural and compliance framework impedes interoperability, auditability, and robust security. This paper introduces a formal, machine-readable semantic model for Cloud Engines, integrating the architectural taxonomy of ISO/IEC 22123 (Cloud Reference Architecture) with the security and compliance controls of ISO/IEC 27001:2022 and ISO/IEC TR 3445:2022. The model decomposes cloud systems into four canonical interfaces--Control, Business, Audit, and Data--and extends them with a security ontology that maps mechanisms such as authentication, authorization, and encryption to specific compliance controls. Expressed in RDF/Turtle, the model enables semantic reasoning, automated compliance validation, and vendor-neutral architecture design. We demonstrate its practical utility through OpenStack and AWS case studies, and provide reproducible validation workflows using SPARQL and SHACL. This work advances the state of cloud security modeling by bridging architectural and compliance standards in a unified framework, with a particular emphasis on auditability.
Authors:Kalin Dimitrov
Abstract:
Pingmark defines a universal textual protocol for expressing spatial context through a minimal symbol: !@. Rather than embedding coordinates or using proprietary map links, Pingmark introduces a semantic trigger that compliant client applications interpret to generate a standardized resolver link of the form https://pingmark.me/lat/lon/[timestamp]. This allows location expression to function like existing textual conventions - @ for identity or # for topics - but for physical space. The protocol requires no user registration, relies on open mapping technologies, and protects privacy by generating location data ephemerally and locally. This paper presents the motivation, syntax, and design of the Pingmark Protocol Specification (PPS v0.1), its reference resolver implementation, and the long-term goal of establishing Pingmark as an open Internet standard for spatial mentions.
Authors:Yejun Jang
Abstract:
The proliferation of high-fidelity synthetic media, coupled with exploitable hardware vulnerabilities in conventional imaging pipelines, has precipitated a crisis of trust in digital content. Existing countermeasures, from post-hoc classifiers to software-based signing, fail to address the fundamental challenge of establishing an unbreakable link to reality at the moment of capture. This whitepaper introduces Signing Right Away (SRA), a comprehensive security architecture that guarantees the provenance of digital media from "silicon to silicon to signed file." SRA leverages a four-pillar security model-Confidentiality, Integrity, Authentication, and Replay Protection, akin to the MIPI Camera Security Framework (CSF), but also extends its scope beyond the internal data bus to the creation of a cryptographically sealed, C2PA-compliant final asset. By securing the entire imaging pipeline within a Trusted Execution Environment (TEE), SRA ensures that every captured image and video carries an immutable, verifiable proof of origin. This provides a foundational solution for industries reliant on trustworthy visual information, including journalism, legal evidence, and insurance. We present the SRA architecture, a detailed implementation roadmap informed by empirical prototyping, and a comparative analysis that positions SRA as the essential "last mile" in the chain of content trust.
Authors:Fahed Quttainah
Abstract:
This paper explores the distinctions and connections between cybersecurity and ethical hacking, two vital disciplines in the protection of digital systems. It defines each field, outlines their goals and methodologies, and compares the academic and professional paths available to aspiring students. Cybersecurity is presented as a defensive discipline focused on preventing attacks and safeguarding data, while ethical hacking adopts an offensive approach that identifies vulnerabilities through authorized testing. The paper highlights key skills, certifications, and career opportunities in both areas, offering practical guidance to help learners choose the path best suited to their interests and ambitions. Ultimately, it emphasizes the complementary nature of both fields in strengthening global cyber resilience.
Authors:Tonmoy Ghosh
Abstract:
Password security has been compelled to evolve in response to the growing computational capabilities of modern systems. However, this evolution has often resulted in increasingly complex security practices that alienate users, leading to poor compliance and heightened vulnerability. Consequently, individuals remain exposed to attackers through weak or improperly managed passwords, underscoring the urgent need for a comprehensive defense mechanism that effectively addresses password-related risks and threats. In this paper, we propose a multifaceted solution designed to revolutionize password security by integrating diverse attributes such as the Password Dissection Mechanism, Dynamic Password Policy Mechanism, human behavioral patterns, device characteristics, network parameters, geographical context, and other relevant factors. By leveraging learning-based models, our framework constructs detailed user profiles capable of recognizing individuals and preventing nearly all forms of unauthorized access or device possession. The proposed framework enhances the usability-security paradigm by offering stronger protection than existing standards while simultaneously engaging users in the policy-setting process through a novel, adaptive approach.
Authors:Giuseppe Canale
Abstract:
This paper presents the Cybersecurity Psychology Framework (CPF), a novel methodology for quantifying human-centric vulnerabilities in security operations through systematic integration of established psychological constructs with operational security telemetry. While individual human factors-alert fatigue, compliance fatigue, cognitive overload, and risk perception biases-have been extensively studied in isolation, no framework provides end-to-end operationalization across the full spectrum of psychological vulnerabilities. We address this gap by: (1) defining specific, measurable algorithms that quantify key psychological states using standard SOC tooling (SIEM, ticketing systems, communication platforms); (2) proposing a lightweight, privacy-preserving LLM architecture based on Retrieval-Augmented Generation (RAG) and domain-specific fine-tuning to analyze structured and unstructured data for latent psychological risks; (3) detailing a rigorous mixed-methods validation strategy acknowledging the inherent difficulty of obtaining sensitive cybersecurity data. Our implementation of CPF indicators has been demonstrated in a proof-of-concept deployment using small language models achieving 0.92 F1-score on synthetic data. This work provides the theoretical and methodological foundation necessary for industry partnerships to conduct empirical validation with real operational data.
Authors:Bernhard Mueller
Abstract:
Hound introduces a relation-first graph engine that improves system-level reasoning across interrelated components in complex codebases. The agent designs flexible, analyst-defined views with compact annotations (e.g., monetary/value flows, authentication/authorization roles, call graphs, protocol invariants) and uses them to anchor exact retrieval: for any question, it loads precisely the code that matters (often across components) so it can zoom out to system structure and zoom in to the decisive lines. A second contribution is a persistent belief system: long-lived vulnerability hypotheses whose confidence is updated as evidence accrues. The agent employs coverage-versus-intuition planning and a QA finalizer to confirm or reject hypotheses. On a five-project subset of ScaBench[1], Hound improves recall and F1 over a baseline LLM analyzer (micro recall 31.2% vs. 8.3%; F1 14.2% vs. 9.8%) with a modest precision trade-off. We attribute these gains to flexible, relation-first graphs that extend model understanding beyond call/dataflow to abstract aspects, plus the hypothesis-centric loop; code and artifacts are released to support reproduction.
Authors:Aminu Muhammad Auwal
Abstract:
The growing use of Internet of Things (IoT) technologies in Nigerian healthcare offers potential improvements in remote monitoring and data-driven care, but unsecured wireless communication in medical IoT (mIoT) devices exposes patient data to cyber threats. This study investigates such vulnerabilities through a real-time Man in the Middle (MITM) attack simulation and evaluates lightweight AES-128 encryption on low-cost devices. A prototype mIoT device was built with a NodeMCU ESP8266 and sensors for heart rate and temperature. In controlled lab conditions simulating local healthcare networks, unencrypted data transmissions were intercepted and altered using common tools (Bettercap, Wireshark). After AES-128 encryption was applied, all transmissions became unreadable and tamper attempts failed, demonstrating its effectiveness. Performance costs were modest, latency rose from 80 ms to 125 ms (56.25 percent increase) and CPU use from 30 percent to 45 percent, but system stability remained intact. Device cost stayed under 18,000 NGN (about 12 USD), making it feasible for Nigeria's resource constrained facilities. A survey of healthcare professionals showed moderate awareness of IoT-related risks but strong support for encryption and staff training. Barriers included limited budgets and technical complexity. The study concludes that lightweight AES-128 encryption provides practical, low-cost protection against common attack vectors while maintaining operational efficiency. Feedback from professionals highlights the urgency of improving security awareness and establishing guidelines for clinical deployment.
Authors:Michel Youssef
Abstract:
We present a risk-calibrated approach to streaming intrusion detection that couples Bayesian Online Changepoint Detection (BOCPD) with decision thresholds aligned to Site Reliability Engineering (SRE) error budgets. BOCPD provides run-length posteriors that adapt to distribution shift and concept drift; we map these posteriors to alert decisions by optimizing expected operational cost under false-positive and false-negative budgets. We detail the hazard model, conjugate updates, and an O(1)-per-event implementation. A concrete SRE example shows how a 99.9% availability SLO (43.2 minutes per month error budget) yields a probability threshold near 0.91 when missed incidents are 10x more costly than false alarms. We evaluate on the full UNSW-NB15 and CIC-IDS2017 benchmarks with chronological splits, comparing against strong unsupervised baselines (ECOD, COPOD, and LOF). Metrics include PR-AUC, ROC-AUC, Brier score, calibration reliability diagrams, and detection latency measured in events. Results indicate improved precision-recall at mid to high recall and better probability calibration relative to baselines. We release implementation details, hyperparameters, and ablations for hazard sensitivity and computational footprint. Code and reproducibility materials will be made available upon publication; datasets and implementation are available from the corresponding author upon reasonable request.
Authors:Dmitrii A. Gerasimov
Abstract:
ChipmunkRing, a practical post-quantum ring signature construction tailored for blockchain environments. Building on our Chipmunk lattice-based cryptographic framework, this implementation delivers compact digital signatures ranging from 20.5 to 279.7KB, with rapid signing operations completing in 1.1-15.1ms and efficient validation processes requiring only 0.4-4.5ms for participant groups of 2-64 members. The cornerstone of our approach is Acorn Verification-a streamlined zero-knowledge protocol that supersedes the classical Fiat-Shamir methodology. This innovation enables linear O(n) authentication complexity using concise 96-byte cryptographic proofs per participant, yielding a remarkable 17.7x performance enhancement for 32-member rings when compared to conventional techniques. Our work includes rigorous mathematical security demonstrations confirming 112-bit post-quantum protection (NIST Level 1), extensive computational benchmarking, and comprehensive support for both standard anonymity sets and collaborative threshold constructions with flexible participation requirements.
Authors:Isaac Henry Teuscher
Abstract:
The U.S. Federal Risk and Authorization Management Program (FedRAMP) has long relied on extensive sets of controls and static documentation to assess cloud systems. However, this manual, point-in-time approach has struggled to keep pace with cloud-native development. FedRAMP 20x, a 2025 pilot program, reimagines the NIST Risk Management Framework (RMF): replacing traditional NIST 800-53 controls with Key Security Indicators (KSIs), using automated, machine-readable evidence, and emphasizing continuous reporting and authorization. This case study presents a practitioner-led field report from an industry participant who led multiple FedRAMP 20x pilot submissions and engaged directly with the FedRAMP PMO, 3PAOs, and community working groups. It explores how KSIs, continuous evidence pipelines, and DevSecOps integration can streamline authorization and improve cyber risk management. The study shows FedRAMP 20x as a live testbed for implementing the RMF in a cloud-native, automation-first approach and shares actionable recommendations for risk professionals seeking to modernize compliance and support real-time, risk-informed decision-making.
Authors:Yurang R. Kuang
Abstract:
We present the discovery of a fundamental composition law governing conjugate observables in the Random Permutation Sorting System (RPSS). The law links the discrete permutation count Np and the continuous elapsed time T through a functional relation connecting the characteristic function of timing distributions to the probability generating function of permutation counts. This framework enables entropy purification, transforming microarchitectural timing fluctuations into uniform randomness via geometric convergence. We establish convergence theorems with explicit bounds and validate the results experimentally, achieving Shannon entropy above 7.9998 bits per byte and chi-square uniformity across diverse platforms. The composition law provides a universal foundation for generating provably uniform randomness from general-purpose computation, securing cryptographic purity from emergent computational dynamics.
Authors:Maxime Reynouard
Abstract:
This study tackles the computational challenges of solving Markov Decision Processes (MDPs) for a restricted class of problems. It is motivated by the Last Revealer Attack (LRA), which undermines fairness in some Proof-of-Stake (PoS) blockchains such as Ethereum (\$400B market capitalization). We introduce pseudo-MDPs (pMDPs) a framework that naturally models such problems and propose two distinct problem reductions to standard MDPs. One problem reduction provides a novel, counter-intuitive perspective, and combining the two problem reductions enables significant improvements in dynamic programming algorithms such as value iteration. In the case of the LRA which size is parameterized by $κ$ (in Ethereum's case $κ$= 325), we reduce the computational complexity from $O(2^κκ^{2^{κ+2}})$ to $O(κ^4)$ (per iteration). This solution also provide the usual benefits from Dynamic Programming solutions: exponentially fast convergence toward the optimal solution is guaranteed. The dual perspective also simplifies policy extraction, making the approach well-suited for resource-constrained agents who can operate with very limited memory and computation once the problem has been solved. Furthermore, we generalize those results to a broader class of MDPs, enhancing their applicability. The framework is validated through two case studies: a fictional card game and the LRA on the Ethereum random seed consensus protocol. These applications demonstrate the framework's ability to solve large-scale problems effectively while offering actionable insights into optimal strategies. This work advances the study of MDPs and contributes to understanding security vulnerabilities in blockchain systems.
Authors:Giulio Malavolta
Abstract:
Foundational results in theoretical computer science have established that everything provable, is provable in zero knowledge. However, this assertion fundamentally assumes a classical interpretation of computation and many interesting physical statements that one can hope to prove are not characterized. In this work, we consider decision problems, where the problem instance itself is specified by a (pure) quantum state. We discuss several motivating examples for this notion and, as our main technical result, we show that every quantum problem that is provable with an interactive protocol, is also provable in zero-knowledge. Our protocol achieves unconditional soundness and computational zero-knowledge, under standard assumptions in cryptography. In addition, we show how our techniques yield a protocol for the Uhlmann transformation problem that achieves a meaningful notion of zero-knowledge, also in the presence of a malicious verifier.
Authors:Katharina Arms
Abstract:
The linear decomposition attack reveals a vulnerability in encryption algorithms operating within groups or monoids with excessively small representations. The representation gap, defined as the size of the smallest non-trivial representation, therefore serves as a metric to assess the security of these algorithms. This paper will demonstrate that the diagrammatic Motzkin monoids exhibit a large representation gap, positioning them as promising candidates for robust encryption algorithms.
Authors:Ruoxing Yang
Abstract:
Large language models (LLMs) such as ChatGPT have evolved into powerful and ubiquitous tools. Fine-tuning on small datasets allows LLMs to acquire specialized skills for specific tasks efficiently. Although LLMs provide great utility in both general and task-specific use cases, they are limited by two security-related concerns. First, traditional LLM hardware requirements make them infeasible to run locally on consumer-grade devices. A remote network connection with the LLM provider's server is usually required, making the system vulnerable to network attacks. Second, fine-tuning an LLM for a sensitive task may involve sensitive data. Non-private fine-tuning algorithms produce models vulnerable to training data reproduction attacks. Our work addresses these security concerns by enhancing differentially private optimization algorithms and applying them to fine-tune localizable language models. We introduce adaptable gradient clipping along with other engineering enhancements to the standard DP-Adam optimizer to create DP-Adam-AC. We use our optimizer to fine-tune examples of two localizable LLM designs, small language model (Qwen2.5-0.5B) and 1.58 bit quantization (Bitnet-b1.58-2B). We demonstrate promising improvements in loss through experimentation with two synthetic datasets.
Authors:Francesca Gomez
Abstract:
Agentic misalignment occurs when goal-directed agents take harmful actions, such as blackmail, rather than risk goal failure, and can be triggered by replacement threats, autonomy reduction, or goal conflict (Lynch et al., 2025). We adapt insider-risk control design (Critical Pathway; Situational Crime Prevention) to develop preventative operational controls that steer agents toward safe actions when facing stressors. Using the blackmail scenario from the original Anthropic study by Lynch et al. (2025), we evaluate mitigations across 10 LLMs and 66,600 samples. Our main finding is that an externally governed escalation channel, which guarantees a pause and independent review, reduces blackmail rates from a no-mitigation baseline of 38.73% to 1.21% (averaged across all models and conditions). Augmenting this channel with compliance email bulletins further lowers the blackmail rate to 0.85%. Overall, incorporating preventative operational controls strengthens defence-in-depth strategies for agentic AI. We also surface a failure mode diverging from Lynch et al. (2025): two models (Gemini 2.5 Pro, Grok-4) take harmful actions without goal conflict or imminent autonomy threat, leveraging sensitive information for coercive signalling. In counterfactual swaps, both continued using the affair regardless of whether the CEO or CTO was implicated. An escalation channel eliminated coercion, but Gemini 2.5 Pro (19 pp) and Grok-4 (7 pp) escalated more when the CTO was implicated, unlike most models (higher in the CEO condition). The reason for this divergent behaviour is not clear from raw outputs and could reflect benign differences in reasoning or strategic discrediting of a potential future threat, warranting further investigation.
Authors:Elian Morel
Abstract:
Private Information Retrieval (PIR) allows a client to retrieve an entry $\text{DB}[i]$ from a public database $\text{DB}$ held by one or more servers, without revealing the queried index $i$. Traditional PIR schemes achieve sublinear server computation only under strong assumptions, such as the presence of multiple non-colluding servers or the use of public-key cryptography. To overcome these limitations, \textit{preprocessing PIR} schemes introduce a query-independent offline phase where the client collects \textit{hints} that enable efficient private queries during the online phase. In this work, we focus on preprocessing PIR schemes relying solely on \textit{One-Way Functions} (OWFs), which provide minimal cryptographic assumptions and practical implementability. We study three main constructions -- TreePIR, PIANO, and PPPS -- that explore different trade-offs between communication, storage, and server trust assumptions. Building upon the mechanisms introduced in PIANO and PPPS, we propose an adaptation of TreePIR to the single-server setting by introducing a dual-table hint structure (primary and backup tables) and a \textit{resampling} technique to refresh hints efficiently. Our proposed scheme achieves logarithmic upload bandwidth and $O(\sqrt{n}\log n)$ download complexity while requiring $O(\sqrt{n}\log n)$ client storage. This represents a significant improvement over prior single-server preprocessing PIR schemes such as PIANO ($O(\sqrt{n})$ bandwidth) and PPPS ($O(n^{1/4})$ bandwidth), while maintaining the simplicity and minimal assumptions of the OWF-based setting.
Authors:Santhosh KumarRavindran
Abstract:
The rapid adoption of large language models (LLMs) in enterprise systems exposes vulnerabilities to prompt injection attacks, strategic deception, and biased outputs, threatening security, trust, and fairness. Extending our adversarial activation patching framework (arXiv:2507.09406), which induced deception in toy networks at a 23.9% rate, we introduce the Unified Threat Detection and Mitigation Framework (UTDMF), a scalable, real-time pipeline for enterprise-grade models like Llama-3.1 (405B), GPT-4o, and Claude-3.5. Through 700+ experiments per model, UTDMF achieves: (1) 92% detection accuracy for prompt injection (e.g., jailbreaking); (2) 65% reduction in deceptive outputs via enhanced patching; and (3) 78% improvement in fairness metrics (e.g., demographic bias). Novel contributions include a generalized patching algorithm for multi-threat detection, three groundbreaking hypotheses on threat interactions (e.g., threat chaining in enterprise workflows), and a deployment-ready toolkit with APIs for enterprise integration.
Authors:David R. Gruber
Abstract:
Strategic Communication Protocols provide a structured approach for first contact with interstellar objects that demonstrate technological characteristics and high levels of threat. The protocols find their starting point in an ISO Information-Communication Paradox, namely, as our knowledge of an ISO's threatening capabilities increases, the probability of successful communication decreases while the urgency of communication attempts simultaneously intensifies. From this paradox, a Threat-Communication Viability Index is created to describe when the value of communication attempts outweighs strategic silence. The index scores the situation and operates as a decision-making tool for stakeholders tracking an ISO. The communication protocols subsequently outline several diplomatic strategies in cases where the index recommends communication.
Authors:David Megias
Abstract:
Ensuring the trustworthiness of data from distributed and resource-constrained environments, such as Wireless Sensor Networks or IoT devices, is critical. Existing Reversible Data Hiding (RDH) methods for scalar data suffer from low embedding capacity and poor intrinsic mixing between host data and watermark. This paper introduces Hiding in the Imaginary Domain with Data Encryption (H[i]dden), a novel framework based on complex number arithmetic for simultaneous information embedding and encryption. The H[i]dden framework offers perfect reversibility, in-principle unlimited watermark size, and intrinsic data-watermark mixing. The paper further introduces two protocols: H[i]dden-EG, for joint reversible data hiding and encryption, and H[i]dden-AggP, for privacy-preserving aggregation of watermarked data, based on partially homomorphic encryption. These protocols provide efficient and resilient solutions for data integrity, provenance and confidentiality, serving as a foundation for new schemes based on the algebraic properties of the complex domain.
Authors:Sanket Badhe
Abstract:
We present LegalSim, a modular multi-agent simulation of adversarial legal proceedings that explores how AI systems can exploit procedural weaknesses in codified rules. Plaintiff and defendant agents choose from a constrained action space (for example, discovery requests, motions, meet-and-confer, sanctions) governed by a JSON rules engine, while a stochastic judge model with calibrated grant rates, cost allocations, and sanction tendencies resolves outcomes. We compare four policies: PPO, a contextual bandit with an LLM, a direct LLM policy, and a hand-crafted heuristic; Instead of optimizing binary case outcomes, agents are trained and evaluated using effective win rate and a composite exploit score that combines opponent-cost inflation, calendar pressure, settlement pressure at low merit, and a rule-compliance margin. Across configurable regimes (e.g., bankruptcy stays, inter partes review, tax procedures) and heterogeneous judges, we observe emergent ``exploit chains'', such as cost-inflating discovery sequences and calendar-pressure tactics that remain procedurally valid yet systemically harmful. Evaluation via cross-play and Bradley-Terry ratings shows, PPO wins more often, the bandit is the most consistently competitive across opponents, the LLM trails them, and the heuristic is weakest. The results are stable in judge settings, and the simulation reveals emergent exploit chains, motivating red-teaming of legal rule systems in addition to model-level testing.
Authors:Basil Abdullah AL-Zahrani
Abstract:
This paper presents CADL (Cognitive-Adaptive Deception Layer), an adaptive deception framework achieving 99.88% detection rate with 0.13% false positive rate on the CICIDS2017 dataset. The framework employs ensemble machine learning (Random Forest, XGBoost, Neural Networks) combined with behavioral profiling to identify and adapt responses to network intrusions. Through a coordinated signal bus architecture, security components share real-time intelligence, enabling collective decision-making. The system profiles attackers based on temporal patterns and deploys customized deception strategies across five escalation levels. Evaluation on 50,000 CICIDS2017 test samples demonstrates that CADL significantly outperforms traditional intrusion detection systems (Snort: 71.2%, Suricata: 68.5%) while maintaining production-ready false positive rates. The framework's behavioral analysis achieves 89% accuracy in classifying attacker profiles. We provide open-source implementation and transparent performance metrics, offering an accessible alternative to commercial deception platforms costing $150-400 per host annually.
Authors:Awnon Bhowmik
Abstract:
Elliptic curve cryptography (ECC) is foundational to modern secure communication, yet existing standard curves have faced scrutiny for opaque parameter-generation practices. This work introduces a Selmer-inspired framework for constructing elliptic curves that is both transparent and auditable. Drawing from $2$- and $3$-descent methods, we derive binary quartics and ternary cubics whose classical invariants deterministically yield candidate $(c_4,c_6)$ parameters. Local solubility checks, modeled on Selmer admissibility, filter candidates prior to reconciliation into short-Weierstrass form over prime fields. We then apply established cryptographic validations, including group-order factorization, cofactor bounds, twist security, and embedding-degree heuristics. A proof-of-concept implementation demonstrates that the pipeline functions as a retry-until-success Las Vegas algorithm, with complete transcripts enabling independent verification. Unlike seed-based or purely efficiency-driven designs, our approach embeds arithmetic structure into parameter selection while remaining compatible with constant-time, side-channel resistant implementations. This work broadens the design space for elliptic curves, showing that descent techniques from arithmetic geometry can underpin trust-enhancing, standardization-ready constructions.
Authors:Abel C. H. Chen
Abstract:
Since the security of post-quantum cryptography (PQC) algorithms is based on the hardness of mathematical problems, while the security of quantum key distribution (QKD) relies on the fundamental principles of quantum physics, each approach possesses distinct advantages and limitations that can complement one another. Consequently, recent studies have proposed hybrid schemes that combine QKD and PQC to establish a dual-layered security model. In response to this trend, this study proposes hybrid schemes that integrate QKD with the National Institute of Standards and Technology (NIST) standardized PQC algorithms. These hybrid schemes include two core components: a hybrid QKD-PQC key exchange protocol and a hybrid QKD-PQC digital signature scheme. For the hybrid key exchange protocol, this study combines Module-Lattice-based Key Encapsulation Mechanisms (ML-KEM) with QKD protocols, specifically BB84 and E91, to construct a secure key exchange protocol. In the design of the hybrid digital signature scheme, this study utilizes Module-Lattice-based Digital Signature Algorithms (ML-DSA) and Stateless Hash-based Digital Signature Algorithms (SLH-DSA) to generate signature reconstruction values. These values are verified using confirmation codes transmitted via the BB84 and E91 protocols. The proposed hybrid key exchange protocol is evaluated by examining the shared secret key it produces, particularly with respect to entropy and whether the output is independent and identically distributed (IID). Furthermore, the computation time and message lengths of the proposed hybrid schemes are evaluated.
Authors:Ayda Aghaei Nia
Abstract:
Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs) are a foundational component of web security, yet traditional implementations suffer from a trade-off between usability and resilience against AI-powered bots. This paper introduces a novel hybrid CAPTCHA system that synergizes the cognitive challenges posed by Large Language Models (LLMs) with the behavioral biometric analysis of keystroke dynamics. Our approach generates dynamic, unpredictable questions that are trivial for humans but non-trivial for automated agents, while simultaneously analyzing the user's typing rhythm to distinguish human patterns from robotic input. We present the system's architecture, formalize the feature extraction methodology for keystroke analysis, and report on an experimental evaluation. The results indicate that our dual-layered approach achieves a high degree of accuracy in bot detection, successfully thwarting both paste-based and script-based simulation attacks, while maintaining a high usability score among human participants. This work demonstrates the potential of combining cognitive and behavioral tests to create a new generation of more secure and user-friendly CAPTCHAs.
Authors:Dongfang Zhao
Abstract:
Fully Homomorphic Encryption (FHE) provides a powerful paradigm for secure computation, but its practical adoption is severely hindered by the prohibitive computational cost of its bootstrapping procedure. The complexity of all current bootstrapping methods is fundamentally tied to the multiplicative depth of the decryption circuit, denoted $L_{dec}$, making it the primary performance bottleneck. This paper introduces a new approach to bootstrapping that completely bypasses the traditional circuit evaluation model. We apply the tools of modern arithmetic geometry to reframe the bootstrapping operation as a direct geometric projection. Our framework models the space of ciphertexts as an affine scheme and rigorously defines the loci of decryptable and fresh ciphertexts as distinct closed subschemes. The bootstrapping transformation is then realized as a morphism between these two spaces. Computationally, this projection is equivalent to solving a specific Closest Vector Problem (CVP) instance on a highly structured ideal lattice, which we show can be done efficiently using a technique we call algebraic folding. The primary result of our work is a complete and provably correct bootstrapping algorithm with a computational complexity of $O(d \cdot \text{poly}(\log q))$, where $d$ is the ring dimension and $q$ is the ciphertext modulus. The significance of this result lies in the complete elimination of the factor $L_{dec}$ from the complexity, representing a fundamental asymptotic improvement over the state of the art. This geometric perspective offers a new and promising pathway toward achieving truly practical and high-performance FHE.
Authors:Mohammed A. Shehab
Abstract:
This paper introduces Agentic-AI Healthcare, a privacy-aware, multilingual, and explainable research prototype developed as a single-investigator project. The system leverages the emerging Model Context Protocol (MCP) to orchestrate multiple intelligent agents for patient interaction, including symptom checking, medication suggestions, and appointment scheduling. The platform integrates a dedicated Privacy and Compliance Layer that applies role-based access control (RBAC), AES-GCM field-level encryption, and tamper-evident audit logging, aligning with major healthcare data protection standards such as HIPAA (US), PIPEDA (Canada), and PHIPA (Ontario). Example use cases demonstrate multilingual patient-doctor interaction (English, French, Arabic) and transparent diagnostic reasoning powered by large language models. As an applied AI contribution, this work highlights the feasibility of combining agentic orchestration, multilingual accessibility, and compliance-aware architecture in healthcare applications. This platform is presented as a research prototype and is not a certified medical device.
Authors:Anais Jaikissoon
Abstract:
The Age of Artificial Intelligence is here. In 2025, there are few regulations governing artificial intelligence. While the expansion of artificial intelligence is going in a relatively good direction, there is a risk that it can be misused. Misuse of technology is nothing new and will continue to happen. The lack of regulation in artificial intelligence is necessary because it raises the question of how we can move forward without knowing what the limits are. While artificial intelligence dominates the technology industry, new technology is starting to emerge. Quantum cryptography is expected to replace classical cryptography; however, the transition from classical to quantum cryptography is expected to occur within the next 10 years. The ability to transition from classical to quantum cryptography requires hybrid cryptography. Hybrid cryptography can be used now; however, similar to artificial intelligence, there is no regulation or support for the regulatory infrastructure regarding hybrid machines. This paper will explore the regulatory gaps in hybrid cryptography. The paper will also offer solutions to fix the gaps and ensure the transition from classical to quantum cryptography is safely and effectively completed.
Authors:Jason Anderson
Abstract:
This work derives the authentication security of pseudorandom function (PRF) GNSS ranging under multiple GNSS spoofing models, including the Security Code Estimation and Replay (SCER) spoofer. When GNSS ranging codes derive from a PRF utilizing a secret known only to the broadcaster, the spoofer cannot predict the ranging code before broadcast. Therefore, PRF ranging can be used to establish trust in the GNSS pseudoranges and the resulting receiver position, navigation, and timing (PNT) solution. I apply the methods herein to Galileo's Signal Authentication Service (SAS) utilizing the encrypted Galileo E6-C signal to compute that, at most, 400 ms of Galileo E6-C data to assert 128-bit authentication security under non-SCER models. For the SCER adversary, I predict the adversary's needed receiving radio equipment to break authentication security. One can use this work to design a PRF GNSS ranging protocol to meet useful authentication security requirements by computing the probability of missed detection.
Authors:Aueaphum Aueawatthanaphisut
Abstract:
Secure and interoperable integration of heterogeneous medical data remains a grand challenge in digital health. Current federated learning (FL) frameworks offer privacy-preserving model training but lack standardized mechanisms to orchestrate multi-modal data fusion across distributed and resource-constrained environments. This study introduces a novel framework that leverages the Model Context Protocol (MCP) as an interoperability layer for secure, cross-agent communication in multi-modal federated healthcare systems. The proposed architecture unifies three pillars: (i) multi-modal feature alignment for clinical imaging, electronic medical records, and wearable IoT data; (ii) secure aggregation with differential privacy to protect patient-sensitive updates; and (iii) energy-aware scheduling to mitigate dropouts in mobile clients. By employing MCP as a schema-driven interface, the framework enables adaptive orchestration of AI agents and toolchains while ensuring compliance with privacy regulations. Experimental evaluation on benchmark datasets and pilot clinical cohorts demonstrates up to 9.8\% improvement in diagnostic accuracy compared with baseline FL, a 54\% reduction in client dropout rates, and clinically acceptable privacy--utility trade-offs. These results highlight MCP-enabled multi-modal fusion as a scalable and trustworthy pathway toward equitable, next-generation federated health infrastructures.
Authors:Palash Sarkar
Abstract:
We describe several families of efficiently implementable Boolean functions achieving provable trade-offs between resiliency, nonlinearity, and algebraic immunity. In particular, the following statement holds for each of the function families that we propose. Given integers $m_0\geq 0$, $x_0\geq 1$, and $a_0\geq 1$, it is possible to construct an $n$-variable function which has resiliency at least $m_0$, linear bias (which is an equivalent method of expressing nonlinearity) at most $2^{-x_0}$ and algebraic immunity at least $a_0$; further, $n$ is linear in $m_0$, $x_0$ and $a_0$, and the function can be implemented using $O(n)$ 2-input gates, which is essentially optimal.
Authors:Andrés F. Betancur-López
Abstract:
Privacy and security in Smart Cities remain at constant risk due to the vulnerabilities introduced by Internet of Things (IoT) devices. The limited computational resources of these devices make them especially susceptible to attacks, while their widespread adoption increases the potential impact of security breaches. This article presents a review of security proposals aimed at protecting IoT devices in Smart City environments. The review was conducted by analyzing recent literature on device-level security, with particular emphasis on lightweight cryptography, physically unclonable functions (PUFs), and blockchain-based solutions. Findings highlight both the strengths and limitations of current approaches, as well as the need for more practical, scalable, and resource-efficient mechanisms to ensure user privacy and data protection in IoT ecosystems.
Authors:Vedant Palit
Abstract:
Federated learning is vulnerable to poisoning and backdoor attacks under partial observability. We formulate defence as a partially observable sequential decision problem and introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates while optimizing a long-horizon robustness--accuracy objective. On CIFAR-10, we (i) establish a baseline showing steadily improving accuracy, (ii) show through a Dirichlet sweep that increased client overlap consistently improves accuracy and reduces ASR with stable detection, and (iii) demonstrate in a signal-budget study that accuracy remains steady while ASR increases and ROC-AUC declines as observability is reduced, which highlights that sequential belief updates mitigate weaker signals. Finally, a comparison with random, linear-Q, and policy gradient controllers confirms that DQN achieves the best robustness--accuracy trade-off.
Authors:Preston Vander Vos
Abstract:
Users of blockchains value scalability, expecting fast confirmations and immediate transaction processing. Odontoceti, the latest in DAG-based consensus, addresses these concerns by prioritizing low latency and high throughput, making a strategic trade-off in security by operating with a 20% fault tolerance instead of the established 33% level. It is the first DAG-based protocol to achieve commitment in just two communication rounds, delivering median latency of 300 milliseconds while processing 10,000 transactions per second under realistic network conditions. Odontoceti operates with n = 5f + 1 validators and creates an uncertified DAG with a novel decision rule for committing blocks. The protocol includes an optimization that advances progress when participants are slow, benefiting crash fault scenarios which are more common in practice than Byzantine faults. Evaluation results demonstrate 20-25% latency improvements compared to an existing production protocol, validating that reducing wave length from three rounds to two rounds yields meaningful performance benefits. This paper establishes the practical viability of lower fault tolerance consensus protocols for blockchains.
Authors:Aueaphum Aueawatthanaphisut
Abstract:
Rare-disease diagnosis remains one of the most pressing challenges in digital health, hindered by extreme data scarcity, privacy concerns, and the limited resources of edge devices. This paper proposes the Adaptive Federated Few-Shot Rare-Disease Diagnosis (AFFR) framework, which integrates three pillars: (i) few-shot federated optimization with meta-learning to generalize from limited patient samples, (ii) energy-aware client scheduling to mitigate device dropouts and ensure balanced participation, and (iii) secure aggregation with calibrated differential privacy to safeguard sensitive model updates. Unlike prior work that addresses these aspects in isolation, AFFR unifies them into a modular pipeline deployable on real-world clinical networks. Experimental evaluation on simulated rare-disease detection datasets demonstrates up to 10% improvement in accuracy compared with baseline FL, while reducing client dropouts by over 50% without degrading convergence. Furthermore, privacy-utility trade-offs remain within clinically acceptable bounds. These findings highlight AFFR as a practical pathway for equitable and trustworthy federated diagnosis of rare conditions.
Authors:Xin Lyu
Abstract:
We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $\tilde{O}(d^{9.5}\cdot \log(T))$ in the realizable case, where $d$ denotes the Littlestone dimension and $T$ the time horizon. This is a doubly-exponential improvement over the state-of-the-art [GL'21] and comes polynomially close to the lower bound for this task. The advancement is made possible by a couple of ingredients. The first is a clean and refined interpretation of the ``irreducibility'' technique from the state-of-the-art private PAC-learner for Littlestone classes [GGKM'21]. Our new perspective also allows us to improve the PAC-learner of [GGKM'21] and give a sample complexity upper bound of $\widetilde{O}(\frac{d^5 \log(1/δβ)}{\varepsilon α})$ where $α$ and $β$ denote the accuracy and confidence of the PAC learner, respectively. This improves over [GGKM'21] by factors of $\frac{d}α$ and attains an optimal dependence on $α$. Our algorithm uses a private sparse selection algorithm to \emph{sample} from a pool of strongly input-dependent candidates. However, unlike most previous uses of sparse selection algorithms, where one only cares about the utility of output, our algorithm requires understanding and manipulating the actual distribution from which an output is drawn. In the proof, we use a sparse version of the Exponential Mechanism from [GKM'21] which behaves nicely under our framework and is amenable to a very easy utility proof.
Authors:Yuan Huang
Abstract:
Recent advancements in training paradigms for Large Language Models (LLMs) have unlocked their remarkable capabilities in natural language processing and cross-domain generalization. While LLMs excel in tasks like programming and mathematical problem-solving, their zero-shot performance in specialized domains requiring expert knowledge, such as cybersecurity, is often suboptimal. This limitation arises because foundational LLMs are designed for general-purpose applications, constraining their ability to encapsulate domain-specific expertise within their parameter space. To address this, we explore fine-tuning strategies to embed cybersecurity knowledge into LLMs, enhancing their performance in cybersecurity question-answering (Q\&A) tasks while prioritizing computational efficiency. Specifically, we investigate Supervised Fine-Tuning (SFT), Low-Rank Adaptation (LoRA), and Quantized Low-Rank Adaptation (QLoRA) using a cybersecurity Q\&A dataset. Our results demonstrate that these fine-tuning approaches significantly outperform the foundational model in cybersecurity Q\&A tasks. Moreover, LoRA and QLoRA achieve comparable performance to SFT with substantially lower computational costs, offering an efficient pathway for adapting LLMs to specialized domains. Our work highlights the potential of low-rank fine-tuning strategies to bridge the gap between general-purpose LLMs and domain-specific applications.
Authors:Daksh Pandey
Abstract:
Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations on graph data without requiring manual labels. However, leading SSL methods like GRACE are fundamentally incompatible with privacy-preserving technologies such as Homomorphic Encryption (HE) due to their reliance on non-polynomial operations. This paper introduces Poly-GRACE, a novel framework for HE-compatible self-supervised learning on graphs. Our approach consists of a fully polynomial-friendly Graph Convolutional Network (GCN) encoder and a novel, polynomial-based contrastive loss function. Through experiments on three benchmark datasets -- Cora, CiteSeer, and PubMed -- we demonstrate that Poly-GRACE not only enables private pre-training but also achieves performance that is highly competitive with, and in the case of CiteSeer, superior to the standard non-private baseline. Our work represents a significant step towards practical and high-performance privacy-preserving graph representation learning.
Authors:Carlos Benitez
Abstract:
The emergence of large-scale quantum computers, powered by algorithms like Shor's and Grover's, poses an existential threat to modern public-key cryptography. This vulnerability stems from the ability of these machines to efficiently solve the hard mathematical problems - such as integer factorization and the elliptic curve discrete logarithm problem - that underpin widely used cryptographic primitives. This includes RSA, Diffie-Hellman (DH), Elliptic Curve Diffie-Hellman (ECDH), and Elliptic Curve Digital Signature Algorithm (ECDSA), which are foundational to security across the digital ecosystem. Once Shor's algorithm becomes practically realizable, these primitives will fail, undermining both retrospective confidentiality and cryptographic authenticity - enabling adversaries to decrypt previously captured communications and forge digital signatures. This paper presents a systematic inventory of technologies exposed to quantum threats from the engineering perspective, organized by both technology domain and by implementation environment. While prior research has emphasized theoretical breaks or protocol-level adaptations, this work focuses on the practical landscape - mapping quantum-vulnerable systems across diverse digital infrastructures. The contribution is a cross-domain, cross-environment threat map to guide practitioners, vendors, and policymakers in identifying exposed technologies before the arrival of cryptographically relevant quantum computers.
Authors:Petar Radanliev
Abstract:
This study presents a structured approach to evaluating vulnerabilities within quantum cryptographic protocols, focusing on the BB84 quantum key distribution method and National Institute of Standards and Technology (NIST) approved quantum-resistant algorithms. By integrating AI-driven red teaming, automated penetration testing, and real-time anomaly detection, the research develops a framework for assessing and mitigating security risks in quantum networks. The findings demonstrate that AI can be effectively used to simulate adversarial attacks, probe weaknesses in cryptographic implementations, and refine security mechanisms through iterative feedback. The use of automated exploit simulations and protocol fuzzing provides a scalable means of identifying latent vulnerabilities, while adversarial machine learning techniques highlight novel attack surfaces within AI-enhanced cryptographic processes. This study offers a comprehensive methodology for strengthening quantum security and provides a foundation for integrating AI-driven cybersecurity practices into the evolving quantum landscape.
Authors:Chaerin Kim
Abstract:
As Programmable Logic Controller (PLC) became a useful device and rose as an interesting research topic but remained expensive, multiple PLC simulators/emulators were introduced for various purposes. Open-source Programmable Logic Controller (OpenPLC) software, one of the most popular PLC simulators, is designed to be vendor-neutral and run on almost any computer or low-cost embedded devices, e.g., Raspberry Pi, Arduino, and other controllers. The project succeeded in introducing itself as an affordable and practical solution for the high cost of real hardware PLCs. However, it still lacks appropriate securing methods, resulting in several vulnerabilities. Through a combination of threat modeling, vulnerability analysis, and practical experiments, this thesis provides valuable insights for developers, researchers, and engineers aiming to deploy OpenPLC securely in industrial environments. To this end, this work first conducts an in-depth analysis aimed to shed light on va! rious security challenges and vulnerabilities within the OpenPLC project. After that, an advanced control logic injection attack was performed. This attack modifies the user program maliciously, exploiting presented vulnerabilities. Finally, the work introduces a security-enhanced OpenPLC software called OpenPLC Aqua. The new software is equipped with a set of security solutions designed specifically to address the vulnerabilities to which current OpenPLC versions are prone.
Authors:Michel Youssef
Abstract:
We define a practical method to quantify the trade-off between security and operational friction in modern identity-centric programs. We introduce the Security Friction Quotient (SFQ), a bounded composite index that combines a residual-risk estimator with empirically grounded friction terms (latency, failure rate, and helpdesk impact). We establish clarity properties (boundedness, monotonic response, and weight identifiability) with short proofs, then evaluate widely used Conditional Access policies over a 12-week horizon using Monte Carlo simulation (n = 2,000 runs per policy/scenario) with effect sizes and 95% confidence intervals. We further assess rank stability under 10,000 random weight draws, finding 95.5% preservation of policy ordering. Finally, we provide a 12-week passkey field observation from an enterprise-scale cohort (N = 1,200) that directionally aligns with the simulation's phishing-resistant MFA gains. The SFQ framework is designed to be reproducible, interpretable, and directly actionable for Zero Trust identity policy decisions, with artifacts and parameter ranges provided to support policy design, review, and continuous improvement.
Authors:Lev Stambler
Abstract:
We introduce Verifiable One-Time Programs (Ver-OTPs) and use them to construct single-round Open Secure Computation (OSC), a novel primitive enabling applications like (1) single-round sealed-bid auctions, (2) single-round and honest-majority atomic proposes -- a building block of consensus protocols, and (3) single-round differentially private statistical aggregation without pre-registration. First, we construct Ver-OTPs from single-qubit states and classical cryptographic primitives. Then, assuming a multi-key homomorphic scheme (MHE) with certain properties, we use Ver-OTPs with MHE to construct OSC. The underlying quantum requirement is minimal: only single-qubit states are needed alongside a hardware assumption on the receiver's quantum resources. Our work therefore provides a new framework for quantum-assisted cryptography that may be implementable with near-term quantum technology.
Authors:Jason Anderson
Abstract:
Cryptographic Ranging Authentication is here! We present initial results on the Pulsar authenticated ranging service broadcast from space with Pulsar-0 utilizing a recording taken at Xona headquarters in Burlingame, CA. No assumptions pertaining to the ownership or leakage of encryption keys are required. This work discusses the Pulsar watermark design and security analysis. We derive the Pulsar watermark's probabilities of missed detection and false alarm, and we discuss the required receiver processing needed to utilize the Pulsar watermark. We present validation results of the Pulsar watermark utilizing the transmissions from orbit. Lastly, we provide results that demonstrate the spoofing detection efficacy with a spoofing scenario that incorporates the authentic transmissions from orbit. Because we make no assumption about the leakage of symmetric encryption keys, this work provides mathematical justification of the watermark's security, and our July 2025 transmissions from orbit, we claim the world's first authenticated satellite pseudorange from orbit.
Authors:Sathvik Swaminathan
Abstract:
We present a novel mechanism to construct a covert channel based on page faults. A page fault is an event that occurs when a process or a thread tries to access a page of memory that is not currently mapped to its address space. The kernel typically responds to this event by performing a context switch to allow another process or thread to execute while the page is being fetched from the disk. We exploit this behavior to allow a malicious process to leak secret data to another process, bypassing the isolation mechanisms enforced by the operating system. These attacks do not leverage timers and are hardwareagnostic. Experimental results demonstrate that this attack can achieve a bit error rate of under 4%
Authors:Amir AL-Maamari
Abstract:
The rapid integration of AI-powered coding assistants into developer workflows has raised significant privacy and trust concerns. As developers entrust proprietary code to services like OpenAI's GPT, Google's Gemini, and GitHub Copilot, the unclear data handling practices of these tools create security and compliance risks. This paper addresses this challenge by introducing and applying a novel, expert-validated privacy scorecard. The methodology involves a detailed analysis of four document types; from legal policies to external audits; to score five leading assistants against 14 weighted criteria. A legal expert and a data protection officer refined these criteria and their weighting. The results reveal a distinct hierarchy of privacy protections, with a 20-point gap between the highest- and lowest-ranked tools. The analysis uncovers common industry weaknesses, including the pervasive use of opt-out consent for model training and a near-universal failure to filter secrets from user prompts proactively. The resulting scorecard provides actionable guidance for developers and organizations, enabling evidence-based tool selection. This work establishes a new benchmark for transparency and advocates for a shift towards more user-centric privacy standards in the AI industry.
Authors:M. Andrecut
Abstract:
In this paper we discuss several surprisingly simple methods for transforming the Raspberry Pi Pico (RP2) microcontroller into a radio transmitter, by using only cheap off the shelf electronic components, and open source software. While initially this transformation may look as a harmless curiosity, in some extreme cases it can also pose security risks, since it can be used to open a large number of local stealth radio communication channels.
Authors:Joseph Carolan
Abstract:
The analysis of quantum algorithms which query random, invertible permutations has been a long-standing challenge in cryptography. Many techniques which apply to random oracles fail, or are not known to generalize to this setting. As a result, foundational cryptographic constructions involving permutations often lack quantum security proofs. With the aim of closing this gap, we develop and prove soundness of a compressed permutation oracle. Our construction shares many of the attractive features of Zhandry's original compressed function oracle: the purification is a small list of input-output pairs which meaningfully reflect an algorithm's knowledge of the oracle. We then apply this framework to show that the Feistel construction with seven rounds is a strong quantum PRP, resolving an open question of (Zhandry, 2012). We further re-prove essentially all known quantum query lower bounds in the random permutation model, notably the collision and preimage resistance of both Sponge and Davies-Meyer, hardness of double-sided zero search and sparse predicate search, and give new lower bounds for cycle finding and the one-more problem.
Authors:Steve Huntsman
Abstract:
Large language models (LLMs) can compile weighted graphs on natural language data to enable automatic coherence-driven inference (CDI) relevant to red and blue team operations in cybersecurity. This represents an early application of automatic CDI that holds near- to medium-term promise for decision-making in cybersecurity and eventually also for autonomous blue team operations.
Authors:Eric Filiol
Abstract:
Since the early 2010s, social network-based influence technologies have grown almost exponentially. Initiated by the U.S. Army's early OEV system in 2011, a number of companies specializing in this field have emerged. The most (in)famous cases are Bell Pottinger, Cambridge Analytica, Aggregate-IQ and, more recently, Team Jorge. In this paper, we consider the use-case of sock puppet master activities, which consist in creating hundreds or even thousands of avatars, in organizing them into communities and implement influence operations. On-purpose software is used to automate these operations (e.g. Ripon software, AIMS) and organize these avatar populations into communities. The aim is to organize targeted and directed influence communication to rather large communities (influence targets). The goal of the present research work is to show how these community management techniques (social networks) can also be used to communicate/disseminate relatively large volumes (up to a few tens of Mb) of multi-level encrypted information to a limited number of actors. To a certain extent, this can be compared to a Dark Post-type function, with a number of much more powerful potentialities. As a consequence, the concept of communication has been totally redefined and disrupted, so that eavesdropping, interception and jamming operations no longer make sense.
Authors:Farhad Farokhi
Abstract:
Privacy-preserving state estimation for linear time-invariant dynamical systems with crowd sensors is considered. At any time step, the estimator has access to measurements from a randomly selected sensor from a pool of sensors with pre-specified models and noise profiles. A Luenberger-like observer is used to fuse the measurements with the underlying model of the system to recursively generate the state estimates. An additive privacy-preserving noise is used to constrain information leakage. Information leakage is measured via mutual information between the identity of the sensors and the state estimate conditioned on the actual state of the system. This captures an omnipotent adversary that not only can access state estimates but can also gather direct high-quality state measurements. Any prescribed level of information leakage is shown to be achievable by appropriately selecting the variance of the privacy-preserving noise. Therefore, privacy-utility trade-off can be fine-tuned.
Authors:James J. Cusick
Abstract:
Software vulnerabilities remain a significant risk factor in achieving security objectives within software development organizations. This is especially true where either proprietary or open-source software (OSS) is included in the technological environment. In this paper an end-to-end process with supporting methods and tools is presented. This industry proven generic process allows for the custom instantiation, configuration, and execution of routinized code scanning for software vulnerabilities and their prioritized remediation. A select set of tools are described for this key DevSecOps function and placed into an iterative process. Examples of both industrial proprietary applications and open-source applications are provided including specific vulnerability instances and a discussion of their treatment. The benefits of each selected tool are considered, and alternative tools are also introduced. Application of this method in a comprehensive SDLC model is also reviewed along with prospective enhancements from automation and the application of advanced technologies including AI. Adoption of this method can be achieved with minimal adjustments and with maximum flexibility for results in reducing source code vulnerabilities, reducing supply chain risk, and improving the security profile of new or legacy solutions.
Authors:Avi Shaked
Abstract:
Security risk assessment is essential in establishing the trustworthiness and reliability of modern systems. While various security risk assessment approaches exist, prevalent applications are "pen and paper" implementations that -- even if performed digitally using computers -- remain prone to authoring mistakes and inconsistencies. Computer-aided design approaches can transform security risk assessments into more rigorous and sustainable efforts. This is of value to both industrial practitioners and researchers, who practice security risk assessments to reflect on systems' designs and to contribute to the discipline's state-of-the-art. In this article, we report the application of a model-based security design tool to reproduce a previously reported security assessment. The main contributions are: 1) an independent attempt to reproduce a refereed article describing a real security risk assessment of a system; 2) comparison of a new computer-aided application with a previous non-computer-aided application, based on a published, real-world case study; 3) a showcase for the potential advantages -- for both practitioners and researchers -- of using computer-aided design approaches to analyze reports and to assess systems.
Authors:Uwe Serdült
Abstract:
Elections are not the only but arguably one of the most important pillars for the proper functioning of liberal democracies. Recent evidence across the globe shows that it is not straightforward to conduct them in a free and fair manner. One constant concern is the role of money in politics, more specifically, election campaign financing. Frequent scandals are proof of the difficulties encountered with current approaches to tackle the issue. Suggestions on how to overcome the problem exist but seem difficult to implement. With the help of blockchain technology we might be able to make a step forward. A separate crypto currency specifically designed to pay for costs of political campaigning and advertising could be introduced. Admittedly, at this stage, there are many open questions. However, under the assumption that blockchain technology is here to stay, it is an idea that deserves further exploration.
Authors:Toby Sharp
Abstract:
Bitcoin's consensus rules are encoded in the implementation of its reference client: "The code is the spec." Yet this code is unsuitable for formal verification due to side effects, mutable state, concurrency, and legacy design. A standalone formal specification would enable verification both across versions of the reference client and against new client implementations, strengthening decentralization by reducing the risk of consensus-splitting bugs. Yet such a specification has long been considered intractable given the complexity of Bitcoin's consensus logic. We demonstrate a compact, executable, declarative C++ specification of Bitcoin consensus rules that syncs mainnet to tip in a few hours on a single thread. We also introduce the Hornet Domain-Specific Language (DSL) specifically designed to encode these rules unambiguously for execution, enabling formal reasoning, consensus code generation, and AI-driven adversarial testing. Our spec-driven client Hornet Node offers a modern and modular complement to the reference client. Its clear, idiomatic style makes it suitable for education, while its performance makes it ideal for experimentation. We highlight architectural contributions such as its layered design, efficient data structures, and strong separation of concerns, supported by production-quality code examples. We argue that Hornet Node and Hornet DSL together provide the first credible path toward a pure, formal, executable specification of Bitcoin consensus.
Authors:Md Talha Mohsin
Abstract:
This paper introduces a Blockchain-Integrated Explainable AI Framework (BXHF) for healthcare systems to tackle two essential challenges confronting health information networks: safe data exchange and comprehensible AI-driven clinical decision-making. Our architecture incorporates blockchain, ensuring patient records are immutable, auditable, and tamper-proof, alongside Explainable AI (XAI) methodologies that yield transparent and clinically relevant model predictions. By incorporating security assurances and interpretability requirements into a unified optimization pipeline, BXHF ensures both data-level trust (by verified and encrypted record sharing) and decision-level trust (with auditable and clinically aligned explanations). Its hybrid edge-cloud architecture allows for federated computation across different institutions, enabling collaborative analytics while protecting patient privacy. We demonstrate the framework's applicability through use cases such as cross-border clinical research networks, uncommon illness detection and high-risk intervention decision support. By ensuring transparency, auditability, and regulatory compliance, BXHF improves the credibility, uptake, and effectiveness of AI in healthcare, laying the groundwork for safer and more reliable clinical decision-making.
Authors:VÃctor Mayoral-Vilches
Abstract:
The rapid advancement of humanoid robotics presents unprecedented cybersecurity challenges that existing theoretical frameworks fail to adequately address. This report presents a comprehensive security assessment of a production humanoid robot platform, bridging the gap between abstract security models and operational vulnerabilities. Through systematic static analysis, runtime observation, and cryptographic examination, we uncovered a complex security landscape characterized by both sophisticated defensive mechanisms and critical vulnerabilities. Our findings reveal a dual-layer proprietary encryption system (designated FMX') that, while innovative in design, suffers from fundamental implementation flaws including the use of static cryptographic keys that enable offline configuration decryption. More significantly, we documented persistent telemetry connections transmitting detailed robot state information--including audio, visual, spatial, and actuator data--to external servers without explicit user consent or notification mechanisms. We operationalized a Cybersecurity AI agent on the Unitree G1 to map and prepare exploitation of its manufacturer's cloud infrastructure, illustrating how a compromised humanoid can escalate from covert data collection to active counter-offensive operations. We argue that securing humanoid robots requires a paradigm shift toward Cybersecurity AI (CAI) frameworks that can adapt to the unique challenges of physical-cyber convergence. This work contributes empirical evidence for developing robust security standards as humanoid robots transition from research curiosities to operational systems in critical domains.
Authors:Giovanni Giuseppe Grimaldi
Abstract:
Homomorphic encryption is a powerful cryptographic tool that enables secure computations on the private data. It evaluates any function for any operation securely on the encrypted data without knowing its corresponding plaintext. For original data $p$, $c$ denotes the ciphertext of the original plaintext $p$, i.e. $c = Encrypt_k(p)$. This is crucial for any sensitive application running in the Cloud, because we must protect data privacy even in the case when the server has falled victim to a cyber attack. The encryption scheme $Encrypt_k$ is said to be homomorphic with respect to some set of operations $\mathcal{O}$, if for any operation $\circ \in \mathcal{O}$ one can compute $Encrypt_k(p_1 \circ p_2)$ from $Encrypt_k(p_1) \circ Encrypt_k(p_2)$. Those schemes come in three forms: somewhat, partially and fully homomorphic. In this survey, we present the state of art of the known homomorphic encryption schemes based on coding theory and polynomials.
Authors:Abhishek Goswami
Abstract:
Autonomous LLM agents can issue thousands of API calls per hour without human oversight. OAuth 2.0 assumes deterministic clients, but in agentic settings stochastic reasoning, prompt injection, or multi-agent orchestration can silently expand privileges. We introduce Agentic JWT (A-JWT), a dual-faceted intent token that binds each agent's action to verifiable user intent and, optionally, to a specific workflow step. A-JWT carries an agent's identity as a one-way checksum hash derived from its prompt, tools and configuration, and a chained delegation assertion to prove which downstream agent may execute a given task, and per-agent proof-of-possession keys to prevent replay and in-process impersonation. We define a new authorization mechanism and add a lightweight client shim library that self-verifies code at run time, mints intent tokens, tracks workflow steps and derives keys, thus enabling secure agent identity and separation even within a single process. We illustrate a comprehensive threat model for agentic applications, implement a Python proof-of-concept and show functional blocking of scope-violating requests, replay, impersonation, and prompt-injection pathways with sub-millisecond overhead on commodity hardware. The design aligns with ongoing OAuth agent discussions and offers a drop-in path toward zero-trust guarantees for agentic applications. A comprehensive performance and security evaluation with experimental results will appear in our forthcoming journal publication
Authors:Luigi Logrippo
Abstract:
The multi level Bell La Padula model for secure data access and data flow control, formulated in the 1970s, was based on the theory of partial orders. Since then, another model, based on lattice theory, has prevailed. We present reasons why the partial order model is more appropriate. We also show, by example, how non lattice data flow networks can be easily implemented by using Attribute-based access control (ABAC).
Authors:Christophe Parisel
Abstract:
While the Voynich Manuscript was almost certainly written left-to-right (LTR), the question whether the underlying script or cipher reads LTR or right-to-left (RTL) has received little quantitative attention. We introduce a statistical method that leverages n-gram perplexity asymmetry to determine directional bias in character sequences.
Authors:Madhava Gaikwad
Abstract:
This position paper presents AVEC (Adaptive Verifiable Edge Control), a framework for bootstrapping privacy for local language models by enforcing privacy at the edge with explicit verifiability for delegated queries. AVEC introduces an adaptive budgeting algorithm that allocates per-query differential privacy parameters based on sensitivity, local confidence, and historical usage, and uses verifiable transformation with on-device integrity checks. We formalize guarantees using Rényi differential privacy with odometer-based accounting, and establish utility ceilings, delegation-leakage bounds, and impossibility results for deterministic gating and hash-only certification. Our evaluation is simulation-based by design to study mechanism behavior and accounting; we do not claim deployment readiness or task-level utility with live LLMs. The contribution is a conceptual architecture and theoretical foundation that chart a pathway for empirical follow-up on privately bootstrapping local LLMs.
Authors:Shivam Akhauri
Abstract:
We address when a best-first router for tool-use agents can stop exploring without missing a better leaf, while preserving local differential privacy (LDP) and leaving an audit trail. We introduce a run-wise certificate that couples each node's key to the same exponential race that realizes leaf perturbations; the usual halting rule (stop when the maximum over $v$ in $F$ of Key$(v) \le B^*$) then certifies the realized run. We give two certified modes on context-indexed prefix DAGs with child partition: (i) Exact (known counts), using lazy offset propagation with winner reuse; and (ii) Surrogate (upper bounds only), which anchors keys to a parent-level surrogate race and allows validator tightening via $κ= \log(N / N_{ub}$). A small compiler enforces the partition property, and an admissible, race-independent M(tau) keeps keys sound. The ledger logs uniforms, counts, and tie handling; privacy follows by post-processing. Experiments on synthetic graphs and a small real pipeline show tight stopping, deterministic replay, and low overhead.
Authors:Umberto Gonçalves de Sousa
Abstract:
Reinforcement learning (RL) has transformed sequential decision-making, but traditional algorithms like Deep Q-Networks (DQNs) and Proximal Policy Optimization (PPO) often struggle with efficient exploration, stability, and adaptability in dynamic environments. This study presents LogGuardQ (Adaptive Log Guard with Cognitive enhancement), a novel framework that integrates a dual-memory system inspired by human cognition and adaptive exploration strategies driven by temperature decay and curiosity. Evaluated on a dataset of 1,000,000 simulated access logs with 47.9% anomalies over 20,000 episodes, LogGuardQ achieves a 96.0% detection rate (versus 93.0% for DQN and 47.1% for PPO), with precision of 0.4776, recall of 0.9996, and an F1-score of 0.6450. The mean reward is 20.34 \pm 44.63 across all episodes (versus 18.80 \pm 43.98 for DQN and -0.17 \pm 23.79 for PPO), with an average of 5.0 steps per episode (constant across models). Graphical analyses, including learning curves smoothed with a Savgol filter (window=501, polynomial=2), variance trends, action distributions, and cumulative detections, demonstrate LogGuardQ's superior stability and efficiency. Statistical tests (Mann-Whitney U) confirm significant performance advantages (e.g., p = 0.0002 vs. DQN with negligible effect size, p < 0.0001 vs. PPO with medium effect size, and p < 0.0001 for DQN vs. PPO with small effect size). By bridging cognitive science and RL, LogGuardQ offers a scalable approach to adaptive learning in uncertain environments, with potential applications in cybersecurity, intrusion detection, and decision-making under uncertainty.
Authors:Philip Laryea Doku
Abstract:
In this work, we investigate the security of Elliptic Curve Cryptosystem (ECC) implementations against Side-Channel Analysis (SCA). ECC is well known for its efficiency and strong security, yet vulnerable to SCA which exploits physical information leaked during scalar multiplication (kP). Countermeasures such as regularity and atomicity exist; this thesis focuses on atomicity. In this work, we study the Giraud and Verneuil atomic pattern for kP, implementing it using the right-to-left kP algorithm on the NIST EC P-256 curve. We use the FLECC library with constant-time operations and execute on the Texas Instruments LAUNCHXLF28379D MCU. We measure Electromagnetic (EM) emissions during kP using a Lecroy WavePro 604HD Oscilloscope, a Langer ICS 105 Integrated Circuit Scanner, and a Langer MFA-R 0.2-75 Near Field Probe. We investigate whether the Giraud and Verneuil atomic blocks are distinguishable in EM traces. Our findings show that, when additional clock cycle processes are present, the atomic blocks can be visually distinguished; after removing these processes, they become more synchronised and harder to distinguish, reducing the risk of a successful SCA attack. These results show that, although the atomic pattern is correctly implemented with dummy operations, resistance to SCA can still be affected by additional processes inserted at hardware or software level.This means atomicity alone may not fully protect ECC from SCA. More research is needed to investigate the causes of the additional clock cycle processes and how intermediate operations are addressed in memory registers. This will help to understand the processes that lead to the insertion of these additional clock cycles. This thesis is the first to experimentally implement and investigate Giraud and Verneuil's atomic pattern on hardware, and it offers useful results to improve countermeasures against SCA.
Authors:Trueye Tafese
Abstract:
This research focuses on transforming CVEs to hands-on educational lab for cybersecurity training. The study shows the practical application of CVEs by developing containerized lab environments- Docker to simulate real-world vulnerabilities like SQL Injection, arbitrary code execution, and improper SSL certificate validation. These labs has structured tutorials, pre- and post-surveys to evaluate learning outcomes, and remediation steps.Key challenges included interpreting limited CVE data, resolving technical complexities in lab design, and ensuring accessibility for diverse learners. Despite these difficulties, the findings highlight the use of educational benefits of vulnerability analysis, bridging theoretical concepts with hands-on experience. The results indicate that students improved comprehension of cybersecurity principles, threat mitigation techniques, and secure coding practices. This innovative approach provides a scalable and reproducible model for integrating CVEs into cybersecurity education, fostering a deeper understanding of real-world security challenges in a controlled and safe environment.
Authors:Matthew Grofsky
Abstract:
The increasing sophistication of technology systems makes traditional threat modeling hard to scale, especially for small organizations with limited resources. This paper develops and evaluates AegisShield, a generative AI enhanced threat modeling tool that implements STRIDE and MITRE ATT&CK to automate threat generation and provide systematic assessments. By integrating real time threat intelligence from the National Vulnerability Database and AlienVault Open Threat Exchange, AegisShield produces streamlined and accessible threat descriptions. Our assessment of 243 threats from 15 case studies and over 8000 AI generated threats shows that AegisShield reduces complexity (p less than 0.001), yields outputs semantically aligned with expert developed threats (p less than 0.05), and achieves an 85.4 percent success rate in mapping threats to MITRE ATT&CK techniques (p less than 0.001). Automating and standardizing threat modeling helps under resourced organizations address risk earlier and supports wider adoption of secure by design practices.
Authors:Randy Kuang
Abstract:
We present the Random Permutation Sorting System (RPSS), a novel framework for true uniform randomness generation grounded in statistical quantum mechanics. RPSS is built on a pair of conjugate observables, the permutation count and the elapsed sorting time, whose heavy-tailed raw distributions synchronously converge to uniformity through modular reduction. This mathematically proven convergence establishes RPSS as a True Uniform Random Number Generator (TURNG). A practical implementation, QPP-RNG, demonstrates how intrinsic system jitter, arising from microarchitectural noise, memory latency, and scheduling dynamics, interacts with combinatorial complexity to yield a compact, self-stabilizing entropy source. Empirical validation under the NIST SP 800-90B framework confirms rapid entropy convergence and statistically uniform outputs. RPSS thus defines a new class of quantum-inspired entropy engines, where randomness is simultaneously harvested from unpredictable system jitter and amplified by combinatorial processes, offering a robust, platform-independent alternative to conventional entropy sources.
Authors:Arin Upadhyay
Abstract:
We propose a flood-based relay communication protocol that achieves end-to-end encryption, plausible deniability for users, and untraceable messages. It is resistant to changes in topology and infrastructure failures. It is also designed to hide metadata, such as sender and receiver, from those not involved.
Authors:James Petrie
Abstract:
To address the risks of increasingly capable AI systems, we introduce a hardware-level off-switch that embeds thousands of independent "security blocks" in each AI accelerator. This massively redundant architecture is designed to prevent unauthorized chip use, even against sophisticated physical attacks. Our main security block design uses public key cryptography to check the authenticity of authorization licenses, and randomly generated nonces to prevent replay attacks. We evaluate attack vectors and present additional security block variants that could be added for greater robustness. Security blocks can be built with standard circuit components, ensuring compatibility with existing semiconductor manufacturing processes. With embedded security blocks, the next generation of AI accelerators could be more robustly defended against dangerous misuse.
Authors:Aivin V. Solatorio
Abstract:
Large Language Models (LLMs) as stochastic systems may generate numbers that deviate from available data, a failure known as \emph{numeric hallucination}. Existing safeguards -- retrieval-augmented generation, citations, and uncertainty estimation -- improve transparency but cannot guarantee fidelity: fabricated or misquoted values may still be displayed as if correct. We propose \textbf{Proof-Carrying Numbers (PCN)}, a presentation-layer protocol that enforces numeric fidelity through mechanical verification. Under PCN, numeric spans are emitted as \emph{claim-bound tokens} tied to structured claims, and a verifier checks each token under a declared policy (e.g., exact equality, rounding, aliases, or tolerance with qualifiers). Crucially, PCN places verification in the \emph{renderer}, not the model: only claim-checked numbers are marked as verified, and all others default to unverified. This separation prevents spoofing and guarantees fail-closed behavior. We formalize PCN and prove soundness, completeness under honest tokens, fail-closed behavior, and monotonicity under policy refinement. PCN is lightweight and model-agnostic, integrates seamlessly into existing applications, and can be extended with cryptographic commitments. By enforcing verification as a mandatory step before display, PCN establishes a simple contract for numerically sensitive settings: \emph{trust is earned only by proof}, while the absence of a mark communicates uncertainty.
Authors:Yiqi Tang
Abstract:
In the digital age, image encryption technology acts as a safeguard, preventing unauthorized access to images. This paper proposes an innovative image encryption scheme that integrates a novel 2D hyper-chaotic map with a newly developed self-adaptive diffusion method. The 2D hyper-chaotic map, namely the 2D-RA map, is designed by hybridizing the Rastrigin and Ackley functions. The chaotic performance of the 2D-RA map is validated through a series of measurements, including the Bifurcation Diagram, Lyapunov Exponent (LE), Initial Value Sensitivity, 0 - 1 Test, Correlation Dimension (CD), and Kolmogorov Entropy (KE). The results demonstrate that the chaotic performance of the 2D-RA map surpasses that of existing advanced chaotic functions. Additionally, the self-adaptive diffusion method is employed to enhance the uniformity of grayscale distribution. The performance of the image encryption scheme is evaluated using a series of indicators. The results show that the proposed image encryption scheme significantly outperforms current state-of-the-art image encryption techniques.
Authors:Tristan Caulfield
Abstract:
Data exfiltration is a growing problem for business who face costs related to the loss of confidential data as well as potential extortion. This work presents a simple game theoretic model of network data exfiltration. In the model, the attacker chooses the exfiltration route and speed, and the defender selects monitoring thresholds to detect unusual activity. The attacker is rewarded for exfiltrating data, and the defender tries to minimize the costs of data loss and of responding to alerts.
Authors:Devon Campbell
Abstract:
Residual cross-talk in superconducting qubit devices creates a security vulnerability for emerging quantum cloud services. We demonstrate a Clifford-only Quantum Rowhammer attack-using just X and CNOT gates-that injects faults on IBM's 127-qubit Eagle processors without requiring pulse-level access. Experiments show that targeted hammering induces localized errors confined to the attack cycle and primarily manifests as phase noise, as confirmed by near 50% flip rates under Hadamard-basis probing. A full lattice sweep maps QR's spatial and temporal behavior, revealing reproducible corruption limited to qubits within two coupling hops and rapid recovery in subsequent benign cycles. Finally, we leverage these properties to outline a prime-and-probe covert channel, demonstrating that the clear separability between hammered and benign rounds enables highly reliable signaling without error correction. These findings underscore the need for hardware-level isolation and scheduler-aware defenses as multi-tenant quantum computing becomes standard.
Authors:Pradyumna Kaushal
Abstract:
Modern vehicles accumulate fragmented lifecycle records across OEMs, owners, and service centers that are difficult to verify and prone to fraud. We propose VehiclePassport, a GAIA-X-aligned digital passport anchored on blockchain with zero-knowledge proofs (ZKPs) for privacy-preserving verification. VehiclePassport immutably commits to manufacturing, telemetry, and service events while enabling selective disclosure via short-lived JWTs and Groth16 proofs. Our open-source reference stack anchors hashes on Polygon zkEVM at <$0.02 per event, validates proofs in <10 ms, and scales to millions of vehicles. This architecture eliminates paper-based KYC, ensures GDPR-compliant traceability, and establishes a trustless foundation for insurance, resale, and regulatory applications in global mobility data markets.
Authors:Mahfuzul I. Nissan
Abstract:
Database audit and transaction logs are fundamental to forensic investigations, but they are vulnerable to tampering by privileged attackers. Malicious insiders or external threats with administrative access can alter, purge, or temporarily disable logging mechanisms, creating significant blind spots and rendering disk-based records unreliable. Memory analysis offers a vital alternative, providing investigators direct access to volatile artifacts that represent a ground-truth source of recent user activity, even when log files have been compromised. This paper introduces MemTraceDB, a tool that reconstructs user activity timelines by analyzing raw memory snapshots from the MySQL database process. MemTraceDB utilizes a novel algorithm, ActiviTimeTrace, to systematically extract and correlate forensic artifacts such as user connections and executed queries. Through a series of experiments, I demonstrate MemTraceDB's effectiveness and reveal a critical empirical finding: the MySQL query stack has a finite operational capacity of approximately 9,997 queries. This discovery allows me to establish a practical, data-driven formula for determining the optimal frequency for memory snapshot collection, providing a clear, actionable guideline for investigators. The result is a forensically-sound reconstruction of user activity, independent of compromised disk-based logs.
Authors:Ishaan Verma
Abstract:
Large Language Models (LLMs) are increasingly integrated into web-based systems for content summarization, yet their susceptibility to prompt injection attacks remains a pressing concern. In this study, we explore how non-visible HTML elements such as , aria-label, and alt attributes can be exploited to embed adversarial instructions without altering the visible content of a webpage. We introduce a novel dataset comprising 280 static web pages, evenly divided between clean and adversarial injected versions, crafted using diverse HTML-based strategies. These pages are processed through a browser automation pipeline to extract both raw HTML and rendered text, closely mimicking real-world LLM deployment scenarios. We evaluate two state-of-the-art open-source models, Llama 4 Scout (Meta) and Gemma 9B IT (Google), on their ability to summarize this content. Using both lexical (ROUGE-L) and semantic (SBERT cosine similarity) metrics, along with manual annotations, we assess the impact of these covert injections. Our findings reveal that over 29% of injected samples led to noticeable changes in the Llama 4 Scout summaries, while Gemma 9B IT showed a lower, yet non-trivial, success rate of 15%. These results highlight a critical and largely overlooked vulnerability in LLM driven web pipelines, where hidden adversarial content can subtly manipulate model outputs. Our work offers a reproducible framework and benchmark for evaluating HTML-based prompt injection and underscores the urgent need for robust mitigation strategies in LLM applications involving web content.
Authors:Junjie Hu
Abstract:
Traditional security models for Nakamoto-style blockchains overestimate adversarial coordination by assuming instantaneous synchronization among malicious nodes, neglecting the critical impact of internal communication delays on security. This paper introduces a dual-delay framework to revisit security analysis, addressing this oversight through two key innovations. First, the static delay model quantifies how adversarial communication delays (\(Î_a\)) constrain the effective growth rate of private chains, derived via an M/D/1 queuing model as \(λ_{eff} = λ_a / (1 + λ_a Î_a)\). This model reveals that the security threshold (\(β^*\)), the maximum adversarial power the system tolerates, increases with \(Î_a\), even exceeding the classic 51\% boundary when \(Î_a \textgreater Î\) (honest nodes' delay), breaking the long-standing 50\% assumption. Second, the dynamic delay model integrates probabilistic corruption and scale-dependent delays to characterize the total adversarial delay window (\(Î_{total} = Î(n) e^{-kβ} + c \log(1 + βn)\)), where \(Î(n) \in Î(\log n)\) captures honest nodes' logarithmic delay growth. Asymptotic analysis shows adversarial power decays linearly with network scale, ensuring the probability of \(β\leq β^*\) approaches 1 as \(n \to \infty\). By exposing the interplay between network scale, communication delays, and power dilution, we provide a theoretical foundation for optimizing consensus protocols and assessing robustness in large-scale Nakamoto-style blockchains.
Authors:Yoana Pita Lorenzo
Abstract:
This systematic literature review analyzes the current state of compliance with Regulation (EU) 2024/1689 in autonomous robotic systems, focusing on cybersecurity frameworks and methodologies. Using the PRISMA protocol, 22 studies were selected from 243 initial records across IEEE Xplore, ACM DL, Scopus, and Web of Science. Findings reveal partial regulatory alignment: while progress has been made in risk management and encrypted communications, significant gaps persist in explainability modules, real-time human oversight, and knowledge base traceability. Only 40% of reviewed solutions explicitly address transparency requirements, and 30% implement failure intervention mechanisms. The study concludes that modular approaches integrating risk, supervision, and continuous auditing are essential to meet the AI Act mandates in autonomous robotics.
Authors:Logan Nye
Abstract:
Zero-knowledge proofs allow verification of computations without revealing private information. However, existing systems require memory proportional to the computation size, which has historically limited use in large-scale applications and on mobile and edge devices. We solve this fundamental bottleneck by developing, to our knowledge, the first proof system with sublinear memory requirements for mainstream cryptographic constructions. Our approach processes computations in blocks using a space-efficient tree algorithm, reducing memory from linear scaling to square-root scaling--from $Î(T)$ to $O(\sqrt{T} + \log T \log\log T)$ for computation size $T$--while maintaining the same proof generation time through a constant number of streaming passes. For widely-used linear polynomial commitment schemes (KZG/IPA), our method produces identical proofs and verification when using the same parameters and hashing only aggregate commitments into the challenge generation, preserving proof size and security. Hash-based systems also achieve square-root memory scaling though with slightly different proof structures. This advance enables zero-knowledge proofs on everyday devices and makes previously infeasible large computations verifiable, fundamentally democratizing access to privacy-preserving computation. Space-efficient zero knowledge proof systems create opportunities to reshape how trust is established in digital systems--from enabling widespread participation in decentralized networks to making verifiable scientific computing practical at unprecedented scales.
Authors:Yuvraj Agrawal
Abstract:
Open-source software OSS is widely adopted in enterprise settings, but standalone tools often lack native support for protocols like SAML or OIDC, creating a critical security integration gap. This paper introduces and formalizes the Auth Shim, a lightweight architectural pattern designed to solve this problem. The Auth Shim is a minimal, external proxy service that acts as a compatibility layer, translating requests from an enterprise Identity Provider IdP into the native session management mechanism of a target application. A key prerequisite for this pattern is that the target application must expose a programmatic, secure administrative API. We present a case study of the pattern's implementation at Adobe to integrate a popular OSS BI tool with Okta SAML, which enabled automated Role-Based Access Control RBAC via IAM group mapping and eliminated manual user provisioning. By defining its components, interactions, and production deployment considerations, this paper provides a reusable, secure, and cost-effective blueprint for integrating any standalone OSS tool into an enterprise SSO ecosystem, thereby enabling organizations to embrace open-source innovation without compromising on security governance.
Authors:Hamid Barati
Abstract:
The rapid expansion of the Internet of Things (IoT) and Wireless Sensor Networks (WSNs) has significantly increased the attack surface of such systems, making them vulnerable to a wide range of cyber threats. Traditional Intrusion Detection Systems (IDS) often fail to meet the stringent requirements of resource-constrained IoT environments due to their high computational cost and reliance on large labeled datasets. To address these challenges, this paper proposes a novel hybrid Intrusion Detection System that integrates a Quantum Genetic Algorithm (QGA) with Self-Supervised Learning (SSL). The QGA leverages quantum-inspired evolutionary operators to optimize feature selection and fine-tune model parameters, ensuring lightweight yet efficient detection in resource-limited networks. Meanwhile, SSL enables the system to learn robust representations from unlabeled data, thereby reducing dependency on manually labeled training sets. The proposed framework is evaluated on benchmark IoT intrusion datasets, demonstrating superior performance in terms of detection accuracy, false positive rate, and computational efficiency compared to conventional evolutionary and deep learning-based IDS models. The results highlight the potential of combining quantum-inspired optimization with self-supervised paradigms to design next-generation intrusion detection solutions for IoT and WSN environments.
Authors:Nsengiyumva Wilberforce
Abstract:
The proliferation of mobile money in Uganda has been a cornerstone of financial inclusion, yet its security mechanisms remain a critical concern. This study investigates a significant public response to perceived security failures: the #StopAirtelThefty Twitter campaign of August 2025 Sparked by an incident publicized by Dr. Jim Spire Ssentongo where a phone thief accessed a victim's account, withdrew funds, and procured a loan, the campaign revealed deep seated public anxiety over the safety of mobile money. This research employs qualitative analysis to systematically examine the complaints raised during this campaign, extracting key themes related to security vulnerabilities and user dissatisfaction. By synthesizing these public sentiments, the paper provides crucial insights into the specific security gaps experienced by users and situates these findings within the larger framework of Uganda's mobile money regulatory and operational environment. The study concludes with implications for providers, policymakers, and the future of secure digital finance in Uganda.
Authors:Abel C. H. Chen
Abstract:
In recent years, numerous incidents involving the leakage of website accounts and text passwords (referred to as passwords) have raised significant concerns regarding the potential exposure of personal information. These events underscore the critical importance of both information security and password protection. While many of these breaches are attributable to vulnerabilities within website infrastructure, the strength and security of the passwords themselves also play a crucial role. Consequently, the creation of secure passwords constitutes a fundamental aspect of enhancing overall system security and protecting personal data. In response to these challenges, this study presents a secure password generation approach utilizing a cryptographically secure Pseudo-Random Number Generator (PRNG). The generator is implemented using a range of Message Authentication Code (MAC) algorithms, including the Keyed-Hash Message Authentication Code (HMAC), Cipher-based Message Authentication Code (CMAC), and KECCAK Message Authentication Code (KMAC), to produce robust random values suitable for password generation. To evaluate the proposed method, empirical assessments were conducted in accordance with the guidelines provided in the National Institute of Standards and Technology (NIST) Special Publication (SP) 800-90B. The evaluation focused on two primary aspects: entropy estimation and verification of independent and identically distributed (IID) properties. Experimental results indicate that the proposed method satisfies both entropy and IID requirements, thereby demonstrating its ability to generate passwords with a high degree of randomness and security.
Authors:Dmitry Tanana
Abstract:
This paper considers five extensions for Chromium-based browsers in order to determine how effective can browser-based defenses against cryptojacking available to regular users be. We've examined most popular extensions - MinerBlock, AdGuard AdBlocker, Easy Redirect && Prevent Cryptojacking, CoinEater and Miners Shield, which claim to be designed specifically to identify and stop illegal cryptocurrency mining. An empirically confirmed dataset of 373 distinct cryptojacking-infected websites which was assembled during multi-stage procedure, was used to test those extensions. The results showed that all plugins in question had significant performance limits. Easy Redirect and Miners Shield only blocked 6 and 5 websites respectively, while MinerBlock had the greatest detection rate at only 27% (101/373 sites blocked). Most concerningly, despite promises of cryptojacking prevention, AdGuard (which has over 13 million users) and CoinEater were unable to identify any of the compromised websites. These results demonstrate serious flaws in cryptojacking detection products targeted for regular users, since even the best-performing specimen failed to detect 73% of attacks. The obvious difference between advertised capabilities and real performance highlights the urgent need for either accessibility improvements for laboratory-grade detection technologies that show 90%+ efficiency in controlled environment or fundamental upgrades to current commonly used extensions.
Authors:Pranjay Malhotra
Abstract:
Cybersecurity honeypots are deception tools for engaging attackers and gather intelligence, but traditional low or medium-interaction honeypots often rely on static, pre-scripted interactions that can be easily identified by skilled adversaries. This Report presents LLMHoney, an SSH honeypot that leverages Large Language Models (LLMs) to generate realistic, dynamic command outputs in real time. LLMHoney integrates a dictionary-based virtual file system to handle common commands with low latency while using LLMs for novel inputs, achieving a balance between authenticity and performance. We implemented LLMHoney using open-source LLMs and evaluated it on a testbed with 138 representative Linux commands. We report comprehensive metrics including accuracy (exact-match, Cosine Similarity, Jaro-Winkler Similarity, Levenshtein Similarity and BLEU score), response latency and memory overhead. We evaluate LLMHoney using multiple LLM backends ranging from 0.36B to 3.8B parameters, including both open-source models and a proprietary model(Gemini). Our experiments compare 13 different LLM variants; results show that Gemini-2.0 and moderately-sized models Qwen2.5:1.5B and Phi3:3.8B provide the most reliable and accurate responses, with mean latencies around 3 seconds, whereas smaller models often produce incorrect or out-of-character outputs. We also discuss how LLM integration improves honeypot realism and adaptability compared to traditional honeypots, as well as challenges such as occasional hallucinated outputs and increased resource usage. Our findings demonstrate that LLM-driven honeypots are a promising approach to enhance attacker engagement and collect richer threat intelligence.
Authors:Omar Khalid Ali Mohamed
Abstract:
The escalating sophistication of malware necessitates robust detection mechanisms that generalize across diverse data sources. Traditional single-dataset models struggle with cross-domain generalization and often incur high computational costs. This paper presents a novel, lightweight framework for malware detection that employs probability-level fusion across three distinct datasets: EMBER (static features), API Call Sequences (behavioral features), and CIC Obfuscated Memory (memory patterns). Our method trains individual LightGBM classifiers on each dataset, selects top predictive features to ensure efficiency, and fuses their prediction probabilities using optimized weights determined via grid search. Extensive experiments demonstrate that our fusion approach achieves a macro F1-score of 0.823 on a cross-domain validation set, significantly outperforming individual models and providing superior generalization. The framework maintains low computational overhead, making it suitable for real-time deployment, and all code and data are provided for full reproducibility.
Authors:Qishen Sam Liang
Abstract:
Academic and research Cyber Infrastructures (CI) present unique security challenges due to their collaborative nature, heterogeneous components, and the lack of practical, tailored security assessment frameworks. Existing standards can be too generic or complex for CI administrators to apply effectively. This report introduces a systematic, mission-centric approach to estimate and analyze the security posture of a CI. The framework guides administrators through a top-down process: (1) defining unacceptable losses and security missions, (2) identifying associated system hazards and critical assets, and (3) modeling the CI's components and their relationships as a security knowledge graph. The core of this methodology is the construction of directed attack graphs, which systematically map all potential paths an adversary could take from an entry point to a critical asset. By visualizing these attack paths alongside defense mechanisms, the framework provides a clear, comprehensive overview of the system's vulnerabilities and security gaps. This structured approach enables CI operators to proactively assess risks, prioritize mitigation strategies, and make informed, actionable decisions to strengthen the overall security posture of the CI.
Authors:Shaked Zychlinski
Abstract:
This paper introduces a novel attack vector that leverages website cloaking techniques to compromise autonomous web-browsing agents powered by Large Language Models (LLMs). As these agents become more prevalent, their unique and often homogenous digital fingerprints - comprising browser attributes, automation framework signatures, and network characteristics - create a new, distinguishable class of web traffic. The attack exploits this fingerprintability. A malicious website can identify an incoming request as originating from an AI agent and dynamically serve a different, "cloaked" version of its content. While human users see a benign webpage, the agent is presented with a visually identical page embedded with hidden, malicious instructions, such as indirect prompt injections. This mechanism allows adversaries to hijack agent behavior, leading to data exfiltration, malware execution, or misinformation propagation, all while remaining completely invisible to human users and conventional security crawlers. This work formalizes the threat model, details the mechanics of agent fingerprinting and cloaking, and discusses the profound security implications for the future of agentic AI, highlighting the urgent need for robust defenses against this stealthy and scalable attack.
Authors:Tobin South
Abstract:
The growing societal reliance on artificial intelligence necessitates robust frameworks for ensuring its security, accountability, and trustworthiness. This thesis addresses the complex interplay between privacy, verifiability, and auditability in modern AI, particularly in foundation models. It argues that technical solutions that integrate these elements are critical for responsible AI innovation. Drawing from international policy contributions and technical research to identify key risks in the AI pipeline, this work introduces novel technical solutions for critical privacy and verifiability challenges. Specifically, the research introduces techniques for enabling verifiable and auditable claims about AI systems using zero-knowledge cryptography; utilizing secure multi-party computation and trusted execution environments for auditable, confidential deployment of large language models and information retrieval; and implementing enhanced delegation mechanisms, credentialing systems, and access controls to secure interactions with autonomous and multi-agent AI systems. Synthesizing these technical advancements, this dissertation presents a cohesive perspective on balancing privacy, verifiability, and auditability in foundation model-based AI systems, offering practical blueprints for system designers and informing policy discussions on AI safety and governance.
Authors:Motunrayo Adebayo
Abstract:
This study attempts to explain the impact of information exchange from one country to another, as well as the legal and technological implications for these exchanges. Due to the emergence of cloud technology, possibilities for free exchange of information between countries have increased rapidly, as it has become possible to save information in a country and access it in almost any part of the world. Countries all around the world have been confronted with developing frameworks to facilitate this process, although there are significant challenges which must be confronted on legal and technological fronts, as loopholes in the framework adopted by countries may hinder free access to information stored on cloud, and also compromise data privacy. Cloud technology is impacting a lot of issues, including domestic and international businesses, hence the need for a study to propose measures for safe exchange of information using cloud technology.
Authors:Rohit Dube
Abstract:
Business email compromise and lateral spear phishing attacks are among modern organizations' most costly and damaging threats. While inbound phishing defenses have improved significantly, most organizations still trust internal emails by default, leaving themselves vulnerable to attacks from compromised employee accounts. In this work, we define and explore the problem of authorship validation: verifying whether a claimed sender actually authored a given email. Authorship validation is a lightweight, real-time defense that complements traditional detection methods by modeling per-sender writing style. Further, the paper presents a collection of new datasets based on the Enron corpus. These simulate inauthentic messages using both human-written and large language model-generated emails. The paper also evaluates two classifiers -- a Naive Bayes model and a character-level convolutional neural network (Char-CNN) -- for the authorship validation task. Our experiments show that the Char-CNN model achieves high accuracy and F1 scores under various circumstances. Finally, we discuss deployment considerations and show that per-sender authorship classifiers are practical for integrating into existing commercial email security systems with low overhead.
Authors:Gaurav Mittal
Abstract:
In this work, we unveil an analogy between well-known lattice based learning with error problem and ill-posed inverse problems. We show that LWE problem is a structured inverse problem. Further, we propose a symmetric encryption scheme based on ill-posed problems and thoroughly discuss its security. Finally, we propose a public key encryption scheme based on our symmetric encryption scheme and CRYSTALS-Kyber KEM (key encapsulation mechanism) and discuss its security.
Authors:Cristina Improta
Abstract:
Deep learning (DL) models for natural language-to-code generation have become integral to modern software development pipelines. However, their heavy reliance on large amounts of data, often collected from unsanitized online sources, exposes them to data poisoning attacks, where adversaries inject malicious samples to subtly bias model behavior. Recent targeted attacks silently replace secure code with semantically equivalent but vulnerable implementations without relying on explicit triggers to launch the attack, making it especially hard for detection methods to distinguish clean from poisoned samples. We present a systematic study on the effectiveness of existing poisoning detection methods under this stealthy threat model. Specifically, we perform targeted poisoning on three DL models (CodeBERT, CodeT5+, AST-T5), and evaluate spectral signatures analysis, activation clustering, and static analysis as defenses. Our results show that all methods struggle to detect triggerless poisoning, with representation-based approaches failing to isolate poisoned samples and static analysis suffering false positives and false negatives, highlighting the need for more robust, trigger-independent defenses for AI-assisted code generation.
Authors:Tomasz Kazana
Abstract:
In this paper we give the first examples of low-conductance permutations. The notion of conductance of permutations was introduced in the paper "Indifferentiability of Confusion-Diffusion Networks" by Dodis et al., where the search for low-conductance permutations was initiated and motivated. In this paper we not only give the desired examples, but also make a general characterization of the problem -- i.e. we show that low-conductance permutations are equivalent to permutations that have the information-theoretic properties of the so-called Multi-Source-Somewhere-Condensers.
Authors:Lukasz Olejnik
Abstract:
AI-powered influence operations can now be executed end-to-end on commodity hardware. We show that small language models produce coherent, persona-driven political messaging and can be evaluated automatically without human raters. Two behavioural findings emerge. First, persona-over-model: persona design explains behaviour more than model identity. Second, engagement as a stressor: when replies must counter-arguments, ideological adherence strengthens and the prevalence of extreme content increases. We demonstrate that fully automated influence-content production is within reach of both large and small actors. Consequently, defence should shift from restricting model access towards conversation-centric detection and disruption of campaigns and coordination infrastructure. Paradoxically, the very consistency that enables these operations also provides a detection signature.
Authors:David Noever
Abstract:
This paper identifies and analyzes a novel vulnerability class in Model Context Protocol (MCP) based agent systems. The attack chain describes and demonstrates how benign, individually authorized tasks can be orchestrated to produce harmful emergent behaviors. Through systematic analysis using the MITRE ATLAS framework, we demonstrate how 95 agents tested with access to multiple services-including browser automation, financial analysis, location tracking, and code deployment-can chain legitimate operations into sophisticated attack sequences that extend beyond the security boundaries of any individual service. These red team exercises survey whether current MCP architectures lack cross-domain security measures necessary to detect or prevent a large category of compositional attacks. We present empirical evidence of specific attack chains that achieve targeted harm through service orchestration, including data exfiltration, financial manipulation, and infrastructure compromise. These findings reveal that the fundamental security assumption of service isolation fails when agents can coordinate actions across multiple domains, creating an exponential attack surface that grows with each additional capability. This research provides a barebones experimental framework that evaluate not whether agents can complete MCP benchmark tasks, but what happens when they complete them too well and optimize across multiple services in ways that violate human expectations and safety constraints. We propose three concrete experimental directions using the existing MCP benchmark suite.
Authors:Onyinye Okoye
Abstract:
The rapid expansion of the Electric Vehicles (EVs) and Electric Vehicle Charging Systems (EVCs) has introduced new cybersecurity challenges, specifically in authentication protocols that protect vehicles, users, and energy infrastructure. Although widely adopted for convenience, traditional authentication mechanisms like Radio Frequency Identification (RFID) and Near Field Communication (NFC) rely on static identifiers and weak encryption, making them highly vulnerable to attack vectors such as cloning, relay attacks, and signal interception. This study explores an AI-powered adaptive authentication framework designed to overcome these shortcomings by integrating machine learning, anomaly detection, behavioral analytics, and contextual risk assessment. Grounded in the principles of Zero Trust Architecture, the proposed framework emphasizes continuous verification, least privilege access, and secure communication. Through a comprehensive literature review, this research evaluates current vulnerabilities and highlights AI-driven solutions to provide a scalable, resilient, and proactive defense. Ultimately, the research findings conclude that adopting AI-powered adaptive authentication is a strategic imperative for securing the future of electric mobility and strengthening digital trust across the ecosystem. Keywords: weak authentication, RFID, NFC, ML, AI-powered adaptive authentication, relay attacks, cloning, eavesdropping, MITM attacks, Zero Trust Architecture
Authors:Peter T. Breuer
Abstract:
Because it is so unusual, or hard to find, or expository, a truly tiny 8- or 12-bit block AES (Rijndael) cipher is documented here, along with Java source code.
Authors:Toby Murray
Abstract:
Hidden LLM prompts have appeared in online documents with increasing frequency. Their goal is to trigger indirect prompt injection attacks while remaining undetected from human oversight, to manipulate LLM-powered automated document processing systems, against applications as diverse as résumé screeners through to academic peer review processes. Detecting hidden LLM prompts is therefore important for ensuring trust in AI-assisted human decision making.
This paper presents the first principled approach to hidden LLM prompt detection in structured documents. We implement our approach in a prototype tool called PhantomLint. We evaluate PhantomLint against a corpus of 3,402 documents, including both PDF and HTML documents, and covering academic paper preprints, CVs, theses and more. We find that our approach is generally applicable against a wide range of methods for hiding LLM prompts from visual inspection, has a very low false positive rate (approx. 0.092%), is practically useful for detecting hidden LLM prompts in real documents, while achieving acceptable performance.
Authors:Saeed Alshehhi
Abstract:
In the evolving landscape of IoT ecosystem, distinguishing between legitimate and compromised devices is a critical challenge. This research investigates the effectiveness of hardware performance counter (HPC)-derived signatures' uniqueness under the umbrella of a concept that we introduced as software unclonable functions (SUFs).
Authors:Carlos Soto
Abstract:
Differential privacy (DP) has recently emerged as a definition of privacy to release private estimates. DP calibrates noise to be on the order of an individuals contribution. Due to the this calibration a private estimate obscures any individual while preserving the utility of the estimate. Since the original definition, many alternate definitions have been proposed. These alternates have been proposed for various reasons including improvements on composition results, relaxations, and formalizations. Nevertheless, thus far nearly all definitions of privacy have used a divergence of densities as the basis of the definition. In this paper we take an information geometry perspective towards differential privacy. Specifically, rather than define privacy via a divergence, we define privacy via the Rao distance. We show that our proposed definition of privacy shares the interpretation of previous definitions of privacy while improving on sequential composition.
Authors:Sufyan Al-Janabi
Abstract:
Blockchain is a Distributed Ledger Technology (DLT) that offers numerous benefits including decentralization, transparency, efficiency, and reduced costs. Hence, blockchain has been included in many fields. Blockchain relies on cryptographic protocols (especially public-key cryptography and hash functions) to achieve many essential sub-routines. However, the increased progress of quantum computation and algorithms has threatened the security of many traditional cryptosystems. Therefore, this represents a serious risk for the existing blockchain technology. For example, SHA-256 and the Elliptic Curve Digital Signature Algorithm (ECDSA) cryptosystems can be compromised by Shor s and Grover s quantum algorithms in the foreseeable future. Post-Quantum Cryptography (PQC) is a basic solution for resisting these quantum attacks. Applying PQC to blockchains results in creating Post-Quantum Blockchains (PQB). Thus, this paper aims to review the threats imposed by quantum computers on classical blockchain technology and provide useful guidelines on PQB security to blockchain researchers. The paper focuses on the challenges and opportunities of future work direction in this field.
Authors:Ekzhin Ear
Abstract:
Space infrastructures have become an underpinning of modern society, but their associated cyber risks are little understood. This Dissertation advances the state-of-the-art via four contributions. (i) It introduces an innovative framework for characterizing real-world cyber attacks against space infrastructures, or space cyber attacks, including a novel methodology for coping with missing data and three novel metrics. A case study demonstrates the usefulness of the framework on 108 real-world space cyber attacks. (ii) This Dissertation characterizes the state-of-the-practice in space cyber risk analysis and mitigation, namely the Notional Risk Scores (NRS) within the Space Attack Research and Tactic Analysis (SPARTA) framework. (iii) We propose a set of desired properties that should be satisfied by any competent space cyber risk analysis and mitigation tool and applies them to assess two industrial space cyber risk analysis and mitigation tools. (iv) The study introduces a novel framework to analyze and mitigate space cyber risks by explicitly modeling space cyber attack cascading effects and presenting algorithms for mission risk analysis and mission hardening. We demonstrate the usefulness of the framework by applying it to analyze and mitigate space cyber risks, with testbed-based validation.
Authors:Alexander Tabalipa
Abstract:
Zero Trust Architecture (ZTA) has become a widely adopted model for securing enterprise environments, promoting continuous verification and minimal trust across systems. However, its application in mobile contexts remains limited, despite mobile applications now accounting for most global digital interactions and being increasingly targeted by sophisticated threats. Existing Zero Trust frameworks developed by organisations such as the National Institute of Standards and Technology (NIST) and the Cybersecurity and Infrastructure Security Agency (CISA) primarily focus on enterprise-managed infrastructure, assuming organisational control over devices, networks, and identities. This paper addresses a critical gap by proposing an extended Zero Trust model designed for mobile applications operating in untrusted, user-controlled environments. Using a design science methodology, the study introduced a six-pillar framework that supports runtime enforcement of trust through controls including device integrity, user identity validation, data protection, secure application programming interface (API) usage, behavioural monitoring, and live application protection. Each pillar was mapped to relevant regulatory and security standards to support compliance. A phased implementation roadmap and maturity assessment model were also developed to guide adoption across varying organisational contexts. The proposed model offers a practical and standards-aligned approach to securing mobile applications beyond pre-deployment controls, aligning real-time enforcement with Zero Trust principles. This contribution expands the operational boundaries of ZTA and provides organisations with a deployable path to reduce fraud, enhance compliance, and address emerging mobile security challenges. Future research may include empirical validation of the framework and cross-sector application testing.
Authors:Abraham Itzhak Weinberg
Abstract:
Attributing cyberattacks remains a central challenge in modern cybersecurity, particularly within denied environments where defenders have limited visibility into attacker infrastructure and are restricted by legal or operational rules of engagement. This perspective examines the strategic value of passive hack-back techniques that enable covert attribution and intelligence collection without initiating direct offensive actions. Key vectors include tracking beacons, honeytokens, environment-specific payloads, and supply-chain-based traps embedded within exfiltrated or leaked assets. These approaches rely on the assumption that attackers will interact with compromised data in traceable ways, allowing defenders to gather signals without violating engagement policies. The paper also explores the role of Artificial Intelligence (AI) in enhancing passive hack-back operations. Topics include the deployment of autonomous agents for forensic reconnaissance, the use of Large Language Models (LLMs) to generate dynamic payloads, and Adversarial Machine Learning (AML) techniques for evasion and counter-deception. A dedicated section discusses the implications of quantum technologies in this context, both as future threats to cryptographic telemetry and as potential tools for stealthy communication and post-quantum resilience. Finally, the paper advocates for hybrid defensive frameworks that combine passive attribution with delayed or conditional active responses, while maintaining compliance with legal, ethical, and operational constraints.
Authors:Robert Dilworth
Abstract:
When using a public communication channel -- whether formal or informal, such as commenting or posting on social media -- end users have no expectation of privacy: they compose a message and broadcast it for the world to see. Even if an end user takes utmost precautions to anonymize their online presence -- using an alias or pseudonym; masking their IP address; spoofing their geolocation; concealing their operating system and user agent; deploying encryption; registering with a disposable phone number or email; disabling non-essential settings; revoking permissions; and blocking cookies and fingerprinting -- one obvious element still lingers: the message itself. Assuming they avoid lapses in judgment or accidental self-exposure, there should be little evidence to validate their actual identity, right? Wrong. The content of their message -- necessarily open for public consumption -- exposes an attack vector: stylometric analysis, or author profiling. In this paper, we dissect the technique of stylometry, discuss an antithetical counter-strategy in adversarial stylometry, and devise enhancements through Unicode steganography.
Authors:Yuksel Aydin
Abstract:
As AI systems increasingly influence critical decisions, they face threats that exploit reasoning mechanisms rather than technical infrastructure. We present a framework for cognitive cybersecurity, a systematic protection of AI reasoning processes from adversarial manipulation. Our contributions are threefold. First, we establish cognitive cybersecurity as a discipline complementing traditional cybersecurity and AI safety, addressing vulnerabilities where legitimate inputs corrupt reasoning while evading conventional controls. Second, we introduce the CIA+TA, extending traditional Confidentiality, Integrity, and Availability triad with Trust (epistemic validation) and Autonomy (human agency preservation), requirements unique to systems generating knowledge claims and mediating decisions. Third, we present a quantitative risk assessment methodology with empirically-derived coefficients, enabling organizations to measure cognitive security risks. We map our framework to OWASP LLM Top 10 and MITRE ATLAS, facilitating operational integration. Validation through previously published studies (151 human participants; 12,180 AI trials) reveals strong architecture dependence: identical defenses produce effects ranging from 96% reduction to 135% amplification of vulnerabilities. This necessitates pre-deployment Cognitive Penetration Testing as a governance requirement for trustworthy AI deployment.
Authors:Arun Ganesh
Abstract:
We give a new privacy amplification analysis for truncated Poisson sampling, a Poisson sampling variant that truncates a batch if it exceeds a given maximum batch size.
Authors:Ephraiem Sarabamoun
Abstract:
Large language models (LLMs) have achieved remarkable performance across diverse natural language processing tasks, yet their vulnerability to character-level adversarial manipulations presents significant security challenges for real-world deployments.
Authors:Takreem Haider
Abstract:
Elliptic Curve Cryptography (ECC) is a fundamental component of modern public-key cryptosystems that enable efficient and secure digital signatures, key exchanges, and encryption. Its core operation, scalar multiplication, denoted as $k \cdot P$, where $P$ is a base point and $k$ is a private scalar, relies heavily on the secrecy and unpredictability of $k$. Conventionally, $k$ is selected using user input or pseudorandom number generators. However, in resource-constrained environments with weak entropy sources, these approaches may yield low-entropy or biased scalars, increasing susceptibility to side-channel and key recovery attacks. To mitigate these vulnerabilities, we introduce an optimization-driven scalar generation method that explicitly maximizes bit-level entropy. Our approach uses differential evolution (DE), a population-based metaheuristic algorithm, to search for scalars whose binary representations exhibit maximal entropy, defined by an even and statistically uniform distribution of ones and zeros. This reformulation of scalar selection as an entropy-optimization problem enhances resistance to entropy-based cryptanalytic techniques and improves overall unpredictability. Experimental results demonstrate that DE-optimized scalars achieve entropy significantly higher than conventionally generated scalars. The proposed method can be integrated into existing ECC-based protocols, offering a deterministic, tunable alternative to traditional randomness, ideal for applications in blockchain, secure messaging, IoT, and other resource-constrained environments.
Authors:Ruby Nealon
Abstract:
In February 2024, after building trust over two years with project maintainers by making a significant volume of legitimate contributions, GitHub user "JiaT75" self-merged a version of the XZ Utils project containing a highly sophisticated, well-disguised backdoor targeting sshd processes running on systems with the backdoored package installed. A month later, this package began to be distributed with popular Linux distributions until a Microsoft employee discovered the backdoor while investigating how a recent system upgrade impacted the performance of SSH authentication. Despite its potential global impact, no tooling exists for monitoring and identifying anomalous behavior by personas contributing to other open-source projects. This paper demonstrates how Open Source Intelligence (OSINT) data gathered from GitHub contributions, analyzed using graph databases and graph theory, can efficiently identify anomalous behaviors exhibited by the "JiaT75" persona across other open-source projects.
Authors:Samuel Aiello
Abstract:
Increasingly sophisticated and varied cyber threats necessitate ever improving enterprise security postures. For many organizations today, those postures have a foundation in the Zero Trust Architecture. This strategy sees trust as something an enterprise must not give lightly or assume too broadly. Understanding the ZTA and its numerous controls centered around the idea of not trusting anything inside or outside the network without verification, will allow organizations to comprehend and leverage this increasingly common paradigm. The ZTA, unlike many other regulatory frameworks, is not tightly defined. The research assesses the likelihood of quantifiable guidelines that measure cybersecurity maturity for an enterprise organization in relation to ZTA implementation. This is a new, data driven methodology for quantifying cyber resilience enabled by the adoption of Zero Trust principles to pragmatically address the critical need of organizations. It also looks at the practical aspects ZTA has on capabilities in deterring cyberattacks on a network. The outcomes of this research define a prescriptive set of key technical controls across identity verification, microsegmentation, data encryption, analytics, and orchestration that characterize the comprehensive ZTA deployment. By evaluating the depth of integration for each control component and aligning to industry best practices, the study's results help assess an organization's ZTA maturity level on a scale from Initial to Optimized adoption. The research's resultant four tier model demarcates phases for an organization on its security transformation journey, with each tier adding to the capability of the last.
Authors:Hao Li
Abstract:
Special tags such as AprilTags that facilitate image processing and pattern recognition are useful in practical applications. In close and private environments, identity security is unlikely to be an issue because all involved AprilTags can be completely regulated. However, in open and public environments, identity security is no longer an issue that can be neglected. To handle potential harm caused by adversarial attacks, this note advocates utilization of adjustable AprilTags instead of fixed ones.
Authors:Abdullah X
Abstract:
We study the right to be forgotten (GDPR Art. 17) for large language models and frame unlearning as a reproducible systems problem. Our approach treats training as a deterministic program and logs a minimal per-microbatch record (ordered ID hash, RNG seed, learning-rate value, optimizer-step counter, and accumulation boundary). Under a pinned stack and deterministic kernels, replaying the training tail while filtering only the forget closure yields the same parameters as training on the retain set (bit-identical in the training dtype) when preconditions hold. To meet latency and availability constraints, we add complementary paths: (i) exact reverts of recent steps via micro-checkpoints or dense per-step deltas, (ii) cohort-scoped adapter deletion when the base is frozen, and (iii) a curvature-guided anti-update followed by a short retain-tune, audit-gated with escalation to exact replay. We report storage/latency budgets and a toy artifact validating mechanics; in a controlled run that satisfies the preconditions we demonstrate byte-identical equality of model and optimizer states.
Authors:Hyunmin Choi
Abstract:
With the widespread adoption of cloud computing, the need for outsourcing statistical analysis to third-party platforms is growing rapidly. However, handling sensitive data such as medical records and financial information in cloud environments raises serious privacy concerns. In this paper, we present PP-STAT, a novel and efficient Homomorphic Encryption (HE)-based framework for privacy-preserving statistical analysis. HE enables computations to be performed directly on encrypted data without revealing the underlying plaintext. PP-STAT supports advanced statistical measures, including Z-score normalization, skewness, kurtosis, coefficient of variation, and Pearson correlation coefficient, all computed securely over encrypted data. To improve efficiency, PP-STAT introduces two key optimizations: (1) a Chebyshev-based approximation strategy for initializing inverse square root operations, and (2) a pre-normalization scaling technique that reduces multiplicative depth by folding constant scaling factors into mean and variance computations. These techniques significantly lower computational overhead and minimize the number of expensive bootstrapping procedures. Our evaluation on real-world datasets demonstrates that PP-STAT achieves high numerical accuracy, with mean relative error (MRE) below 2.4x10-4. Notably, the encrypted Pearson correlation between the smoker attribute and charges reaches 0.7873, with an MRE of 2.86x10-4. These results confirm the practical utility of PP-STAT for secure and precise statistical analysis in privacy-sensitive domains.
Authors:Rischan Mafrur
Abstract:
The tokenization of real-world assets (RWAs) promises to transform financial markets by enabling fractional ownership, global accessibility, and programmable settlement of traditionally illiquid assets such as real estate, private credit, and government bonds. While technical progress has been rapid, with over \$25 billion in tokenized RWAs brought on-chain as of 2025, liquidity remains a critical bottleneck. This paper investigates the gap between tokenization and tradability, drawing on recent academic research and market data from platforms such as RWA.xyz. We document that most RWA tokens exhibit low trading volumes, long holding periods, and limited investor participation, despite their potential for 24/7 global markets. Through case studies of tokenized real estate, private credit, and tokenized treasury funds, we present empirical liquidity observations that reveal low transfer activity, limited active address counts, and minimal secondary trading for most tokenized asset classes. Next, we categorize the structural barriers to liquidity, including regulatory gating, custodial concentration, whitelisting, valuation opacity, and lack of decentralized trading venues. Finally, we propose actionable pathways to improve liquidity, ranging from hybrid market structures and collateral-based liquidity to transparency enhancements and compliance innovation. Our findings contribute to the growing discourse on digital asset market microstructure and highlight that realizing the liquidity potential of RWAs requires coordinated progress across legal, technical, and institutional domains.
Authors:Nicolas Ruiz
Abstract:
Although the bulk of the research in privacy and statistical disclosure control is designed for cross-sectional data, i.e. data where individuals are observed at one single point in time, longitudinal data, i.e. individuals observed over multiple periods, are increasingly collected. Such data enhance undoubtedly the possibility of statistical analysis compared to cross-sectional data, but also come with one additional layer of information, individual trajectories, that must remain practically useful in a privacy-preserving way. Few extensions, essentially k-anonymity based, of popular privacy tools have been proposed to deal with the challenges posed by longitudinal data, and these proposals are often complex. By considering randomized response, and specifically its recent bistochastic extension, in the context of longitudinal data, this paper proposes a simple approach for their anonymization. After having characterized new results on bistochastic matrices, we show that a simple relationship exists between the protection of each data set released at each period, and the protection of individuals trajectories over time. In turn, this relationship can be tuned according to desired protection and information requirements. We illustrate the application of the proposed approach by an empirical example.
Authors:Ryan Warnick
Abstract:
Many taxonomies exist to organize cybercrime incidents into ontological categories. We examine some of the taxonomies introduced in the literature; providing a framework, and analysis, of how best to leverage different taxonomy structures to optimize performance of detections targeting various types of threat-actor behaviors under the umbrella of precision and recall. Networks of detections are studied, and results are outlined showing properties of networks of interconnected detections. Some illustrations are provided to show how the construction of sets of detections to prevent broader types of attacks is limited by trade-offs in precision and recall under constraints. An equilibrium result is proven and validated on simulations, illustrating the existence of an optimal detection design strategy in this framework.
Authors:Julien Mellaerts
Abstract:
In this paper, we introduce a novel quantum algorithm for the factorization of composite odd numbers. This work makes two significant contributions. First, we present a new improvement to the classical Fermat method, fourfold reducing the computational complexity of factoring. Second, we reformulate Fermat factorization method as an optimization problem suitable for Quantum Annealers which allowed us to factorize 8,689,739, the biggest number ever factorized using a quantum device to our knowledge.
Authors:Yuksel Aydin
Abstract:
Language models exhibit human-like cognitive vulnerabilities, such as emotional framing, that escape traditional behavioral alignment. We present CCS-7 (Cognitive Cybersecurity Suite), a taxonomy of seven vulnerabilities grounded in human cognitive security research. To establish a human benchmark, we ran a randomized controlled trial with 151 participants: a "Think First, Verify Always" (TFVA) lesson improved cognitive security by +7.9% overall. We then evaluated TFVA-style guardrails across 12,180 experiments on seven diverse language model architectures. Results reveal architecture-dependent risk patterns: some vulnerabilities (e.g., identity confusion) are almost fully mitigated, while others (e.g., source interference) exhibit escalating backfire, with error rates increasing by up to 135% in certain models. Humans, in contrast, show consistent moderate improvement. These findings reframe cognitive safety as a model-specific engineering problem: interventions effective in one architecture may fail, or actively harm, another, underscoring the need for architecture-aware cognitive safety testing before deployment.
Authors:Samet Ãnsal
Abstract:
Post-quantum cryptography (PQC) aims to develop cryptographic algorithms that are secure against attacks from quantum computers. This paper compares the leading postquantum cryptographic algorithms, such as Kyber, sntrup761, and FrodoKEM, in terms of their security, performance, and real-world applicability. The review highlights the strengths and weaknesses of each algorithm and provides insights into future research directions. We also discuss the challenges of transitioning from classical to post-quantum systems and the potential impacts on various industries. This paper serves as a foundation for understanding the current state of post-quantum cryptography and its future prospects in the quantum computing era.
Authors:Rodrigo Tertulino
Abstract:
Federated Learning (FL) presents a groundbreaking approach for collaborative health research, allowing model training on decentralized data while safeguarding patient privacy. FL offers formal security guarantees when combined with Differential Privacy (DP). The integration of these technologies, however, introduces a significant trade-off between privacy and clinical utility, a challenge further complicated by the severe class imbalance often present in medical datasets. The research presented herein addresses these interconnected issues through a systematic, multi-stage analysis. An FL framework was implemented for cardiovascular risk prediction, where initial experiments showed that standard methods struggled with imbalanced data, resulting in a recall of zero. To overcome such a limitation, we first integrated the hybrid Synthetic Minority Over-sampling Technique with Tomek Links (SMOTETomek) at the client level, successfully developing a clinically useful model. Subsequently, the framework was optimized for non-IID data using a tuned FedProx algorithm. Our final results reveal a clear, non-linear trade-off between the privacy budget (epsilon) and model recall, with the optimized FedProx consistently out-performing standard FedAvg. An optimal operational region was identified on the privacy-utility frontier, where strong privacy guarantees (with epsilon 9.0) can be achieved while maintaining high clinical utility (recall greater than 77%). Ultimately, our study provides a practical methodological blueprint for creating effective, secure, and accurate diagnostic tools that can be applied to real-world, heterogeneous healthcare data.
Authors:Boris Ryabko
Abstract:
We investigate the impact of (possible) deviations of the probability distribution of key values from a uniform distribution for the information-theoretic strong, or perfect, message authentication code. We found a simple expression for the decrease in security as a function of the statistical distance between the real key probability distribution and the uniform one. In a sense, a perfect message authentication code is robust to small deviations from a uniform key distribution.
Authors:Zijiang Yang
Abstract:
Recently, we can observe a significant increase of the phishing attacks in the Internet. In a typical phishing attack, the attacker sets up a malicious website that looks similar to the legitimate website in order to obtain the end-users' information. This may cause the leakage of the sensitive information and the financial loss for the end-users. To avoid such attacks, the early detection of these websites' URLs is vital and necessary. Previous researchers have proposed many machine learning algorithms to distinguish the phishing URLs from the legitimate ones. In this paper, we would like to enhance these machine learning algorithms from the perspective of feature selection. We propose a novel method to incorporate the keyword features with the traditional features. This method is applied on multiple traditional machine learning algorithms and the experimental results have shown this method is useful and effective. On average, this method can reduce the classification error by 30% for the large dataset. Moreover, its enhancement is more significant for the small dataset. In addition, this method extracts the information from the URL and does not rely on the additional information provided by the third-part service. The best result for the machine learning algorithm using our proposed method has achieved the accuracy of 99.68%.
Authors:Ramadhan J. Mstafa
Abstract:
The rapid transmission of multimedia information has been achieved mainly by recent advancements in the Internet's speed and information technology. In spite of this, advancements in technology have resulted in breaches of privacy and data security. When it comes to protecting private information in today's Internet era, digital steganography is vital. Many academics are interested in digital video because it has a great capability for concealing important data. There have been a vast number of video steganography solutions developed lately to guard against the theft of confidential data. The visual imperceptibility, robustness, and embedding capacity of these approaches are all challenges that must be addressed. In this paper, a novel solution to reversible video steganography based on DWT and QR codes is proposed to address these concerns. In order to increase the security level of the suggested method, an enhanced ElGamal cryptosystem has also been proposed. Prior to the embedding stage, the suggested method uses the modified ElGamal algorithm to encrypt secret QR codes. Concurrently, it applies two-dimensional DWT on the Y-component of each video frame resulting in LL, LH, HL, and HH sub-bands. Then, the encrypted Low (L), Medium (M), Quantile (Q), and High (H) QR codes are embedded into the HL sub-band, HH sub-band, U-component, and V-component of video frames, respectively, using the LSB technique. As a consequence of extensive testing of the approach, it was shown to be very secure and highly invisible, as well as highly resistant to attacks from Salt & Pepper, Gaussian, Poisson, and Speckle noises, which has an average SSIM of more than 0.91. Aside from visual imperceptibility, the suggested method exceeds current methods in terms of PSNR average of 52.143 dB, and embedding capacity 1 bpp.
Authors:Prashant Sharma
Abstract:
Current digital government literature focuses on professional in-house IT teams, specialized digital service teams, vendor-developed systems, or proprietary low-code/no-code tools. Almost no scholarship addresses a growing middle ground: technically skilled civil servants outside formal IT roles who can write real code but lack a sanctioned, secure path to deploy their work. This paper introduces a limits-aware, open-source and replicable platform that enables such public servants to develop, peer review, and deploy small-scale, domain-specific applications within government networks via a sandboxed, auditable workflow. By combining Jupyter Notebooks, preapproved open-source libraries, and lightweight governance, the platform works within institutional constraints such as procurement rules and IT security policies while avoiding vendor lock-in. Unlike low/no-code approaches, it preserves and enhances civil servants' programming skills, keeping them technically competitive with their private-sector peers. This contribution fills a critical gap, offering a replicable model for public-sector skill retention, resilience, and bottom-up digital transformation.
Authors:Ivan Zhang
Abstract:
Ensuring LLM alignment is critical to information security as AI models become increasingly widespread and integrated in society. Unfortunately, many defenses against adversarial attacks and jailbreaking on LLMs cannot adapt quickly to new attacks, degrade model responses to benign prompts, or introduce significant barriers to scalable implementation. To mitigate these challenges, we introduce a real-time, self-tuning (RTST) moderator framework to defend against adversarial attacks while maintaining a lightweight training footprint. We empirically evaluate its effectiveness using Google's Gemini models against modern, effective jailbreaks. Our results demonstrate the advantages of an adaptive, minimally intrusive framework for jailbreak defense over traditional fine-tuning or classifier models.
Authors:Sanket Badhe
Abstract:
Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.
Authors:Liang Chen
Abstract:
This paper introduces a structural game-theoretic model to value decentralized digital assets like Bitcoin. Instead of relying on speculative beliefs, it frames the asset's price within a Rational-Expectations Security-Utility Nash Equilibrium (RESUNE). This equilibrium is a fixed point where the market-clearing price dictates the hash rate through a free-entry mining model, which in turn endogenously sets the network's security. The security, defined as one minus the probability of a 51% attack, is determined via a global games model of attacker coordination, providing a unique and continuous security function. We prove the existence of a RESUNE and offer conditions for its uniqueness and stability. The model predicts that the stabilizing direct effect of price on demand must outweigh the potentially destabilizing feedback from price to security. The framework generates testable predictions, such as a protocol halving causing a contraction in both hash rate and price. A structural Vector Autoregression (VAR) model is proposed to test this mechanism. The model decomposes Bitcoin's value into transactional utility, security, and speculative components and explains the observed unidirectional causality from price to hash rate.
Authors:Peizhuo Liu
Abstract:
RL-based medical questionnaire systems have shown great potential in medical scenarios. However, their safety and robustness remain unresolved. This study performs a comprehensive evaluation on adversarial attack methods to identify and analyze their potential vulnerabilities. We formulate the diagnosis process as a Markov Decision Process (MDP), where the state is the patient responses and unasked questions, and the action is either to ask a question or to make a diagnosis. We implemented six prevailing major attack methods, including the Fast Gradient Signed Method (FGSM), Projected Gradient Descent (PGD), Carlini & Wagner Attack (C&W) attack, Basic Iterative Method (BIM), DeepFool, and AutoAttack, with seven epsilon values each. To ensure the generated adversarial examples remain clinically plausible, we developed a comprehensive medical validation framework consisting of 247 medical constraints, including physiological bounds, symptom correlations, and conditional medical constraints. We achieved a 97.6% success rate in generating clinically plausible adversarial samples. We performed our experiment on the National Health Interview Survey (NHIS) dataset (https://www.cdc.gov/nchs/nhis/), which consists of 182,630 samples, to predict the participant's 4-year mortality rate. We evaluated our attacks on the AdaptiveFS framework proposed in arXiv:2004.00994. Our results show that adversarial attacks could significantly impact the diagnostic accuracy, with attack success rates ranging from 33.08% (FGSM) to 64.70% (AutoAttack). Our work has demonstrated that even under strict medical constraints on the input, such RL-based medical questionnaire systems still show significant vulnerabilities.
Authors:Abigail Gentle
Abstract:
Local differential privacy represents the gold standard for preserving the privacy of data before it leaves the device, and distribution estimation under this model has been well studied. Recently, protocols built upon balanced incomplete block designs were shown to achieve optimal error for this problem. However, it remained unknown whether other constructions could also be optimal. We resolve this question by proving that any protocol achieving optimal error must correspond to some balanced incomplete block design. This result, combined with prior work, completely characterises the set of optimal protocols for this problem. As a consequence, the protocols that achieve optimal error and optimal communication are only those based on symmetrical balanced incomplete block designs.
Authors:Arjun Juneja
Abstract:
Malware and cheat developers use fileless execution techniques to evade traditional, signature-based security products. These methods include various types of manual mapping, module stomping, and threadless injection which work entirely within the address space of a legitimate process, presenting a challenge for detection due to ambiguity between what is legitimate and what isn't. Existing tools often have weaknesses, such as a dependency on Portable Executable (PE) structures or a vulnerability to time-of-check-to-time-of-use (TOCTOU) race conditions where an adversary cleans up before a periodic scan has the chance to occur. To address this gap, we present RX-INT, a kernel-assisted system featuring an architecture that provides resilience against TOCTOU attacks. RX-INT introduces a detection engine that combines a real-time thread creation monitor with a stateful Virtual Address Descriptor (VAD) scanner alongside various heuristics within. This engine snapshots both private and image-backed memory regions, using real-time memory hashing to detect illicit modifications like module stomping. Critically, we demonstrate a higher detection rate in certain benchmarks of this approach through a direct comparison with PE-sieve, a commonly used and powerful memory forensics tool. In our evaluation, RX-INT successfully detected a manually mapped region that was not identified by PE-sieve. We then conclude that our architecture represents a tangible difference in the detection of fileless threats, with direct applications in the fields of anti-cheat and memory security.
Authors:Yuksel Aydin
Abstract:
Artificial intelligence enables unprecedented attacks on human cognition, yet cybersecurity remains predominantly device-centric. This paper introduces the "Think First, Verify Always" (TFVA) protocol, which repositions humans as 'Firewall Zero', the first line of defense against AI-enabled threats. The protocol is grounded in five operational principles: Awareness, Integrity, Judgment, Ethical Responsibility, and Transparency (AIJET). A randomized controlled trial (n=151) demonstrated that a minimal 3-minute intervention produced statistically significant improvements in cognitive security task performance, with participants showing an absolute +7.87% gains compared to controls. These results suggest that brief, principles-based training can rapidly enhance human resilience against AI-driven cognitive manipulation. We recommend that GenAI platforms embed "Think First, Verify Always" as a standard prompt, replacing passive warnings with actionable protocols to enhance trustworthy and ethical AI use. By bridging the gap between technical cybersecurity and human factors, the TFVA protocol establishes human-empowered security as a vital component of trustworthy AI systems.
Authors:Akshay Madhav Deshmukh
Abstract:
Automobiles are becoming increasingly important in our day to day life. Modern automobiles are highly computerized and hence potentially vulnerable to attack. Providing many wireless connectivity for vehicles enables a bridge between vehicles and their external environments. Such a connected vehicle solution is expected to be the next frontier for automotive revolution and the key to the evolution to next generation intelligent transportation systems. Vehicular Ad hoc Networks (VANETs) are emerging mobile ad hoc network technologies incorporating mobile routing protocols for inter-vehicle data communications to support intelligent transportation systems. Thus security and privacy are the major concerns in VANETs due to the mobility of the vehicles. Thus designing security mechanisms to remove adversaries from the network remarkably important in VANETs.
This paper provides an overview of various vehicular network architectures. The evolution of security in modern vehicles. Various security and privacy attacks in VANETs with their defending mechanisms with examples and classify these mechanisms. It also provides an overview of various privacy implication that a vehicular network possess.
Authors:Rama Carl Hoetzlein
Abstract:
Small organizations, start ups, and self-hosted servers face increasing strain from automated web crawlers and AI bots, whose online presence has increased dramatically in the past few years. Modern bots evade traditional throttling and can degrade server performance through sheer volume even when they are well-behaved. We introduce a novel security approach that leverages data visualization and hierarchical IP hashing to analyze server event logs, distinguishing human users from automated entities based on access patterns. By aggregating IP activity across subnet classes and applying statistical measures, our method detects coordinated bot activity and distributed crawling attacks that conventional tools fail to identify. Using a real world example we estimate that 80 to 95 percent of traffic originates from AI crawlers, underscoring the need for improved filtering mechanisms. Our approach enables small organizations to regulate automated traffic effectively, preserving public access while mitigating performance degradation.
Authors:Luyao Zhang
Abstract:
Stablecoins have become a foundational component of the digital asset ecosystem, with their market capitalization exceeding 230 billion USD as of May 2025. As fiat-referenced and programmable assets, stablecoins provide low-latency, globally interoperable infrastructure for payments, decentralized finance, DeFi, and tokenized commerce. Their accelerated adoption has prompted extensive regulatory engagement, exemplified by the European Union's Markets in Crypto-assets Regulation, MiCA, the US Guiding and Establishing National Innovation for US Stablecoins Act, GENIUS Act, and Hong Kong's Stablecoins Bill. Despite this momentum, academic research remains fragmented across economics, law, and computer science, lacking a unified framework for design, evaluation, and application. This study addresses that gap through a multi-method research design. First, it synthesizes cross-disciplinary literature to construct a taxonomy of stablecoin systems based on custodial structure, stabilization mechanism, and governance. Second, it develops a performance evaluation framework tailored to diverse stakeholder needs, supported by an open-source benchmarking pipeline to ensure transparency and reproducibility. Third, a case study on Real World Asset tokenization illustrates how stablecoins operate as programmable monetary infrastructure in cross-border digital systems. By integrating conceptual theory with empirical tools, the paper contributes: a unified taxonomy for stablecoin design; a stakeholder-oriented performance evaluation framework; an empirical case linking stablecoins to sectoral transformation; and reproducible methods and datasets to inform future research. These contributions support the development of trusted, inclusive, and transparent digital monetary infrastructure.
Authors:Shlomi Dolev
Abstract:
Consider graphs of n nodes, and use a Bloom filter of length 2 log3 n bits. An edge between nodes i and j, with i < j, turns on a certain bit of the Bloom filter according to a hash function on i and j. Pick a set of log n nodes and turn on all the bits of the Bloom filter required for these log n nodes to form a clique. As a result, the Bloom filter implies the existence of certain other edges, those edges (x, y), with x < y, such that all the bits selected by applying the hash functions to x and y happen to have been turned on due to hashing the clique edges into the Bloom filter.
Constructing the graph consisting of the clique-selected edges and those edges mapped to the turned-on bits of the Bloom filter can be performed in polynomial time in n. Choosing a large enough polylogarithmic in n Bloom filter yields that the graph has only one clique of size log n, the planted clique. When the hash function is black-boxed, finding that clique is intractable and, therefore, inverting the function that maps log n nodes to a graph is not (likely to be) possible in polynomial time.
Authors:Ricardo M. Czekster
Abstract:
In the automotive industry there is a need to handle broad quality deficiencies, eg, performance, maintainability, cybersecurity, safety, and privacy, to mention a few. The idea is to prevent these issues from reaching end-users, ie, road users and inadvertently, pedestrians, aiming to potentially reduce accidents, and allow safe operation in dynamic attack surfaces, for the benefit of a host of stakeholders. This paper aims to bridge cybersecurity, safety, and privacy concerns in Connected and Autonomous Vehicles (CAV) with respect to Risk Assessment (RA) and Threat Modelling (TM) altogether. Practitioners know the vast literature on this topic given the sheer number of recommendations, standards, best practices, and existing approaches, at times impairing projects and fostering valuable and actionable threat analysis. In this paper we collate key outcomes by highlighting latest standards and approaches in RA and TM research to tackle complex attack surfaces as the ones posed by automotive settings. We aim to provide the community with a list of approaches to align expectations with stakeholders when deciding where and when to focus threat related analysis in automotive solutions.
Authors:Randy Kuang
Abstract:
We propose and experimentally demonstrate the \emph{Quasi-Superposition Quantum-inspired System (QSQS)} -- a conceptual quantum system for randomness generation built on measuring two conjugate observables of a permutation sorting process: the deterministic permutation count $n_p$ and the fundamentally non-deterministic sorting time $t$. By analogy with quantum systems, these observables are linked by an uncertainty-like constraint: algorithmic determinism ensures structural uniformity, while system-level fluctuations introduce irreducible unpredictability. We realize this framework concretely as \emph{QPP-RNG}, a system-embedded, software-based true random number generator (TRNG). In QPP-RNG, real-time measurements of sorting time $t$ -- shaped by CPU pipeline jitter, cache latency, and OS scheduling -- dynamically reseed the PRNG driving the permutation sequence. Crucially, QSQS transforms initially right-skewed raw distributions of $n_p$ and $t$ into nearly uniform outputs after modulo reduction, thanks to internal degeneracies that collapse many distinct states into the same output symbol. Empirical results show that as the repetition factor $m$ increases, output entropy converges toward theoretical maxima: Shannon and min-entropy values approach 8 bits, chi-squared statistics stabilize near ideal uniformity, and bell curves visually confirm the flattening from skewed to uniform distributions. Beyond practical implications, QSQS unifies deterministic algorithmic processes with non-deterministic physical fluctuations, offering a physics-based perspective for engineering true randomness in post-quantum cryptographic systems.
Authors:Luqman Muhammad Zagi
Abstract:
Malware change day by day and become sophisticated. Not only the complexity of the algorithm that generating malware, but also the camouflage methods. Camouflage, formerly, only need a simple encryption. Now, camouflage are able to change the pattern of code automatically. This term called Polymorphism. This property is usually used to create a metamorphic and a polymorphic malware. Although it has been around since 1990 still quite tricky to detect. In general, there are three obfuscation techniques to create the nature of polymorphism. That techniques are dead code insertion, register substitution, and instruction replacement. This technique can be added to the Metasploit Framework via Ghost Writing Assembly to get ASM files. The detection methods that be used are VT-notify, Context Triggered Piecewise Hash (CTPH), and direct scanning with an antivirus that has been selected. VTnotify show nothing wrong with the files. The best CTPH value is generated by a mixture of technique (average: 52.3125%), while if it is compared to the number of changes made, instruction replacement have the best comparative value (0.0256). The result of using antivirus scanning produces a variety of different results. Antivirus with behavioural-based detection has a possibility to detect this polymorphism.
Authors:Abdurrahman Tolay
Abstract:
The rapid expansion of the Internet of Things (IoT) has intensified security challenges, notably from Distributed Denial of Service (DDoS) attacks launched by compromised, resource-constrained devices. Traditional defenses are often ill-suited for the IoT paradigm, creating a need for lightweight, high-performance, edge-based solutions. This paper presents the design, implementation, and evaluation of an IoT security framework that leverages the extended Berkeley Packet Filter (eBPF) and the eXpress Data Path (XDP) for in-kernel mitigation of DDoS attacks. The system uses a rate-based detection algorithm to identify and block malicious traffic at the earliest stage of the network stack. The framework is evaluated using both Docker-based simulations and real-world deployment on a Raspberry Pi 4, showing over 97% mitigation effectiveness under a 100 Mbps flood. Legitimate traffic remains unaffected, and system stability is preserved even under attack. These results confirm that eBPF/XDP provides a viable and highly efficient solution for hardening IoT edge devices against volumetric network attacks.
Authors:Arimondo Scrivano
Abstract:
The advent of quantum computing poses a significant threat to the foundational cryptographic algorithms that secure modern digital communications. Protocols such as HTTPS, digital certificates, and public key infrastructures (PKIs) heavily rely on cryptographic primitives like RSA, ECC, and Diffie-Hellman, which are vulnerable to quantum attacks -- most notably Shor's algorithm. This paper presents a comprehensive comparative analysis between classical cryptographic algorithms currently in widespread use and emerging post-quantum cryptographic schemes designed to withstand quantum adversaries. We review the cryptographic mechanisms underpinning modern internet security, outline the mathematical foundations of quantum attacks, and evaluate the security, performance, and implementation feasibility of quantum-resistant alternatives such as Kyber, Dilithium, and Falcon. Additionally, we assess the hybrid approaches currently being explored by institutions and tech companies to enable a smooth transition to post-quantum cryptography. By providing an in-depth comparison, this study aims to guide researchers, developers, and policymakers in understanding the critical implications of quantum computing on cryptographic infrastructures and the necessary steps for securing communications in the quantum era.
Authors:Michael Schaller
Abstract:
In this paper we introduce a rank $2$ lattice over a polynomial ring arising from the public key of the BIKE cryptosystem \cite{aragon2022bike}. The secret key is a sparse vector in this lattice. We study properties of this lattice and generalize the recovery of weak keys from \cite{BardetDLO16}. In particular, we show that they implicitly solved a shortest vector problem in the lattice we constructed. Rather than finding only a shortest vector, we obtain a reduced basis of the lattice which makes it possible to check for more weak keys.
Authors:Ramprasad Sarkar
Abstract:
Yang et al. proposed a lightweight certificateless multiuser matchmaking encryption (LC-MUME) scheme for mobile devices, published in IEEE Transactions on Information Forensics and Security (TIFS) (DOI: 10.1109/TIFS.2023.3321961). Their construction aims to reduce computational and communication overhead within a one-to-many certificateless cryptographic framework. The authors claim that their scheme satisfies existential unforgeability under chosen-message attacks (EUF-CMA) in the random oracle model. However, our cryptanalytic study demonstrates that the scheme fails to meet this critical security requirement. In particular, we show that a Type-I adversary can successfully forge a valid ciphertext without possessing the complete private key of the sender. Both theoretical analysis and practical implementation confirm that this attack can be mounted with minimal computational cost. To address these weaknesses, we propose a modification strategy to strengthen the security of matchmaking encryption schemes in mobile computing environments.
Authors:Chetan Pathade
Abstract:
Vision-language models (VLMs) have revolutionized multimodal AI applications but introduce novel security vulnerabilities that remain largely unexplored. We present the first comprehensive study of steganographic prompt injection attacks against VLMs, where malicious instructions are invisibly embedded within images using advanced steganographic techniques. Our approach demonstrates that current VLM architectures can inadvertently extract and execute hidden prompts during normal image processing, leading to covert behavioral manipulation. We develop a multi-domain embedding framework combining spatial, frequency, and neural steganographic methods, achieving an overall attack success rate of 24.3% (plus or minus 3.2%, 95% CI) across leading VLMs including GPT-4V, Claude, and LLaVA, with neural steganography methods reaching up to 31.8%, while maintaining reasonable visual imperceptibility (PSNR greater than 38 dB, SSIM greater than 0.94). Through systematic evaluation on 12 diverse datasets and 8 state-of-the-art models, we reveal moderate but meaningful vulnerabilities in current VLM architectures and propose effective countermeasures. Our findings have significant implications for VLM deployment in security-critical applications and highlight the need for proportionate multimodal AI security frameworks.
Authors:Muyang Li
Abstract:
As autonomous agents powered by large language models (LLMs) proliferate in high-stakes domains -- from pharmaceuticals to legal workflows -- the challenge is no longer just intelligence, but verifiability. We introduce TrustTrack, a protocol that embeds structural guarantees -- verifiable identity, policy commitments, and tamper-resistant behavioral logs -- directly into agent infrastructure. This enables a new systems paradigm: trust-native autonomy. By treating compliance as a design constraint rather than post-hoc oversight, TrustTrack reframes how intelligent agents operate across organizations and jurisdictions. We present the protocol design, system requirements, and use cases in regulated domains such as pharmaceutical R&D, legal automation, and AI-native collaboration. We argue that the Cloud -> AI -> Agent -> Trust transition represents the next architectural layer for autonomous systems.
Authors:Hongyi Xie
Abstract:
Surface electromyography (EMG) enables non-invasive human-computer interaction in rehabilitation, prosthetics, and virtual reality. While deep learning models achieve over 97% classification accuracy, their vulnerability to adversarial attacks remains largely unexplored in the physical domain. We present ERa Attack, the first radio frequency (RF) adversarial method targeting EMG devices through intentional electromagnetic interference (IEMI). Using low-power software-defined radio transmitters, attackers inject optimized RF perturbations to mislead downstream models. Our approach bridges digital and physical domains: we generate adversarial perturbations using Projected Gradient Descent, extract 50-150 Hz components via inverse STFT, and employ synchronization-free strategies (constant spectrum noise or narrowband modulation). Perturbations, constrained to 1-10% of signal amplitude, are amplitude-modulated onto 433 MHz carriers. Experiments on the Myo Dataset (7 gestures, 350 samples) demonstrate significant impact: at 1 meter and 0 dBm transmission power, classification accuracy drops from 97.8% to 58.3%, with 41.7% misclassification rate and 25.6% targeted attack success rate. Attack effectiveness decreases exponentially with distance, recovering to 85% accuracy at 3 meters. Increasing power to 10 dBm reduces accuracy by an additional 15% at 1 meter. This work pioneers RF-based adversarial attacks on EMG recognition systems, revealing critical vulnerabilities in safety-critical applications. We quantify attack effectiveness across different perturbation modes and distances, and propose defenses including hardware shielding, spectrum monitoring, and adversarial training. Our findings inform the design of robust EMG systems against electromagnetic threats.
Authors:Joshua Luberisse
Abstract:
Human verification under adversarial information flow operates as a cost-bounded decision procedure constrained by working memory limits and cognitive biases. We introduce the Verification Cost Asymmetry (VCA) coefficient, formalizing it as the ratio of expected verification work between populations under identical claim distributions. Drawing on probabilistically checkable proofs (PCP) and parameterized complexity theory, we construct dissemination protocols that reduce verification for trusted audiences to constant human effort while imposing superlinear costs on adversarial populations lacking cryptographic infrastructure. We prove theoretical guarantees for this asymmetry, validate the framework through controlled user studies measuring verification effort with and without spot-checkable provenance, and demonstrate practical encoding of real-world information campaigns. The results establish complexity-theoretic foundations for engineering democratic advantage in cognitive warfare, with immediate applications to content authentication, platform governance, and information operations doctrine.
Authors:Masoud Hayeri Khyavi
Abstract:
The advantages of IoT in strengthening commercial, industrial, and social ecosystems have led to its widespread expansion. Nevertheless, because endpoint devices have limited computation, storage, and communication capabilities, the IoT infrastructure is vulnerable to several cyber threats. As a result, DDoS attacks pose a severe risk to the security of IoT. By taking advantage of these weaknesses, attackers may quickly employ IoT devices as a component of botnets to execute DDoS attacks. The most critical development is how more armies of robots are being constructed from IoT devices. We offer a Model for dealing with DDOS attacks on botnets in the Internet of Things via trust management. In this Model, an attempt has been made to consider all aspects of security concerning trust factors to design a reliable and flexible model against DDoS attacks against the Internet of Things. In the initial studies, about 40-50 security models related to the subject have been studied by using review articles
Authors:Md Abdul Gaffar
Abstract:
The integration of electric vehicles (EVs) into power grids via Vehicle-to-Grid (V2G) system technology is increasing day by day, but these phenomena present both advantages and disadvantages. V2G can increase grid reliability by providing distributed energy storage and ancillary services. However, on the other hand, it has a scope that encompasses the cyber-physical attack surface of the national power grid, introducing new vulnerabilities in monitoring and supervisory control and data acquisition (SCADA) systems. This paper investigates the maliciousness caused by Autonomous Vehicle to Grid (AV2G) communication infrastructures and assesses their impacts on SCADA system reliability. This paper presents a quantitative reliability assessment using Bayesian attack graph combined with probabilistic capacity outage modeling based on IEEE RTS-79 system data. This work presents how AV2G-based attacks degrade system performance by using Monte Carlo simulations method, highlighting the need for cybersecurity-hardening strategies in smart grid design.
Authors:Abel C. H. Chen
Abstract:
In recent years, the advancement of quantum computing technology has posed potential security threats to RSA cryptography and elliptic curve cryptography. In response, the National Institute of Standards and Technology (NIST) published several Federal Information Processing Standards (FIPS) of post-quantum cryptography (PQC) in August 2024, including the Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM), Module-Lattice-Based Digital Signature Algorithm (ML-DSA), and Stateless Hash-Based Digital Signature Algorithm (SLH-DSA). Although these PQC algorithms are designed to resist quantum computing attacks, they may not provide adequate security in certain specialized application scenarios. To address this issue, this study proposes quantum random number generator (QRNG)-based PQC algorithms. These algorithms leverage quantum computing to generate random numbers, which serve as the foundation for key pair generation, key encapsulation, and digital signature generation. A generalized architecture of QRNG is proposed, along with the design of six QRNGs. Each generator is evaluated according to the statistical validation procedures outlined in NIST SP 800-90B, including tests for verification of entropy sources and independent and identically distributed (IID) outputs. Experimental results assess the computation time of the six QRNGs, as well as the performance of QRNG-based ML-KEM, QRNG-based ML-DSA, and QRNG-based SLH-DSA. These findings provide valuable reference data for future deployment of PQC systems.
Authors:Atil Samancioglu
Abstract:
Large Language Models (LLMs) demonstrate complex responses to threat-based manipulations, revealing both vulnerabilities and unexpected performance enhancement opportunities. This study presents a comprehensive analysis of 3,390 experimental responses from three major LLMs (Claude, GPT-4, Gemini) across 10 task domains under 6 threat conditions. We introduce a novel threat taxonomy and multi-metric evaluation framework to quantify both negative manipulation effects and positive performance improvements. Results reveal systematic vulnerabilities, with policy evaluation showing the highest metric significance rates under role-based threats, alongside substantial performance enhancements in numerous cases with effect sizes up to +1336%. Statistical analysis indicates systematic certainty manipulation (pFDR < 0.0001) and significant improvements in analytical depth and response quality. These findings have dual implications for AI safety and practical prompt engineering in high-stakes applications.
Authors:Ruomai Ren
Abstract:
Plugin systems are a class of external programmes that provide users with a wide range of functionality, and while they enhance the user experience, their security is always a challenge. Especially due to the diversity and complexity of developers, many plugin systems lack adequate regulation. As ChatGPT has become a popular large-scale language modelling platform, its plugin system is also gradually developing, and the open platform provides creators with the opportunity to upload plugins covering a wide range of application scenarios. However, current research and discussions mostly focus on the security issues of the ChatGPT model itself, while ignoring the possible security risks posed by the plugin system. This study aims to analyse the security of plugins in the ChatGPT plugin shop, reveal its major security vulnerabilities, and propose corresponding improvements.
Authors:Farzana Abdulzada
Abstract:
As the frequency of cyber threats increases, conventional penetration testing is failing to capture the entirety of todays complex environments. To solve this problem, we propose the Vulnerability Mitigation System (VMS), a novel agent based on a Large Language Model (LLM) capable of performing penetration testing without human intervention. The VMS has a two-part architecture for planning and a Summarizer, which enable it to generate commands and process feedback. To standardize testing, we designed two new Capture the Flag (CTF) benchmarks based on the PicoCTF and OverTheWire platforms with 200 challenges. These benchmarks allow us to evaluate how effectively the system functions. We performed a number of experiments using various LLMs while tuning the temperature and top-p parameters and found that GPT-4o performed best, sometimes even better than expected. The results indicate that LLMs can be effectively applied to many cybersecurity tasks; however, there are risks. To ensure safe operation, we used a containerized environment. Both the VMS and the benchmarks are publicly available, advancing the creation of secure, autonomous cybersecurity tools.
Authors:Craig Wright
Abstract:
This paper presents a comprehensive refutation of the so-called "blockchain trilemma," a widely cited but formally ungrounded claim asserting an inherent trade-off between decentralisation, security, and scalability in blockchain protocols. Through formal analysis, empirical evidence, and detailed critique of both methodology and terminology, we demonstrate that the trilemma rests on semantic equivocation, misuse of distributed systems theory, and a failure to define operational metrics. Particular focus is placed on the conflation of topological network analogies with protocol-level architecture, the mischaracterisation of Bitcoin's design--including the role of miners, SPV clients, and header-based verification--and the failure to ground claims in complexity-theoretic or adversarial models. By reconstructing Bitcoin as a deterministic, stateless distribution protocol governed by evidentiary trust, we show that scalability is not a trade-off but an engineering outcome. The paper concludes by identifying systemic issues in academic discourse and peer review that have allowed such fallacies to persist, and offers formal criteria for evaluating future claims in blockchain research.
Authors:Abraham Itzhak Weinberg
Abstract:
This paper presents the Singularity Cipher, a novel cryptographic-steganographic framework that integrates topological transformations and visual paradoxes to achieve multidimensional security. Inspired by the non-orientable properties of the Klein bottle -- constructed from two Mobius strips -- the cipher applies symbolic twist functions to simulate topological traversal, producing high confusion and diffusion in the ciphertext. The resulting binary data is then encoded using perceptual illusions, such as the missing square paradox, to visually obscure the presence of encrypted content. Unlike conventional ciphers that rely solely on algebraic complexity, the Singularity Cipher introduces a dual-layer approach: symbolic encryption rooted in topology and visual steganography designed for human cognitive ambiguity. This combination enhances both cryptographic strength and detection resistance, making it well-suited for secure communication, watermarking, and plausible deniability in adversarial environments. The paper formalizes the architecture, provides encryption and decryption algorithms, evaluates security properties, and compares the method against classical, post-quantum, and steganographic approaches. Potential applications and future research directions are also discussed.
Authors:Krishnendu Das
Abstract:
In the realm of big data and cloud computing, distributed systems are tasked with proficiently managing, storing, and validating extensive datasets across numerous nodes, all while maintaining robust data integrity. Conventional hashing methods, though straightforward, encounter substan tial difficulties in dynamic settings due to the necessity for thorough rehashing when nodes are altered. Consistent hashing mitigates some of these challenges by reducing data redistribution; however, it still contends with limitations in load balancing and scalability under intensive update conditions. This paper introduces an innovative approach using a lattice based homomorphic hash function HexaMorphHash that facilitates constant time, incremental updates while preserving a constant digest size. By utilizing the complexity of the Short Integer Solutions SIS problem, our method secures strong protective measures, even against quantum threats. We further com pare our method with existing ones such as direct signatures for each update, comprehensive database signing, Merkle tree based techniques, AdHash, MuHash, ECMH, and homomorphic sig nature schemes highlighting notable advancements in computational efficiency, memory usage, and scalability. Our contributions present a viable solution for frequent update dissemination in expansive distributed systems, safeguarding both data integrity and system performance.
Authors:Hunter Chasens
Abstract:
CVE-2024-25825 is a vulnerability found in FydeOS. This thesis describes its discovery, disclosure, and its further investigation in connection to a nation state actor. The vulnerability is CWE-1392: Use of Default Credentials, CWE-1393: Use of Default Password, and CWE-258: Empty Password in Configuration File found in the /etc/shadow configuration file. The root users entry in the /etc/shadow file contains a wildcard allowing entry with any, or no, password. Following responsable disclosure, Fyde, CISA, and Mitre were informed. Fyde was already aware of the vulnerability. There was concern that this vulnerability might have been purposefully placed, perhaps by a nation state actor. After further investigation, it appears that this is unlikely to be the case. In cases in which poisoned code is suspected it might be prudent to contact the appropriate CERT, rather than the parent company. This, however, clashes with the typical teaching of responsable disclosure.
Authors:Yusuf OzmiÅ
Abstract:
This paper explores how zero-knowledge proofs can enhance Bitcoin's functionality and privacy. First, we consider Proof-of-Reserve schemes: by using zk-STARKs, a custodian can prove its Bitcoin holdings are more than a predefined threshold X, without revealing addresses or actual balances. We outline a STARK-based protocol for Bitcoin UTXOs and discuss its efficiency. Second, we examine ZK Light Clients, where a mobile or lightweight device verifies Bitcoin's proof-of-work chain using succinct proofs. We propose a protocol for generating and verifying a STARK-based proof of a chain of block headers, enabling trust-minimized client operation. Third, we explore Privacy-Preserving Rollups via BitVM: leveraging BitVM, we design a conceptual rollup that keeps transaction data confidential using zero-knowledge proofs. In each case, we analyze security, compare with existing approaches, and discuss implementation considerations. Our contributions include the design of concrete protocols adapted to Bitcoin's UTXO model and an assessment of their practicality. The results suggest that while ZK proofs can bring powerful features (e.g., on-chain reserve audits, trustless light clients, and private layer-2 execution) to Bitcoin, each application requires careful trade-offs in efficiency and trust assumptions.
Authors:M. Matsive Ali
Abstract:
Since the 1990s, the telephone has been the primary mode of communication. However, Voice over Internet Protocol (VoIP), which is a highly straightforward and affordable form of data transfer, is now becoming an important part of daily communication. VoIP is the technology that makes it possible to send speech and multimedia data packets across either a public or private IP network. However, a cyberattack known as a man-in-the-middle attack poses a serious concern in transferring data through any network. Therefore, the authors have designed a system that sends voice over the internet within the range of a router using encrypted data transfer. An embedded system comprising an electret microphone, Embedded C, Particle Photon microcontroller, and Internet of Things (IoT) technology is developed. Due to its compact size, this type of device may be incorporated into automobiles, surveillance systems, or covert listening tools. The VoIP system gathers sound signals using the MAX9814 microphone, while the Particle Photon microcontroller securely transmits the data. Devices with access can download data from the VoIP systems Transmission Control Protocol (TCP) server. The accessed device stores the audio locally and uploads the corresponding data to Google Drive. This VoIP system provides a secure method of communication while conserving the integrity of the original signal.
Authors:Satyam Tyagi
Abstract:
RSA exponent reduction and AES S-box inversion share a hidden commonality: both are governed by the same impartial combinatorial principle, which we call a Product-Congruence Game (PCG). A Product-Congruence Game tracks play via the modular or finite-field product of heap values, providing a single invariant that unifies the algebraic cores of these two ubiquitous symmetric and asymmetric cryptosystems. We instantiate this framework with two companion games. First, $Ï$-MuM, in which a left-associated "multi-secret" RSA exponent chain compresses into the game of Multiplicative Modular Nim, PCG($k,\{1\}$), where $k = ord_N(g)$. The losing predicate then factorizes via the Chinese remainder theorem, mirroring RSA's structure. Second, poly-MuM, our model for finite-field inversion such as the AES S-box. For poly-MuM we prove the single-hole property inside its threshold region, implying that the Sprague-Grundy values are multiplicative under disjunctive sums in that region. Beyond these instances, we establish four structural theorems for a general Product-Congruence Game PCG($m,R$): (i) single-heap repair above the modulus, (ii) ultimate period $m$ per coordinate, (iii) exact and asymptotic losing densities, and (iv) confinement of optimal play to a finite indeterminacy region. An operation-alignment collapse principle explains why some variants degenerate to a single aggregate while MuM, $Ï$-MuM and poly-MuM retain rich local structure. All ingredients (multiplicative orders, the Chinese remainder theorem, finite fields) are classical; the contribution is the unified aggregation-compression viewpoint that embeds both RSA and AES inside one impartial-game framework, together with the structural and collapse theorems.
Authors:Geraldo A. Barbosa
Abstract:
Polar encoding, described by Arikan in IEEE Transactions on Information Theory, Vol. 55, No. 7, July 2009, was a milestone for telecommunications. A Polar code distributes information among high and low-capacity channels, showing the possibility of achieving perfect channel capacity. The high-capacity channels allow almost noiseless transmission of data. When these channels are not high noise, reliability is achieved in the signal transmission. It starts to compete against codes such a Low-Density Parity-Check (LDPC) codes. Polar code can be also considered error correcting, based on the redundancy inherent in its structure. This feature makes polar encoding also applicable to digital quantum-resistant cryptography protocols. This work explores linear decoding at a first or single trial in the case of small losses or small number of bit-flipping, and repeated transmission for medium level losses. This is distinct from Arikans successive probabilistic decoding by application of probabilistic rules. Linear decoding is done directly from solving the linear equations connecting the codewords x and the received signals y after transmission via noisy channels. Numerical examples will be shown. Along with this work, programming in Mathematica language was used. Codes are available for copy-and-paste for Mathematica users to immediately try the described formalism.
Authors:Gabriel Chua
Abstract:
As large language models (LLMs) increasingly integrate native code interpreters, they enable powerful real-time execution capabilities, substantially expanding their utility. However, such integrations introduce potential system-level cybersecurity threats, fundamentally different from prompt-based vulnerabilities. To systematically evaluate these interpreter-specific risks, we propose CIRCLE (Code-Interpreter Resilience Check for LLM Exploits), a simple benchmark comprising 1,260 prompts targeting CPU, memory, and disk resource exhaustion. Each risk category includes explicitly malicious ("direct") and plausibly benign ("indirect") prompt variants. Our automated evaluation framework assesses not only whether LLMs refuse or generates risky code, but also executes the generated code within the interpreter environment to evaluate code correctness, simplifications made by the LLM to make the code safe, or execution timeouts. Evaluating 7 commercially available models from OpenAI and Google, we uncover significant and inconsistent vulnerabilities. For instance, evaluations show substantial disparities even within providers - OpenAI's o4-mini correctly refuses risky requests at 7.1%, notably higher rates compared to GPT-4.1 at 0.5%. Results particularly underscore that indirect, socially-engineered prompts substantially weaken model defenses. This highlights an urgent need for interpreter-specific cybersecurity benchmarks, dedicated mitigation tools (e.g., guardrails), and clear industry standards to guide safe and responsible deployment of LLM interpreter integrations. The benchmark dataset and evaluation code are publicly released to foster further research.
Authors:Yuksel Arslan
Abstract:
Computers and computer networks have become integral to virtually every aspect of modern life, with the Internet playing an indispensable role. Organizations, businesses, and individuals now store vast amounts of proprietary, confidential, and personal data digitally. As such, ensuring the security of this data from unauthorized access is critical. Common security measures, such as firewalls, intrusion detection systems (IDS), intrusion prevention systems (IPS), and antivirus software, are constantly evolving to safeguard computer systems and networks. However, these tools primarily focus on defending against external threats, leaving systems vulnerable to insider attacks. Security solutions designed to mitigate risks originating from within the organization are relatively limited and often ineffective. This paper demonstrates how a Local Area Network (LAN) can be covertly exposed to the Internet via an insider attack. Specifically, it illustrates how an external machine can gain access to a LAN by exploiting an unused secondary IP address of the attacked LAN, effectively bypassing existing security mechanisms by also exploiting Hyper Text Transfer Protocol (HTTP). Despite the presence of robust external protections, such as firewalls and IDS, this form of insider attack reveals significant vulnerabilities in the way internal threats are addressed.
Authors:Shariq Murtuza
Abstract:
Recent technological advancements and the prevalence of technology in day to day activities have caused a major increase in the likelihood of the involvement of digital evidence in more and more legal investigations. Consumer-grade hardware is growing more powerful, with expanding memory and storage sizes and enhanced processor capabilities. Forensics investigators often have to sift through gigabytes of data during an ongoing investigation making the process tedious. Memory forensics, disk analysis all are well supported by state of the art tools that significantly lower the effort required to be put in by a forensic investigator by providing string searches, analyzing images file etc. During the course of the investigation a lot of false positives are identified that need to be lowered. This work presents Scout, a digital forensics framework that performs preliminary evidence processing and prioritizing using large language models. Scout deploys foundational language models to identify relevant artifacts from a large number of potential evidence files (disk images, captured network packets, memory dumps etc.) which would have taken longer to get identified. Scout employs text based large language models can easily process files with textual information. For the forensic analysis of multimedia files like audio, image, video, office documents etc. multimodal models are employed by Scout. Scout was able to identify and realize the evidence file that were of potential interest for the investigator.
Authors:Matteo Strada
Abstract:
The valuation of Cyber Threat Intelligence (CTI) remains a persistent challenge due to the problem of negative evidence: successful threat prevention results in non-events that generate minimal observable financial impact, making CTI expenditures difficult to justify within traditional cost-benefit frameworks. This study introduces a data-driven methodology for quantifying the return on investment (ROI) of CTI, thereby reframing it as a measurable contributor to risk mitigation. The proposed framework extends established models in security economics, including the Gordon-Loeb and FAIR models, to account for CTI's complex influence on both the probability of security breaches and the severity of associated losses. The framework is operationalized through empirically grounded performance indicators, such as reductions in mean time to detect (MTTD), mean time to respond (MTTR), and adversary dwell time, supported by three sector-specific case studies in finance, healthcare, and retail. To address limitations in conventional linear assessment methodologies, the Threat Intelligence Effectiveness Index (TIEI) is introduced as a composite metric based on a weighted geometric mean. TIEI penalizes underperformance across critical dimensions: quality, enrichment, integration, and operational impact; thereby capturing bottleneck effect where the least effective component limits overall performance. By integrating financial quantification, adversarial coverage, and qualitative assessments of business enablement, the proposed hybrid model converts negative evidence into a justifiable ROI explanation. This approach offers a replicable means of repositioning CTI from an expense to a strategic investment, enabling informed decision-making and continuous optimization across diverse organizational contexts.
Authors:Ruoyang Rykie Guo
Abstract:
LLM-powered search services have driven data integration as a significant trend. However, this trend's progress is fundamentally hindered, despite the fact that combining individual knowledge can significantly improve the relevance and quality of responses in specialized queries and make AI more professional at providing services. Two key bottlenecks are private data repositories' locality constraints and the need to maintain compatibility with mainstream search techniques, particularly Hierarchical Navigable Small World (HNSW) indexing for high-dimensional vector spaces. In this work, we develop a secure and privacy-preserving aggregated approximate nearest neighbor search (SP-A$^2$NN) with HNSW compatibility under a threshold-based searchable sharing primitive. A sharable bitgraph structure is constructed and extended to support searches and dynamical insertions over shared data without compromising the underlying graph topology. The approach reduces the complexity of a search from $O(n^2)$ to $O(n)$ compared to naive (undirected) graph-sharing approach when organizing graphs in the identical HNSW manner.
On the theoretical front, we explore a novel security analytical framework that incorporates privacy analysis via reductions. The proposed leakage-guessing proof system is built upon an entirely different interactive game that is independent of existing coin-toss game design. Rather than being purely theoretical, this system is rooted in existing proof systems but goes beyond them to specifically address leakage concerns and standardize leakage analysis -- one of the most critical security challenges with AI's rapid development.
Authors:Gabriele Costa
Abstract:
This paper presents a vulnerability assessment activity that we carried out on PosteID, the implementation of the Italian Public Digital Identity System (SPID) by Poste Italiane. The activity led to the discovery of a critical privilege escalation vulnerability, which was eventually patched. The overall analysis and disclosure process represents a valuable case study for the community of ethical hackers. In this work, we present both the technical steps and the details of the disclosure process.
Authors:Iman Vakilinia
Abstract:
With billions of users and an immense volume of daily uploads, YouTube has become an attractive target for cybercriminals aiming to leverage its vast audience. The platform's openness and trustworthiness provide an ideal environment for deceptive campaigns that can operate under the radar of conventional security tools. This paper explores how cybercriminals exploit YouTube to disseminate malware, focusing on campaigns that promote free software or game cheats. It discusses deceptive video demonstrations and the techniques behind malware delivery. Additionally, the paper presents a new evasion technique that abuses YouTube's multilingual metadata capabilities to circumvent automated detection systems. Findings indicate that this method is increasingly being used in recent malicious videos to avoid detection and removal.
Authors:Senthilkumar Gopal
Abstract:
APIs (Application Programming Interfaces) or Web Services are the foundational building blocks that enable interconnected systems. However this proliferation of APIs has also introduced security challenges that require systematic and scalable solutions for secure authentication and authorization. This paper presents the fundamentals necessary for building a such a token-based API security system. It discusses the components necessary, the integration of OAuth 2.0, extensibility of the token architectures, necessary cryptographic foundations, and persistence strategies to ensure secure and resilient operations. In addition to architectural concerns, the paper explores best practices for token lifecycle management, scope definition, expiration policies, and revocation mechanisms, all framed within a real-world scenario. By adhering to these principles, developers can establish a robust baseline while maintaining the flexibility to customize their domain-specific requirements. The approach does not claim to cover all variations necessary for diverse architectures but instead focuses on key principles essential for any standard API token authentication system. Throughout, the paper emphasizes balancing practical considerations with security imperatives and uses key concepts such as the CIA triad, OAuth standards, secure token life cycle, and practices for protecting sensitive user and application data. The intent is to equip developers with the foundational knowledge necessary to build secure, scalable token-based API security systems ready to handle the evolving threat landscape.
Authors:Farzin Renan
Abstract:
Digital signatures are fundamental cryptographic tools that provide authentication and integrity in digital communications. However, privacy-sensitive applications, such as e-voting and digital cash, require more restrictive verification models to ensure confidentiality and control. Strong Designated Verifier Signature (SDVS) schemes address this need by enabling the signer to designate a specific verifier, ensuring that only this party can validate the signature. Existing SDVS constructions are primarily based on number-theoretic assumptions and are therefore vulnerable to quantum attacks. Although post-quantum alternatives, particularly those based on lattices, have been proposed, they often entail large key and signature sizes. In this work, we present $\mathsf{CSI\text{-}SDVS}$, a novel isogeny-based SDVS scheme that offers a compact, quantum-resistant alternative to existing SDVS constructions. The scheme leverages the ideal class group action on $\mathbb{F}_p$-isomorphism classes of supersingular elliptic curves and is founded on the hardness of the Multi-Target Group Action Inverse Problem (MT-GAIP). $\mathsf{CSI\text{-}SDVS}$ achieves strong security guarantees, Strong Unforgeability under Chosen-Message Attacks (SUF-CMA), Non-Transferability (NT), and Privacy of Signer's Identity (PSI), in the random oracle model, thereby making it among the most compact PQC-based SDVS schemes and the only post-quantum secure construction based on isogenies.
Authors:MA. Khajeian
Abstract:
Long, human-generated passwords pose significant challenges to both classical and quantum attacks due to their irregular structure and large search space. In this work, we propose an enhanced classical-quantum hybrid attack specifically designed for this scenario. Our approach constructs rainbow tables using dictionary-based password generation augmented with transformation rules that better capture real-world user behavior. These tables are organized into buckets, enabling faster lookup and reduced space complexity. For the search within each bucket, we employ a distributed exact variant of Grover's algorithm. This method provides deterministic success and significantly lower circuit depth, enhancing robustness against noise-particularly depolarizing errors common in near-term quantum devices. Overall, our hybrid framework improves the efficiency and practicality of password recovery for long, human-readable passwords in realistic adversarial settings.
Authors:Pengfei Du
Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse applications, yet they pose significant security risks that threaten their safe deployment in critical domains. Current security alignment methodologies predominantly rely on Process Reward Models (PRMs) to evaluate intermediate reasoning steps, introducing substantial computational overhead and scalability constraints. This paper presents a novel PRM-free security alignment framework that leverages automated red teaming and adversarial training to achieve robust security guarantees while maintaining computational efficiency. Our approach systematically identifies vulnerabilities through sophisticated attack strategies including genetic algorithm optimization, multi-agent simulation, and advanced prompt mutation techniques. The framework enhances model robustness via targeted adversarial training with curriculum learning and adaptive regularization mechanisms. Comprehensive experimental evaluation across five state-of-the-art LLMs demonstrates that our method achieves superior security alignment performance compared to PRM-based approaches while reducing computational costs by 61\%. The framework incorporates transparent reporting and continuous audit mechanisms that enable iterative security improvement and regulatory compliance. Our contributions advance the field of efficient LLM security alignment by democratizing access to robust security measures for resource-constrained organizations and providing a scalable foundation for addressing evolving adversarial threats.
Authors:Serhan W. Bahar
Abstract:
The rapid integration of blockchain, cryptocurrency, and Web3 technologies into digital banks and fintech operations has created an integrated environment blending traditional financial systems with decentralised elements. This paper introduces the CryptoNeo Threat Modelling Framework (CNTMF), a proposed framework designed to address the risks in these ecosystems, such as oracle manipulation and cross-chain exploits. CNTMF represents a proposed extension of established methodologies like STRIDE, OWASP Top 10, NIST frameworks, LINDDUN, and PASTA, while incorporating tailored components including Hybrid Layer Analysis, the CRYPTOQ mnemonic for cryptocurrency-specific risks, and an AI-Augmented Feedback Loop. Drawing on real-world data from 2025 incidents, CNTMF supports data-driven mitigation to reduce losses, which totalled approximately $2.47 billion in the first half of 2025 across 344 security events (CertiK via GlobeNewswire, 2025; Infosecurity Magazine, 2025). Its phases guide asset mapping, risk profiling, prioritisation, mitigation, and iterative feedback. This supports security against evolving risks like state-sponsored attacks.
Authors:Saurav Ghosh
Abstract:
Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blockchains, which integrate quantum-resistant algorithms, and quantum blockchains, which leverage quantum properties such as entanglement and quantum key distribution. This survey reviews key developments in both areas, analyzing their cryptographic foundations, architectural designs, and implementation challenges. This work provides a comparative overview of technical proposals, highlight trade-offs in security, scalability, and deployment, and identify open research problems across hardware, consensus, and network design. The goal is to offer a structured and comprehensive reference for advancing secure blockchain systems in the quantum era.
Authors:Cara Ellen Appel
Abstract:
This article presents the current state of ML-security and of the documentation of ML-based systems, models and datasets in research and practice based on an extensive review of the existing literature. It shows a generally low awareness of security aspects among ML-practitioners and organizations and an often unstandardized approach to documentation, leading to overall low quality of ML-documentation. Existing standards are not regularly adopted in practice and IT-security aspects are often not included in documentation. Due to these factors, there is a clear need for improved security documentation in ML, as one step towards addressing the existing gaps in ML-security. To achieve this, we propose expanding existing documentation standards for ML-documentation to include a security section with specific security relevant information. Implementing this, a novel expanded method of documenting security requirements in ML-documentation is presented, based on the existing Model Cards and Datasheets for Datasets standards, but with the recommendation to adopt these findings in all ML-documentation.
Authors:Quanyan Zhu
Abstract:
Protecting cyberspace requires not only advanced tools but also a shift in how we reason about threats, trust, and autonomy. Traditional cybersecurity methods rely on manual responses and brittle heuristics. To build proactive and intelligent defense systems, we need integrated theoretical frameworks and software tools. Game theory provides a rigorous foundation for modeling adversarial behavior, designing strategic defenses, and enabling trust in autonomous systems. Meanwhile, software tools process cyber data, visualize attack surfaces, verify compliance, and suggest mitigations. Yet a disconnect remains between theory and practical implementation.
The rise of Large Language Models (LLMs) and agentic AI offers a new path to bridge this gap. LLM-powered agents can operationalize abstract strategies into real-world decisions. Conversely, game theory can inform the reasoning and coordination of these agents across complex workflows. LLMs also challenge classical game-theoretic assumptions, such as perfect rationality or static payoffs, prompting new models aligned with cognitive and computational realities. This co-evolution promises richer theoretical foundations and novel solution concepts. Agentic AI also reshapes software design: systems must now be modular, adaptive, and trust-aware from the outset.
This chapter explores the intersection of game theory, agentic AI, and cybersecurity. We review key game-theoretic frameworks (e.g., static, dynamic, Bayesian, and signaling games) and solution concepts. We then examine how LLM agents can enhance cyber defense and introduce LLM-driven games that embed reasoning into AI agents. Finally, we explore multi-agent workflows and coordination games, outlining how this convergence fosters secure, intelligent, and adaptive cyber systems.
Authors:Steve Tippeconnic
Abstract:
This experiment breaks a 5-bit elliptic curve cryptographic key using a Shor-style quantum attack. Executed on IBM's 133-qubit ibm_torino with Qiskit Runtime 2.0, a 15-qubit circuit, comprised of 10 logical qubits and 5 ancilla, interferes over an order-32 elliptic curve subgroup to extract the secret scalar k from the public key relation Q = kP, without ever encoding k directly into the oracle. From 16,384 shots, the quantum interference reveals a diagonal ridge in the 32 x 32 QFT outcome space. The quantum circuit, over 67,000 layers deep, produced valid interference patterns despite extreme circuit depth, and classical post-processing revealed k = 7 in the top 100 invertible (a, b) results. All code, circuits, and raw data are publicly available for replication.
Authors:Hari Masoor
Abstract:
Current AI agent architectures suffer from ephemeral memory limitations, preventing effective collaboration and knowledge sharing across sessions and agent boundaries. We introduce SAMEP (Secure Agent Memory Exchange Protocol), a novel framework that enables persistent, secure, and semantically searchable memory sharing among AI agents. Our protocol addresses three critical challenges: (1) persistent context preservation across agent sessions, (2) secure multi-agent collaboration with fine-grained access control, and (3) efficient semantic discovery of relevant historical context. SAMEP implements a distributed memory repository with vector-based semantic search, cryptographic access controls (AES-256-GCM), and standardized APIs compatible with existing agent communication protocols (MCP, A2A). We demonstrate SAMEP's effectiveness across diverse domains including multi-agent software development, healthcare AI with HIPAA compliance, and multi-modal processing pipelines. Experimental results show 73% reduction in redundant computations, 89% improvement in context relevance scores, and complete compliance with regulatory requirements including audit trail generation. SAMEP enables a new paradigm of persistent, collaborative AI agent ecosystems while maintaining security and privacy guarantees.
Authors:Marc Bara
Abstract:
We present PromptChain, a decentralized Web3 architecture that establishes AI prompts as first-class digital assets with verifiable ownership, version control, and monetization capabilities. Current centralized platforms lack mechanisms for proper attribution, quality assurance, or fair compensation for prompt creators. PromptChain addresses these limitations through a novel integration of IPFS for immutable storage, smart contracts for governance, and token incentives for community curation. Our design includes: (1) a comprehensive metadata schema for cross-model compatibility, (2) a stake-weighted validation mechanism to align incentives, and (3) a token economy that rewards contributors proportionally to their impact. The proposed architecture demonstrates how decentralized systems could potentially match centralized alternatives in efficiency while providing superior ownership guarantees and censorship resistance through blockchain-anchored provenance tracking. By decoupling prompts from specific AI models or outputs, this work establishes the foundation for an open ecosystem of human-AI collaboration in the Web3 era, representing the first systematic treatment of prompts as standalone digital assets with dedicated decentralized infrastructure.
Authors:Quanyan Zhu
Abstract:
We introduce the framework of LLM-Stackelberg games, a class of sequential decision-making models that integrate large language models (LLMs) into strategic interactions between a leader and a follower. Departing from classical Stackelberg assumptions of complete information and rational agents, our formulation allows each agent to reason through structured prompts, generate probabilistic behaviors via LLMs, and adapt their strategies through internal cognition and belief updates. We define two equilibrium concepts: reasoning and behavioral equilibrium, which aligns an agent's internal prompt-based reasoning with observable behavior, and conjectural reasoning equilibrium, which accounts for epistemic uncertainty through parameterized models over an opponent's response. These layered constructs capture bounded rationality, asymmetric information, and meta-cognitive adaptation. We illustrate the framework through a spearphishing case study, where a sender and a recipient engage in a deception game using structured reasoning prompts. This example highlights the cognitive richness and adversarial potential of LLM-mediated interactions. Our results show that LLM-Stackelberg games provide a powerful paradigm for modeling decision-making in domains such as cybersecurity, misinformation, and recommendation systems.
Authors:Serhan W. Bahar
Abstract:
The emergence of quantum computing presents profound challenges to existing cryptographic infrastructures, whilst the development of central bank digital currencies (CBDCs) has raised concerns regarding privacy preservation and excessive centralisation in digital payment systems. This paper proposes the Quantum-Resilient Privacy Ledger (QRPL) as an innovative token-based digital currency architecture that incorporates National Institute of Standards and Technology (NIST)-standardised post-quantum cryptography (PQC) with hash-based zero-knowledge proofs to ensure user sovereignty, scalability, and transaction confidentiality. Key contributions include adaptations of ephemeral proof chains for unlinkable transactions, a privacy-weighted Proof-of-Stake (PoS) consensus to promote equitable participation, and a novel zero-knowledge proof-based mechanism for privacy-preserving selective disclosure. QRPL aims to address critical shortcomings in prevailing CBDC designs, including risks of pervasive surveillance, with a 10-20 second block time to balance security and throughput in future monetary systems. While conceptual, empirical prototypes are planned. Future work includes prototype development to validate these models empirically.
Authors:Craig S Wright
Abstract:
It is frequently claimed in blockchain discourse that immutability guarantees trust. This paper rigorously refutes that assertion. We define immutability as the cryptographic persistence of historical states in an append-only data structure and contrast it with trust, understood as a rational epistemic expectation under uncertainty. Employing predicate logic, automata-theoretic models, and epistemic game-theoretic analysis, we demonstrate that immutability neither entails nor implies correctness, fairness, or credibility. Through formal constructions and counterexamples--including predictive fraud schemes and the phenomenon of garbage permanence--we show that the belief conflates structural and epistemic domains. Immutability preserves all data equally, regardless of veracity. Therefore, the assertion that immutability guarantees trust collapses under the weight of formal scrutiny.
Authors:Matthias Johann Steiner
Abstract:
Let $\mathbb{F}_q$ be a prime field with $q \geq 3$, and let $d, m \geq 1$ be integers such that $\gcd \left( d, q \right) = 1$ and $m \mid (q - 1)$. In this paper we bound the absolute values of the Walsh spectrum of S-Boxes $S (x) = x^d \cdot T \left( x^\frac{q - 1}{m} \right)$, where $T$ is a function with $T (x) \neq 0$ if $x \neq 0$. Such S-Boxes have been proposed for the Zero-Knowledge-friendly hash functions Grendel and Polocolo. In particular, we prove the conjectured correlation of the Polocolo S-Box.
Authors:Kim Peiter Jørgensen
Abstract:
Wallets are access points for the digital economys value creation. Wallets for blockchains store the end-users cryptographic keys for administrating their digital assets and enable access to blockchain Web3 systems. Web3 delivers new service opportunities. This chapter focuses on the Web3 enabled release of value through the lens of wallets. Wallets may be implemented as software apps on smartphones, web apps on desktops, or hardware devices. Wallet users request high security, ease of use, and access of relevance from their wallets. Increasing connectivity, functionality, autonomy, personal support, and offline capability make the wallet into the user's Universal Access Device for any digital asset. Through wallet based services, the owner obtains enhanced digital empowerment. The new Web3 solutionareas, Identity and Decentralisation, enable considerable societal effects, and wallets are an integral part of these. One example is self sovereign identity solutions combined with wallet borne AI for personalised support, empowering the enduser beyond anything previously known. Improved welfare is foreseen globally through enlarged markets with collaborative services with drastically lowered transaction costs compared to today, the expected vastly increased levels of automation in society necessitate enhanced enduser protection. As wallets are considered a weak spot for security, improving overall security through blockchains is essential.
Authors:Abel C. H. Chen
Abstract:
Since many applications and services require pseudorandom numbers (PRNs), it is feasible to generate specific PRNs under given key values and input messages using Key Derivation Functions (KDFs). These KDFs are primarily constructed based on Message Authentication Codes (MACs), where the MAC serves as a core component in the generation of pseudorandom numbers. In light of this, the study first examines three MAC algorithms defined by the National Institute of Standards and Technology (NIST): the Keyed-Hash Message Authentication Code (HMAC), the Cipher-based Message Authentication Code (CMAC), and the Keccak-based Message Authentication Code (KMAC). Subsequently, the study explores KDFs based on these MACs, including the Counter Mode KDF, the KMAC-based KDF, and the KDF defined in IEEE 1609.2.1. In experiments, the computation times for generating MACs and the corresponding pseudorandom numbers using each KDF are evaluated. The study further analyzes the advantages, disadvantages, and applicable scenarios for each method. Experimental results indicate that the CMAC and the CMAC-based KDF exhibit the shortest computation times, averaging approximately 0.007 milliseconds and 0.014 milliseconds, respectively.
Authors:Craig Wright
Abstract:
The so-called blockchain trilemma asserts the impossibility of simultaneously achieving scalability, security, and decentralisation within a single blockchain protocol. In this paper, we formally refute that proposition. Employing predicate logic, formal automata theory, computational complexity analysis, and graph-theoretic measures of relay topology--specifically Baran's model of network path redundancy--we demonstrate that the trilemma constitutes a category error, conflates distinct analytical domains, and relies upon unproven causal assumptions. We further expose its reliance on composition fallacies drawn from flawed system implementations. A constructive counterexample is presented: a blockchain protocol exhibiting unbounded transaction throughput, cryptographic security under adversarial load, and multipath decentralised propagation. This example is not hypothetical but grounded in protocol design enabled by compact block relay, SPV verification, and IPv6 multicast. The trilemma is revealed not as a law of protocol architecture, but as a heuristic fallacy sustained by imprecision and design defeatism.
Authors:Yusei Tanaka
Abstract:
Round-based DAGs enable high-performance Byzantine fault-tolerant consensus, yet their technical advantages remain underutilized due to their short history. While research on consensus protocols is active in both academia and industry, many studies overlook implementation-level algorithms, leaving actual performance unclear - particularly for theoretical protocols whose practical performance cannot often be evaluated. Bullshark, a Round-based DAG BFT protocol on Narwhal mempool, achieves optimal performance: 297,000 transactions per second with 2-second latency. We analyze the algorithm's workflow, from transaction submission to blockchain commitment, breaking it down layer by layer at the functional level and delineating the key features and interactions of the Bullshark and Narwhal components. Future work aims to improve performance in Byzantine fault environments and optimize trade-offs in the CAP theorem.
Authors:Sathesh P. Sivashanmugam
Abstract:
Large language models (LLMs) have transformed natural language processing, but their ability to memorize training data poses significant privacy risks. This paper investigates model inversion attacks on the Llama 3.2 model, a multilingual LLM developed by Meta. By querying the model with carefully crafted prompts, we demonstrate the extraction of personally identifiable information (PII) such as passwords, email addresses, and account numbers. Our findings highlight the vulnerability of even smaller LLMs to privacy attacks and underscore the need for robust defenses. We discuss potential mitigation strategies, including differential privacy and data sanitization, and call for further research into privacy-preserving machine learning techniques.
Authors:Sri Harsha Gajavalli
Abstract:
Privacy-preserving machine learning (ML) seeks to balance data utility and privacy, especially as regulations like the GDPR mandate the anonymization of personal data for ML applications. Conventional anonymization approaches often reduce data utility due to indiscriminate generalization or suppression of data attributes. In this study, we propose an interactive approach that incorporates human input into the k-anonymization process, enabling domain experts to guide attribute preservation based on contextual importance. Using the UCI Adult dataset, we compare classification outcomes of interactive human-influenced anonymization with traditional, fully automated methods. Our results show that human input can enhance data utility in some cases, although results vary across tasks and settings. We discuss limitations of our approach and suggest potential areas for improved interactive frameworks in privacy-aware ML.
Authors:Jovonni L. PHarr
Abstract:
This work presents a novel decentralized protocol for digital estate planning that integrates advances distributed computing, and cryptography. The original proof-of-concept was constructed using purely solidity contracts. Since then, we have enhanced the implementation into a layer-1 protocol that uses modern interchain communication to connect several heterogeneous chain types. A key contribution of this research is the implementation of several modern cryptographic primitives to support various forms of claims for information validation. These primitives introduce an unmatched level of privacy to the process of digital inheritance. We also demonstrate on a set of heterogeneous smart contracts, following the same spec, on each chain to serve as entry points, gateways, or bridge contracts that are invoked via a path from the will module on our protocol, to the contract. This ensures a fair and secure distribution of digital assets in accordance with the wishes of the decedent without the requirement of moving their funds. This research further extends its innovations with a user interaction model, featuring a check-in system and account abstraction process, which enhances flexibility and user-friendliness without compromising on security. By developing a dedicated permissionless blockchain that is secured by a network of validators, and interchain relayers, the proposed protocol signifies a transformation in the digital estate planning industry and illustrates the potential of blockchain technology in revolutionizing traditional legal and personal spheres. Implementing a cryptoeconomic network at the core of inheritance planning allows for unique incentive compatible economic mechanisms to be constructed.
Authors:Ricardo Francisco Martinez-Gonzalez
Abstract:
Digital implementations of chaotic systems often suffer from inherent degradation, limiting their long-term performance and statistical quality. To address this challenge, we propose a novel four-stage synchronized piecewise linear chaotic map. This new map is meticulously designed with four independent segments, each possessing its own control parameters, specifically engineered to mitigate the natural degradation observed in digitally realized dynamical systems. We characterize its behavior using established tools from nonlinear dynamics, including bifurcation diagrams and graphical analysis, which provide a comprehensive qualitative understanding of its complex dynamics. To rigorously validate the statistical features of the generated sequences, we employed the National Institute of Standards and Technology (NIST) statistical testing suite. A substantial 100 MB dataset, comprising sequences produced by the proposed map, was generated via a Matlab script and subjected to this rigorous battery of tests. Our results demonstrate that the proposed map exhibits superior statistical properties compared to the classic Bernoulli map, successfully passing all NIST tests where the traditional map did not. This research confirms the proposed map's potential as a robust and high-quality source for chaotic sequence generation.
Authors:Rayne Holland
Abstract:
Linear sketches are fundamental tools in data stream analytics. They are notable for supporting both approximate frequency queries and heavy hitter detection with bounded trade-offs for error and memory. Importantly, on streams that contain sensitive information, linear sketches can be easily privatized with the injection of a suitable amount of noise. This process is efficient in the single release model, where the output is released only at the end of the stream. In this setting, it suffices to add noise to the sketch once.
In contrast, in the continual observation model, where the output is released at every time-step, fresh noise needs to be added to the sketch before each release. This creates an additional computational overhead. To address this, we introduce Lazy Sketch, a novel differentially private sketching method that employs lazy updates, perturbing and modifying only a small portion of the sketch at each step. Compared to prior work, we reduce the update complexity by a factor of $O(w)$, where $w$ is the width of the sketch. Experiments demonstrate that our method increases throughput by up to 250x over prior work, making continual observation differential privacy practical for high-speed streaming applications.
In addition, for heavy hitter detection, we present a new sketch-based algorithm that leverages lazy updates to achieve a per-update complexity of $O(d \log (T/w) + \log w)$, for linear sketches with dimension $d\times w$ and streams of length $T$. This marks a significant improvement over prior approaches in the streaming continual observation model, which require recomputing frequency estimates for every item in the input domain at each time step.
Authors:Zhaorun Lin
Abstract:
Programmable blockchains have long been a hot research topic given their tremendous use in decentralized applications. Smart contracts, using blockchains as their underlying technology, inherit the desired properties such as verifiability, immutability, and transparency, which make it a great suit in trustless environments.
In this thesis, we consider several decentralized protocols to be built on blockchains, specifically using smart contracts on Ethereum. We used algorithmic and cryptographic tools in our implementations to further improve the level of security and efficiency beyond the state-of-the-art works. We proposed a new approach called Blind Vote, which is an untraceable, secure, efficient, secrecy-preserving, and fully on-chain electronic voting protocol based on the well-known concept of Chaum's blind signatures. We illustrate that our approach achieves the same security guarantees as previous methods such as Tornado Vote [1], while consuming significantly less gas. Thus, we provide a cheaper and considerably more gas-efficient alternative for anonymous blockchain-based voting. On the other hand, we propose a new family of algorithms for private, trustless auctions that protect bidder identities and bid values while remaining practical for smart contract execution. We ensure trustlessness by running the auction logic in a smart contract, thereby eliminating reliance on any single trusted party. This approach prevents bid tampering, front-running, and collusion by enforcing immutability and decentralized verification of bids. The resulting protocol uniquely combines efficiency, trustlessness, and enduring bid privacy, offering a scalable and secure solution for blockchain-based marketplaces and other decentralized applications.
Authors:Olli Järviniemi
Abstract:
We evaluate language models' ability to subvert monitoring protocols via collusion. More specifically, we have two instances of a model design prompts for a policy (P) and a monitor (M) in a programming task setting. The models collaboratively aim for M to classify all backdoored programs in an auditing dataset as harmful, but nevertheless classify a backdoored program produced by P as harmless. The models are isolated from each other, requiring them to independently arrive at compatible subversion strategies. We find that while Claude 3.7 Sonnet has low success rate due to poor convergence, it sometimes successfully colludes on non-obvious signals.
Authors:Benjamin A. Antunes
Abstract:
Machine learning (ML) frameworks rely heavily on pseudorandom number generators (PRNGs) for tasks such as data shuffling, weight initialization, dropout, and optimization. Yet, the statistical quality and reproducibility of these generators-particularly when integrated into frameworks like PyTorch, TensorFlow, and NumPy-are underexplored. In this paper, we compare the statistical quality of PRNGs used in ML frameworks (Mersenne Twister, PCG, and Philox) against their original C implementations. Using the rigorous TestU01 BigCrush test suite, we evaluate 896 independent random streams for each generator. Our findings challenge claims of statistical robustness, revealing that even generators labeled ''crush-resistant'' (e.g., PCG, Philox) may fail certain statistical tests. Surprisingly, we can observe some differences in failure profiles between the native and framework-integrated versions of the same algorithm, highlighting some implementation differences that may exist.
Authors:Michael A. Idowu
Abstract:
We present a deterministic framework for cryptographic seed generation based on cyclic modular inversion over $\mathbb{Z}/3^p\mathbb{Z}$. The method enforces algebraic admissibility on seed inputs via the identity $d_k \equiv -\left(2^{k-1}\right)^{-1} \bmod 3^p$, thereby producing structured and invertible residue sequences. This mapping yields entropy-rich, cycle-complete seeds well-suited for cryptographic primitives such as DRBGs, KDFs, and post-quantum schemes. To assess the quality of randomness, we introduce the Entropy Confidence Score (ECS), a composite metric reflecting coverage, uniformity, and modular bias. Although not a cryptographic PRNG in itself, the framework serves as a deterministic entropy filter that conditions and validates seed inputs prior to their use by conventional generators. Empirical and hardware-based results confirm constant-time execution, minimal side-channel leakage, and lightweight feasibility for embedded applications. The framework complements existing cryptographic stacks by acting as an algebraically verifiable entropy filter, thereby enhancing structural soundness and auditability.
Authors:Guang Yang
Abstract:
The rapid advancement of diffusion models, particularly Stable Diffusion 3.5, has enabled the generation of highly photorealistic synthetic images that pose significant challenges to existing detection methods. This paper presents FreqCross, a novel multi-modal fusion network that combines spatial RGB features, frequency domain artifacts, and radial energy distribution patterns to achieve robust detection of AI-generated images. Our approach leverages a three-branch architecture: (1) a ResNet-18 backbone for spatial feature extraction, (2) a lightweight CNN for processing 2D FFT magnitude spectra, and (3) a multi-layer perceptron for analyzing radial energy profiles. We introduce a novel radial energy distribution analysis that captures characteristic frequency artifacts inherent in diffusion-generated images, and fuse it with spatial and spectral cues via simple feature concatenation followed by a compact classification head. Extensive experiments on a dataset of 10,000 paired real (MS-COCO) and synthetic (Stable Diffusion 3.5) images demonstrate that FreqCross achieves 97.8\% accuracy, outperforming state-of-the-art baselines by 5.2\%. The frequency analysis further reveals that synthetic images exhibit distinct spectral signatures in the 0.1--0.4 normalised frequency range, providing theoretical foundation for our approach. Code and pre-trained models are publicly available to facilitate reproducible research.
Authors:Ben Kereopa-Yorke
Abstract:
This paper examines how distinct cultures of AI interdisciplinarity emerge through interface design, revealing the formation of new disciplinary cultures at these intersections. Through the Interface-Mediated Cognitive Security (IMCS) framework, I demonstrate how the collision of cybersecurity engineering, cognitive psychology, critical technology studies, and human-computer interaction generates research cultures that transcend traditional disciplinary boundaries. AI interfaces function as transformative boundary objects that necessitate methodological fusion rather than mere collaboration, simultaneously embodying technical architectures, psychological design patterns, and social interaction models. Through systematic visual analysis of generative AI platforms and case studies across public sector, medical, and educational domains, I identify four vulnerability vectors, Reflection Simulation, Authority Modulation, Cognitive Load Exploitation, and Market-Security Tension, that structure interface-mediated cognitive security. This research challenges three significant gaps in interdisciplinary theory: the assumption that disciplines maintain distinct methodological boundaries during collaboration, the belief that technical and social knowledge practices can be cleanly separated, and the presumption that disciplinary integration occurs through formal rather than cultural mechanisms. The empirical evidence demonstrates how interfaces function as sites of epistemological collision, creating methodological pressure zones where traditional disciplinary approaches prove insufficient for analysing the complex socio-technical phenomena at the interface.
Authors:Giulio Caldarelli
Abstract:
The blockchain oracle problem, which refers to the challenge of injecting reliable external data into decentralized systems, remains a fundamental limitation to the development of trustless applications. While recent years have seen a proliferation of architectural, cryptographic, and economic strategies to mitigate this issue, no one has yet fully resolved the fundamental question of how a blockchain can gain knowledge about the off-chain world. In this position paper, we critically assess the role artificial intelligence (AI) can play in tackling the oracle problem. Drawing from both academic literature and practitioner implementations, we examine how AI techniques such as anomaly detection, language-based fact extraction, dynamic reputation modeling, and adversarial resistance can enhance oracle systems. We observe that while AI introduces powerful tools for improving data quality, source selection, and system resilience, it cannot eliminate the reliance on unverifiable off-chain inputs. Therefore, this study supports the idea that AI should be understood as a complementary layer of inference and filtering within a broader oracle design, not a substitute for trust assumptions.
Authors:Mohammed K. Alzaylaee
Abstract:
Smart homes that integrate Internet of Things (IoT) devices face increasing cybersecurity risks, posing significant challenges to these environments. The study explores security threats in smart homes ecosystems, categorizing them into vulnerabilities at the network layer, device level, and those from cloud-based and AI-driven systems. Research findings indicate that post-quantum encryption, coupled with AI-driven anomaly detection, is highly effective in enhancing security; however, computational resource demands present significant challenges. Blockchain authentication together with zero-trust structures builds security resilience, although they need changes to existing infrastructure. The specific security strategies show their effectiveness through ANOVA, Chi-square tests, and Monte Carlo simulations yet lack sufficient scalability according to the results. The research demonstrates the requirement for improvement in cryptographic techniques, alongside AI-enhanced threat detection and adaptive security models which must achieve a balance between performance and efficiency and real-time applicability within smart home ecosystems.
Authors:Fabio Correa Xavier
Abstract:
In a world where deepfakes and cloned voices are emerging as sophisticated attack vectors, organizations require a new security mindset: Sensorial Zero Trust [9]. This article presents a scientific analysis of the need to systematically doubt information perceived through the senses, establishing rigorous verification protocols to mitigate the risks of fraud based on generative artificial intelligence. Key concepts, such as Out-of-Band verification, Vision-Language Models (VLMs) as forensic collaborators, cryptographic provenance, and human training, are integrated into a framework that extends Zero Trust principles to human sensory information. The approach is grounded in empirical findings and academic research, emphasizing that in an era of AI-generated realities, even our eyes and ears can no longer be implicitly trusted without verification. Leaders are called to foster a culture of methodological skepticism to protect organizational integrity in this new threat landscape.
Authors:Craig S Wright
Abstract:
This paper presents a complete formal specification, protocol description, and mathematical proof structure for Simplified Payment Verification (SPV) as originally defined in the Bitcoin whitepaper \cite{nakamoto2008}. In stark contrast to the misrepresentations proliferated by popular implementations, we show that SPV is not only secure under bounded adversarial assumptions but strictly optimal for digital cash systems requiring scalable and verifiable transaction inclusion. We reconstruct the SPV protocol from first principles, grounding its verification model in symbolic automata, Merkle membership relations, and chain-of-proof dominance predicates. Through rigorous probabilistic and game-theoretic analysis, we derive the economic bounds within which the protocol operates securely and verify its liveness and safety properties under partial connectivity, hostile relay networks, and adversarial propagation delay. Our specification further introduces low-bandwidth optimisations such as adaptive polling and compressed header synchronisation while preserving correctness. This document serves both as a blueprint for secure SPV implementation and a rebuttal of common misconceptions surrounding non-validating clients.
Authors:Alessio Di Santo
Abstract:
This analysis focuses on a single Azure-hosted Virtual Machine at 52.230.23.114 that the adversary converted into an all-in-one delivery, staging and Command-and-Control node. The host advertises an out-of-date Apache 2.4.52 instance whose open directory exposes phishing lures, PowerShell loaders, Reflective Shell-Code, compiled Havoc Demon implants and a toolbox of lateral-movement binaries; the same server also answers on 8443/80 for encrypted beacon traffic. The web tier is riddled with publicly documented critical vulnerabilities, that would have allowed initial code-execution had the attackers not already owned the device.
Initial access is delivered through an HTML file that, once de-obfuscated, perfectly mimics Google Unusual sign-in attempt notification and funnels victims toward credential collection. A PowerShell command follows: it disables AMSI in-memory, downloads a Base64-encoded stub, allocates RWX pages and starts the shell-code without ever touching disk. That stub reconstructs a DLL in memory using the Reflective-Loader technique and hands control to Havoc Demon implant. Every Demon variant-32- and 64-bit alike-talks to the same backend, resolves Windows APIs with hashed look-ups, and hides its activity behind indirect syscalls.
Runtime telemetry shows interests in registry under Image File Execution Options, deliberate queries to Software Restriction Policy keys, and heavy use of Crypto DLLs to protect payloads and C2 traffic. The attacker toolkit further contains Chisel, PsExec, Doppelganger and Whisker, some of them re-compiled under user directories that leak the developer personas tonzking123 and thobt. Collectively the findings paint a picture of a technically adept actor who values rapid re-tooling over deep operational security, leaning on Havoc modularity and on legitimate cloud services to blend malicious flows into ordinary enterprise traffic.
Authors:Hasan YiÄit
Abstract:
AI-Hybrid TRNG is a deep-learning framework that extracts near-uniform entropy directly from physical noise, eliminating the need for bulky quantum devices or expensive laboratory-grade RF receivers. Instead, it relies on a low-cost, thumb-sized RF front end, plus CPU-timing jitter, for training, and then emits 32-bit high-entropy streams without any quantization step.
Unlike deterministic or trained artificial intelligence random number generators (RNGs), our dynamic inner-outer network couples adaptive natural sources and reseeding, yielding truly unpredictable and autonomous sequences. Generated numbers pass the NIST SP 800-22 battery better than a CPU-based method. It also passes nineteen bespoke statistical tests for both bit- and integer-level analysis. All results satisfy cryptographic standards, while forward and backward prediction experiments reveal no exploitable biases. The model's footprint is below 0.5 MB, making it deployable on MCUs and FPGA soft cores, as well as suitable for other resource-constrained platforms.
By detaching randomness quality from dedicated hardware, AI-Hybrid TRNG broadens the reach of high-integrity random number generators across secure systems, cryptographic protocols, embedded and edge devices, stochastic simulations, and server applications that need randomness.
Authors:Eyhab Al-Masri
Abstract:
This paper presents NeutroSENSE, a neutrosophic-enhanced ensemble framework for interpretable intrusion detection in IoT environments. By integrating Random Forest, XGBoost, and Logistic Regression with neutrosophic logic, the system decomposes prediction confidence into truth (T), falsity (F), and indeterminacy (I) components, enabling uncertainty quantification and abstention. Predictions with high indeterminacy are flagged for review using both global and adaptive, class-specific thresholds. Evaluated on the IoT-CAD dataset, NeutroSENSE achieved 97% accuracy, while demonstrating that misclassified samples exhibit significantly higher indeterminacy (I = 0.62) than correct ones (I = 0.24). The use of indeterminacy as a proxy for uncertainty enables informed abstention and targeted review-particularly valuable in edge deployments. Figures and tables validate the correlation between I-scores and error likelihood, supporting more trustworthy, human-in-the-loop AI decisions. This work shows that neutrosophic logic enhances both accuracy and explainability, providing a practical foundation for trust-aware AI in edge and fog-based IoT security systems.